locked
WHS Crash: Delayed Write Failure RRS feed

  • Question

  • Couldn't connect to WHS, did a restart, took over 10 minutes.  Thought I should check it out, went into Event Viewer found this:

    http://img295.imageshack.us/img295/8145/whsll3.jpg

    Log looks like that from 12 noon to 12 midnight.  Mostly affected DVD folder, lost data.  WHS Console won't start.  (File is blocking it: HomeServerConsoleTab.Sharing.dll (sometimes Storage.dll) "Do you want to open without?" Yes.  Then you lose those two tabs.) Disk Management won't start.  What is wrong with my WHS?

    All started here, it seems:

    http://img221.imageshack.us/my.php?image=whsht5.jpg

    Home Built
    WHS w/ PP1
    Althon Processor 3200+ 2.0GHz
    1GB RAM
    WinFast 6150K8MA-8EKRS Motherboard
    (2) Rosewill 4 port PCI SATA Controllers
    (3) Seagate 400GB SATA HDD
    (6) Seagate 500GB SATA HDD

    Duplication was turned on for Photos.  Sharing was enabled.

    What can I do?

    Saturday, September 20, 2008 6:27 AM

Answers

  • Enchanter said:

    My guess is failing hard drive.  You can download software from most hard drive mfg to test their hard drives for errors.


    athlon 3400, 2gb ram, 9 drives totaling about 3.5 tbs.



    I think Enchanter is probably right. If you haven't recently updated device drivers (controller, disk, ....) it's most likely a hardware problem, especially since you also have ntfs EventID 50 error. If you're lucky it's just a bad cable or bad connection, however failing disk is the most likely option.

    Failing disk could also be a limited number of bad blocks which could normally be fixed by running chkdsk /f/r on the problematic disk.

    If chkdsk /f /r on the problematic harddisk doesn't fix your problem I would suggest you remove the failing drive using the WHS console, then replace it.

    To run chkdsk on the failing harddisk logon to your WHS using remote desktop, then:

    1. Start, Run, type cmd, Hit Enter
    2. Type net stop pdl and Hit Enter
    3. Type net stop whsbackup and Hit Enter
    4. In command shell type chkdsk C:\fs\J /f /r and hit Enter. 
      1. If you get a question asking you to dismount the volume answer Yes
      2. If you get a question to schedule a chkdsk at next boot answer Yes and reboot
    5. Do NOT write to your shares when chkdsk is running
    6. Reboot after chkdsk has finished

     

    Please note, if you leave a failing harddisk in your machine you're bound to run into more serious errors.
    Sunday, September 21, 2008 2:30 PM
    Moderator

All replies

  • Run chkdsk /f /r on each disk (if available boot from a Vista DVD and use the system repair options command prompt to run this command).

    If this does not help, check if you can perform a server reinstall. If you only get a new installation offered, things become ugly (I speak from bad experience) and you need to get another storage device (external disk), which is capable to hold the files stored on the disks, or at least on the system disk.
    If starting a new installation (in case server reinstall is not offered), disconnect all other disks before, so that you can salvage the files from them later.
    Best greetings from Germany
    Olaf
    Saturday, September 20, 2008 11:34 AM
    Moderator
  • visualechodesigns said:

    Couldn't connect to WHS, did a restart, took over 10 minutes.  Thought I should check it out, went into Event Viewer found this:

    http://img295.imageshack.us/img295/8145/whsll3.jpg

    Log looks like that from 12 noon to 12 midnight.  Mostly affected DVD folder, lost data.  WHS Console won't start.  (File is blocking it: HomeServerConsoleTab.Sharing.dll (sometimes Storage.dll) "Do you want to open without?" Yes.  Then you lose those two tabs.) Disk Management won't start.  What is wrong with my WHS?

    All started here, it seems:

    http://img221.imageshack.us/my.php?image=whsht5.jpg

    Home Built
    WHS w/ PP1
    Althon Processor 3200+ 2.0GHz
    1GB RAM
    WinFast 6150K8MA-8EKRS Motherboard
    (2) Rosewill 4 port PCI SATA Controllers
    (3) Seagate 400GB SATA HDD
    (6) Seagate 500GB SATA HDD

    Duplication was turned on for Photos.  Sharing was enabled.

    What can I do?



    You say that the photo folders were set to duplication, were the DVD folders?

    Thank you
    Lara Jones [MSFT] Windows Home Server Team
    Saturday, September 20, 2008 4:21 PM
    Moderator
  • Olaf Engelke said:

    Run chkdsk /f /r on each disk (if available boot from a Vista DVD and use the system repair options command prompt to run this command).

    If this does not help, check if you can perform a server reinstall. If you only get a new installation offered, things become ugly (I speak from bad experience) and you need to get another storage device (external disk), which is capable to hold the files stored on the disks, or at least on the system disk.
    If starting a new installation (in case server reinstall is not offered), disconnect all other disks before, so that you can salvage the files from them later.
    Best greetings from Germany
    Olaf


    Alright, I think I understand the chkdsk part.  On the server, use a Vista DVD, boot into repair, and run chkdsk on each drive.

    If I have to go to a server reinstall; we are talking about 4TB of data, I don't have enough storage elsewhere.  However, it is my understanding that each of my WHS data drives are readable in any Windows PC.  So my thought would be, remove data disks, install clean WHS, and start transferring data back onto WHS one disk at a time from another PC.  Is that a possibility?  What are the disadvantages besides taking a lot of time and WHS reallocates/etc. the data?

    What has happened?  I'd like to understand what went wrong.....

    Can anyone else confirm that this is what I should do?
    Saturday, September 20, 2008 4:27 PM
  • Lara Jones said:


    You say that the photo folders were set to duplication, were the DVD folders?

    Thank you
    Lara Jones [MSFT] Windows Home Server Team



    No, total storage was about 4.5TB, the DVD folder was almost 3TB of data. 

    Also, the 2nd screen didn't upload correctly.  I noticed an event around 1:45pm that kicked off the entire mishap.  Something along the lines of "Hardrive 3 failure".  I'll turn on WHS again and grab the screen.
    Saturday, September 20, 2008 4:29 PM
  • visualechodesigns said:

    Alright, I think I understand the chkdsk part.  On the server, use a Vista DVD, boot into repair, and run chkdsk on each drive.

    If I have to go to a server reinstall; we are talking about 4TB of data, I don't have enough storage elsewhere.  However, it is my understanding that each of my WHS data drives are readable in any Windows PC.

    That's correct.

    visualechodesigns said:

    So my thought would be, remove data disks, install clean WHS, and start transferring data back onto WHS one disk at a time from another PC.  Is that a possibility?  What are the disadvantages besides taking a lot of time and WHS reallocates/etc. the data?

    I would suggest Server Reinstallation first.  It's a specialized installation mode that wipes the OS partition, but leaves everything else in tact.

    visualechodesigns said:

    What has happened?  I'd like to understand what went wrong.....

    Can anyone else confirm that this is what I should do?



    Saturday, September 20, 2008 5:33 PM
    Moderator
  • I got this problem after installing a dodgy SATA controller card which I have sent in for RMA. But it can also be a dodgy driver.

    I have a Highpoint 2300 and I bought a Highpoint 2310 for a little more pep (and I needed the 2300 in another box). As soon as I installed the 2310, I got all sorts of Delayed Write Failures. Put the 2300 back in and they went away.
    Saturday, September 20, 2008 8:07 PM
  • Server has been running for about a year no problem. I didn't update the drivers, auto-update maybe?  Still digging to see what has happened.   CHKDSK on C:\ no errors.  CHKDSK on D:\ .................
    Saturday, September 20, 2008 8:23 PM
  • visualechodesigns said:

    Server has been running for about a year no problem. I didn't update the drivers, auto-update maybe?  Still digging to see what has happened.   CHKDSK on C:\ no errors.  CHKDSK on D:\ .................


    No.  WHS doesn't automatically update drivers (as a matter of fact, no MS OS does).  If it's not the hard drives, I'd start looking at the controller cards and the power and data cables.
    Saturday, September 20, 2008 9:55 PM
    Moderator
  • My guess is failing hard drive.  You can download software from most hard drive mfg to test their hard drives for errors.
    athlon 3400, 2gb ram, 9 drives totaling about 3.5 tbs.
    Sunday, September 21, 2008 9:05 AM
  • Enchanter said:

    My guess is failing hard drive.  You can download software from most hard drive mfg to test their hard drives for errors.


    athlon 3400, 2gb ram, 9 drives totaling about 3.5 tbs.



    I think Enchanter is probably right. If you haven't recently updated device drivers (controller, disk, ....) it's most likely a hardware problem, especially since you also have ntfs EventID 50 error. If you're lucky it's just a bad cable or bad connection, however failing disk is the most likely option.

    Failing disk could also be a limited number of bad blocks which could normally be fixed by running chkdsk /f/r on the problematic disk.

    If chkdsk /f /r on the problematic harddisk doesn't fix your problem I would suggest you remove the failing drive using the WHS console, then replace it.

    To run chkdsk on the failing harddisk logon to your WHS using remote desktop, then:

    1. Start, Run, type cmd, Hit Enter
    2. Type net stop pdl and Hit Enter
    3. Type net stop whsbackup and Hit Enter
    4. In command shell type chkdsk C:\fs\J /f /r and hit Enter. 
      1. If you get a question asking you to dismount the volume answer Yes
      2. If you get a question to schedule a chkdsk at next boot answer Yes and reboot
    5. Do NOT write to your shares when chkdsk is running
    6. Reboot after chkdsk has finished

     

    Please note, if you leave a failing harddisk in your machine you're bound to run into more serious errors.
    Sunday, September 21, 2008 2:30 PM
    Moderator
  • Chkdsk is taking forever......

    How can I find out which drive is the "J" drive?  I'd like to focus my efforts.
    Monday, September 22, 2008 1:54 AM
  • visualechodesigns said:

    Chkdsk is taking forever......

    How can I find out which drive is the "J" drive?  I'd like to focus my efforts.


    First, chkdsk can take a long time.  My suggestion is to leave it alone (and prepare to replace the drive ASAP).  You really can't determine which drive is which while chkdsk is running.  You just have to let it go through its process.
    Monday, September 22, 2008 2:51 AM
    Moderator
  • visualechodesigns said:

    Chkdsk is taking forever......

    How can I find out which drive is the "J" drive?  I'd like to focus my efforts.


    As Kariya21 already said please let chkdsk finish. It can run for a long time, especially if there are disk errors that need to be fixed.

    When it's finished hopefully your system will be OK again. If so I would advise you to check the SMART parameters of your harddisks using some tool that can report these parameters, for example Speedfan

    When SMART reports all your disks are healthy you can keep on using all of your disks with little or no risk. If one or more disks are reported unhealthy the best thing is to replace them.

    I don't know of any simple way to find out which hdd is C:\fs\J. If you know which one is the system drive you can disconnect power to the other disks one by one, and check C:\fs (in explorer) which one is disappearing. (press F5 to refresh after disconnecting a disk to refresh). As long as your not writing to the disks this shouldn't harm them.
    Monday, September 22, 2008 5:04 AM
    Moderator
  • Chkdsk is done.  Said it corrected some errors.  Seagate Tools reported no SMART errors on any drives and the short test came back PASS on all.  I will run the long test on all as well.  WHS is booting normally, reporting healthy, and running well except for the backup service.  Error message saying that the backup service is not running......

    So the crisis may have been fixed, almost anyway.

    Thank you for all your help/suggestions.

    Will do more work tomorrow.

    Monday, September 22, 2008 6:41 AM
  • visualechodesigns said:

    Error message saying that the backup service is not running......



    The net stop pdl and net stop whsbackup commands stop backup service. You can restart them using net start pdl and net start whsbackup or simply reboot.
    Monday, September 22, 2008 8:05 AM
    Moderator