none
Need Help with Strange WHS Disk Errors

    Question

  • Hello-

    I am fairly new to home server and hope someone can help me figure out a resolution to this critical disk issue I am having. Here's some background.

    I have a new Asus WHS that came with a 1TB HD, and to which I added 2 2TB HDs. I promptly loaded it up with about 1 TB of data across 9 different duplicated shares. It seemed to distribute all the new data on the new drives rather evenly (based on the Disk Mgmt Add-In). A quick read of the Drive Extender White Paper helped me understand why.

    Now, about a week later I rebooted the home server hard because my router stopped giving it an IP address, and I initially assumed it was an issue with the WHS. When I resolved the network issue and connected to the home server I had the following errors for ALL of my duplicated shares:

    The system cannot find the drive specified
    A device attached to the system is not functioning



    Additionally I see the other strange messages throughout the console:

    1) Server Storage sees both the primary data partition and one of the 2 GB disks as not added

    2) The pie chart in Server Storage is scrolling with the message Calculating sizes.... (I don't think this is the permission issue since it started happening at the same time as these other issues)

    3) Shared Folders shows that duplication is on all but one of my folders (that one is empty). It also shows that all the duplicated folders have a status of Failing (Check Health)

    4) The Disk Management Add-In reports that Disk 0 (the 1 GB system disk) has a status of not added under the Unmanaged Disks group

    5) The Disk Management Add-In reports that Disk 2 (the second 2 GB disk) has a status of not added under the Action Required group

    6) The little shield is yellow with an at risk message. The blue lights beside all of the drives are a friendly, healthy solid blue

    7) I am still able to access all of my shares

    Troubleshooting:
    - I tried rebooting properly a few times to no avail.

    - I removed the failing 2GB drive (the Disk Management Add-In helped me identify the correct drive, since the two added drives were otherwise identical). This caused the disk light to start flashing red, the shield turned to red (critical), and I got the Network Critical error of "the xxx hard drive has failed. Ensure it is connected." I then reconnected it, shield returned to yellow, but now the disk shows up as HEALTHY!

    BUT . . . The primary data partition is still showing not added and shared folders is still showing Failing (Check Health)

    I obviously cannot do the same thing to fix the primary disk, since it is also running the system partition. So then I rebooted, and I was back to square one, with both disks showing as not added.

    I can repeat this proceedure and get to the same point with the second 2GB disk reporting healthy, but the primary data partition reporting as Failing (Check Health). When I then right-click on the primary data partition in Server Storage no options are available. When i right-click on the primary data partition in the Disk Management Add-In I can Add but not Remove. If I try to Add it to the pool, it does nothing, but then the purple light starts blinking beside drive 0.
    QUESTION:

    How do i fix this problem? I believe there are no actual drive failures right now, but that Disk Extender has just gotten itself in a strange state. Since I can get the second 2GB disk reporting healthy before a reboot, my guess is that if I can get the primary data partition reporting healthy as well, it should resolve itself and hopefully Disk Extender will not forget about these disks after each reboot. Since the primary data partition should have nothing but tombstones on it, I would love to just remove it properly though Server Storage, and then add it back to the pool, but that is not an option provided through the console.

    Please Help!

    Thursday, February 25, 2010 11:35 PM

All replies

  • Hello-

    I am fairly new to home server and hope someone can help me figure out a resolution to this critical disk issue I am having. Here's some background.

    I have a new Asus WHS that came with a 1TB HD, and to which I added 2 2TB HDs. I promptly loaded it up with about 1 TB of data across 9 different duplicated shares. It seemed to distribute all the new data on the new drives rather evenly (based on the Disk Mgmt Add-In). A quick read of the Drive Extender White Paper helped me understand why.

    Now, about a week later I rebooted the home server hard because my router stopped giving it an IP address, and I initially assumed it was an issue with the WHS. When I resolved the network issue and connected to the home server I had the following errors for ALL of my duplicated shares:

    The system cannot find the drive specified
    A device attached to the system is not functioning



    Additionally I see the other strange messages throughout the console:

    1) Server Storage sees both the primary data partition and one of the 2 GB disks as not added

    2) The pie chart in Server Storage is scrolling with the message Calculating sizes.... (I don't think this is the permission issue since it started happening at the same time as these other issues)

    3) Shared Folders shows that duplication is on all but one of my folders (that one is empty). It also shows that all the duplicated folders have a status of Failing (Check Health)

    4) The Disk Management Add-In reports that Disk 0 (the 1 GB system disk) has a status of not added under the Unmanaged Disks group

    5) The Disk Management Add-In reports that Disk 2 (the second 2 GB disk) has a status of not added under the Action Required group

    6) The little shield is yellow with an at risk message. The blue lights beside all of the drives are a friendly, healthy solid blue

    7) I am still able to access all of my shares

    Troubleshooting:
    - I tried rebooting properly a few times to no avail.

    - I removed the failing 2GB drive (the Disk Management Add-In helped me identify the correct drive, since the two added drives were otherwise identical). This caused the disk light to start flashing red, the shield turned to red (critical), and I got the Network Critical error of "the xxx hard drive has failed. Ensure it is connected." I then reconnected it, shield returned to yellow, but now the disk shows up as HEALTHY!

    BUT . . . The primary data partition is still showing not added and shared folders is still showing Failing (Check Health)

    I obviously cannot do the same thing to fix the primary disk, since it is also running the system partition. So then I rebooted, and I was back to square one, with both disks showing as not added.

    I can repeat this proceedure and get to the same point with the second 2GB disk reporting healthy, but the primary data partition reporting as Failing (Check Health). When I then right-click on the primary data partition in Server Storage no options are available. When i right-click on the primary data partition in the Disk Management Add-In I can Add but not Remove. If I try to Add it to the pool, it does nothing, but then the purple light starts blinking beside drive 0.
    QUESTION:

    How do i fix this problem? I believe there are no actual drive failures right now, but that Disk Extender has just gotten itself in a strange state.

    I would have to disagree.  I think you should start by running chkdsk /r on each drive in your server.  See the FAQ post:  How to check all the drives in your server for errors for details on how to do that.

    Since I can get the second 2GB disk reporting healthy before a reboot, my guess is that if I can get the primary data partition reporting healthy as well, it should resolve itself and hopefully Disk Extender will not forget about these disks after each reboot. Since the primary data partition should have nothing but tombstones on it, I would love to just remove it properly though Server Storage, and then add it back to the pool, but that is not an option provided through the console.

    Please Help!

    The only supported option to get everything back to normal is to do a Server Recovery (or whatever Asus calls the function that wipes the OS partition, leaving the rest of the data in tact).  That should be covered in the documentation you got with your server.
    Friday, February 26, 2010 12:03 AM
    Moderator
  • It seems likely that something happened to your server, something like a power failure, or disk error, etc., which caused the issue you're seeing. Can you please submit a bug report on Connect ? Include logs from your server; they can be collected using the Windows Home Server toolkit. (You'll find links for the toolkit and it's documentation here .) There have been several other reports of users who have had a very similar problem recently, and it would be useful if folks could submit bug reports when this sort of thing comes up, in case it indicates a problem in the Windows Home Server software...
    I'm not on the WHS team, I just post a lot. :)
    Friday, February 26, 2010 4:41 AM
    Moderator
  • Thank you for your response.  I ran chkdsk using the convenient batch file you linked me to.  I then followed the instructions of checking the application event log for chkdsk reports filtering on winlogon events, and everything came up clean.  I am now even more confident that Disk Extender has gotten itself into a strange state.

    If I do a Server Recovery, won't I have to readd the disks to the pool, and in doing so won't I wipe out all the data that is currently stored on them?
    Saturday, February 27, 2010 3:10 PM
  • Correct, I did a hard restart of my machine when it lost an IP last weekend.  I know that I have a lot of logs building up because the WHS server has been complaining about disk space on C a lot, and it looks like the culprit are these DEUtil logs.  I will submit a bug.  Thanks for the suggestion.  Is there a way to follow up or track bugs you submit?

    In the meanwhile, what is the best way for me to get my machine running properly again?  Is server recovery my only option?  And if so, won't I lose data once the DE asks me to readd these disks to the pool?  I am very nervous about doing anything that might cause me to lose all of my data.  I guess what I would like to find is a Disk Extenter maintenance utility that would just scan the Disk Extender configuration, realize that it is out of wack, and reintroduce the missing partitions into the pool.

    Looking at the log files, here's what keeps repeating in the DEUtil log:

    [2/26/2010 2:05:48 AM  558] OpenedFile::Open: Can't find VolumeInfo for {412638DB-D0AA-4D6B-8C00-A89020B1C138}
    [2/26/2010 2:05:48 AM  558] OpenedFile::Open: Can't find VolumeInfo for {412638DB-D0AA-4D6B-8C00-A89020B1C138}
    [2/26/2010 2:05:48 AM  558] ERROR WITH: D:\shares\Pictures\5.30.02\5.30.02 009.jpg because shadow 1 is in state Unknown
    [2/26/2010 2:05:48 AM  558] Tombstone D:\shares\Pictures\5.30.02\5.30.02 009.jpg has shadow on 412638db-d0aa-4d6b-8c00-a89020b1c138 which couldn't be found. Setting error to 15.
    [2/26/2010 2:05:48 AM  558] ErrorNeedsReporting is reporting fatal error 15 on file D:\shares\Pictures\5.30.02\5.30.02 009.jpg as an error
    [2/26/2010 2:05:48 AM  558] Info for D:\shares\Pictures\5.30.02\5.30.02 009.jpg
    [2/26/2010 2:05:48 AM  558]     State = Migrated  NumberOfShadows = 2
    [2/26/2010 2:05:48 AM  558]     Shadow(0)  Volume(865cb764-43cc-4b0e-b78e-c068606e8376) State(Healthy)
    [2/26/2010 2:05:48 AM  558]     Shadow(1)  Volume(412638db-d0aa-4d6b-8c00-a89020b1c138) State(Unknown)

    Now that I look through the log files, there's a lot of personal information in there I am not sure I am comfortable sharing with Microsoft.  But if you think this is a bug, would this be a free telephone support incident?

    Thanks
    Saturday, February 27, 2010 3:43 PM
  • Microsoft isn't going to give you any support, per se . The purpose of submitting a bug report is to give the Windows Home Server team enough information to research the problem and determine if it's internal to the product (in which case they will hopefully be able to correct it in a future update) or external and the result of some other software or user action.

    As for the submission on Connect, you'll get email if/when Microsoft updates it.
    I'm not on the WHS team, I just post a lot. :)
    Saturday, February 27, 2010 4:18 PM
    Moderator
  • Thanks Ken.  So filing a bug may help improve future versions of WHS and DE, but won't help me resolve my immediate issue in the short term.  I have saved off the log files using the toolkit add-in so I will file a bug once I get my issue resolved. 

    As for resolving the issue, are there any tools that allow you to view and modify the DE configuration?  Or is the only way to fix this problem to do a Server Restore?  And if so, won't I lose data when I have to rejoin the "not added" disks to the pool?

    Thanks!
    Sunday, February 28, 2010 3:00 PM
  • File the bug now, if you're going to. If you resolve your issue without filing a bug, it's extremely likely that any bug you do submit will be closed, since there will be no opportunity for investigation beyond the server logs (which may not be sufficient). And I'm sure that the logs themselves will "expire" from Microsoft's servers at some point, though I don't know how long that might take.
    I'm not on the WHS team, I just post a lot. :)
    Sunday, February 28, 2010 3:13 PM
    Moderator
  • Ok Ken.  I filed a bug last Sunday and have gotten the generic response that they are looking at it.  I have also turned off my home server since duplication is not working to protect my data.  How long should I wait to hear back on the bug before I can start accessing data off my home server again.  Also, if nothing comes of my bug submission, is a server restore my only option to get the disks to rejoin the pool?  And if so, won't I lose data when I have to rejoin the "not added" disks to the pool during or after the server restore?

    Thanks
    Saturday, March 06, 2010 4:35 PM