locked
Drive faulty - cannot remove it (or any other drive)?? RRS feed

  • Question

  • Hi

    I really hope someone can help?

    I've got a WHS (home build) that has been working very well for at least 6 months - well at least until Christmas?!

    My WHS has 3x 500Gb Samsung drives (one with the system software on) and a recently added 1Tb WD drive. All shares are duplicated.

    Just after Christmas I noticed that a Samsung drive had gone missing. After a quick reboot it was back and reporting as healthy. Unfortunately this seems to be reoccurring almost every three days or so. Due to the fact that I have two Samsung data drives that are identical I wanted to discover which one was faulty so I swapped the sata connectors on the motherboard and waited for the drive to again go missing. (not sure if this has caused a problem?)

    So I've decided to remove the drive and replace it with a new WD.

    However, when I use the 'drive removal wizard', the WHS freezes up. I've left it for hours but no movement at all. So after 'Many' attempts I decided to try and remove the new WD drive - same result. It seems that I cannot remove ANY of the data drives?? No idea why.

    Any thoughts?

    Many thanks in advance

    Monday, February 9, 2009 1:23 PM

Answers

  • Hi,
    please log in locally on the server and check the event logs in Control Panel/Administrative Tools/Event Viewer.
    This may eventually give you more details about the situation. Look especially for file system and NTFS errors and for details related to the missing drive.
    Other than that:
    Did you enable any power save settings for the disks in power management, which lets them spin down? This may also cause issues with the WHS components detecting the drives in time.
    You could install the Add-In WHS Disk Management. This shows in details some usefull informations (i.e. Serial number of the disks and SMART status). With the serial numbers you can determine, which disk has gone missing.

    Best greetings from Germany
    Olaf
    Monday, February 9, 2009 9:59 PM
    Moderator

All replies

  • magnatizerr said:

    Hi

    I really hope someone can help?

    I've got a WHS (home build) that has been working very well for at least 6 months - well at least until Christmas?!

    My WHS has 3x 500Gb Samsung drives (one with the system software on) and a recently added 1Tb WD drive. All shares are duplicated.

    Just after Christmas I noticed that a Samsung drive had gone missing. After a quick reboot it was back and reporting as healthy. Unfortunately this seems to be reoccurring almost every three days or so. Due to the fact that I have two Samsung data drives that are identical I wanted to discover which one was faulty so I swapped the sata connectors on the motherboard and waited for the drive to again go missing. (not sure if this has caused a problem?)

    So I've decided to remove the drive and replace it with a new WD.

    However, when I use the 'drive removal wizard', the WHS freezes up. I've left it for hours but no movement at all. So after 'Many' attempts I decided to try and remove the new WD drive - same result. It seems that I cannot remove ANY of the data drives?? No idea why.

    Any thoughts?

    Many thanks in advance



    First things first:  are you now able (even after a reboot) to get WHS to at least temporarily see all of your drives?  Second, have you been able to determine which drive is failing?  Also, how much data do you currently have stored on it (both under Shared Folders and Duplication)?
    Monday, February 9, 2009 3:01 PM
    Moderator
  • Hi Kariya

    Thankfully all the drives seem fully functional. I can use the drives as normal, eg using data, back ups etc.

    I think I have found the faulty drive - its got about 380Mb of data on it (500mb drive). It's the drive with the most data on it. For that reason I felt it was right to leave the 'removal wizard' take its time, but after 5hrs it hadn't even registered a start. I then had to turn on/off power. After the WHS came buck up - everything seems fine. All drives work. - I've done this a number of times after the dive goes 'missing' every few days.

    I then decided to try removing another drive (just to see if it would work) and it didn't.

    Many thanks for your help.

    PS - I have backed up the shares to an external drive

    Monday, February 9, 2009 5:32 PM
  • Hi,
    please log in locally on the server and check the event logs in Control Panel/Administrative Tools/Event Viewer.
    This may eventually give you more details about the situation. Look especially for file system and NTFS errors and for details related to the missing drive.
    Other than that:
    Did you enable any power save settings for the disks in power management, which lets them spin down? This may also cause issues with the WHS components detecting the drives in time.
    You could install the Add-In WHS Disk Management. This shows in details some usefull informations (i.e. Serial number of the disks and SMART status). With the serial numbers you can determine, which disk has gone missing.

    Best greetings from Germany
    Olaf
    Monday, February 9, 2009 9:59 PM
    Moderator
  • magnatizerr said:

    Hi Kariya

    Thankfully all the drives seem fully functional. I can use the drives as normal, eg using data, back ups etc.

    I think I have found the faulty drive - its got about 380Mb of data on it (500mb drive). It's the drive with the most data on it. For that reason I felt it was right to leave the 'removal wizard' take its time, but after 5hrs it hadn't even registered a start. I then had to turn on/off power. After the WHS came buck up - everything seems fine. All drives work. - I've done this a number of times after the dive goes 'missing' every few days.

    I then decided to try removing another drive (just to see if it would work) and it didn't.

    Many thanks for your help.

    PS - I have backed up the shares to an external drive



    I would try running chkdsk /r on each drive and see if it finds any errors.
    Tuesday, February 10, 2009 12:17 AM
    Moderator
  • Hi Olaf

    Many thanks

    I've checked the event viewer on the server and there are over 60k events under the system section!!!!

    Most of the recent ones are:

    The driver detected a controller error on \Device\Harddisk3

    The device, \Device\Ide\IdePort3, did not respond within the timeout period.

    The driver detected a controller error on \Device\Harddisk0.

    A parity error was detected on \Device\Ide\IdePort1.

    An error was detected on device \Device\Harddisk3 during a paging operation.

    I have also just had a notification from the WHS console that 'There are file conflicts on one of the shares'


    Does this point to anything?

    Many thanks again

    Tuesday, February 10, 2009 2:13 PM
  • magnatizerr said:

    Hi Olaf

    Many thanks

    I've checked the event viewer on the server and there are over 60k events under the system section!!!!

    Most of the recent ones are:

    The driver detected a controller error on \Device\Harddisk3

    The device, \Device\Ide\IdePort3, did not respond within the timeout period.

    The driver detected a controller error on \Device\Harddisk0.

    A parity error was detected on \Device\Ide\IdePort1.

    An error was detected on device \Device\Harddisk3 during a paging operation.

    I have also just had a notification from the WHS console that 'There are file conflicts on one of the shares'


    Does this point to anything?

    Many thanks again

    This indicates a failing disk drive, disk controller, or cable issue. The most likely is a failing drive.


    I'm not on the WHS team, I just post a lot. :)
    Tuesday, February 10, 2009 3:56 PM
    Moderator
  • Thanks Ken

    I have swapped all the sata cables so I'm sure its not this, however,

    If a failing drive - how do I go about replacing it if I can't use the 'remove drive wizard'? As I can't use the wizard on any of the drives within the pool is it likely to be the controllers?

    If disk controller - How do I correct this?

    Cheers
    Tuesday, February 10, 2009 4:08 PM
  • If it's the disk controller you replace it, possibly by replacing the motherboard. And I just re-read the post I quoted; since you mention errors on multiple disks, I would be more suspicious of the controller than a single drive.

    As for removing the drive, if it were a single drive you could shut your server down, physically remove the drive, and then use the console to remove the drive from the storage pool. That will succeed because the drive is "missing" at that point, instead of generating errors at a furious rate. You will potentially lose your backups, and file in shares that don't have duplicaiton turned on. You should simply abandon the backup database if it's affected, but you can try to use standard data recovery techniques for other files.

    And if you haven't yet, you should run chkdsk (as recommended above) on all drives. There is further information on how to do this in a post in the FAQ section.

    I'm not on the WHS team, I just post a lot. :)
    Tuesday, February 10, 2009 4:58 PM
    Moderator
  • Thanks Ken

    The more I read into this the more it sounds like the controllers are at fault?

    I've read much on the forum about changing out motherboards (including the re-activation issue). I happen to have an almost identical MB so would you see any complications if I were to replace the MB with the 'almost idetical' one - using all existing CPU, memory etc.

    Do I just replace and power up as normal or do I need to do anything else? I will of course Back up shares before I start. (just in case)!

    If I were to do this wouldn't the issues of the disk controller come back?

    Cheers

    Tuesday, February 10, 2009 5:59 PM
  • The only way (short of reinstallation) to be certain you won't have any driver problems with a motherboard replacement is to obtain the exact same type of motherboard, update to the exact same BIOS version, and configure the BIOS identically. If you can't get the exact same model, the more similar the boards are the better. Ideally they will use the same processor, memory, etc, have the same chipset (and peripheral chips, like network interface, etc.), and be from the same manufacturer and product line. 

    Sorting out possible issues is, umm, tough. You might be better off if you A) back up your shares, and then B) move the drives to the new MB and do a server reinstallation, on the assumption that the controller is what's bad.

    But you should still run chkdsk on all the drives. And if you get errors, shut the server down and temporarily connect the drives to some other computer to run chkdsk. A lack of errors on another computer is another argument in favor of the controller.


    I'm not on the WHS team, I just post a lot. :)
    Tuesday, February 10, 2009 7:00 PM
    Moderator
  • Mnay thanks Ken

    Will do and will report back (may take a week or so)

    Cheers again
    Tuesday, February 10, 2009 7:06 PM
  • Hi,
    replacing the mainboard will often work, especially if the chipset is nearly identically, although it's unsupported.
    Take care, that the system disk should be the attached to the first controller port and selected as boot drive in Bios. Also, the mode of the SATA controller (if one is used) should be set to the same mode as in the old board (i.e. IDE compatible).
    Also product activation may kick in after such an operation.
    If this does not work, you can try to change the Bios settings (especially SATA controller again) and if this also fails, the option is a server reinstall after booting from the Windows Home Server DVD.
    Should this also fail (no server reinstall is offered), a new installation is your last option. To keep your data in this case, please check this FAQ.
    Best greetings from Germany
    Olaf
    Tuesday, February 10, 2009 7:08 PM
    Moderator