locked
WHS - Failing Disk Advice RRS feed

  • Question

  • Hi, I hope you can help me with some advice on what the best next steps are to take.

    I am running an HP MediaSmart EX470 WHS with 4 disks in total, the original 500GB, one 1TB drive and two 1.5TB drives.

    In a nutshell, I believe that one of the 1.5TB drives is failing and my absolute number one priority is to ensure my data (mainly photos and videos) is safe.  Duplication is turned on for all the shares that I am concerned about.

    I started to see file conflicts appearing and noticed slow running when copying some files etc. and then one drive went off-line, i.e. it no longer appeared as a drive.

    I restarted the server and reseated the drive and started doing some investigation.  I looked at the logs and noticed some NTFS errors so started thinking I needed to replace the drive asap.

    So I started the 'Remove Drive' which reported that I had enough space but this process hung even after leaving it for 24 hours.  I looked at the logs again and saw NTFS errors (and some others) which I will post up tonight.

    After doing some reading I thought I would run chkdsk /r on the drive but this hung also (I checked via task manager to see the amount of data I/O which wasn't increasing after 12 hours) so I restarted the server again.

    I have already purchased a new replacement drive, I just want to check what is the best next steps.  I've already written off the old drive, I just want to ensure my data is safe.  So my questions are:

    1. Is there any other steps I should try to save / fix the existing drive?

    2. If the answer to 1 is no, what is the approach to replacing this drive with the new one so that no data is lost?  Do I just physically remove the bad drive and slot in the new one (on the basis that that the 'remove drive' hasn't worked?)  Will WHS handle this and accept the new drive and re-duplicate as appropriate?  I've also accepted that as part of the process it's highly likely I will lose my backups which again I can live with as long as my data will be preserved.

    Interesting I've read some posts that indicate that cables might be an issue.  I have a spare EX470 chassis with a faulty backplane so have access to some spares if needed.

    All advice welcomed.

    Thanks,

    Martin

    Friday, April 15, 2011 11:26 AM

All replies

  • I found that when I had a failing disk a few months back that when the removal was just about finished it bombed because there were some corrupt files (where the bad blocks were).

     

    If you RDP into your WHS, and use explorer to view the individual drives C:\FS\(drive letter)\DE\ can you figure out which one of them is the bad drive?  You should be able to figure it out I guess based on the fact that it's one of your 1.5tb drives, and probably only has a few files left on it.  If there are remaining files, those are probably corrupt, and you will probably lose them unless you have a backup... wondering if you delete the remaining files, then try to remove the drive again from the console if it will then un-mount...  (It would probably be a good idea before deleting those files to see if you can get a good copy of that file from your network or directly from wherever that file is duplicated in the C:\FS\(drive letter)\DE\ hierarchy...)

     

    I don't have any experience with this, but I think I've read in some other forum posts that the last resort if the drive will not permit removal from the console, that you should be able to shutdown, physically remove the drive.  When the server reboots it will complain about a missing disk, at that point you should be able to select remove, and it should be removed from the console...   but I'm not 100% sure, so maybe someone else here can provide some insight on that procedure.

     

     

    Friday, April 15, 2011 7:59 PM