locked
How Long to remove HDD from Storage Pool RRS feed

  • Question

  • I have started the removal of a WD 1.5TB HDD at about 60% and was wondering how long this will take to offload about 700 GB of data to the remaining storage drives.  I have been getting an error message that the drive has 1 bad block which I believe is causing my backups to fail part way through with a cyclic redundancy error trying to read the backup file.  Yesterday I did a chkdsk /f on volume d: and it eventually got through it at least 8 hours (although I have no idea how long it actually took because I wnet to bed) later but still not able to complete any backups always with a disk error.  I am concerned that it will not be able to complete the removal process.
    Wednesday, May 12, 2010 12:41 AM

Answers

  • I ran chkdsk /f yesterday and it also rebooted once on it's own to do a chkdsk as well,  took a very long time to attempt to fix the HDD - over 8 hours which seems like a waste of time to me.   The backups have been failing for a week now and I have even re-installed WHS and tried many times to do repair of backups which simply won't complete - just says I have lost connection to the server and then closes the remote console after about 5 minutes.  I have deleted all the backup files in d: /folders/{00008086-058D-4C89-AB57-A7F909A47AB4} based on recommendation here.  Auto or manual backups all eventually fail and each time I get the same errors in the event log - they are IO errors from the backup file and then when I look at the event log for the system it shows disk error event code 7 and says that Harddisk 1 has a bad block.  It repeats this error code many times in the log. 

    I have over 21TB in the storage pool with close to 8TB free - I think at this point I will let it finish with the removal process and then see if I can do my backups.  If they still fail I will run checkall.cmd to see if I have other HDD problems or something else.  If it doesn't complete I will do it how Tcalp suggests. 

    • Marked as answer by HTPCat Friday, May 14, 2010 1:57 PM
    Wednesday, May 12, 2010 3:35 AM

All replies

  • I gave up on trying to remove the normal way.  I pull the drive I wanted to remove from the machine, removed the drive in a 'missing' state from the pool (this takes 2-3 min to complete).  Then hooked up the drive on USB and manually copied the data back.

    I cannot guarantee this will work 100% for all situations, but if your just doing 'basic' data storage it *should* be fine.

     

    Also you get the added plus of your storage array actually being accessible during the file copy


    Tcalp
    Wednesday, May 12, 2010 1:23 AM
  • I suggest you cancel disk removal process, then please try running chkdsk on all drives / partitions on your server as described in the FAQ post How to check all the drives in your server for errors.

    When finished try running backup again before you attempt to remove any disks. If backups still fail please try and Repair your backup database from the WHS console, settings, computers and backup. Please note that this may result in loss of all your backups (if the backup database can not be repaired).

    Wednesday, May 12, 2010 2:36 AM
    Moderator
  • I ran chkdsk /f yesterday and it also rebooted once on it's own to do a chkdsk as well,  took a very long time to attempt to fix the HDD - over 8 hours which seems like a waste of time to me.   The backups have been failing for a week now and I have even re-installed WHS and tried many times to do repair of backups which simply won't complete - just says I have lost connection to the server and then closes the remote console after about 5 minutes.  I have deleted all the backup files in d: /folders/{00008086-058D-4C89-AB57-A7F909A47AB4} based on recommendation here.  Auto or manual backups all eventually fail and each time I get the same errors in the event log - they are IO errors from the backup file and then when I look at the event log for the system it shows disk error event code 7 and says that Harddisk 1 has a bad block.  It repeats this error code many times in the log. 

    I have over 21TB in the storage pool with close to 8TB free - I think at this point I will let it finish with the removal process and then see if I can do my backups.  If they still fail I will run checkall.cmd to see if I have other HDD problems or something else.  If it doesn't complete I will do it how Tcalp suggests. 

    • Marked as answer by HTPCat Friday, May 14, 2010 1:57 PM
    Wednesday, May 12, 2010 3:35 AM
  • You did the chkdsk on D partition, however the error could ver well reside on another disk. If you originally built your WHS with more then one disk the actual backup database files will be stored on another disk, not on d partition. The files you observe in d: /folders/{00008086-058D-4C89-AB57-A7F909A47AB4} are not real files, they are tombstones (reparsepoints) pointing to the actual files on another disk.
    Wednesday, May 12, 2010 7:10 AM
    Moderator
  • In my experience, when one disk drops into PIO mode everything in a WHS slows down to a crawl with all sorts of strange behaviour. Have you checked for this? see http://support.microsoft.com/kb/817472

     

    Wednesday, May 12, 2010 2:36 PM
  • So once I run the checkall.cmd where do I look in the event viewer to find out if there were errors? 

    How would I know which disk is actually the problem? 

    Since I am getting CRC errors over and over I would assume that I do have a bad disk right? 

    I went into device manager and looked at the IDE ATA Controllers which showed 2 primary channels and 2 secondary channels I then looked at the advanced TAB for all of these and they all show DMA transfer rate and not PIO.  Is there anywhere else I should check?

    Wednesday, May 12, 2010 5:39 PM
  • Please read the FAQ brubber linked; it tells you where chkdsk reports are to be found.

    Frequent CRC errors, particularly if they're reported for more than one disk, are more suggestive of hardware issues other than the disk drives.


    I'm not on the WHS team, I just post a lot. :)
    Wednesday, May 12, 2010 5:44 PM
    Moderator
  • When I view the system event log all the errors are reported as Harddisk 1 has a bad block.  How do I know which disk this is referring to.  I have already removed disk 1 as reported by disk management which I assume isn't what the errors are referring to since I still am getting them.

    Wednesday, May 12, 2010 6:25 PM