Answered by:
Why would WHS log me off during backup database repair?

Question
-
A number of file conflicts are being reported with the "data error (cyclic redundancy check)" error. The backup database also has errors.I am trying to repair the database, but about 15% of the way through the repair, WHS logs me out of the console. When I log back in, the backup process has stopped running, and it appears that the database repair tool has also stopped running.How do I get around this?Thursday, November 5, 2009 8:23 PM
Answers
-
Actually you are closer. It's just that the news is not good. You appear to have one or more failing drives, or failing drive controllers. I.e. it's a hardware issue. You can try locating a drive test utility supplied by the drive maker and running that, to see if you can get any additional information about the drive(s).As for how to recover from the situation, you can start with this FAQ posting. It describes the usual causes of file conflict type errors, and also indicates a likely hardware issue. Also look at this FAQ; since you seem to have more than one failing drive I'm more concerned that you be able to recover as much of your data as possible, rather than worrying about your server hardware. I don't think your backup database is going to be recoverable...
Note: if chkdsk reports bad sectors, the only reasonable thing to do is replace the drive. Don't rely on chkdsk (or anything) to repair a drive that's demonstrated it's experiencing errors that are spreading. It's normal for a drive to have some bad sectors from the factory, but those are mapped out in advance. And it's normal for additional errors to occur as a drive ages, but the drive normally recovers from those on it's own. When you start seeing bad sectors reported through chkdsk, the drive is not (for whatever reason) recovering from errors any more.As for why you may have failing drives: power line fluctuations can cause drive failure. A motherboard can experience a failure of the disk controller(s). Your power supply may be overloaded. You may have bad cabling (power or data). Et cetera. You could even have bad RAM.
I'm not on the WHS team, I just post a lot. :)- Marked as answer by Will Diaz Friday, November 13, 2009 7:41 PM
Friday, November 6, 2009 5:09 PMModerator -
It appears as if one of my hard drives is damaged beyond repair; the Western Digital diagnostic tool can't repair it.
At this point, I no longer care about salvaging the data. I just want to remove the hard drive from the console in the proper manner, but whenever I attempt to do that, the removal fails due to file conflicts.How can I remove this hard drive from the server?
If nothing else, you should be able to power down your server, disconnect the failed hard drive, then power up the server again. At that point, WHS will say a drive is missing (obviously :) ). Just remove the missing drive through the Console and it should clear itself.- Marked as answer by Will Diaz Friday, November 13, 2009 7:41 PM
Thursday, November 12, 2009 2:21 AMModerator -
I am happy to report that my server is back to normal. Here's a quick summary (cutting out many gory details):
- Ran chkdsk on all drives and identified the bad drive (the one mounted at c:\fs\1F).
- chkdsk was unable to repair the drive enough to remove it from the server, so I plugged the drive into another PC and ran a diagnostic.
- The diagnostic was unable to fully repair the drive, but it took care of enough bad sectors to allow the drive to be removed from the server.
- I ran the database repair tool, salvaging some of my backups.
- I RDCed into the server to track down the files marked with the The system cannot find the drive specified file conflict in each of the c:\fs\[drive name\DE\shares folders, cut them and pasted them to the desktop (I could not do this from any of the clients due to an Invalid file handle error).
- Once the server was rebooted and everything reported as healthy, I put those cut files back where they were before.
This multi-day battle with the WHS has taught me a few things:- Enable folder duplication on each folder. If I had not done this, losing that drive would have been much worse.
- Do not use external hard drives as part of the drive pool. Of my server's six drives, the only two to fail in such a spectacular manner were my externals.
- Back up your shares. I did lose a few files, which could have been restored if I had bothered to run that backup more frequently.
- When you see bad sectors the first time, consider replacing the drive immediately. I knew this drive was going bad months ago, but I ignored the warning signs.
As always, thank you Ken and kariya21. You helped me get this server back to 100%.- Marked as answer by Will Diaz Friday, November 13, 2009 7:41 PM
Friday, November 13, 2009 7:41 PM
All replies
-
It sounds like the console is crashing during the repair. Probably this is a result of the CRC errors, which usually indicate a failing hard disk. Please run chkdsk on all the drives in your server, examine the reports (in the event logs on the server) and let us know what you find.
I'm not on the WHS team, I just post a lot. :)Thursday, November 5, 2009 8:33 PMModerator -
I'm currently running chkdsk against the HDD that I believe to be bad (a couple of months ago I ran into 20KB of bad sectors on that one). Once I get passed this problem, I am going to identify that drive and yank it out of the WHS. Using the proper methods, of course.Thursday, November 5, 2009 8:40 PM
-
I think it took somewhere around 12 hours to complete the chkdsk this time around. One drive alone took six hours.I last ran into this problem in September, when [c:\fs\1a] had 20KB of bad sectors. The other three HDDs were fine. This time, [c:\fs\1f] and [c:\fs\18] had 416KB and 20KB in bad sectors, respectively, while [c:\fs\1a] showed no problems. Only [c:\fs\16] and the system drive showed no problems.Having done that, I tried to repair my backup database and I am still being logged out, although this time it got about halfway through. I still have file conflicts, but this time I get two new errors, along with the same old one:"Access is denied""Data error (cyclic redundancy check)""The group or resource is not in the correct state to perform the requested operation"It appears that I am not closer to fixing this problem. What am I supposed to do now?Friday, November 6, 2009 4:40 PM
-
Actually you are closer. It's just that the news is not good. You appear to have one or more failing drives, or failing drive controllers. I.e. it's a hardware issue. You can try locating a drive test utility supplied by the drive maker and running that, to see if you can get any additional information about the drive(s).As for how to recover from the situation, you can start with this FAQ posting. It describes the usual causes of file conflict type errors, and also indicates a likely hardware issue. Also look at this FAQ; since you seem to have more than one failing drive I'm more concerned that you be able to recover as much of your data as possible, rather than worrying about your server hardware. I don't think your backup database is going to be recoverable...
Note: if chkdsk reports bad sectors, the only reasonable thing to do is replace the drive. Don't rely on chkdsk (or anything) to repair a drive that's demonstrated it's experiencing errors that are spreading. It's normal for a drive to have some bad sectors from the factory, but those are mapped out in advance. And it's normal for additional errors to occur as a drive ages, but the drive normally recovers from those on it's own. When you start seeing bad sectors reported through chkdsk, the drive is not (for whatever reason) recovering from errors any more.As for why you may have failing drives: power line fluctuations can cause drive failure. A motherboard can experience a failure of the disk controller(s). Your power supply may be overloaded. You may have bad cabling (power or data). Et cetera. You could even have bad RAM.
I'm not on the WHS team, I just post a lot. :)- Marked as answer by Will Diaz Friday, November 13, 2009 7:41 PM
Friday, November 6, 2009 5:09 PMModerator -
I aplogize for the delay in my response. I've been traveling today.
While this news isn't the best, I can deal with it. All three Western Digital HDDs that I purchased last year have shown some sort of problem since I put them into the WHS, so I will need to replace each of them. I will also need to invest in a UPS, since my house does experience infrequent power outages. I understand that using a UPS is not a supported scenario, but I don't really see what other option I have right now.
I'm used to losing my backup database; every time I run into an issue with my WHS, it seems to be the first casualty.
I won't be back home for a few days, so I'll get started on your suggestions then. I will report back with any good or bad news as it comes. As always, I really appreciate your help. I've become so dependant on my WHS that I can't have it running at anything below its best.Saturday, November 7, 2009 9:06 AM -
I have also ran into this same issues. Are you by any chance running PP3 beta or RC? If not what version of WHS are you running?
Duane ThomasSaturday, November 7, 2009 11:12 AM -
I am not running the RC or the PP3 beta. I'm still on PP2.I'm slowly working through the file conflicts, and I am stumped on the ones that show the 'Access is denied' error message. I have not found any documentation or forum posts that cover that particular error. I assumed that the file is open, but after shutting down all of my PCs and restarting the server, they still come up as 'Access is denied.'What should I do about those files?Tuesday, November 10, 2009 7:59 PM
-
It appears as if one of my hard drives is damaged beyond repair; the Western Digital diagnostic tool can't repair it.At this point, I no longer care about salvaging the data. I just want to remove the hard drive from the console in the proper manner, but whenever I attempt to do that, the removal fails due to file conflicts.How can I remove this hard drive from the server?Wednesday, November 11, 2009 10:24 PM
-
It appears as if one of my hard drives is damaged beyond repair; the Western Digital diagnostic tool can't repair it.
At this point, I no longer care about salvaging the data. I just want to remove the hard drive from the console in the proper manner, but whenever I attempt to do that, the removal fails due to file conflicts.How can I remove this hard drive from the server?
If nothing else, you should be able to power down your server, disconnect the failed hard drive, then power up the server again. At that point, WHS will say a drive is missing (obviously :) ). Just remove the missing drive through the Console and it should clear itself.- Marked as answer by Will Diaz Friday, November 13, 2009 7:41 PM
Thursday, November 12, 2009 2:21 AMModerator -
I was just getting ready to do that (it took a while to find a blog entry about what I was asking about). Since I last posted, I made serious headway in fixing this problem.While Western Digital's diagnostic tool was unable to fix all of the bad sectors, it fixed enough bad sectors to allow me to remove the bad hard drive (the tool wiped out whatever data was in those bad sectors though). After removing the bad drive, I was able to repair the backup database without losing all of them (I think my wife's netbook backups were all lost though).The blue light is back on my server! I do have one last problem though. The status of three folders are reported as 'Failing.' It has a few file conflicts, all of which that state The system cannot find the drive specified. After running chkdsk again, I will yank each drive out, one at a time, and copy the files to another computer, as suggested in this thread.Thursday, November 12, 2009 4:36 AM
-
I am happy to report that my server is back to normal. Here's a quick summary (cutting out many gory details):
- Ran chkdsk on all drives and identified the bad drive (the one mounted at c:\fs\1F).
- chkdsk was unable to repair the drive enough to remove it from the server, so I plugged the drive into another PC and ran a diagnostic.
- The diagnostic was unable to fully repair the drive, but it took care of enough bad sectors to allow the drive to be removed from the server.
- I ran the database repair tool, salvaging some of my backups.
- I RDCed into the server to track down the files marked with the The system cannot find the drive specified file conflict in each of the c:\fs\[drive name\DE\shares folders, cut them and pasted them to the desktop (I could not do this from any of the clients due to an Invalid file handle error).
- Once the server was rebooted and everything reported as healthy, I put those cut files back where they were before.
This multi-day battle with the WHS has taught me a few things:- Enable folder duplication on each folder. If I had not done this, losing that drive would have been much worse.
- Do not use external hard drives as part of the drive pool. Of my server's six drives, the only two to fail in such a spectacular manner were my externals.
- Back up your shares. I did lose a few files, which could have been restored if I had bothered to run that backup more frequently.
- When you see bad sectors the first time, consider replacing the drive immediately. I knew this drive was going bad months ago, but I ignored the warning signs.
As always, thank you Ken and kariya21. You helped me get this server back to 100%.- Marked as answer by Will Diaz Friday, November 13, 2009 7:41 PM
Friday, November 13, 2009 7:41 PM