Better Recovery Options RRS feed

  • General discussion

  • I have a home server that I built for the beta / RC's and moved forward into RTM with the System Builder SKU. Overall I love the product. I like the automatic backups especially. My configuration was a P4 3.4 GHz hyperthreaded CPU with 2 GB RAM and 2 x 500 GB SATA drives and also 2 x 250 GB SATA drives. ( 1.5 TB total, only 400 GB used) with folder duplication turned on for all shares.


    The other day, I had my son come down and ask if the server was down (I have written an app using SQL Server Express as a backend that controls access to the power to a Wii and an XBox and won't let them play except during certain time windows and if they have "hours" remaining in their bank - has a simple service on the server and client software on their PC's) as he could not log on. I checked the console (with a REAL screen) and it was at a BSOD showing me "A hardware error has ocurred, please consult your hardware vendor, the system has been halted". Great - So I reboot it and the bios screen tells me that SATA drive 2 is missing. (SATA drive 2 does NOT have home server itself on it - it is just one of the data partitions. - this is all AHCI and not setup as SATA RAID or anything - Just A Bunch of Drives mode).


    So, I boot the server up and it shows the drive missing in the console. I also see that the status of the network is critical (good) and it shows the backup database had trouble. "A possible database consistency problem has been detected in the backup database" with a "click here for details". The Click here link takes you to a site trying to market home server (you need to fix that - it should have been to more help and not to a marketing site).


    All backups would fail and also all attempts to open existing backups. I would get this error in the event log:


    Unexpected error 0x6 from CreateFile on D:\folders\{00008086-058D-4C89-AB57-A7F909A47AB4}\Index.4096.dat: The handle is invalid.


    Unexpected error 0x6 from CreateFile on D:\folders\{00008086-058D-4C89-AB57-A7F909A47AB4}\JERRYPC.D.FileRecordHash.4096.dat: The handle is invalid.


    The actual files it pointed at seemed to exist, but you could not copy them. The service would crash and restart when this would happen.


    I added a replacement drive (a 500 GB this time so now it has 3 x 500 GB and 1 x 250 GB drives). Added the drive to the storage list and removed the "missing" one. The wizard claimed at the end that "all backups" would be gone and that "some files" would be missing if I finished the removal. I find that odd, as folder replication should deal with this. In fact, I have not so far found any files missing. As far as the backups - they still showed as there, but would still not open. Each time you would try to open one the backup service would crash with the same errors.


    I finally had to use the console to "delete all" backups. After doing this, the first 2 backups failed, but then they started to work. I have the first one just about to finish now (it has been taking 35% of a 1 Gbps connection for 20 minutes now and must be near done).


    I think that:


    1) This product should NOT corrupt the database when 1 drive out of 4 with folder replication on fails. Especially when it is the smallest physical drive and is not the one with the Windows partition on it.

    2) If the database does get corrupted somehow it should be able to be rebuilt automatically from log files. Possibly a message asking if you want to repair it - but the user shouldn't have to do any more than say, "yes, repair it". It might have to toss the last backup or something but that should be it.

    3) There should not be any situations where the home server has to have all of the backups deleted. Those of us using home server rely on those backups. Speaking as someone who has in the past had two hard drives fail in one day (a year ago), I can't afford to have my Home Server bork my backups due to a single drive failure as with 6 other machines in the house and Murphy's law - one of the other machines will either have a failure or a software update or something (even spyware) *** them out the same day and I will need to restore.



    Thursday, January 3, 2008 9:32 PM

All replies

  • That would be the same day I got with a virus that infected all our computers here in the WIRE workgroup.


    Friday, January 4, 2008 9:00 AM