locked
Data recovery story for drive failure RRS feed

  • Question

  • Hi all,

    A lot of great questions are coming up regarding the new Drive Extender behavior in http://social.microsoft.com/Forums/en-US/whsvailbeta/thread/32844aae-9f41-41cb-8a4a-f6c26ddfdd6f 

    I'd like to ask the DE team about the recovery story in the scenario where one or more drives fails in a disk set.

    If there were two hard drives in a disk set, all the files are under 1GB each so there is no striping in operation. Folder duplication is enabled for folders that contain half the files. The other half have no duplication.

    If a drive has a total failure leaving me with a single disk in the disk set, how good is the recovery story for this?

    I'd like to approach it from the angle of the booting Vail system (core os on a 3rd non pool drive) - Will ALL of the non-striped sub-1GB files contained on the drive be recoverable?

    Where in V1 of the product I could connect the drive to any Windows machine and start recovering files, what's the story for connecting the drive to a friends Vail box? - Is it dependent on a database being present? Can my friends Vail easily allow the disk into a new temporary disk set to recover all the files? Will I be able to ensure it doesn't get added to his storage pool / ensure demigrator doesn't write to the disk?

     

    I'd really appreciate any insight into this.

    Thanks!

    Monday, June 7, 2010 1:23 AM

Answers

  • First, I highly recommend you read the release notes, Getting Started Guide, and stickies at the top of this forum, as some of your questions are answered there. Probably you will find yourself with other questions that will also be answered in the documentation. Next, I also highly recommend you test the scenarios that are important to you, and submit bug reports where they don't meet your needs. Remember that this is still a beta product; there are bugs known and unknown, and the main reason for this type of beta test is to stress the product in ways that Microsoft and private beta groups didn't think to, so as to find as many of those bugs as possible. Much like voting, if you don't participate in the process your complaints about the results will fall on rather deaf ears. :)

    Now to your questions:

    Disk failures aren't documented, and probably won't be, by Microsoft other than in the most general terms. My experience with simulated disk failure (i.e. pull the cable on a running disk) was that I lost approximately 50% of my files when I disconnected a single disk from a two disk set with no duplication. (With duplication, I lost nothing.) Another user has conducted a similar test and says he lost everything. I don't have any idea why his results differed from mine, so I think this needs more user testing. In any case, Microsoft's design really requires the use of duplication for any data protection. Don't turn duplication off unless you don't mind losing everything in a particular share. This remains unchanged from V1, where Drive Extender tends (through it's storage use algorithm) to cluster files that are in the same location in the folder tree on the same disk(s).

    Accessing your storage post-failure is addressed in the documentation. You can connect a server storage drive, or drive set, to another server and it will appear as "non default storage". The "non default storage" will also be mounted to a folder on the server's system drive. As long as a drive is part of that "non default storage" you won't be able to add it to the default storage pool. Files in "non default storage" appear (I didn't examine every one) to be available and accessible; obviously if one or more "blocks" of a large file are on another (missing) disk you won't be able to access that file but (if I recall correctly) it won't be visible unless all "blocks" with pieces of the file are present.


    I'm not on the WHS team, I just post a lot. :)
    Monday, June 7, 2010 3:52 AM
    Moderator

All replies

  • First, I highly recommend you read the release notes, Getting Started Guide, and stickies at the top of this forum, as some of your questions are answered there. Probably you will find yourself with other questions that will also be answered in the documentation. Next, I also highly recommend you test the scenarios that are important to you, and submit bug reports where they don't meet your needs. Remember that this is still a beta product; there are bugs known and unknown, and the main reason for this type of beta test is to stress the product in ways that Microsoft and private beta groups didn't think to, so as to find as many of those bugs as possible. Much like voting, if you don't participate in the process your complaints about the results will fall on rather deaf ears. :)

    Now to your questions:

    Disk failures aren't documented, and probably won't be, by Microsoft other than in the most general terms. My experience with simulated disk failure (i.e. pull the cable on a running disk) was that I lost approximately 50% of my files when I disconnected a single disk from a two disk set with no duplication. (With duplication, I lost nothing.) Another user has conducted a similar test and says he lost everything. I don't have any idea why his results differed from mine, so I think this needs more user testing. In any case, Microsoft's design really requires the use of duplication for any data protection. Don't turn duplication off unless you don't mind losing everything in a particular share. This remains unchanged from V1, where Drive Extender tends (through it's storage use algorithm) to cluster files that are in the same location in the folder tree on the same disk(s).

    Accessing your storage post-failure is addressed in the documentation. You can connect a server storage drive, or drive set, to another server and it will appear as "non default storage". The "non default storage" will also be mounted to a folder on the server's system drive. As long as a drive is part of that "non default storage" you won't be able to add it to the default storage pool. Files in "non default storage" appear (I didn't examine every one) to be available and accessible; obviously if one or more "blocks" of a large file are on another (missing) disk you won't be able to access that file but (if I recall correctly) it won't be visible unless all "blocks" with pieces of the file are present.


    I'm not on the WHS team, I just post a lot. :)
    Monday, June 7, 2010 3:52 AM
    Moderator
  • This weekend a user came in with their HP home server, one that they had for about 2 years.  They picked it up just as Homeserver came out, one of the first early adopters.  Anyway, cat peed in it (don't ask) killing powersupply and board.  Fine.  We setup a new box, re-installed Homeserver, and managed to recover about 85% of the data (everything except what was on that first drive)

     

    Totally different hardware, OS not bootable anymore, etc.

     

    Which really made me start thinking about what Vail's real disaster recovery methods are.  Because from everything I read/see/play with, it appears a situation like this, where a hardware change and a re-install comes into play, I'm not sure recovery would be at all possible.

    • Proposed as answer by joshstix Monday, June 7, 2010 7:10 AM
    • Unproposed as answer by Ken WarrenModerator Monday, June 7, 2010 10:01 AM
    Monday, June 7, 2010 5:34 AM
  • This weekend a user came in with their HP home server, one that they had for about 2 years.  They picked it up just as Homeserver came out, one of the first early adopters.  Anyway, cat peed in it (don't ask) killing powersupply and board.  Fine.  We setup a new box, re-installed Homeserver, and managed to recover about 85% of the data (everything except what was on that first drive)

     

    Totally different hardware, OS not bootable anymore, etc.

     

    Which really made me start thinking about what Vail's real disaster recovery methods are.  Because from everything I read/see/play with, it appears a situation like this, where a hardware change and a re-install comes into play, I'm not sure recovery would be at all possible.


    You'd do a new install of Vail, remove the drive that was created during install from the storage pool.  Then connect the old drives and make them the active storage pool.  In theory this is even easier than WHS1 because you wouldn't need to move the data between drives and format drives as long as you could add the additional system drive to the machine.
    Monday, June 7, 2010 7:12 AM