locked
Mixing Duplication On/Off: Bad Idea? RRS feed

  • Question

  • When space gets tight, I am tempted to set some of my shares to "Duplication=No" bc I have hardcopy backup for what is in them.

    e.g. I have about 300 DVDs in a closet that also have been ripped to the WHS pool.   These are all in their own share called "DVDs_Owned".   They are ripped to .VOB format.   i.e. for each DVD there is a parent directory named for the DVD and subdirectories containing .VOB and other types of files.

     

    The Question: Is turning off duplication for only this share going to create unforseen problems when recovering from a drive failure?

    For somebody who knows nothing, it seems like as long as I have a printed list of what SB in the share all I need to do post-crash is to check the share's list against the printed list and re-rip as needed.

    Or can directories - or even files - span multiple drives and I'll never know what survived and what did not until I try to play each DVD to the end?

    Sunday, November 21, 2010 6:45 PM

All replies

  • Duplication shouldn't be thought of as protecting your data. It does that (to an extent), and Microsoft promotes the feature as protection, but mostly it provides high availability. For protection, you should back your data up and take it off-site on a regular basis. I use duplication on all shares, for high availability, but I also take backups off site weekly.
    I'm not on the WHS team, I just post a lot. :)
    Sunday, November 21, 2010 7:09 PM
    Moderator
  • ...you should back your data up and take it off-site on a regular basis. I use duplication on all shares, for high availability, but I also take backups off site weekly.

    I've got a second server to which I manually mirror every so often - and for low-volume data I do scheduled incrementals to removable drives.

    But I don't do anything except the mirror thing for high-volume files - specifically ripped DVDs and recorded TV programs.

    What media would you use for, say, 5 terabytes offsite?

     

    It's starting to sound like a significant post-crash problem is determining what does and does not need restoring.

    Sunday, November 21, 2010 7:32 PM
  • Ken,
     
    Could you expand on the following:
     
    "I use duplication on all shares, for high availability..."
     
    What exactly is meant by high availability?  Does duplication make it easier or faster to access files? 

    --
    _________________
     
    BullDawg
    In God We Trust
    _________________
    Duplication shouldn't be thought of as protecting your data. It does that (to an extent), and Microsoft promotes the feature as protection, but mostly it provides high availability. For protection, you should back your data up and take it off-site on a regular basis. I use duplication on all shares, for high availability, but I also take backups off site weekly.
    I'm not on the WHS team, I just post a lot. :)

    BullDawg
    Sunday, November 21, 2010 8:43 PM
  • A good definition of "high availability" can be found on Wikipedia. I don't mean everything you'll find there.

    In this case, all I mean is that I will not lose access to any of the data in my shares for an extended period even if a drive fails in my server. In that eventuality, I would shut the server down, remove the drive, start back up, and remove it from server storage. This is the rough equivalent of rebuilding a RAID array. Since I personally keep quite a bit of space free on my server I wouldn't even need to add a drive in the short term. If you don't have the equivalent of a free disk, you might have to turn duplication off for some shares in the same situation, until you could replace the drive.

    Without duplication, if you lose a drive, you are guaranteed to lose files. Then you've got to retrieve them from a backup, or recreate them. Either way, you have significant "down time" until all the files are available again.

    Even with duplication, however, you could still lose files in the event of force majeure events, which is why off-site backups are a critical component of a complete disaster recovery plan.


    I'm not on the WHS team, I just post a lot. :)
    Sunday, November 21, 2010 9:29 PM
    Moderator
  • What media would you use for, say, 5 terabytes offsite?

    It's starting to sound like a significant post-crash problem is determining what does and does not need restoring.

    Hard disks. 3 external disks will hold that volume of data. Best practice is to rotate disks on and off-site regularly (I do so weekly) so you would need two sets. There are other ways to deal with the question: archiving everything and supplementing that archive from time to time might work out better, but you should still have two copies of the archive just in case. Yes, this represents a lot of money. No, there's no cheaper alternative that will "deliver the goods" that I'm aware of. Tape sounds like a good idea (and tapes are smaller than disk drives, a factor for large data sets), until you price tape drives. Cloud backup sounds good too, until you price the storage and figure out how long restoring 5 TB will take over your internet connection. (Even if you have the equivalent of Fast Ethernet you'll need a minimum of a week to restore; will your ISP let you suck down that much bandwidth, or will they cap or terminate your connection?)

    I don't worry about what does or doesn't need restoring. Figuring that out is expensive in terms of time, and my hourly rate is, umm, "not low". Besides, why should I care as long as I can restore everything? That takes a while, but will proceed without my intervention once I set the process in motion. People tend to forget that there's a cost associated with everything; that cost may be hidden, it may be acceptable due to other factors (the fun factor is why I used to participate in amateur motorsports, one of the most expensive hobbies in the world), it may be low enough that it doesn't matter, but if you spend all day on this, you didn't spend it on that.


    I'm not on the WHS team, I just post a lot. :)
    Sunday, November 21, 2010 9:48 PM
    Moderator
  • Great answer, got it!  (I would mark it as an answer, but I use the NNTP Bridge)

    --
    _________________
     
    BullDawg
    In God We Trust
    _________________

    A good definition of "high availability" can be found on Wikipedia. I don't mean everything you'll find there.

    In this case, all I mean is that I will not lose access to any of the data in my shares for an extended period even if a drive fails in my server. In that eventuality, I would shut the server down, remove the drive, start back up, and remove it from server storage. This is the rough equivalent of rebuilding a RAID array. Since I personally keep quite a bit of space free on my server I wouldn't even need to add a drive in the short term. If you don't have the equivalent of a free disk, you might have to turn duplication off for some shares in the same situation, until you could replace the drive.

    Without duplication, if you lose a drive, you are guaranteed to lose files. Then you've got to retrieve them from a backup, or recreate them. Either way, you have significant "down time" until all the files are available again.

    Even with duplication, however, you could still lose files in the event of force majeure events, which is why off-site backups are a critical component of a complete disaster recovery plan.


    I'm not on the WHS team, I just post a lot. :)

    BullDawg
    Monday, November 22, 2010 12:56 PM