locked
Another WHS system disc failure thread RRS feed

  • Question

  • Have reviewed the whole forum, thought I'd add my notes.

    Hardware: Intel D510MO 1.66GHz Dual Core Atom Mini-ITX Motherboard (2GB ram) which is running just fine in an old Dell server case (mainly used for the plethora of drive bays). It has two SATA connectors, I installed WHS with the primary hard drive on one ( 320GB ) and the CDROM on the other. I back up the system files onto a 500GB USB drive using ntbackup (cough, spit).

    Since then I have copied ALL our music, and ALL our photos / videos, running out of space, so added a 4 Port SATA controller and two 1TB drives ( £80 sterling!). It runs just fine, acting as our shared printer host, scanner host (I'm aiming for a paperless household...ahem) and we can access the music on our media receiver.

    I had a blip when, after adding the two new drives, I decided to move them into other drive bays for the sake of tidyness. Oops - all the folder structure was there but no files...I think, having read the threads that I might have reconnected them on different SATA ports, and so the tombstones were mismatched, anyway, I'm paranoid, so no data loss so far.

    However after a couple of MS updates, it failed to reboot, and disc diagnostics indicate the system drive is having problems. I added a third 1TB drive on motherboard SATA2, intending to make this the new system drive , it is recognised, I have added it to the array, and removed it again, it seems fine.

    So I'm down to one minor problem. The motherboard will not boot from the CDROM if its on the 4 port SATA card (it has to load drivers first). If I put it on one of the two SATA ports on the motherboard, then I can only have one hard disc on the motherboard.

    My plan is:

    Put CDROM back on motherboard SATA 2

    Yank old 320GB disc, and plug in the new 1TB disc on motherboard SATA 1, and hope that I can boot from the CD, and will see the "Reinstall" option.

    If I can reinstall, I will do so, and see if I can read the files from the array.

    If I dont see "reinstall", panic.

    Question: what other parts of the system disc should I backup?

    Suggestion: three things in life are CERTAIN, death, taxes and hard disc failure. Failure of the system disc should not cause such heartache, and should be covered in detail in the manual / helpfiles. This is meant to be a consumer product. I work for a MS partner as a development manager, and I'm scared about this. I should not have to trawl the interweb for answers.

    It should be as simple as:

    Install the new drive, on any available SATA port, and add it to the array via the UI.

    Use the UI to "remove" the old system drive. The system files and tombstones should all be replicated already (its only RAID after all) . IF REMOVAL OF THE DRIVE ON THAT SATA PORT WOULD LEAVE THE COMPUTER UNABLE TO BOOT, it should explain why, and offer detailed instructions.

    Physically remove the old drive and plug in the new one. The reinstall option should simply rebuild the raid set onto the new drive. This "just works" on "REAL" servers.

    It should be possible to re-arrange the drives on different SATA ports without the WHS going bonkers. Each drive is given a unique ID anyway, how hard can it be? ( I know, I'm living in an ideal world, I know how hard it is really, but playing Devil's advocate, this is meant to be a consumer product. Not all motorists know, or want to know, how an IC engine works).

    At present, I am not confident that, in the event of a system drive failure, I will have access to all our data.

    [Rant]

    Microsoft, you need to get over the concern that once the OS is RAIDed, someone could remove the disc and rebuild the array in another box. I bought this product, and the hardware. A few pirates activities should not put my storage at risk.

    [/Rant]

    Saturday, January 1, 2011 12:48 PM

All replies

  • Oh joy. I just tried to add my daughters "new" laptop (a Compaq 6650) which was running Vista, now upgraded to Windows 7 - she says its never run this fast before.

    It turns out that the "Blip" I mentioned above includes the "software" share - the folders are there but there are no connector files in the folder, so the laptop cannot configure backups etc, and is not joined to WHS.

    This is a real shame - I really like WHS in many ways, its just not quite finished. I wonder if WHS 2008 is any better...

    Anyway, can anyone please tell me which files should appear in the "software" share? I dont seem to have that in the WHS ntbackup set...Grrr

     

    Saturday, January 1, 2011 2:13 PM
  • Something to remember: Windows Home Server is available in two forms. Neither form is intended for an end user to install on their own hardware. One form is as an OEM hardware/software bundle; usually recoveries are relatively easy in this environment. The other form is as a software-only "system builder" pack, intended for installation by a small system builder (your corner computer store) who presumably has significant experience in building computers. ("Significant" is more than the usual enthusiast "one a year or so".) The system builder channel also sell to end users, because Microsoft can't prevent it, and because some enthusiasts are also system builders (or vice versa, some system builders are also enthusiasts). But that software is not "shrink wrap" software that you could go to Best Buy or Microcenter to purchase.

    In any case, Windows Home Server isn't intended for "Joe Sixpack" to install on random hardware, and the OEM or system builder (you, in this case) is expected to be able to sort out just about all issues on their own. There is no free support from Microsoft for this product; you get support from your OEM or system builder. We try to help here in the forums, but we don't work for Microsoft, we aren't paid for our time, and we can't always help someone with their (usually very specific) configuration or hardware issues.

    Dealing with system drive failure in a server you built yourself:

    • You remove the old system drive.
    • You install a new system drive.
    • You boot from your Windows Home Server installation media.
    • You supply storage drivers at the appropriate time so that all of your server storage drives are visible in the GUI portion of setup, including the system drive. These drivers should normally be Windows Server 2003 drivers, but occasionally you won't have those. In that case, you can try XP drivers (which are preferable down the line) or Vista drivers (because setup runs on WinPE 2.0). If you need storage drivers for the system drive, you should also prepare an F6 floppy for each storage controller (more reliable overall). If you need storage drivers for secondary drives, setup may offer to cache them for you. You might have some luck with this, but it's not a 100% guarantee.
      Note: To avoid the need for F6 floppies, set your SATA controllers to Legacy IDE, PATA, etc. mode; this will use in-box drivers.
    • You're offered a "server reinstallation" or "server recovery". You select this and proceed.
    • After the first reboot, if you need an F6 floppy, press F6 at the appropriate time. I usually start tapping a second or three before the prompt should display, and keep tapping until it goes away. ("Tap" isn't the same as "pound as fast as possible". :) More like 2-3 times a second.) Supply your drivers once you have the option. I usually get this to work about 75% of the time; if you never get the keypresses registered, start over from scratch.
    • Continue through setup.
    • At some point, Windows Home Server will start a rebuild of the primary DATA drive (D:). Depending on how much data you have on your secondary drives, this can take a very long time, up to days, if you have several terabytes of data.

    Things not to do if you want a successful recovery with data intact:

    • Disconnect the data drives. You won't be able to get them to reintegrate later.
    • Fail to supply drivers when needed. Your drives won't be visible, which is effectively the same as disconnecting them.
    • Have both your old and new system drives connected. This results in very odd behavior; you can't have two Windows Home Server system drives connected at once.

    "Removing" the old system drive: not possible, other than through server recovery. The system drive always contains two partitions, and both of those partitions are required.

    RAID: isn't supported, but it's not to prevent piracy. (Windows Home Server has a thriving pirate subculture). It's because your average computer user (not enthusiast) is, according to Microsoft's market research, intimidated by RAID. RAID is not easy for a non-technical person to use, and particularly when an issue arises (though it's easier than it was a few years ago). If you want to run your server on a RAID array, you can, and it should work fine (assuming you made decent hardware choices), but RAID-specific issues are guaranteed to result in advice along these lines: "get your data off your server, break the array, and let Windows Home Server manage your drives."

    Backing up your server: you can back up the shares using the tools available in the Windows Home Server console application. You can back up the backup database following instructions in the Home Computer Backup and Restore technical brief, or you can use an add-in such as Backup DataBase Backup (BDBB). You can't use NTBackup to back your server up, however, and expect to be able to restore it at an arbitrary point in the future. Unless, that is, you back up the entire server at once, and restore the entire server at once. "Entire Server" is: All drives to include SYSTEM partition and all DATA partitions, and system state. All at once. Windows Home Server uses reparse points to link files from your shares (exposed on D:, though you should never access them that way) to other drives, and reparse points are usually not backed up correctly by backup utilities. Also, Windows Home Server stores some information about storage configuration in the registry, and if that isn't backed up/restored at the same time as all of server storage, you'll wind up with worse issues than when yous tarted. 


    I'm not on the WHS team, I just post a lot. :)
    Saturday, January 1, 2011 5:36 PM
    Moderator
  • Thanks for the comments Ken, I see you on this forum a lot. I appreciate you guys are all volunteers, and you do a great job.

    Joe Sixpack says:

    I eventually persuaded my daughter's laptop to back up, the problem was indeed revealed by the WHS toolkit, along with some spurious messages about encryption key permissions, it had a message about the connector software being different to the server. So it was the aforementioned blip, I created a virtual server, installed WHS on that, allowed it to install three years of updates, and then blatted \\server\Software\Home Server Connector Software, and copied it over from the VM. I'll keep that VM handy in case something like this happens again.

    I'll be reviewing your note above in detail before attempting the disc replacement, but a few immediate comments:

    1) OEM hardware software bundle: if the system drive fails in an HP mediaSmart, you buy a new drive (from HP?) and plug it in, you get the recovery option, and HP have supplied a recovery disc with all the required drivers...(all this is clearly documented in the manual, right?) I dont know how it works in the US, but here in the UK you can forget any support from HP - you will be on hold to Bangalore for an hour before they tell you to re-install the operating system. So you're on your own. Assuming you successfully re-install WHS, it will then spend 8 hours getting the service packs up to date (mine did). It will then spend a day or two rebuilding the tombstones, with no progress indicator. My fellow Sixpackers will have rebooted a dozen times by then.

    And if you were unwise enough to send your mediasmart back to PC World, with your 2TB of photos, music, video? Read their small print: its your responsiblity to back up your data first.

    2) System builder, with "significant experience"...I was designing computers before Compaq released their first clone. I'm an unashamed fiddler, always swapping hardware about, upgrading, I'm the IT support guy for the entire family and I'm struggling with this. My initial comments were not a criticism of anyone in particular, more an observation that as it stands WHS DOES require way too much knowledge of how operating systems, BIOS, SATA/PATA work. 

    3) Backups: Yes we need to. Yes we should all have off-site backups, yes we should all be nice to each other. However, people on this forum have arrays of 12TB or more (I saw some stats from Microsoft that quoted a record of 72TB on one system). How do you back that up? The main appeal for most people is that WHS folder duplication takes care of that for you, so if one drive fails, the data is still there on another drive. I really don't care if its RAID or Drive Extender, I want my files safe. If I cant guarantee to be able to recover my files in the event of, say, a drive controller failure, requiring me / someone to take the machine apart, then the design is flawed.

    I once passed the IBM service guy exam. Recommended procedure for fixing a PC:

    Is the computer plugged in and switched on? Yes

    Ctrl-Alt-Del OK? No

    Are the cards seated properly? Yes

    Replace motherboard. 

    Rant over :-)

    The solution, in my view, is that the entire system should be RAIDed / Mirrored / Drive Extended whatever, and ANY drive can be unplugged and replaced without risk of data loss or lengthy rebuilds. The current half-way house is not ideal. I know this would require a significant amount of work on Microsoft's part.

    Interesting that MS seem to have lost the plot and dropped Drive extender entirely in Vail: http://windowsteamblog.com/windows/b/windowshomeserver/archive/2010/11/23/windows-home-server-code-name-vail-update.aspx

    Seems a lot of folk dislike that decision!

     

    Sunday, January 2, 2011 11:07 AM
  • The story for an OEM drive replacement is a little different than for a system builder drive replacement. A system builder will probably have installed a DVD-ROM in the server; OEM servers don't (can't; Microsoft doesn't permit it) have optical drives, so the recovery process runs remotely from a client. As for waiting and rebooting, yes, it can take a long time. Some people will wind up rebooting, if they have lots of data. Per Microsoft market research and SQM data most users of Windows Home Server have under a TB, though, which will usually take several hours for the rebuild, but not usually even a full day, and those people probably will wait (because they have an HP or similar OEM device, and they called HP or read the manual before they started).

    System builder vs. enthusiast: I generally don't put it this plainly, but someone who builds a couple of computers a year, upgrades, etc. isn't a system builder, as Microsoft would define the term internally. A system builder, per Microsoft's Partner site, is : "… anyone who assembles, reassembles, or installs software on a new or used computer system." This sounds like you or I, and that's why you or I can buy system builder software. But in reality Microsoft thinks of a system builder as the corner computer store that builds 50 to 150 computers a year, and upgrades/repairs computers proportionally. This really represents at least an order of magnitude more practical experience than the average enthusiast has.

    Add to that the tendency that enthusiasts have to make bad hardware choices (they buy high-performance desktop boards for low-end servers, which is both overkill and likely to deliver flaky drivers) and it starts to become clear why enthusiasts struggle with system issues. Add to all of that the fact that enthusiasts then want to use their servers as Windows Server "Lite" and pile on other software, when Microsoft warns clearly that desktop use is unsupported and likely to result in issues, and you can see that there's a serious disconnect between Microsoft's intent and the end-user's intent, and that's always a source of friction and stability issues.

    Backups: backing up more than a couple of TB is a challenge. I'm an IT pro, and I think it's a challenge. :) The people with that amount of data are just not in the target market segment for Windows Home Server no matter how much they insist that they should be, so things may be harder for them. (It would be nice if the world worked in a different way; unfortunately it doesn't.) Fortunately for these people, most of their data is probably static (DVD/BD rips, torrents, etc.), and doesn't need to be backed up every time. Unfortunately, sorting out what does need to be backed up sucks. Microsoft rather clearly didn't take these people into account when planning their products, and that was probably a decision that Microsoft made consciously. Is this a good thing? A bad thing? Further deponent sayeth not...

    As for the reasons for backups vs. duplication: duplication, by itself, is somewhat better than nothing. It does protect against the most likely hardware failure. However, the number one cause of data loss for consumers isn't hardware failure, it's user error (of various sorts) against which DE offers absolutely no protection. The sense of security a user gets from duplication is something of a false one, I'm afraid. Do your backups: even if you don't do them regularly, most consumers have mostly static data, so irregular and infrequent will still only lose a small percentage of total data.

    DE removal: No, it's not a popular decision. There are several possible reasons floating around the press; I have no proof of this, but I tend to believe that the decision was taken because A) DE causes problems with Line of Business (LoB) apps and the business SKUs of Vail will have to be able to run at least some LoB apps, and B) there will be a ton more profit in the business SKUs than the consumer SKU. Pulling DE entirely is cheaper than fixing it (I would bet that it had turned into a "fishing expedition", where Microsoft would fix one problem, and two more would pop up), and gets product out the door sooner. Is it the right decision? I would say "Not for consumers." Oh well. <shrugs>

    IBM's "repair" procedure: it's cheaper to replace the motherboard than it is to figure out the specific component that's failing, because parts cost less than training and technician troubleshooting time, so you replace the motherboard early on in the troubleshooting process. This is true even if the motherboard isn't always the problem. You check the disk for errors first (in case it's a disk issue; those are even cheaper), then you move on to the motherboard. If that doesn't fix it, you do further troubleshooting. It seems reasonable to me. :)


    I'm not on the WHS team, I just post a lot. :)
    Sunday, January 2, 2011 2:44 PM
    Moderator
  • I couldn’t resist the download of WHS 2011 (curiosity) and it was just as disappointing as predicted. I have already loaded v1 on my new server in place of 2011. There is no reason to upgrade (even if that were possible) as WHS v1 does most things just as well and some (DE of course) so much better. Is there any point in blogging all this as Microsoft have made no reasonable response to all the negative mail? They seem to be completely ignoring the vast majority of feedback supporting the retention of DE as the core to WHS’s survival. So it is stick with WHS v1 or off to Fedora/Amahi for those more ambitious types.
    GOOD LUCK to all version 1 users everywhere.
    Monday, February 7, 2011 3:43 PM