locked
WHS hangs frequently RRS feed

  • Question

  • Hello,
     
    My WHS (home build) frequently hangs. I have a monitor, keyboard and mouse hooked up and when it happens, the screen blanks out, the keyboard and mouse become unresponsive, and it drops from the network. The system is still 'on' but hung. This used to happen avery few days. I clocked the memory speed down at the suggestion of a post (which I can no longer find) and that fixed the problem for a few months but now it's doing it again only now the server only stays up for a half a hour at best before it hangs.

    When the system boots it dispays an error that a driver or service failed to start but I don't see anything in the event logs that pertain to a failed driver or service and the server does work (until it crashes).

    This machine used to be my primary PC before it got relageted to my WHS and I never had problems with the hardware before so while not ruling it out, I'm hesitant to think it's a hardware or memory problem.

    I'm thinking it's some kind of driver issue but the logs don't show anything.

    Thoughts?
    Monday, January 5, 2009 5:09 PM

Answers

  • I did just see an error in the System log that I had not seen before. It is event ID 14 and the source is nv

    From what I could find online this pertains to a corrupted dll or similar file pertaining to the nvidia display drivers. I do have an nvidia video card in the machine. But since it really does not matter what drivers it uses I removed the nvidia video driver (as well as the nvidia pci management drivers) and let the native Windows drivers do all the heavy lifting.

    So far its been up for about 2 hours. I'll keep my fingers crossed.

    Tuesday, January 6, 2009 2:07 AM

All replies

  • Have you run any hardware diagnostics or memory tests at all? Are all the fans working? How old is the PC?
    I'm not on the WHS team, I just post a lot. :)
    Monday, January 5, 2009 5:15 PM
    Moderator
  • As the system has been an evolution of components over the years it's a little difficult to quantify its age. It's a socket 939 (Epox 9NPA+ I think .... I'm not home right now) motherboard with a dual core AMD Opteron (175 I think) installed. 2GB of PC3200 RAM I believe. So, it's a little long in the tooth but certainly not ancient.

    I believe the motherboard chipset is the nVidia nForce4 chipset which I think I read somewhere that WHS doesn't like. Not sure though.

    All the fans are working. It's in an old Antec aluminum gamer case so there are plenty of places for fans which I have taken advantage of. Cooling shouldn't be an issue. Doesn't mean that it isn't, just shoudn't be.

    I have not done any memory diagnostics as of yet. It's been a while since I've used any memory tools. Is Memtest x86 still the way to go or is there something better now?
    Monday, January 5, 2009 5:30 PM
  • Hi,

    Memtest is fine, it should do the job OK. Yes, there has been some reported problems of the nForce4 chipset being incompatible, but a lot of them are down to the drivers. Are all your drivers Server 2003 versions and do the server logs show anything at all?

    Colin



    If anyone answers your query successfully, please mark it as 'Helpful', to guide other users.
    Monday, January 5, 2009 6:21 PM
    Moderator
  • Since this is desktop hardware with a server OS, Server2K3 drivers were not always available when I went looking for them, but I used Server 2K3 drivers whenever possible and XP drivers where server 2K3 drivers were unavailable.

    I do have a few devices which I could not find drivers for but since they were all audio devices I didn't bother looking too hard for them. I suppose it couldn't hurt to give it another try to find them.

    I didn't see anything majorly out of place in the event logs. I do get an occasional COM+ error that I cannot find a solution for but it didn't seem to be the 'crash your system' type of error.
    Monday, January 5, 2009 7:07 PM
  • Dan,

    Could be worthwhile, if Memtest doesn't show anything, to check through your drivers and ensure they are the latest versions etc., as completely blank 'hangs' are usually either a driver or hardware issue.
    As a matter of interest, what is the COM+ error message?

    Colin


    If anyone answers your query successfully, please mark it as 'Helpful', to guide other users.
    Monday, January 5, 2009 7:51 PM
    Moderator
  • Hi,
    I have seen similar behavior in the introduction period of Server 2003, if operating with larger files either locally or via network, usually on systems using IDE disks. While I did assume driver issues, I never could nail that down to the point. (In that time I used my desktop PC as test PC for that OS.) If monitoring during such actions like copy a CD image locally or to a shared folder on that server, explorer hogged more and more memory, response got slower and slower, even the network connection dropped. Just as you describe. Never happened on real server hardware, luckily.
    Not sure, if that information can help you, but may be you can also perform some tests in that direction.
    Is it always around the same time, like at backup cleanup stage?
    Do you have additional software installed?
    How much memory has the box?
    Any possible related stuff in the event log?
    Best greetings from Germany
    Olaf
    Monday, January 5, 2009 7:53 PM
    Moderator
  • The COM error description is:

    Description:
    The COM+ Event System failed to create an instance of the subscriber
    partition:{41E90F3E-56C1-4633-81C3-6E8BAC8BDD70}!new:{58FC39EB-9DBD-4EA7-B7­B4-9404CC6ACFAB}.
    CoGetObject returned HRESULT 8000401A.

     

    Unfortunately the crashes do not happen with any regularity nor do they seem to coincide with any particular server activity. I do have Virtual Server 2005(?) installed but an not running any VM's. The box has 2GB of RAM.

     

    Looks like I'll be spending tonight hunting for updated drivers.  :)

    Monday, January 5, 2009 8:14 PM
  • Dan,

    In general, that type of COm error shouldn't cause a lock-up, but the Application Event Log should also have an Event Category and Event ID number - the ID number being the important one - as it identifies the actual error event.

    Good luck with the error-checking!

    Colin


    If anyone answers your query successfully, please mark it as 'Helpful', to guide other users.
    Monday, January 5, 2009 8:32 PM
    Moderator
  • I did just see an error in the System log that I had not seen before. It is event ID 14 and the source is nv

    From what I could find online this pertains to a corrupted dll or similar file pertaining to the nvidia display drivers. I do have an nvidia video card in the machine. But since it really does not matter what drivers it uses I removed the nvidia video driver (as well as the nvidia pci management drivers) and let the native Windows drivers do all the heavy lifting.

    So far its been up for about 2 hours. I'll keep my fingers crossed.

    Tuesday, January 6, 2009 2:07 AM