none
Server Freezing RRS feed

  • Question

  • Not sure what is happening, but my server is freezing after some 20~30 minutes up. I am running a back up of my main machine, and at some point it looses connection with the server, which has stopped responding, although it is still up and running. Even if I put keyboard, video and mouse on, I can't get video image on the monitor...

    It is a new machine, I just build for this project, which I was really anxious for... here are the machine specs:

    MoBo: Intel® Desktop Board DG965OT

    Processor:Intel® Core™2 Duo  E6400 2 MB L2 2.13 GHz 1066 MHz

    RAM: 2 Gig OCZ EL DDR2 PC2-6400 / 800MHz

    Case: Antec Fusion

    Hard Drives: 2X SATA Hard Drives 500 GB, 300 MB/s, 16 MB Cache, 7200 RPM

    And this is pretty much it. And it is driving me crazy that the system won't stay up 24x7. Any ideas, suggestions ? Is there any stress test Software that I can use ? Can I make it double boot and install Vista, or Win XP on the second Harddrive to see what happens on a diferent OS ? thanks guys,

    Monday, March 19, 2007 9:10 PM

Answers

  • I have had a failure in the 8th hour before, so I feel comfortable at 24.  It is not the result of a formula or anything.   

    The way I look at it .. 24 hours is not that long to prove that your memory is doing what it is suppose to.

    I really don't think that this server will be starved for memory performance, I doubt you could even measure the performance drop from a client, I would not hesitate at all to leave it at 533 if that was what was required.

    The individual stick testing idea was to find weak/bad sticks.  If it fails with one and not the other then you know that it is likely the stick that is bad and the rest of the system is probably fine. And the reason that you need to retest together after the individual tests is that sometimes sticks will work individually at a specific speed, but will fail when paired together in dual channel mode at the same speed.  

    Now that we know that your combo is not "supposed to work together" and the goal (IMO) has changed from finding something defective to finding a way to make it all work together anyway.  I would stick with 533 till you can do 24 hour memtest, then install WHS, then 24 hour orthos without issue.  Then after it had been in service for a while, at least a week or two .. rock solid ... then try 667 for an extra few % server performance :)  A server that is not perfectly stable is of no value at all so why worry about anything else till you have that?

    good luck!

     

     

     

    Thursday, March 22, 2007 1:03 AM
  • Your Bx2 supports changing memory voltage and timings, so, if you want, you could swap them and run both at 800.  However I didn't suggest that since you have 4gb in your desktop and 2gb in your server and that makes a lot more sense than the other way around. 

    One option would be to buy another similar pair of OCZ and run 4gb in each but that would be awful expensive when the only benefit you would see is another GB or so RAM (it is 32 bit remember) and 667 to 800 memory speed on your server.  I am pretty confident that it would be difficult to come up with a situation that you would see a real life performance boost for your dollars.

    PS I wish I had your budget for toys .. you picked some really, really nice stuff!!!  Just what I would want if I were to buy it myself.

     

    Thursday, March 22, 2007 4:49 PM

All replies

  • I would not mess with any dual boot stuff, stick with CD boot stuff or stuff you can run natively IMO.

     

    I would grab this http://www.memtest.org/ burn the iso to a cd, and run it for 24 hours.

    I would also consider downclocking (via memory divider) your RAM to 667 or even better 533, if the bios supports it, at least till you have a rock solid system.

    Then, If you are looking for a great system stability test I would use Orthos http://sp2004.fre3.com/beta/beta2.htm

    If you are at all concerned that temperature might be a problem (maybe the HS is not perfectly mounted?) grab the intel utility of better yet speedfan http://www.almico.com/speedfan.php

    I am not sure speed fan is a long term solution for WHS, but it should be fine for troubleshooting.

     

     

    Monday, March 19, 2007 11:54 PM
  • thanks MervinCM, I will get the ram testing/stressing program and run as you suggested to see what happens.

    as for downclocking the memory, I am not sure how to do it. I was never in the OC business myself, but I'll sniff around on the BIOS. What I know (or think I do), is that Intel MoBo;s are generally pretty straight forward, in a What You See Is What You Get kind of way and usually do not offer any OC capability, so I would expect the opposite to me true, with no DC (Down Clocking) capabilities. Last I've checked while roaming on the BIOS, there are lots of settings that are locked and do not enable changing. Does it matter if the Dual Core is enabled/disabled ? I would assume that for an OS like this, with this relativelly huge monster I've build, I can disable Dual Core, and it should be fine, correct ? There won't be any benefits on having Dual Core enabled, would it ?

    As for the Orthos, which actually might be a good one too, do I need to go backwards to Win 2000 ? Not sure I still have its medias around , but I might.

    How about the harddrives ? I did have to changed them twice at the store, once they were causing some issues right out of the gate. Any good application I can install and have them run like crazy (hopefully collection logs).

    As for temperature, I don't think that could be the issue, but I will put in a fan blowing on the Hard Drives, i did notice they get quite hot, and are fit in a relativelly small compartiment, given its a mATX case.

    thanks a bunch, I'll report the results as I progress...

     

     

    Tuesday, March 20, 2007 9:22 PM
  • ok, I got some preliminary results...

    I ran the RAM test : Windows Memory Diagnost, and went out for a run. I did observe that the first pass completed sucessfuly in all 6 tests of 6. To my surprise when I came back, the screen looked a bit funky, with lots of white dots all over the place, still I could see that the test made its way up to Pass #4 and it froze on while performing test 5 of 6.

    I already went to the BIOS and could find where to manually configure the RAM do DownClock it from 800 to 600 and even 533, I can also manually change other latency timmings, although I have no idea if I should venture such a thing. While in the BIOS I disabled the Duo Core, and also some other interfaces, such as Floppy, Serial, Parallel ports, etc...

    I also have the other mem test that you've recommended, and might leave it running tonight, although chance are it will freeze too.

    Since I have no logs on what is going on, what should I conclude up to this point ? Could it be some malfunctioning in the motherboard itself ?

    The quest goes on, all comments are very welcome, cheers

    Tuesday, March 20, 2007 11:42 PM
  • When a machine freezes during a memory diagnostic, it usually means that some very low-level error has occurred.  There won't be any logs to check because the machine isn't healthy enough to detect the error, much less write to the file.

    I'm afraid that you will have troubleshoot it the hard way by substituting parts until you can isolate the problem to

    • Motherboard
    • RAM
    • Add-in cards (video, etc.)
    • Power supply
    • Disk or controller

    I would recommend that you not change the BIOS settings from the default to troubleshoot this problem.  Even if you find a setting that works, you will still have some marginal component in your system that needs replacing.

    Good luck.

    Wednesday, March 21, 2007 12:28 AM
  • as for downclocking the memory, I am not sure how to do it.

    OK well the first step for me would to to check what speed and timings the RAM id running at.  DDR2-800 type speeds are not always 100% stable at the default voltages and timings.  They are often not even designed to work at them.  I know my OCZ plat ddr2-6400 r2 runs ddr2-800 speeds stable only if I manually set the voltage to 2.0, or manually specify the timings. Get a program like http://www.cpuid.com/cpuz.php to check what speed your RAM is running at.  On the 4th tab (memory) What does it say for frequency? If it says 266 or 333 it is already running slower than rated and should not be a worry.  If it says 400, and there is nothing in your bios about a memory speed or a memory divider, then I would say that you should check with ocz to be sure that your sticks work on your systemboard at full speed, with default timings and voltages.

    I dont think dual core can be disabled, you can check easily it the OS sees both cores by running Task Manager on your WHS box , and check for 2 CPU graphs on the performance tab.  You absolutely want both cores working.  I have pegged my single core at times, I am moving it to a dual core system as soon as I figure out how to shuffle all my systems around correctly :)

    Orthos will work fine under many operating systems, just run it in WHS, but not till after you have clean memory tests.

    every manufacturer makes a bootable cd diagnostic that you can use to test your drives.  If your willing to buy somethhing, spinrite was always a good choice, but I have not used it in a long time.

    Are you positive that temperature is not an issue?  I don't think you can be till you check it with something like speedfan.  a simple spec of dirt between your heatsink and CPU can cause temps to spike and cause all mannors of uglyness.

     

     

     

     

     

    Wednesday, March 21, 2007 1:18 AM
  • Yikes you do have a problem!  I would double check that both your northbridge and CPU heatsinks are properly seated and doing their job.  Check the temps using speedfan.  I would also start with a single stick of RAM, at 533 speed, and do the memory test again.  If it works, then try the other one by itself at 533.  Then both at 533.  Since you don't sound to confident in mucking around with memory timings, I would leave it at 533, since that is likely the only speed that your motherboard will be able to read the appropriate timings from the memory sticks.

    First make sure your cooling is running properly, all fans are spinning and clean .. etc

    1) remove one stick, change speed to 533 and mem test

    2) Swap sticks and repeat mem test.  (533)  if you have a single bad stick it will be apparent now.

    3) install both sticks and repeat mem test (533)

    once you have a 24hour pass on memtest86+  then you can do a stress test with orthos, right within WHS.  Orthos will work your CPU., systemboard and RAM hard, so be sure your cooling is up to the task.  oncee you pass 24hours on orthos, you can be fairly confident in it.

    hope that makes sense :)

     

     

    Wednesday, March 21, 2007 1:36 AM
  • Ouch!

    I dug a little further and you have a problem....

    http://www.intel.com/products/motherboard/ddr2/index.htm

    tells me that your board is VERY picky about what memory it will support at 800 speeds.  It will not support the ram you bought.  Your RAM is rated for 1.95v+ and the board only supplys 1.8.  Also it only supports very limited memory timings, and those are not the same as the RAM you chose.

     

    533 Might be supported though ... Give it a shot....

     

    Wednesday, March 21, 2007 1:46 AM
  • Hey Mervin, thanks a bunch man, you've been of great help ;)

    Progress last night after things went wrong with th Microsoft Mem Test:

    1. Ran Memtes86+ v1.70 just to check how long it would hold... and not surprisingly it failed after 5 to 10 minutes, B-A-D;

    2. Modified in BIOS to 533, and left the Microsoft Test running all night long, and early this morning, almost 12 hours down the road, things were still running smooth: 171 passes in full!

    3. Went to Intel web site, and checked the mem test compliance, that I recall seeing when I first bought the Mobo. To be honest, I did buy 4 Gig of recommended Kingston Mem, but these 4 sticks are now running on the other machine, with a D975xBX2, that became the main machine (running Vista Ultimate with the 8800GTS). I can still swap later on, if I can get those to run smoothly at 800, but let continue the tests;

    4. Modified again the mem speed on BIOS to be on the 667 neighborhood and right now, it is running its 7th pass. I am leaving for the office in a bit and will leave this running until I return later today.

    Youve been of great help. Questions: Why would I need to run the teste for 24 hours ? (not that I can't, but what can happen after the 12th hour, that didn't already happen ?); also why would I need to run 3 batches of testing: one with each individual stick, and another one with both ? All tests I've run so far were made with both (again, not that I can't, I probably will now, but I confess that being relativelly ignorant to some specs, in my mind it does not seem to make much sense).

    More updates later today, feeling much happier now that it is running sucessfully at 533. That is a start. Rod

    Wednesday, March 21, 2007 12:12 PM
  • I have had a failure in the 8th hour before, so I feel comfortable at 24.  It is not the result of a formula or anything.   

    The way I look at it .. 24 hours is not that long to prove that your memory is doing what it is suppose to.

    I really don't think that this server will be starved for memory performance, I doubt you could even measure the performance drop from a client, I would not hesitate at all to leave it at 533 if that was what was required.

    The individual stick testing idea was to find weak/bad sticks.  If it fails with one and not the other then you know that it is likely the stick that is bad and the rest of the system is probably fine. And the reason that you need to retest together after the individual tests is that sometimes sticks will work individually at a specific speed, but will fail when paired together in dual channel mode at the same speed.  

    Now that we know that your combo is not "supposed to work together" and the goal (IMO) has changed from finding something defective to finding a way to make it all work together anyway.  I would stick with 533 till you can do 24 hour memtest, then install WHS, then 24 hour orthos without issue.  Then after it had been in service for a while, at least a week or two .. rock solid ... then try 667 for an extra few % server performance :)  A server that is not perfectly stable is of no value at all so why worry about anything else till you have that?

    good luck!

     

     

     

    Thursday, March 22, 2007 1:03 AM
  • ok Mervin, pretty happy with your commitment to help out, please keep these helpfull thoughts coming in. now for the latest results...

    After a 10 hours testing, with both mem sticks running at 667, it was still running smooth,, 173 passes and going on. I stopped it, and decided to do some test in real production mode (but I will resume the testing shortly, fear no more).

    On production stage, I was have been able to keep it up and running on the WHS OS for 6 hours now. In this time I did a full back up of my main desktop, and that is 270 gig of data. Additionally to that, I've copied all my music and movies to the Server shared folder, which is some additional 180 gig of files, that made through to the server sucessfully. and still running solid (too early to celebrate just yet, but remarkable).

    I will let it run in the WHS OS for a day or so to see if it behaves as expected, and then will resume testing for the heck of it. So far, so good. Thanks a bunch., I will mark as aswered but will still update here as I move forward.

    Thursday, March 22, 2007 3:57 AM
  • Kind of hard to argue with a working system.  I am glad that you have had promising results.  Have fun with it! When you have the time, I would reinstall it simply because it was installed when your hardware was in a known unreliable condition, you can't know what might have been corrupted and when it might come to haunt you.  Also when you have time Orthos is well worth running.  It identified a failure in one system of mine in less than  an hour that would bluescreen every month or two under normal use!

     

    Thursday, March 22, 2007 4:39 AM
  • great thanks.

    Also, last night while I was backing up some files to the server, I was also streaming audio from it to the XBOX 360 and it was running perfecto!

    What I am curious to know is if I swap back the Kingston Mem Sticks, that were certified for this machine from my desktop, if I will be able to run it at 800Mhz, and of course if the other machine, also running an Intel Mobo, the 975BX2 would be stable and happy with the OCZ sticks full speed. that is part of what I'd like to test really soon.

    What I am a bit disappointed at this time is the lack of integration on the WHS through the Media Center interface, like you can do on the Windows MCE and Vista. I got really used to it and spoiled. I understand that this could be seen as an additional layer of sofistication that don't need to be in the early beta versions, but I am really praying that it will come on the final version. I mean, given Microsoft speech about being the full provider of multimedia experience, it only makes sense to push all that data (photos, music, videos, recorded TV,...) from your regulart PC (windows vista, XP,MCE...) to the WHS. Hopefuly this is not just wishfull thinking. Cheers, Rod

    Thursday, March 22, 2007 2:36 PM
  • Your Bx2 supports changing memory voltage and timings, so, if you want, you could swap them and run both at 800.  However I didn't suggest that since you have 4gb in your desktop and 2gb in your server and that makes a lot more sense than the other way around. 

    One option would be to buy another similar pair of OCZ and run 4gb in each but that would be awful expensive when the only benefit you would see is another GB or so RAM (it is 32 bit remember) and 667 to 800 memory speed on your server.  I am pretty confident that it would be difficult to come up with a situation that you would see a real life performance boost for your dollars.

    PS I wish I had your budget for toys .. you picked some really, really nice stuff!!!  Just what I would want if I were to buy it myself.

     

    Thursday, March 22, 2007 4:49 PM
  • yeah I know :) well, there is not that much budget to be hones, I did carve out some resources from other places to be able to put this all together, and these two machines will have to go for the long run. I mean, at least 3 years, 5 is preferrable and if they reach some 7 I would be happy. It's hard to predict how far they can go, but given that 64bits is still the HolyGrail that micht not become feasible in the short term, and these babies are up for Vista and a bit beyond, I am sure it will pay off on the long run. That is also why I don't go for slightly cheaper MoBo's with fancier Overclocking capabilities to trying squeeze more for the buck. Not that is anything wrong with that, but I preferr rock solid Intel, set up and forget about it kind of thing.

    So you thing the OCZ would run OK in 975 Mobo @ 800 huh ? I can totally swap them and go back to microcenter and get another pair of OCZ's for the desktop. I think they were $189 which is not a bad deal for this memory, is it ?

    I confess I used to enjoy moving parts around and doing this years ago, but nowadays is painfull to pop-oppen the hood and keep moving parts around. I get no kick out of it anymore.

    If you are already impressed with these parts, I better not tell you the new monitor I just got to run the 8800GTS... It came in yesterday. So sweet, and I still haven't moved the other one forward, although I have a commited buyer, that is helping me sponsor the upgrade. I would post a pic here if it was possible. fortunatelly it is not.

    Thursday, March 22, 2007 9:35 PM
  • latest developments... I swapped the 4gig Kingston mem sticks that were in the 975BX and placed them in the 965OT and put the 2 Gig OCZ that were in the Server (965OT) in the Main PC (975BX). And started testing...

    On the Server I am using the memtest86, once the Windows Memory Diagnostic didn't seem to support more than 4 Gigs . and on the main PC I am running the memtes86. Both are running @ 800Mhz since 9 PM of last night, so 12 hours in testing both machines seem to be responding fine at top memory speed. Wee wee. that is great news. Later tonight, or early tomorrow morning I will flip both machines back to their OS (WHS and Vista Ultimate) and will play with them for a few hours. If everyone remains stable and happy, I will have to go get 2Gig more fo the OCZ for the bad-boy. then machines will be done done done. I mean, we can always add an additional hard drive here and there, but aside of the growing storage need, they are where they need to be.

    that is proving that the PCI HD-cable-TV capture, for "Tivo" using the media center edition doesn't come out soon. the OCCUR, I think that's what they are calling these new babies, still pretty hard to come by. then I will be done (until the next big thing comes out !!!!

    Friday, March 23, 2007 2:45 PM