locked
WSH Having Periods Of Slowness? RRS feed

  • Question

  • In the console, right-clicking on a drive and choosing "Details", it's taking 20-40+ seconds for the details dialog to come up.   Used tb < 2 seconds.

    Reason I was doing that was to check drive temps - which seem tb in the "Ok" range of 35-45c.

    Reason I was checking drive temps was that one of the services on the system (SageTV) has started having episodes of very slow performance - as in video freezing and having to kill the client.

    Could the very long times to bring up "Details" dialogs be a clue to the slow performance?

    TaskMan shows something called "QSM" taking close to 50% of CPU, but only briefly/intermittantly.

    I added a few columns to TaskMan's display, but don't know what to make of them - except that there seem tb a lot of "Page Faults".

    viz: http://tinyurl.com/2cbsrwq

    Could the page faults be a clue?    From what little I can glean from reading, it sounds like a page fault means there was insufficient physical memory to complete an operation - but the WHS box has 4 gigs and is only using 1.6

    viz: http://tinyurl.com/27a33qn

     

    EDIT 2010 08-01 14:16:

    I re-booted the server and now those "Details" dialogs are opening in under two seconds.

    But I am still seeing rather high CPU usage - but nothing in the "Processes" pane seems to explain it.

    CPU E.G.

    Processes E.G.  (view sorted by CPU, descending)

     

    EDIT 2010 08-01 21:30:

    I fired up an RDP into the WHS box, watching TaskMan | Performance.   It never got below 49% CPU and had spikes to 70-90+ percent.

    All the time I was watching, I could not see anything in Processes that was taking more than about three percent - except, of course, for System Idle, which was mostly high 90's.

    One-by-one I killed off everything that was not plain-vanilla WHS, but there seemed tb no change in CPU activity.

    Unencumbered by any real knowledge, I get the impression that whatever is doing this is hiding in System Idle.  How  else could Performance be showing 50+% CPU usage while System Idle was in the high 90's?

    All this is based on the assumption that, just sitting there, with nobody outside the box doing any file activity at all (all other PCs on the LAN are turned off or asleep) a constant 50+% CPU activity is anomalous.

     

    EDIT 2010 08-01 22:35

    I've re-booted two more times.  Everything is running (i.e. I haven't killed anything).

    Now Performance is reading mostly 0-10% CPU with occasional spikes to 49.

    I'm *really* puzzled now...

     

    EDIT 2010 08-02 12:23

    Last night, looking good.

    This morning: back to the 50% thing jumping much higher from time-to-time.  

    Right now, it's behaving itsself - more or less: 0-3% CPU with occasional spikes into the teens and spikes to 50 every minute or so.

    TO re-iterate, this situation is NOT the problem.  The problem comes when CPU becomes 50+ percent on a more-or-less continuous basis with spikes up to 100.

    Can anybody suggest a process that could be causing those 50% spikes - yet not appearing in Task Manager | Processes?

     

    Sunday, August 1, 2010 5:01 PM

Answers

  • Please check to see if one or more of your drives is running in PIO mode. If so, run chkdsk on all the drives in your server, and if no errors are reported, set the offending drive(s) back to UDMA mode (see e.g. this KB article for more). If errors are reported by chkdsk, you may have a failing hard drive. In that case, you can set the drive back to UDMA as above, but you should monitor your server for a recurrence of the problem.

    As for changing the schedule on the migrator service, you can submit a product suggestion on Connect. However, Microsoft has chosen to prioritize protecting users' data very highly, and because of the target audience has chosen not to include an interface for manipulating how/when storage balancing occurs. I doubt that this will change.


    I'm not on the WHS team, I just post a lot. :)
    • Proposed as answer by kariya21Moderator Wednesday, September 8, 2010 4:13 AM
    • Marked as answer by PeteCress Friday, December 10, 2010 12:28 AM
    Tuesday, August 3, 2010 12:58 PM
    Moderator

All replies

  • Well...

    From here it's hard to tell what is going on.
    In most cases high CPU usage caused by some service or process running (like indexing service). But Task manager shows this is not the case.

    High CPU *spikes* could be an idicate some driver or hardware problem but based on you CPU graphs you can rule out this (for now).
    First thing I would check is disk access mode. Through Device Manger, check if the drives are running in UDMA mode and not PIO?
    Next, check the system event list for anything suspicious (warnings/errors)?

    Apart from this I would first leave the system for at least 24 hours to see if it settles down?

    - Theo.


    No home server like Home Server
    Monday, August 2, 2010 9:12 PM
    Moderator

  • Well...

    From here it's hard to tell what is going on.
    In most cases high CPU usage caused by some service or process running (like indexing service). But Task manager shows this is not the case.

    High CPU *spikes* could be an idicate some driver or hardware problem but based on you CPU graphs you can rule out this (for now).
    First think I would check disk access mode. Through Device Manger, check if the drives are running in UDMA mode and not PIO?
    Next, check the system event list for anything suspicious (warnings/errors)?

    Apart from this I would first leave the system for at least 24 hours to see if it settles down?

    - Theo.


    No home server like Home Server

    Sounds like the spikes are a red herring.

    I've watched more episodes of sustained high CPU and the cause seems to be something called "DeMigrator.exe".

    Read a few threads on demigrator, and it sounds like a problem child that MS is ignoring.

    From what I can glean, there needs tb a scheduling facility added so that demigrator can be told to only do it's thing during certain hours (like 2AM to 5Am..) and not while people are using the server to watch movies and such.

    Monday, August 2, 2010 11:22 PM
  • Please check to see if one or more of your drives is running in PIO mode. If so, run chkdsk on all the drives in your server, and if no errors are reported, set the offending drive(s) back to UDMA mode (see e.g. this KB article for more). If errors are reported by chkdsk, you may have a failing hard drive. In that case, you can set the drive back to UDMA as above, but you should monitor your server for a recurrence of the problem.

    As for changing the schedule on the migrator service, you can submit a product suggestion on Connect. However, Microsoft has chosen to prioritize protecting users' data very highly, and because of the target audience has chosen not to include an interface for manipulating how/when storage balancing occurs. I doubt that this will change.


    I'm not on the WHS team, I just post a lot. :)
    • Proposed as answer by kariya21Moderator Wednesday, September 8, 2010 4:13 AM
    • Marked as answer by PeteCress Friday, December 10, 2010 12:28 AM
    Tuesday, August 3, 2010 12:58 PM
    Moderator
  • Please check to see if one or more of your drives is running in PIO mode. If so, run chkdsk on all the drives in your server, and if no errors are reported, set the offending drive(s) back to UDMA mode (see e.g. this KB article for more).

     

    I don't know how to tell which IDE ATA/Atapi Controllers instance a given drive is connected to, but all 3 of the Primary IDE Channel controllers and all 3 of the Secondaries are running DMA If Available and all of the ones that have a transfer mode are Ultra DMA Mode 5 except one, which is Ultra DMA Mode 6.

    My drive array are all connected to SATA ports on the mobo though and neither of the ATA Storage Controller devices' props has such information.

    I *think* I ran ChkDsk against all drives - assuming that "D" nails the drive pool.

    [CODE]

    Microsoft Windows [Version 5.2.3790]

    (C) Copyright 1985-2003 Microsoft Corp.

     

    c:\>chkdsk

    The type of the file system is NTFS.

    Volume label is SYS.

     

    WARNING!  F parameter not specified.

    Running CHKDSK in read-only mode.

     

    CHKDSK is verifying files (stage 1 of 3)...

    318816 file records processed.

    File verification completed.

    199 large file records processed.

    0 bad file records processed.

    0 EA records processed.

    8 reparse records processed.

    CHKDSK is verifying indexes (stage 2 of 3)...

    741575 index entries processed.

    Index verification completed.

    5 unindexed files processed.

    CHKDSK is verifying security descriptors (stage 3 of 3)...

    318816 security descriptors processed.

    Security descriptor verification completed.

    10094 data files processed.

    CHKDSK is verifying Usn Journal...

    35374544 USN bytes processed.

    Usn Journal verification completed.

     

      20972825 KB total disk space.

      16308968 KB in 34935 files.

         31796 KB in 10095 indexes.

             0 KB in bad sectors.

        420905 KB in use by the system.

         65536 KB occupied by the log file.

       4211156 KB available on disk.

     

          4096 bytes in each allocation unit.

       5243206 total allocation units on disk.

       1052789 allocation units available on disk.

     

    c:\>d:

     

    D:\>chkdsk

    The type of the file system is NTFS.

    Volume label is DATA.

     

    WARNING!  F parameter not specified.

    Running CHKDSK in read-only mode.

     

    CHKDSK is verifying files (stage 1 of 3)...

    366336 file records processed.

    File verification completed.

    85 large file records processed.

    0 bad file records processed.

    0 EA records processed.

    163646 reparse records processed.

    CHKDSK is verifying indexes (stage 2 of 3)...

    1114830 index entries processed.

    Index verification completed.

    5 unindexed files processed.

    CHKDSK is verifying security descriptors (stage 3 of 3)...

    366336 security descriptors processed.

    Security descriptor verification completed.

    16324 data files processed.

    CHKDSK is verifying Usn Journal...

    34716032 USN bytes processed.

    Usn Journal verification completed.

     

     291587782 KB total disk space.

     273420936 KB in 198230 files.

         92924 KB in 16325 indexes.

             0 KB in bad sectors.

        475634 KB in use by the system.

         65536 KB occupied by the log file.

      17598288 KB available on disk.

     

          4096 bytes in each allocation unit.

      72896945 total allocation units on disk.

       4399572 allocation units available on disk.

     

     

     

    D:\>ipconfig /all

     

    Windows IP Configuration

     

       Host Name . . . . . . . . . . . . : sage

       Primary Dns Suffix  . . . . . . . :

       Node Type . . . . . . . . . . . . : Unknown

       IP Routing Enabled. . . . . . . . : No

       WINS Proxy Enabled. . . . . . . . : No

     

    Ethernet adapter Local Area Connection 2:

     

       Connection-specific DNS Suffix  . :

       Description . . . . . . . . . . . : Realtek RTL8169 Gigabit Ethernet Adapter

       Physical Address. . . . . . . . . : 00-A1-B0-80-32-84

       DHCP Enabled. . . . . . . . . . . : No

       IP Address. . . . . . . . . . . . : 192.169.1.112

       Subnet Mask . . . . . . . . . . . : 255.255.255.0

       Default Gateway . . . . . . . . . :

     

    Ethernet adapter Local Area Connection:

     

       Connection-specific DNS Suffix  . :

       Description . . . . . . . . . . . : Realtek RTL8168C(P)/8111C(P) PCI-E Gigabit Ethernet NIC

       Physical Address. . . . . . . . . : 00-23-54-79-39-A8

       DHCP Enabled. . . . . . . . . . . : Yes

       Autoconfiguration Enabled . . . . : Yes

       IP Address. . . . . . . . . . . . : 192.168.0.102

       Subnet Mask . . . . . . . . . . . : 255.255.255.0

       Default Gateway . . . . . . . . . : 192.168.0.1

       DHCP Server . . . . . . . . . . . : 192.168.0.1

       DNS Servers . . . . . . . . . . . : 71.242.0.12

                                           71.250.0.12

       Lease Obtained. . . . . . . . . . : Tuesday, August 03, 2010 11:42:47 AM

       Lease Expires . . . . . . . . . . : Wednesday, August 04, 2010 11:42:47 AM

     

    D:\>

    [/CODE]

    Tuesday, August 3, 2010 8:21 PM
  • According to what you describe all drive are running in UDMA mode. So there is no problem.

    Chkdsk output for D: looks ok, but it the check is not complete.
    Please see this FAQ from Ken on how to check all drives in your server fro errors?

    - Theo.


    No home server like Home Server
    Tuesday, August 3, 2010 9:17 PM
    Moderator
  • According to what you describe all drive are running in UDMA mode. So there is no problem.

    Chkdsk output for D: looks ok, but it the check is not complete.
    Please see this FAQ from Ken on how to check all drives in your server fro errors?

    - Theo.


    No home server like Home Server

    It's Demigrator.

    Here's why I say that:

    - Fire up the app that renders my video (in this case a ripped movie)

    - Boot up my little net book.

    - Open up a RDP window into the server

    - Open up TaskMan

    - Click Processes tab, and find Demigrator in the alphabetic listing

    - Watch the movie until it freezes

    - Use TaskMan to kill the Demigrator process.

    - Movie unfreezes instantly - click Ok on the kill, movie resumes.  Understood that there has tb a pause/delay, but it's totally imperceptable.

     

    I've done the kill thing at least 20 times now and it's been exactly the same each time.

    Only thing different is that sometimes TaskMan shows Demigrator taking zero CPU (it's usually 48-50%).

    Somebody in another thread ventured that it's not the CPU cycles Demigrator is using, but the disc activity that freezed up the video playback.

    But killing Demigrator *always* unfreezed the movie.... *always* and instantly.

    Of course Demigrator starts up again within 10-12 seconds.... -)

     

    My current theory is that this follows deleting a lot of recorded TV shows - like 3 one-hour episodes of The PBS News Hour and a few other shows.

    I'll test this theory over the next few days.   Understood that correlation is not causation... but the instant un-freezing when Demigrator is killed over and over again seems to implicate it pretty thoroughly.

     

    The remaining question seems to be "Is Demigrator's interference with video playback just Demigrator?  Or is it a disc problem that manifests itself when  Demigrator  runs?

    My reading of other threads on the subject leaves me with the impression that it's just Demigrator - but then again, maybe the authors of those threads had disc problems and didn't know it....

     

    Going back to one of your other replies, what is it about Demigrator's running that is relevant to data integrity?   Performance, I can see.    Efficient disc usage, I can see.   But integrity suggests I don't really know what Demigrator does.   I had assumed that it just shuffles things around to balance out the load on different physical discs.

    Thursday, August 5, 2010 1:22 AM
  • Please check to see if one or more of your drives is running in PIO mode. 

     Bingo!!!

    Took me awhile to figure out how to do that.... and now I'm in the process of doping out which drive is connected to that IDE channel... but that was the immediate problem - the root problem seeming tb a drive or controller on it's way out.

    Did the workaround where one uninstalls the problem IDE Channel's driver and then lets Windows find the channel and re-install the driver next boot - whereupon the access mode is re-initialized to DMA.

    For anybody wondering: it turns out that once Windows (Server 2003?) detects six CRC errors on a drive, it automagically demotes that drive from DMA (fast) access to PIO (Physical I/O - sloooow) access.

    Pretty early on, I started to suspect that TaskMan was hiding *something* in System Idle, but didn't know what to do about it and mistakenly focused on what I could see - which was DeMigrator.exe

    Then somebody told me about a freebie replacement for Task Manager called "Process Explorer".    As soon as I fired it up, I could see that "Interrupts" was hiding within System Idle Process and kicking the brains out of my CPU.

    From there, it was just a few Google searches to discover that a common cause of this was a drive's access mode being downgraded from DMA to PIO and a couple of proposed fixes/workarounds.... and Theo and Ken's recommendation suddenly became clear.

    This is in spite of my statement to the contrary on August 3.   I must have missed the PIO at that time.  From my years as a mainframe User Support guy: "Never trust a user".

    I wimped on the hotfix, but did the driver re-install thing and my SageTV rendering problems, generally awful response, and various other flaky behaviours went away immediately.

     

    It's held up overnight, but I still have a bad drive to hunt down.    Overnight somebody somewhere threw a CRC error.   WSH caught it and reported the file name - but, of course, did not identify the drive.   I guess if I Google a little, I'll find out how to backtrack from the file name to the drive...  Even as I write that, I'm getting a faint glimmer of something like "Tombstone Files" 

    There is a manual process to get back to a drive's SN or GUID from the IDE Channel it is connected to but I'm currently trying to locate a little utility that will do that for me - both because I'm a little lazy and not the brightest bulb on the tree and because I want to build a toolkit/procedure for dealing with repetitions of the DMA==>PIO problem.

     

    Needless to say, Process Explorer is the first part of that toolkit.   I guess ChkDsk is the second - although I'm currently scanning each drive using something called HD Tune to check for bad sectors.... dunno if it does the other stuff that ChkDsk does... probably not.

     

    Wednesday, September 8, 2010 1:01 AM