locked
Unknown Hardware Interrupts clogs processor RRS feed

  • Question

  • Hello,

    I have a TranquilPC WHS with 2x Intel 330 Atom processors (in all 4 processor cores) and 2 GB RAM andwith a Gb/s NIC.

    Comparing with my friends who also has WHS's, I realized that my server has remarkably worse performance. I have noticed only getting 3 MB/s network throughput on a Gb/s network and other activities, such as unRARing a file or performing a standard WHS nightly backups of my clients takes significantly longer time than expected. My friends get much better performance doing all the things I do, but they even do it simultanously, something I could only dream of.

    At first I suspected this to be a network problem, so I bought a separate fast switch and new cat6 ethernet cables, but to no avail. Then I started to investigate the WHS and found symptoms of problems.

    Looking at the Task Manager in Windows while doing a file transfer I noticed that one of the four "processors" was constantly at 100 % workload. This phenomenon is unusual and it looks the same when doing other things as well i.e. doing anything I could thing of except for letting the server be idle.

    I downloaded the "Process Explorer" tool from Microsoft Sysinternals (http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx) which gives a more detailed view of the running processes. I now found out that the problem seems to be Hardware Interrupts!

    Twenty five percent of the whole processor (i.e. 100% of one of the processor cores) is busy with Hardware Interrupts whenever I do anything!

    I have tried different things to solve this issue to no avail:

    - Checking the Device Manager and the Event Log in Windows. None of them reports anything unusal.
    - uppgrading the BIOS to the latest version
    - loading the "Optimized defaults" in BIOS
    - in Windows upgrading the NIC drivers and the motherboard inf files to their latest versions
    - disabling the NIC in BIOS and as many other things as possible. It turns out I get that 25% Hardware Interrupts even immediately after boot and it remains so until all programs are loaded and the computer is idle.
    - trying the KrView tool (http://www.microsoft.com/whdc/system/sysperf/krview.mspx). I cannot really do anything with the results it gave me. Maybe you can, see below.

    I have searched the Internet quite a bit, but cannot find anything that seems perfectly relevant to my problem.

    Looking forward to your helping me,
    Fred

    ------------ KrView log follows:
    C:\Program Files\KrView\Kernrates>Kernrate_i386_XP.exe
     /==============================\
    <         KERNRATE LOG           >
     \==============================/
    Date: 2009/09/19   Time:  9:55:21
    Machine Name: SERVER
    Number of Processors: 4
    PROCESSOR_ARCHITECTURE: x86
    PROCESSOR_LEVEL: 6
    PROCESSOR_REVISION: 1c02
    Physical Memory: 2038 MB
    Pagefile Total: 3936 MB
    Virtual Total: 2047 MB
    PageFile1: \??\C:\pagefile.sys, 2046MB
    OS Version: 5.2 Build 3790 Service-Pack: 2.0
    WinDir: C:\WINDOWS

    Kernrate User-Specified Command Line:
    Kernrate_i386_XP.exe


    Kernel Profile (PID = 0): Source= Time,
    Using Kernrate Default Rate of 25000 events/hit
    Starting to collect profile data

    ***> Press ctrl-c to finish collecting profile data
    ===> Finished Collecting Data, Starting to Process Results

    ------------Overall Summary:--------------

    P0     K 0:00:12.843 (99.3%)  U 0:00:00.000 ( 0.0%)  I 0:00:00.093 ( 0.7%)  DPC
    0:00:00.562 ( 4.3%)  Interrupt 0:00:12.250 (94.7%)
           Interrupts= 11832, Interrupt Rate= 915/sec.

    P1     K 0:00:00.031 ( 0.2%)  U 0:00:00.000 ( 0.0%)  I 0:00:12.906 (99.8%)  DPC
    0:00:00.015 ( 0.1%)  Interrupt 0:00:00.015 ( 0.1%)
           Interrupts= 6004, Interrupt Rate= 464/sec.

    P2     K 0:00:00.312 ( 2.4%)  U 0:00:00.015 ( 0.1%)  I 0:00:12.609 (97.5%)  DPC
    0:00:00.000 ( 0.0%)  Interrupt 0:00:00.015 ( 0.1%)
           Interrupts= 6003, Interrupt Rate= 464/sec.

    P3     K 0:00:00.078 ( 0.6%)  U 0:00:00.000 ( 0.0%)  I 0:00:12.859 (99.4%)  DPC
    0:00:00.015 ( 0.1%)  Interrupt 0:00:00.046 ( 0.4%)
           Interrupts= 6004, Interrupt Rate= 464/sec.

    TOTAL  K 0:00:13.265 (25.6%)  U 0:00:00.015 ( 0.0%)  I 0:00:38.468 (74.3%)  DPC
    0:00:00.593 ( 1.1%)  Interrupt 0:00:12.328 (23.8%)
           Total Interrupts= 29843, Total Interrupt Rate= 2307/sec.


    Total Profile Time = 12937 msec

                                           BytesStart          BytesStop         BytesDiff.
        Available Physical Memory   ,      1602830336,      1601867776,         -962560
        Available Pagefile(s)       ,      3766513664,      3765891072,         -622592
        Available Virtual           ,      2131640320,      2130591744,        -1048576
        Available Extended Virtual  ,               0,               0,  0

                                      Total      Avg. Rate
        Context Switches     ,         6384,         493/sec.
        System Calls         ,         3590,         277/sec.
        Page Faults          ,        10078,         779/sec.
        I/O Read Operations  ,           60,         5/sec.
        I/O Write Operations ,           28,         2/sec.
        I/O Other Operations ,          594,         46/sec.
        I/O Read Bytes       ,        10151,         169/ I/O
        I/O Write Bytes      ,        34350,         1227/ I/O
        I/O Other Bytes      ,     36725094,         61827/ I/O

    -----------------------------

    Results for Kernel Mode:
    -----------------------------

    OutputResults: KernelModuleCount = 98
    Percentage in the following table is based on the Total Hits for the Kernel

    Time   20687 hits, 25000 events per hit --------
     Module                                Hits   msec  %Total  Events/Sec
    intelppm                              15435      12937    74 %    29827239
    hal                                    5023      12937    24 %     9706655
    ntkrnlpa                                 80      12937     0 %      154595
    tcpip                                    70      12937     0 %      135270
    Rtenicxp                                 21      12937     0 %       40581
    atapi                                    10      12937     0 %       19324
    fltMgr                                    9      12937     0 %       17391
    NDIS                                      8      12937     0 %       15459
    Ntfs                                      8      12937     0 %       15459
    ipnat                                     7      12937     0 %       13527
    win32k                                    3      12937     0 %        5797
    netbt                                     3      12937     0 %        5797
    ipsec                                     2      12937     0 %        3864
    srv                                       1      12937     0 %        1932
    ndisuio                                   1      12937     0 %        1932
    USBPORT                                   1      12937     0 %        1932
    SiWinAcc                                  1      12937     0 %        1932
    DEfilter                                  1      12937     0 %        1932
    CLASSPNP                                  1      12937     0 %        1932
    PCIIDEX                                   1      12937     0 %        1932
    ACPI                                      1      12937     0 %        1932

    ================================= END OF RUN ==================================
    ============================== NORMAL END OF RUN ==============================

    C:\Program Files\KrView\Kernrates>

    Wednesday, October 14, 2009 7:38 PM

Answers

All replies

  • I don't know if its the Atom or something else. Really only the WHS team can analyze what it is doing... may want to report this on Microsoft Connect
    Wednesday, October 14, 2009 9:35 PM
  • Have you identified the source of the interrupts? I don't pretend to know how to interpret the KrView logs, but a casual glance indicates that the module "intelppm" is generating most of the kernel hits. What is this module?

    I had a similar problem with a desktop PC and it turned out in my case that the source of the interrupt storm was having a virtual network adapter (from Cisco VPN client) bound to the physical NIC. Unbinding the virtual adapter would stop the interrupts. An upgrade of the NIC driver fixed the issue.
    Wednesday, October 14, 2009 11:45 PM
  • Evaders99 and Mark Warton,

    Thank you for your answers. Good idea, I will take this to MS Connect.

    (The intelppm module is a Microsoft driver for Intel processors.)
    Thursday, October 15, 2009 6:19 AM
  • Could also be one of your drives is running in PIO mode. Please checkout this thread and this one and this one.
    • Marked as answer by Dettner Thursday, October 15, 2009 1:00 PM
    Thursday, October 15, 2009 7:23 AM
    Moderator
  • Thank you brubber,

    You were spot on! It was the Secondary IDE Channel that had reverted to PIO mode, see http://support.microsoft.com/default.aspx?scid=kb;en-us;817472.

    Regards, /Fred
    Thursday, October 15, 2009 1:01 PM
  • Glad you solved it!
    Thursday, October 15, 2009 1:16 PM
    Moderator
  • Well it was you who solved it and I am also glad. =) Thanks!!!
    Thursday, October 15, 2009 1:19 PM
  • Please bear in mind that Windows normally drops a drive back to PIO mode for a reason. That reason is repeated errors on the drive: every time a certain threshold is reached, the drive is stepped back one level until the lowest level (PIO mode) is reached. Sometimes this happens for reasons not related to hardware issues, but it's more common that it is related.

    So I would recommend running chkdsk on all the drives in your server to check for errors, even if you're in the habit of putting your server into a low power state (and thus fall under the KB article linked above). Consider it proactive error recovery. :)

    I'm not on the WHS team, I just post a lot. :)
    Thursday, October 15, 2009 4:15 PM
    Moderator