none
application crashing ( Fault Module Name:ntdll.dll, exit code 0xc0000374 ) RRS feed

  • Question

  • Hi

    My application runs when I execute with 1 process. But when I run with 2 processes using $mpiexec -np 2 flash3.exe then it crashes with the following message:

     

    --------------

     

    Problem signature:

      Problem Event Name: APPCRASH

      Application Name: flash3.exe

      Application Version: 0.0.0.0

      Application Timestamp: 4c91b3c9

      Fault Module Name: ntdll.dll

      Fault Module Version: 6.0.6002.18005

      Fault Module Timestamp: 49e0421d

      Exception Code: c000007b

      Exception Offset: 00000000000b8fb8

      OS Version: 6.0.6002.2.2.0.272.18

      Locale ID: 2057

      Additional Information 1: fa3e

      Additional Information 2: ac0507478d1c5bd693cfc4fe3987e900

      Additional Information 3: fa3e

      Additional Information 4: ac0507478d1c5bd693cfc4fe3987e900

    ----------------

    Later it gives this message:

     

    --------------

    job aborted:
    [ranks] message

    [0] terminated

    [1] process exited without calling finalize

    ---- error analysis -----

    [1] on Head
    ./flash3 ended prematurely and may have crashed. exit code 0xc0000374

    ---- error analysis -----
    ---------------------

    I am working on Windows Server 2008, HPC Edition. Kindly let me know what is going wrong and how do I resolve this ? Is there some issue with ntdll.dll with this OS or is it my application ?

    Thanks & Regards,
    Kunal

     

    Thursday, September 16, 2010 12:06 AM

Answers

  • Hi Kunal,

    Error code: 0xc0000374 is STATUS_HEAP_CORRUPTION, that means memory access violated. Also, you app can pass with single rank but failed with 2 ranks. This makes me thinking whether there is a MPI problem. Could you find the debugger for the compiler you used to compile your app?

    Thanks,

    James

    Friday, October 8, 2010 4:55 AM

All replies

  • Hello Kunal,

    If possible, could you reduce your source code and post here for debugging? It is hard to tell what's wrong from the error message itself.

    Thanks,

    James

    Thursday, September 16, 2010 7:20 PM
  • Hi James

      Thanks for your reply. The source code is too huge and I cannot point to any specific file which could be the source of the error. I compiled the application again and when I executed it with 2 processes, this time I got this error:

     

    ------------------

    Problem signature:
      Problem Event Name: APPCRASH
      Application Name: flash3.exe
      Application Version: 0.0.0.0
      Application Timestamp: 4c94f5ff
      Fault Module Name: StackHash_8f98
      Fault Module Version: 6.0.6002.18005
      Fault Module Timestamp: 49e0421d
      Exception Code: c0000374
      Exception Offset: 00000000000aef37
      OS Version: 6.0.6002.2.2.0.272.18
      Locale ID: 2057
      Additional Information 1: 8f98
      Additional Information 2: 3926b45e3f7f9075e413d7f43231ac3c
      Additional Information 3: e6c5
      Additional Information 4: 0b6cf2f93119e73fa1670cd3360652c4

    -------------------

    and later:

    --------------

    job aborted:
    [ranks] message

    [0] terminated

    [1] process exited without calling finalize

    ---- error analysis -----

    [1] on Head
    ./flash3 ended prematurely and may have crashed. exit code 0xc0000374

    ---- error analysis -----
    ---------------------

    Sorry, I am not sure if there is any specific portion of the code that is causing this error. Could it be some issue with the Windows dll's that are being loaded ? 

    Thanks & Regards,
    Kunal


    Saturday, September 18, 2010 5:34 PM
  • I used application verifier to check what is going on. I selected my executable and ran it with 2 processes. This is what I got in the log for process 1:

    ------------------------

    <?xml version="1.0" encoding="UTF-8" standalone="no" ?>
    - <avrf:logfile xmlns:avrf="Application Verifier">
    - <avrf:logSession TimeStarted="2010-09-18 : 23:47:28" PID="4868" Version="2">
    - <avrf:logEntry Time="2010-09-18 : 23:47:29" LayerName="Heaps" StopCode="0x13" Severity="Error">
      <avrf:message>First chance access violation for current stack trace.</avrf:message>
      <avrf:parameter1>849e0f0 - Invalid address causing the exception.</avrf:parameter1>
      <avrf:parameter2>74cae3f0 - Code address executing the invalid access.</avrf:parameter2>
      <avrf:parameter3>12f2e0 - Exception record.</avrf:parameter3>
      <avrf:parameter4>12ee10 - Context record.</avrf:parameter4>
    - <avrf:stackTrace>
      <avrf:trace>vrfcore!VerifierDisableVerifier+934 ( @ 0)</avrf:trace>
      <avrf:trace>ntdll!RtlApplicationVerifierStop+d3 ( @ 0)</avrf:trace>
      <avrf:trace>vfbasics!+7fef0f26377 ( @ 0)</avrf:trace>
      <avrf:trace>vfbasics!+7fef0f27c9b ( @ 0)</avrf:trace>
      <avrf:trace>vfbasics!+7fef0f27392 ( @ 0)</avrf:trace>
      <avrf:trace>ntdll!RtlIpv4AddressToStringA+1cb ( @ 0)</avrf:trace>
      <avrf:trace>ntdll!_C_specific_handler+27d ( @ 0)</avrf:trace>
      <avrf:trace>ntdll!KiUserExceptionDispatcher+2e ( @ 0)</avrf:trace>
      <avrf:trace>MSVCR80!memcpy+250 ( @ 0)</avrf:trace>
      </avrf:stackTrace>
      </avrf:logEntry>
      </avrf:logSession>
      </avrf:logfile>

    ------------------------------------


    The job this time aborted with this message:

    ------------------------------------
    forrtl: severe (159): Program Exception - breakpoint
    Image              PC                Routine            Line        Source
    ntdll.dll          0000000076E76060  Unknown               Unknown  Unknown
    vrfcore.dll        000007FEF0FC37EE  Unknown               Unknown  Unknown
    vrfcore.dll        000007FEF0FC9970  Unknown               Unknown  Unknown
    ntdll.dll          0000000076EEC193  Unknown               Unknown  Unknown
    vfbasics.dll       000007FEF0F26377  Unknown               Unknown  Unknown
    vfbasics.dll       000007FEF0F27C9B  Unknown               Unknown  Unknown
    vfbasics.dll       000007FEF0F27392  Unknown               Unknown  Unknown
    ntdll.dll          0000000076E5396B  Unknown               Unknown  Unknown
    ntdll.dll          0000000076E69795  Unknown               Unknown  Unknown
    ntdll.dll          0000000076E76C78  Unknown               Unknown  Unknown
    MSVCR80.dll        0000000074CAE3F0  Unknown               Unknown  Unknown
    msmpi.dll          0000000068D758DD  Unknown               Unknown  Unknown
    msmpi.dll          0000000068D724A5  Unknown               Unknown  Unknown
    msmpi.dll          0000000068D6F21B  Unknown               Unknown  Unknown
    msmpi.dll          0000000068D66DD8  Unknown               Unknown  Unknown
    msmpi.dll          0000000068D1757D  Unknown               Unknown  Unknown
    msmpi.dll          0000000068D0A9F4  Unknown               Unknown  Unknown
    msmpi.dll          0000000068D0B0F5  Unknown               Unknown  Unknown
    flash3.exe         0000000140162666  Unknown               Unknown  Unknown
    flash3.exe         000000014013E814  Unknown               Unknown  Unknown
    flash3.exe         0000000140032553  Unknown               Unknown  Unknown
    flash3.exe         0000000140032CF2  Unknown               Unknown  Unknown
    flash3.exe         0000000140004E6F  Unknown               Unknown  Unknown
    flash3.exe         000000014000CFA1  Unknown               Unknown  Unknown
    flash3.exe         00000001401FD08C  Unknown               Unknown  Unknown
    flash3.exe         00000001401F874A  Unknown               Unknown  Unknown
    kernel32.dll       0000000076C4BE3D  Unknown               Unknown  Unknown
    ntdll.dll          0000000076E56A51  Unknown               Unknown  Unknown

    job aborted:
    [ranks] message

    [0] process exited without calling finalize

    [1] terminated

    ---- error analysis -----

    [0] on WIN-MN7DR40J561
    ./flash3 ended prematurely and may have crashed. exit code 159

    ---- error analysis -----

    ---------------------------------------

    Can we get any hints from this ?

    Thanks & Regards,
    Kunal

    Saturday, September 18, 2010 10:56 PM
  • Hi Kunal,

    It is hard to tell what's going wrong merely from the error message. It tells the call stacks but couldn't tell which is wrong. It could be your app or could be msmpi. I suggest you to try the MSMPI debugger to debug your program. You can get it in 2 ways:

    1) In Visual studio 2010, the MSMPI debugger is built in.

    2) In Visual Studio 2008, you can download the addin from HPC web site. see http://msdn.microsoft.com/en-us/library/dd560808.aspx

    Hope it helps,

    Thanks,

    Wednesday, September 22, 2010 9:50 PM
  • Hi James,

     May I know, why do you think it could be MSMPI ? I am compiling and executing my application from the cygwin environment.

     So, I don't think, I can use the MSMPI debugger as I am not using Visual Studio. Any other suggestions ?

     Could it be that some Windows dll or msmpi.dll is not proper ? In that case should I get a fresh Windows copy and/or install MS HPC  Pack 2008 SDK  latest version ?

    Thanks and Regards,

    Kunal

    Thursday, September 23, 2010 5:19 AM
  • Hi Kunal,

    Error code: 0xc0000374 is STATUS_HEAP_CORRUPTION, that means memory access violated. Also, you app can pass with single rank but failed with 2 ranks. This makes me thinking whether there is a MPI problem. Could you find the debugger for the compiler you used to compile your app?

    Thanks,

    James

    Friday, October 8, 2010 4:55 AM