application crashing ( Fault Module Name:ntdll.dll, exit code 0xc0000374 )
-
16 September 2010 0:06
Hi
My application runs when I execute with 1 process. But when I run with 2 processes using $mpiexec -np 2 flash3.exe then it crashes with the following message:
--------------
Problem signature:
Problem Event Name: APPCRASH
Application Name: flash3.exe
Application Version: 0.0.0.0
Application Timestamp: 4c91b3c9
Fault Module Name: ntdll.dll
Fault Module Version: 6.0.6002.18005
Fault Module Timestamp: 49e0421d
Exception Code: c000007b
Exception Offset: 00000000000b8fb8
OS Version: 6.0.6002.2.2.0.272.18
Locale ID: 2057
Additional Information 1: fa3e
Additional Information 2: ac0507478d1c5bd693cfc4fe3987e900
Additional Information 3: fa3e
Additional Information 4: ac0507478d1c5bd693cfc4fe3987e900
----------------
Later it gives this message:
--------------
job aborted:[ranks] message
[0] terminated
[1] process exited without calling finalize
---- error analysis -----
[1] on Head./flash3 ended prematurely and may have crashed. exit code 0xc0000374
---- error analysis --------------------------
I am working on Windows Server 2008, HPC Edition. Kindly let me know what is going wrong and how do I resolve this ? Is there some issue with ntdll.dll with this OS or is it my application ?
Thanks & Regards,Kunal
Semua Balasan
-
16 September 2010 19:20
Hello Kunal,
If possible, could you reduce your source code and post here for debugging? It is hard to tell what's wrong from the error message itself.
Thanks,
James
-
18 September 2010 17:34
Hi James
Thanks for your reply. The source code is too huge and I cannot point to any specific file which could be the source of the error. I compiled the application again and when I executed it with 2 processes, this time I got this error:
------------------
Problem signature:Problem Event Name: APPCRASHApplication Name: flash3.exeApplication Version: 0.0.0.0Application Timestamp: 4c94f5ffFault Module Name: StackHash_8f98Fault Module Version: 6.0.6002.18005Fault Module Timestamp: 49e0421dException Code: c0000374Exception Offset: 00000000000aef37OS Version: 6.0.6002.2.2.0.272.18Locale ID: 2057Additional Information 1: 8f98Additional Information 2: 3926b45e3f7f9075e413d7f43231ac3cAdditional Information 3: e6c5Additional Information 4: 0b6cf2f93119e73fa1670cd3360652c4
-------------------
and later:
--------------
job aborted:[ranks] message
[0] terminated
[1] process exited without calling finalize
---- error analysis -----
[1] on Head./flash3 ended prematurely and may have crashed. exit code 0xc0000374
---- error analysis --------------------------
Sorry, I am not sure if there is any specific portion of the code that is causing this error. Could it be some issue with the Windows dll's that are being loaded ?
Thanks & Regards,Kunal
-
18 September 2010 22:56
I used application verifier to check what is going on. I selected my executable and ran it with 2 processes. This is what I got in the log for process 1:
------------------------
<?xml version="1.0" encoding="UTF-8" standalone="no" ?><avrf:message>First chance access violation for current stack trace.</avrf:message><avrf:parameter1>849e0f0 - Invalid address causing the exception.</avrf:parameter1><avrf:parameter2>74cae3f0 - Code address executing the invalid access.</avrf:parameter2><avrf:parameter3>12f2e0 - Exception record.</avrf:parameter3><avrf:parameter4>12ee10 - Context record.</avrf:parameter4><avrf:trace>vrfcore!VerifierDisableVerifier+934 ( @ 0)</avrf:trace><avrf:trace>ntdll!RtlApplicationVerifierStop+d3 ( @ 0)</avrf:trace><avrf:trace>vfbasics!+7fef0f26377 ( @ 0)</avrf:trace><avrf:trace>vfbasics!+7fef0f27c9b ( @ 0)</avrf:trace><avrf:trace>vfbasics!+7fef0f27392 ( @ 0)</avrf:trace><avrf:trace>ntdll!RtlIpv4AddressToStringA+1cb ( @ 0)</avrf:trace><avrf:trace>ntdll!_C_specific_handler+27d ( @ 0)</avrf:trace><avrf:trace>ntdll!KiUserExceptionDispatcher+2e ( @ 0)</avrf:trace><avrf:trace>MSVCR80!memcpy+250 ( @ 0)</avrf:trace></avrf:stackTrace></avrf:logEntry></avrf:logSession></avrf:logfile>
------------------------------------
The job this time aborted with this message:
------------------------------------forrtl: severe (159): Program Exception - breakpointImage PC Routine Line Sourcentdll.dll 0000000076E76060 Unknown Unknown Unknownvrfcore.dll 000007FEF0FC37EE Unknown Unknown Unknownvrfcore.dll 000007FEF0FC9970 Unknown Unknown Unknownntdll.dll 0000000076EEC193 Unknown Unknown Unknownvfbasics.dll 000007FEF0F26377 Unknown Unknown Unknownvfbasics.dll 000007FEF0F27C9B Unknown Unknown Unknownvfbasics.dll 000007FEF0F27392 Unknown Unknown Unknownntdll.dll 0000000076E5396B Unknown Unknown Unknownntdll.dll 0000000076E69795 Unknown Unknown Unknownntdll.dll 0000000076E76C78 Unknown Unknown UnknownMSVCR80.dll 0000000074CAE3F0 Unknown Unknown Unknownmsmpi.dll 0000000068D758DD Unknown Unknown Unknownmsmpi.dll 0000000068D724A5 Unknown Unknown Unknownmsmpi.dll 0000000068D6F21B Unknown Unknown Unknownmsmpi.dll 0000000068D66DD8 Unknown Unknown Unknownmsmpi.dll 0000000068D1757D Unknown Unknown Unknownmsmpi.dll 0000000068D0A9F4 Unknown Unknown Unknownmsmpi.dll 0000000068D0B0F5 Unknown Unknown Unknownflash3.exe 0000000140162666 Unknown Unknown Unknownflash3.exe 000000014013E814 Unknown Unknown Unknownflash3.exe 0000000140032553 Unknown Unknown Unknownflash3.exe 0000000140032CF2 Unknown Unknown Unknownflash3.exe 0000000140004E6F Unknown Unknown Unknownflash3.exe 000000014000CFA1 Unknown Unknown Unknownflash3.exe 00000001401FD08C Unknown Unknown Unknownflash3.exe 00000001401F874A Unknown Unknown Unknownkernel32.dll 0000000076C4BE3D Unknown Unknown Unknownntdll.dll 0000000076E56A51 Unknown Unknown Unknown
job aborted:[ranks] message
[0] process exited without calling finalize
[1] terminated
---- error analysis -----
[0] on WIN-MN7DR40J561./flash3 ended prematurely and may have crashed. exit code 159
---- error analysis -----
---------------------------------------
Can we get any hints from this ?
Thanks & Regards,Kunal -
22 September 2010 21:50
Hi Kunal,
It is hard to tell what's going wrong merely from the error message. It tells the call stacks but couldn't tell which is wrong. It could be your app or could be msmpi. I suggest you to try the MSMPI debugger to debug your program. You can get it in 2 ways:
1) In Visual studio 2010, the MSMPI debugger is built in.
2) In Visual Studio 2008, you can download the addin from HPC web site. see http://msdn.microsoft.com/en-us/library/dd560808.aspx
Hope it helps,
Thanks,
-
23 September 2010 5:19
Hi James,
May I know, why do you think it could be MSMPI ? I am compiling and executing my application from the cygwin environment.
So, I don't think, I can use the MSMPI debugger as I am not using Visual Studio. Any other suggestions ?
Could it be that some Windows dll or msmpi.dll is not proper ? In that case should I get a fresh Windows copy and/or install MS HPC Pack 2008 SDK latest version ?
Thanks and Regards,
Kunal
-
08 Oktober 2010 4:55
Hi Kunal,
Error code: 0xc0000374 is STATUS_HEAP_CORRUPTION, that means memory access violated. Also, you app can pass with single rank but failed with 2 ranks. This makes me thinking whether there is a MPI problem. Could you find the debugger for the compiler you used to compile your app?
Thanks,
James
- Ditandai sebagai Jawaban oleh Don PatteeModerator 12 Januari 2011 3:00