질문하기질문하기
 

답변됨compute nodes became unreachable randomly!!!

  • 2009년 6월 27일 토요일 오전 4:43Ali Nazemian 사용자 메달사용자 메달사용자 메달사용자 메달사용자 메달
     
    Hi,
    I have a problem with my HPCC , our compute nodes became unreachable randomly and all the job that runs on those compute nodes stop working.
    Also some times this error shows up on Head node:
    Windows NT Intersite Messaging Service has stopped working
    error details:
      Problem Event Name:    APPCRASH
      Application Name:    ismserv.exe
      Application Version:    6.0.6001.18000
      Application Timestamp:    4791966a
      Fault Module Name:    ismip.dll
      Fault Module Version:    6.0.6001.18000
      Fault Module Timestamp:    4791ad8a
      Exception Code:    c0000005
      Exception Offset:    0000000000005a5d
      OS Version:    6.0.6001.2.1.0.274.10
      Locale ID:    1033
      Additional Information 1:    86de
      Additional Information 2:    4200a2a0bdf4f799aa942c465bfdf13c
      Additional Information 3:    23ff
      Additional Information 4:    637e5f46119c24aa761e48e05f314b3e
    can anybody help me what should i do?!
    P.S:i have a Windows server 2008 enterprise x64 on Head node and Compute nodes. and also using hpc pack 2008.
    Best regards.

답변

  • 2009년 7월 10일 금요일 오후 1:12Ben YbarraMSFT사용자 메달사용자 메달사용자 메달사용자 메달사용자 메달
     답변됨
    Hello Ali

    On an unresponsive node take a look at the HPCManagement.log. This log is located under "C:\Program Files\Microsoft HPC Pack\Data\Logfiles". Please reply back with the errors you see from the logs.

    Thanks
    Ben

모든 응답

  • 2009년 7월 10일 금요일 오후 1:12Ben YbarraMSFT사용자 메달사용자 메달사용자 메달사용자 메달사용자 메달
     답변됨
    Hello Ali

    On an unresponsive node take a look at the HPCManagement.log. This log is located under "C:\Program Files\Microsoft HPC Pack\Data\Logfiles". Please reply back with the errors you see from the logs.

    Thanks
    Ben