none
Insufficient Memory to continue Execution of the program HPC Pack 2012 R2

    Question

  • Installed HPC Pack 2012 R2 on an 8 core 16GB host running windows server 2012 R2

    Getting an error in the session logs that I have never seen before and it is preventing me from scheduling jobs. (domain name redacted)

    7/28/2018 00:32:57.694 i HpcSoa 4672 4732 Get 1 broker nodes from the scheduler.  
    07/28/2018 00:32:57.694 i HpcSoa 4672 4732 Add broker node WIN-KLK5GP8CM6P to the list, domain name is XXXXXXXXXXXX.  
    07/28/2018 00:32:57.701 e HpcSoa 4672 4732 Cannot get BN WIN-KLK5GP8CM6P SSDL. Exception = System.OutOfMemoryException: Insufficient memory to continue the execution of the program...   at System.Security.Principal.WindowsIdentity.KerbS4ULogon(String upn, SafeAccessTokenHandle& safeTokenHandle)..   at System.Security.Principal.WindowsIdentity..ctor(String sUserPrincipalName, String type)..   at System.Security.Principal.WindowsIdentity..ctor(String sUserPrincipalName)..   at Microsoft.Hpc.Scheduler.Session.Internal.SessionLauncher.BrokerNodesManager.GetSDDI(String nodeName, Exception& exception)

    Everything on the cluster seems normal from other deployments I have done but this one has me stumped as I am not sure where to look next.

    Saturday, 28 July 2018 1:14 AM

All replies

  • Hi SoggyStrargazer,

    Could you check the memory footprint on the head node to see if HpcSession.exe, or HpcBrokerWorker.exe (if head node is also a broker node), or other processes are consuming most of the memory so that the System.OutOfMemoryException occurs?

    Regards,

    Yutong Sun

    Monday, 30 July 2018 3:39 AM
  • Yutong,

         Please refer to the attached screen capture of the memory usage.  We have attempted to redeploy HPC pack on the same host with no change in behavior.  I see this out of memory error at regular intervals in the session log.  There are no jobs submitted to the cluster when this is occurring.

    Based on the stats below, it looks like all of HPC pack is only using about 3GB of memory which is well below the available memory.

    are there any other log files I should check?

    Monday, 30 July 2018 3:16 PM
  • Hi SoggyStrargazer,

    Could you dump the process HpcSession.exe, upload to network disk together with the HpcSession_*.bin logs files under %CCP_DATA%LogFiles\SOA, and send us (hpcpack@microsoft.com) the download link?

    Meanwhile could you check and indicate the HPC Pack version are you using? Is it HPC Pack 2012 R2 Update 3 with the latest QFE (version 4.5.5194)?

    Regards,

    Yutong Sun


    Tuesday, 31 July 2018 8:40 AM
  • Yutong,

          Version is HPC Pack 2012 R2 4.5.5079.0.  I have sent a link to the requested email.

    -Zach

    Tuesday, 31 July 2018 1:33 PM
  • We have upgraded to 4.5.5187 and are still seeing the issue.
    Tuesday, 31 July 2018 6:32 PM
  • I believe this was ultimately caused by attempting to use AWS Simple AD as the domain controller.

    We switched to a full fledged AD host and it works now.

    Wednesday, 8 August 2018 6:25 PM