none
Enable Hyper-Threading on Compute Nodes RRS feed

  • Question

  • Hi,

    My compute nodes are 40 cores and hyper-threaded. So, when I submit a task to HPC Cluster Manager, it is only assigning 40 tasks to each compute nodes and therefore their CPU usage is maxed at 50%. I want to take advantage of hyper-threading and get the CPU usage up to 100%. I tried changing the subscribed cores to 80, but 40 tasks will run fine while the other 40 fail with the following error:

    Error from node: HPCPROC01:Microsoft.Hpc.Activation.NodeManagerException: Exception 'The parameter is incorrect' reported creating the task.

    Server stack trace:

       at Microsoft.Hpc.NodeManager.RemotingExecutor.RemotingNMExecImpl.StartTask(Int32 jobId, Int32 taskId, ProcessStartInfo startInfo)

       at Microsoft.Hpc.NodeManager.RemotingCommunicator.RemotingNMCommImpl.StartTask(Int32 jobId, Int32 taskId, ProcessStartInfo startInfo)

       at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs)

       at System.Runtime.Remoting.Messaging.StackBuilderSink.SyncProcessMessage(IMessage msg)

    Exception rethrown at [0]:

       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

       at Microsoft.Hpc.Scheduler.Communicator.Remoting.NodeController.StartTaskWorker.EndInvoke(IAsyncResult result)

       at Microsoft.Hpc.Scheduler.Communicator.Remoting.NodeController.AsyncContext`1.EndCall(IAsyncResult result)

    I am able to run 80 processes on the compute node locally and get 100% CPU usage.  But I can't seem to do this through HPC Cluster Manager.

    Thanks,

    Tiffany

    Friday, March 17, 2017 9:10 PM

All replies

  • Hi,

      Could you tell us your HPC Pack version as well as the windows version on the compute node?


    Qiufang Shi

    Sunday, March 19, 2017 2:55 PM
  • And please also check whether you have enabled Afinity for your task. We may also need to get the nodemanager log from the compute node to check what happened during tast start.

    Qiufang Shi

    Monday, March 20, 2017 2:41 PM
  • The HPC Pack version is 4.5.5079.0.

    Compute nodes are Windows 7 Professional SP 1

    Wednesday, March 22, 2017 2:31 PM
  • To submit tasks we use the IScheduler Interface, which doesn’t invoke the mpiexec command directly and doesn’t set the affinity parameter.
    Wednesday, March 22, 2017 3:13 PM
  • Hi, Tiffany

      Looks like there is issue when affinity kicks in for you task running on win7。 As by default Affinity will be set for "Non-Exclusive jobs". Could you try configure the affinity to "no jobs" in "Cluster Manager --> Options --> Job Scheduler Configuration --> Affinity" page?

      If this still won't won't, please help share the nodemanager log on your compute node which should be located under %CCP_HOME%DATA\LogFiles\Scheduler\ names as "HPCNodeManager_*.bin", just share the latest 2 bin files to hpcpack@microsoft.com


    Qiufang Shi

    Thursday, March 23, 2017 2:11 AM