none
How to force to only run 1 task per node? RRS feed

  • Question

  • I switched over the current app I'm working on to no longer use our HPC wrapper and to just create the SOA code myself.  The issue I'm running into now is that I want to limit it so that only 1 task runs per node (have 50 compute nodes, each with 4 core).  Because of the way our job is designed (multi-threaded), we want to limit it so that only 1 task is running per node.  Is that possible?  

    I cant seem to figure out how to set the task properties when using the SOA option to set the task exclusive = true.  I set the job template exclusive = true, but that still has it running 200 tasks concurrently (50 machines * 4 cores I'm guessing).  The code I'm using is very similar to the HelloWorldR2 in the SDK samples.  There doesnt seem to be a way to pass in task options when submitting each request as far as I could see.

    Thanks!


    -Jason

    EDIT: i switched the code over to use the IScheduler/IJobScheduler/ITaskScheduler and was able to accomplish what I was looking to do (set IsExclusive = true on the task).  That said, I'd love to be able to do this using the SOA option as well, as that gives us more capability in terms of passing parameters to/from the processes running on HPC.

    Thanks!

    Monday, August 31, 2015 2:30 PM

Answers

  • Hi Jason,

    Please check the maxConcurrentCalls setting in the service registration file and see if the value is set to 0. Besides, you may also set the WCF service behavior ConcurrencyMode = ConcurrencyMode.Single to ensure the single entry for the requests. Refer this forum thread for more info.

    BR,

    Yutong Sun

    • Marked as answer by Jason Lee 1234 Wednesday, September 2, 2015 7:37 PM
    Wednesday, September 2, 2015 3:32 AM
    Moderator

All replies

  • You can check the Job Resource allocation Unit. By default it is core and you can change to socket and node. In your case, you can just set the job allocation Unit type to Node. After doing so your task on this node will occupy the whole node (Eg, one task per node).

    And you can also fine tune this number through Socket. Using powershell command "set-hpcnode -SubscribedCores/SubscribedSockets" will let you change this number for node. For example, you may oversubscribe one of your know to 100 cores, and submit lightweight jobs/tasks to this node.


    Qiufang Shi

    Tuesday, September 1, 2015 1:29 AM
  • Hi Qiufang,

    Thanks for the quick response.  Is setting the Resource Allocation Type all i need to do?  

                SessionStartInfo info = new SessionStartInfo(_hpcHeadNode, _serviceName);
                info.JobTemplate = _jobTemplate;
                info.SessionResourceUnitType = SessionUnitType.Node;
                info.MaximumUnits = 50;

    I tried setting it in both my job template, as well as explicitly on the job itself (also tried capping MaximumUnits to see what would happen), but none of those actually seem to be accomplishing the goal of only running 1 task per node.  Within the job manager - job details, I can see what Unit Type = Node, Exclusive = true, Maximum Nodes: 50, Minimum nodes: Auto, and on the tasks themselves, they show up as Unit: Node, min: 1, max: 1, exclusive: false.  

    The other thing I noticed is that under the job, the tasks only show 1.1-1.50, but I know for certain that more than 50 jobs have started based on what I'm seeing in the database that the job writes to.  The session progress says: Total Requests: 300, Calculating: 169, Incoming: 131.

    My preference is to have to use the powershell command to trick the system into thinking there are more or less cores/sockets, as that is an additional configuration that would be required when building a new machine I would prefer to avoid.

    Thanks!

    -Jason 

    Tuesday, September 1, 2015 12:14 PM
  • Hi Jason,

    Please check the maxConcurrentCalls setting in the service registration file and see if the value is set to 0. Besides, you may also set the WCF service behavior ConcurrencyMode = ConcurrencyMode.Single to ensure the single entry for the requests. Refer this forum thread for more info.

    BR,

    Yutong Sun

    • Marked as answer by Jason Lee 1234 Wednesday, September 2, 2015 7:37 PM
    Wednesday, September 2, 2015 3:32 AM
    Moderator
  • thanks.

    Previously did not have maxConcurrentCalls in the service registration file.  I added it and set to 1 and that seemed to solve my problem.  Thanks!

    Wednesday, September 2, 2015 7:37 PM