none
HPC even;u distributed job scheduling RRS feed

  • Question

  • Is there a way to schedule a job so that the tasks are distributed more evenly across all available nodes instead of getting assigned to the first few nodes?  Specifically, I have 132 cores distributed across 11 physical systems (2 six core CPUs per system).   When I kick off a job that has 32 tasks the first three nodes get all the tasks.  This is not ideal since each task is performing quite a bit of CPU and disk writing.   I don't want to change the "UnitType" to Socket because then I would only have 22 tasks being run at a given time (in the case of 32 tasks that is 2 batches).   Ideally, I could evenly distribute tasks across all the nodes up to full saturation.

    Thanks.


    • Edited by Gogol's Sum Tuesday, January 14, 2014 5:32 PM
    Tuesday, January 14, 2014 5:30 PM

All replies

  • For the time being I set the "MinimumNumberOfCores" for each task to 2 so that at least the distribution would be more even.  Each task would then consume 2 cores.  This might be what works the best for my scenario.
    Tuesday, January 14, 2014 9:43 PM
  • I have the same problem.

    Changing the "MinimumNumberOfCores" of the job or the subscribedCores of the node would lower the amount of jobs each core would handle, which is less then ideal.

    Is there anyway to force the HPC to allocate tasks evenly between nodes?

    Instead of completely filling one node and then moving to the next as it does now.

    Tuesday, November 18, 2014 8:27 AM
  • You can set the required node for each tasks in your job to explicitly control which task should be dispatched to which node. This is not perfect either but removes the pain of changing MinumumNumber of Cores for a task.

    Currently there is no scheduling way to dispatch tasks evenly.

    Tuesday, November 18, 2014 10:30 AM
  • so if i have a job with 300 tasks (standard) i'll have to go task by task and change their required node..... 300 times....

    I don't see how that is different from changing the "MinimumNumberOfCores". It's 6 of one and half a dozen of the other.

    Tuesday, November 18, 2014 11:14 AM
  • You can set it programmatically through HPC Api.

    Changing "MinCores" of the tasks lowered the tasks on one node, but changing required node doesn't.

    Thursday, November 20, 2014 7:21 AM
  • Hi,

    I have this problem too. Is there a good workaround/fix for this problem of not distributing evenly across nodes?

    Do we know why Microsoft is keeping it this way?

    Also, I am confused to what the default behavior is. Look at this post where it claims it evenly distributes for them and want it to do otherwise.

    https://social.microsoft.com/Forums/en-US/bf223998-be24-4c75-af70-43fd99abfee1/avoid-even-distribution-of-jobs-among-nodes?forum=windowshpcsched

    Thanks,

    Mani


    ManiMJ

    Tuesday, April 4, 2017 12:56 PM