locked
HPC Nodes Allocation & Task Dependency RRS feed

  • Question

  • Hi,

    I have a cluster with N nodes. My workflow requires a job that contains task A and B. Task A is to be executed using 1 node only. Task B is dependent on A and is to be executed using all nodes. 

    I had set task A MinimumNumberOfNodes and MaximumNumberOfNodes to be 1, task B MinimumNumberOfNodes to 1 and MaximumNumberOfNodes to be N. When I run my job, task A uses the first node and task B started using the same node after task A even when there are N nodes available.

    Is it possible to configure it to use N nodes for task B instead of only 1 node?

    Thanks in advance

    • Moved by Don PatteeModerator Sunday, May 13, 2012 2:56 AM (From:Windows HPC Server Developers - General)
    Wednesday, April 18, 2012 1:06 AM

All replies

  • what's the min/max nodes configuration of your job? and what's the grow/shrink policy of your scheduler?
    Wednesday, April 18, 2012 6:09 PM
  • Hi,

    The min is 1 and max is N for the job. I did a checked on the allocated resources after the job is completed and it displays all the N nodes. I did not set the grow/shrink policy and left it default.

    Thanks

    Thursday, April 19, 2012 2:11 AM
  • Hi,

    Could you try to set Task B to run on min of N nodes and max of N nodes as well? Because if you wish Task B to run on all the nodes, you have to specify it to use all the nodes. Does this make sense?

    Also, if you don't mind, could you share your job xml file? This will give us an opportunity to take a closer at your problem and understand it better.

    Thanks.

    Michael

    Thursday, April 19, 2012 9:11 PM
  • Hi Michael,

    If I were to set the min nodes of Task B to N, it would mean that Task B cannot start if there are less than N nodes available. I want the task to start as long as there is at least 1 node and will use more if available.

    Sorry I couldn't retrieve the xml out as it is in a different network.

    Thanks

    Friday, April 20, 2012 12:58 AM
  • Hi William,

    This is a very interesting user scenario, I just tried it and got a repro. I talked to the developer and the answer is that Job scheduler only automatically grows a job once in a while, say if a job has 1000 tasks, where Task 2 to 1000 are like your Task B, the task will start to grow after a while, say 5 tasks for example. In your scenario, at the end of Task A it looks at the next task B, and see the currently assigned node #(1) satisfies the minimum requirement of Task B, so it launches TAsk B with 1 node.

    I have filed a bug to the development team doing this and we will try to get this into our next realease.

    In the meanwhile, I suggest you try one of the following alternatives:

    1. Set your Task B to run with minimum of Y nodes, where 1<Y<N, so that it at least get Y nodes

    2. Set task A to be a "node preparation task" before task B, so that Task B is not affected by Task A. However, this means that your Task A will run once on each node Task B runs at, you probably have to do some work to allow this to happen.

    Thank you again for bringing this issue to us!

    Friday, April 27, 2012 5:56 PM
  • Oh, and here is what you need for node preparation task

    http://technet.microsoft.com/en-us/library/ee783543(v=WS.10).aspx

    Friday, April 27, 2012 5:58 PM
  • Hi Michael,

    Thanks for the clarification and suggestion. I will find a workaround for the time being. Will look forward to have this bug solve in the next release :) 

    Monday, May 7, 2012 1:21 AM
  • Hi William,

    Thanks for your understanding! As you know we are very late into the current developmetn cycle, I'm afraid this bug won't get fixed in the next release. That being said, it is definitely on our radar and we will strive to get it fixed in the earliest time possible. Thank you again reporting this to us!

    Michael 

    Monday, May 7, 2012 5:30 PM