2012년 4월 18일 수요일 오전 1:06
I have a cluster with N nodes. My workflow requires a job that contains task A and B. Task A is to be executed using 1 node only. Task B is dependent on A and is to be executed using all nodes.
I had set task A MinimumNumberOfNodes and MaximumNumberOfNodes to be 1, task B MinimumNumberOfNodes to 1 and MaximumNumberOfNodes to be N. When I run my job, task A uses the first node and task B started using the same node after task A even when there are N nodes available.
Is it possible to configure it to use N nodes for task B instead of only 1 node?
Thanks in advance
- 이동됨 Don PatteeModerator 2012년 5월 13일 일요일 오전 2:56 (From:Windows HPC Server Developers - General)
2012년 4월 18일 수요일 오후 6:09what's the min/max nodes configuration of your job? and what's the grow/shrink policy of your scheduler?
2012년 4월 19일 목요일 오전 2:11
The min is 1 and max is N for the job. I did a checked on the allocated resources after the job is completed and it displays all the N nodes. I did not set the grow/shrink policy and left it default.
2012년 4월 19일 목요일 오후 9:11
Could you try to set Task B to run on min of N nodes and max of N nodes as well? Because if you wish Task B to run on all the nodes, you have to specify it to use all the nodes. Does this make sense?
Also, if you don't mind, could you share your job xml file? This will give us an opportunity to take a closer at your problem and understand it better.
2012년 4월 20일 금요일 오전 12:58
If I were to set the min nodes of Task B to N, it would mean that Task B cannot start if there are less than N nodes available. I want the task to start as long as there is at least 1 node and will use more if available.
Sorry I couldn't retrieve the xml out as it is in a different network.
2012년 4월 27일 금요일 오후 5:56
This is a very interesting user scenario, I just tried it and got a repro. I talked to the developer and the answer is that Job scheduler only automatically grows a job once in a while, say if a job has 1000 tasks, where Task 2 to 1000 are like your Task B, the task will start to grow after a while, say 5 tasks for example. In your scenario, at the end of Task A it looks at the next task B, and see the currently assigned node #(1) satisfies the minimum requirement of Task B, so it launches TAsk B with 1 node.
I have filed a bug to the development team doing this and we will try to get this into our next realease.
In the meanwhile, I suggest you try one of the following alternatives:
1. Set your Task B to run with minimum of Y nodes, where 1<Y<N, so that it at least get Y nodes
2. Set task A to be a "node preparation task" before task B, so that Task B is not affected by Task A. However, this means that your Task A will run once on each node Task B runs at, you probably have to do some work to allow this to happen.
Thank you again for bringing this issue to us!
2012년 4월 27일 금요일 오후 5:58
Oh, and here is what you need for node preparation task
2012년 5월 7일 월요일 오전 1:21
Thanks for the clarification and suggestion. I will find a workaround for the time being. Will look forward to have this bug solve in the next release :)
2012년 5월 7일 월요일 오후 5:30
Thanks for your understanding! As you know we are very late into the current developmetn cycle, I'm afraid this bug won't get fixed in the next release. That being said, it is definitely on our radar and we will strive to get it fixed in the earliest time possible. Thank you again reporting this to us!