Answered by:
Task delays

Question
-
Hello, i using hpc cluster with 2 compute nodes and one head, broker and compute node. All nodes have 4 cores. I submit wcf job and 4 task for this job. But in progress i have only 2 task at one time. Other tasks are calculating only if some task finished. It seems like queue. But incoming task count is 0.
I not founded any setting in service registration and broker configs which could give such effect. Max and min cores for job are setted to 4, for tasks - 1 core.
What is can be?- Edited by Vovstehn Tuesday, May 25, 2010 6:30 AM
Monday, May 24, 2010 6:02 PM
Answers
-
Hello,
The problem does not affect directly HPC Server. It depends on binding contfigurations. I was changed binding from https to net.tcp and now all ok.
Later i will investigate problem more detailed and post here.
- Marked as answer by Rae WangModerator Monday, June 21, 2010 11:13 PM
Tuesday, June 8, 2010 12:34 PM
All replies
-
Hello,
If I understand your question correctly, you have cluster with 2 compute node, a head node and a broker node. Each compute node has 4 cores. You submit a WCF job with 4 tasks, and each task has min and max cores set to 4. Only 2 of the tasks are running - the other 2 are queued.
If this is correct, then this behavior is expected. You cluster has only 2 compute nodes, and therefore only 8 cores on which tasks could run. Your job has 4 tasks, and each task requires 4 cores, for a total of 16 cores. Since only 8 cores are available to run your tasks, the job scheduler cannot start all 4 tasks simultaneously. Instead, it starts only 2 tasks, and gives 4 cores to each of the tasks. The other 2 tasks are queued.
If you would like to be able to start all tasks in parallel, you can try to reduce the number of tasks in the job to 2, or reduce each task's resource requirement to 2 cores instead of 4, or, if possible, add more compute nodes to your cluster.
Best regards,
Leonid.Monday, May 24, 2010 8:05 PM -
Hello,
Thank you for answer. I am sorry. I mean that resource requirement are setted for job, each task's resource requirement are 1 core.Tuesday, May 25, 2010 6:29 AM -
Hello,
Sorry about the misunderstanding. Could you please provide some additional details about your job and tasks. After you submit the job, please enter the following commands at the Command Prompt.
job view <YourJobId>
task view <YourJobId>.1
task view <YourJobId>.2
task view <YourJobId>.3
task view <YourJobId>.4
node listcores
Please copy and paste the output from each command here. Hopefully, this will give us enough information to diagnose the issue.Best regards,
Leonid.
Saturday, May 29, 2010 6:46 AM -
Hello,
The problem does not affect directly HPC Server. It depends on binding contfigurations. I was changed binding from https to net.tcp and now all ok.
Later i will investigate problem more detailed and post here.
- Marked as answer by Rae WangModerator Monday, June 21, 2010 11:13 PM
Tuesday, June 8, 2010 12:34 PM