Answered by:
a job queued by activation filter queues another job

Question
-
Hi specialists,
Currently i meet with a strange problem about activation filter.
An online compute node has 8 cores. No job is running on or being queued on it.
Now, I submit a job requiring 3 cores to the node and let it queued by an activation filter. As expected, the pending reason is “Activation of this job was delayed by the activation filter program”. Then I submit another job requiring one core to the same node. The job is queued too. The pending reason is “Not enough available cores.” Why is the second job queued?
thanksMonday, November 30, 2009 1:36 PM
Answers
-
When an activation filter decides not to activate a job, the entire scheduling queue is blocked by this one job. So, even though it looks like the second job would have enough cores to run, the second job will not be scheduled until the activation filter releases or cancels the blocking job.
- Proposed as answer by Don PatteeModerator Wednesday, December 9, 2009 6:15 AM
- Marked as answer by Don PatteeModerator Friday, January 29, 2010 7:33 AM
Wednesday, December 9, 2009 12:53 AM
All replies
-
Hi specialists,
Currently i meet with a strange problem about activation filter.
An online compute node has 8 cores. No job is running on or being queued on it.
Now, I submit a job requiring 3 cores to the node and let it queued by an activation filter. As expected, the pending reason is “Activation of this job was delayed by the activation filter program”. Then I submit another job requiring one core to the same node. The job is queued too. The pending reason is “Not enough available cores.” Why is the second job queued?
thanks
Add a small output to your testprogram printing the job id which is delayd to a file.
You will probably only see the first job. As this job was queued earlier, the priority is higher. So this job is starved out.
Johannes
JHMonday, November 30, 2009 3:55 PM -
When an activation filter decides not to activate a job, the entire scheduling queue is blocked by this one job. So, even though it looks like the second job would have enough cores to run, the second job will not be scheduled until the activation filter releases or cancels the blocking job.
- Proposed as answer by Don PatteeModerator Wednesday, December 9, 2009 6:15 AM
- Marked as answer by Don PatteeModerator Friday, January 29, 2010 7:33 AM
Wednesday, December 9, 2009 12:53 AM -
When an activation filter decides not to activate a job, the entire scheduling queue is blocked by this one job.
Its my impression that this however applies only to jobs of one user. You are not blocking other user's jobs, are you?
JHWednesday, December 9, 2009 11:14 AM -
But it is strange, isn't it? The queued job should not block other jobs. If the queued job was never activated, the others could not run, which are really not accepted by users.
Is there any solutions can fix this issue?Wednesday, December 9, 2009 12:31 PM -
When an activation filter decides not to activate a job, this blocks the entire scheduling queue regardless of user. Otherwise, for example, a job being blocked by the activation filter because it requires two licenses may never be scheduled if other jobs that require only one license are allowed to pass it in the queue. If this is not the desired behavior, then the activation filter developer may design it so that it cancels the job instead of allowing it to block the queue.
Wednesday, December 9, 2009 10:09 PM -
When an activation filter decides not to activate a job, this blocks the entire scheduling queue regardless of user. Otherwise, for example, a job being blocked by the activation filter because it requires two licenses may never be scheduled if other jobs that require only one license are allowed to pass it in the queue. If this is not the desired behavior, then the activation filter developer may design it so that it cancels the job instead of allowing it to block the queue.
Vancloud:
Us, too, want another solution to this and see the only way to use an activation filter which is aware of all jobs. We are planning to do this by creating a list which holds all jobs submitted and not running.
JHThursday, December 10, 2009 9:53 AM -
Thank you for the feedback. I agree that the current behavior of Job Activation Filters is not ideal for all cases, but it does prevent failed jobs and license starvation in a license-constrained environment. Job Activation Filters are good for specialized cases, but are not designed for generalized resource scheduling. We are actively considering how to improve the behavior in the next version.
For now, if you require that the queue not be blocked and that jobs do not fail or get canceled, then you might like to consider removing the Activation Filter and instead push its behavior down into your jobs. That is, allow the scheduler to start the job, and provide a way for the job to perform the wait-until-<resource>-is-ready checks before starting the application (where <resource> might be licenses, memory, and so on).Friday, December 11, 2009 10:22 PM -
Anyone have a hack to implement a custom resource schedule? I'm looking to do an even # of cores per client distribution. Is support for this indeed in R2?Thanks,MattFriday, January 8, 2010 9:57 PM
-
If you look at the activation filter example on msdn you will notice that it has some special logic to cancel a job if the activation filter encounters a special situation. Right now in V2 this is really the only option if you do not want the activation filter to block. You can build your own logic into the filter to decide whether to block (return 1) execute the job (return 0) or cancel the job (return 0 )
http://technet.microsoft.com/en-us/library/dd346641(WS.10).aspx
Friday, January 22, 2010 6:39 PM -
Hi Matt, HPC 2008 R2 has a new scheduling policy called "Resource Balanced Scheduling" that will try to do this.Friday, January 22, 2010 6:40 PM
-
Great. Looking forward to it. Any word on a potential release date for R2?Thanks,MattFriday, January 22, 2010 10:24 PM
-
Hello,
Actually I had the same problem with an implementation of my activation filter (license usage), I realized that the filter will block the queue, even if other jobs can use it.
So my "work around" for this was to instruct the activation filter not to cancel the job, instead of that, just move the job to "configure" state; in this way it will "unblock" the queue to the rest of the jobs; then I have other program working in background who is monitoring all the jobs in the "configure state" because licenses , and it will try to resubmit the job in a "period of time" you can set. Then if the activation filter detect not enough licenses it will put this job in "configure state" again , and the cycles repeats.
But as other users says, you can have the risk this job never runs because the resources are taken by other jobs, but so far this implementation works for me...
I can "emulate" the next version of this "behavior" deciding if pass the job to configure state for a period of time, cancel it or just block the queue...
The next version of HPC, will support fully this capability, but for now my activation filter is working just fine... I have other problem trying to resubmit it (you can see it as "Activation filter: how to get rid of Remember this password? (Y/N)" thread....
Regards
Thursday, May 6, 2010 12:34 PM