locked
node prep running everytime a task runs RRS feed

  • Question

  • instead of just running once in the beginning of the job for the number of resources available. Our node prep tasks are running many many times for each regular task, and they are all successful. So before each task runs - the node prep will run on any available nodes - so in a job that can only run on 25 nodes, sometimes I have 900+ node prep runs. Does anyone have any idea why this could be happening?

    Thanks!

    Thursday, August 3, 2017 8:22 PM

All replies

  • hi Nick,

      Node Prep task will be run on a node when the node is allocated to the job before any body task get run on it. That means, if the node has been allocated to the job for many times, the node prep task could run for many times.

      Thus, any of below situations will cause this:

    1. preemption happens and the node is released from this job for other job

    2. the node becomes unreachable and not usable any more for the job, and then it becomes reachable and get assigned to the job again

    In situation 1, the number nodeprep task instance should match to the noderelease task instance, and in situation 2, there will be less node release task instance as node release task won't get run on "unreachable nodes"


    Qiufang Shi

    Friday, August 4, 2017 3:08 AM