none
Job shrinking / growing with different resource granularity RRS feed

  • Question

  • Our cluster is configured to adjust resources automatically to give precedence to jobs with higher priority (graceful pre-emption plus increasing and decreasing of resources).

    We have one job (A) with many tasks requiring a single (4-core) socket each. The job is at BelowNormal priority with 1-auto sockets for the job resources.
    A second job (B) with normal priority was submitted afterwards. This job has several tasks requiring 1 core, 2 of which queued. The job is set to auto-auto cores .
    Neither of the jobs/tasks is set to exclusive resource usage.

    We were under the impression that job (A) should shrink whenever it finishes a task if job (B) still has more waiting tasks looking for cores. However, this is seemingly not happening as job (B) has several tasks waiting while job (A) finished several tasks but kept the sockets (all 15 it had) to start new tasks instead of handing the sockets over to (B) which is looking for 2 cores for its two remaining tasks.

    Is this the expected behavior or does it look like we have a configuration problem?


    On a side note, the activity log of job (A) lists "added 1 core on X" when it should probably be 1 socket.


    HPC Version: 2.1.1703.0 on Windows 2008 Server.


    PS: adding a third job caused job (A) to give up its socket. Looks like a socket is only given up if all its cores are requested by jobs with higher priority...
    Tuesday, September 15, 2009 11:23 AM

Answers

  • Actually, job A will only shrink far enough to get job B running.

    We may change this behavior in a future version to be more in line with your expecations; that is still under discussion.

    Thanks,
    Josh
    -Josh
    Friday, September 18, 2009 11:44 PM
    Moderator