none
Job's dependency on Windows HPC Cluster Manager RRS feed

  • Question

  • I'm using Microsoft HPC 2008 R2 Cluster Manager and I need to add dependency between several jobs (the second jobs can't run while the first one is still running) but I found out that Microsoft has not this feature in the HPC system (however we can add dependency between tasks ) !!
    how can I do to make this dependency
    Thursday, February 13, 2014 10:42 AM

All replies

  • Unfortunately, 2008 R2 does not support job dependencies, but this feature was introduced in HPC Pack 2012.

    In 2012, there is a new panel in New Job dialog to set dependencies and the SDK also provides a public interface to achieve this:

    http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulerjob.childjobids(v=vs.85).aspx

    So I suggest upgrading to 2012.

    If you do need dependencies in 2008 R2, you can use a PowerShell script to check all the parent jobs' states and submit child jobs once parents get finished:

    while ($true) { if ( (Get-HpcJob -Id 111).State -eq 'Finished' ) { Submit-HpcJob -Id 222 } Sleep(3) }
    Friday, February 14, 2014 2:21 AM
  • Thank you for your response :

    1- I can't upgrade the HPC system cause it will cost me a fortune also If one day I decide to upgrade it I have to upgrade the OS also

    2- about the PowerShell Script , at the first sight I can conclude that it will be run to infinity (while ($true)) !

    in addition Im not familiar with the script language (how can I get the jobId ? , sleep(3) means the job will sleep for 3 ms ? ...  )

    Thank you again

    Friday, February 14, 2014 11:05 AM
  • It's just a sample to show the main logic.

    I'll improve it and explain a bit more.

    The job ID can be found in Cluster Manager or Job Manager once you submit a job.

    Those # in the script means a line of comment.

    while ($true) {

        # get the job with id 111 to see if it's finished. this is the 'parent'
        if ( (Get-HpcJob -Id 111).State -eq 'Finished' )
        {
            # if the 'parent' has finished, submit the 'child'
            Submit-HpcJob -Id 222;

            # leave the loop since the 'child' has been submitted
            break;
        }

        # sleep 3 seconds before another check. you can modify this to meet your need
        Sleep(3);

    }


    • Edited by SnOoPy1214 Monday, February 17, 2014 1:53 AM
    Monday, February 17, 2014 1:49 AM
  • It seems very helpful , I will try it

    however  I have a little doubt on Job-Id ,because in my case there is many jobs in the grid so I can't put get the job-Id and put it hard-coded like you did (the jobs I will execute will be scheduled so I don't know witch Id will be assigned to it)

    Monday, February 17, 2014 11:13 AM
  • To be honest, it's tough to achieve job dependencies with scripts if the dependencies are complex or the job amount is large.

    Though, if you really need it, you can place the code in a function and pass the job IDs (names) as parameters (job names can be used to filter jobs, too).

    You may need to define a data structure to preserve all the job dependencies and let the script to check them at regular intervals.

    There might be lots of script work there...

    Tuesday, February 18, 2014 2:39 AM
  • I will try to implement this solution

    I could not have done it without your help

    Thanks and I hope I did not disturb you .

    Tuesday, February 18, 2014 9:44 AM
  • I have another question if you don't mind :

    I have a batch file that will launch the jobs something like that :

    for /f ....  do (
    job add %%i /name:"JobOne"

    job add %%i /name:"JobTwo"

    job add %%i /name:"JobThree"

    job add %%i /name:"JobFoure" /depend:"JobThree" ...

    job add %%i /name:"JobFive" /depend:"JobThree" ...

    job add %%i /name:"JobSix" /depend:"JobThree" ...

    job add %%i /name:"JobSven" /depend:"JobThree" ...

    job submit /sched......

    )

    in this case job 4 /5 /6/7 may not be executed until job 3 finish  right ?
    in fact this is not what happend ,

    even when job 3 finish , job 4 /5 /6/7  are still waiting for job 1 / 2 to finish !

    how can I do ?

    Wednesday, February 26, 2014 11:01 AM
  • I wrote a powershell function to wait until all jobs are finished:

    function WaitNoJobs
    {
        $ErrorActionPreference= 'silentlycontinue'
        $Error.clear();
        $jobs = Get-HpcJob -Scheduler <headnode>
        While (!$Error) {
            Write-Host 'Waiting, jobs remaining = ' $jobs.Count
            # Check every 1 minutes
            Start-Sleep -s 60
            $jobs = Get-HpcJob -Scheduler <headnode>
        }
    }

    Friday, February 6, 2015 9:16 PM