How do you determine which mahcines have been allocated for your job? RRS feed

  • Question

  • Under systems, like PBS, TORQUE, LSF, etc. When a job is submitted and goes into the "Running" state the Workload Manager sets an environment variable such as PBS_NODEFILE or LSB_MCPU_HOSTS, which contains the names of the nodes that have been allocated for the job.  Is there a MS Job Scheduler equivalent? Basically, I've created an application that uses the command-line to create a job and then submit it.  I use a jobfile as a "template" and then modify the template based on a users input.  The problem is that my application starts a client application that then invokes a call to mpiexec.  [NOTE: Here template refers to my application template, note the job scheduler template used to define job policy.] My goal is to make sure that the mpiexec uses the resources allocated for the job.  The basic pattern looks as follows:

    query user for job details
    generate XML job file from a generic xml job file.
    create new job: 
         job new /jobfile:[my_jobfile.xml]
    submit job: 
         job submit /id:[my_job_id]

    The CommandLine attribute in the Task element of of my job file starts a an application, which in turn starts an MPI application via mpiexec.
    I would like to be able to make sure mpiexec uses the resources allocated to the job.

    I know that this is somewhat convoluted, but one of my goals is to create a framework that is generic as possible and allows me ot run under multiple platforms and multiple workload managers.  My current abstraction gives me that flexibility so I am hesitant to move to a new implementation unless I can be convinced that it is really the right thing to do.

    Thanks for any suggestions.

    Monday, August 31, 2009 7:15 PM