Running an MPI job using the HPC scheduler in several hosts RRS feed

  • Question

  • Hi,

    I'm trying to run an MPI application using several compute nodes of an HPC cluster. I've copied the MPI application locally to each compute node and I would think that I should run it like this:

    job submit /scheduler:my_head_node /jobtemplate:my_job_template /jobname:TransformationEngine-MPI /stdout:c:\temp\stdout.txt /stderr:c:\temp\stderr.txt /numnodes:2 mpiexec -n 1 c:\dev\mpi_job.exe

    I would expect the job scheduler to start one instance of mpi_job.exe in each one of the compute nodes, at least in two of them. And I would expect both of them to be able to communicate. However, only one task is started even if two compute nodes are allocated.

    I've also tried to start the job like this:

    job new /scheduler:my_head_node /jobtemplate:my_job_template /jobname:TransformationEngine-MPI /numnodes:2
    job add my_job_id mpiexec -n 1 c:\dev\mpi_job.exe
    job add my_job_id mpiexec -n 1 c:\dev\mpi_job.exe
    job submit /id:my_job_id

    And in this case two MPI tasks are started, but both of them get rank 0, and don't communicate with each other.

    Finally, I've also tried to run the application like this:

    job submit /scheduler:my_head_node /jobtemplate:my_job_template /jobname:TransformationEngine-MPI /stdout:c:\temp\stdout.txt /stderr:c:\temp\stderr.txt /numnodes:2 mpiexec -hosts 2 hotsname_1 hostname_2 c:\dev\mpi_job.exe

    And in this case the tasks failed saying that the host allocation was managed by the job scheduler, so I couldn't override it with mpiexec (I don't have the exact error message now, someone else is hogging the cluster).

    I don't know what else to try, any ideas anyone?



    Friday, September 7, 2012 6:54 AM


  • The right command line was (for a 2 node cluster):

    job submit /scheduler:my_head_node /jobtemplate:my_job_template /jobname:TransformationEngine-MPI /stdout:c:\temp\stdout.txt /stderr:c:\temp\stderr.txt /numnodes:2 mpiexec -n 2 c:\dev\mpi_job.exe

    or (for running one process on each node of an arbitrarily sized cluster)

    job submit /scheduler:my_head_node /jobtemplate:my_job_template /jobname:TransformationEngine-MPI /stdout:c:\temp\stdout.txt /stderr:c:\temp\stderr.txt /numnodes:2 mpiexec -c 1 c:\dev\mpi_job.exe

    I got confused because I was expecting to see as many tasks as MPI processes, while there is just one task, with the whole MPI world inside.

    Monday, September 10, 2012 4:52 AM