locked
Subitting job on 4 processors but compute cluster admin showing it as running 8 processors. RRS feed

  • Question

  • Hi All,

    I want to submit 64  jobs each running on four processors for a total of 256 cores. I submit the jobs as follows.

    job submit /numprocessor:4  mpiexec -n 4 exectuable -np 4. The jobs gets submited. It runs. It shows up on the job scheduler as runnning on 4 processors but the compute cluster admin shows it as running on 8 processors. I logged into one of the nodes and the job is infact running on 4 processors. Thus 32 jobs get qued and 32 are running because of the discrepancy between the compute cluster administator and how the jobs are actually running. Any idea what's going on?

    Thanks,

    Ilya

    Monday, February 25, 2008 10:42 PM

Answers

  •  

    Ilya,

    There are a couple of things going on there . . . first off you don't need to (and shouldn't) specify -n for your mpiexec . . . MS-MPI should pick the correct settings up from the Job Scheduler.

     

    Second, by default jobs are tagged as Exclusive (meaning no other jobs can share a node with them).  This means if you submit a job with "job submit /numprocessors:4 . . ." on a cluster with 8-core nodes, your job will claim an entire node even though you only requested 4 procs.  In your case it sounds like you need to use the flag "/exclusive:false" to change this setting.

     

    So your new command line woud be:

     

    Code Snippet
    job submit /exclusive:false /numprocessors:4 mpiexec executable.exe

     

     

     

    Let me know if that fixes your problem.

     

    Thanks,
    Josh Barnard

    Microsoft HPC

    • Proposed as answer by Lio Monday, July 7, 2008 7:08 PM
    • Marked as answer by Don Pattee Monday, April 13, 2009 5:38 AM
    Thursday, February 28, 2008 7:52 PM

All replies


  • Hi Ilya,

    Is this situation with Compute Cluster Server 2003 or are you testing Windows HPC Server 2008 beta1...?

    Thanks,
    Phil

    Thursday, February 28, 2008 4:41 PM
  • 2003

    Thanks,

    Ilya

    Thursday, February 28, 2008 5:03 PM
  •  

    Ilya,

    There are a couple of things going on there . . . first off you don't need to (and shouldn't) specify -n for your mpiexec . . . MS-MPI should pick the correct settings up from the Job Scheduler.

     

    Second, by default jobs are tagged as Exclusive (meaning no other jobs can share a node with them).  This means if you submit a job with "job submit /numprocessors:4 . . ." on a cluster with 8-core nodes, your job will claim an entire node even though you only requested 4 procs.  In your case it sounds like you need to use the flag "/exclusive:false" to change this setting.

     

    So your new command line woud be:

     

    Code Snippet
    job submit /exclusive:false /numprocessors:4 mpiexec executable.exe

     

     

     

    Let me know if that fixes your problem.

     

    Thanks,
    Josh Barnard

    Microsoft HPC

    • Proposed as answer by Lio Monday, July 7, 2008 7:08 PM
    • Marked as answer by Don Pattee Monday, April 13, 2009 5:38 AM
    Thursday, February 28, 2008 7:52 PM