locked
How to use only one core of a node? RRS feed

  • Pertanyaan

  • Hello, I am a new user of a Windows HPC cluster consisting of 3 double-dual nodes, i.e. each node has 4 cores. Up to now, I only worked with Linux clusters, so I have to get used to the job manager at first. At the moment I intend to compare the Linux und Windows cluster with the Intel MPI benchmark and want to test different constellations of nodes and cores. But I am wondering how it could be possible to start only ONE process per node, i.e. use only ONE core per node to be able to start a 3 process-job using all the 3 nodes and not only 3 cores of one node.  In the mpiexec man page the -pernode option is specified but this option doesn't work on the cluster if written to the command line ´(after mpiexec) in the Task List. I also played around with different settings for maximum core when creating an new job template, but not with the desired success. I would be very happy if anybody gave me a hint how to solve this problem. Many thanks, parsus 
    Rabu, 10 September 2008 10.49

Jawaban

  • Hello, Parsus.
    Yes, the job scheduler on HPC Server (or CCS) is a bit different than running MPI on a set of Linux nodes.  The key difference is the primary role the job scheduler takes in assigning resources as opposed to mpiexec.  However, there are cases where the job scheduler and mpiexec arguments can be used together to better control process placement of your MPI application. 

    The simplest means of running a single process on each of N nodes is with node-based scheduling (/numnodes:) like this: 
        job submit /numnodes:3 mpiexec imb.exe
    which would run the MPI application named "imp.exe" on 3 nodes with one process (MPI rank) per node.  The HPCS scheduler allows you to schedule by node, socket, or core.  

    You can also combine job scheduler and mpiexec arguments to more closely control the processess placement of your application.  For example: 
        job submit /numnodes:3 mpiexec -cores 2 imb.exe
    will run 2 MPI processes on each of 3 nodes for a total of 6 MPI ranks for the job. 

    Note that mpiexec's "-affinity" argument can be used to separate processes on a node to avoid contention (and the resultant memory swapping and poor performance).  The -affinity option will cause mpiexec to place that processes in such a way as to avoid any 2 processes sharing the same: L1 cache, L2 cache, Lx cache, phyiscal package, NUMA node (this list in order of precedence). 

    IMPORTANT NOTE:  You mentioned "the -pernode option..." by which I believe you intended the "/corespernode" argument.  Please be aware /corespernode is a requirement and not a resource request.  For example, 
        job submit /corespernode:4  /numnodes:2 app.exe
    will run a single process on each of 2 nodes where each of those nodes must have a least 4 cores each. 


    Hope this helps.
    Eric

    Eric Lantz (Microsoft)
    • Disarankan sebagai Jawaban oleh elantzMicrosoft employee Selasa, 16 September 2008 19.28
    • Ditandai sebagai Jawaban oleh parsus Kamis, 18 September 2008 11.39
    Selasa, 16 September 2008 19.28
  • We've gotten a few questions about this . . . so I've posted this Blog entry about how to do process placement with Windows Server 2008.  Check it out let me know if helps (or is wrong in any way!):
     https://windowshpc.net/Blogs/jobscheduler/Lists/Posts/Post.aspx?ID=9

    Thanks!
    Josh

    -Josh
    • Disarankan sebagai Jawaban oleh Josh BarnardModerator Rabu, 17 September 2008 00.43
    • Ditandai sebagai Jawaban oleh parsus Kamis, 18 September 2008 11.39
    Rabu, 17 September 2008 00.43
    Moderator
  • Many thanks to both of you!!! Your detailed explanations helped a lot. Now, jobs can be easily started via cmd or powershell. Within the Job Manager the selection of e.g. 'Job resources': Nodes, Min:2, Max:2 replaces the explicit option \numnodes:2 while using '-cores' at the Command Line specifies the number of cores.

    My only concern is the following: If I start a job

    job submit /numnodes:2 mpiexec -cores 2 Job.exe

    on two nodes of a cluster with 3 double-duals, e.g. 4 cores per node, and have a look to 'Heat Map/Cores in Use' I would expect to see 2 busy  cores on 2 nodes. But the heat map indicates that 4 cores per node are working on 2 nodes (no other jobs are running!), although the output of the job clearly indicates that only a total of 4 processes (not 8) worked together. In contrast to that the 'Heat Map/CPU Usage' behaves as expected: only a maximum of 50 % is reached on both nodes.

    Many greetings, Parsus

    The "cores in use" refers to the cores that have been allocated by the scheduler, not necessarily those being used by your application.  In your case, since you requested the whole nodes, all cores are assigned to your job.
    -Josh
    Kamis, 07 Mei 2009 17.30
    Moderator

Semua Balasan

  • Hello, Parsus.
    Yes, the job scheduler on HPC Server (or CCS) is a bit different than running MPI on a set of Linux nodes.  The key difference is the primary role the job scheduler takes in assigning resources as opposed to mpiexec.  However, there are cases where the job scheduler and mpiexec arguments can be used together to better control process placement of your MPI application. 

    The simplest means of running a single process on each of N nodes is with node-based scheduling (/numnodes:) like this: 
        job submit /numnodes:3 mpiexec imb.exe
    which would run the MPI application named "imp.exe" on 3 nodes with one process (MPI rank) per node.  The HPCS scheduler allows you to schedule by node, socket, or core.  

    You can also combine job scheduler and mpiexec arguments to more closely control the processess placement of your application.  For example: 
        job submit /numnodes:3 mpiexec -cores 2 imb.exe
    will run 2 MPI processes on each of 3 nodes for a total of 6 MPI ranks for the job. 

    Note that mpiexec's "-affinity" argument can be used to separate processes on a node to avoid contention (and the resultant memory swapping and poor performance).  The -affinity option will cause mpiexec to place that processes in such a way as to avoid any 2 processes sharing the same: L1 cache, L2 cache, Lx cache, phyiscal package, NUMA node (this list in order of precedence). 

    IMPORTANT NOTE:  You mentioned "the -pernode option..." by which I believe you intended the "/corespernode" argument.  Please be aware /corespernode is a requirement and not a resource request.  For example, 
        job submit /corespernode:4  /numnodes:2 app.exe
    will run a single process on each of 2 nodes where each of those nodes must have a least 4 cores each. 


    Hope this helps.
    Eric

    Eric Lantz (Microsoft)
    • Disarankan sebagai Jawaban oleh elantzMicrosoft employee Selasa, 16 September 2008 19.28
    • Ditandai sebagai Jawaban oleh parsus Kamis, 18 September 2008 11.39
    Selasa, 16 September 2008 19.28
  • We've gotten a few questions about this . . . so I've posted this Blog entry about how to do process placement with Windows Server 2008.  Check it out let me know if helps (or is wrong in any way!):
     https://windowshpc.net/Blogs/jobscheduler/Lists/Posts/Post.aspx?ID=9

    Thanks!
    Josh

    -Josh
    • Disarankan sebagai Jawaban oleh Josh BarnardModerator Rabu, 17 September 2008 00.43
    • Ditandai sebagai Jawaban oleh parsus Kamis, 18 September 2008 11.39
    Rabu, 17 September 2008 00.43
    Moderator
  • Many thanks to both of you!!! Your detailed explanations helped a lot. Now, jobs can be easily started via cmd or powershell. Within the Job Manager the selection of e.g. 'Job resources': Nodes, Min:2, Max:2 replaces the explicit option \numnodes:2 while using '-cores' at the Command Line specifies the number of cores.

    My only concern is the following: If I start a job

    job submit /numnodes:2 mpiexec -cores 2 Job.exe

    on two nodes of a cluster with 3 double-duals, e.g. 4 cores per node, and have a look to 'Heat Map/Cores in Use' I would expect to see 2 busy  cores on 2 nodes. But the heat map indicates that 4 cores per node are working on 2 nodes (no other jobs are running!), although the output of the job clearly indicates that only a total of 4 processes (not 8) worked together. In contrast to that the 'Heat Map/CPU Usage' behaves as expected: only a maximum of 50 % is reached on both nodes.

    Many greetings, Parsus
    Kamis, 18 September 2008 06.52
  • please note that the -cores switch only applies to MSMPI and not to other MPI implementations.

    As to you uqestion about the heat map.
    The "Cores in Use" is the number of cores allocated for the job, and indeed the scheduler allocated all cores to your jobs and thus all 4 show up in the heat map.
    The "CPU Usage" indicate what percentage of the CPU's are actually used, and as you expected it would be 50%.

    thanks,
    .Erez
    • Diedit oleh Lio Senin, 22 September 2008 22.34
    Senin, 22 September 2008 22.34
  • Hi,
    Is it possible to specify one core on a node in CCS 2003?

    Thanks,
     Kenji
    Kenji
    Rabu, 01 Oktober 2008 12.50
  • Hi,
    Is it a possible to run 2 tasks/jobs at the same time as follows.

    MPIapp1: 4 Cores of Node1 and  4 Cores of Node2
    MPIapp2: 4 Cores of Node1 and  4 Cores of Node2

     - each nodes have 2 quad-core processors

    I couldn't find how to run those job/task at the same time (not sequentially).
    I think "mpiexec -cores" does not work when the UnitType is not Node.
    And, if the UnitType is Node, multiple task can not run on the same node..

    Tansks,

    Jumat, 31 Oktober 2008 03.21
  • Many thanks to both of you!!! Your detailed explanations helped a lot. Now, jobs can be easily started via cmd or powershell. Within the Job Manager the selection of e.g. 'Job resources': Nodes, Min:2, Max:2 replaces the explicit option \numnodes:2 while using '-cores' at the Command Line specifies the number of cores.

    My only concern is the following: If I start a job

    job submit /numnodes:2 mpiexec -cores 2 Job.exe

    on two nodes of a cluster with 3 double-duals, e.g. 4 cores per node, and have a look to 'Heat Map/Cores in Use' I would expect to see 2 busy  cores on 2 nodes. But the heat map indicates that 4 cores per node are working on 2 nodes (no other jobs are running!), although the output of the job clearly indicates that only a total of 4 processes (not 8) worked together. In contrast to that the 'Heat Map/CPU Usage' behaves as expected: only a maximum of 50 % is reached on both nodes.

    Many greetings, Parsus

    The "cores in use" refers to the cores that have been allocated by the scheduler, not necessarily those being used by your application.  In your case, since you requested the whole nodes, all cores are assigned to your job.
    -Josh
    Kamis, 07 Mei 2009 17.30
    Moderator
  • I don't think this is psosible today with separate jobs.  If you really want this, the best way is to create 1 job with a single task, where that task is an mpiexec command that starts two separate applications.

    Thanks,
    Josh


    -Josh
    Kamis, 07 Mei 2009 17.32
    Moderator
  • By the way, the link above seemed broken.  Here is an updated link:
    http://blogs.technet.com/windowshpc/archive/2008/09/16/mpi-process-placement-with-windows-hpc-server-2008.aspx

    Thanks!
    J
    -Josh
    Kamis, 07 Mei 2009 17.32
    Moderator
  • Hi all,

        If I submit job with job submit /numnodes:2 mpiexec -cores 1 job.exe, will it use one core from each node.

    And I have another query how I can submit a 6 core job by selecting 2 cores from one node and another 4 cores from other node. If it is possible please give the command

    Regards,
    Kalyan
    Senin, 15 Juni 2009 17.02