locked
How to use only one core of a node? RRS feed

  • Вопрос

  • Hello, I am a new user of a Windows HPC cluster consisting of 3 double-dual nodes, i.e. each node has 4 cores. Up to now, I only worked with Linux clusters, so I have to get used to the job manager at first. At the moment I intend to compare the Linux und Windows cluster with the Intel MPI benchmark and want to test different constellations of nodes and cores. But I am wondering how it could be possible to start only ONE process per node, i.e. use only ONE core per node to be able to start a 3 process-job using all the 3 nodes and not only 3 cores of one node.  In the mpiexec man page the -pernode option is specified but this option doesn't work on the cluster if written to the command line ´(after mpiexec) in the Task List. I also played around with different settings for maximum core when creating an new job template, but not with the desired success. I would be very happy if anybody gave me a hint how to solve this problem. Many thanks, parsus 
    10 сентября 2008 г. 10:49

Ответы

  • Hello, Parsus.
    Yes, the job scheduler on HPC Server (or CCS) is a bit different than running MPI on a set of Linux nodes.  The key difference is the primary role the job scheduler takes in assigning resources as opposed to mpiexec.  However, there are cases where the job scheduler and mpiexec arguments can be used together to better control process placement of your MPI application. 

    The simplest means of running a single process on each of N nodes is with node-based scheduling (/numnodes:) like this: 
        job submit /numnodes:3 mpiexec imb.exe
    which would run the MPI application named "imp.exe" on 3 nodes with one process (MPI rank) per node.  The HPCS scheduler allows you to schedule by node, socket, or core.  

    You can also combine job scheduler and mpiexec arguments to more closely control the processess placement of your application.  For example: 
        job submit /numnodes:3 mpiexec -cores 2 imb.exe
    will run 2 MPI processes on each of 3 nodes for a total of 6 MPI ranks for the job. 

    Note that mpiexec's "-affinity" argument can be used to separate processes on a node to avoid contention (and the resultant memory swapping and poor performance).  The -affinity option will cause mpiexec to place that processes in such a way as to avoid any 2 processes sharing the same: L1 cache, L2 cache, Lx cache, phyiscal package, NUMA node (this list in order of precedence). 

    IMPORTANT NOTE:  You mentioned "the -pernode option..." by which I believe you intended the "/corespernode" argument.  Please be aware /corespernode is a requirement and not a resource request.  For example, 
        job submit /corespernode:4  /numnodes:2 app.exe
    will run a single process on each of 2 nodes where each of those nodes must have a least 4 cores each. 


    Hope this helps.
    Eric

    Eric Lantz (Microsoft)
    • Предложено в качестве ответа elantzMicrosoft employee 16 сентября 2008 г. 19:28
    • Помечено в качестве ответа parsus 18 сентября 2008 г. 11:39
    16 сентября 2008 г. 19:28
  • We've gotten a few questions about this . . . so I've posted this Blog entry about how to do process placement with Windows Server 2008.  Check it out let me know if helps (or is wrong in any way!):
     https://windowshpc.net/Blogs/jobscheduler/Lists/Posts/Post.aspx?ID=9

    Thanks!
    Josh

    -Josh
    • Предложено в качестве ответа Josh BarnardModerator 17 сентября 2008 г. 0:43
    • Помечено в качестве ответа parsus 18 сентября 2008 г. 11:39
    17 сентября 2008 г. 0:43
    Модератор
  • Many thanks to both of you!!! Your detailed explanations helped a lot. Now, jobs can be easily started via cmd or powershell. Within the Job Manager the selection of e.g. 'Job resources': Nodes, Min:2, Max:2 replaces the explicit option \numnodes:2 while using '-cores' at the Command Line specifies the number of cores.

    My only concern is the following: If I start a job

    job submit /numnodes:2 mpiexec -cores 2 Job.exe

    on two nodes of a cluster with 3 double-duals, e.g. 4 cores per node, and have a look to 'Heat Map/Cores in Use' I would expect to see 2 busy  cores on 2 nodes. But the heat map indicates that 4 cores per node are working on 2 nodes (no other jobs are running!), although the output of the job clearly indicates that only a total of 4 processes (not 8) worked together. In contrast to that the 'Heat Map/CPU Usage' behaves as expected: only a maximum of 50 % is reached on both nodes.

    Many greetings, Parsus

    The "cores in use" refers to the cores that have been allocated by the scheduler, not necessarily those being used by your application.  In your case, since you requested the whole nodes, all cores are assigned to your job.
    -Josh
    7 мая 2009 г. 17:30
    Модератор

Все ответы

  • Hello, Parsus.
    Yes, the job scheduler on HPC Server (or CCS) is a bit different than running MPI on a set of Linux nodes.  The key difference is the primary role the job scheduler takes in assigning resources as opposed to mpiexec.  However, there are cases where the job scheduler and mpiexec arguments can be used together to better control process placement of your MPI application. 

    The simplest means of running a single process on each of N nodes is with node-based scheduling (/numnodes:) like this: 
        job submit /numnodes:3 mpiexec imb.exe
    which would run the MPI application named "imp.exe" on 3 nodes with one process (MPI rank) per node.  The HPCS scheduler allows you to schedule by node, socket, or core.  

    You can also combine job scheduler and mpiexec arguments to more closely control the processess placement of your application.  For example: 
        job submit /numnodes:3 mpiexec -cores 2 imb.exe
    will run 2 MPI processes on each of 3 nodes for a total of 6 MPI ranks for the job. 

    Note that mpiexec's "-affinity" argument can be used to separate processes on a node to avoid contention (and the resultant memory swapping and poor performance).  The -affinity option will cause mpiexec to place that processes in such a way as to avoid any 2 processes sharing the same: L1 cache, L2 cache, Lx cache, phyiscal package, NUMA node (this list in order of precedence). 

    IMPORTANT NOTE:  You mentioned "the -pernode option..." by which I believe you intended the "/corespernode" argument.  Please be aware /corespernode is a requirement and not a resource request.  For example, 
        job submit /corespernode:4  /numnodes:2 app.exe
    will run a single process on each of 2 nodes where each of those nodes must have a least 4 cores each. 


    Hope this helps.
    Eric

    Eric Lantz (Microsoft)
    • Предложено в качестве ответа elantzMicrosoft employee 16 сентября 2008 г. 19:28
    • Помечено в качестве ответа parsus 18 сентября 2008 г. 11:39
    16 сентября 2008 г. 19:28
  • We've gotten a few questions about this . . . so I've posted this Blog entry about how to do process placement with Windows Server 2008.  Check it out let me know if helps (or is wrong in any way!):
     https://windowshpc.net/Blogs/jobscheduler/Lists/Posts/Post.aspx?ID=9

    Thanks!
    Josh

    -Josh
    • Предложено в качестве ответа Josh BarnardModerator 17 сентября 2008 г. 0:43
    • Помечено в качестве ответа parsus 18 сентября 2008 г. 11:39
    17 сентября 2008 г. 0:43
    Модератор
  • Many thanks to both of you!!! Your detailed explanations helped a lot. Now, jobs can be easily started via cmd or powershell. Within the Job Manager the selection of e.g. 'Job resources': Nodes, Min:2, Max:2 replaces the explicit option \numnodes:2 while using '-cores' at the Command Line specifies the number of cores.

    My only concern is the following: If I start a job

    job submit /numnodes:2 mpiexec -cores 2 Job.exe

    on two nodes of a cluster with 3 double-duals, e.g. 4 cores per node, and have a look to 'Heat Map/Cores in Use' I would expect to see 2 busy  cores on 2 nodes. But the heat map indicates that 4 cores per node are working on 2 nodes (no other jobs are running!), although the output of the job clearly indicates that only a total of 4 processes (not 8) worked together. In contrast to that the 'Heat Map/CPU Usage' behaves as expected: only a maximum of 50 % is reached on both nodes.

    Many greetings, Parsus
    18 сентября 2008 г. 6:52
  • please note that the -cores switch only applies to MSMPI and not to other MPI implementations.

    As to you uqestion about the heat map.
    The "Cores in Use" is the number of cores allocated for the job, and indeed the scheduler allocated all cores to your jobs and thus all 4 show up in the heat map.
    The "CPU Usage" indicate what percentage of the CPU's are actually used, and as you expected it would be 50%.

    thanks,
    .Erez
    • Изменено Lio 22 сентября 2008 г. 22:34
    22 сентября 2008 г. 22:34
  • Hi,
    Is it possible to specify one core on a node in CCS 2003?

    Thanks,
     Kenji
    Kenji
    1 октября 2008 г. 12:50
  • Hi,
    Is it a possible to run 2 tasks/jobs at the same time as follows.

    MPIapp1: 4 Cores of Node1 and  4 Cores of Node2
    MPIapp2: 4 Cores of Node1 and  4 Cores of Node2

     - each nodes have 2 quad-core processors

    I couldn't find how to run those job/task at the same time (not sequentially).
    I think "mpiexec -cores" does not work when the UnitType is not Node.
    And, if the UnitType is Node, multiple task can not run on the same node..

    Tansks,

    31 октября 2008 г. 3:21
  • Many thanks to both of you!!! Your detailed explanations helped a lot. Now, jobs can be easily started via cmd or powershell. Within the Job Manager the selection of e.g. 'Job resources': Nodes, Min:2, Max:2 replaces the explicit option \numnodes:2 while using '-cores' at the Command Line specifies the number of cores.

    My only concern is the following: If I start a job

    job submit /numnodes:2 mpiexec -cores 2 Job.exe

    on two nodes of a cluster with 3 double-duals, e.g. 4 cores per node, and have a look to 'Heat Map/Cores in Use' I would expect to see 2 busy  cores on 2 nodes. But the heat map indicates that 4 cores per node are working on 2 nodes (no other jobs are running!), although the output of the job clearly indicates that only a total of 4 processes (not 8) worked together. In contrast to that the 'Heat Map/CPU Usage' behaves as expected: only a maximum of 50 % is reached on both nodes.

    Many greetings, Parsus

    The "cores in use" refers to the cores that have been allocated by the scheduler, not necessarily those being used by your application.  In your case, since you requested the whole nodes, all cores are assigned to your job.
    -Josh
    7 мая 2009 г. 17:30
    Модератор
  • I don't think this is psosible today with separate jobs.  If you really want this, the best way is to create 1 job with a single task, where that task is an mpiexec command that starts two separate applications.

    Thanks,
    Josh


    -Josh
    7 мая 2009 г. 17:32
    Модератор
  • By the way, the link above seemed broken.  Here is an updated link:
    http://blogs.technet.com/windowshpc/archive/2008/09/16/mpi-process-placement-with-windows-hpc-server-2008.aspx

    Thanks!
    J
    -Josh
    7 мая 2009 г. 17:32
    Модератор
  • Hi all,

        If I submit job with job submit /numnodes:2 mpiexec -cores 1 job.exe, will it use one core from each node.

    And I have another query how I can submit a 6 core job by selecting 2 cores from one node and another 4 cores from other node. If it is possible please give the command

    Regards,
    Kalyan
    15 июня 2009 г. 17:02