Single Server Miltiple Processor Group Syntax RRS feed

  • Question

  • I have a server with 96 physical cores, available as 4 NUMA nodes or as 2 processor groups. MPI applications execute correctly using the default localhost or using the syntax:

    ./mpiexec -hosts 1 hostname 48 ./appname.exe

    but, I have not been able to use all 96 cores at the same time since I cannot figure out the proper syntax to submit the processor groups as two hosts.

    Either two hosts to access the processor groups or four hosts to access the NUMA nodes is fine.  We just need to be able to use all 96 physical cores for the same job.

    using:MS MPI Version 9.0.12497.9, Windows Server 2016 64-bit, Intel Xeon 24 core (4 processors)
    Tuesday, March 6, 2018 5:42 PM

All replies

  • What happens if you try 

    ./mpiexec -n 96 ./appname.exe


    ./mpiexec -hosts 1 hostname 96 ./appname.exe

    Is submitting processor groups as separate hosts critical for you?

    Friday, March 9, 2018 6:51 AM
  • Submitting with any value higher than 48 will result in the same result as 48  (all 48 logical processors in one group become 100% active, all logical processors in the other group are dormant)

    As far as I can find, the issue is based on an affinity mask being limited to a max of 64 cores.  One host can only generate one affinity mask and can therefore (currently) only manage at most 64 cores.  The operating system and hardware setup result in the processors being managed as two groups of 48.  And, changing the server setup doesn't help because disabling one of the NUMA nodes still results in 72 which would still be two groups (and we would like to use all 96 cores that we have).

    the only critical detail is: all 96 logical cores working on one processing job

    "mpiexec -cores 96 -np *" is the goal ... how to get that is the problem

    edit / add: Recently we have noticed that if we submit multiple jobs on the server (each asking for 48 cores) there is a rare chance that at least one of those jobs will be started on the other group.  When this occurs we get the full 96 cores working at 100% concurrently.  This is rare, however, as we usually have all jobs competing for the same group which leave the other group unused.

    Monday, March 12, 2018 4:58 PM
  • Partial Update:

    I can submit the application to a specific affinity mask on either processor group with this syntax:

    ./mpiexec.exe -hosts 2 hostname 2,4:0,8:0 hostname 1,20:0 ./appname.exe

    where each host uses the affinity mask as bitmap:group (in this example 8:0 implies to only allow the first host second task to run on the 4th core on the 1st group, 3:0 would allow the task to move between the first and second core on the first group, based on binary representation of core location in affinity mask)

    Changing the syntax to

    ./mpiexec.exe -hosts 2 hostname 2,4:1,8:1 hostname 1,20:1 ./appname.exe

    gives me the same behavior on the second group, but

    ./mpiexec.exe -hosts 2 hostname 2,4:0,8:0 hostname 1,20:1 ./appname.exe

    fails to "Aborting: An explicit affinity was used but the cores specified were either not allowed or do not exits."

    Since I can submit the first two options without error, I know the cores exist.  Does this mean that spanning processor groups is still not allowed/supported?

    Tuesday, March 13, 2018 9:06 PM
  • Have you tried submit your job through HPC Pack? The HPC Pack should be able to handle this situation.

    Qiufang Shi

    Friday, March 16, 2018 5:35 AM