Specify cores per node in MPI application and run multiple task on a node
-
Friday, October 31, 2008 3:30 AMHi,
Is it possible to run 2 tasks/jobs at the same time as follows.
MPIapp1: 4 Cores of Node1 and 4 Cores of Node2
MPIapp2: 4 Cores of Node1 and 4 Cores of Node2
- each nodes have 2 quad-core processors
I believe that using cluster as above sometimes makes best performance,
but I couldn't find how to run those job/task at the same time (not sequentially).
I think "mpiexec -cores" does not work when the UnitType is not Node.
And, if the UnitType is Node, multiple task can not run on the same node..
> https://windowshpc.net/Blogs/jobscheduler/Lists/Posts/Post.aspx?ID=3#Comments
> https://windowshpc.net/Blogs/jobscheduler/Lists/Posts/Post.aspx?ID=9
Tansks,
All Replies
-
Friday, October 31, 2008 4:48 AMHi,
If the scheduler doesn't support this, how about running mpiexec (MS-MPI) not using scheduler?
> I believe that using cluster as above sometimes makes best performance,
It's based on my experience of Linux Cluster ( same hardware ).
Thanks,
------------------
hirakata -
Friday, October 31, 2008 11:22 PM
I assume that you are using Windows HPC 2008 (the story is a bit different for Windows Compute Cluster Server 2003)
The natural way would be to run app1 and app2 on different machines/cores; that would be rather simple as:
job submit /numcores:16 mpiexec -n 8 mpiapp1 : -n * mpiapp2
assuming your nodes have 8 cores; you can drop the -n * or replace it with -n 8; with the same effect.
(see mpiexec /help2 for details on the syntax above)
if you really want to splice them as described above, you can should let mpiexec know that each node "has" 4 cores and you are "oversubscribing"
job submit /numcores:16 mpiexec -cores 4 -n 8 mpiapp1 : -n * mpiapp2
in this case mpiexec will put mpiapp1 on the two nodes, thinking that each has only 4 cores; than it would "oversubscribe" the cores again with mpiapp2
hope this helps,
.Erez -
Thursday, November 06, 2008 1:17 AMHello Lio-san,
Thank you for your reply.
Before testing at the large cluster (8 Cores x 32 nodes WHPCS2008 RTM), I confirmed on the small test cluster (2 Cores x 8 nodes, WHPCS2008 RTM)following command as you described above .
job submit /numcores:16 mpiexec -cores 1 -n 8 mpiapp arg1_for_mpiapp : -n * mpiapp arg2_for_mpiapp
But the results is ...
Error: not enough cores left for 'mpiapp' in section 2. the command line already subscribes 8 processes on 8 cores.
The same error occurs when I submited mpipingpong.exe as below.
job submit /numcores:16 mpiexec -cores 1 -n 8 "C:\Program Files\Microsoft HPC Pack\Bin\mpipingpong.exe" : -n * "C:\Program Files\Microsoft HPC Pack\Bin\mpipingpong.exe"
Error: not enough cores left for 'C:\Program Files\Microsoft HPC Pack\Bin\mpipingpong.exe' in section 2. the command line already subscribes 8 processes on 8 cores.
It seems that oversubscribing is not permitted.
Is any other comman-line option necessary? Or, is there any other resolution for that?
Thank you.
------------------
hirakata
-
Friday, November 14, 2008 1:57 AM
Hello Hirakata-san
there error that you get is because you specify -cores 1, this makes mpiexec think that there is only one core on each node and thus a total of 2 cores (assuming 2 nodes). this was my mistake in my example you should actually not specify -n * in the second section but rather the specific number of processes that are needed.
you should specify
job submit /numcores:16 mpiexec -cores 1 -n 8 mpiapp arg1_for_mpiapp : -n 8 mpiapp arg2_for_mpiapp
btw: this will put all the odd ranks on the first node and all the even ranks o the other.
hope this helps
.Erez- Edited by Lio Friday, November 14, 2008 1:57 AM
- Proposed As Answer by Lio Saturday, November 15, 2008 8:45 AM
- Marked As Answer by Josh BarnardModerator Tuesday, November 18, 2008 9:19 PM