27 มีนาคม 2555 22:20
In creation of a new job, there is a section to select the type of resources to request for the job.
in this section we can select core or node. for example I have a cluster with 10 compute node that each one has 8 cores.
Now in job resource if I select core and define minimum core=12, maximum core = 18.... and the command is : mpiexec -n 18 ...
if all the nodes are free, Is it right that Windows hpc assigns 8 cores of first compute node and 8 cores of second compute node and then 2 cores of third compute node to the job?
If I select node in job resources and define minimum node = 2 , maximum node = 3, and the command is : mpiexec -n 12...
if all the nodes are free, how windows hpc handles this situation?
how many nodes are assigned to the job?
how many cores of each node are assigned to the job?
In other words if the job is divided between 3 nodes, how many cores of each of these three nodes are assigned? just 4 cores of each node or all the cores of each node?
Thanks a lot.
Any help is appreciated.
28 มีนาคม 2555 0:15
1. Yes, you should be able to expect that behavior
2. scheduler will give you 3 nodes, if you have 3, then it will evenly distribute 4 mpi processes onto each node, assuming you still have 8 cores per node
However, note that in your second scenario, the scheduler has given the full node to mpiexec, so if your MPI job runs multi-thread, you could potentially use other cores on the same node, in other words, you could use up all the cores in that machine. (note that if you schedule your job by number of cores, you may not be able to use extra cores on the node mpiexec runs on)
- ทำเครื่องหมายเป็นคำตอบโดย hd banki 3 เมษายน 2555 8:52
28 มีนาคม 2555 8:56
Thanks a lot for your attention.
So in second scenario we have 4 processes on each node. now if the mpi program is not multi-thread, Is it right that each process runs on one core?
Therefore in each of these three nodes, 4 cores are assigned to the job and 4 cores are free?
28 มีนาคม 2555 20:42
Suppose that I have a cluster with 5 compute nodes that, each one has 8 cores and there is an MPI program that I want to execute it by 5 processes. (also node is selected for job resources.)
According to your response each process will run on one node. In other words scheduler gives one node for each process.
Now my question is about this issue that, when a node wants to execute a process that has been assigned to it, this process uses just one core of the node? or it can use all 8 cores of the node?
if the process just uses one core of node, to increase the performance , Can I use TPL in my MPI program to have multi threaded program and take advantage of using all 8 cores in each node?
Thanks a lot for your help.
- แก้ไขโดย hd banki 28 มีนาคม 2555 22:51
30 มีนาคม 2555 17:35
1. You are absolutely right in your second scenario
2. for your new scenario, whether it can take all 8 cores or not depends on your job submission, if your job only limits the number of nodes (say maxnode=5), your MPI process can run on all 8 cores of the node, if multi-threading is enabled. However, if you job limits the number of cores(say maxcores = 5), your MPI process can only run on the core it is assigned to on a specific node.
3. We generally don't encourage using TPL to increase performance, because some nice features such as CPU affinity cannot be applied if you use TPL, and sometimes your could decrease your performance by doing so - if you run 8 processes on your 8 core-machine and you have multi-threading, you waste CPU/Memory resources on context switching.
4. If you want to utilize all your cores, you should launch your MPI application on 40 processes and take over the entire cluster.
Hope that helps.
31 มีนาคม 2555 1:28
Thanks a lot for your helpful responses.
Actually I don't want to run 8 processes on 8 core machine and also use multi threading.
I want to run an MPI program with 5 processes on 5 nodes. (maxnode = 5).
in this scenario each process will run on one core of each nodes.
now I want to use TPL in my code till each process can use all 8 cores of that node.
Is this a good idea?
3 เมษายน 2555 0:20
There are some MPI applications that do this, but that way you end up dealing with thread communication yourself AND communication between all processes. Why not just schedule one MPI process per core, and let the MPI implementation handle all communciations for you?
That being said, it's a design decision you choose to make, and what you just described should work.
- เสนอเป็นคำตอบโดย Michael_Man 3 เมษายน 2555 0:20