2010年2月12日 2:38Dear all,
I have encountered this problem when using windows comput cluster server 2003. When I use the command line "mpiexec -hosts 2 cn01 1 cn02 1 myapp.exe" or "mpiexec -hosts 2 cn02 1 cn01 1 myapp.exe"in the graphical interface of CCS2003 on cn01 or cn02 it works OK.
But when I want to use other nodes ,say cn01 and cn03, I typed "mpiexec -hosts 2 cn01 1 cn03 1 myapp.exe", the system told me that 'access denied by node cn03, a common cause: this node is not allocated to job by scheduler'. If I typed "mpiexec -hosts 2 cn03 1 cn01 1 myapp.exe", the system told me that 'unable to read authorization result from cn03. socket connection closed'. The result is the same when I use cn 01 and cn04.
So I tried to use cn02 and cn03, the result is the same as above mentioned in the case of cn01 and cn03.
Does this mean that there is some connection problem of this cluster?
2010年2月16日 3:49Hi Skiff,
Can you try the following for your cmd line ""mpiexec -hosts 2 cn01 1 cn03 1 myapp.exe":
1) Make sure that cn01 and cn03 are used by the job. from cmdline, it will be : job submit /askednodes:cn01,cn03 ...
2) Make sure that you have N processors allocated, where N = total number of processors on cn01 and cn03.
from cmdline, it will be something like below
job submit /askednodes:cn01,cn03 /numprocessors:N mpiexec -hosts 2 cn01 1 cn03 1 myapp.exe
Hope the above helps,
- 已标记为答案 Don PatteeModerator 2011年1月12日 2:50
2010年2月27日 3:49Thank you, Liwei. The problem is solved now.