I can run the program on two computers not part of our HPC. I start smpd or msmpilauchsvc on both computers and then run the following command on either computer. Everything works well.
mpiexec -hosts 2 MachineA 1 MachineB 1 MPIApp.exe
However, I couldn’t run the MPI program on our HPC. I got the following error message.
ERROR: Failed RpcCliStartMgr error -2147024809
Aborting: mpiexec on MachineA is unable to connect to the smpd service on MachineB:8677
Other MPI error, error stack:
connect failed - The parameter is incorrect. (errno -2147024809)
I then tried to manually start smpd (running smpd -d). However, I got an access denied error. Do we really need to manually start smpd on each compute node of a cluster?
Any suggestions to fix the issue? Thank you very much!