Asked by:
Two questions about execute mpi job on HPC server 2008

Question
-
Hi,
I have two questions:
1. I set up small cluster one head node and two compute nodes. I copy mpi executable to same path on all nodes. How can I do without copying mpi executable.
2. My mpi job has only one task. I sumbited the job and then task in job has finished but it seems that the job is running. Why the job seems to be running?
Thank you
Wednesday, January 7, 2015 9:00 AM
All replies
-
Hi,
1. You can put your executable on a File Share so that you won't need to copy them to local; You task commandline can use UNC path.
2. Please check your job configuration, things like "RunUntilCancelled" will keep the job running.
Qiufang Shi
Thursday, January 8, 2015 8:16 AM -
Hi,
1. You can put your executable on a File Share so that you won't need to copy them to local; You task commandline can use UNC path.
2. Please check your job configuration, things like "RunUntilCancelled" will keep the job running.
Qiufang Shi
Thank you for reply. 2. questıon is OK I see the option "RunUntilCancelled". but I put executable on a File Share then I control that I can access shared executable from all nodes. I gave the path as UNC like below but I get below error.
My task in HPC Cluster Manager : mpiexec.exe -n 3 \\HEADNODE\Share\task.exe
Working directory:\\HEADNODE\Share
Standart ouput: \\HEADNODE\Share\out.txt
I get this error: Task failed during execution with exit code -1073740777. Please check task's output for error details.
I can not see anything in out.txt
Thursday, January 8, 2015 9:27 AM -
Hi,
1. You can put your executable on a File Share so that you won't need to copy them to local; You task commandline can use UNC path.
2. Please check your job configuration, things like "RunUntilCancelled" will keep the job running.
Qiufang Shi
Thank you for reply. 2. questıon is OK I see the option "RunUntilCancelled". but I put executable on a File Share then I control that I can access shared executable from all nodes. I gave the path as UNC like below but I get below error.
My task in HPC Cluster Manager : mpiexec.exe -n 3 \\HEADNODE\Share\task.exe
Working directory:\\HEADNODE\Share
Standart ouput: \\HEADNODE\Share\out.txt
I get this error: Task failed during execution with exit code -1073740777. Please check task's output for error details.
I can not see anything in out.txt
I solved problem I changed command mpiexec.exe -n 3 \\HEADNODE\Share\task.exe -> mpiexec.exe -n 3 task.exe
I must not write workdir in command.
- Edited by validator12 Thursday, January 8, 2015 1:05 PM
- Marked as answer by validator12 Thursday, January 8, 2015 1:05 PM
- Unmarked as answer by validator12 Thursday, January 8, 2015 1:05 PM
- Proposed as answer by qiufang shiMicrosoft employee Friday, January 9, 2015 4:02 AM
Thursday, January 8, 2015 1:04 PM