none
Can't start MPI procedure RRS feed

  • Question

  • Hello, I'm a student and I'm trying to create a test software using MPI across multiple machines.

    So, host: Windows 10
    VMWare "client": Windows 7

    Both has the latest MS MPI (v8) downloaded and installed from here:
    https://www.microsoft.com/en-us/download/details.aspx?id=54607
    (Both the SDK and the redist)

    So, I executed this line from Windows 7 console:
    smpd -d

    Executed this one from the W10 console:
    mpiexec -hosts 1 192.168.30.12 1 C:\mpi\MsMPI.exe
    (target local ip is pingable, thus up, and both machine has that exe at the right place)


    The result is:

    C:\Windows\system32>mpiexec -hosts 1 192.168.30.12 1 C:\mpi\MsMPI.exe
    ERROR: Failed RpcCliCreateContext error 5
    
    Aborting: mpiexec on CODING-STATION is unable to connect to the smpd service on 192.168.30.12:8677
    Other MPI error, error stack:
    connect failed - Access is denied.  (errno 5)

    Both console were started as administrator account. Any idea what could be the problem? How should I proceed on debugging the problem?

    Thank you for every answer :)

    Saturday, April 22, 2017 2:48 PM

All replies

  • Hi,

    From the Windows 7 console can you run "smpd -d 3" instead of "smpd -d". Also, please re run the mpiexec command with -d 3 as well (e.g., mpiexec -d 3 -hosts 1 192.168.30.12 1 C:\mpi\MsMPI.exe). Let us know the output 

    Anh

    Saturday, April 22, 2017 4:15 PM
  • Here is the output:

    ( http://img.imgland.net/broBknq.png )

    • Edited by Uncenzured Saturday, April 22, 2017 4:25 PM added image
    Saturday, April 22, 2017 4:24 PM
  • What is the output of the command "whoami" for both machines? For non-domain joined account you should be running both machines with the same username and password
    Monday, April 24, 2017 10:44 PM
  • I'm sure it's totally different.

    Is there any simple method to domain-join these machines?
    Is it possible to manually hand out username & password, so I don't have to domain-join them?

    Tuesday, April 25, 2017 9:16 PM
  • You can manually create new users on both machines, just make sure you give them the same name and use the same password for them.
    Tuesday, April 25, 2017 9:31 PM
  • Sorry for long time no answering, sadly I didn't have time to play with MPI :)

    But I'm back again, I finally made the same user on both PC, this time it perfectly connected, but bumped into another problem. It gives an error.
    Here's my source code:

    #include <Windows.h>
    #include <stdio.h>
    #include <time.h>
    #include "mpi.h"
    
    int main(int argc, char **argv) {
    	int rank, result, rc, numtasks;
    	clock_t t1, t2;
    	float ratio;
    
    	ratio = 1. / CLOCKS_PER_SEC;
    
    	rc = MPI_Init(&argc, &argv);
    	if (rc != MPI_SUCCESS) {
    		printf("Error starting MPI program. Terminating.\n");
    		MPI_Abort(MPI_COMM_WORLD, rc);
    	}
    
    	MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
    	MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    
    	if (rank == 0) {
    		t1 = clock();
    	}
    
    	MPI_Allreduce(&rank, &result, 1, MPI_INTEGER, MPI_SUM, MPI_COMM_WORLD);
    
    	MPI_Finalize();
    
    	if (rank == 0) {
    		t2 = clock();
    		printf("result = %d\n", result);
    		printf("time = %f\n", ratio*(long)t1 + ratio*(long)t2);
    	}
    }

    The error code at the "slave":

    C:\Users\Unknown User>smpd -d 3
    [-1:484] Launching SMPD service.
    [-1:484] smpd listening on port 8677
    [-1:484] Authentication completed. Successfully obtained Context for Client.
    [-1:484] version check complete, using PMP version 3.
    [-1:484] create manager process (using smpd daemon credentials)
    [-1:484] smpd reading the port string from the manager
    [-1:2844] Launching smpd manager instance.
    [-1:2844] created set for manager listener, 112
    [-1:2844] smpd manager listening on port 49230
    [-1:484] closing the pipe to the manager
    [-1:2844] Authentication completed. Successfully obtained Context for Client.
    [-1:2844] Authorization completed.
    [-1:2844] version check complete, using PMP version 3.
    [-1:2844] Received session header from parent id=1, parent=0, level=0
    [01:2844] Connecting back to parent using host CODING-STATION and endpoint 6743
    [01:2844] Authentication completed. Successfully obtained Context for Client.
    [01:2844] Authorization completed.
    [01:2844] handling command SMPD_COLLECT src=0
    [01:2844] handling command SMPD_STARTDBS src=0
    [01:2844] sending start_dbs result command kvs = 623a1fbb-79d5-4772-aa50-3bf79a7
    df0e0.
    [01:2844] handling command SMPD_LAUNCH src=0
    [01:2844] Successfully handled bcast nodeids command.
    [01:2844] setting environment variable: <MPIEXEC_HOSTNAME> = <CODING-STATION>
    [01:2844] env: PMI_SIZE=1
    [01:2844] env: PMI_KVS=623a1fbb-79d5-4772-aa50-3bf79a7df0e0
    [01:2844] env: PMI_DOMAIN=a38e37bf-b8b0-448d-b1a8-cbe197452e74
    [01:2844] env: PMI_HOST=localhost
    [01:2844] env: PMI_PORT=965989b3-aa4f-4d0b-86a3-edc4442df9f0
    [01:2844] env: PMI_SMPD_ID=1
    [01:2844] env: PMI_APPNUM=0
    [01:2844] env: PMI_NODE_IDS=s
    [01:2844] env: PMI_RANK_AFFINITIES=a
    [01:2844] searching for 'C:\mpi\MsMPI.exe' in workdir 'C:\Windows\system32'
    [01:2844] C>CreateProcess(C:\mpi\MsMPI.exe C:\mpi\MsMPI.exe)
    [01:2844] env: PMI_RANK=0
    [01:2844] env: PMI_SMPD_KEY=0
    [01:2844] Authentication completed. Successfully obtained Context for Client.
    [01:2844] Authorization completed.
    [01:2844] version check complete, using PMP version 3.
    [01:2844] 1 -> 0 : returning parent_context: 0 < 1
    [01:2844] forwarding command SMPD_INIT to 0
    [01:2844] posting command SMPD_INIT to parent, src=1, ctx_key=0, dest=0.
    [01:2844] Handling cmd=SMPD_INIT result
    [01:2844] forward SMPD_INIT result to dest=1 ctx_key=0
    [01:2844] handling command SMPD_BCPUT src=1 ctx_key=0
    [01:2844] Handling SMPD_BCPUT command from smpd 1
            ctx_key=0
            rank=0
            value=port=49233 description="192.168.30.12 192.168.131.132 Pocket-PC "
    shm_host=Pocket-PC shm_queue=3796:164
            result=success
    [01:2844] 1 -> 0 : returning parent_context: 0 < 1
    [01:2844] forwarding command SMPD_FINALIZE to 0
    [01:2844] posting command SMPD_FINALIZE to parent, src=1, ctx_key=0, dest=0.
    [01:2844] Handling cmd=SMPD_FINALIZE result
    [01:2844] forward SMPD_FINALIZE result to dest=1 ctx_key=0
    [01:2844] process_id=0 process refcount == 2, pmi client closed.
    [01:2844] read 29 bytes from stdout
    [01:2844] posting command SMPD_STDOUT to parent, src=1, dest=0.
    [01:2844] reading failed, assuming stdout is closed. error 0xc000014b
    [01:2844] process_id=0 process refcount == 1, stdout closed.
    [01:2844] reading failed, assuming stderr is closed. error 0xc000014b
    [01:2844] process_id=0 process refcount == 0, stderr closed.
    [01:2844] process_id=0 rank=0 refcount=0, waiting for the process to finish exit
    ing.
    [01:2844] creating an exit command for process id=0  rank=0, pid=3796, exit code
    =0.
    [01:2844] posting command SMPD_EXIT to parent, src=1, dest=0.
    [01:2844] Handling cmd=SMPD_STDOUT result
    [01:2844] cmd=SMPD_STDOUT result will be handled locally
    [01:2844] handling command SMPD_CLOSE from parent
    [01:2844] sending 'closed' command to parent context
    [01:2844] posting command SMPD_CLOSED to parent, src=1, dest=0.
    [01:2844] Handling cmd=SMPD_EXIT result
    [01:2844] cmd=SMPD_EXIT result will be handled locally
    [01:2844] Handling cmd=SMPD_CLOSED result
    [01:2844] cmd=SMPD_CLOSED result will be handled locally
    [01:2844] smpd manager successfully stopped listening.
    [01:2844] SMPD exiting with error code 0.
    ^C
    C:\Users\Unknown User>

    Message at the host machine:

    C:\Windows\system32>mpiexec -d 3 -hosts 1 192.168.30.12 1 C:\mpi\MsMPI.exe
    [00:8004] host tree:
    [00:8004]  host: 192.168.30.12, parent: 0, id: 1
    [00:8004] mpiexec started smpd manager listening on port 6743
    [00:8004] using spn RestrictedKrbHost/192.168.30.12 to contact server
    [00:8004] CODING-STATION posting a re-connect to 192.168.30.12:49230 in left child context.
    [00:8004] Authentication completed. Successfully obtained Context for Client.
    [00:8004] Authorization completed.
    [00:8004] version check complete, using PMP version 3.
    [00:8004] posting command SMPD_COLLECT to left child, src=0, dest=1.
    [00:8004] Handling cmd=SMPD_COLLECT result
    [00:8004] cmd=SMPD_COLLECT result will be handled locally
    [00:8004] Finished collecting hardware summary.
    [00:8004] posting command SMPD_STARTDBS to left child, src=0, dest=1.
    [00:8004] Handling cmd=SMPD_STARTDBS result
    [00:8004] cmd=SMPD_STARTDBS result will be handled locally
    [00:8004] start_dbs succeeded, kvs_name: '623a1fbb-79d5-4772-aa50-3bf79a7df0e0', domain_name: 'a38e37bf-b8b0-448d-b1a8-cbe197452e74'
    [00:8004] creating a process group of size 1 on node 0 called 623a1fbb-79d5-4772-aa50-3bf79a7df0e0
    [00:8004] launching the processes.
    [00:8004] posting command SMPD_LAUNCH to left child, src=0, dest=1.
    [00:8004] Handling cmd=SMPD_LAUNCH result
    [00:8004] cmd=SMPD_LAUNCH result will be handled locally
    [00:8004] successfully launched process 0
    [00:8004] root process launched, starting stdin redirection.
    [00:8004] Authentication completed. Successfully obtained Context for Client.
    [00:8004] Authorization completed.
    [00:8004] handling command SMPD_INIT src=1 ctx_key=0
    [00:8004] init: 0:1:623a1fbb-79d5-4772-aa50-3bf79a7df0e0
    [00:8004] handling command SMPD_FINALIZE src=1 ctx_key=0
    [00:8004] finalize: 0:623a1fbb-79d5-4772-aa50-3bf79a7df0e0
    [00:8004] handling command SMPD_STDOUT src=1
    [00:8004] Handling SMPD_STDOUT
    [00:8004] Decoding stdout/stderr buffer 726573756C74203D20300D0A74696D65203D20302E3037393030300D0A
    result = 0
    time = 0.079000
    [00:8004] handling command SMPD_EXIT src=1
    [00:8004] saving exit code: rank 0, exitcode 0, pg <623a1fbb-79d5-4772-aa50-3bf79a7df0e0>
    [00:8004] last process exited, tearing down the job tree num_exited=1 num_procs=1.
    [00:8004] handling command SMPD_CLOSED src=1
    [00:8004] closed command received from left child.
    [00:8004] smpd manager successfully stopped listening.
    I hope you've got some idea :)
    As I see 0xc000014b is a STATUS_PIPE_BROKEN message, but I'm out of ideas why it could happen :O

    Thank you again for all your replies! :)
    Friday, May 5, 2017 9:03 AM
  • Hi

    That error is harmless and is only printed out in the debug message. If you remove the flag "-d 3" from mpiexec you should not see it. 

    Friday, May 5, 2017 2:41 PM