none
Cannot run mpiexec on different Windows PCs RRS feed

  • Question

  • Hi,

    I am trying MS-MPI for the first time, without HPC (running on 2 Windows PCs), compiling with Visual C++  2015.
    MSMPI 7 is installed on both machines and so is MSMPI SDK.

    A simple test program (count the nodes) works on the local machine with

    mpiexec -n 2 MPITest.exe

    The same command also runs locally on the other machine.

    But if I try to run from a Windows 10 device called LAPTOP remotely using the IP address of the remote Windows 7 machine

    mpiexec -hosts 2 localhost 1 192.168.1.9\Share 1 -debug MPITest.exe

    it gives

    [00:13948] host tree:
    [00:13948]  host: localhost, parent: 0, id: 1
    [00:13948]  host: 192.168.1.9\Share, parent: 1, id: 2
    [00:13948] successfully loaded and initialized the extension C:\Program Files\msmpidbg\msmpidbg.dll
    [00:13948] mpiexec started smpd manager listening on port 59913
    [00:13948] using spn msmpi/localhost to contact server
    [00:13948] LAPTOP posting a re-connect to localhost:59915 in left child context.
    [00:13948] Authentication completed. Successfully obtained Context for Client.
    [00:13948] Authorization completed.
    [00:13948] version check complete, using PMP version 3.
    [00:13948] creating connect command for left node
    [00:13948] creating connect command to '192.168.1.9\Share'
    [00:13948] posting command SMPD_CONNECT to left child, src=0, dest=1.
    [00:13948] host 192.168.1.9\Share is not connected yet
    [00:13948] Handling cmd=SMPD_CONNECT result
    [00:13948] cmd=SMPD_CONNECT result will be handled locally
    [00:13948] Authentication completed. Successfully obtained Context for Client.
    [00:13948] smpd id 1 failed to process cmd=SMPD_CONNECT error=1722.
    [00:13948] Authorization completed.
    [00:13948] error 1722 detected during previous command, initiating abort.
    [00:13948] handling command SMPD_ABORT src=1

    Aborting: smpd on LAPTOP is unable to connect to the smpd service on 192.168.1.9\Share:8677
    Other MPI error, error stack:
    connect failed - The RPC server is unavailable.  (errno 1722)
    [00:13948] smpd manager successfully stopped listening.

    I presume there is some authentication issue and it cannot connect to the remote machine, but I don't know how to find the problem.

    The firewalls are disabled on both computers and they use the same user name on both.

    I can connect with Remote Desktop Protocol from LAPTOP to the other PC.

    The same version of MSMPI is running on both and the smpd -d command is run in an Administrator window.

    Any idea what I need to do?

    Many thanks

    Monday, October 31, 2016 8:51 PM

Answers


  • That showed me where to find the problem.

    The machines display the same user name, but the Windows 10 PC uses a Microsoft account for log-in, so I guess the "real" user name is the e-mail address. The Windows 7 machine has no such subterfuge.

    So I changed the Windows 10 to a local account and now they talk.

    Thanks for your help

    Roger

    • Edited by Roger567 Thursday, November 3, 2016 10:13 AM
    • Marked as answer by Roger567 Thursday, November 3, 2016 10:14 AM
    Thursday, November 3, 2016 10:12 AM

All replies

  • You should specify the other host using only the IP address.

    Can you try mpiexec -hosts 2 localhost 1 192.168.1.9 1 -debug MPITest.exe ?

    Thanks

    Anh

    Wednesday, November 2, 2016 5:40 AM
  • Thanks Anh,

    The result was very similar, except the last 3 lines were now:

    Other MPI error, error stack:
    connect failed - Access is denied.  
    (errno 5)[00:10564] smpd manager successfully stopped listening.

    instead of

    Other MPI error, error stack:
    connect failed - The RPC server is unavailable.  
    (errno 1722)[00:13948] smpd manager successfully stopped listening.

    Something I do not understand is how mpiexec can expect to get access to the remote PC without explicitly specifying user name and password (although they are the same on both PCs) or how it knows the directory on the remote PC to find the .exe file (again the same on both PCs).


    Wednesday, November 2, 2016 1:05 PM
  • Did you start the smpd daemon by running smpd -d on both machines?

    If things are not working, I would suggest stepping back and try things out step by step

    1) Verify that a simple command is working: mpiexec -n 2 hostname

    2) Start smpd daemon by running smpd -d on a separate console. Then run mpiexec -host localhost -n 1 hostname

    3) Start smpd deamon on remote machine. Then run mpiexec -host remotehost -n 1 hostname

    4) Now run across both machines by mpiexec -hosts 2 localhost 1 remotehost 1 hostname

    mpiexec can only launch an application on remote machine if you have an smpd daemon running there on a known port (default port is 8677. You can specify a different port by supplying the -p flag to both smpd and mpiexec). Either the machines have to be domain-joined or the username and password across the two machines have to be the same.

    Regarding the working directory, mpiexec will use the current working directory to search for the application on the remote PC. In practice it is best to specify exactly where the application is and also specify a working directory (by passing -wdir flag). For example:

    mpiexec -hosts 2 localhost 1 remotehost 1 -wdir c:\myWorkingDir C:\apps\myapp.exe

    Anh

    Thursday, November 3, 2016 6:37 AM

  • That showed me where to find the problem.

    The machines display the same user name, but the Windows 10 PC uses a Microsoft account for log-in, so I guess the "real" user name is the e-mail address. The Windows 7 machine has no such subterfuge.

    So I changed the Windows 10 to a local account and now they talk.

    Thanks for your help

    Roger

    • Edited by Roger567 Thursday, November 3, 2016 10:13 AM
    • Marked as answer by Roger567 Thursday, November 3, 2016 10:14 AM
    Thursday, November 3, 2016 10:12 AM