none
Using MS MPI in Client Server Model RRS feed

  • Question

  • Hi everyone,

    I am new to HPC and MSMPI. I want to build an application which has 1 client which is a Windows forms application and many servers running on different compute nodes which are console applications.

    Now how can I use MSMPI to communicate between the client and the servers.

    The client will run on the Head node and the servers will run on all the compute nodes available.

    Please help me.

    Tuesday, May 11, 2010 6:31 AM

Answers

  • Hi Humayun,

    It looks like there is problem in your network communications. If you have installed the Microsoft HPC pack, please run:

    cluscfg setenvs MPICH_NETMASK=10.66.0.0/255.255.0.0

    and then try your test again?

    Thanks,

    James

    Wednesday, August 4, 2010 9:31 PM

All replies

  • Hi,

    One way to resolve your problem is:

    1) Build 2 applications. Both uses MSMPI and share the same COMM WORLD in the API.

    2) Run the 2 application at the same time like below:  

    job submited /numodes:3 /askednodes:server1,node1,node2 mpiexec -hosts 1 server1 1 win-form-app.exe : -hosts 2 node1 1 node2 1 console-app.exe

    where:

    - totally 3 nodes are used. each node runs  1 process. If you want to run more than 1 process on certain node. you can do -hosts 2 node1 M -node2 N ...

    - server1 will run your win form application; node1 and node2 will run the console application.

    Does this answer your question?

    Liwei

     

    Tuesday, May 18, 2010 2:18 PM
  • Thanks Liwei,

    I will try this...

    Friday, May 21, 2010 1:17 PM
  • Hi Liwei,

    Let define the problem in a more clear way:

    I have two executable files "client.exe" and "server.exe". We have a cluster setup with 1 Head node and two compute nodes(The head node is also configured as a compute node). Now I want to run one instance of "client.exe" on the head node and as many instances of "server.exe" as the no.of compute nodes available. In short I want to execute one instance of "server.exe" on each compute node.

    Now I want to achieve communication between the client on head node and all the server's running on compute nodes.

    I have tested client.exe and server.exe on a single machine and they are running absolutely fine.

    As you have suggested I have tried to achive this by submitting a job:

    The command i have used looks like this

    "mpiexec -n 1 -host <headnode hostname> : -n 1 -host <computenode1 host name> : -n 1 -host <compute node 2 host name>"

    I have assigned the head node and all the compute nodes exclusively for this job and tried to execute it but it is giving some error which looks as shown below:

    job aborted:
    [ranks] message

    [0] terminated

    [1] fatal error
    Fatal error in MPI_Comm_dup: Other MPI error, error stack:
    MPI_Comm_dup(136)..............: MPI_Comm_dup(MPI_COMM_WORLD, new_comm=0x00F8AEE0) failed
    MPIR_Comm_copy(500)............:
    MPIR_Get_contextid(248)........:
    MPI_Allreduce(609).............: MPI_Allreduce(sbuf=MPI_IN_PLACE, rbuf=0x0023F058, count=32, MPI_INT, MPI_BAND, MPI_COMM_WORLD) failed
    MPIR_Allreduce(211)............:
    MPIC_Sendrecv(170).............:
    MPID_Send(66)..................:
    MPIDI_CH3_SendEager(85)........:
    MPIDI_CH3I_VC_post_connect(376):
    MPIDI_CH3I_Sock_connect(304)...: [ch3:sock] rank 1 unable to connect to rank 2 using business card <port=54836 description="10.66.194.131 computenode1 " shm_host=computenode1 shm_queue=1048:592 >
    MPIDU_Sock_post_connect(1118)..:
    save_valid_endpoints(1047).....: unable to connect to 10.66.194.131 computenode1  on port 54836, no endpoint matches the netmask 10.66.193.0/255.255.255.0

    [2] terminated

    ---- error analysis -----

    [1] on computenode1
    mpi has detected a fatal error and aborted C:\MPITEST\server.exe

    ---- error analysis -----

     

    Please help me on this.

    Thursday, June 3, 2010 1:05 PM
  • Hi Humayun,

    It looks like there is problem in your network communications. If you have installed the Microsoft HPC pack, please run:

    cluscfg setenvs MPICH_NETMASK=10.66.0.0/255.255.0.0

    and then try your test again?

    Thanks,

    James

    Wednesday, August 4, 2010 9:31 PM