locked
MS-MPI Connection Errors RRS feed

  • Question

  • Hello, I've had problems with MS-MPI on large numbers of cores. My research programs run fine on up to 64 cores, but something goes wrong when run on 128 cores (4 cores per compute node, MS-MPI through CCS). I haven't experimented with core counts in between. The error output for a 128-core job is below.

    Code Snippet

    44: CS021: fatal error: Fatal error in MPI_Alltoallv: Other MPI error, error stack:
    MPI_Alltoallv(406)..................: MPI_Alltoallv(sbuf=0x000000005AAF9D90, scnts=0x000000001DD08B90, sdispls=0x000000001E89BF50, MPI_DOUBLE_PRECISION, rbuf=0x000000005AAFFD90, rcnts=0x000000001DD08990, rdispls=0x000000001E89BD40, MPI_DOUBLE_PRECISION, MPI_COMM_WORLD) failed
    MPIR_Alltoallv(122).................:
    MPIC_Isend(250).....................:
    MPID_Isend(146).....................: failure occurred while attempting to send an eager message
    MPIDI_CH3_iSendv_internal(242)......:
    MPIDI_CH3I_Sock_connect(381)........: [ch3:sock] rank 44 unable to connect to rank 26 using business card <port=1336 description="12.88.43.135 177.252.115.175 cs025 " shm_host=cs025 shm_queue=E525B349-443F-47ae-BC3F-1021C4DE2728 >
    MPIDU_Sock_post_connect_filter(1258): unable to connect to 12.88.43.135 177.252.115.175 cs025  on port 1336, exhausted all endpoints
    MPIDU_Sock_post_connect_filter(1308): unable to connect to 177.252.115.175 on port 1336, No connection could be made because the target machine actively refused it.  (errno 10061)


    Is there a limit to the number of connections a machine can have at one time? Or could there be a maximum number of receive counts (rcnts) in the MPI_Alltoallv function for this particular cluster configuration? Some other tests I've done have shown that when rcnts is very low (<100), the program will function. Anyway, I'd appreciate any advice on debugging this problem. Thanks - Dave
    Monday, June 16, 2008 6:13 PM

Answers

  • Are you using Windows Computer Cluster Server (i.e., v1) or Windows HPC Server Beta?

     

    When MPI_Alltoallv is the first (or one of the first) communication functons called it tries to connect all cores with all cores at the same time; the original MPICH2 code did not handle that very well with the 200 backlog queue length for winsock. This was fixed in MSMPI v2 (Windows HPC Server)

     

    if you modify your program to create some of the connections upfront using point-to-point API you will not run into this problem. (e.g., do dummy send/receive from all toall)

     

    thanks,

    .Erez

    • Proposed as answer by Lio Thursday, June 26, 2008 5:06 PM
    Tuesday, June 17, 2008 1:35 AM

All replies

  • Are you using Windows Computer Cluster Server (i.e., v1) or Windows HPC Server Beta?

     

    When MPI_Alltoallv is the first (or one of the first) communication functons called it tries to connect all cores with all cores at the same time; the original MPICH2 code did not handle that very well with the 200 backlog queue length for winsock. This was fixed in MSMPI v2 (Windows HPC Server)

     

    if you modify your program to create some of the connections upfront using point-to-point API you will not run into this problem. (e.g., do dummy send/receive from all toall)

     

    thanks,

    .Erez

    • Proposed as answer by Lio Thursday, June 26, 2008 5:06 PM
    Tuesday, June 17, 2008 1:35 AM
  • I'm using Windows Computer Cluster Server (v1). Thanks for the tip, problem is solved. Also, the same issue occurred in a different mode of the program, where the first MPI communication calls were mpi_isend instead of mpi_alltoallv. The same solution worked for that mode too. Thanks again - Dave
    Tuesday, June 17, 2008 8:41 PM