none
MPI communication problems with multiple nodes RRS feed

  • Question

  • Hello!

    Could you help me? I try to launch my MPI application, using Microsoft HPC Pack 2008 SDK and two virtual machines, provided Hyper-V. I launch smpd.exe -d via the command lines of the both virtual machines and then I execute the next command via command line of the one of the virtual machines:

     >"C:\Program Files\Microsoft HPC Pack 2008 SDK\Bin\mpiexec.exe" -machi nefile file.txt GlimpseOfMPI.exe

    In case file.txt is:

    10.0.16.6 5 

    everything is OK

    In case file.txt is:

    10.0.16.7 5 

    everything is OK

    But in case file.txt is:

    10.0.16.6 5

    10.0.16.7 5

    I see:

    Running 10 MPI processes, n = 1000

     

    job aborted:

    [ranks] message

     

    [0-4] terminated

     

    [5] fatal error

    Fatal error in MPI_Send: Other MPI error, error stack:

    MPI_Send(175)..................: MPI_Send(buf=0x00267880, count=101, MPI_DOUBLE,

     dest=0, tag=0, MPI_COMM_WORLD) failed

    MPIDI_EagerContigSend(177).....: failure occurred while attempting to send an ea

    ger message

    MPIDI_CH3_iStartMsgv(240)......:

    MPIDI_CH3I_VC_post_connect(405): MPIDI_CH3I_Shm_connect failed in VC_post_connec

    t

    MPIDI_CH3I_Shm_connect(194)....: failed to attach to a bootstrap queue

     

    [6] fatal error

    Fatal error in MPI_Send: Other MPI error, error stack:

    MPI_Send(175)..................: MPI_Send(buf=0x021D7880, count=101, MPI_DOUBLE,

     dest=0, tag=0, MPI_COMM_WORLD) failed

    MPIDI_EagerContigSend(177).....: failure occurred while attempting to send an ea

    ger message

    MPIDI_CH3_iStartMsgv(240)......:

    MPIDI_CH3I_VC_post_connect(405): MPIDI_CH3I_Shm_connect failed in VC_post_connec

    t

    MPIDI_CH3I_Shm_connect(194)....: failed to attach to a bootstrap queue

     

    [7] fatal error

    Fatal error in MPI_Send: Other MPI error, error stack:

    MPI_Send(175)..................: MPI_Send(buf=0x020E7880, count=101, MPI_DOUBLE,

     dest=0, tag=0, MPI_COMM_WORLD) failed

    MPIDI_EagerContigSend(177).....: failure occurred while attempting to send an ea

    ger message

    MPIDI_CH3_iStartMsgv(240)......:

    MPIDI_CH3I_VC_post_connect(405): MPIDI_CH3I_Shm_connect failed in VC_post_connec

    t

    MPIDI_CH3I_Shm_connect(194)....: failed to attach to a bootstrap queue

     

    [8] fatal error

    Fatal error in MPI_Send: Other MPI error, error stack:

    MPI_Send(175)..................: MPI_Send(buf=0x02267880, count=101, MPI_DOUBLE,

     dest=0, tag=0, MPI_COMM_WORLD) failed

    MPIDI_EagerContigSend(177).....: failure occurred while attempting to send an ea

    ger message

    MPIDI_CH3_iStartMsgv(240)......:

    MPIDI_CH3I_VC_post_connect(405): MPIDI_CH3I_Shm_connect failed in VC_post_connec

    t

    MPIDI_CH3I_Shm_connect(194)....: failed to attach to a bootstrap queue

     

    [9] fatal error

    Fatal error in MPI_Send: Other MPI error, error stack:

    MPI_Send(175)..................: MPI_Send(buf=0x008B7880, count=100, MPI_DOUBLE,

     dest=0, tag=0, MPI_COMM_WORLD) failed

    MPIDI_EagerContigSend(177).....: failure occurred while attempting to send an ea

    ger message

    MPIDI_CH3_iStartMsgv(240)......:

    MPIDI_CH3I_VC_post_connect(405): MPIDI_CH3I_Shm_connect failed in VC_post_connec

    t

    MPIDI_CH3I_Shm_connect(194)....: failed to attach to a bootstrap queue

     

    ---- error analysis -----

     

    [5-9] on 10.0.16.7

    mpi has detected a fatal error and aborted GlimpseOfMPI.exe

     

    ---- error analysis -----

     

    Any ideas?I would be very much obliged to you. Thank you!


    Tuesday, August 30, 2011 2:19 PM

All replies

  • Do your VMs have different names?  It appears that processes on 10.0.16.7 are trying to use shared memory to processes on 10.0.16.6, which won't work.

    You can work around this issue by adding "-env MSMPI_DISABLE_SHM 1" if using HPC Pack 2012 or "-env MPICH_DISABLE_SHM 1" to your mpiexec command line if using earlier versions of the HPC Pack, but your on-node performance will suffer.

    -Fab

    Thursday, June 27, 2013 10:52 PM