none
Subroutine execution hangs at run time when using msmpi. RRS feed

  • Question

  • Dear all,

    I have routines to be executed repeatedly by different parts of my .f90 programs. When using msmpi and including mpi Send and Receive statements in these routines, the computer hangs in one of these programs. Because routine variables are publicly declared, so as a work around, I put the routine in an isolated part of the program where I can branch to it from any where, and at the end of the routine I branched back conditionally to where I came from. Surprisingly this try succeeded. I need to know why. The computer is i3-PC, and the operating system is Win 7 (64-bit).

    Best regards.

    Prof. Dr. Said El Noshokaty



    Wednesday, October 30, 2013 10:48 PM

All replies

  • Hello Professor,

    MPI_Send is a blocking operation, and depending on whether there is buffering space in the network transport, may block until a matching MPI_Recv is issued by the peer process.  Likewise, the MPI_Recv call will block until a matching MPI_Send is issued by the peer process, and received locally.  In my experience, mismatched tags, source, or destination parameters are the leading cause of deadlocks.  What version of MS-MPI are you running?  We released a debugger extension for HPC Pack 2012 SP1 version of MS-MPI, available here:

    http://www.microsoft.com/en-us/download/details.aspx?id=39964

    You can inspect the message queues at the different processes, which might show why the processes are deadlocked.

    How many processes do you need to run with in order to see the deadlock?

    Thanks,
    -Fab

    Wednesday, November 6, 2013 5:20 PM
  • Hello Mr. Fab,

    MS-MPI 2008 is the one I'm running. Three processes are running; one master and two subordinates. In other programs, I used to run 7 processes successfully. Hand shaking of send and receive processes of same tags are done successfully when I process the subroutine containing the send process by branch and branch back instead of call and return. Another work around is to embed some filler statements. like printing, where the program hangs.

    Best regards.

     Prof. Dr. Said El Noshokaty 

    Monday, November 11, 2013 3:23 PM
  • I wonder if this might be a Fortran compiler issue.  What Fortran compiler are you using?

    Thanks,
    -Fab

    Monday, November 11, 2013 10:34 PM
  • Hello Mr. Fab,

    It is Fortran Intel 11.01.051.

    Best regards.

    Prof. Dr. Said El Noshokaty

    Thursday, November 14, 2013 12:17 PM
  • Is your program single-threaded, or multi-threaded?

    Thanks,
    -Fab

    Thursday, December 5, 2013 9:47 PM