none
attempting to send a message to the local process without a prior matching receive RRS feed

  • Question

  • I use mpich2 combined with visual c++ 6.0 to reslove one equations by cholesky method. it give me two error as follws:

    Fatal error in MPI_Send: Other MPI error, error stack:
    MPI_Send(174): MPI_Send(buf=0012BF80, count=4, MPI_DOUBLE, dest=0, tag=0, MPI_CO
    MM_WORLD) failed
    MPID_Send(53): DEADLOCK: attempting to send a message to the local process witho
    ut a prior matching receive

    job aborted:
    process: node: exit code: error message:
    0: localhost: 1: Fatal error in MPI_Send: Other MPI error, error stack:
    MPI_Send(174): MPI_Send(buf=0012BF80, count=4, MPI_DOUBLE, dest=0, tag=0, MPI_CO
    MM_WORLD) failed
    MPID_Send(53): DEADLOCK: attempting to send a message to the local process witho
    ut a prior matching receive
    Press any key to continue

     

    my code is as follows :

    #include <stdio.h>
    #include <stdlib.h>
    #include <math.h>
    //#define MPICH_SKIP_MPICXX   //这句一定要有,否则报错
    #include "mpi.h"

    #define  MAX_PROCESSOR_NUM  12   /*set the max number of the Processor*/
    #define  MAX_ARRAY_SIZE     32   /*set the max size of the array*/

    int main(int argc, char *argv[])
    {
        double a[MAX_ARRAY_SIZE][MAX_ARRAY_SIZE], g[MAX_ARRAY_SIZE][MAX_ARRAY_SIZE];
        int n;
        double transTime = 0,tempCurrentTime, beginTime;

        MPI_Status status;
        int rank, size = 2;
        FILE *fin;
        int i, j, k;

     int membershipKey;
     membershipKey=rank%3;


        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD,&rank);
       
    // MPI_Comm_split(MPI_COMM_WORLD,membershipKey,rank,&myComm);

     MPI_Comm_size(MPI_COMM_WORLD,&size);

    // MPI_Group_excl(MPI_GROUP_WORLD,1,ranks,&grprem);/*创建一个不包括进程0的新的进程组*/ 


        if(rank == 0)
        {
     fin = fopen("dataIn.txt","r");
            if (fin == NULL)     /* have no input source file */
            {
                puts("Not find input data file");    
             puts("Please create a file \"dataIn.txt\"");
                puts("<example for dataIn.txt> ");
                puts("4");
                puts("1  3  4  5");
                puts("3  5  6  1");
                puts("4  6  2  7");
                puts("5  1  7  8");
                puts("\nArter\'s default data are running for you\n");
            }
            else
            {
                fscanf(fin, "%d", &n);
         if ((n < 1)||(n > MAX_ARRAY_SIZE)) /* the matrix input error */
                {
                    puts("Input the Matrix\'s size is out of range!");
                    exit(-1);
                }
                for(i = 0; i < n; i ++) /* get the data of the matric in */
                {
                    for(j = 0; j < n; j ++) fscanf(fin,"%lf", &a[i][j]);
                }
            }

            /* put out the matrix */
            puts("Cholersky Decomposion");
            puts("Input Matrix A from dataIn.txt");
            for(i = 0; i < n; i ++)
            {
                for(j = 0; j < n; j ++) printf("%9.5f  ", a[i][j]);
                printf("\n");
            }
            printf("\n");
        }

        MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);

        for(k = 0; k < n; k ++)
        {
            /* gathering the result ,and then broacasting to each processor */
            MPI_Bcast(a[k], (n-k)*MAX_ARRAY_SIZE, MPI_DOUBLE, 0, MPI_COMM_WORLD);

      size = 2;
                                                                                                               
            for(i = k+rank; i < n; i += size)//开始并行了吗?
            {
                for(j = 0; j < k; j ++)
                {
                    g[i][j] = a[i][j];
                }
                if (i == k)//rank = 0
                {
                    for(j = k; j < n; j ++) g[i][j] = a[i][j]/sqrt(a[k][k]);//包括了akk
                }
                else
                {
                    g[i][k] = a[i][k]/sqrt(a[k][k]);
                    for(j = k+1; j < n; j ++) g[i][j] = a[i][j] - a[i][k]*a[k][j]/a[k][k];
                }
            }

            /* use the Cholersky Algorithm */
            for(i = k +rank; i < n; i ++)
            {
                MPI_Send(g[i], n, MPI_DOUBLE, 0, k*1000+i, MPI_COMM_WORLD);
            }

            if(rank == 0)
            {
                for(j = 0; j < size; j ++)
                {
                    for(i = k + j; i < n; i += size)
                    {
                        MPI_Recv(a[i], n, MPI_DOUBLE, j, k*1000+i, MPI_COMM_WORLD, &status);
                    }
                }
            }
        }

        if (rank == 0)
        {
            puts("After Cholersky Discomposion");
            puts("Output Matrix G");
            for(i =0; i < n; i ++)
            {
                for(j = 0; j < i; j ++) printf("           ");
                for(j = i; j < n; j ++) printf("%9.5f  ", a[i][j]);
                printf("\n");
            } /* output the result */
        }

        MPI_Finalize();/* end of the program */
    }

     

     

     

    Tuesday, November 16, 2010 4:39 AM

Answers

  • Hello Changyun,

    I did a quick check of your source code the part related with MPI_Send and MPI_Recv. It looks like they don't match.

    To illuminate the problem, let's assume there are just two processors and the array size is just 4 (n=4 in your code).

    For the first loop of k (say k = 0), for rank 0, you have 4 and for rank 1 you have 3 MPI_Send to rank 0:

     for(i = k +rank; i < n; i ++)
     {
            MPI_Send(g[i], n, MPI_DOUBLE, 0, k*1000+i, MPI_COMM_WORLD);
     }

    Next see what you have in MPI_Recv. Note that we take assumption size = 2 and k = 0 in the beginning. For rank 0, it will have 2 MPI_Recv from rank 0 and 2 from rank 1. Remember that you have 4 MPI_Send from rank 0 and 3 MPI_Send from rank 1. So there is mismatch of MPI_Send and MPI_Recv as the error message indicated.

     if(rank == 0)
     {
          for(j = 0; j < size; j ++)
          {
              for(i = k + j; i < n; i += size)
              {
                  MPI_Recv(a[i], n, MPI_DOUBLE, j, k*1000+i, MPI_COMM_WORLD, &status);
              }
          }
     }

    Thanks,

    James

    Tuesday, November 16, 2010 5:16 PM