none
Can different MPI processes write data in one file (with different offset)? RRS feed

  • Question

  • I'm developing serious mathematics system, which using MPI technology. And I've faced a problem: "Can different MPI processes write data in one file (with different offset)?" If it is possible, I ask you to tell me how to do it. It will be very healthy, if you'll show the source code solving my problem. P.S. English language is not my native language.
    Wednesday, August 5, 2009 9:38 AM

Answers

  • Hi, Anton.

    Yes, Johannes is correc that MPI IO does exactly that...enables many MPI processes to read/write data to a single file.  Now, understand that MPI IO doesn't do anything you couldn't do with native Windows APIs and open a file for shared write.  However, MPI IO has some very cool features that may save you quite a bit of time in programming/debugging your app.  You would just create a file share on your cluster and then reference that share by UNC path (eg. \\MyHeadNode\MyShare\data1) when using the MPI_FILE_* functions. 

    Some of the cool MPI IO features:
        * MPI IO does all the data marshalling and file layout for you...you don't have to partition the file yourself.
        * Writing files is very similar to sending MPI messages, and reading files similar to receiving messages so fits smoothly into most MPI applications and can write/read in non-blocking mode so you can do work while the file reads/writes in the background.  
        * The partitions used by each rank do NOT have to be contiguous.  
        * Works with your custom MPI data structures.  

    All of these features are available in MS-MPI which is wired into the Windows file system.  In the future, I'd love to have it wired into paralel file systems too, but...I digress :) 

    There's a cool online tutorial on MPI IO here;  http://beige.ucs.indiana.edu/I590/node86.html 
    And a presentation/course that informative here:  http://www.mcs.anl.gov/research/projects/mpi/tutorial/advmpi/sc2005-advmpi.pdf 
    And more on the MPICH site (below).

    Good luck and let us know how it goes!
    Eric

    Eric Lantz
    MS-MPI Program Manager




    More MPI IO references from the MPICH-2 site:
    http://www.mcs.anl.gov/research/projects/mpich2/publications/index.php?s=pubs

    MPI-IO

    • R. Thakur, W. Gropp, and E. Lusk, "Optimizing Noncontiguous Accesses in MPI-IO," Parallel Computing, (28)1:83-105, January 2002. (ps, pdf)
    • R. Thakur, W. Gropp, and E. Lusk, "On Implementing MPI-IO Portably and with High Performance," in Proc. of the Sixth Workshop on I/O in Parallel and Distributed Systems, May 1999, pp. 23-32. (ps, pdf)
    • R. Thakur, W. Gropp, and E. Lusk, "Data Sieving and Collective I/O in ROMIO," in Proc. of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp. 182-189. (ps, pdf)
    • R. Thakur, W. Gropp, and E. Lusk, "A Case for Using MPI's Derived Datatypes to Improve I/O Performance," in Proc. of SC98: High Performance Networking and Computing, November 1998. (html)
    • R. Thakur, W. Gropp, and E. Lusk, "An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces," in Proc. of the 6th Symposium on the Frontiers of Massively Parallel Computation, October 1996, pp. 180-187. (ps, pdf)
    • R. Thakur, R. Ross, E. Lusk, and W. Gropp, "Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation," Technical Memorandum ANL/MCS-TM-234, Mathematics and Computer Science Division, Argonne National Laboratory, Revised May 2004. (ps, pdf)

    Eric Lantz (Microsoft)
    Friday, August 7, 2009 11:42 PM

All replies

  • Hi,

    what you basically want to do is what MPI/IO implements.
    I cannot provide you with a source code but there should be some examples on the web.
    I don't know whether or how well MPI/IO is implemented in MSMPI nowadays.

    Johannes
    JH
    Thursday, August 6, 2009 6:38 AM
  • Hi, Anton.

    Yes, Johannes is correc that MPI IO does exactly that...enables many MPI processes to read/write data to a single file.  Now, understand that MPI IO doesn't do anything you couldn't do with native Windows APIs and open a file for shared write.  However, MPI IO has some very cool features that may save you quite a bit of time in programming/debugging your app.  You would just create a file share on your cluster and then reference that share by UNC path (eg. \\MyHeadNode\MyShare\data1) when using the MPI_FILE_* functions. 

    Some of the cool MPI IO features:
        * MPI IO does all the data marshalling and file layout for you...you don't have to partition the file yourself.
        * Writing files is very similar to sending MPI messages, and reading files similar to receiving messages so fits smoothly into most MPI applications and can write/read in non-blocking mode so you can do work while the file reads/writes in the background.  
        * The partitions used by each rank do NOT have to be contiguous.  
        * Works with your custom MPI data structures.  

    All of these features are available in MS-MPI which is wired into the Windows file system.  In the future, I'd love to have it wired into paralel file systems too, but...I digress :) 

    There's a cool online tutorial on MPI IO here;  http://beige.ucs.indiana.edu/I590/node86.html 
    And a presentation/course that informative here:  http://www.mcs.anl.gov/research/projects/mpi/tutorial/advmpi/sc2005-advmpi.pdf 
    And more on the MPICH site (below).

    Good luck and let us know how it goes!
    Eric

    Eric Lantz
    MS-MPI Program Manager




    More MPI IO references from the MPICH-2 site:
    http://www.mcs.anl.gov/research/projects/mpich2/publications/index.php?s=pubs

    MPI-IO

    • R. Thakur, W. Gropp, and E. Lusk, "Optimizing Noncontiguous Accesses in MPI-IO," Parallel Computing, (28)1:83-105, January 2002. (ps, pdf)
    • R. Thakur, W. Gropp, and E. Lusk, "On Implementing MPI-IO Portably and with High Performance," in Proc. of the Sixth Workshop on I/O in Parallel and Distributed Systems, May 1999, pp. 23-32. (ps, pdf)
    • R. Thakur, W. Gropp, and E. Lusk, "Data Sieving and Collective I/O in ROMIO," in Proc. of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp. 182-189. (ps, pdf)
    • R. Thakur, W. Gropp, and E. Lusk, "A Case for Using MPI's Derived Datatypes to Improve I/O Performance," in Proc. of SC98: High Performance Networking and Computing, November 1998. (html)
    • R. Thakur, W. Gropp, and E. Lusk, "An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces," in Proc. of the 6th Symposium on the Frontiers of Massively Parallel Computation, October 1996, pp. 180-187. (ps, pdf)
    • R. Thakur, R. Ross, E. Lusk, and W. Gropp, "Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation," Technical Memorandum ANL/MCS-TM-234, Mathematics and Computer Science Division, Argonne National Laboratory, Revised May 2004. (ps, pdf)

    Eric Lantz (Microsoft)
    Friday, August 7, 2009 11:42 PM