none
Starting MPI Programs without using mpiexec? RRS feed

  • Question

  • Is it possible to start mpi programs without using mpiexec?

    I am developing a distributed application which runs on many cores or computers - the code currently uses TCP/IP sockets (not MPI) and I am investigating moving it to MPI (to allow use of other methods such as RDMA or low-latency Infiniband).  The complete app can run on numerous computers, each with different (long) pathnames etc... which make it difficult to run on a single mpiexec command line (even with a config file).  Each executable communicates with 1 or more other executables (each executable is different) but each end of a link knows the hostname and TCP port of the other end of its link.  Each program first does non-blocking sends for all com channels, followed by blocking receives for all channels - thus the entire sequence operates in lock-step.  We have set it up so each process can start other processes (using long/complex powershell remote commands).  In some cases each app is started from a GUI (and communicates with it via other sockets).  I explain all this to show it is difficult to start each process via mpiexec...

    I have seen posts for other MPI implementations which allow MPI programs to be manually started by setting a few environment variables (to identify the smpd hostname, port, rank, # of processes etc..) - I tried this with MSMPI (simple pingpong app) - I set environment variables in the shell, as well as directly from the code just before MPI_INIT is called,  but the rank, # or processes etc.. does not get set properly.

    Ideally we would want each application to startup on its own, then block at some part of the mpi calls until all of the other processes are started.  We currently use Windows 7 and Vista desktop computers.

    1) How does each mpi program get the information from mpiexec/smpd (say to set the rank, number of tasks etc..)?

    2) Can I duplicate this setup information somehow so I can start each program myself without using mpiexec?

    Any suggestions?  In advance, thanks and all help appreciated!


    • Edited by girwin Monday, June 18, 2012 5:53 PM
    Saturday, June 16, 2012 3:07 AM

All replies

  • I was really hoping for some expert guidance here - anyone?
    Monday, June 18, 2012 5:54 PM
  • Hi Girwin,

    It's really painful to do what you are trying to, rank and other information are passed at MPI_INIT time, setting it up manually would require far more work than setting up the HPC cluster properly.

    MPIEXEC should be able to start your GUI if that's your concern.

    Michael

    Tuesday, June 19, 2012 8:25 PM
  • Hi Michael - thanks for the reply.

    I have determined that MSMPI does not support starting of processes manually (ie without using mpiexec) although most other mpi versions do support this. 

    We are now trying Intel MPI, which support MPI_Comm_Accept and MPI_Comm_Connect commands - we can start each client and server program manually, they connect, then we can use Send/Recv normally...

    It seems all MPI development is based heavily on distributed processing requirements (ie where each program is the same, you get rank, then do something special...).  Our process is a simple client-server realm - each program is different and we simply want to communicate between each program (which may be on the same computer or on other computers).

    Thanks!



    • Edited by girwin Wednesday, June 20, 2012 3:47 PM
    Wednesday, June 20, 2012 3:45 PM
  • I want to run an MPI app manually, too

    This is my code (using boost MPI, under windows 10 <Microsoft MPI v7.1>

    #include <boost/mpi.hpp>
    #include <iostream>
    #include <boost/serialization/string.hpp>
    #include <thread>
    #include <boost/exception/all.hpp>
    #include <exception>
    
    using namespace std;
    namespace mpi = boost::mpi;
    
    int main(int argc, char* argv[]) {
    
       for (int i = 0; i < argc; ++i) {
          std::cout << "argc " << i << " value " << argv[i] << std::endl;
       }
    
       mpi::environment env(false);
       std::cout << "processor_name " << env.processor_name() <<
          " thread_level: " << env.thread_level() <<
          " is_main_thread: " << env.is_main_thread() << std::endl;
    
       std::this_thread::sleep_for(std::chrono::seconds(2));
    
       mpi::communicator world;
       std::cout << "I am process " << world.rank() << " of " << world.size()
          << "." << std::endl;
    
       do {
          if (world.rank() == 0) {
             try {
                mpi::request reqs[2];
                std::string msg, out_msg = "Hello";
    
                reqs[0] = world.isend(1, 0, out_msg);
                reqs[1] = world.irecv(1, 1, msg);
                mpi::wait_all(reqs, reqs + 2);
                std::cout << msg << "!" << std::endl;
             }
             catch (boost::exception &ex) {
                std::cerr << "[" << world.rank() << "] boost::exception" << boost::diagnostic_information(ex);
             }
             catch (std::exception &ex) {
                std::cerr << "[" << world.rank() << "] std::exception" << ex.what();
             }
             catch (...) {
                std::cerr << "[" << world.rank() << "] exception";
             }
          }
          else {
             try {
                mpi::request reqs[2];
                std::string msg, out_msg = "world";
                reqs[0] = world.isend(0, 1, out_msg);
                reqs[1] = world.irecv(0, 0, msg);
                mpi::wait_all(reqs, reqs + 2);
                std::cout << msg << ", ";
             }
             catch (boost::exception &ex) {
                std::cerr << "[" << world.rank() << "]" << boost::diagnostic_information(ex);
             }
             catch (std::exception &ex) {
                std::cerr << "[" << world.rank() << "]" << ex.what();
             }
             catch (...) {
                std::cerr << "[" << world.rank() << "]" << "exception";
             }
          }
          std::this_thread::sleep_for(std::chrono::seconds(1));
       } while (true);
    
       return 0;
    }
    

    this is the output using mpiexec  (mpiexec -debug 3 -n 2 app_name.exe)

    argc 0 value boost_mpi_main_MDd_x86_v140.exe
    argc 0 value boost_mpi_main_MDd_x86_v140.exe
    processor_name jagomezw7.indra.es thread_level: single is_main_thread: 1
    processor_name jagomezw7.indra.es thread_level: single is_main_thread: 1
    I am process 0 of 2.
    I am process 1 of 2.
    world!

    works well but I want to run manually, I try with smpd and app_name.exe directly...

    this is the output using smpd (manually)

    set MSMPI_LOCAL_ONLY=1
    smpd.exe -p 8677 -d 2 -localonly

    [-1:67884] Launching SMPD service.
    [-1:67884] smpd listening on port 8677

    and calling app_name.exe (manually)

    SET MSMPI_LOCAL_ONLY=1
    SET PMI_PORT=8677
    SET PMI_RANK_AFFINITIES=affinity_region_67884
    SET PMI_SIZE=2
    SET PMI_RANK=0

    app_name.exe

    argc 0 value boost_mpi_main_MDd_x86_v140.exe
    processor_name jagomezw7.indra.es thread_level: single is_main_thread: 1
    I am process 0 of 1.
    [0] boost::exceptionThrow location unknown (consider using BOOST_THROW_EXCEPTION)
    Dynamic exception type: class boost::exception_detail::clone_impl<struct boost::exception_detail::error_info_injector<class boost::mpi::exception> >
    std::exception::what: MPI_Isend: Invalid rank, error stack:
    MPI_Isend(buf=0x0042C1A4, count=1, MPI_UNSIGNED, dest=1, tag=0, MPI_COMM_WORLD, request=0x001CF228) failed
    Invalid rank has value 1 but must be nonnegative and less than 1

    I try using these others env var, but I can´t modify the rank.size(), always set with 1

    SET MSMPI_LOCAL_ONLY=1
    SET PMI_APPNUM=0
    SET PMI_DOMAIN=441ea856-59f5-4c93-b8e8-a75c31bd6f31
    SET PMI_HOST=localhost
    SET PMI_KVS=16944d75-bb84-4f98-9a52-011dd5b9ea38
    SET PMI_NODE_IDS=smp_region_64096
    SET PMI_PORT=fe13ea10-8d7d-42ae-9de8-dcdbd43cfa3a
    SET PMI_RANK_AFFINITIES=affinity_region_67884
    SET PMI_SIZE=2
    SET PMI_SMPD_ID=1

    SET PMI_RANK=0
    SET PMI_SMPD_KEY=1

    Is it possible? Do you know what are the appropriate environment variables to set it?

    thank you in advance




    jose andres gomez tovar

    Wednesday, November 23, 2016 3:12 PM
  • Hi Jose,

    Currently only these scenarios are supported :

    1) Launching MPI processes using mpiexec

    2) Launching individual MPI processes and connect them together using MPI_Comm_connect/MPI_Comm_accept

    If you launch the process without using mpiexec the process will start with a MPI_COMM_WORLD size of 1. Note that the environment PMI_SIZE is for reference only and you should not try to manually set it.

    Is there a reason why you do not want to launch the processes using mpiexec?

    Anh

    Wednesday, November 23, 2016 8:15 PM
  • it is possible to have an aquitecture like this
    main->dll (mpi_point_to_point)
    other_main(mpi_point_to_point)

    use mpi as a RPC (gRPC or something like this)

    I have a main.exe with a lot of dll, I want to create a generic dll with a MPI interface (point-to-point) and a main.exe with a specific dll


    jose andres gomez tovar

    Wednesday, November 23, 2016 9:40 PM
  • I'm not sure I completely understand the scenario that you're describing. Do you want to have some sort of RPC server running (main dll), and then some other client calls into the RPC server to execute some workload? If yes, it is possible using MPI_Comm_connect and MPI_Comm_accept.

    You can refer to this article for some examples on the usage of those two APIs:

    http://mpi-forum.org/docs/mpi-3.1/mpi31-report/node248.htm

    Thanks

    Anh


    Tuesday, November 29, 2016 10:11 PM

  • jose andres gomez tovar

    Sunday, December 4, 2016 8:14 PM