none
MPI environment RRS feed

  • General discussion

  • Hi,

    We have been using pallas mpi for long time and usually has MPICH_DISABLE_SOCK 1 as environment variable only.

    It may be wrong to not use any other parameters while validation of network direct on windows hpc platform.

    Is there any open list available which we should consider in command line of 'mpiexec' to validate Network Direct to some more depth level?

    Can anyone share thier experience using various pallas mpi environement variables in thier testing of Network Direct?

    Thanks in advance,

    Monday, November 23, 2009 4:52 PM

All replies

  • run 'mpiexec -help3' to get a list of environment variables. there are several that are related to network direct.
    You can also try to disable the shared memory interconnect '-env MPICH_DISABLE_SHM 1'.

    thanks,
    .Erez
    Sunday, November 29, 2009 5:16 AM
  • Hi Techie,

    The following are a list of network direct related env vars. sorted in order by importance and common use (from high to low):


    MPICH_DISABLE_ND=[0|1]
     When set to 1, disables the use of the Network Direct interconnect.

    MPICH_DISABLE_SHM=[0|1]
     When set to 1, disables the use of the Shared Memory interconnect

    MPICH_DISABLE_SOCK=[0|1]
     When set to 1, disables the use of the Sockets interconnect.

    MPICH_NETMASK=address/subnet
     When set, limits the Sockets and Network Direct interconnects to use only
     connections that match the network mask. for example,
           -env MPICH_NETMASK 10.0.0.5/255.255.255.0
     will use only networks that match 10.0.0.x.


    MPICH_ND_ZCOPY_THRESHOLD=size (bytes)
     Set the message size above which to perform zcopy transfers.
     The default of 0 uses the threshold indicated by the Network Direct provider.
     The value -1 disables zcopy transfers.


    MPICH_SOCKET_SBUFFER_SIZE=size (bytes)
     Set the Sockets send buffer size in bytes (SO_SNDBUF). the default is 32768.

    MPICH_SHM_EAGER_LIMIT=size (bytes)
     Set the message size above which to use the rendezvous protocol for shared
     memory communication. The default is 128000 (1500 - 2G).

    MPICH_SOCK_EAGER_LIMIT=size (bytes)
     Set the message size above which to use the rendezvous protocol for sockets
     communication. The default is 128000 (1500 - 2G).

    MPICH_ND_EAGER_LIMIT=size (bytes)
     Set the message size above which to use the rendezvous protocol for
     Network Direct communication. The default is 128000 (1500 - 2G).

    MPICH_ND_ENABLE_FALLBACK=[0|1]
     When set, enables the use of the sockets interconnect if the Network Direct
     interconnect is enabled but connection over Network Direct fails.

    MPICH_ND_MR_CACHE_SIZE=size (MB)
     Set the size in megabytes of the Network Direct memory registration cache.
     The default is half of physical memory divided by the number of cores.

    Tuesday, December 8, 2009 5:11 AM