none
Unable to connect to the smpd service RRS feed

  • Question

  • Hi Team,

    Need your support in troubleshooting the MPI communication error.

    Name

    Version

    Windows Server OS

    2016

    Windows HPC Pack

    5.3.6450

    Microsoft MPI

    10.0.12498.5

    Number of Master Nodes

    1 [L11SGRIHPC001]

    Number of Compute Nodes

    4 [L11SGRIHPC002 - 005 ]

    HPC Pack version (HPC Cluster Manager -> Help -> About):5.3.6450.0

    What topology are you set to -Topology 3

    How many Head nodes – one Head Node.

    How many compute nodes – Four compute Node.

    Are we using SOA - NO

    Are they on premise or in Azure - on premise.

    Are nodes virtualized? Using what hypervisor – Not Virtualized(Physical Server).

    Error Details:-

    Aborting: smpd on L11SGRIHPC002 is unable to connect to the smpd service on L11SGRIHPC003:8677

    Other MPI error, error stack:

    connect failed - Access is denied.  (errno 5)

    [00:1328] smpd manager successfully stopped listening.

     

    Thank You

    Atul Yadav

    9980066464


    Atul Yadav

    Tuesday, April 7, 2020 4:06 AM

All replies

  • Hi Atul,

    Could you check under which user the MPI job was running? Is the user in the same domain as the cluster nodes?

    Besides, could you run the built-in MPI Ping-Pong diagnostic tests and check the results?

    Regards,

    Yutong Sun

    Friday, April 17, 2020 3:55 PM
    Moderator
  • Hi Yutong,

    Please find the response of your query:

    Could you check under which user the MPI job was running ?

     The user is belongs to local admin group, Domain Admin group.

    Is the user in the same domain as the cluster nodes ?

     Yes, User is the part of the same domain.

    could you run the built-in MPI Ping-Pong diagnostic tests and check the results?

     Ping Diagnostic result is failed.

    Thank You

    Atul Yadav


    Atul Yadav

    Monday, May 11, 2020 3:23 PM