none
Workstation nodes on different domain to the cluster - no endpoint matches the netmask error RRS feed

  • Question

  • I hope someone can help me...

    According to the documentation, " ...you can also use workstation computers that are joined to any domain that has an established trust relationship with the domain to which the head node is joined. ", and my two domains A and B do have a trust relationship.

    I have two machines in the cluster in domain A (a head-/compute-node and a compute-node), and I can add four workstations in domain B to the head-node as workstation-nodes. I am running Windows Server HPC Pack 2008 R2 SP4 on all the nodes.

    The problem is that I can't run jobs utilizing the whole cluster, that is the two nodes in domain A and the four nodes in domain B. I get a error message that states...

    " unable to connect to <list_of_IP_addresses_for_a_workstation_node> on port 63505, no endpoint matches the netmask <domain_A_netmask>"
    I there a solution that doesn't involve me moving all the machines to the same domain? Is the documentation correct, should I be able to run jobs in the whole cluster?

    Many thanks in advance!

    <update>

    The head node and the workstation node are on different Subnet Masks. The instructions to alter the Subnet Mask the MPI process uses are well described here, but how exactly to change that value? To match the Subnet Mask for the head-node, or the Subnet Mask of the workstation-node?

    </update>


    • Edited by Matt JC Barron Wednesday, January 21, 2015 1:21 PM Updated question
    Friday, October 31, 2014 3:55 PM

Answers

  • I suppose you're trying to run the MPI application on nodes that in different subnet, this might not work for you. Try to set wider subnet mask that includes both nodes. what are your current subnet for your workstation nodes?

    Qiufang Shi

    Tuesday, January 27, 2015 8:45 AM

All replies

  • Some questions:

    1. Are the workstation nodes "Healthy" and "Online" in Admin UI?

    2. What kinds of HPC network topology are you using?

    3. Where did you see the error? even viewer on the head node?

    4. can you see if there is connection listening on port 63505 on the workstation nodes?

    5. Is there any firewall opened on the workstation nodes?

    Tuesday, November 4, 2014 3:12 AM
  • Some questions:

    1. Are the workstation nodes "Healthy" and "Online" in Admin UI? Node State: Online; Node Health: OK

    2. What kinds of HPC network topology are you using? Topology 5

    3. Where did you see the error? even viewer on the head node?In the Standard Output for the job

    4. can you see if there is connection listening on port 63505 on the workstation nodes? Cannot see that port number in the list of listening ports

    5. Is there any firewall opened on the workstation nodes? no

    Hi zhongl1, answers above in bold. Many thanks, Matt.

    Tuesday, January 20, 2015 3:16 PM
  • I suppose you're trying to run the MPI application on nodes that in different subnet, this might not work for you. Try to set wider subnet mask that includes both nodes. what are your current subnet for your workstation nodes?

    Qiufang Shi

    Tuesday, January 27, 2015 8:45 AM
  • I managed to solve this (finally!) yesterday. As @Qiufang suggested, I widened the subnet mask so as to include the IP range of my cluster and the IP range of my workstations.
    Tuesday, January 27, 2015 10:40 AM
  • Hey @Matt JC Barron , how did you "widened the subnet mask so as to include the IP range of my cluster and the IP range of my workstations"

    I have the same problem,

    Head node in subnet mask: 255.255.248.0

    new node subnet mask: 255.255.255.0

    thanks,

    Baruch

    Tuesday, September 13, 2016 3:47 PM
  • Hi Baruch,

    I used an IP Subnet Calculator (http://jodies.de/ipcalc) to work out the subnet mask that'd need to cover all my node IP addresses. Look at the HostMin and HostMax values for the subnet mask generated by that website.

    For example, if your head-node IP is 192.168.0.1, then subnet mask of 255.255.255.0 will cover IPs in the range 192.168.0.1 to 192.168.0.254

    Hope you get it working! Cheers,

    Matt

    Wednesday, September 14, 2016 9:05 AM