none
Azure IaaS node error - 'HPC Node Manager Service unreachable'

    Question

  • Environment: I have on premise headnode trying to add Azure IaaS nodes for bursting;

    Everything is on HPC Pack 2016 Update 1

    Azure and our on-prem are connected via VNet.

    I am able to start/stop Azure nodes just fine, but those nodes are showing with Online state and Error as node health.

    Upon digging deeper, they have  a Node connectivity error of 'HPC Node Manager Service unreachable'

    And I can ping them using their IP and RDP into the VMs using IP.

    I am guessing the error is essentially due to the fact that Azure VMs are not on our domain and there is a DNS resolution issue. 

    Is there a way to force HeadNode to resolve those VMs using IP? Or, is there a workaround. I hate to use the hosts.config way to resolve these as the IPs could change when we burst into Azure and bring down the VMs at a later point; and the IPs could change if I ask for fewer VMs etc..

    Friday, August 24, 2018 9:09 PM

All replies

  • Hi, 

    Per your description, I suppose the IaaS nodes are with Windows OS, and you didn't choose "Join Domain" when you created the Azure IaaS node template, so the compute nodes are not domain joined.

    Had you enabled network security group rules on the virtual network? 

    Could you ping the head node with FQDN from the compute node? 

    How did you configure the DNS server of the virtual network?


    • Edited by Sunbin Zhu Monday, August 27, 2018 3:42 AM
    Monday, August 27, 2018 3:38 AM
  • 1) Yes, the IaaS nodes are Windows 2016 OS

    2) I did not choose 'join domain' when creating node template

    3) how can I enable/verify the 'network security group rules' on virtual network?

    4) I cannot ping HN from Azure IaaS compute node

    I can explicitly ping HN from compute node and vice versa using the IP address but not with the name (or FQDN).

    Monday, August 27, 2018 4:53 PM
  • You can check the Subnet of the virtual network on the Azure portal to see whether network security group is applied on this subnet.

    But Per your description, this shall be a DNS resolution issue. You shall manually configure a DNS server on the virtual network which can resolve the FQDN of the head node.

    Wednesday, August 29, 2018 3:22 AM