HPC headnode local hosts file management RRS feed

  • Question

  • Hi, 

    I noticed that HPC cluster dynamically updates local hosts files C:\Windows\System32\drivers\etc\hosts .

    Can anybody explain or point me to any documentation with information on how often this file is updated and used in the cluster. Should I disable it if I use Enterprise network topology?


    Monday, January 16, 2017 5:05 PM

All replies

  • Hi,

      HPC Pack Cluster will manage the hosts file both on headnode and compute nodes so that for complex network environment, the services will pick the right network for different traffic, for example:

    - All user applications will go to the Application Network, usually they are IB network

    - All service and deployment traffic will go to private network

    - Enterprise network for internet, DNS resolve, AD connectivity ...

    Thus, HPC Pack will put below entries in the hosts file

         <Ip>   Enterprise.HeadnodeName #HPC

         <Ip>   Private.HeadnodeName #HPC

         <Ip>   HeadnodeName #HPC

         <Ip>   Application.HeadnodeName #HPC

         <Ip>   Enterprise.ComputeNodeName #HPC

         <Ip>   Private.ComputeNodeName #HPC

         <Ip>   ComputeNodeName #HPC

         <Ip>   Application.ComputeNodeName #HPC

    Thus, all entries with "#HPC" will be managed by HPC Service, you shouldn't change them. For you questions:

    1. This file will be changed when there are IP changes from headnode or compute nodes, or compute nodes being added or removed

    2. This file usually being changed within 5 minutes

    3. This file chchange can be disabled by setting # ManageFile = false

    4. For Enterprise Network, it should be okay to disable it as long as your DNS resolve won't have issues

    But I also want to understand why you want to disable this?


    Qiufang Shi

    Tuesday, January 17, 2017 1:20 AM
  • Thank you for the prompt response.

    The hosts file exists on the system to address problems when you cannot rely on DNS. I would think about it as a temporary or auxiliary solution rather then main functionality. 

    Does HPC head node respect manual entries in the file unrelated to HPC cluster when it updates information in the file?


    Tuesday, January 17, 2017 2:47 PM
  • HPC Pack service on the headnode collects all the IP information that's reported from the compute nodes and make a IP<-->ComputeNodes mapping list, and it distribute the info to all compute nodes when the contents change. And the services running on all the compute nodes will write/update the content to the hosts file with "#HPC".

    Thus, HPC won't touch entries that's not related to HPC Cluster. It only adds entries with "#HPC" and manages those entries only.

    Qiufang Shi

    Wednesday, January 18, 2017 1:15 AM
  • Hi Qiufang,

    I have HPC cluster installed on 4 machines on top of Windows 2012 R2:

    MSE-CL-Head-SRV (Cluster Head)

    MSE-CL-WCF-SRV (Windows Communication Foundation)

    MSE-CL-Node1 and MSE-CL-Node2 are computing nodes.

    I have installed MATLAB Distributed Computing Server MDCS on the head and the nodes as well.

    When I test the connectivity from Matlab Admin Center, there is a test called ResolveIPToHostname Test, this test result is a Warning and the detailed message says: The hostname (mse-cl-node1) and canonical hostname (enterprise.mse-cl-node1) do not match. mse-cl-node1 may be misconfigured or the domain name may not set up orrectly.

    If I comment this line in hosts file:            Enterprise.MSE-CL-NODE1        #HPC

    The warning will disappear.

    and when I uncomment and set ManageFile = false, it will reset after a few minutes to what it was before, and this applies to all head and the nodes.

    Any suggestion? and how to make the changes to hosts file to stay?



    Tuesday, March 14, 2017 10:32 PM
  • HPC Pack will manage the hostfile on compute node and headnode (Not unmanaged server node or worksation node) to let our service communicate with the right network interface, for example, if your topology has three networks during network configuration, then, we will make sure

    1. Connection to AD, internet goes to "Enterprise.xxxx #HPC"
    2. Traffic by our service such as deployment, management, metric data goes to "Private.xxx #HPC"
    3. Traffic by applications such as MPI communication, SOA request/response goes to "Application.xxx #HPC"

    And when a node is added/removed to/from the cluster, the cluster will update the ip mapping table and update to all compute nodes so that the above rules are follow. And you want to manage the hostfile by yourself, you just need to change the value of "true" to "false" in line "# ManageFile = true", Please be noted that do not remove the "#". I suppose you removed "#" in your testing, right?

    Qiufang Shi

    Wednesday, March 15, 2017 1:25 AM
  • You right, I used to remove the "#".

    So now it is working as expected.

    Thank you.

    Wednesday, March 15, 2017 4:05 PM
  • hi, does this resolves the node connectivity issues? like node not reachable etc.
    • Edited by HPCMan Wednesday, July 8, 2020 3:07 AM
    Wednesday, July 8, 2020 3:06 AM
  • Hi HPCMan,

    Suppose this hosts file should solve the DNS problem which may cause node unreachable. Node unreachable could also be caused by other network connectivity issues though.


    Yutong Sun

    Wednesday, July 8, 2020 9:01 AM
  • Hi Yutong Sun,
    i have corrected the host file which has a stale server (hpc head node)., but i still see the "nodes not reachable" error message, during the compute nodes, when trying to reach the db server. have you had these issues before?

    Monday, July 13, 2020 3:46 PM