none
Windows Server 2008 HPC Edition "No Internet connectivity in compute nodes"

    Question

  • I have setup a HPC 2008 Server and can deploy compute nodes without a problem, but have one annoying problem that I cannot figure out.

    I can access the Internet on the headnode without a problem, but If I try to access the Internet from a compute node, this breaks Internet connectivty on the head node. In the Network and Sharing center you can see the connection being lost as the colored Internet globe is replaced by a grey globe and there is a red x in place of the domain name.  I can also see in the Manage network connections of the head node, the private NIC being disabled and then re-enabled automatically during this process which takes a few seconds to complete.

    During this process, if I have IE opened on the headnode, I can no longer open any web pages, but if I close IE and I wait a few seconds before opening up a new page it will work.

    During this process in IE on the compute node, I will eventually see a message at the bottom left stating a DNS error.

    On the compute node, only the private LAN was active. If I also connect the Enterprise LAN the same problem exists, but if I unplug the private LAN, I can access a web page with IE without a problem, so I can see that there is something wrong with the private LAN settings but don't understand what.

    Network Topology on the headnode is set to "1 Compute nodes isolated on a private network"
    On the headnode the IPv4 settings are

    private LAN
    IP address: 10.0.0.1
    Subnet mask: 255.255.0.0
    Preferred DNS Server: 127.0.0.1 also tried 10.0.0.1

    enterprise LAN
    IP address: 172.0.1.48
    Subnet mask: 255.255.0.0
    Default gateway: 172.0.0.3
    Preferred DNS Server: 172.0.0.1

    Motherboard is a Super Micro X8DTT for headnode and compute nodes
    LAN drivers used are Intel 14.0

    Windows Server 2008 HPC Edition with SP1
     no updates have been applied yet


    I thank you in advance with any suggestions

    Gilles

    Monday, December 14, 2009 8:59 PM

Answers

  • Hi Gilles,

    With network topology 1, I would recommend NOT specifying a preferred DNS server for the head node's private network interface since, by default, the head node is not actually a DNS server.  The compute nodes, however, should point to the head node (10.0.0.1) for both DNS and gateway because routing/NAT is configured on the head node.

    If you continue to encounter problems after making this change, could you please share 'ipconfig /all' for both the head node and a compute node along with the full text of the DNS error that you mentioned?

    Thanks,
    --Brian

    Monday, December 14, 2009 10:31 PM

All replies

  • Hi Gilles,

    With network topology 1, I would recommend NOT specifying a preferred DNS server for the head node's private network interface since, by default, the head node is not actually a DNS server.  The compute nodes, however, should point to the head node (10.0.0.1) for both DNS and gateway because routing/NAT is configured on the head node.

    If you continue to encounter problems after making this change, could you please share 'ipconfig /all' for both the head node and a compute node along with the full text of the DNS error that you mentioned?

    Thanks,
    --Brian

    Monday, December 14, 2009 10:31 PM
  • HI,

     So, I have already seen the case, where the head node were running NAT and once a compute node was trying to make a name resolution trought the HN, then on on the HN NIC was failling ...

     Can you test that :
         from the Head Node : ping -t [server_on_public]

         from a compute node :
           ping [ipadress_server_on_public]
              => see on hn if ok

           ping [name_server_on_public]
              => see on hn if ok

    In my case I have solve that by using the topology with cn on private and public and no NAT on HN (connectivity to the domain was done by using the NIC on public. In this case you will need a DHCP on public !

    Best regards, Tom
    Wednesday, December 23, 2009 5:03 PM
  • Hi Brian,

    I stopped working on this issue a while back as I had other projects, but a colleague has continued and the problem we were having was related to the Intel network driver and the onboard nic of the motherboard used.

    After updating to the latest intel 14.8 drivers the problem was solved.

    I don't know all the details, but thought it would be a good idea to mention this if ever others are having similar problems.

    Regards,
    Gilles
    Thursday, February 11, 2010 4:07 PM