locked
HPC diagnostics - DNS test fails RRS feed

  • Question

  • We have a POC cluster of HPC R2-Beta2. The cluster is setup in toplogy number 3 (compute nodes on a private and application networks only). The cluster was recently moved from one ad domain to another. during the process, the compute nodes were joined to the new domain with their public ip's from the old domain. as a result the dns records were made in the AD domain dns zone. We i run the tests all networking tests pass but the dns tests on all compute nodes fail with the same error below

    Failure

    • IP address 10.23.80.59 is associated with this node in the DNS server. This IP address is not expected to be associated with this node.

    Enterpise interfaces on the compute nodes are now disabled so they can only NAT through the head node. so i went in a cleaned up the public ip's registered with the public dns waited for the replication to do clean up on all DNS servers and then tried again. same results. the test still fails. The local host file on the head node is fine with the right private and application network for the compute nodes

    My question is where is the cluster still seeing those stale public ip's? Is there anywhere else i should be looking at?

     

    Wednesday, May 12, 2010 5:35 PM

Answers

  • Hi Kamran

    Are you still seeing this issue? If so what do you see in the DNS Cache on the headnode? (ipconfig /displaydns)? How about the same on the compute nodes?

    If there are still records in the cache(s) clear them out using ipconfig /flushdns & retry the diagnostics

    I can certainly replicate the diagnostic failure on my test cluster here by adding a spurious record to my 'enterprise' DNS, but it's quickly resolved when the record is deleted. Do records in your enterprise DNS zone have an extended TTL?

    Cheers

    Dan

    • Marked as answer by Don Pattee Friday, February 4, 2011 10:12 PM
    Friday, May 14, 2010 8:51 AM