HPC diagnostics - DNS test fails
-
12. maj 2010 17:35
We have a POC cluster of HPC R2-Beta2. The cluster is setup in toplogy number 3 (compute nodes on a private and application networks only). The cluster was recently moved from one ad domain to another. during the process, the compute nodes were joined to the new domain with their public ip's from the old domain. as a result the dns records were made in the AD domain dns zone. We i run the tests all networking tests pass but the dns tests on all compute nodes fail with the same error below
Failure
- IP address 10.23.80.59 is associated with this node in the DNS server. This IP address is not expected to be associated with this node.
Enterpise interfaces on the compute nodes are now disabled so they can only NAT through the head node. so i went in a cleaned up the public ip's registered with the public dns waited for the replication to do clean up on all DNS servers and then tried again. same results. the test still fails. The local host file on the head node is fine with the right private and application network for the compute nodes
My question is where is the cluster still seeing those stale public ip's? Is there anywhere else i should be looking at?
Alle besvarelser
-
14. maj 2010 08:51
Hi Kamran
Are you still seeing this issue? If so what do you see in the DNS Cache on the headnode? (ipconfig /displaydns)? How about the same on the compute nodes?
If there are still records in the cache(s) clear them out using ipconfig /flushdns & retry the diagnostics
I can certainly replicate the diagnostic failure on my test cluster here by adding a spurious record to my 'enterprise' DNS, but it's quickly resolved when the record is deleted. Do records in your enterprise DNS zone have an extended TTL?
Cheers
Dan
- Markeret som svar af Don PatteeModerator 4. februar 2011 22:12