All of our nodes except the head node are DHCP. The DCHP server is our firewall. We upgraded one of our clusters to 2008R2 and ever since all of the dhcp clients drop out of DNS and will not reregister unless we go and check the boxes in the DNS
properties of the NIC for register this connection's address in DNS and use this connection's DNS suffix in DNS registration and force a registration. This is all good, but after the lease timeout or a server reboot the boxes become unchecked
and once again remove themselves from DNS.
After dealing with my compute nodes having an "Error" status instead of "OK" for 3 weeks, I landed on your thread here which helped me determine the problem.
I found the following solution, which seems to be holding through many reboots of the server and the nodes:
1) On each compute node, open up the "HPC Powershell" application by right-clicking on it and selecting 'run as administrator'.
2) Type in the following: "Set-HpcNetwork -PrivateDnsRegistrationType WithConnectionDnsSuffix"
This permanently checks the 2 boxes that you mention.
I do not like the fact that the Head Node wants to communicate to the Compute Nodes using the FQDN of the Enterprise domain (so there has to be DNS entries on an Enterprise DNS Server), but I haven't found a way to get around this.