none
clean up orphand headnoe name on the network RRS feed

  • Question

  • Is anybody on the forum know how HPC Cluster manager detect the headnode name on the network?

    What happen is we replaced a slow HPCcluster with new HPCcluster. But the old name still detected by hpc cluster manager even hardware is gone. It doesn't affect any production or job submission but just curiosity how to remove old name from network totally.

    HPC2008R2SP3 cluster

    Any suggestion is welcome and thank you for your help.


    • Edited by lijun1234 Tuesday, December 24, 2013 4:54 AM
    Tuesday, December 24, 2013 4:53 AM

Answers

  • Before decommission the HPC cluster, it would be better to uninstall the HPC components from head node (all head nodes for HA). The uninstallation will remove some objects from AD. In compute node installation wizard, the drop down head nodes list is queried from AD.

    In case the decommissioned head node still shows in the list, please try to delete the object "MicrosoftComputeCluster" under the head node compute object in AD. For HA, please delete "MicrosoftComputeCluster" from all head nodes' object in AD. If the machine itself is decommissioned, to delete the whole compute object in AD should also fix the problem.

    Hope this helps.

    Best regards,
    Yizhong

     

    • Marked as answer by lijun1234 Friday, December 27, 2013 4:01 PM
    Thursday, December 26, 2013 8:59 AM

All replies

  • I don't think HPC Cluster manager detects headnodes on network by some magical approaches (At least i am not aware of).

    HPC Cluster Manager do remember (should be recorded in registry) the head node name that was connected successfully last time. As long as Cluster Manager connect to the new HPCCluster successfully once, it won't connect to the slow HPCCluster any more per my knowledge.

    Hope this helps,

    Best regards,
    Yizhong

    Wednesday, December 25, 2013 7:29 AM
  • Hi Yizhong

    Thank you for the quick responding. You answer is yes and no. It's true cluster manager will remember and always open the last headnode I connected to. But it doesn't detect headnode automatically.

    Another good example is try to install hpc software on a freash OS. During installation wizard, choose 2nd option for installation type, install as compute node and join exist cluster. In this step, user can type the headnode name or click drop down menu to pick up the headnode. It's a proof hpc can't detect headnode on the network and if hpc headnode is not decommisioned properly the name will not disapper. That's what's happening right now on our network.

    I resolved similar problem for DHCP server. There is DHCP server is not gracefully removed and DHCP management tool always detect this dhcp server. Finally I remove this server name from AD database via ADSedit.

    Do you have a proved working procedure about how to decommission a HPC headnode properly?

    Thank you for your help.

    Thursday, December 26, 2013 3:40 AM
  • Before decommission the HPC cluster, it would be better to uninstall the HPC components from head node (all head nodes for HA). The uninstallation will remove some objects from AD. In compute node installation wizard, the drop down head nodes list is queried from AD.

    In case the decommissioned head node still shows in the list, please try to delete the object "MicrosoftComputeCluster" under the head node compute object in AD. For HA, please delete "MicrosoftComputeCluster" from all head nodes' object in AD. If the machine itself is decommissioned, to delete the whole compute object in AD should also fix the problem.

    Hope this helps.

    Best regards,
    Yizhong

     

    • Marked as answer by lijun1234 Friday, December 27, 2013 4:01 PM
    Thursday, December 26, 2013 8:59 AM
  • If a computer account is configured as Head Node, servicePrincipalName attribute contains COMPUTECLUSTER in its name , maybe deleting this value would resolve your problem.Not tested however.

    1.Go to Domain Controller/Server Manger/Roles/Active Directory Domain Services/Active Directory Users and Computers/Computers
    2.From the top menu bar of Server Manger select View and then select extended properties
    3.Select your old head node computer name and open its properties
    4.Select Attribute Editor tab on the property window
    5.Search for servicePrincipalName attribute

    Daniel Drypczewski




    Friday, December 27, 2013 2:55 AM
  • Hi Yizhong

    As you said I do see MicrosoftComputeCluster object under headnode computer account in AD.  I remember the sequence of setting up headnode is  to install HPC software and system will assign headnode template automatically. Then bring headnode online. So I did in reverse way. Take headnode offline then uninstall the all hpc program from headnode. Change system from domain joined system to workgroup.  After that I don't see MicrosoftComputeCluster object under headnode computer account in AD and HPC cluster manager won't detect decommissioned headnode any more.

    But you say delete the whole compute object in AD should also fix the problem. This is not true because we do have headnodes shown on the cluster manager list but computer accounts were purged long time ago.

    Daniel's reply is also very helpful even there is no computecluster in spn attribute because computercluster is object not attribute. The reason I say it's helpful because from SPN I see some service names like MSSQLSvc, MSClusterVirtualServer. I guess maybe that is how SSMS detect sql instance and Failover detect failover node on the network.

    Appreciate help from both of you.


    • Edited by lijun1234 Friday, December 27, 2013 4:19 PM
    Friday, December 27, 2013 4:17 PM