none
Cannot Remove Compute Node

    Question

  • Hello, 

    I am running HPC Manager 2008 R2 both on head nodes and computes nodes. I wanted to delete a compute node from the cluster. What I did was right click on the node and select delete. That did not work. Node goes to Unapproved list. I tried Reject, deleting it from the command line, uninstalling HPC Pack from the compute node and many other things. But still not able to remove it. Can some one please tell me how to do it?

    Thank you

    Thursday, April 7, 2016 2:35 PM

All replies

  • Hi,

      You first try on delete actually works. The Node goes to unapproved list because the compute node report itself to the system again. To complete remove the node, try:

    1. Take the node offline first, thus no job will be impacted

    2. Go to the compute node, un-install the HPC Pack from the system. Thus it won't report itself again

    3. From the admin console UI, select the node, delete it


    Qiufang Shi

    Friday, April 8, 2016 1:03 AM
  • Thank you for the reply. I had already done those steps. I uninstalled HPC Pack from the node and tried to delet but had no success. I might have messed up the system since I tried so many things. Let me list some of the things I did :)

    - Rename the node that I want to remove (From Head Node HPC Manager). And deleted it. This puts it directly in the Unapproved list. I could not do anything with these nodes. When I try to rename those nodes back to the original name, I get "Error 7000: Collection was modified, enumeration may not execute" error. 

    - I tried to install HPC PACK from scratch on the node which I had removed. First uninstalled HPC PAck and all the components, then reinstall it again. But I could not add this node to HPC Manager as a compute node. HPC Manager could not detect the node. 

    In short, I am still not able to remove the nodes in the Unapproved list :) Any further help will be appreciated. 

    Thank you,

    Mak


    Friday, April 8, 2016 7:00 AM
  • Hi, Mak,

    What do you mean rename the node? rename the hostname on that compute node?

    And can you check are there any executing operations from HpcClusterManager(from left navigation pane, select "Node Management", then select "Operations")

    and for example, the old name is node1, then the new name is node2, now in your cluster, the node1 is already removed, still has node2 in unapproved list, right?

    to remove compute nodes, you don't need reinstall HPC on that compute node,

    Thanks,

    Yongjun

    Friday, April 8, 2016 10:38 AM
  • Hi Yongjun,

    I changed the name on HPC Manager, not the actual host name of the machine:

    1. Node Management->By Node Health->Unapproved group, 

    2. Right click on one of the nodes, lets say node10

    3. Select Edit Properties

    4. In Edit Node Properties window, change Name to Node 15

    If I try to delete the node in Unapproved list, it goes first to Transitional list, tries to delete it and goes back to the Unapproved list with errors:

    Reverted
    The operation failed due to errors during execution.
    Disassociating template from node <NODE NAME>
    The operation failed and will not be retried.
    Could not contact node 'NODE NAME' to perform change. The management service was unable to      connect to the node using any of the IP addresses resolved for the node.


    For your question "for example, the old name is node1, then the new name is node2, now in your cluster, the node1 is already removed, still has node2 in unapproved list, right?", yes, the new node, node2 in your example, is still in unapproved list. 

    Thank you again,

    Mak





    • Edited by Mak_Mak Friday, April 8, 2016 11:01 AM
    Friday, April 8, 2016 10:51 AM
  • Hi, Mak,

    I catch your point now, I try to repro your issue on our latest release, but cannot repro it.

    So seems there are two options to solve this issue,

    1, can you upgrade to HPC 2012 R2 Update 3?

    2, if not, you may need uninstall head node, then reinstall head node again, (you can export several data, such ad node template, job template etc)

    Thanks,

    Yongjun

    Friday, April 8, 2016 11:26 AM
  • Hi Yongjun,

    I can probably do the second one. I have Windows 2008 on the head node and HPC Pack 2012 requires Windows 2012  on the head node. It will require me to upgrade the OS as well. But thank you very much for all the help.

    Have a nice day,

    Mak

    Friday, April 8, 2016 1:36 PM