none
Duplicate nodes RRS feed

  • Question

  • Hi Everyone

    I have used Microsoft HPC 2016 (5.0.5826.0).

    Problem:  I have issue related to my nodes. I have duplicated nodes. I have done the steps below and after it I have it.

    1. I have installed Head node (without high availability)

    2. Installed compute node. There was some issue connected that server. That node was Unknown state and Unapproved health.

    3. Used Cluster Manager I go to the Resource Manager, select the Unapproved node, right click and Delete. It deleted successfully. I did not uninstall Compute node from that server. I only Delete it from Cluster manager.

    4. Next day instead of one head node  I have seen 3 nodes  --- 1.Head node ---  State-Online  ---  Health-OK  2.Head node --- State-Provisioning  ---  Health-Transitional  3.Compute Node ---  State-Unknown ---  Health-Unapproved. 

    I uninstalled that compute node and restart Head node server. Did not help, nothing changed. Same situation 3 nodes with same statuses.

    Questions:

    1. How can I solve it?

    2. Why did it happen ?



    Monday, December 25, 2017 8:45 AM

Answers

  • HPC Pack 2016 uses Service Fabric cluster with 3 virtual nodes for one head node, HPC management service shall only run on one virtual node in the same time, I guess in your cluster, somehow the service runs on two virtual nodes for a while, so there are two head nodes displayed. You can try to delete the head node in Provisioning.

    For the compute node, the behavior is as expected. When a compute node is installed, it reports itself to head node, and shown as "unapproved" on the cluster manager. If you delete it from cluster manager, it will report itself again. If you want it disappear forever, you need to uninstall the compute node, AND then remove it from cluster manager manually.

    Btw, HPC Pack 2016 Update 1 was already released. It fixed many bugs found in HPC Pack 2016, and it removes the dependency on Service Fabric cluster for single head node. Since your HPC Pack 2016 cluster didn't work well, we recommend you to uninstall it and install HPC Pack 2016 update 1.

    Download HPC Pack 2016 Update 1 from https://www.microsoft.com/en-us/download/details.aspx?id=56360, right click the downloaded zip file and choose properties, unblock it if there is a "blocked" security warning.  Extract to a local folder for example d:\, and

    1. run the following command to remove Service Fabric cluster from head node since it is not needed anymore for single head node in HPC Pack 2016 Update 1:

    d:\5.1.6086.0\ServiceFabric\CleanFabric.ps1

    2. Uninstall HPC Pack 2016 components from control panel.

    3. Install HPC Pack 2016 Update 1 head node

    For the compute node, uninstall HPC Pack 2016 components from control panel and install HPC Pack 2016 Update 1 as compute node.

    • Marked as answer by Artem Azaryan Sunday, January 7, 2018 8:37 PM
    Wednesday, January 3, 2018 3:37 AM

All replies

  • HPC Pack 2016 uses Service Fabric cluster with 3 virtual nodes for one head node, HPC management service shall only run on one virtual node in the same time, I guess in your cluster, somehow the service runs on two virtual nodes for a while, so there are two head nodes displayed. You can try to delete the head node in Provisioning.

    For the compute node, the behavior is as expected. When a compute node is installed, it reports itself to head node, and shown as "unapproved" on the cluster manager. If you delete it from cluster manager, it will report itself again. If you want it disappear forever, you need to uninstall the compute node, AND then remove it from cluster manager manually.

    Btw, HPC Pack 2016 Update 1 was already released. It fixed many bugs found in HPC Pack 2016, and it removes the dependency on Service Fabric cluster for single head node. Since your HPC Pack 2016 cluster didn't work well, we recommend you to uninstall it and install HPC Pack 2016 update 1.

    Download HPC Pack 2016 Update 1 from https://www.microsoft.com/en-us/download/details.aspx?id=56360, right click the downloaded zip file and choose properties, unblock it if there is a "blocked" security warning.  Extract to a local folder for example d:\, and

    1. run the following command to remove Service Fabric cluster from head node since it is not needed anymore for single head node in HPC Pack 2016 Update 1:

    d:\5.1.6086.0\ServiceFabric\CleanFabric.ps1

    2. Uninstall HPC Pack 2016 components from control panel.

    3. Install HPC Pack 2016 Update 1 head node

    For the compute node, uninstall HPC Pack 2016 components from control panel and install HPC Pack 2016 Update 1 as compute node.

    • Marked as answer by Artem Azaryan Sunday, January 7, 2018 8:37 PM
    Wednesday, January 3, 2018 3:37 AM
  • Thanks !!!
    Sunday, January 7, 2018 8:41 PM