none
HPC Pack 2016 Update 1: Preconfigured node not visible in Add Node Wizard

    Question

  • Problem:

    Can’t add preconfigured compute node. Can’t even see it.

    Context:

                Head Node:

                            Dell R610 hostname HTC03

                            Windows Server Standard 2016

                            Domain “domain” joined

                            Topology 2: Enterprise (1 Gbe) + Private (10 Gbe). No NAT, no DHCP on Private network.

                            HPC Pack 2016 Update 1 with May28 Fix

                            Private network 10.0.1.0\24

                            Enterprise network: 10.0.0.0\24

     

                Compute Node

                            Dell M610 in M1000e enclosure (Blade 1 hostname Server01)

                            Windows Server Standard 2008 R2 SP1

                            Domain “domain” joined

                            HPC Pack 2016 Update 1 with May28 Fix.

                            Point HPC at head node HTC03.domain (using FQDN) during install.

    Observation:

                Head and Compute can ping each other on both E and P networks.

                For the Compute node certificate, I just copied the Head node certificate to the Compute node and imported it. Same certificate as found on Head node in C:\Program Files\Microsoft HPC Pack 2016\Data\InstallShare\Certificate\HpcCnCommunication.pfx.

                Configuration of the Head node allowed me to specify adapters for P and E networks. No such choice when installing HPC Pack on Compute node. These servers have a separate management network interface (separate from P and E). Perhaps Compute node is trying to use that interface?

                Compute node is up and running:

                            HpcManagement.exe

                            HpcMonitoringClient.exe

                            HpcNodeManager.exe

                            HpcSoaDiagMon.exe

                Add Node Wizard from Head node

                            “Add … node already preconfigured”

                            Use default preconfigured template (only action is “Activate Windows”)

                            No preconfigured nodes appear.

     

     

     



    • Edited by RokShox Sunday, 24 June 2018 5:01 AM Clarified no NAT DHCP
    Saturday, 23 June 2018 4:15 AM

Answers

  • Thanks for sharing the logs, it is a certificate issue. You had chosen to use two different self-signed certificates in your head node and compute node (which is recommended for on-premise cluster), so when you install your compute node, you shall also choose to install HpcHnPublicCert.cer as per step 9 of Install Microsoft HPC Pack 2016 on the computer, so that your compute node can trust the self-signed certificate on the head node. Since you didn't do that on the compute node installation, you shall manually import HpcHnPublicCert.cer to "Local Computer\Trusted Root CA" store on the compute node to fix the issue.
    • Marked as answer by RokShox Tuesday, 3 July 2018 9:32 AM
    Tuesday, 3 July 2018 8:50 AM

All replies

  • I disabled firewall on Head node. Still can't see Compute node.

    Doesn't anyone have a suggestion?

    Sunday, 24 June 2018 5:13 AM
  • Hello,

    Could you share C:\Program Files\Microsoft HPC Pack 2016\Data\LogFiles\Managment\HpcManagement_AA_00000.bin on the compute node to hpcpack@microsoft.com?

    Monday, 25 June 2018 2:43 AM
  • I sent  the file. There was also an ...AA_000001.bin. I sent a link to that one also.
    Tuesday, 26 June 2018 3:21 PM
  • Now my head node is reporting a Node Health error on "Node Connectivity". I bet this is some kind of certificate problem.


    • Edited by RokShox Tuesday, 3 July 2018 9:33 AM remove dead link
    Saturday, 30 June 2018 6:03 AM
  • replied in mail.
    Tuesday, 3 July 2018 3:26 AM
  • Thanks for sharing the logs, it is a certificate issue. You had chosen to use two different self-signed certificates in your head node and compute node (which is recommended for on-premise cluster), so when you install your compute node, you shall also choose to install HpcHnPublicCert.cer as per step 9 of Install Microsoft HPC Pack 2016 on the computer, so that your compute node can trust the self-signed certificate on the head node. Since you didn't do that on the compute node installation, you shall manually import HpcHnPublicCert.cer to "Local Computer\Trusted Root CA" store on the compute node to fix the issue.
    • Marked as answer by RokShox Tuesday, 3 July 2018 9:32 AM
    Tuesday, 3 July 2018 8:50 AM