none
Adding workstation nodes in HPC Pack 2016 RRS feed

  • Question

  • I am using Microsoft HPC Pack 2016 update 2  (5.2.6277.0) on a local network and on-premise cluster. We have employed topology 5 (all nodes on the enterprise network). Head node is successfully setup and running.

    The problem is that after manual installation of HPC Pack on different Windows 10 workstations which are all on the same local network, some cannot be found and added to the cluster using the HPC Cluster Manager.

    To be clear, I have taken the same exact steps for each workstation, i.e. same SSL, head node ... in the installation wizard. But while some nodes have been successfully added (particularly the fresh installs), some others are not added to the cluster. I can’t see them added as pending on the HPC Cluster Manager running on the head node.

    I have a very random guess that uninstalling HPC Pack may have left some files in the registry, that's why I have not seen that problem so far on the fresh installs, but some workstations that previously had HPC Pack fail to be added to the cluster. Is there any way to track down to find the cause? Thanks.


    Thursday, June 27, 2019 9:53 PM

Answers

  • After reading log files, it turned out that in my case the issue was trust relationship. This can be verified by running nltest /trusted_domains command in cmd. Resetting trust relationship fixed this problem.
    • Marked as answer by Ramin Mafi Tuesday, July 2, 2019 8:54 PM
    Tuesday, July 2, 2019 8:54 PM

All replies

  • Hi Ramin,

    Could you send the following HPC management logs to hpcpack@microsoft.com? 

    C:\Program Files\Microsoft HPC Pack 2016\Data\LogFiles\Management\HpcManagement_<index>.bin

    Please send 3 files with latest indexes.

    Friday, June 28, 2019 3:08 AM
  • Thanks Sunbin,

    I just emailed the files. Here is some extra information I missed to add in the first post: 

    My workstation is one example that is not visible in the HPC Cluster Manager. However, since my user is added as Administrator in the HPC Cluster Manager, I am able to launch HPC Cluster Manager from my workstation and even run some jobs. However, still my workstation does not appear as a node in the cluster and does not run any computation from HPC.

    Friday, June 28, 2019 2:49 PM
  • After reading log files, it turned out that in my case the issue was trust relationship. This can be verified by running nltest /trusted_domains command in cmd. Resetting trust relationship fixed this problem.
    • Marked as answer by Ramin Mafi Tuesday, July 2, 2019 8:54 PM
    Tuesday, July 2, 2019 8:54 PM