locked
HPC Pack 2016 Update 3 deployment times out RRS feed

  • Question

  • During the deployment of a compute node the process times out at the point of copying the WIM file to the machine. The only message that appears in the logs in that it timed out.

    The template robocopies the WIM file to the compute node and I did verify that it finished copying but next makes it to step file of the process to create the install folder.

    I am migrating from Windows HPC Pack 2012 Update 2 over to HPC Pack 2016 Update 3.

    The Head Node was completely rebuilt and now trying to deploy the computer nodes.

    Any assistance is greatly appreciated.

    Wednesday, September 4, 2019 7:09 PM

Answers

  • Upgrading WinPE 5.x to version 10.x resolved my issue. Please refer to the WinPE 5.x thread to additional instructions.
    • Marked as answer by Ken Parr Tuesday, February 25, 2020 7:12 PM
    Tuesday, February 25, 2020 7:12 PM

All replies

  • It turns out even though the Firewall was turned off during the installation. The customer did not remove the GPO like he said he did.

    For what ever reason the Provisioning would not continue past the copying of the WIM file until the firewall was removed for GPO. Also reconfigured the Network in Deployment just for safe keeping.

    I hope this helps someone as this was a little frustration as I looked a couple of times.

    Wednesday, September 4, 2019 9:14 PM
  • I am on another HPC Pack deployment for another customer and it turns out this is happening again. This time though I have turned off the firewall through the GPO and it did not fix this time.

    I see the robocopy is sending data but not continuously. I have updated the drivers with no luck. 

    I am not sure if this is an issue in the Update 3 Pack. 

    Any ideas what to try next.

    Thursday, December 12, 2019 3:47 AM
  • Had you met the same issue when you were using HPC Pack 2016 Update 2?

    There shall be no change in Update 3 in the bare metal deployment part.

    If "robocopy" didn't finish in time, you shall check why the network is slow.

    Monday, December 16, 2019 4:18 AM
  • This is a new install so Update 2 was not used. I have not tried to redeploy with Update 2 since I could not get a hold of the DB Team to rebuild the remote DBs.

    Now it is just the copy of the WIM file down. I am able to CTRL+C out task and get to the command prompt.

    1) I am able to copy files from the compute node to the head node without issue. Really quick

    2) The copy of the boot.wim file does not take long and has no issue. It is not until it gets into the winpe part to copy the WIM file prior to the install tasks.

    3) All other network troubleshooting is not see any issues.

    Even found Dell winpe 5.x drivers for the NIC and Storage and did not resolve anything.

    No errors in any of the logs.

    Any other ideas.

    Monday, December 16, 2019 7:41 PM
  • So how did you copy files from CN to HN? Also use robocopy? and copy to Z: (which is actually mapping to the network share in the HN)?

    Could you try to manually copy the os wim file from Z: to C: (i.e. from HN to CN) with "robocopy" and "copy" command seperately, and check whether it makes any difference in speed?

    Another thing you can do is to check the size of the os wim file, is it too huge?

    Tuesday, December 17, 2019 2:43 AM
  • I have copied the files down and up.

    Copying the file down with both commands do not work. Copying any files down do not work or is super slow.

    It is going from the compute node to the head node that has no issues. That is why this makes no sense.

    I have transferred files between the X:, Z:, and C: with all different results. 

    The only time I have no issues copying files from the head node to the compute node is during the boot.wim copy. 

    The WIM file is just over 5 GB. Standard build from the HPC Pack for Windows Server 2016. WIM file has the 4 indexes in it.

    Tuesday, December 17, 2019 8:45 PM
  • That's really weird, could you ping head node with hostname on the compute node WinPE cmd console? Is the IP address of the head node in the same subnet of the compute node's IP address? 
    Monday, December 23, 2019 1:52 AM
  • a ping of the hostname resolves to the Private IP address and responds in full.

    a ping -a with the other 2 IPs (application and enterprise resolves with the correct names).

    One thing that I am not questioning is the Hard Drive setup. These nodes are using a Dell BOSS-S1 card. Has anyone been successful on deploying a node with BOSS-S1 cards?

    Wednesday, January 8, 2020 2:36 AM
  • Upgrading WinPE 5.x to version 10.x resolved my issue. Please refer to the WinPE 5.x thread to additional instructions.
    • Marked as answer by Ken Parr Tuesday, February 25, 2020 7:12 PM
    Tuesday, February 25, 2020 7:12 PM