none
Bare metal deployment in V2 CTP stuck in "Pause waiting for management approval..." state RRS feed

  • Question

  • I am trying to deploy a compute node on bare metal using the V2 CTP release of Windows HPC Server.  The PXE boot on the compute node has gotten to the point where it has downloaded "WDSNBP" but is stalled with a message that says: "Pause waiting for management approval..."   It has a request ID of 1, but no request is visible in the HPC Admin Console, so there is no obvious way to approve the request.

     

    Is this a known issue?

     

    -Calvin

    Tuesday, April 8, 2008 11:37 PM

Answers

  • Simon

    many thanks for these helpful suggestions.

    I had toyed with the idea of changing the naming scheme, but shied away from it,

    This worked following a reboot of the headnode.

    I am still getting similar problems with other nodes though.

     

    Interestingly if I connected a monitor and keyboard to a node to examine what was going on, it loops as explained before awaiting approval.

     

    If I hit F12 to select a boot device (these are IBM servers) then the only options are the ethernet cards, even the local SATA disk is visible to the BIOS but not available to boot from.

     

    Therefore hitting F1 for setup and reset to defaults, enables the servers to boot as normal and continue provisioning,

     

    Not sure why the other boot options have been removed.

     

    Many thanks for you suggestions.

     

    Regards

    Jon

    Thursday, May 8, 2008 10:50 AM

All replies

  • Hello Calvin,

     

    You may be running into a known issue. Were these nodes previously deleted and redeployed?

     

    Thanks!

     

    Wednesday, April 9, 2008 4:47 PM
  • Exactly the same problem here. Using April2008 CTP release to bare metal.

    Is there a fix for this, or a way around it?

    Yours in desperation

    Jon

     

     

    Tuesday, May 6, 2008 2:32 PM
  • I had this error yesterday as well, and as I found out that you can get the same behavior for a couple of different reasons.

    As I understand it, the nodes have PXE booted and are waiting to be provisioned at this stage. The first place that I ran into trouble was with the "provisioning task" in my template. Navigate to configurations->deployment->node templates then edit the template you are using. The task "create computer account" (normally with an index of 1) might need to be modified. To troubleshoot that this is the issue, you can simply delete the task - the node will install if this is the issue, but not be recognised by the head node server console.

    You should be able to see events in Node Management->Operations->Executing when you deploy compute nodes and they enter the provisioning state. If there is a problem with provisioning the operation will fail fairly quickly, and then you can look through the events in "reverted" to find the problem. My issue was that the head node couldn't contact the domain controller for provisioning.

    I also got the same "pause waiting for management approval..." problem once I had previously deployed the cluster and attempted to deploy again with a few changes. By this stage I had made quite a few changes on the head node and had managed to break something. There's a bit of a check list that I go through when encountering a problem like this (but the only thing that worked in this case was rebooting the head node). Here's what I try in order (normally I'm successful before getting to a head node reboot):

    • Power nodes off
    • Check executing operations and cancel any that should not be running (good for getting nodes out of the "provisioning" state if they are stuck)
    • Use power shell to clean up any rogue operations: Get-HpcOperation -NodeName <name> | Stop-HpcOperation
    • Close GUI and restart Microsoft HPC services
    • Remove nodes from cluster admin GUI
    • Change the node naming scheme
    • Reboot the head node

    I hope that this helps in some way.

    Regards,

    Simon.


    Wednesday, May 7, 2008 3:18 AM
  • Simon

    many thanks for these helpful suggestions.

    I had toyed with the idea of changing the naming scheme, but shied away from it,

    This worked following a reboot of the headnode.

    I am still getting similar problems with other nodes though.

     

    Interestingly if I connected a monitor and keyboard to a node to examine what was going on, it loops as explained before awaiting approval.

     

    If I hit F12 to select a boot device (these are IBM servers) then the only options are the ethernet cards, even the local SATA disk is visible to the BIOS but not available to boot from.

     

    Therefore hitting F1 for setup and reset to defaults, enables the servers to boot as normal and continue provisioning,

     

    Not sure why the other boot options have been removed.

     

    Many thanks for you suggestions.

     

    Regards

    Jon

    Thursday, May 8, 2008 10:50 AM