none
Nodes remain in Draining state. RRS feed

  • Question

  • I'm running a High performance cluster 2016 (1 head node and several compute nodes) all running on window datacenter 2016,

    Recently, we started having a specific node that remains in draining state for ever. We have restarted the head node, the non working compute node and the db without any success.

    "set-hpcnodestate -force" command shows that the node is draining.

    While googling, I found this link that talks about a patch, but it's not available anymore.

    1. Is this a known issue ?

    2. If yes, is there any patch for it ?

    Regards.

    Rody. 



    • Edited by Rody-gogan Monday, June 8, 2020 3:01 AM
    Monday, June 8, 2020 2:55 AM

All replies

  • Hi Rody-gogan,

    Please check if there is any active (running/queued) jobs on this node. Meanwhile what's the version of HPC Pack (HPC Cluster Manager -> Help -> About)? Please check https://aka.ms/hpcgit to see if you may upgrade to the latest HPC Pack 2016 Update 3 (5.3.6450).

    Cheers,

    Yutong Sun

    Tuesday, June 9, 2020 3:20 PM
    Moderator
  • Hi Yutong,

     Thanks for your reply,

    There's any active running or queued jobs, and we're using HPC 2016 update3 (5.3.6450)

    Tuesday, June 9, 2020 7:18 PM
  • Hi Rody-gogan,

    Could you collect the HpcManagement service log files under %CCP_DATA%LogFiles\Management folder on the head node and non working compute nodes and send them to hpcpack@microsoft.com for further analysis? Check this KB for how to collect the logs.

    Cheers,

    Yutong Sun

    Wednesday, June 10, 2020 1:52 PM
    Moderator