none
HPC 2 Headnodes setup for HA and reboot RRS feed

  • Question

  • Hi,

    Please let me know if that is recommended to weekly reboot HPC Grid 2016 .

    Let me know order if reboot in case of windows patching is applied to the server

    Should headnodes be rebooted first and then compute nodes . Should we make sure Headnode is up before compute nodes are rebooted

    thanks

    Julia

    Wednesday, April 8, 2020 4:33 AM

All replies

  • Hi Julia,

    We have a known issue in HPC Pack 2016 Update 3 that restarting HPC Job Scheduler service when it's busy could lead to a scheduler hung. So when restarting the service or the head node, please make sure there are no active running or queued jobs.

    There is no requirement for reboot order among the nodes.

    Regards,

    Yutong Sun

    Friday, April 17, 2020 3:01 PM
    Moderator
  • Hi Yutong,

    Thank you for reply . What needs to be done if that does happen and scheduler is stuck ? Just want to be prepared

    Aslo is there available log reader for HPC logs ?

    thank you

    Julia

    Saturday, April 18, 2020 3:00 PM
  • Hi Julia,

    We've fixed this issue in the latest HPC Pack 2016 Update 3 QFE below. If you are already on 5.3.6450, then there is no worry about the scheduler hang after restart, and there is no need to cancel all active jobs before restart (though some jobs may fail due to head node restart).

    HPC Pack 2016 Update 3 QFE KB4537169 (5.3.6450) - 2/14/2020 (Download)

    And for GUI log readers, you may check this KB.

    Regards,

    Yutong Sun


    Tuesday, April 21, 2020 8:24 AM
    Moderator
  • Hi Yutong ,

    we do have 6435 for now . So curious to know what to do if situation like this happens?

    Can you also let me know if there  any HPC log reader programm exists ?

    thanks a lot

    Julia

    Tuesday, April 21, 2020 1:33 PM
  • Hi Julia,

    If you are with 6435, please patch the cluster to 6450 to avoid this issue. If the scheduler gets stuck after restart, please restart it again to see if it recovers.

    For HPC log reader, please the KB mentioned above and also listed below,

    2. How to open and search logs

    Use the following GUI tools

    1. LogFlow – LogFlow is a graphical tool that can parse HPC logs in BIN format. It can be downloaded and installed from http://logflow.blob.core.windows.net/install/publish.htm
    2. LogViewerUI – LogViewerUI is an alternative graphical tool that can parse HPC Pack 2016 logs. It is available here: https://hpconlineservice.blob.core.windows.net/logviewer/LogViewer.UI.application

    Cheers,

    Yutong Sun

    Thursday, April 23, 2020 4:36 AM
    Moderator