Howto disable HPC2008 Crash recovery (i.e. no user input necessary)

Answered Howto disable HPC2008 Crash recovery (i.e. no user input necessary)

  • 2009年7月6日 8:40
     
     
    Hi,
    recently we had a power failure and some of our HPC2008 compute nodes were affected.
    After power was restored the nodes didn't come up. I hooked a Monitor and Keyboard up to the nodes and they showed the Windows crash recovery menu and were waiting for user input.
    Some of them even showed a boot menu with just one entry, waiting for user input anyway.

    So my question is how to fix this issue.
    Best would be by group policies. So far I didn't find anything useful.
    Second would be how to disable this behavior by cmd or ps script.

    Thanks,

    Johannes
    JH

所有回覆

  • 2009年7月10日 14:00
     
     
    Hello Johannes_de,

    What are the Startup and Recovery option set to on the compute node?

    To check:
    1. Right Click on Computer from the start up menu, and choose properties.
    2. Choose Advanced system settings, and click settings
    3. On the start up and recovery dialog window.
            Does "Time to display list of operating systems:" have a 30 second time out?
            Does "Automatically restart" enable?

    Thanks,
    Ben
  • 2009年7月20日 11:43
     
     
    HI,

    I'm aware of this configuration and I already set it accordingly. However that is not possible on XX Cluster nodes by hand.
    Currently I'm trying with bcdedit.
    Do you have a better suggestion?
    Especially I'm not aware of an option to set a recovery timeout with bcdedit.

    Johannes
    JH
  • 2009年7月21日 17:33
     
     
    Johannes,

    What do you mean by "I'm not aware of an option to set a recovery timeout with bcdedit"?

    There is only one timeout available with BCDEdit that I am aware of.

    Using BCDEdit

    To specify the boot menu time-out value, use the /timeout option:

    bcdedit /timeout <timeout>

    Use the /timeout option and specify the timeout value in seconds. For example, to specify a 15-second timeout value:

    bcdedit /timeout 15

    Reference:
    http://msdn.microsoft.com/en-us/library/ms791525.aspx

    Thanks,
    Ben

  • 2009年7月22日 8:07
     
     
    Ben,

    thats exactly what I meant.
    However, if you proceed to the Control Panel\System and Maintance\System --> Advanced System Properties and there the Startup and Recovery settings you see a Timeout called:

    Time to display recovery options when neede.
    By default this is disabled and the system halts at a screen with various recover decisions.

    Do you have suggestions on how to set this timeout as well?

    Thanks,

    Johannes
    JH
  • 2009年7月22日 21:21
     
     已答覆
    Johannes,

    This timeout is store outside of BCD into the %windows%\bootstat.dat, and is only configurable using the GUI. You can completely disable the feature using the following command.

    bcdedit /set {default} BOOTSTATUSPOLICY IgnoreAllFailures

    Thanks,
    Ben
  • 2009年7月23日 6:31
     
     
    Thanks to you Ben!

    Gonna try it asap




    JH