none
Using the TaskExecutionFailureRetryLimit RRS feed

  • Question

  • I am somewhat new to Microsoft HPC so my problem may be a misunderstanding.

    I want to enable jobs submitted to use the TaskExecutionFailureRetryLimit=3 by using a Job Template. I tried to do this by creating a special Job Template named HPCRetry and set the values of this property as Default=3, Min=0, Max=20. I submitted a test job submitted with this template and without specifying the TaskExecutionFailureRetryLimit. I expected the template to set the value to 3, but it did not. It left it at 0.

    I then modified the Default job template and set the TaskExecutionFailureRetryLimit with the similar settings I tried on HPCRetry (Default=3, Min=0, Max=2147483647).  I tried my test job and the TaskExecutionFailureRetryLimit was still not set to 3, but still set at 0.

    If I explicitly run my test job with "Set-HpcJob -Id $jobId -TemplateName HPCRetry -TaskExecutionFailureRetryLimit 3", then it does get set to 3 and I can see it retry in the job xml.

    If I set a different property on my HPCRetry template such as Node Groups, then I see the defined Node Group specified in the Default get used in my test job. So Node Group works as I expect, using the default setting, but TaskExecutionFailureRetryLimit does not.

    Thanks 

    System is

    HPC Pack 2012 R2. Server version: 4.3.4652.0 Client version: 4.3.4652.0

    Wednesday, July 1, 2015 5:40 PM

Answers

  • Hi, this is the issue of HPC, we will investigate it. Thanks for your findings!


    • Edited by Yongjun Tian (MSFT) Thursday, July 2, 2015 7:08 AM change the solution
    • Marked as answer by BG1983 Thursday, July 2, 2015 12:29 PM
    Wednesday, July 1, 2015 8:13 PM