Job's hold until 1/1/0001 12:00:00 AM RRS feed

  • Question

  • Hi Everyone
    I meet one strange issue connected job state.
    1. I submitted simple job (not SOA).
    2. HPC scheduler take it form queue and started execution on chosen compute node.
    3. Job always in running state and will never change status.

    From HPC Scheduler log files I found The message below:
                Resource 0, Job's hold until 1/1/0001 12:00:00 AM, pending reason None.

    How Can I solve it ?



    Friday, August 25, 2017 3:01 PM

All replies

  • Hi,

      Could you export the job to job XML file and share the XML file content here? (You can remove sensitive info when pasting). The info you shared means job hold until is not set.

      There are a few situations keep the job in running state, for example:

    1. the job is set to "run until cancelled"

    2. there is still running task in the job

    Qiufang Shi

    Monday, August 28, 2017 1:53 AM
  • Hi Qiufang,

    Artem and I are working in the same team, where we are integrating HPC in our products. We are facing one blocking issue in HPC in our production environment. We found out that in our production environment, time to time a job doesn't update its status from running to finished, although it's task is completed and finished. Moreover, in our program, we have some logic where we create next new job only when the last job is finish or cancelled.

    So due to the job not updating its status from progress to finished, our system is hanging and we had to manually cancel the hanging job. This issue is becoming really a big problem for us. I wonder if you can help us out here. I feel there is some bug in HPC where it goes into infinite loop and couldn't update the job status. I have emailed the logs & job configuration to you at qiufang.shi@microsoft.com. Please help us in understanding what happened.

    Thanks a lot in advance.

    - Puneet & Artem

    Puneet Sharma

    Friday, September 1, 2017 7:47 PM