none
Error from node: CNODEXXX:Exception 'The job identifier XXXXXX is invalid.' reported creating the task. RRS feed

  • Question

  • Hi All , 

    I'm getting the following error message on Windows HPC 2012 cluster - 

    Error from node: CNODEXXX:Exception 'The job identifier XXXXXX is invalid.' reported creating the task.

    Do you know how to solve it ? 

    Thanks , 

    Shai

    Monday, May 18, 2015 3:54 PM

All replies

  • Can you give more details on the issue?

    1. whether it happens all time?

    2. Whether you are seeing it from SOA job or normal batch job?

    3. Which version are you using? As in 4.03 we make an improvement on a similar issue for SOA jobs


    Qiufang Shi

    Tuesday, May 19, 2015 2:43 AM
  • Hi , 

    It happens only for heavy applications (~100% cpu utilization) .

    I saw it only on batch jobs - parametric sweep job .

    We are using Windows HPC 2012 R2 4.2.4400.0

    Regards , 

    Shai.

     

    Tuesday, May 19, 2015 3:01 PM
  • This should be our know case. Usually it happens during status resync between scheduler and the compute node.

    In your case, do you see the task eventually failed? If all tasks finished successfully, you can ignore this error message for the job. And we are seeking to improve the error message. If you do see task failed because of this error message, we need investigate more.


    Qiufang Shi

    Thursday, May 21, 2015 6:48 AM
  • I get the same error on our grid from time to time. Had it occur last night (06/11/2015). It resulted in 58 task failures on one node. The job is a normal batch job. We are using HPC 2012 R2 4.3.4652.0

    Thanks.

    Friday, June 12, 2015 12:20 PM
  • I have similar error on HPC 2012 as well. Is it also a known issue in HPC 2012?
    Wednesday, September 2, 2015 3:10 AM
  • With this error happens, HPC Schedular looks not responding any more.
    Wednesday, September 2, 2015 4:02 AM