Asked by:
Sql errors causes job to not be validated

Question
-
Hi,
We have been experiencing several of these errors in the hpc head node server's event log.
An unexpected exception occurred. For more information about this exception, see the Details tab.
Additional data:
Expected to update 2 rows, but actually updated 1, for SQL command:
SET NOCOUNT OFF;
SET NOCOUNT ON;
SET NOCOUNT OFF;
UPDATE Job SET
ChangeTime = N'2016-07-21 15:37:41.247'
WHERE ID IN (82288,82290) AND timestamp <= 0x0000000002AE7021
SET NOCOUNT ON;The scheduler was unable to commit a transaction.
An unexpected exception occurred. For more information about this exception, see the Details tab.
Additional data:
The operation could not be completed because the affected object is already in use by the scheduler. Please try again later.after these errors - the task would be marked as failed and the message would be
an unexpected exception occurred while validating the job. Please try to submitting the job later. If the problem persists, please contact your system administrator or check the HPC server event log for more details
Do you know why this is happening? We are using HPC 2012 R2 update 4, with an external sql database. Do you have any other troubleshooting steps that we can perform. The task runs fine when it is requeued.
Thanks!!
Thursday, July 21, 2016 4:57 PM
All replies
-
Hi,
This usually means the job record has been upgraded since the time it was read, so the transaction cannot be committed because the timestamp renewed.
In case of this error, the current transaction should abort the modification and reload the job and perform the action again. If this rate is not high, then retry logic is the fix.
Thanks,
EvanFriday, July 22, 2016 6:59 AM -
ok that makes sense...
What all could update the job record? Could adding tasks to an already running job cause that?
Thanks!
Friday, July 22, 2016 5:24 PM -
There could be many reasons. The scheduler could modify the record during the job running, and the clients can also modify the record through API call.
Thanks,
EvanSunday, July 24, 2016 6:05 AM -
ok - can you explain what you meant by if this rate is not high? Do you mean the number of transactions? Sometimes we get this error when we only have a few jobs running... but those running jobs can have over a 100 to 400 tasks.
Thanks,
Nicki
Monday, July 25, 2016 1:53 PM -
Hi Nicki,
You can measure the rate by the count of this exception / total operation (transactions). We don't have a criteria for the rate, but we know this could happen, and not in a high rate.
Thanks,
EvanThursday, July 28, 2016 8:00 AM