locked
Exceptions from C# HPC library while submitting jobs RRS feed

  • Question

  • Hello,

    I am using HPC 2012 R2 update 2.
    From time to time the HPC client library for C# throws exceptions while submitting jobs :

    System.AggregateException: One or more errors occurred. ---> System.NullReferenceException: Object reference not set to an instance of an object.
       at Microsoft.Hpc.Scheduler.Store.RowSetWrapper.OpenRemoteRowSet()
       at Microsoft.Hpc.Scheduler.Store.LocalRowSet.GetCount()
       at Microsoft.Hpc.Scheduler.Store.StoreServer.CountTasksForJob(Int32 jobId, List`1 taskIdList)
       at Microsoft.Hpc.Scheduler.Store.StoreServer.Task_AddTasksToJob(Int32 jobId, List`1& taskIdList, List`1 taskPropsList)
       at Microsoft.Hpc.Scheduler.Store.JobEx.CreateTasks(List`1 taskPropertyList)
       at Microsoft.Hpc.Scheduler.SchedulerJob.CreateAddedTasks(Int32 rootGrpId, Dictionary`2 taskGroupIdMapping)
       at Microsoft.Hpc.Scheduler.SchedulerJob.CreateJob()
       at Microsoft.Hpc.Scheduler.SchedulerJob.Submit(ISchedulerStore store, String username, String password)
       at Microsoft.Hpc.Scheduler.Scheduler.SubmitJob(ISchedulerJob job, String username, String password)
       at System.Threading.Tasks.Task.Execute()
    I am afraid we are somehow using the library the wrong way.

    In particular I noticed that several threads are sharing the Microsoft.Hpc.Scheduler.Scheduler class via a static variable.
    Is this object supposed to be thread safe or should I use a different Scheduler object per thread ?
    Otherwise, do you know of any other tips I can use to debug this one ?

    Thanks

    Thursday, September 14, 2017 4:57 PM

All replies

  • Hi cguevaramari,

    Thank you for your feed back. We will investigate this.

    Thanks,
    Zihao

    Saturday, September 16, 2017 9:23 AM
  • Hi cguevaramari,

    From your description and primary investigation, we believe this can either be an SDK issue or caused by sharing Scheduler instance and race condition. We need more information to diagnose this.
    Which HPC version are you using specifically? 4.4.4864.0 or 4.4.4868.0?
    Could you please provide a short code snippet which can reproduce the error you saw?

    According to the implementing conversion of .Net framework, all instance methods are not guaranteed to be thread safe and all static methods are thread safe. If you share Scheduler instance among threads, you will need to introduce a lock to guard the thread safety.

    Thanks,
    Zihao

    Monday, September 18, 2017 1:43 AM
  • Hi,

    Thanks for your answer.
    As for your questions :
     - Which HPC version are you using specifically? 4.4.4864.0
     - Could you please provide a short code snippet which can reproduce the error you saw? Unfortunately no, it does not happen every time. I see this issue often when I restart my application and the application threads attempt to submit jobs at the same time. However the error does not happen at every application restart, which led me to think about some kind of race condition.

    Anyway, I will patch my code to use 1 Scheduler instance per thread.
    That should hopefully eliminate the problem.

    Thanks
    Monday, September 18, 2017 1:37 PM