Answered by:
Not getting some of the job status call back events

Question
-
Hi,
I'm implementing the HPC communication layer for an inhouse app and I'm trying to get the job and task status update call backs working. My code is a virtual copy of https://msdn.microsoft.com/en-us/library/cc853426(v=vs.85).aspx and https://msdn.microsoft.com/en-us/library/cc853482(v=vs.85).aspx ; I'm getting all the events up to and including Queued, but nothing after that. I commented out the task notifications, just subscribed to the job events, and it's 100% consistent and repeatable:
JobStateCallback: Job 33 state is Submitted
JobStateCallback: Job 33 state is Validating
JobStateCallback: Job 33 state is QueuedI get nothing after that (the test job is setup to run for about 50 seconds, so there's plenty of time). I don't get anything when the job finishes, or when the job is cancelled from the cluster manager screen. I see the job running and finishing fine from the manager screen...
I'm running against HPC Pack 2012 R2 Cluster Manager, using HPC Pack 2012 SDK, Visual Studio 2010 Ultimate, c#
Any ideas would be greatly appreciated...
Thanks!
Damian
Thursday, March 26, 2015 5:04 PM
Answers
-
Can you try my sample codes and running on your headnode? Please make sure that you have a line of "Console.ReadLine();" so that your app won't finish before the job get executed.
Qiufang Shi
- Marked as answer by DamianRK Friday, April 10, 2015 5:08 PM
Thursday, April 9, 2015 2:09 AM
All replies
-
Can you paste your code here?Friday, March 27, 2015 5:18 AM
-
Hi,
Through my testing, it has no problem, please check your codes:
C:\Users\azureuser\Desktop>hpctest
JobStateCallback: Job state is Submitted
JobStateCallback: Job state is Validating
TaskStateCallback: State for task 57017.1 is Queued
JobStateCallback: Job state is Queued
JobStateCallback: Job state is Running
TaskStateCallback: State for task 57017.1 is Dispatching
TaskStateCallback: State for task 57017.1 is Running
TaskStateCallback: State for task 57017.1 is Finished
Output from task:Pinging 127.0.0.1 with 32 bytes of data:
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
Reply from 127.0.0.1: bytes=32 time<1ms TTL=128Ping statistics for 127.0.0.1:
Packets: Sent = 10, Received = 10, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0msJobStateCallback: Job state is Finished
Press Enter to quit
The codes are as below:
namespace Creating_and_Submitting_a_Job
{
class Program
{
private static ManualResetEvent manualEvent = new ManualResetEvent(false);static void Main(string[] args)
{
IScheduler scheduler = new Scheduler();
ISchedulerJob job = null;
ISchedulerTask task = null;try
{
scheduler.Connect("localhost");// Create a job and add a task to the job.
job = scheduler.CreateJob();
task = job.CreateTask();
task.CommandLine = "ping 127.0.0.1 -n 10";
job.AddTask(task);// Specify the events that you want to receive.
job.OnJobState += JobStateCallback;
job.OnTaskState += TaskStateCallback;// Start the job.
scheduler.SubmitJob(job, @"hpc\azureuser", null);// Blocks so the events get delivered. One of your event
// handlers need to set this event.
manualEvent.WaitOne();Console.Write("\nPress Enter to quit");
Console.ReadLine();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}// Add the code from the Implementing the Event Handlers for Job Events in C# topic here.
// Implements the delegates for the SchedulerJob eventsprivate static void JobStateCallback(object sender, IJobStateEventArg args)
{
IScheduler scheduler = null;
ISchedulerJob job = null;Console.WriteLine("JobStateCallback: Job state is " + args.NewState);
if (JobState.Canceled == args.NewState ||
JobState.Failed == args.NewState ||
JobState.Finished == args.NewState)
{
manualEvent.Set();
}
else
{
try
{
scheduler = (IScheduler)sender;
job = scheduler.OpenJob(args.JobId);// TODO: Do something with the job
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}private static void TaskStateCallback(object sender, ITaskStateEventArg args)
{
IScheduler scheduler = null;
ISchedulerJob job = null;
ISchedulerTask task = null;Console.WriteLine("TaskStateCallback: State for task {0} is {1}", args.TaskId.ToString(), args.NewState);
if (TaskState.Finished == args.NewState ||
TaskState.Failed == args.NewState)
{
try
{
scheduler = (IScheduler)sender;
job = scheduler.OpenJob(args.JobId);task = job.OpenTask(args.TaskId);
Console.WriteLine("Output from task:\n" + task.Output);
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
}
}Qiufang Shi
Friday, March 27, 2015 6:54 AM -
static private void SubmitTestJob(JobDetails jobdetails)
{
using (Scheduler scheduler = new Scheduler())
{
scheduler.Connect(config.SchedulerAddress);
ISchedulerJob job = scheduler.CreateJob();
job.Name = "Testing HPC job";
ISchedulerTask task = job.CreateTask();
task.CommandLine = @"c:\share\sleep.exe 30";
job.AddTask(task);
job.AddTask(task);
job.OnJobState += JobStateCallback;
//job.OnTaskState += TaskStateCallback;
scheduler.SubmitJob(job, null, null);
jobdetails.HPCJob = job;
}
}
private static void JobStateCallback(object sender, IJobStateEventArg args)
{
long jobID = args.JobId;
IScheduler scheduler = (IScheduler)sender;
Console.WriteLine("JobStateCallback: Job " + jobID + " state is " + args.NewState + " on thread " + Thread.CurrentThread.ManagedThreadId );
}
JobStateCallback: Job 35 state is Submitted on thread 16
JobStateCallback: Job 35 state is Validating on thread 16
JobStateCallback: Job 35 state is Queued on thread 16and then nothing....
Friday, March 27, 2015 1:44 PM -
and I changed the command to ping
task.CommandLine = "ping -n 8 127.0.0.1";
and got exactly the same result... Noting after Queued, even though the job runs and finished fine, in the cluster manager...
Friday, March 27, 2015 1:51 PM -
Can you try my sample codes and running on your headnode? Please make sure that you have a line of "Console.ReadLine();" so that your app won't finish before the job get executed.
Qiufang Shi
- Marked as answer by DamianRK Friday, April 10, 2015 5:08 PM
Thursday, April 9, 2015 2:09 AM