locked
SQL Tuning to Address Job & Task Performance RRS feed

  • Question

  • Like others I see posting here, we are experiencing performance problems on our Windows HPC 2008 cluster with task management in jobs that include 100's of tasks.  I see from those other posts that MSFT is working to address these performance problems.  Will that come in the form of a service pack or other?  When can we expect some help?


    As for right now, I'd like to ask about what we can do to tune/optimize the SQL database.  To give some background, our application architecture has a workflow web service that is directing work on our cluster.  This workflow web service is using the .NET Scheduler API to add tasks to existing jobs.  

    Here's what we're doing to squeeze better performance out of the scheduling and some questions around each.

    1) We've found that it takes seconds to establish a connection to scheduler so we're implementing a connection pool to manage ISchedulerJob instances

    2) We've run SQL Profiler on the CCPClusterService database and see many operations that are taking 10+ secs so we're taking standard database administration steps to seperate data, indexes, and log files as well as run maintenance plans on the database.  Can we do any of the following?

    - use x64 SQL Server?
    - use SQL 2008?
    - offload SQL load on to another box?   I know HPC Head Node wants a COMPUTECLUSTER named instance but can it be remote?
    - tweak fill factor values to limit index fragmentation?
    - modify indexing?  does MSFT have any revised index guidance?


    I appreciate any feedback from the community and will share any practices that we come up with.

    Thanks!
    Luke

    Thursday, April 23, 2009 6:28 PM

Answers

  • The dependency issue you reported shoudl be fixed in SP1, as should large job XML import.

    Thanks!
    Josh
    -Josh
    Wednesday, June 3, 2009 9:50 PM
    Moderator

All replies

  • Surly, it would be a huge help for us to get more information.  What types of jobs are you submitting?  How are you submitting them (code sample would be great!)?  How long do you expect it to take vs. how long is it actually taking?

    More information on how to configure SQL with HPC is available here: http://go.microsoft.com/fwlink/?LinkId=137791

    Thanks!
    Josh


    -Josh
    Friday, April 24, 2009 6:24 AM
    Moderator
  • Hi Josh,

    Thanks for your reply and the pointer to that doc.  I reviewed it quickly and will go back in more detail.  From what I see, it is not possible to use a remote SQL database.  If that is true then it is a real flaw in the architecture that I hope you guys can address in next release, because most everyone has $'s committed to a robust SQL server with sophisticated storage backing it before HPC comes into the picture.  Allow us to use that infrastructure as the SQL backend for the head node too.  Can't you just expose a connection string somewhere?


    We've got a 16 node, 128 core cluster that is running image processing on very large images - several gigabytes.  We've got a >25 step hierarchical process that we run on each image and it is totally data-driven i.e. what we find in the image drives what the next step in the process is.  A job represents processing of one image and the tasks are dynamically determined and populated via workflow web service.  We are not using MPI yet so all of our parallelism is achieved thru scheduler - dynamically subdividing the image into processing units and executing each of those units as a task.  The EXE run by each task calls back to the workflow web service (WS) when complete, tells WS what it did, the WS loads up the ISchedulerJob from HPC via IScheduler connection, determines if all sibling tasks are complete, and if so then adds the next task in the sequence.


    A couple odds and ends:

    - I said in my original email that we are pooling ISchedulerJob instances but I meant IScheduler instances

    - As a rule of thumb, our tasks will not take less than 30 seconds.  However, we could have 100+ tasks complete within a couple seconds of each other.  Each of them would call back to the web service and cause a lookup of the job on the head node.  The last one to finish would cause another 100+ tasks to be added to the job and this happens one-by-one because there is no way to throw a collection of tasks over with one call that we know of.  This all should occur in a couple seconds (1-5).  I think API support for N tasks would really solve this issue because you could do one roundtrip to the database instead of N.

    - Our approach is partially complicated by struggles to make Task Dependencies work.  My recollection is that we couldn't get these to work w/o a pre-existing Job Template.  Again,  our process is quite dynamic so the pre-constructed job template isn't flexible enough.  Are there known bugs in this area of Task Dependencies?


    I'll ask my engineer responsible for these features to post code examples of our task addition and example of the task dependency problems we're experiencing.

    Thanks!
    Luke

    Friday, April 24, 2009 6:00 PM
  • Josh,

    One other thing, it would be EXTREMELY helpful to us if the task that failed was differentiated from all the others that are cancelled.  When one task fails it causes the others to be cancelled but they also show up as in Failed state in the Cluster/Job Manager's Job view.  This forces us to hunt through everyone of them currently.

    -Luke
    Friday, April 24, 2009 6:20 PM
  • Josh,

    Below is a code example where you can't get two tasks to converge back into one task.  See code for more notes.

    -Scott

    //// UNIT TEST SET #1 OUTPUT
    Took 749 ms to connect to devhead
    Job 5024
    Adding tasks - Level A
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5024.1 Refresh took 70 ms.
    Job Submitted 5024 SubmitJobById took 138 ms.
    Adding tasks - Level B
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5024.2 Refresh took 385 ms.
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5024.3 Refresh took 498 ms.
    Adding tasks - Level C
    Task Added 0 Refresh took 0 ms.
    ----> Exception thrown that "Task3" does not exist.  Task 4 shows up in the HPC Job Manager but is marked as failed.

    //// UNIT TEST SET #2 OUTPUT
    Took 621 ms to connect to devhead
    Job 5025
    Adding tasks - Level A
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5025.1 Refresh took 66 ms.
    Job Submitted 5025 SubmitJobById took 131 ms.
    Adding tasks - Level B
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5025.2 Refresh took 390 ms.
    Adding tasks - Level C
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5025.3 Refresh took 496 ms.
    End.

    //// CODE BELOW HERE

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Diagnostics;
    using System.Threading;

    using Microsoft.Hpc.Scheduler;
    using Microsoft.Hpc.Scheduler.Properties;

    namespace HpcDependsTest
    {
    class Program
    {
    static void Main(string[] args)
    {
    String sHeadnode = "devhead";
    string[] sTaskNamesEmpty = {};

    //
    //UNIT TEST SET #1
    //
    // THIS FAILS
    //

    //  T         Task 1
    //  i        /     \
    //  m       |       |
    //  e    Task 2   Task 3
    //          |       |
    //  |        \     /
    //  v        Task 4

    string[] sTaskNamesA = {"Task1"};
    string[] sTaskNamesB = {"Task2", "Task3"};
    string[] sTaskNamesC = {"Task4"};

    //
    //UNIT TEST SET #1
    //

    //
    //UNIT TEST SET #2
    //
    // THIS PASSES
    //

    //  T         Task 1
    //  i           |     
    //  m           |
    //  e         Task 2   
    //              |
    //  |           |
    //  v        Task 4

    //string[] sTaskNamesA = { "Task1" };
    //string[] sTaskNamesB = { "Task2" };
    //string[] sTaskNamesC = { "Task4" };

    //
    //UNIT TEST SET #2
    //
    Stopwatch watch = new Stopwatch();
    watch.Start();
    IScheduler scheduler = new Scheduler();
    scheduler.Connect(sHeadnode);
    watch.Stop();
    Console.WriteLine("Took " + watch.ElapsedMilliseconds + " ms to connect to " + sHeadnode);
    ISchedulerJob Job = scheduler.CreateJob();
    Job.Name = "HpcDependsTest";
    Job.FailOnTaskFailure = true;
    Job.IsExclusive = false;

    scheduler.AddJob(Job);
    Job.Refresh();  //To get Job Id
    int iJobId = Job.Id;
    Console.WriteLine("Job " + iJobId);

    Console.WriteLine("Adding tasks - Level A");
    Addtasks(Job, sTaskNamesA, sTaskNamesEmpty);

    watch.Reset();
    watch.Start();
    scheduler.SubmitJobById(iJobId, "USER NAME HERE", "PASSWORD HERE");
    watch.Stop();
    Console.WriteLine("Job Submitted " + iJobId + " SubmitJobById took " + watch.ElapsedMilliseconds + " ms.");

    Console.WriteLine("Adding tasks - Level B");
    Addtasks(Job, sTaskNamesB, sTaskNamesA);
    //
    //FOR UNIT TEST SET #1
    //None of these next lines will allow you to add Task4 w/ dependencies to Task2 & Task3
    //You fail during submit of Task 4 at Job.SubmitTask(task); with this exception

    //Job.Commit();
    //Job.Refresh();
    //Job = scheduler.OpenJob(iJobId);
    /*
    * Invalid task dependency: There is no task with the name Task3.  Check your spelling and try again.
    */
    //FOR UNIT TEST SET #1
    //

    Console.WriteLine("Adding tasks - Level C");
    Addtasks(Job, sTaskNamesC, sTaskNamesB);

    Console.WriteLine("End.");
    }

    public static void Addtasks(ISchedulerJob Job, string[] taskNames, string[] taskDepsNames)
    {
    Stopwatch watch = new Stopwatch();
    foreach( String sTaskName in taskNames )
    {
    ISchedulerTask task = Job.CreateTask();
    task.CommandLine = "ping -n 5 localhost";
    task.Name = sTaskName;
    task.IsParametric = false;
    task.IsExclusive = false;
    task.Runtime = 1440 * 60;
    task.StdOutFilePath = "NUL";
    task.StdErrFilePath = "NUL";

    foreach( string sTaskDepName in taskDepsNames)
    {
    task.DependsOn.Add(sTaskDepName);
    }

    watch.Reset();
    watch.Start();
    Console.WriteLine("Task Added " + task.TaskId.JobTaskId + " Refresh took " + watch.ElapsedMilliseconds + " ms.");
    watch.Stop();
    watch.Reset();
    watch.Start();
    Job.SubmitTask(task);
    watch.Stop();
    Console.WriteLine("Task Submitted " + task.TaskId + " Refresh took " + watch.ElapsedMilliseconds + " ms.");
    }
    }
    }
    }

    Friday, April 24, 2009 8:00 PM

  • Our process work flow as Luke mentined is dynamic.  What we find in the image determines what happens later on in the process.  So the process in place has 3 processing levels; whole image, part of an image (blob) and tiles that make up the blob.  There are in number, a few blobs per image and approx 100 tiles per blob to give you an idea of quanities.

    Each HPC task prior to exiting callsback to the centeral web service that manages the job. In this callback we determine what tasks need to be added next.  

    In the case we are going from many tiles to process the blob we need all the tiles to complete processing before running the next step on the blob.  This is the case that does not work in the test code above.

    So to workaround this the callback won't add the next step for the blob till the last tile step callsback to notify it's completion.  
    For each tile callback a query to GetTaskList() is done to determine if this is the last task completing to move to the next step.  This adds quite a bit of load on the Hpc SQL instance since a 100 of them can be asking the same thing nearly all at once.
    //
    IFilterCollection filters = GetScheduler().CreateFilterCollection();
    filters.Add(FilterOperator.Equal, PropId.Task_State, TaskState.Finished);
    filters.Add(FilterOperator.Equal, PropId.Task_ParentJobId, m_iJobId);
    ISchedulerCollection tasks = GetTaskList(filters, null, false);
    //

    If we could get the dependancy tree to work the getTaskList(..) above would no longer be needed.  This would definatly help with scalablity and performance and we could move to only haveing one of the tile tasks callback to the webservice.

    Performance savings all around.

    -Scott
    Friday, April 24, 2009 9:10 PM
  • Josh,

    One other thing, it would be EXTREMELY helpful to us if the task that failed was differentiated from all the others that are cancelled.  When one task fails it causes the others to be cancelled but they also show up as in Failed state in the Cluster/Job Manager's Job view.  This forces us to hunt through everyone of them currently.

    -Luke

    We're working on making this easier to diagnose in v3.
    -Josh
    Monday, May 11, 2009 10:31 PM
    Moderator
  • Josh,

    Below is a code example where you can't get two tasks to converge back into one task.  See code for more notes.

    -Scott

    //// UNIT TEST SET #1 OUTPUT
    Took 749 ms to connect to devhead
    Job 5024
    Adding tasks - Level A
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5024.1 Refresh took 70 ms.
    Job Submitted 5024 SubmitJobById took 138 ms.
    Adding tasks - Level B
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5024.2 Refresh took 385 ms.
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5024.3 Refresh took 498 ms.
    Adding tasks - Level C
    Task Added 0 Refresh took 0 ms.
    ----> Exception thrown that "Task3" does not exist.  Task 4 shows up in the HPC Job Manager but is marked as failed.

    //// UNIT TEST SET #2 OUTPUT
    Took 621 ms to connect to devhead
    Job 5025
    Adding tasks - Level A
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5025.1 Refresh took 66 ms.
    Job Submitted 5025 SubmitJobById took 131 ms.
    Adding tasks - Level B
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5025.2 Refresh took 390 ms.
    Adding tasks - Level C
    Task Added 0 Refresh took 0 ms.
    Task Submitted 5025.3 Refresh took 496 ms.
    End.

    //// CODE BELOW HERE

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Diagnostics;
    using System.Threading;

    using Microsoft.Hpc.Scheduler;
    using Microsoft.Hpc.Scheduler.Properties;

    namespace HpcDependsTest
    {
    class Program
    {
    static void Main(string[] args)
    {
    String sHeadnode = "devhead";
    string[] sTaskNamesEmpty = {};

    //
    //UNIT TEST SET #1
    //
    // THIS FAILS
    //

    //  T         Task 1
    //  i        /     \
    //  m       |       |
    //  e    Task 2   Task 3
    //          |       |
    //  |        \     /
    //  v        Task 4

    string[] sTaskNamesA = {"Task1"};
    string[] sTaskNamesB = {"Task2", "Task3"};
    string[] sTaskNamesC = {"Task4"};

    //
    //UNIT TEST SET #1
    //

    //
    //UNIT TEST SET #2
    //
    // THIS PASSES
    //

    //  T         Task 1
    //  i           |     
    //  m           |
    //  e         Task 2   
    //              |
    //  |           |
    //  v        Task 4

    //string[] sTaskNamesA = { "Task1" };
    //string[] sTaskNamesB = { "Task2" };
    //string[] sTaskNamesC = { "Task4" };

    //
    //UNIT TEST SET #2
    //
    Stopwatch watch = new Stopwatch();
    watch.Start();
    IScheduler scheduler = new Scheduler();
    scheduler.Connect(sHeadnode);
    watch.Stop();
    Console.WriteLine("Took " + watch.ElapsedMilliseconds + " ms to connect to " + sHeadnode);
    ISchedulerJob Job = scheduler.CreateJob();
    Job.Name = "HpcDependsTest";
    Job.FailOnTaskFailure = true;
    Job.IsExclusive = false;

    scheduler.AddJob(Job);
    Job.Refresh();  //To get Job Id
    int iJobId = Job.Id;
    Console.WriteLine("Job " + iJobId);

    Console.WriteLine("Adding tasks - Level A");
    Addtasks(Job, sTaskNamesA, sTaskNamesEmpty);

    watch.Reset();
    watch.Start();
    scheduler.SubmitJobById(iJobId, "USER NAME HERE", "PASSWORD HERE");
    watch.Stop();
    Console.WriteLine("Job Submitted " + iJobId + " SubmitJobById took " + watch.ElapsedMilliseconds + " ms.");

    Console.WriteLine("Adding tasks - Level B");
    Addtasks(Job, sTaskNamesB, sTaskNamesA);
    //
    //FOR UNIT TEST SET #1
    //None of these next lines will allow you to add Task4 w/ dependencies to Task2 & Task3
    //You fail during submit of Task 4 at Job.SubmitTask(task); with this exception

    //Job.Commit();
    //Job.Refresh();
    //Job = scheduler.OpenJob(iJobId);
    /*
    * Invalid task dependency: There is no task with the name Task3.  Check your spelling and try again.
    */
    //FOR UNIT TEST SET #1
    //

    Console.WriteLine("Adding tasks - Level C");
    Addtasks(Job, sTaskNamesC, sTaskNamesB);

    Console.WriteLine("End.");
    }

    public static void Addtasks(ISchedulerJob Job, string[] taskNames, string[] taskDepsNames)
    {
    Stopwatch watch = new Stopwatch();
    foreach( String sTaskName in taskNames )
    {
    ISchedulerTask task = Job.CreateTask();
    task.CommandLine = "ping -n 5 localhost";
    task.Name = sTaskName;
    task.IsParametric = false;
    task.IsExclusive = false;
    task.Runtime = 1440 * 60;
    task.StdOutFilePath = "NUL";
    task.StdErrFilePath = "NUL";

    foreach( string sTaskDepName in taskDepsNames)
    {
    task.DependsOn.Add(sTaskDepName);
    }

    watch.Reset();
    watch.Start();
    Console.WriteLine("Task Added " + task.TaskId.JobTaskId + " Refresh took " + watch.ElapsedMilliseconds + " ms.");
    watch.Stop();
    watch.Reset();
    watch.Start();
    Job.SubmitTask(task);
    watch.Stop();
    Console.WriteLine("Task Submitted " + task.TaskId + " Refresh took " + watch.ElapsedMilliseconds + " ms.");
    }
    }
    }
    }


    Thanks for this amazingly detailed response :-)  I believe we've repro'd your issue.  We will be looking into it over the next few days and I'll try to get back to you once we've figured it out.

    Most likely it is a bug in adding tasks with dependencies to running jobs.

    Thanks!
    Josh

    -Josh
    Monday, May 11, 2009 10:55 PM
    Moderator
  • In case others experience this, we've found that adding the "Error Message" column to the task view can help identify the failed task.  However, I still think identifying tasks as failed vs cancelled would be preferrable.


    Josh,

    Any feedback on the other questions:  SQL connection string and adding a task list in one shot?

    Thanks,
    Luke
    Tuesday, May 12, 2009 10:24 PM
  • Remote SQL databases are unfortunately not supported in v2; we hope to provide support for this in v3.

    Adding tasks in a batch isn't supported in v2 either.  We have a fix (coming in SP1) to make it a bit faster to add multiple tasks at once using XML.  This is another thing that we are looking into for v3.

    Thanks!
    Josh
    -Josh
    Tuesday, May 19, 2009 9:14 PM
    Moderator
  • Josh,

    Any news on if this will be in SP1?

    Thanks,
    Scott
    Friday, May 29, 2009 6:17 PM
  • The dependency issue you reported shoudl be fixed in SP1, as should large job XML import.

    Thanks!
    Josh
    -Josh
    Wednesday, June 3, 2009 9:50 PM
    Moderator