locked
Exponential time of queueing tasks of a single job - is it a bug? RRS feed

  • Question

  • Hi again,

     

    Recently I've written a code, which creates a job with a single task, and that single task submits 1000's of subsequent tasks, which are dependent on this task (i.e. will begin to run right after it terminates).

    The submission itself takes a number of seconds, while surprisingly the transition from Submitted to Queued state takes a long time, which eventually is an exponential function of number of tasks (i.e. on my machine 100th task takes 0.4 sec to queue, 500th task takes 1 sec to queue, 1000th task takes 5 sec to queue, 2000th - 10.5 sec , and so on). Integrating it all, 2000 tasks take 1.5 hours just to queue!

    Since we need to work with 100,000s of tasks, we find this performance a real drawback from using Windows HPC, unless you can explain what I'm doing wrong. Perhaps a priority of a job should be specified, or the scheduling policy should be changed?

    I have even tried to split the creation of tasks into a tree (i.e. create 10 tasks, each of which will create 10 tasks, each of which will create 10 tasks, and so on) and it did not make the performance any better.

    Analyzing resource usage on the HPC head node I found that during this queuing process the system CPU usage is taken mostly by sqlservr.exe (50% CPU) and HpcScheduler.exe (30% CPU) processes.

     

    I'll appreciate your help very much, since we should soon make a decision whether or not to take the HPC direction.

     

    All best,

    Alex.

    Monday, June 21, 2010 7:22 AM

Answers

  • Hi Alex,

    Task dependencies performance will also improve if you put newly created tasks into the same dependency group. To do that, you just need to use the same name for them (assign same value for ISchedulerTask.Name property).

    Thanks,
    Łukasz

    Monday, June 28, 2010 10:51 PM

All replies

  • Hi Alex,

    Thank you for reporting this issue. Could you answer a few questions to help us to investigate?

    1. Which version of Windows HPC are you using? Is it V2 or one of the V3 betas? If this is V3, is there a chance, that you could submit single or multiple parametric tasks, which will expand to required 100,000 subtasks (this should be much faster than basic tasks)?

    2. What is the testing hardware, how many nodes in the cluster etc?

    3. Which version of SQL Server are you using?

    4. While trying this, is there only a single job like this running or are there many 'self-expanding' jobs at the time?

    Thank you,
    Łukasz

    Monday, June 21, 2010 6:23 PM
  • Hi Alex,

    I was trying to repro your issue and it looks like mentioned performance hit occurs mainly when task dependencies are used. Your report was added to our bug database for further investigation.

    For now, as a workaround, I can suggest 2 things:

    1. Try using ParametricSweep tasks (if this is possible)

    or

    2. To prevent other tasks from starting to run before master task is finished you can use task resource requirements instead of dependencies. Master task will require all resources available within the job, so it will block others from starting. Example: Job Requires 20-20 cores, master task, which is submitted first,  requires 20-20 cores, other tasks require 1-1 core.

    Thanks,
    Łukasz

    Tuesday, June 22, 2010 3:37 AM
  • Hi Lukasz,

     

    I'm using V3 beta, SQL Server Express (the one that came with the V3). I was testing it all on a dual-core virtual machine with 4GB RAM, and had 2 worker nodes (which I think is not significant because except the master task no other tasks have started running). There was a single job with a single task creating many sub-tasks.

     

    Using parametric sweep tasks is not possible, since we need more than a simple step-by-step single parameter change.

    Also allocating all the resources for the master task might work in such simple case, but sub-tasks can be created anytime during the job, without the constraint that only one task will be running and creating sub-tasks.

     

    Thank you anyway,

     

    Alex.

    Sunday, June 27, 2010 11:54 AM
  • Hi Alex,

    Task dependencies performance will also improve if you put newly created tasks into the same dependency group. To do that, you just need to use the same name for them (assign same value for ISchedulerTask.Name property).

    Thanks,
    Łukasz

    Monday, June 28, 2010 10:51 PM