We use HPC to generate multiple sub-reports that are then combined into one overall report. We previously been creating a Basic Task for every sub-report, and allowing them to run concurrently. There is another place in our application that will
combine all these sub-report PDFs into a final PDF.
We have realized that some individual sub-reports are very slow, but could easily be broken down and ran concurrently as well. To achieve this, we have converted those Basic Tasks into Parametric Sweep tasks with some partitioning scheme. But
now we have also added a Basic Task after this that will combine the outputs from the parametric sweep task. So we end up with an HPC job that looks like this (each parametric sweep task assigns 2 cores to each subtask, and each basic task needs only
1 core -- job is limited to 6 cores):
- Report1 - Parametric Sweep, Task IDs = 1.1 - 1.5 (2 cores per subtask, so 3 can run at a time)
- FinalizeReport1 - Basic Task, Task ID 2, DependsOn "Report1" (requires 1 core)
- Report2 - Parametric Sweep, Task IDs = 3.1 - 3.3
- FinalizeReport2 - Basic Task, Task ID 4, DependsOn "Report2"
- Report3 - Parametric Sweep, Task IDs = 5.1 - 5.10
- FinalizeReport3 - Basic Task, Task ID 5, DependsOn "Report3"
This does work correctly. But what we are seeing is that as the "Report1" tasks finishes (so 1.1, 1.2, 1.3 have finished, but 1.4 and 1.5 are still running), Task 2 doesn't start running (because it is dependent upon task 1, and task 1 hasn't
fully finished yet). So task 3.1 will start running. This is all still fine.
When all of task 1 finishes, task 2 still doesn't start. Instead, task 3.3 may start...and then 5.1, 5.2, 5.3, etc will usually start. Only after all the parametric sweep tasks have run will the the scheduler go back and run task 2/4/6.
Is there any way to increase the priority of these tasks? Some way of saying "I want these tasks to run as soon as their dependencies are met?"
Thanks!