My organization just stood up a new HPC Pack 2016 U2 cluster (with 1/4/19 QFE), using balanced scheduling and immediate preemption, with several thousand cores and a 3-way clustered headnode. When we submit a parametric job (doesn't matter what
the actual tasks are), the job only gets one core and will stay with one allocated core. If we submit a second job, then the first job will get half the cores in the cluster, but the other will only get 1. If we continue to add a third, then 2
will have 1/3 of the cores, but one of the jobs will always remain at one core.
We have used a 2012R2U3 cluster for years (not clustered headnode), which has worked fine for us. This is new and baffling behavior. Does anyone have an idea of what is going wrong?
Thanks