When I submit a job, if I as for numgpus the best number of gpu cores I can get is 2 even though 4 are accessible. I have verified that I have access to 4 and they are unoccupied. There are also 48 CPU cores.
The best case is if I submit with /numgpus:1-4, I can access 2 of the 4 gpu cores on my 2 K80s. If I submit with any of the following combinations I only get 1 of the gpu cores. /numgpus:2-4, /numgpus:3-4 or /numgpus:4
Is there anything wrong with this command:
C:\Users\H155568>job submit /parametric:0-3:1 /workdir:c:\FrontendDNN\TestCaseAH /stderr:part_*_HPCstderr.txt /stdout:part_*_HPCstdout.txt /requestednodes:pa94pw3r9tv52 /numgpus:4 RunHpc_pa94pw3r9tv52_*.bat
I have found a way to access all cores and that was to make a job template and only specify that the node had to be gpu capable. That job template that specified GPUNodes as the Node Groups.