locked
Not getting more than 2 gpu cores with job submit /numgpus RRS feed

  • Question

  • When I submit a job, if I as for numgpus the best number of gpu cores I can get is 2 even though 4 are accessible. I have verified that I have access to 4 and they are unoccupied. There are also 48 CPU cores.

    The best case is if I submit with /numgpus:1-4, I can access 2 of the 4 gpu cores on my 2 K80s. If I submit with any of the following combinations I only get 1 of the gpu cores. /numgpus:2-4, /numgpus:3-4 or /numgpus:4

    Is there anything wrong with this command:

    C:\Users\H155568>job submit /parametric:0-3:1 /workdir:c:\FrontendDNN\TestCaseAH /stderr:part_*_HPCstderr.txt /stdout:part_*_HPCstdout.txt /requestednodes:pa94pw3r9tv52 /numgpus:4 RunHpc_pa94pw3r9tv52_*.bat

    I have found a way to access all cores and that was to make a job template and only specify that the node had to be gpu capable. That job template that specified GPUNodes as the Node Groups. 



    Monday, April 18, 2016 8:37 PM

All replies

  • Hi, Albert,

    what is your GPU task running, whether use the environment variable "CCP_GOUIDS"?

    you can refer to https://technet.microsoft.com/en-us/library/mt595856.aspx#BKMK_CUDA

    if you submit with /numgpus:1-4, you said can access 2 gpus, do you mean for the 4 tasks in your job, only 2 are running, after they finished, then the other 2 start to run on the same GPUS?

    Thanks,

    Yongjun

     

     

    Wednesday, April 20, 2016 1:57 AM