none
HPC 2012 Update 3: Couldn't monitor GPU time

    Question

  • We have a small cluster with 4 compute nodes. We installed HPC 2012 Update 3, which can monitor the GPU usage. It works well on 3 compute nodes except one. For that computer, we can monitor many GPU performance metrics (e.g., temperature, power, and used memory). However, we couldn't see the real GPU time (always displayed as 0% which is not true). Anyone can help?


    Thursday, February 18, 2016 8:49 PM

Answers

  • Hi, what is the CUDA version on that compute node, is it the same on the other 3 compute nodes?

    can you run nvidia-smi.exe under C:\Program Files\NVIDIA Corporation\NVSMI to get the GPU time correctly?

    • Marked as answer by Yefeng Zheng Friday, February 26, 2016 5:34 PM
    Friday, February 19, 2016 2:51 AM

All replies

  • Hi, what is the CUDA version on that compute node, is it the same on the other 3 compute nodes?

    can you run nvidia-smi.exe under C:\Program Files\NVIDIA Corporation\NVSMI to get the GPU time correctly?

    • Marked as answer by Yefeng Zheng Friday, February 26, 2016 5:34 PM
    Friday, February 19, 2016 2:51 AM
  • The original driver version is 353.90. After upgrading the driver, everything works as a charm. Thank you very much!

    Friday, February 26, 2016 5:35 PM