none
How to programatically get Node CPU usage and disk throughput, like what can be displayed in teh Cluster Manager RRS feed

  • Question

  • Hi

    Is it possible to programatically get the node CPU usage and disk throughput like what is shown, and continually updated, in the Cluster Manager. I can find the static information about the nodes, but not the dynamic. I still need this information even though the nodes are not running any jobs.

    Thanks a lot in advance.

    Friday, October 14, 2011 4:35 PM

Answers

  • There are no public APIs through C# to access these counters.  These cmdlets are currently the only mechanism that is currently provided to do so.

    In writing my own automation I do exactly what Dasch87 suggests and wrap my PowerShell cmdlets in C# - I use C# for command line parsing and results validation and reporting in my test code.  Please note that I didn't include an exhaustive list of all of the metrics in my example.

     

    Here is a listing of al of the Metrics:

    PS C:\Windows\system32> get-hpcmetricvalue -NodeName ComputeNode01 | ft metric,counter

    Metric                                  Counter
    ------                                  -------
    HPCContextSwitches
    HPCCoresInUse
    HPCCpuUsage                             _Total
    HPCDiskQueue                            _Total
    HPCDiskSpace                            _Total
    HPCDiskSpace                            C:
    HPCDiskThroughput                       _Total
    HPCJobsRunning
    HPCMemoryPaging
    HPCMSMQRequestQueueLength
    HPCMSMQResponseQueueLength
    HPCMSMQTotalBytes
    HPCMSMQTotalMessages
    HPCNetwork                              Intel[R] 82567LM-3 Gigabit Network C...
    HPCNetwork                              Intel[R] PRO_1000 PT Dual Port Serve...
    HPCNetwork                              Intel[R] PRO_1000 PT Dual Port Serve...
    HPCNetwork                              isatap.{0E5D3597-651B-4158-AD8F-B317...
    HPCNetwork                              isatap.{96128632-0CEA-4F09-8FA0-DAF5...
    HPCNetwork                              isatap.{A972AE60-0814-4EAF-A0FE-65FE...
    HPCNetwork                              Local Area Connection* 11
    HPCPhysicalMem
    HPCSystemCalls
    HPCTasksRunning
    HPCWcfBrokerCalls
    HPCWcfFailedBrokerCalls
    HPCWcfIncomingCalls
    HPCWcfRetrievedResults

     

    • Marked as answer by Dasch87 Monday, November 28, 2011 5:36 PM
    Monday, November 21, 2011 9:18 PM

All replies

  • You can write simple powershell script. For example, i wrote recently PS-script that gets such statistics as "processId-CPUUsage-MemoryUsage", had put it to the Task Manager. Every 5 minutes script launches and sends the information to my C# web-service that writes it to SQL server database.

    If you want this script you can contact me ahriman[at]tpu.ru.
    Sunday, October 16, 2011 9:20 AM
  • Did you solve it? I'm interested as well.

     

    Regards

    Matthias

    Friday, November 11, 2011 8:49 AM
  • Thanks for your suggestion ahriman. Do you have this script running on every compute node then? Or can you actually collect the information for all nodes on the headnode?
    Friday, November 11, 2011 9:02 AM
  • Hi Matthias, I don't have a good solution for this yet. My best solution so far is to use performance counters for each node and collect information from those every 5-10 seconds in my C# app. But, I would like a more centralized solution if possible.
    Friday, November 11, 2011 9:04 AM
  • Have you tried any of the built in HPC Powershell Cmdlets?  e.g. Get-HPCMetric

     

    Monday, November 14, 2011 11:47 PM
  • No, not yet. I was looking for a pure C# API solution. Or did I miss something?
    Tuesday, November 15, 2011 5:41 AM
  • I have had a quick look at the Powershell Cmdlets, but as far as I can see they only provide overall performance information for the entire cluster. I am not sure that this is correct, I haven't investigated it in details. I would like to get the performance per node, or node group.

    Tuesday, November 15, 2011 8:19 AM
  • Here is an example:

     

    PS C:\Windows\system32> get-hpcmetricvalue -NodeName <name of Machine>

    NodeName                        Metric                                Counter           Value
    --------                             ------                                 -------             -----
    <name of Machine>          HPCContextSwitches                                  1336.328
    <name of Machine>          HPCCoresInUse                                         0
    <name of Machine>          HPCCpuUsage                     _Total             28.84661
    <name of Machine>          HPCDiskQueue                    _Total             0
    <name of Machine>          HPCDiskSpace                    _Total              94.30038
    <name of Machine>          HPCDiskSpace                    C:                    94.30038
    <name of Machine>          HPCDiskThroughput            _Total               0
    <name of Machine>          HPCJobsRunning                                         0

    <snip>

     

     

    PS C:\Windows\system32> get-help get-hpcmetricvalue

    NAME
        Get-HpcMetricValue

    SYNOPSIS
        Gets the current value of the specified metrics that HPC Cluster Manager
        uses in the heat maps for the nodes and the monitoring charts.

        For information about the parameters for this cmdlet, additional remarks,
        and examples, type: "Get-Help Get-HpcMetricValue -online". To download a
        Help file for all of the cmdlets that this product provides, see
        http://go.microsoft.com/fwlink/?LinkId=217345.


    SYNTAX
        Get-HpcMetricValue [[-Name] <String[]>] [-Counter <String[]>] [-MetricTarget <MetricTarget[]>] [-Node <HpcNode[]>] [-       Scheduler <String>] [-Type <String[]>] [<CommonParameters>]

        Get-HpcMetricValue [[-Name] <String[]>] [-Counter <String[]>] [-MetricTarget <MetricTarget[]>] [-NodeName <String[]>] [-Scheduler <String>] [-Type <String[]>] [<CommonParameters>]


    DESCRIPTION
        Gets the current value of the specified set of metrics for the specified
        nodes that HPC Cluster Manager uses in the heat maps for the nodes and
        the monitoring charts.

        You can specify the metric values that you want to get by any combination
        of the names of the metrics, the locations where the metrics are
        generated, and the categories for the metrics. You can specify the nodes
        for which you want to get the metric values by specifying the node name
        or an HpcNode object for the NodeName or Node parameters, respectively.

        If you do not specify any names, locations, or categories, the
        Get-HpcMetricValue cmdlet gets the values of all of the metrics for the
        specified nodes, or for all of nodes in the HPC cluster if no nodes are
        specified.


    RELATED LINKS
        Online Version: http://go.microsoft.com/fwlink/?LinkId=182816
        Get-HpcMetric
        Get-HpcMetricValueHistory
        Get-HpcNode

    REMARKS
        To see the examples, type: "get-help Get-HpcMetricValue -examples".
        For more information, type: "get-help Get-HpcMetricValue -detailed".
        For technical information, type: "get-help Get-HpcMetricValue -full".       

    • Proposed as answer by Mark Staveley Friday, November 18, 2011 6:27 PM
    Friday, November 18, 2011 6:27 PM
  • Hi Mark,

     

    thanks for your help. But the question remains: How does Cluster Manager do it? There must be some native C# API...

     

    Regards

    Matthias

    Monday, November 21, 2011 6:08 AM
  • Thanks a lot for the example Mark, the cmdlet seems to have a least some of the values that I am looking for. It might be the way to do it, though I would prefer that those values were provided through the C# API.

    Thanks a lot.

    Monday, November 21, 2011 8:26 AM
  • Hi Matthias,

    I haven't found a native C# API to retrive the information, but it is possible to utilise the Powershell in your C# app:

    http://msdn.microsoft.com/en-us/library/ee706614%28v=VS.85%29.aspx

     

    Regards

    Monday, November 21, 2011 8:28 AM
  • There are no public APIs through C# to access these counters.  These cmdlets are currently the only mechanism that is currently provided to do so.

    In writing my own automation I do exactly what Dasch87 suggests and wrap my PowerShell cmdlets in C# - I use C# for command line parsing and results validation and reporting in my test code.  Please note that I didn't include an exhaustive list of all of the metrics in my example.

     

    Here is a listing of al of the Metrics:

    PS C:\Windows\system32> get-hpcmetricvalue -NodeName ComputeNode01 | ft metric,counter

    Metric                                  Counter
    ------                                  -------
    HPCContextSwitches
    HPCCoresInUse
    HPCCpuUsage                             _Total
    HPCDiskQueue                            _Total
    HPCDiskSpace                            _Total
    HPCDiskSpace                            C:
    HPCDiskThroughput                       _Total
    HPCJobsRunning
    HPCMemoryPaging
    HPCMSMQRequestQueueLength
    HPCMSMQResponseQueueLength
    HPCMSMQTotalBytes
    HPCMSMQTotalMessages
    HPCNetwork                              Intel[R] 82567LM-3 Gigabit Network C...
    HPCNetwork                              Intel[R] PRO_1000 PT Dual Port Serve...
    HPCNetwork                              Intel[R] PRO_1000 PT Dual Port Serve...
    HPCNetwork                              isatap.{0E5D3597-651B-4158-AD8F-B317...
    HPCNetwork                              isatap.{96128632-0CEA-4F09-8FA0-DAF5...
    HPCNetwork                              isatap.{A972AE60-0814-4EAF-A0FE-65FE...
    HPCNetwork                              Local Area Connection* 11
    HPCPhysicalMem
    HPCSystemCalls
    HPCTasksRunning
    HPCWcfBrokerCalls
    HPCWcfFailedBrokerCalls
    HPCWcfIncomingCalls
    HPCWcfRetrievedResults

     

    • Marked as answer by Dasch87 Monday, November 28, 2011 5:36 PM
    Monday, November 21, 2011 9:18 PM
  • Mark, do you know if there is any performance benefits/downsides to use the cmdlets over performance counters?
    Wednesday, November 23, 2011 8:48 AM
  • I ran into problems with using Performance Counters. I cannot get the performance counter values on Azure nodes (there might be a workaround for this, but I decided to give the cmdlets a go).

    Mark, do you know if it is possible to write a script to get just the HPCCpuUsage value for a specific node, I can't seem to get the correct combination of params. I tried something like:

    "Get-HpcMetricValue -Scheduler UK2WRMAPHPC001 -Counter _Total -NodeName <nodeName>"

    But that obviously gives me all values that has a "_Total" counter:

    HPCCpuUsage                             _Total
    HPCDiskQueue                            _Total
    HPCDiskSpace                            _Total
    HPCDiskThroughput                       _Total

    Any help is greatly appreciated.

     

    Thanks,

    Dasch

     

    Thursday, November 24, 2011 5:55 PM
  • Figured it out:

    Get-HpcMetricValue -NodeName <NodeName> -Name HPCCpuUsage

    This gives the CpuUsage value.

    Friday, November 25, 2011 8:25 AM
  • Hi Dasch87,

     

     Just catching up after Thanksgiving Holiday.  I don't know of any performance hit or performance badness in using the powershell cmdlets.  They are accessing the same information that the heat map uses in the Cluster Manager UI.

     

     You can get performance / counter information about Azure Nodes but the counter information that is delivered is a sub-set of the on-premise information.  I know that there are counters to deliver the following values from Azure Nodes, but I don't think there are any others. 

    Processor-CPU, Memory-Pages, System - Context Switches, System Calls, Physical Disk - Bytes, Logical Disk Queue, Memory - MBytes, Number of Cores, Number of Jobs, Number of Tasks.

    You can play around with the Heat Map in the UI to see what kinds of values you can get from Azure Nodes.  There is a bit of time for a refresh of counters if you change the values you have selected (e.g. going from just CPU to showing CPU + Memory in the heatmap).

    Hope this helps and please let me know if you have any other questions.

     

    Mark

     

     

     

    Monday, November 28, 2011 5:32 PM
  • Hi Mark,

    I hope you had a great Thanksgiving. Sounds great about the performance and the Azure counters, I belive that I can get all the information that I need.

    Thanks a lot for all your help.

    /Dasch

    Monday, November 28, 2011 5:36 PM