How to programatically get Node CPU usage and disk throughput, like what can be displayed in teh Cluster Manager
14 Ekim 2011 Cuma 16:35
Is it possible to programatically get the node CPU usage and disk throughput like what is shown, and continually updated, in the Cluster Manager. I can find the static information about the nodes, but not the dynamic. I still need this information even though the nodes are not running any jobs.
Thanks a lot in advance.
16 Ekim 2011 Pazar 09:20You can write simple powershell script. For example, i wrote recently PS-script that gets such statistics as "processId-CPUUsage-MemoryUsage", had put it to the Task Manager. Every 5 minutes script launches and sends the information to my C# web-service that writes it to SQL server database.
If you want this script you can contact me ahriman[at]tpu.ru.
11 Kasım 2011 Cuma 08:49
Did you solve it? I'm interested as well.
11 Kasım 2011 Cuma 09:02Thanks for your suggestion ahriman. Do you have this script running on every compute node then? Or can you actually collect the information for all nodes on the headnode?
11 Kasım 2011 Cuma 09:04Hi Matthias, I don't have a good solution for this yet. My best solution so far is to use performance counters for each node and collect information from those every 5-10 seconds in my C# app. But, I would like a more centralized solution if possible.
14 Kasım 2011 Pazartesi 23:47
Have you tried any of the built in HPC Powershell Cmdlets? e.g. Get-HPCMetric
15 Kasım 2011 Salı 05:41No, not yet. I was looking for a pure C# API solution. Or did I miss something?
15 Kasım 2011 Salı 08:19
I have had a quick look at the Powershell Cmdlets, but as far as I can see they only provide overall performance information for the entire cluster. I am not sure that this is correct, I haven't investigated it in details. I would like to get the performance per node, or node group.
18 Kasım 2011 Cuma 18:27
Here is an example:
PS C:\Windows\system32> get-hpcmetricvalue -NodeName <name of Machine>
NodeName Metric Counter Value
-------- ------ ------- -----
<name of Machine> HPCContextSwitches 1336.328
<name of Machine> HPCCoresInUse 0
<name of Machine> HPCCpuUsage _Total 28.84661
<name of Machine> HPCDiskQueue _Total 0
<name of Machine> HPCDiskSpace _Total 94.30038
<name of Machine> HPCDiskSpace C: 94.30038
<name of Machine> HPCDiskThroughput _Total 0
<name of Machine> HPCJobsRunning 0
PS C:\Windows\system32> get-help get-hpcmetricvalue
Gets the current value of the specified metrics that HPC Cluster Manager
uses in the heat maps for the nodes and the monitoring charts.
For information about the parameters for this cmdlet, additional remarks,
and examples, type: "Get-Help Get-HpcMetricValue -online". To download a
Help file for all of the cmdlets that this product provides, see
Get-HpcMetricValue [[-Name] <String>] [-Counter <String>] [-MetricTarget <MetricTarget>] [-Node <HpcNode>] [- Scheduler <String>] [-Type <String>] [<CommonParameters>]
Get-HpcMetricValue [[-Name] <String>] [-Counter <String>] [-MetricTarget <MetricTarget>] [-NodeName <String>] [-Scheduler <String>] [-Type <String>] [<CommonParameters>]
Gets the current value of the specified set of metrics for the specified
nodes that HPC Cluster Manager uses in the heat maps for the nodes and
the monitoring charts.
You can specify the metric values that you want to get by any combination
of the names of the metrics, the locations where the metrics are
generated, and the categories for the metrics. You can specify the nodes
for which you want to get the metric values by specifying the node name
or an HpcNode object for the NodeName or Node parameters, respectively.
If you do not specify any names, locations, or categories, the
Get-HpcMetricValue cmdlet gets the values of all of the metrics for the
specified nodes, or for all of nodes in the HPC cluster if no nodes are
Online Version: http://go.microsoft.com/fwlink/?LinkId=182816
To see the examples, type: "get-help Get-HpcMetricValue -examples".
For more information, type: "get-help Get-HpcMetricValue -detailed".
For technical information, type: "get-help Get-HpcMetricValue -full".
- Yanıt Olarak Öneren Mark Staveley 18 Kasım 2011 Cuma 18:27
21 Kasım 2011 Pazartesi 06:08
thanks for your help. But the question remains: How does Cluster Manager do it? There must be some native C# API...
21 Kasım 2011 Pazartesi 08:26
Thanks a lot for the example Mark, the cmdlet seems to have a least some of the values that I am looking for. It might be the way to do it, though I would prefer that those values were provided through the C# API.
Thanks a lot.
21 Kasım 2011 Pazartesi 08:28
I haven't found a native C# API to retrive the information, but it is possible to utilise the Powershell in your C# app:
21 Kasım 2011 Pazartesi 21:18
There are no public APIs through C# to access these counters. These cmdlets are currently the only mechanism that is currently provided to do so.
In writing my own automation I do exactly what Dasch87 suggests and wrap my PowerShell cmdlets in C# - I use C# for command line parsing and results validation and reporting in my test code. Please note that I didn't include an exhaustive list of all of the metrics in my example.
Here is a listing of al of the Metrics:
PS C:\Windows\system32> get-hpcmetricvalue -NodeName ComputeNode01 | ft metric,counter
HPCNetwork Intel[R] 82567LM-3 Gigabit Network C...
HPCNetwork Intel[R] PRO_1000 PT Dual Port Serve...
HPCNetwork Intel[R] PRO_1000 PT Dual Port Serve...
HPCNetwork Local Area Connection* 11
- Yanıt Olarak İşaretleyen Dasch87 28 Kasım 2011 Pazartesi 17:36
23 Kasım 2011 Çarşamba 08:48Mark, do you know if there is any performance benefits/downsides to use the cmdlets over performance counters?
24 Kasım 2011 Perşembe 17:55
I ran into problems with using Performance Counters. I cannot get the performance counter values on Azure nodes (there might be a workaround for this, but I decided to give the cmdlets a go).
Mark, do you know if it is possible to write a script to get just the HPCCpuUsage value for a specific node, I can't seem to get the correct combination of params. I tried something like:
"Get-HpcMetricValue -Scheduler UK2WRMAPHPC001 -Counter _Total -NodeName <nodeName>"
But that obviously gives me all values that has a "_Total" counter:
Any help is greatly appreciated.
25 Kasım 2011 Cuma 08:25
Figured it out:
Get-HpcMetricValue -NodeName <NodeName> -Name HPCCpuUsage
This gives the CpuUsage value.
28 Kasım 2011 Pazartesi 17:32
Just catching up after Thanksgiving Holiday. I don't know of any performance hit or performance badness in using the powershell cmdlets. They are accessing the same information that the heat map uses in the Cluster Manager UI.
You can get performance / counter information about Azure Nodes but the counter information that is delivered is a sub-set of the on-premise information. I know that there are counters to deliver the following values from Azure Nodes, but I don't think there are any others.
Processor-CPU, Memory-Pages, System - Context Switches, System Calls, Physical Disk - Bytes, Logical Disk Queue, Memory - MBytes, Number of Cores, Number of Jobs, Number of Tasks.
You can play around with the Heat Map in the UI to see what kinds of values you can get from Azure Nodes. There is a bit of time for a refresh of counters if you change the values you have selected (e.g. going from just CPU to showing CPU + Memory in the heatmap).
Hope this helps and please let me know if you have any other questions.
28 Kasım 2011 Pazartesi 17:36
I hope you had a great Thanksgiving. Sounds great about the performance and the Azure counters, I belive that I can get all the information that I need.
Thanks a lot for all your help.