Answered by:
Windows HPC Multiple Head Nodes

Question
-
Dear all,
We are testing Windows HPC Pack 2016 and would like to know more about the maximum number of head nodes our cluster can have. Please help us with the following questions:
1. What's the maximum size of the cluster a single head node can handle?
2. How many head nodes a cluster can have? If I have five head nodes, will all of them participate in the job scheduling process.
3. Is a failover head node an active or passive node? Does it works only when the primary head node go down or always be active and share the load.
4. If I have multiple head nodes, how the load balancing happens between them.
Thanks a lot in advance.
- Puneet
Puneet Sharma
Wednesday, March 8, 2017 11:03 PM
Answers
-
1. It depends on your headnode HW spec, higher RAM and CPU and SQL should have no problem to handle a big cluster.
2. Currently we only support 3 Headnode for simplicity, we will support more headnodes in future release
3. Most of the service are active/passive, for example: scheduler, monitoring, SDM, stateful management service, session; and some are stateless (one active instance on every node), for example: stateless management service, REST service, Naming Service
4. Currently we disabled load balancing, admin can move the service around/between the three node manually through Service Fabric Powershell tool
Qiufang Shi
- Marked as answer by PuneetSharma035 Thursday, March 9, 2017 11:39 PM
Thursday, March 9, 2017 1:20 AM
All replies
-
1. It depends on your headnode HW spec, higher RAM and CPU and SQL should have no problem to handle a big cluster.
2. Currently we only support 3 Headnode for simplicity, we will support more headnodes in future release
3. Most of the service are active/passive, for example: scheduler, monitoring, SDM, stateful management service, session; and some are stateless (one active instance on every node), for example: stateless management service, REST service, Naming Service
4. Currently we disabled load balancing, admin can move the service around/between the three node manually through Service Fabric Powershell tool
Qiufang Shi
- Marked as answer by PuneetSharma035 Thursday, March 9, 2017 11:39 PM
Thursday, March 9, 2017 1:20 AM -
Dear Qiufang,
From your reply, I understood that even though we have 3 head nodes, only one head node will be active (submitting jobs, monitoring etc) at a given point in time. Other two head nodes will not be doing anything. Is it correct?
Actually based on our requirements, we want to add multiple active head nodes if cluster size grows. I believe we cannot do it with current HPC. Is is Correct?
Thanks,
Puneet
Puneet Sharma
- Edited by PuneetSharma035 Wednesday, March 15, 2017 9:42 PM Edited
Wednesday, March 15, 2017 9:41 PM -
Hi Puneet,
Most Your understanding is correct. And from our testing, SQL will be easily being the bottleneck. A few clarification:
1. Currently 3 Headnodes only for simplicity, we will support more in future
2. Other two headnodes actually are doing things as different services may active on different headnodes, it includes services:
- Scheduler Service: scheduling jobs
- Monitoring Service: receive metric data from all compute nodes and do aggregation
- Diagnostics Servcie
- Stateful management service: Do management operations
- Session Service: serve SOA job
And meanwhile there are stateless service that running on all headnodes and being active all the time, it includes services:
- Stateless Management Service: serve client request and client triggered operations such as node offline/online ...
- Scheduler REST service: serve client job request
- Naming Service
As we have disabled loading balancing, the service only failover when it encounters error. THus you are able to mannually distribute primary instance of a service on different headnode for example: Primary Scheduler service on one headnode, other primary services on the other two to have better load balance between the three headnodes
Qiufang Shi
Thursday, March 16, 2017 2:14 AM -
Quifang ,
How to see which server runs which service above? For example how do I see which server runs Scheduler Store?
thanks
Julia
- Edited by juliakir Friday, December 13, 2019 8:42 PM
Friday, December 13, 2019 8:41 PM -
Hi juliakir,
You may check the Service Fabric cluster portal (https://<oneheadnodeip>:10400) to view the service status and the running server. Note before you can visit the portal, you should install the certificate(with private key) which you used to install the head nodes to the certificate store “Current User\Personal”.
Regards,
Yutong Sun
Friday, December 27, 2019 3:18 AM