none
HPC Performance and Scalability - How many Compute Nodes (Workstations) can be added to a Head Node

    Question

  • Microsoft HPC 2012 R2 Update 3 is something new to us and we recently discovered that having 2700 computes nodes on one head node seems problematic at a point where schedule task on the server can't complete in the appropriate amount of time.    The computes nodes are CAD Workstations and the Head Node is a virtual machine with Intel Xeon E5-2680 Quad Core of 2.7 GHz with 16 GB of RAM.

    A test was done by adding 2 more CPUs but even if the CPU usage is around 50 to 70% most of the time the Schedule Task never complete successfully.  The cluster size was reduce to 500 and now the system is working but a lot of CPU is required and also a lot of RAM.

    What would be required to be able to use our 2700 compute nodes on the same head node ?  How can the system be sized properly to work with all available compute nodes ?

    Friday, 13 April 2018 1:54 PM

Answers

  • Could you reach us through hpcpack@microsoft.com

    To scale to 2700 compute nodes, you need carefully define your SQL server and Headnode. You could share with us your current configuration. And I could also share our recommendation through our scale testing (10K+ cores)


    Qiufang Shi

    • Marked as answer by Acadien75 Tuesday, 17 April 2018 3:53 AM
    Tuesday, 17 April 2018 3:38 AM

All replies

  • Could you reach us through hpcpack@microsoft.com

    To scale to 2700 compute nodes, you need carefully define your SQL server and Headnode. You could share with us your current configuration. And I could also share our recommendation through our scale testing (10K+ cores)


    Qiufang Shi

    • Marked as answer by Acadien75 Tuesday, 17 April 2018 3:53 AM
    Tuesday, 17 April 2018 3:38 AM
  • Thanks for the email address. I will establish initial contact using my business email account tomorrow and provide any necessary details.
    Tuesday, 17 April 2018 3:53 AM
  • Can you tell me the configuration information you are looking for and if their is a script I can use to extract what you are looking for ?
    Tuesday, 17 April 2018 3:56 AM
  • For example:

    the CPU, RAM, whether SSD, SQL edition, whether it is remote DB. Usually the bottleneck is on SQL.


    Qiufang Shi

    Tuesday, 17 April 2018 7:53 AM
  • An email was sent to hpcpack@microsoft.com to initiate the discussion/investigation.
    Tuesday, 17 April 2018 3:06 PM