none
Basic cluster configuration questions RRS feed

  • Question

  • We are preparing to setup our first HPC Cluster and had some performance questions. We are looking at configuring based on topology 3 with the head node connected to the enterprise network and compute nodes connected via private and enterprise networks. The topology shows the head node connected to both private and application network. If the head node is not used as a compute node, is this necessary? If so, can it be on a 1gig NIC with compute nodes have 10gig NIC's or is there a lot of traffic between the head node and compute nodes? Also, if only used as a head node, could this computer be virtual or would that hinder performance? One last thing, this is based on the assumption that we will not use the head node as a compute node. It seems like compute functions would be slower on the head node since bookkeeping, etc is also being done on this machine. Is this assumtion correct or should we in fact use the same type computer for the head node as the compute nodes and let it serve both functions? Any guidance given will be greatly appreciated. We are in a hurry to get something setup and training courses seem to be few and far betweeen.
    Tuesday, June 22, 2010 6:58 PM

Answers

  • Hello there

    Just to confirm, Topology 3. is Compute nodes isolated on Private and Application networks only, with no direct compute node connection to enterprise network. If this is what you are after, then I have the following suggestions.

    1. If not acting as a compute node, the head node does not necessarily need to be physically attached to the application network at all. Giovanni Marchetti gives more info about using a 'fake' loopback adapter on the headnode to kid HPC Server into operating as you require on his blog http://blogs.technet.com/gmarchetti/archive/2009/02/23/faking-networks.aspx Of course if you have an available physical adapter the loopback interface is not required, and you can have the non-fake connection in place, but it goes to show that this will not be a performance issue. The application network is only used for application traffic, and all cluster operations are carried out over the private network.

    2. For best performance, do not use the headnode as a compute node. The headnode is constantly checking cluster health, submitting and checking on job status, managing resource availability, reading and writing from SQL Server, may be deploying / redeploying nodes etc etc. With all of these interruptions, it's not an ideal platform for also running jobs. Of course, if you are constrained hardware wise & need all the compute resource you can get it may be that use of the headnode as a compute node is just too tempting, but at least be aware that you will not see perfect performance from it.

    3. It is possible to use a VM for the headnode, I've done this for various demo / test / play clusters & it works just fine, but I've not deployed a production system using a VM for the headnode. The main reason behind this is the driver issues sometimes seen with virtual machines. Try installing an Infiniband card in a Hyper-V VM, for example.

    Hope this helps,

    Dan

    Wednesday, June 23, 2010 8:40 AM

All replies

  • Hello there

    Just to confirm, Topology 3. is Compute nodes isolated on Private and Application networks only, with no direct compute node connection to enterprise network. If this is what you are after, then I have the following suggestions.

    1. If not acting as a compute node, the head node does not necessarily need to be physically attached to the application network at all. Giovanni Marchetti gives more info about using a 'fake' loopback adapter on the headnode to kid HPC Server into operating as you require on his blog http://blogs.technet.com/gmarchetti/archive/2009/02/23/faking-networks.aspx Of course if you have an available physical adapter the loopback interface is not required, and you can have the non-fake connection in place, but it goes to show that this will not be a performance issue. The application network is only used for application traffic, and all cluster operations are carried out over the private network.

    2. For best performance, do not use the headnode as a compute node. The headnode is constantly checking cluster health, submitting and checking on job status, managing resource availability, reading and writing from SQL Server, may be deploying / redeploying nodes etc etc. With all of these interruptions, it's not an ideal platform for also running jobs. Of course, if you are constrained hardware wise & need all the compute resource you can get it may be that use of the headnode as a compute node is just too tempting, but at least be aware that you will not see perfect performance from it.

    3. It is possible to use a VM for the headnode, I've done this for various demo / test / play clusters & it works just fine, but I've not deployed a production system using a VM for the headnode. The main reason behind this is the driver issues sometimes seen with virtual machines. Try installing an Infiniband card in a Hyper-V VM, for example.

    Hope this helps,

    Dan

    Wednesday, June 23, 2010 8:40 AM
  • Dan,

    Thanks for the information, it was extremely helpful. Looks like we are at least headed down the right path.

    Thanks again,

    Wednesday, June 23, 2010 12:41 PM