Can I assign cores to HN function, and leave the rest to Compute function on my Headnode? RRS feed

  • Question

  • I have a 4 node grid, running HPC server 2008 R2 SP4. Each machine has 4 sockets, 4 cores per socket. This is not a high-availability/critical path environment, so I've set up the HN to also be a compute node. I know this is not Best Practices, but we need to the compute power.

    My question is: Can I somehow configure the HN function (scheduler, etc.) to use 1 socket (or some number of cores) and leave the rest of the cores available on that machine for the Compute Node function? We have an application that pretty much floods the head node with short jobs (couple hundred at a time), and it can overwhelm the HN with computing obligations such that nothing much else can get done (including, in short bursts, further scheduling).

    I know that another option is to have an additional, less powerful machine take over the HN functions, and have 4 compute nodes, but that would require a bit of redistribution of wealth here, and I was lucky enough to get these machines.

    Failing the above scenario, does this newest version of HPC allow job templates to say (basically) "schedule these nodes first" or "schedule this node only when the others are fully utilized"?

    Thanks for any insight.



    Wednesday, October 10, 2012 7:10 PM

All replies

  • HPC job scheduling prevents different HPC jobs to share the same cores but does not prevent non-hpc tasks to use the same cores/sockets. Therefore, if you only want say at most 3 sockets to be used on HN, you can create a long-running dummy hpc job like sleep to occupy one socket on HN. Then, whatever you schedule next will not take this socket.
    Friday, October 12, 2012 6:11 PM
  • I don't think you understand my question: This is not a question of allowing more than one job to share a core, or even a socket. This is about allowing a combined HN / CN to assign exclusive use of a socket to HN activity only.

    I have 4 machines in my grid (16 cores/machine).
    Machines 2, 3, 4 are configured as Compute Nodes.
    Machine 1 is configured as a Head Node/Compute Node.

    I can flood the grid with jobs so that all cores are busy. This makes head node activity (Job Manager, scheduler, for example) on Machine 1 very slow.

    I'd like to reserve 1 socket (4 cores) on Machine 1 for Head Node activity only. That leaves 3 sockets (12 cores) on Machine 1 exclusively for Compute Node activities.

    So my grid would have 60 cores available in total.

    Am I able to configure my Head Node in this way?


    Monday, October 15, 2012 12:07 AM
  • you can do it with HPC powershell

    take headnode offline
    run in hpc powershell "set-hpcnode <nameofheadnode> -subscribedsockets 3"
    take headnode online 
    Monday, October 15, 2012 9:08 PM