none
Share compute nodes with another HPC software?

    Question

  • Hi,

    Is it possible for MS HPC Server 2008 to share compute nodes with another HPC software from a different vendor?  I'm assuming we shouldn't have any issue running both the  MS HPC Server 2008 client and the client software from the other vendor to all the compute nodes.  The possible issue I'm seeing though is resource allocation.  Is it possible to create a rule in MS HPC Server 2008 to not use a core that's at over 50% utilization, for example?

    Thanks in advance!
    Wednesday, December 9, 2009 3:05 AM

Answers

  • Microsoft HPC Server does not provide the kind of core-level rules you're asking about. However, since it's only a few times month, could you simply tell Microsoft HPC Server to take the nodes offline that will be running the other vendor's application, and then bring them back online again when the application is completed?

    Tuesday, December 15, 2009 12:00 AM

All replies

  • We assume that a compute node is completely dedicated to the cluster. There could be all sorts of things that conflict if you try having a single node joined to two different types of clusters (and we prevent you from joining a single node to two MS HPC clusters).

    You may encounter configuration issues (where we try to configure the network on the node, but your other cluster software also tries to) and admin issues (the HPC admin takes the node offline to do maintenance, but the other cluster is still trying to use it), and probably some other things in addition to the potential inefficient CPU usage as two schedulers try to make decisions on their assumptions of the machine state.

    What's the actual problem that you are trying to solve here? (only own 1 cluster, but have 3rd party apps some of which are integrated with our scheduler and some of which are integrated with other schedulers perhaps?)
    Wednesday, December 9, 2009 7:09 AM
    Moderator
  • Hi Don,

    Thanks for the quick response.  The other issues you mentioned are definitely something for us to think about as well.

    Our problem is we have just this one application that was created by the vendor of the other cluster manager and will only run on their cluster.  The application may only need to be ran a few times a month but they're pretty big jobs so we still need a big cluster for it, but giving it a dedicated environment seems like a waste since there will be a lot of times when it's just idle.  All our other applications will run on the MS HPC environment and it will be very busy throughout the day so we can't really just split the compute nodes when someone needs to run the other application.

    So what we were thinking is maybe if we can set a rule on the schedulers to not use cores/nodes that's using a certain amount of resources we can prevent running more than one application on one core.  I believe the other scheduler has this capability but not sure about MS HPC and we'll need both schedulers to be able to do this for this to work.  There could of course be other bigger issues other than this like you said.

    Any ideas?
    Thursday, December 10, 2009 2:34 AM
  • Microsoft HPC Server does not provide the kind of core-level rules you're asking about. However, since it's only a few times month, could you simply tell Microsoft HPC Server to take the nodes offline that will be running the other vendor's application, and then bring them back online again when the application is completed?

    Tuesday, December 15, 2009 12:00 AM
  • Or maybe dual boot should the vendor software require exclusivity on the node. There are cluster resource allocation tools out there (Moab adaptive suite for one off the top of my head) which will dynamically automate node deployment to co-located clusters. Obviously there is cost to that sort of thing though, and it's really designed for a Linux / Windows HPC mix so I'm not sure how well it would work with 2 x Windows clusters.
    Dan

    Tuesday, December 15, 2009 9:47 AM