Where does the queue go when server crashes?


  • I have a cluster set up with 1 head node and a bunch of compute nodes.  I'm thinking that the job queue that HPC uses sits on the head node somewhere.  Correct me if I'm wrong here.

    My question is what happens to all the jobs that are sitting in the queue if the head node goes down?  If the queue gets lost on a server breakdown are there any other methods available during real time that I can use to backup the job queue other than the obvious server backups at specific intervals to another media?

    2012年2月21日 21:37