none
Unable to run jobs on nodes besides head node RRS feed

  • Question

  • Hello, 

         I am a student trying to set up a 3-node HPC cluster.  The cluster is configured, but when a job is run, it will only run on the head node.  The Job Manager Activity Log for the job will show that it made two attempts to contact the compute nodes, showing that the job "started" and "ended" on the nodes, but the "Hello World" output shows that the job is only running successfully on the head node (also, if I request 3-3 nodes, the job will fail.  The job described above was using either auto request or a minimum of 1 nodes).  

    More Info:

    My network topology is "5: enterprise network".  In the Cluster Manager, all the nodes are shown as "online", but do have warnings from diagnostic tests (I don't currently have privileges to run diagnostics, so I don't have any more detail on these, but I could get in contact with the admin about it.)

    Hardware Details:

    Head node is Intel Pentium 4 (2 cores), Compute nodes are Intel Xeon (4 cores). 

    Software Details (same on all 3 nodes):

    HPC Pack 2008 R2 (job submitted both using Job Manager and HPC Powershell)

    Programming in Microsoft Visual Studio 2012 in C++ using MS-MPI, OS is Windows Server 2008 R2 Enterprise.  

    Can anyone offer any suggestions on why this is happening and what could be done to fix it?  Thanks for your time,

    Lucas

    Friday, October 4, 2013 7:14 PM

All replies

  • If anyone happens to be looking at this, I had better luck with another thread:

    http://social.microsoft.com/Forums/en-US/340d8e72-467b-4688-82c2-0849526d8931/jobs-run-on-head-node-only?forum=windowshpcmpi&prof=required

    Tuesday, October 15, 2013 6:11 PM