none
Cluster Manager Job State stuck on Configuring RRS feed

  • Question

  • Hi,
    I am testing a SOA style service on HPC 2008 Server sp1 cluster.  I have one head node and 2 compute nodes.  All nodes have State=Online and Node Health=OK in the Node Management pane of Cluster Manager.  I ran diagnostics without any issues.

    When my client tries to create a session, the cluster manager's Job Management pane shows 2 items.  The first is Job ID 4 WCF service, the second is Job ID 5 WCF service - Broker for service job 4.  The State for both is 'Configuring'.  There is no furher detail in the Job Details.  My client just hangs.  I let it run overnight and it never got past the configuring stage. 

    I need advice on how to troubleshoot.  I can't find any more information on what is happening during Configuring state.
    thanks
    Phil
    Friday, June 26, 2009 6:45 PM

Answers

  • I can answer my own question.  I had not put credentials in my SessionStartInfo. I'd assumed that if client and server were running as same user, it would be ok.  For whatever reason, the Cluster Manager does not complain, but it does hang.

    Monday, June 29, 2009 12:59 PM

All replies

  • I can answer my own question.  I had not put credentials in my SessionStartInfo. I'd assumed that if client and server were running as same user, it would be ok.  For whatever reason, the Cluster Manager does not complain, but it does hang.

    Monday, June 29, 2009 12:59 PM
  • Thanks.  I'll bring this up and hopefully we can get better error handling in a future version.

    -J
    -Josh
    Wednesday, July 22, 2009 9:52 PM
    Moderator
  • I'm getting a different exception for my job stuck at "configuring" for job submitted from
    Job Management after HPCPack Sp1.   The error is

    Database Exception
    Procedure or function 'Schd_NextTaskId' expects parameter '@numTasks', which was not supplied.

    Interestingly, if I copy the finished job, it will not stuck at "Configuring" stage.
    Looks like a bug introduced in HPC Pack SP1.

    Wednesday, August 5, 2009 8:57 PM
  • I don't think I understand the problem.

    Is this a SOA job?

    If the job is stuck in Configuring state with the error:

    Database Exception
    Procedure or function 'Schd_NextTaskId' expects parameter '@numTasks', which was not supplied.

    What did you to get it running so that it is in the Finished state? Did you add numTasks? If so then I would guess that copying the finished job would have numTasks & therefore would work if resubmitted.

    Friday, August 7, 2009 5:12 PM