none
Exception thrown when NodeGroupList is set in SessionInfo

    Question

  • I get the following exception:

    Failed to submit job: This job requires at least 1 cores, but the list of candidate nodes that the Job Scheduler service returned for this job contains only 0 cores. The Job Scheduler service determines the candidate node list using the following job properties: NodeGroup, RequestedNodes, MinMemoryPerNode, MaxMemoryPerNode, MinCoresPerNode, MaxCoresPerNode, and ExcludedNodes. Either reduce the number of resources that the job requires, or redefine the relevant job properties, and then submit the job again.

    SessionStartInfo info = null;
    info = new SessionStartInfo("HeadNode","Foo");
    info.NodeGroupList = new List<string>(){ "ABC", "DEF" };
    ..
    ..

    I comment out the NodeGroupList line, and it works fine. What could I be missing?

    (I do have the Groups created in HPC Cluster manager and couple of nodes assigned to each of these groups and have ensured that no other jobs are running)

    Wednesday, February 01, 2017 5:13 PM

All replies

  • Hi SRIRAM,

    Could you repro this by running 'EchoClient.exe -groups ABC,DEF' under %CCP_HOME%Bin folder? It will simply create a Echo session with node groups "ABC" and "DEF".

    You may also check the full list of SessionStartInfo property setting e.g. RequestedNodesList, and as well as the node status (are they online?) to see if there are any other restrictions on the resource selection which results 0 available cores returned.

    Regards,

    Yutong Sun

    Friday, February 03, 2017 3:42 AM
  • When I looked at  EchoClient's job -> Resource Selection, only first group is displayed. If I do 

    echoclient -groups ABC,DEF  then Resource Selection->Selected Node Groups will have ABC

    and 

    echoclient -groups DEF,ABC will have DEF in Resource Selection->Selected Node Groups

    and if I just give one group to the -groups parameter, that group gets selected (as expected).

    And yes, the nodes in all groups are available/online. Where do i need to be looking into next?

    All my nodes also have WorkstationNodes (default, greyed out - cannot uncheck this) group tagged to them as well.

    If I leave out the NodeGroupList setting from SessionStartInfo, all nodes across all Groups will be considered. 

    Thanks!


    • Edited by SRIRAM R Friday, February 03, 2017 2:56 PM
    Friday, February 03, 2017 2:36 PM
  • Hi SRIRAM,

    When I run 'EchoClient.exe -groups ABC,DEF', the resource selection of the job would contain both ABC and DEF. It may also be shown with the command line as below,

    job view <jobId> /detailed | findstr /i nodegroups
    NodeGroups                       : ABC,DEF

    Could you post job and cluster properties using the following command lines? And please also let me know the HPC Pack versions on the client machine and the head node.

    job view <jobId> /detailed

    cluscfg listparams

    Regards,

    Yutong Sun


    Saturday, February 04, 2017 12:19 PM