Windows HPC : JobHistory RRS feed

  • Question

  • Hi all,

    My issue concerns 'HPC Job History' & using 'Get-HpcJobHistory' I can't seem to be able to retreive info on 'consumed resources' such as :

    -      ‘core’

    -      ‘socket’

    -      ‘node’

    Is there any way to get all this information ?

    Thx in advance for any help


    Tuesday, January 31, 2012 4:33 PM

All replies

  • Hi

    You may use :"job view [job ID]" to get info about executed job ,as below:

    >job view 186283
    Id                                 : 186283
    State                           : Failed
    Name                           : testjob1
    Project Name               :
    Owner                         : Administrator
    Template                     : Default
    Priority                        : Normal
    Resource Request       : 2-10 cores
    Type                            : Batch
    Node Groups                     :
    Requested Nodes                 :
    Allocated Nodes                 : NODE01,NODE02
    Current Allocation              : 6 cores
    Submit Time                     : 2012/02/07 17:33:40
    Start Time                      : 2012/02/07 17:33:40
    End Time                        : 2012/02/07 17:36:20
    Elapsed Time                    : 00:00:02:39
    Wait Time                       : 00:00:00:00
    Run As                          : testuser
    Pending Reason                  :
    Error Message                   :
    Task 186283.1 failed. Please check the failed task for more details on the failu
    Progress                        : 100%
    Progress Message                :
    Task Count                      : 1
        Configuring tasks           : 0
        Queued tasks                : 0
        Running tasks               : 0
        Finished tasks              : 0
        Failed tasks                : 1
        Canceled tasks              : 0

    As for socket : "netstat -anbo | more" command may help you


    Daniel Drypczewski

    Tuesday, February 7, 2012 9:11 AM
  • Hi Daniel,

    Firstly thx for answering.

    Using 'Get-HpcJobHistory', I obtain list of jobID which I apply to 'job view ...'.
    The result is :

       The specified Job ID is not valid. Check your Job ID and try again.

    Thx again in advance for telling what 's wrong here.

    Friday, February 10, 2012 8:54 AM
  • I think you're waiting too long to try to get the info about the job.  There are 2 databases used here: scheduler database and reporting database.  The scheduler database has very detailed information about queued, running and recently completed jobs.  A subset of the job info is moved from the scheduler database to the reporting database periodically (15 minutes).  In addition completed jobs are removed from the scheduler database every 5 days (by default - you can change this).

    So I think you are getting some info from the reporting database and then trying to go back to the scheduler database to get additional info, but the job is no longer there.  You need to get the more detailed info you need from the scheduler database within 5 days of completion, before it gets deleted.


    Tuesday, February 14, 2012 10:39 PM
  • Hi Chris, Thx to your hints I now can have 'job view' work. My pb now is I can't seem to get info on 'allocated cores'. All I get via 'Current Allocation'  is 0 cores ... Thx again in advance for telling what 's wrong. Regards, Chanh

    Wednesday, February 15, 2012 3:41 PM
  • W/   'job view jobID /detailed',  I got all the info I need ie. : AllocatedCores                   : node1 5AllocatedNodes                   : node1 1AllocatedSockets                 : node1 2
    Thursday, February 23, 2012 9:46 AM