locked
Data archival/removal of Jobs/Tasks data in HPC RRS feed

  • Question

  • Hi

    1.what options of  HPC Server provides for data archival/removal of Jobs/Tasks data?

    2.As Jobs data is stored in SQL Server Compute Cluster, What is the effect of  lot of data created over time?

    3.Will there be any difference in Perfomance between Server having few jobs & lakhs of jobs for the following activites

        Job submission time

        Query for specific Job/Task 

    4. As the Data type of Jobid is int(2147483647). What happens if my jobid Reaches 2147483647.

    Please advise us if any document availble..

    Thanks in advance.....

    PVASN Murthy

        

        
          
    Wednesday, August 13, 2008 11:25 AM

Answers

  • 1. You can use the Get-HpcJobHistory PowerShell cmdlet to dump job history data from the DB.  We are looking to improve this ability in v3.  As for archival, you can use the various SQL tools to do this (though you may need to install Full SQL instead of SQL Express for many features).  As for removal, we will clean old jobs out of the database regularly.  You can set how long jobs are stored using the TTLCompletedJobs property in Cluscfg (or in the UI, Job Management -> Options -> Scheduler Configuration)

    2. Your database will get bigger :-)  We are working on a Capacity Planning document which details how much data you will generate based on your workload; look for that to be available in a month or so.

    3. Seems like there's a typo there . . . are you asking about the difference between 1 big job with many tasks vs. many small jobs with few tasks?  Submitting a single big job will generally be faster overall.  Querying shouldn't be too strongly affected by this.

    4. If you submit 2.1 billion jobs to your cluster, let us know and we'll be happy to take a look at the problem with you :-)

    Thanks!
    Josh
    -Josh
    Wednesday, August 27, 2008 5:47 PM
    Moderator