none
High SQL Server Memory usage on HPC system causes significant slowdown

    Frage

  • Currently using HPC Pack 2016 Update 1. 

    After leaving the Cluster running over a weekend we had roughly 40 SOA jobs which had varying numbers of requests each (but somewhere in the range of 200 to 70000 requests). After a few days, the Cluster Manager application became locked up, and viewing job details threw random exceptions. The SQL Server process (which is running on the single head node) was using roughly 3 gigs of RAM. Restarting the SQL Server process allowed the Cluster Manager to become responsive again. Are there any known resource leaks for HPC interacting with SQL Server? 

    The SQL Server version on the head node is:

    2018-04-26 09:33:44.55 Server      Microsoft SQL Server 2016 (RTM) - 13.0.1601.5 (X64) 
    Apr 29 2016 23:23:58 
    Copyright (c) Microsoft Corporation
    Express Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)

    This is a development HPC instance, so the topology is 1 head node, 2 worker nodes, all on an Enterprise network. The SQL Server instance is running on the Head Node. 

    I was wondering if there was a way to detect what is causing SQL Server's memory to slowly grow over time, and if there is any additional diagnostic information to provide. 

    Montag, 30. April 2018 23:22

Antworten

  • Hi,

    Could you share us the detail of the exceptions (type, inner exception, message, stack, etc.) you saw when you trying to view job details?

    Memory usage of SQL server can be configured according to this page.

    Thanks,
    Zihao

    • Als Antwort markiert KB_apl Mittwoch, 2. Mai 2018 15:20
    Dienstag, 1. Mai 2018 16:28

Alle Antworten

  • Hi,

    Could you share us the detail of the exceptions (type, inner exception, message, stack, etc.) you saw when you trying to view job details?

    Memory usage of SQL server can be configured according to this page.

    Thanks,
    Zihao

    • Als Antwort markiert KB_apl Mittwoch, 2. Mai 2018 15:20
    Dienstag, 1. Mai 2018 16:28
  • Zihao,

    Unforunately, there were no log files which contained relevant error messages/stack traces. However, the link provided did indeed provide the information necessary to prevent SQL Server from consuming all memory. After specifying the cap for the memory, the cluster manager stopped becoming unresponsive after use over an extended period of time. It might be helpful to link those notes to the HPC Pack configuration page, but our case might be special since we are not a SQL Server shop.

    Much appreciated,

    -Kyle

    Mittwoch, 2. Mai 2018 15:25