none
HPC Server 2008 SOA service calls longer than one hour

    Question

  • In our application, we followed the suggestions at http://blogs.technet.com/b/windowshpc/archive/2010/04/15/suggestions-to-avoid-certain-timeouts-and-exceptions-when-using-soa.aspx to have long running service calls.

    In fact we set the client OperationTimeout to TimeSpan.MaxValue and the serviceAssembly.dll.config receiveTimeout to "24.00:00:00".

    We are noticing that at the one-hour mark, our calls are being ended and our service jobs are finishing.

    It seems that the suggestions from that blog page are not sufficient to have calls longer than one hour.

    Are there additional timeouts that we can configure to allow our calls to run for more than an hour?

    Thanks!

    Monday, June 21, 2010 8:47 PM

Answers

  • Hi Derek,

    It seems that issue is network related.

    On an isolated network environment (without IPSec), with the suggestions posted on http://blogs.technet.com/b/windowshpc/archive/2010/04/15/suggestions-to-avoid-certain-timeouts-and-exceptions-when-using-soa.aspx, I could run a request that lasts for 80 minutes.

    Can you try the following diagnostics?

    1) Check IpSec and turn it off. On your cluster with the 1-hour problem, is IPSec running on the nodes? If so, can you turn it off and try again? On some network, if IPSec is turned off, the computer won't be able to be remotely accessed and HPC cluster may not work any long. So please make sure after IPSec is off, all nodes are still online.

    2) Get tracing and view it with SvcTraceViewer. The following is how to get tracing log. SvcTraceViewer is part of the free windows sdk


    1. add tracing to service host on each compute node.

     Edit file %CCP_HOME%\bin\HpcServiceHost.exe.config. section system.diagnostic

     <system.diagnostics>
         <sources>
           <source name="Microsoft.Hpc.HpcServiceHosting" switchValue="All">
             <listeners>
               <add name="Console" />
               <add name="ServiceHostTraceListener" />
             </listeners>
           </source>
         </sources>
         <sharedListeners>
           <add initializeData="\\<HEADNODE>\CcpSpoolDir\host.svclog" type="System.Diagnostics.XmlWriterTraceListener"
             name="ServiceHostTraceListener">
             <filter type="" />
           </add>
           <add type="System.Diagnostics.ConsoleTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
             name="Console" traceOutputOptions="DateTime, ThreadId">
             <filter type="" />
           </add>
         </sharedListeners>
         <trace autoflush="true" />
     </system.diagnostics>

    2. add tracing to broker nodes.
     Edit file %CCP_HOME%\bin\HpcWcfBroker.exe.config. section system.diagnostic:

      <system.diagnostics>
        <sources>
          <source name="Microsoft.Hpc.ServiceBroker" switchValue="All">
            <listeners>
              <add name="Console">
                <filter type="" />
              </add>
              <add name="WSLBTraceListener">
                <filter type="" />
              </add>
              <remove name ="Default" />
            </listeners>
          </source>
        </sources>
        <sharedListeners>
          <add type="System.Diagnostics.ConsoleTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
            name="Console" traceOutputOptions="DateTime, ThreadId">
            <filter type="" />
          </add>
          <add initializeData="\\<HEADNODE>\CcpSpoolDir\broker.svclog"
            type="System.Diagnostics.XmlWriterTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
            name="WSLBTraceListener" traceOutputOptions="Timestamp">
            <filter type="" />
          </add>
        </sharedListeners>
        <trace autoflush="true" />
      </system.diagnostics>

     

     

    Friday, June 25, 2010 4:05 PM

All replies

  • We noticed this phenomenon of HPC SOA calls ending after one hour on two HPC clusters on our internal network.

    Then, on another HPC cluster on a separate network, I was able to run service calls that took longer than an hour without failing.

    So I am wondering if there is possibly another timeout (networking related maybe?) that might interact with HPC SOA tasks to cause calls to fail at an hour.

    Thanks.

    Wednesday, June 23, 2010 3:53 PM
  • Hi Derek,

    It seems that issue is network related.

    On an isolated network environment (without IPSec), with the suggestions posted on http://blogs.technet.com/b/windowshpc/archive/2010/04/15/suggestions-to-avoid-certain-timeouts-and-exceptions-when-using-soa.aspx, I could run a request that lasts for 80 minutes.

    Can you try the following diagnostics?

    1) Check IpSec and turn it off. On your cluster with the 1-hour problem, is IPSec running on the nodes? If so, can you turn it off and try again? On some network, if IPSec is turned off, the computer won't be able to be remotely accessed and HPC cluster may not work any long. So please make sure after IPSec is off, all nodes are still online.

    2) Get tracing and view it with SvcTraceViewer. The following is how to get tracing log. SvcTraceViewer is part of the free windows sdk


    1. add tracing to service host on each compute node.

     Edit file %CCP_HOME%\bin\HpcServiceHost.exe.config. section system.diagnostic

     <system.diagnostics>
         <sources>
           <source name="Microsoft.Hpc.HpcServiceHosting" switchValue="All">
             <listeners>
               <add name="Console" />
               <add name="ServiceHostTraceListener" />
             </listeners>
           </source>
         </sources>
         <sharedListeners>
           <add initializeData="\\<HEADNODE>\CcpSpoolDir\host.svclog" type="System.Diagnostics.XmlWriterTraceListener"
             name="ServiceHostTraceListener">
             <filter type="" />
           </add>
           <add type="System.Diagnostics.ConsoleTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
             name="Console" traceOutputOptions="DateTime, ThreadId">
             <filter type="" />
           </add>
         </sharedListeners>
         <trace autoflush="true" />
     </system.diagnostics>

    2. add tracing to broker nodes.
     Edit file %CCP_HOME%\bin\HpcWcfBroker.exe.config. section system.diagnostic:

      <system.diagnostics>
        <sources>
          <source name="Microsoft.Hpc.ServiceBroker" switchValue="All">
            <listeners>
              <add name="Console">
                <filter type="" />
              </add>
              <add name="WSLBTraceListener">
                <filter type="" />
              </add>
              <remove name ="Default" />
            </listeners>
          </source>
        </sources>
        <sharedListeners>
          <add type="System.Diagnostics.ConsoleTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
            name="Console" traceOutputOptions="DateTime, ThreadId">
            <filter type="" />
          </add>
          <add initializeData="\\<HEADNODE>\CcpSpoolDir\broker.svclog"
            type="System.Diagnostics.XmlWriterTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
            name="WSLBTraceListener" traceOutputOptions="Timestamp">
            <filter type="" />
          </add>
        </sharedListeners>
        <trace autoflush="true" />
      </system.diagnostics>

     

     

    Friday, June 25, 2010 4:05 PM
  • Thank you Liwei. You are right that it turned out to be network related.

    We found a setting on our firewall that was limiting the connections to an hour. We have increased this limit, and have been able to have calls run for longer than an hour.

    Derek

    Monday, June 28, 2010 3:26 PM
  • Hi Derek,

    It is great to know your issue was resolved. Can you share details about where a firewall setting can limit connection to be an hour? It is good to know for future trouble-shotting.

    Thanks,

    Liwei

    Tuesday, June 29, 2010 4:25 PM