none
All Diagnostics and Admin jobs Failed To Run but application can submit Job normally on HPC Pack 2012 R2 Cluster Manager RRS feed

  • Question

  • Hi All,

    We are using HPC Pack 2012 R2 Update 3 on Failover Cluster.

    Until before, there were no problems when performing all diagnostics and Admin Job.

    The following error occurs when running HPC diagnostics and Admin Jobs like run command recently.

    But weirdly Cluster Manager can submit job normally.

    Below is the log file of Diagnostics when an error occurs.

    09/25/2020 04:46:29.147 i HpcDiagnostics 1896 14432 [Store] New test run submitted for test Id#4 by [Domain Account]
    09/25/2020 04:46:29.148 i HpcDiagnostics 1896 14432 [Store] new RunId is #1419
    09/25/2020 04:46:29.148 i HpcDiagnostics 1896 14432 [Store] parameter: -count:4
    09/25/2020 04:46:29.148 i HpcDiagnostics 1896 14432 [Store] parameter: -Network:Default
    09/25/2020 04:46:29.154 e HpcDiagnostics 1896 4752 An unexpected exception occurred. For more information about this exception, see the Details tab. . Additional data:. End of Stream encountered before parsing was completed. . Exception detail: System.Runtime.Serialization.SerializationException: End of Stream encountered before parsing was completed.....Server stack trace: .. at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.Run().. at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage).. at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage).. at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, IMethodCallMessage methodCallMessage).. at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)....Exception rethrown at [0]: .. at Microsoft.Hpc.Scheduler.NodeManagement.NodeQuery.HandleException(Exception e).. at Microsoft.Hpc.Scheduler.NodeManagement.NodeQuery.QueryNodes(IEnumerable`1 constraints).. at Microsoft.Hpc.Diagnostics.Controller.Utilities.GetNodesListFromNodeTypes(INodeQuery nodeQuery, DiagnosticTargetNodeType type).. at Microsoft.Hpc.Diagnostics.Controller.SubmittedTestHandler.FilterOutNotSupportedNodes(DiagnosticTestRun testRun, DiagnosticTest testDef).. at Microsoft.Hpc.Diagnostics.Controller.SubmittedTestHandler.ExecuteInternal(DiagnosticTestRun testRun).. at Microsoft.Hpc.Diagnosti� �� oller.StateHandlerBase.Execute()�)�; ��R�; t Mi nostics. 57v�� y, Excep eption).h�)�; .Hpc.Diagnostics.Controller. dler� cute().. crosoft.Hpc.Diagnostics.Controller.DiagnosticsController.RunStateHandlers(Object o).. at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx).. at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx).. at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem().. at System.Threading.ThreadPoolWorkQueue.Dispatch()..

    I tried restart the HPC Diagnostics service, but the results were the same.

    I would appreciate any help with this problem.
    Thanks.

    Monday, October 12, 2020 3:33 AM

All replies

  • Hi jmpark,

    What's the detailed HPC Pack version? Please check Hpc Cluster Manger -> Help -> About.

    Could you run 'node list /group:<GroupName>' to list nodes in a node group?

    Regards,

    Yutong Sun

    Monday, October 12, 2020 2:11 PM
    Moderator

  • Hi Yutong Sun, Thank you for answer.

    Detailed HPC Pack Version is 4.5.5111.0.

    When I run 'node list /group:HeadNodes', the following error message appears.

    'End of Stream encountered before parsing was completed.'

    Nodes are listed when just run 'node list' without group option.

    Thank you.




    • Edited by jmpark Tuesday, October 13, 2020 4:51 AM
    Tuesday, October 13, 2020 4:47 AM
  • Hi jmpark,

    Is this constant repro when running 'node list /group:<GroupName>'? Are you running it on the active head node?

    Could you collect the HpcManagement_*.bin log files under %CCP_DATA%LogFiles\Management folder on the active head node after reproing the issue? Just zip the latest few and email to hpcpack@microsoft.com.

    Regards,

    Yutong Sun

    Thursday, October 15, 2020 1:59 PM
    Moderator