Answered by:
Jobs failing to run on HPC Pack 2016

Question
-
When attempting to run a job on one of my clusters compute nodes I keep getting the following error
The job encountered and error: "Job failed to start on some nodes or some nodes are unreachable"
Error from node: WINDY-CN-01:System.Runtime.Serialization.SerializationException: Unable to find assembly 'Microsoft.Hpc.NodeManager.RemotingExecutor, Version=5.0.0.0, Culture=neutral, PublicKeyToken=null'.
at System.Runtime.Serialization.Formatters.Binary.BinaryAssemblyInfo.GetAssembly()
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.GetType(BinaryAssemblyInfo assemblyInfo, String name)
at System.Runtime.Serialization.Formatters.Binary.ObjectMap..ctor(String objectName, String[] memberNames, BinaryTypeEnum[] binaryTypeEnumA, Object[] typeInformationA, Int32[] memberAssemIds, ObjectReader objectReader, Int32 objectId, BinaryAssemblyInfo assemblyInfo, SizedArray assemIdToAssemblyTable)
at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.ReadObjectWithMapTyped(BinaryObjectWithMapTyped record)
at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.Run()
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
at Microsoft.Hpc.ExceptionWrapper.DeserializeException()
at Microsoft.Hpc.Scheduler.Communicator.Remoting.NodeManagerServiceProxy.<InvokeOperationWithRetryAsync>d__2`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Hpc.WcfReliableClient`1.<InvokeOperationWithRetryAsync>d__7.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Hpc.WcfReliableClient`1.<InvokeOperationWithRetryAsync>d__6.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Hpc.Scheduler.Communicator.Remoting.NodeManagerServiceProxy.<StartJobAndTaskAsync>d__7.MoveNext()Has anyone else bumped into an issue like this?
Friday, January 26, 2018 12:27 PM
Answers
-
Please migrate your cluster to HPC Pack 2016 Update 1. the migration doc is here: https://technet.microsoft.com/en-us/library/mt829314(v=ws.11).aspx and update 1 is available here: HPC Pack 2016 Update 1 here.
Qiufang Shi
- Marked as answer by wkerr128 Monday, February 5, 2018 11:48 AM
Tuesday, January 30, 2018 5:35 AM
All replies
-
Hi,
This could be related with the a known issue:
- Proposed as answer by qiufang shiMicrosoft employee Tuesday, January 30, 2018 5:34 AM
Tuesday, January 30, 2018 2:25 AM -
Please migrate your cluster to HPC Pack 2016 Update 1. the migration doc is here: https://technet.microsoft.com/en-us/library/mt829314(v=ws.11).aspx and update 1 is available here: HPC Pack 2016 Update 1 here.
Qiufang Shi
- Marked as answer by wkerr128 Monday, February 5, 2018 11:48 AM
Tuesday, January 30, 2018 5:35 AM -
Hi Qiufang Shi,
This appears to be a common theme amongst all the forums regarding HPC Pack 2016 RTM. A lot of problems do indeed appear to be resolved with the HPC Pack 2016 Update 1.
Jobs are now running on my cluster.
--
Cheers,
William
Monday, February 5, 2018 11:49 AM -
Thanks, we are considering removing the HPC Pack 2016 RTM bits from download center.
Qiufang Shi
Tuesday, February 6, 2018 1:52 AM