none
Cannot connect to HPC Manager on Headnode after installation of HPC Pack 2016 RRS feed

  • Question

  • After successfully installation of HPC Pack 2016 on Windows server 2016, the HPC Manager (started on Headnode) will not connect (to the scheduler service?). The Headnode is a AD Domain server. 

    I got following error message:

    The connection to the scheduler service failed. detail error: System.ServiceModel.EndpointNotFoundException: Could not connect to net.tcp://ServerHostName:5800/SchedulerStoreService. The connection attempt lasted for a time span of 00:00:06.0089951. TCP error code 10061: No connection could be made because the target machine actively refused it 172.26.1.0:5800.  ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 172.26.1.0:5800
       at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
       at System.Net.Sockets.Socket.Connect(EndPoint remoteEP)
       at System.ServiceModel.Channels.SocketConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
       --- End of inner exception stack trace ---
    
    Server stack trace: 
       at System.ServiceModel.Channels.SocketConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
       at System.ServiceModel.Channels.BufferedConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
       at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout)
       at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout)
       at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
       at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
       at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
       at System.ServiceModel.Channels.ServiceChannel.CallOpenOnce.System.ServiceModel.Channels.ServiceChannel.ICallOnce.Call(ServiceChannel channel, TimeSpan timeout)
       at System.ServiceModel.Channels.ServiceChannel.CallOnceManager.CallOnce(TimeSpan timeout, CallOnceManager cascade)
       at System.ServiceModel.Channels.ServiceChannel.EnsureOpened(TimeSpan timeout)
       at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
       at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
       at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)
    
    Exception rethrown at [0]: 
       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
       at Microsoft.Hpc.Scheduler.Store.ISchedulerStoreInternal.Register(String clientSource, String userName, ConnectionRole role, Version clientVersion, ConnectionToken& token, UserPrivilege& privilege, Version& serverVersion, Dictionary`2& serverProps)
       at Microsoft.Hpc.Scheduler.Store.StoreServer.RegisterWithServer()
       at Microsoft.Hpc.Scheduler.Store.StoreServer.<_Connect>d__33.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Microsoft.Hpc.Scheduler.Store.StoreServer.<ConnectAsync>d__29.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Microsoft.Hpc.Scheduler.Store.SchedulerStoreSvc.<InitializeAsync>d__41.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Microsoft.Hpc.Scheduler.Store.SchedulerStoreSvc.<RemoteConnectAsync>d__4.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Microsoft.Hpc.Scheduler.Store.SchedulerStore.<ConnectAsync>d__0.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Microsoft.ComputeCluster.Admin.ConnectionManager.ConnectScheduler(Object sender)

    HPC Pack Update 1 or Update 2 is not possible to install, because during installation of Server Components, the setup will got an error.  The setup process cannot start scheduler service.

    Thanks for help!

    Tuesday, February 12, 2019 8:29 AM

All replies

  • Hi andigx,

    The error looks the scheduler service was not running so HPC Cluster Manager failed to connect to it.

    Could you first check if there is any error info in the Event Viewer for Windows Logs -> Application and Application and Services Logs -> Microsoft -> HPC -> Scheduler?

    Regards,

    Yutong Sun

    Wednesday, February 13, 2019 1:51 AM