none
Error in CcpManagement on add node RRS feed

  • Question

  •  

    Hi

    Can someone give me some insight on how I may resolve this issue of adding a compute nodes to my cluster.

    This is a 'simple' 4 node cluster used to evaluate some 10Ge hardware. I  am having some trouble getting my compute nodes to be added to the cluster.  

     

    The cluster has 3 networks

    Public network use by the head node and service node – 10.10.0.x

    Private network (management traffic) – 192.168.30.x

    MPI network (Chelsio 10G) – 192.168.10.xxx

     

    All nodes are in the same domain – MSCLUSTER

    I found some similar error reports on the web and tried a solution of changing the SID but that did not work.

     

    I initiate add node from the head (VIC22) node  It reports that an exception occurred adding the compute node.

    On the compute node, I have the following error in the event log:

     

    Source: CcpManagement

    The Management service encountered an error while discovering or updating the configuration of this compute node. This error may prevent the compute node from joining the cluster. A call to SSPI failed, see inner exception.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

     

    Also, looking in the system directory in the CcpManagement.log it has the following error, which seem to repeat itself for each time the compute node tries to join the cluster.

     

     

    2008/06/18 13:50:50 [CcpManagement] [Info] Trying to establish connection to headnode.

    2008/06/18 13:50:50 [CcpManagement] [Info] The head node name is VIC22.

    2008/06/18 13:50:50 [CcpManagement] [Info] Resolved host VIC22 to IP 10.10.0.152

    2008/06/18 13:50:50 [CcpManagement] [Info] Resolved host VIC22 to IP 192.168.10.100

    2008/06/18 13:50:50 [CcpManagement] [Info] Resolved host VIC22 to IP 192.168.30.2

    2008/06/18 13:50:50 [CcpManagement] [Info] Connecting to ClusterManager service on host VIC22 with uri tcp://10.10.0.152:6729/Microsoft.Ccp.ClusterManager

    2008/06/18 13:50:50 [CcpManagement] [Info] Connecting to ClusterManager service on host VIC22 with uri tcp://192.168.10.100:6729/Microsoft.Ccp.ClusterManager

    2008/06/18 13:50:50 [CcpManagement] [Info] Connecting to ClusterManager service on host VIC22 with uri tcp://192.168.30.2:6729/Microsoft.Ccp.ClusterManager

    2008/06/18 13:50:50 [CcpManagement] [Error] Event 6103: The Management service encountered an error while discovering or updating the configuration of this compute node. This error may prevent the compute node from joining the cluster. A call to SSPI failed, see inner exception.

    2008/06/18 13:50:50 [CcpManagement] [Error] Exception:

    System.Security.Authentication.AuthenticationException: A call to SSPI failed, see inner exception. ---> System.ComponentModel.Win32Exception: The system detected a possible attempt to compromise security.  Please ensure that you can contact the server that authenticated you

       --- End of inner exception stack trace ---

     

    Server stack trace:

       at System.Net.Security.NegoState.ProcessAuthentication(LazyAsyncResult lazyResult)

       at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.CreateAuthenticatedStream(Stream netStream, String machinePortAndSid)

       at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.CreateSocketHandler(Socket socket, SocketCache socketCache, String machinePortAndSid)

       at System.Runtime.Remoting.Channels.RemoteConnection.CreateNewSocket(EndPoint ipEndPoint)

       at System.Runtime.Remoting.Channels.RemoteConnection.CreateNewSocket()

       at System.Runtime.Remoting.Channels.SocketCache.GetSocket(String machinePortAndSid, Boolean openNew)

       at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.SendRequestWithRetry(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream)

       at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.ProcessMessage(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream, ITransportHeaders& responseHeaders, Stream& responseStream)

       at Microsoft.ComputeCluster.ClientSink.ProcessMessage(IMessage message, ITransportHeaders requestHeaders, Stream requestStream, ITransportHeaders& responseHeaders, Stream& responseStream)

       at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)

     

    Exception rethrown at [0]:

       at Microsoft.ComputeCluster.Management.ManagementServicesConnectionHelper.ConnectToClusterManager(String host)

       at Microsoft.ComputeCluster.Management.HpcComputeNode.ResetHeadNodeConnection()

       at Microsoft.ComputeCluster.Management.HpcComputeNode.RefreshStatus(Object data, Boolean fromTimer)

     

    Inner exception:

    System.ComponentModel.Win32Exception: The system detected a possible attempt to compromise security.  Please ensure that you can contact the server that authenticated you

    2008/06/18 13:55:50 [CcpManagement] [Info] Trying to establish connection to headnode.

    2008/06/18 13:55:50 [CcpManagement] [Info] The head node name is VIC22.

    2008/06/18 13:55:50 [CcpManagement] [Info] Resolved host VIC22 to IP 10.10.0.152

    2008/06/18 13:55:50 [CcpManagement] [Info] Resolved host VIC22 to IP 192.168.10.100

    2008/06/18 13:55:50 [CcpManagement] [Info] Resolved host VIC22 to IP 192.168.30.2

    2008/06/18 13:55:50 [CcpManagement] [Info] Connecting to ClusterManager service on host VIC22 with uri tcp://10.10.0.152:6729/Microsoft.Ccp.ClusterManager

    2008/06/18 13:55:50 [CcpManagement] [Info] Connecting to ClusterManager service on host VIC22 with uri tcp://192.168.10.100:6729/Microsoft.Ccp.ClusterManager

    2008/06/18 13:55:50 [CcpManagement] [Info] Connecting to ClusterManager service on host VIC22 with uri tcp://192.168.30.2:6729/Microsoft.Ccp.ClusterManager

    2008/06/18 13:55:50 [CcpManagement] [Error] Event 6103: The Management service encountered an error while discovering or updating the configuration of this compute node. This error may prevent the compute node from joining the cluster. A call to SSPI failed, see inner exception.

    2008/06/18 13:55:50 [CcpManagement] [Error] Exception:

    System.Security.Authentication.AuthenticationException: A call to SSPI failed, see inner exception. ---> System.ComponentModel.Win32Exception: The system detected a possible attempt to compromise security.  Please ensure that you can contact the server that authenticated you

       --- End of inner exception stack trace ---

     

    Server stack trace:

       at System.Net.Security.NegoState.ProcessAuthentication(LazyAsyncResult lazyResult)

       at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.CreateAuthenticatedStream(Stream netStream, String machinePortAndSid)

       at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.CreateSocketHandler(Socket socket, SocketCache socketCache, String machinePortAndSid)

       at System.Runtime.Remoting.Channels.RemoteConnection.CreateNewSocket(EndPoint ipEndPoint)

       at System.Runtime.Remoting.Channels.RemoteConnection.CreateNewSocket()

       at System.Runtime.Remoting.Channels.SocketCache.GetSocket(String machinePortAndSid, Boolean openNew)

       at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.SendRequestWithRetry(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream)

       at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.ProcessMessage(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream, ITransportHeaders& responseHeaders, Stream& responseStream)

       at Microsoft.ComputeCluster.ClientSink.ProcessMessage(IMessage message, ITransportHeaders requestHeaders, Stream requestStream, ITransportHeaders& responseHeaders, Stream& responseStream)

       at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)

     

    Exception rethrown at [0]:

       at Microsoft.ComputeCluster.Management.ManagementServicesConnectionHelper.ConnectToClusterManager(String host)

       at Microsoft.ComputeCluster.Management.HpcComputeNode.ResetHeadNodeConnection()

       at Microsoft.ComputeCluster.Management.HpcComputeNode.RefreshStatus(Object data, Boolean fromTimer)

     

    Inner exception:

     

    Peter

    Wednesday, June 18, 2008 10:38 PM

Answers

  • Hi Peter,

    At this point you should try our RC1 release and see if you can reproduce the problem.  Sorry we couldn't provide any solutions sooner but the team was completely focused on fixing these types of issues for RC1.

    Thanks for trying out HPC Pack 2008; hopefully we'll get to your post much sooner should you encounter any issues with our latest release.

    Regards,
    --Brian

    Wednesday, July 2, 2008 4:59 AM

All replies

  • Hi Peter,

    At this point you should try our RC1 release and see if you can reproduce the problem.  Sorry we couldn't provide any solutions sooner but the team was completely focused on fixing these types of issues for RC1.

    Thanks for trying out HPC Pack 2008; hopefully we'll get to your post much sooner should you encounter any issues with our latest release.

    Regards,
    --Brian

    Wednesday, July 2, 2008 4:59 AM
  • Did you have a chance to move to RC1? Did that help with the issue? If not, please let us know.

    Given that this was a security issue I'd check to make sure that the account used to provision the compute nodes has appropriate permissions to create/modify machine accounts in Active Directory.

    Ryan Waite - Product Unit Manager - Windows HPC
    Monday, July 14, 2008 6:45 PM