none
Couldn't add preconfigured ComputeNode RRS feed

  • Question

  • Hi all.

    I have two preconfigured Compute nodes. When I try to add them to HPC cluster (or assign template) I have get the following error:

     

    Time	Message
    
    11.01.2011 13:47:33	Reverted
    11.01.2011 13:47:33	Disassociating template from node HPCS\NODE-W2K82
    11.01.2011 13:47:33	The operation failed due to errors during execution.
    11.01.2011 13:47:33	The operation failed and will not be retried.
    11.01.2011 13:47:33	Failed to execute the change on the target node
    11.01.2011 13:47:33	Could not contact node 'NODE-W2K82' to perform change. The management service was unable to connect to the node using any of the IP addresses resolved for the node.
    11.01.2011 13:47:33	Could not contact node 'NODE-W2K82' to perform change. Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
    11.01.2011 13:47:33	Checking the configuration of node HPCS\NODE-W2K82
    
    ...
    
    11.01.2011 13:47:23	Failed to execute the change on the target node
    11.01.2011 13:47:23	Could not contact node 'NODE-W2K82' to perform change. The management service was unable to connect to the node using any of the IP addresses resolved for the node.
    11.01.2011 13:47:23	Could not contact node 'NODE-W2K82' to perform change. Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
    11.01.2011 13:47:23	Checking the configuration of node HPCS\NODE-W2K82
    11.01.2011 13:47:23	Associating template ComputeNode Template with node HPCS\NODE-W2K82
    11.01.2011 13:47:23	Moving node HPCS\NODE-W2K82 from state Unknown to state Provisioning
    11.01.2011 13:47:23	Assigning template ComputeNode Template to node NODE-W2K82
    
    

    And the WireShark shows me the following (it's a fragment):

    No.   Time    Source        Destination      Protocol Info
       27 12.836361  HEADNODE      NODE-W2K82      TCP   50346 > 6730 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=8 SACK_PERM=1
       28 12.836889  NODE-W2K82      HEADNODE      TCP   6730 > 50346 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1460 WS=8 SACK_PERM=1
       29 12.836994  HEADNODE      NODE-W2K82      TCP   50346 > 6730 [ACK] Seq=1 Ack=1 Win=65536 Len=0
       30 12.838043  HEADNODE      NODE-W2K82      TCP   50346 > 6730 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=1771
       31 12.838530  NODE-W2K82      HEADNODE      TCP   6730 > 50346 [ACK] Seq=1 Ack=1772 Win=65536 Len=0
       32 12.839860  NODE-W2K82      HEADNODE      TCP   6730 > 50346 [PSH, ACK] Seq=1 Ack=1772 Win=65536 Len=190
       33 12.840542  HEADNODE      NODE-W2K82      TCP   50346 > 6730 [PSH, ACK] Seq=1772 Ack=191 Win=65280 Len=592
       34 12.842255  NODE-W2K82      HEADNODE      TCP   6730 > 50346 [RST, ACK] Seq=191 Ack=2364 Win=0 Len=0
    

    The problem is the same for both compute nodes. Reinstall HPC Pack on compute node doesn't help.

    Could anyone help me ?

    Tuesday, January 11, 2011 11:21 AM

Answers

  • Hi. I found the cause, and it was in my AD - I found a few AD related errors in Event Log.

    So I reinstalled AD and HPC and the problem no longer exists.

    Thank you all!

    Thursday, January 13, 2011 6:56 AM

All replies

  • Hi dim.tdv,

     

    I had experience almost the same problem with you. Mine was fix after checking the network configurations. My network layout is using 2 LAN Headnode, 1 LAN Compute node.

    What is your network layout?

     

    KC POLIRAN

    Wednesday, January 12, 2011 1:00 AM
  • It's possible that you have incorrect/old entries for the name resolution and the HN is not talking to the nodes you want it to. Please make sure that you have correct entries in the hosts file (\windows\system32\drivers\etc\hosts).

     

    Wednesday, January 12, 2011 11:19 PM
  • Hi. I found the cause, and it was in my AD - I found a few AD related errors in Event Log.

    So I reinstalled AD and HPC and the problem no longer exists.

    Thank you all!

    Thursday, January 13, 2011 6:56 AM