Answered by:
Couldn't add preconfigured ComputeNode

Question
-
Hi all.
I have two preconfigured Compute nodes. When I try to add them to HPC cluster (or assign template) I have get the following error:
Time Message 11.01.2011 13:47:33 Reverted 11.01.2011 13:47:33 Disassociating template from node HPCS\NODE-W2K82 11.01.2011 13:47:33 The operation failed due to errors during execution. 11.01.2011 13:47:33 The operation failed and will not be retried. 11.01.2011 13:47:33 Failed to execute the change on the target node 11.01.2011 13:47:33 Could not contact node 'NODE-W2K82' to perform change. The management service was unable to connect to the node using any of the IP addresses resolved for the node. 11.01.2011 13:47:33 Could not contact node 'NODE-W2K82' to perform change. Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. 11.01.2011 13:47:33 Checking the configuration of node HPCS\NODE-W2K82 ... 11.01.2011 13:47:23 Failed to execute the change on the target node 11.01.2011 13:47:23 Could not contact node 'NODE-W2K82' to perform change. The management service was unable to connect to the node using any of the IP addresses resolved for the node. 11.01.2011 13:47:23 Could not contact node 'NODE-W2K82' to perform change. Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. 11.01.2011 13:47:23 Checking the configuration of node HPCS\NODE-W2K82 11.01.2011 13:47:23 Associating template ComputeNode Template with node HPCS\NODE-W2K82 11.01.2011 13:47:23 Moving node HPCS\NODE-W2K82 from state Unknown to state Provisioning 11.01.2011 13:47:23 Assigning template ComputeNode Template to node NODE-W2K82
And the WireShark shows me the following (it's a fragment):
No. Time Source Destination Protocol Info 27 12.836361 HEADNODE NODE-W2K82 TCP 50346 > 6730 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=8 SACK_PERM=1 28 12.836889 NODE-W2K82 HEADNODE TCP 6730 > 50346 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1460 WS=8 SACK_PERM=1 29 12.836994 HEADNODE NODE-W2K82 TCP 50346 > 6730 [ACK] Seq=1 Ack=1 Win=65536 Len=0 30 12.838043 HEADNODE NODE-W2K82 TCP 50346 > 6730 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=1771 31 12.838530 NODE-W2K82 HEADNODE TCP 6730 > 50346 [ACK] Seq=1 Ack=1772 Win=65536 Len=0 32 12.839860 NODE-W2K82 HEADNODE TCP 6730 > 50346 [PSH, ACK] Seq=1 Ack=1772 Win=65536 Len=190 33 12.840542 HEADNODE NODE-W2K82 TCP 50346 > 6730 [PSH, ACK] Seq=1772 Ack=191 Win=65280 Len=592 34 12.842255 NODE-W2K82 HEADNODE TCP 6730 > 50346 [RST, ACK] Seq=191 Ack=2364 Win=0 Len=0
The problem is the same for both compute nodes. Reinstall HPC Pack on compute node doesn't help.
Could anyone help me ?
Tuesday, January 11, 2011 11:21 AM
Answers
-
Hi. I found the cause, and it was in my AD - I found a few AD related errors in Event Log.
So I reinstalled AD and HPC and the problem no longer exists.
Thank you all!
- Marked as answer by Don Pattee Friday, February 4, 2011 10:00 PM
Thursday, January 13, 2011 6:56 AM
All replies
-
Hi dim.tdv,
I had experience almost the same problem with you. Mine was fix after checking the network configurations. My network layout is using 2 LAN Headnode, 1 LAN Compute node.
What is your network layout?
KC POLIRAN
Wednesday, January 12, 2011 1:00 AM -
It's possible that you have incorrect/old entries for the name resolution and the HN is not talking to the nodes you want it to. Please make sure that you have correct entries in the hosts file (\windows\system32\drivers\etc\hosts).
Wednesday, January 12, 2011 11:19 PM -
Hi. I found the cause, and it was in my AD - I found a few AD related errors in Event Log.
So I reinstalled AD and HPC and the problem no longer exists.
Thank you all!
- Marked as answer by Don Pattee Friday, February 4, 2011 10:00 PM
Thursday, January 13, 2011 6:56 AM