fqdn and endpoint problem RRS feed

  • Question

  • We're experimenting with HPC and are running into a little glitch that I'd like to understand.  Our HPC deployment uses the enterprise model where the clients reside on one corporate domain, and the head and compute nodes reside on another corporate domain.

    When playing around with the Echo example, we've found that we can create a session with no problem, but when it comes to invoking the service, we get errors about not being able to locate the head node machine.  If we look at the endpoint that resides in the session, it just contains the head node's machine name, but not its FQDN (which is on a different domain that the client).  We can make things work by adding the domain suffix to our client's DNS paths in the NIC's TCP settings, but we really don't want to do this as it might cause problems when this thing eventually gets rolled out.

    So the question...  is there some configuration setting we can use so that we can avoid modifying the client's NIC DNS setting?  Manually replacing the endpoint obtained from the session with the FQDN in the call creating the client:
         client = new MyServiceClient( binding, endpoint );
    goes through ok, but then the wait for the service to actually complete never terminates--there's no error though.

    Thanks in advance.
    Thursday, March 11, 2010 3:20 PM

All replies

  • When you create the session, what kind of name do you use for the headnode? Is it FQDN or the netbios name only?

    Monday, March 15, 2010 5:45 AM

  • Thanks for replying--I should have been clearer on that point.  Because I don't have the domain suffix added to my DNS suffixes, simply the machine name on its own is not sufficient to create the session.  Thus, I use the fqdn when specifying the headnode.

    I've also used the IP address as the headnode string, and that also works when creating the session, but the endpoint string in the session object still only contains the machine name but not the suffix.

    I've also tried manually constructing an endpoint string in the

    client = new MyServiceClient( binding, endpoint )

    call so that the endpoint string contains the fqdn, but that just causes the above call to hang.
    Monday, March 15, 2010 1:20 PM
  • Can you work with your domain admin to get the cluster's suffix added to the DNS suffix search list of the client's domain?  This can be done with a GPO, for example.

    Monday, March 15, 2010 5:33 PM

  • Thanks for the thought.

    Unfortunately, we'll be deploying the software on client HPC systems where we don't necessarily have the capability to tell the client that they must change their network settings for everyone who wishes to invoke jobs.  It would be nice not to have to tell them that they must do this.
    Tuesday, March 16, 2010 1:50 AM
  • We are currently working with WCF team on this issue, please give us sometime on the solution. Thanks.


    Monday, March 22, 2010 7:36 AM
  • Hi,

    We have tried to reproduce this problem in our own multiple domain environment and cannot see this issue.

    Judging from the situation you provided, it seems that you have a problem for resolving IP in multiple corp domains. Can you work with your administrator on the DNS issue? Also, can you ping computer on other domain without FQDN?


    Wednesday, March 24, 2010 8:48 AM

  • I will raise this issue again with our IT staff and see what they have to say.


    As for successfully pinging the other domain without the domain suffix, the answer is no, I cannot (unless I put the domain suffix in the NIC's DNS suffix list).  I believe there's a reason why the IT staff has set things up this way...

    Wednesday, March 24, 2010 2:10 PM