locked
Standard Sample does not work (EchoSvcLib) RRS feed

  • Question

  • Hello together,

    we tried to run a sample solution on our cluster-network but it did not work.

    Sample Solution: http://channel9.msdn.com/learn/courses/HPCLearningCourse/soaandwcf/lab39/

    (it is the standard sample with EchoSvcLib and EchoClient)

     

    Error Message found in the log of running tasks (HPC Cluster Manager):

    Task ID Task Name State Command Line Requested Resources Start Time End Time Output Error Message

    1 Finished  "%CCP_HOME%bin\HpcWcfBroker.exe" 1-1 Cores 27.05.2010 15:23:47 27.05.2010 15:24:15 Microsoft.Hpc.ServiceBroker Warning: 10006 : Failed to open service net.tcp://private.dsor-optdie02:9088/97/1078/_defaultEndpoint, error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:50.0418357Z

    Microsoft.Hpc.ServiceBroker Warning: 10006 : Failed to open service net.tcp://private.dsor-optdie02:9088/97/1079/_defaultEndpoint, error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:50.1218357Z

    Microsoft.Hpc.ServiceBroker Warning: 0 : Service net.tcp://private.dsor-optdie02:9088/97/1078/_defaultEndpoint failed. Error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:50.3438357Z

    Microsoft.Hpc.ServiceBroker Warning: 0 : Service net.tcp://private.dsor-optdie02:9088/97/1079/_defaultEndpoint failed. Error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:50.3488357Z

    Microsoft.Hpc.ServiceBroker Warning: 10006 : Failed to open service net.tcp://private.dsor-optdie02:9088/97/1080/_defaultEndpoint, error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:52.0098357Z

    Microsoft.Hpc.ServiceBroker Warning: 0 : Service net.tcp://private.dsor-optdie02:9088/97/1080/_defaultEndpoint failed. Error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:52.0128357Z

    Microsoft.Hpc.ServiceBroker Warning: 10006 : Failed to open service net.tcp://private.dsor-optdie02:9088/97/1077/_defaultEndpoint, error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:52.0158357Z

    Microsoft.Hpc.ServiceBroker Warning: 0 : Service net.tcp://private.dsor-optdie02:9088/97/1077/_defaultEndpoint failed. Error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:52.0178357Z

    Microsoft.Hpc.ServiceBroker Warning: 10006 : Failed to open service net.tcp://private.dsor-optdie02:9088/97/1082/_defaultEndpoint, error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:52.3578357Z

    Microsoft.Hpc.ServiceBroker Warning: 0 : Service net.tcp://private.dsor-optdie02:9088/97/1082/_defaultEndpoint failed. Error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:52.3608357Z

    Microsoft.Hpc.ServiceBroker Warning: 10000 : Request urn:uuid:bb8456ff-b5cd-4602-8990-9d22bc12dde7 from user DSOROD\Administrator has been given up because Request has failed more than 3 times, broker will not deliver it again

        DateTime=2010-05-27T13:23:52.3638357Z

    Microsoft.Hpc.ServiceBroker Warning: 10006 : Failed to open service net.tcp://private.dsor-optdie02:9088/97/1083/_defaultEndpoint, error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:53.3318357Z

    Microsoft.Hpc.ServiceBroker Warning: 0 : Service net.tcp://private.dsor-optdie02:9088/97/1083/_defaultEndpoint failed. Error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:53.3348357Z

    Microsoft.Hpc.ServiceBroker Warning: 10006 : Failed to open service net.tcp://private.dsor-optdie02:9088/97/1084/_defaultEndpoint, error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:54.3408357Z

    Microsoft.Hpc.ServiceBroker Warning: 0 : Service net.tcp://private.dsor-optdie02:9088/97/1084/_defaultEndpoint failed. Error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:54.3438357Z

    Microsoft.Hpc.ServiceBroker Warning: 10006 : Failed to open service net.tcp://private.dsor-optdie02:9088/97/1085/_defaultEndpoint, error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:55.3558357Z

    Microsoft.Hpc.ServiceBroker Warning: 0 : Service net.tcp://private.dsor-optdie02:9088/97/1085/_defaultEndpoint failed. Error:No DNS entries exist for host private.dsor-optdie02.

        DateTime=2010-05-27T13:23:55.3588357Z

    Microsoft.Hpc.ServiceBroker Warning: 10000 : Request urn:uuid:750b2afe-a54b-4a84-b9f9-f54875c118fc from us

    -------------------------- Output Truncated --------------------------

     

    we also built a WCF Trace File to find out where the problem is:

    http://marian-schiemann.de/dl/hpc/host.svclog
    http://marian-schiemann.de/dl/hpc/broker.svclog

     

    Any ideas how to solve this?

     

    Thanks,

    Marian

    Monday, May 31, 2010 7:41 PM

Answers

  • Hi Marian,

    For network topology 5, the WCF_NETWORKPREFIX value should be Enterprise. Can you try this?

    I am not sure why you got Private on your cluster. By default, if you use our HpcClusterManager to configure the cluster, WCF_NETWORKPREFIX  should be configured automatically for you.

    thanks,

    Liwei

     

    Thursday, June 3, 2010 8:32 PM

All replies

  • Hi Marian,

    Can you run the SOA diagnostics tests if not run yet? Does it pass? If fail, what's the error msg?

    Thanks,

    Liwei

    Tuesday, June 1, 2010 9:59 PM
  • Also, what network topology do you use? (I mean the 5 choices of network topology you make in the Todo List of the Cluster Manager?)

    Besides that, can you do a "cluscfg listenvs /scheduler:<your_headnode>"? Please let us know the value of WCF_NETWORKPREFIX.

    The error looks like you are having a out-of-sync network (thus "No DNS entries exist for host private...").

    Tuesday, June 1, 2010 10:53 PM
  • Hello and thanks for the fast replys.

     

    All SOA diagnostics tests pass and there are no error messages.

    We use the network topology 5 (everything on an enterprise network).

     

    cluscfg listenvs /scheduler output:

    WCF_NETWORKPREFIX=Private

     

    What exactly do you mean with "out-of-sync network" and what are the main sources of error when having a "out-of-sync network"?

    Thanks,

    Marian

    Wednesday, June 2, 2010 7:48 AM
  • Hi Marian,

    For network topology 5, the WCF_NETWORKPREFIX value should be Enterprise. Can you try this?

    I am not sure why you got Private on your cluster. By default, if you use our HpcClusterManager to configure the cluster, WCF_NETWORKPREFIX  should be configured automatically for you.

    thanks,

    Liwei

     

    Thursday, June 3, 2010 8:32 PM
  • Great.

    We changed the WCF_NETWORKPREFIX manually and now everything works fine. For all the others with the same problem:

    cluscfg setenvs -scheduler:<headnode> "WCF_NETWORKPREFIX=Enterprise"

     

    But we still don't know why the HpcClusterManager did not configure it automatically.

     

    Thank you for the great support,

    Marian

    Thursday, June 3, 2010 9:36 PM