none
Configuration Test MPI Ping Pong Lightweight Throughput failed RRS feed

  • Question

  • Hi, all!

    I tried to run the test "MPI Ping Pong Lightweight Throughput" from "HPC Cluster Manager" and obtained the following output:

    Node Message HPC-HEAD There is an error in XML document (1, 1). --> Data at the root level is invalid. Line 1, position 1.



    I am not sure, that I understand this result, hence any help will be appreciated. 
    My cluster is based on HPC Server 2008 and has two nodes in topology 5 (all nodes on eneterprise network)




     

    • Moved by parmita mehtaModerator Thursday, November 19, 2009 9:51 PM (From:Windows HPC Server Deployment, Management, and Administration)
    Wednesday, November 4, 2009 12:58 PM

Answers

  • That pointed me in the right direction, thanks.

    The root cause is the firewall on the nodes.  Either the install doesn't open the correct ports, or there may be some group policy settings getting propogated (but I have my OU set to "do not inherit").

    Turning the firewall off for everyone allows all the tests to complete, so now it's just a matter of figuring out what settings and ports need to be set correctly.

    Thanks for the help.
    Thursday, February 18, 2010 5:56 PM

All replies

  • Hi Igor,

    Thanks for reporting this issue. Can you check the job management console to see whether there is a new job created by the above test? If so, can you save the job xml and send to me? Thanks

    Liwei
    Tuesday, December 8, 2009 4:46 AM
  • I am getting the same error when running the two MPI diagnostics.

    Here's the XML from the results:

    Post Step run failed due to :
    Unhandled Exception: System.InvalidOperationException: There is an error in XML document (2, 1). ---> System.Xml.XmlException: Data at the root level is invalid. Line 2, position 1.
       at System.Xml.XmlTextReaderImpl.Throw(Exception e)
       at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace()
       at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
       at System.Xml.XsdValidatingReader.Read()
       at System.Xml.XmlReader.MoveToContent()
       at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderMpiPingPongResults.Read36_MpiPingPongResults()
       --- End of inner exception stack trace ---
       at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
       at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader)
       at Microsoft.Hpc.Diagnostics.Host.MpiTestBase.ExecutePostStep(IDiagnosticProgram owner, List`1 nodes, List`1 args)
       at Microsoft.Hpc.Diagnostics.Host.TestCommand.Execute(IDiagnosticProgram owner, CommandList parentCmdList, List`1 args)
       at Microsoft.Hpc.Diagnostics.Shared.CommandList.Execute(IDiagnosticProgram owner, CommandList parentCmdList, List`1 args)
       at Microsoft.Hpc.Diagnostics.Shared.CommandList.Execute(IDiagnosticProgram owner, CommandList parentCmdList, List`1 args)
       at Microsoft.Hpc.Diagnostics.Host.Program.Run(String[] argsIn)
       at Microsoft.Hpc.Diagnostics.Host.Program.Main(String[] args)
    Wednesday, February 17, 2010 8:57 PM
  • Hi,

    I think I saw the same problem recently but I need more data from you to verify.

    1. Select the failed run and please go to the 'Jobs for the Tests' from the Actions->Pivot To menu. You should see three jobs there. Please check which of them report 'Failed' state.
    2. Go to the the 'c:\Program Files\Microsoft HPC Pack 2008 R2\Data\SpoolDir\Diagnostics\<runid>' folder and check the content of the runstep.out file. This should give the clue for a reason. (<runid> is the number you will see in 'ID' column for your failed run)

    Lukasz
    Wednesday, February 17, 2010 11:12 PM
  • That pointed me in the right direction, thanks.

    The root cause is the firewall on the nodes.  Either the install doesn't open the correct ports, or there may be some group policy settings getting propogated (but I have my OU set to "do not inherit").

    Turning the firewall off for everyone allows all the tests to complete, so now it's just a matter of figuring out what settings and ports need to be set correctly.

    Thanks for the help.
    Thursday, February 18, 2010 5:56 PM