Configuration Test MPI Ping Pong Lightweight Throughput failed
-
4 listopada 2009 12:58Hi, all!I tried to run the test "MPI Ping Pong Lightweight Throughput" from "HPC Cluster Manager" and obtained the following output:Node Message HPC-HEAD There is an error in XML document (1, 1). --> Data at the root level is invalid. Line 1, position 1.I am not sure, that I understand this result, hence any help will be appreciated.My cluster is based on HPC Server 2008 and has two nodes in topology 5 (all nodes on eneterprise network)
- Przeniesiony przez parmita mehtaModerator 19 listopada 2009 21:51 (From:Windows HPC Server Deployment, Management, and Administration)
Wszystkie odpowiedzi
-
8 grudnia 2009 04:46Hi Igor,
Thanks for reporting this issue. Can you check the job management console to see whether there is a new job created by the above test? If so, can you save the job xml and send to me? Thanks
Liwei -
17 lutego 2010 20:57I am getting the same error when running the two MPI diagnostics.
Here's the XML from the results:
Post Step run failed due to :
Unhandled Exception: System.InvalidOperationException: There is an error in XML document (2, 1). ---> System.Xml.XmlException: Data at the root level is invalid. Line 2, position 1.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace()
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XsdValidatingReader.Read()
at System.Xml.XmlReader.MoveToContent()
at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderMpiPingPongResults.Read36_MpiPingPongResults()
--- End of inner exception stack trace ---
at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader)
at Microsoft.Hpc.Diagnostics.Host.MpiTestBase.ExecutePostStep(IDiagnosticProgram owner, List`1 nodes, List`1 args)
at Microsoft.Hpc.Diagnostics.Host.TestCommand.Execute(IDiagnosticProgram owner, CommandList parentCmdList, List`1 args)
at Microsoft.Hpc.Diagnostics.Shared.CommandList.Execute(IDiagnosticProgram owner, CommandList parentCmdList, List`1 args)
at Microsoft.Hpc.Diagnostics.Shared.CommandList.Execute(IDiagnosticProgram owner, CommandList parentCmdList, List`1 args)
at Microsoft.Hpc.Diagnostics.Host.Program.Run(String[] argsIn)
at Microsoft.Hpc.Diagnostics.Host.Program.Main(String[] args) -
17 lutego 2010 23:12Hi,
I think I saw the same problem recently but I need more data from you to verify.
1. Select the failed run and please go to the 'Jobs for the Tests' from the Actions->Pivot To menu. You should see three jobs there. Please check which of them report 'Failed' state.
2. Go to the the 'c:\Program Files\Microsoft HPC Pack 2008 R2\Data\SpoolDir\Diagnostics\<runid>' folder and check the content of the runstep.out file. This should give the clue for a reason. (<runid> is the number you will see in 'ID' column for your failed run)
Lukasz -
18 lutego 2010 17:56
That pointed me in the right direction, thanks.
The root cause is the firewall on the nodes. Either the install doesn't open the correct ports, or there may be some group policy settings getting propogated (but I have my OU set to "do not inherit").
Turning the firewall off for everyone allows all the tests to complete, so now it's just a matter of figuring out what settings and ports need to be set correctly.
Thanks for the help.- Oznaczony jako odpowiedź przez Don PatteeModerator 12 stycznia 2011 02:50