Windows HPC Server Message Passing Interface (MPI) ForumDedicated to all aspects of Message Passing Interface (MPI) on Windows HPC Server 2008 and Windows Compute Cluster Server 2003© 2009 Microsoft Corporation. All rights reserved.Wed, 25 Nov 2009 04:38:01 Z1d45beb7-b9b5-40f1-be90-812802f66485http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/6e7d2bb0-5944-428e-b531-2446a8551955http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/6e7d2bb0-5944-428e-b531-2446a8551955R. Rogerhttp://social.microsoft.com/Profile/en-US/?user=R.%20Rogerfailed to summit MPI application builded with 64-bit binary codeI have an application running on Windows HPC 2008. There is no problem with 32-bit binary code. However, I can not submit the job for 64-bit binary code. It was release build with amd64 msmpi.lib. I can run the code on one node through mpiexec command. However, if I submit the job on multiple node it failed with following error message.<br/><br/>Aborting: failed to launch '\\TDCMFCPWOPHpF01\HPC_APPS\rluo\mpi_demo_64.exe' on TDCMFCPWOPCNA01<br/>Error (14001) The application has failed to start because its side-by-side configuration is incorrect. Please see the application event log for more detail.Thu, 05 Nov 2009 21:24:33 Z2009-11-25T04:38:01Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/75827d7a-0acf-4731-b5a4-c888f413fe8chttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/75827d7a-0acf-4731-b5a4-c888f413fe8craluca_vhttp://social.microsoft.com/Profile/en-US/?user=raluca_vprogram not working when using more than 2 processes...Hello!  <div>I'm a very beginner in MPI and the answer to my question might be obvious, but I just can't see it now. <div>Could anyone say what causes the program containing the following lines not to work when using more than 2 processes? (it works fine in a serial manner and also when using 2 processes)</div> <div><br/></div> <div> <div> <pre lang=x-cpp> if (myrank == 0) { // read from file ... // send target value to all other processes for (i = 1; i &lt; nrprocs; i++) { MPI_Send (&amp;target, 1, MPI_INT, i, TAG1, MPI_COMM_WORLD); } // calculate len len = n / nrprocs; // send len to all other processes for (i = 1; i &lt; nrprocs; i++) { MPI_Send (&amp;len, 1, MPI_INT, i, TAG2, MPI_COMM_WORLD); } // send to each process a part of the array to be searched for (i = 1; i &lt; nrprocs; i++) { MPI_Send (&amp;b[(i-1)*len], len, MPI_INT, i, TAG3, MPI_COMM_WORLD); } // the root process searches the last portion for (i = ((nrprocs - 1)*len); i &lt; n; i++) { if (b[i] == target) printf (&quot;%d &quot;, i); } no = 0; while ( no != (nrprocs - 1) ) { MPI_Recv (&amp;index, 1, MPI_INT, 1, MPI_ANY_TAG, MPI_COMM_WORLD, &amp;status); if (status.MPI_TAG == END_TAG) { no++; } else { printf (&quot;%d &quot;, index); } } } else { MPI_Recv (&amp;target, 1, MPI_INT, 0, TAG1, MPI_COMM_WORLD, &amp;status); MPI_Recv (&amp;len, 1, MPI_INT, 0, TAG2, MPI_COMM_WORLD, &amp;status); MPI_Recv (&amp;b[0], len, MPI_INT, 0, TAG3, MPI_COMM_WORLD, &amp;status); index = -1; for (i = 0; i &lt; len; i++) { if (b[i] == target) { index = (myrank - 1)*len + i; MPI_Send (&amp;index, 1, MPI_INT, 0, TAG, MPI_COMM_WORLD); } } // message saying that the current process has finished searching MPI_Send (&amp;index, 1, MPI_INT, 0, END_TAG, MPI_COMM_WORLD); } </pre> <br/></div> <div>Thanks in advance!</div> <div><br/></div> </div> </div>Tue, 24 Nov 2009 19:18:05 Z2009-11-24T19:18:06Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/07107efc-3167-4b7f-b104-4e9b5fe03b13http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/07107efc-3167-4b7f-b104-4e9b5fe03b13YuJinSuhttp://social.microsoft.com/Profile/en-US/?user=YuJinSumpi.dllMy PC OS is Windows HPC Server 2008<br/><br/>When I use HPC Cluster Manager add Task to MPI.exe (Hello world) , the task error message:<br/><br/>ERROR: unable to read the cmd header on the pmi context, Error= -1<br/><br/>How do I solve this problem !? <br/><br/>Did OS lack &quot;mpi.dll&quot;  !?  If it did,how can I found the  &quot;mpi.dll&quot; !?<br/><br/>Or problem have else !?<br/><br/>by the way....<br/><br/>Do you install MPICH2 (x86 or x64) ? <br/><br/>The MPICH2 have to install !?Tue, 24 Nov 2009 09:19:34 Z2009-11-24T09:19:35Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/b7ab3990-f9b9-4782-be42-432d98ab9b84http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/b7ab3990-f9b9-4782-be42-432d98ab9b84techie_http://social.microsoft.com/Profile/en-US/?user=techie_MPI environment<p>Hi,<br/><br/>We have been using pallas mpi for long time and usually has <span style="font-size:x-small;color:#000080;font-family:Arial">MPICH_DISABLE_SOCK 1 as environment </span>variable only.<br/><br/>It may be wrong to not use any other parameters while validation of network direct on windows hpc platform.<br/><br/>Is there any open list available which we should consider in command line of 'mpiexec' to validate Network Direct to some more depth level?<br/><br/>Can anyone share thier experience using various pallas mpi environement variables in thier testing of Network Direct?<br/><br/>Thanks in advance,<br/></p>Mon, 23 Nov 2009 16:52:25 Z2009-11-23T16:52:25Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/1877841c-32bb-4493-9c26-c13b26afc5c2http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/1877841c-32bb-4493-9c26-c13b26afc5c2YuJinSuhttp://social.microsoft.com/Profile/en-US/?user=YuJinSuHPC Cluster Manager And MPI Task<p>When I use HPC Cluster Manager run a sample MPI program (EX:Hello world).<br/>I find that always failed. I guess the MPI environment set be problem.<br/>I have 5 PC, 1 HeadNode 4 ComputerNode.<br/>Each PC installation MPICH2,the MPICH2 need for the installation on Windows HPC Server 2008 !?<br/><br/>When use HPC Cluster Manager run Diagnostics MPI test is Succeeded<br/><a href="http://img163.imageshack.us/img163/7493/23749122.jpg"><span style="color:#0033cc">http://img163.imageshack.us/img163/7493/23749122.jpg</span></a> (HeadNode)<br/><a href="http://img29.imageshack.us/img29/9378/44036017.jpg"><span style="color:#0033cc">http://img29.imageshack.us/img29/9378/44036017.jpg</span></a>    (ComputerNode)<br/>And than add new job to add task(MPI) always failed.<br/>How can I solve this problem !?<br/><br/>By the way....<br/>&quot;Required Resources&quot; under add task, what does it mean mean !?<br/><br/>MPI programs to provide test ....<br/>download :<a href="http://www.xun6.com/file/899af4f11/MPI.exe.html"><span style="color:#0033cc">http://www.xun6.com/file/899af4f11/MPI.exe.html</span></a><br/><br/></p>Tue, 17 Nov 2009 13:45:12 Z2009-11-17T13:45:13Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/6eb88343-99ee-4885-8488-60077a7f25c5http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/6eb88343-99ee-4885-8488-60077a7f25c5YuJinSuhttp://social.microsoft.com/Profile/en-US/?user=YuJinSuMPI Job Question <p>On the Job Management ,when I new job  and add task,the &quot;required nodes&quot; under Required Resources.<br/>What 's the  &quot;required nodes&quot; that it mean !? <br/><a href="http://img91.imageshack.us/img91/869/132456.jpg">http://img91.imageshack.us/img91/869/132456.jpg</a><br/><br/>by the way .... when I create a new job,the task is failed .<br/>The task  failed   data (View  Failed Tasks) download the file <br/>→ http://www.xun6.com/file/b92173a11/mpierror.txt.html<br/>How can I solve this problem !?<br/></p>Thu, 12 Nov 2009 07:40:14 Z2009-11-12T07:40:14Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/aff0d5c8-512d-4eb6-9679-f5441fcd0c7chttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/aff0d5c8-512d-4eb6-9679-f5441fcd0c7cyyallihttp://social.microsoft.com/Profile/en-US/?user=yyalliMPI problem<div>Hi. I have got an MPI job which makes the following error. I have no problem if I use one node with multiple processes but if I use multiple nodes, I got always the following error. My HPC clusters are using Infiniband. Is there any option I should use for this type of network? I don't know where I should look into.</div> <div><br/></div> <div>job aborted:</div> <div>[ranks] message</div> <div><br/></div> <div>[0] fatal error</div> <div>Fatal error in MPI_Scatterv: Other MPI error, error stack:</div> <div>MPI_Scatterv(358).......................: MPI_Scatterv(sbuf=0x06F47100, scnts=0x02138718, displs=0x0</div> <div>1E21180, MPI_DOUBLE, rbuf=0x01C19660, rcount=1, MPI_DOUBLE, root=0, comm=0x84000001) failed</div> <div>MPIR_Scatterv(119)......................:</div> <div>MPIC_Send(39)...........................:</div> <div>MPIC_Wait(277)..........................:</div> <div>CH3_ND::CCq::Poll(136)..................:</div> <div>CH3_ND::CEndpoint::RecvSucceeded(1476)..:</div> <div>CH3_ND::CEndpoint::ProcessReceives(1120):</div> <div>CH3_ND::CEndpoint::ProcessDataMsg(1281).:</div> <div>MPIDI_CH3_RndvSend(271).................: failure occurred while attempting to send message data</div> <div>CH3_ND::CEndpoint::ProcessSends(869)....:</div> <div>CH3_ND::CEnvironment::CreateMr(490).....:</div> <div>CH3_ND::CMr::Create(91).................:</div> <div>CH3_ND::CMr::Init(66)...................:</div> <div>CH3_ND::CAdapter::RegisterMemory(293)...: [ch3:nd] INDAdapter::RegisterMemory failed with 0xc0000001</div> <div><br/></div> <div><br/></div> <div>[1-4] terminated</div> <div><br/></div> <div>Any advice or help will be greatly appreciated. </div> <div><br/></div> <div>Thanks,</div> <div>Jong</div>Fri, 06 Nov 2009 06:10:56 Z2009-11-06T06:10:56Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/edd34211-8efc-45b9-84d6-00c0ee379ae9http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/edd34211-8efc-45b9-84d6-00c0ee379ae9YuJinSuhttp://social.microsoft.com/Profile/en-US/?user=YuJinSuMPI command lineMy PC system is Windows HPC Server 2008<br/>I have problem on command line<br/>follow this picture <a href="http://img18.imageshack.us/img18/9416/123ik.jpg"><span style="color:#0033cc">http://img18.imageshack.us/img18/9416/123ik.jpg</span></a><br/>How can I solve this problem ?<br/><br/><br/>The &quot;sendmpi.exe&quot; that I use Microsoft Visual C++ compile<br/>And I haven install MPICH2 on PC.<br/><br/>If I debug the code , it is no problem.Thu, 05 Nov 2009 08:55:10 Z2009-11-05T08:55:11Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/8eec6f3a-c4db-4013-bf07-494274c1a9b4http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/8eec6f3a-c4db-4013-bf07-494274c1a9b4Igor Pasichnykhttp://social.microsoft.com/Profile/en-US/?user=Igor%20PasichnykConfiguration Test MPI Ping Pong Lightweight Throughput failedHi, all! <div><br/></div> <div>I tried to run the test &quot;<span style="font-family:Arial;font-size:13px;white-space:pre">MPI Ping Pong Lightweight Throughput&quot; from &quot;HPC Cluster Manager&quot; and obtained the following output:</span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"> </span></span> Node Message HPC-HEAD There is an error in XML document (1, 1). --&gt; Data at the root level is invalid. Line 1, position 1.</div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre">I am not sure, that I understand this result, hence any help will be appreciated. </span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre">My cluster is based on HPC Server 2008 and has two nodes in topology 5 (all nodes on eneterprise network)</span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"><br/></span></span></div> <div><span style="font-family:Arial;font-size:small"><span style="font-size:13px;white-space:pre"> </span></span>   <br/> <br/></div>Wed, 04 Nov 2009 12:58:21 Z2009-11-19T21:51:11Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/4c67afdb-2915-48b7-a56a-59e672a01d29http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/4c67afdb-2915-48b7-a56a-59e672a01d29skiffzzzhttp://social.microsoft.com/Profile/en-US/?user=skiffzzzcannot connet to head node from local hostHi, I am using the ccs 2003. I want to use computing node 1 and 2 (16 processor in total), I can send the job using the user interface on CN01 and there is no problem. But when I tried the command line: <a><span style="font-size:x-small">c:\documents and settings\hpcuser&gt; job submit /numprocessors:16 /askednodes:cn01,cn02 mpiexec -hosts 2 cn01 cn02 mpitest2.exe</span></a><br/>the system tells me that cannot connect to head node from 'localhost'. <br/><br/>Does someone experience the same problem? Furthermore, the reason I use the command line is that I hope to place rank 0 on CN01, rank 1 on CN02. Then rank 0 and rank 1 can pass message between each other, then they can initialize OpenMP on their own node for future use. Is it possible to do this on ccs 2003?<br/><br/>Thanks and regards<br/>SkiffFri, 30 Oct 2009 02:45:26 Z2009-10-30T02:45:26Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/a84906cd-ba37-462c-b356-b096decc7e32http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/a84906cd-ba37-462c-b356-b096decc7e32skiffzzzhttp://social.microsoft.com/Profile/en-US/?user=skiffzzzUsing one processor from each of multiple nodes on ccp 2003Dear all, I am using the windows ccp 2003. I want to run my program on 2 nodes. Inside each node, I want to use OpenMP since all the processors on one node have access to the shared memory. <br/>Can I use only one processor from each of the two nodes? Then the two processors will be the master thread in their own node and pass message between each other. But in ccp2003, I can't find the right way to specify two processors from 2 nodes with each node having 8 processors.<br/><br/>Thanks and regards<br/>skiffThu, 29 Oct 2009 02:27:51 Z2009-10-29T02:27:52Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/ab7462a9-bb8f-4711-8573-873ef8130480http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/ab7462a9-bb8f-4711-8573-873ef8130480LeeComphttp://social.microsoft.com/Profile/en-US/?user=LeeCompNew MPI user question.I have an HP laptop computer with dual core CPU's and 6 GB of RAM, running the MS Vista 64-bit operating system and<br />the Intel 64-bit FORTRAN compiler, version 10.0.027.&nbsp; I am trying to follow the MPI FORTRAN example codes in the book,<br />"Using MPI:&nbsp; Portable Parallel Programming with the Message-Passing Interface," 2nd ed., by W. Gropp, et. al.&nbsp; Specifically,<br />I am trying to run the simple FORTRAN program shown on page 25 which calculates pi, CalcPi.&nbsp; I have successfully, I think, loaded,<br />compiled, and linked this short FORTRAN program with the MPI library--no errors.&nbsp; When I run it, just by initiating the<br />CalcPi.exe, as in C:&gt; CalcPi&nbsp; the program runs and produces the correct pi result; however, the variable "numprocs," which is<br />the number of processes used to do the calculation, is always 1, no matter how many integration steps I request.&nbsp; Doing more<br />research, I found that I was not properly starting this MPI code:&nbsp; I should be initiating the command:<br />C:&gt; mpiexec -n 4 CalcPi.exe&nbsp;&nbsp; if I wanted more than 1 process (4 in this case).&nbsp; However, when I run this, I get this error message:<br /><br />Error while connecting to host.&nbsp; No connection could be made because the target machine actively refused it.&nbsp; (10061)<br />Connect on sock (host=Owner-PC, port=8676) failed, exhaused [sic] all end points<br />Unable to connect to 'Owner-PC:8676', sock error:&nbsp; Error = -1<br /><br />I am the administrator on this laptop, and I am not trying to connect to another machine, to my knowledge.&nbsp; So, what am I doing wrong?<br />As I say, when I run CalcPi.exe as C:&gt; CalcPi&nbsp; all appears to run well, except that "numprocs" always equals 1 no matter how many<br />integrations steps I request, and I have gone as high as 1 billion.<br />Wed, 14 Oct 2009 17:51:42 Z2009-10-28T18:34:37Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/d357fa27-6560-442f-b3cd-732c96cd1929http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/d357fa27-6560-442f-b3cd-732c96cd1929YuJinSuhttp://social.microsoft.com/Profile/en-US/?user=YuJinSuMpi Error Question<p>My PC system is Windows HPC Server 2008<br/>on Command line :<br/>mpiexec sendmpi.exe<br/>I have two error message <br/>Link the picture <a href="http://img163.imageshack.us/img163/2276/789r.jpg">http://img163.imageshack.us/img163/2276/789r.jpg</a><br/><br/>how solve it !?</p>Wed, 28 Oct 2009 13:02:16 Z2009-10-28T17:43:36Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/64ca4f42-f18f-4cb9-9bdd-f0de71869740http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/64ca4f42-f18f-4cb9-9bdd-f0de71869740AminZhttp://social.microsoft.com/Profile/en-US/?user=AminZBuilding a function out of a mpi program and inserting it into the new main function....?Hi:<br /> I have a mpi program with its bash file which works completely fine. It is as follows:<br /> <br /> &nbsp;&nbsp;&nbsp; int main(int argc, char *argv[])<br /> {<br /> <br /> &nbsp;&nbsp;&nbsp; MPI_Status status; <br /> &nbsp;&nbsp;&nbsp; MPI_Init(&amp;argc,&amp;argv);<br /> &nbsp;&nbsp;&nbsp; MPI_Comm_rank(MPI_COMM_WORLD, &amp;MYRANK);<br /> &nbsp;&nbsp;&nbsp; MPI_Comm_size(MPI_COMM_WORLD, &amp;NP);<br /> &nbsp;&nbsp;&nbsp;&nbsp; <br /> &nbsp; .......CODES.......<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp; MPI_Finalize();<br /> &nbsp; &nbsp;&nbsp;&nbsp; return 0;<br /> }<br /> <br /> <br /> Its final results is a double vector that I need. Now what I want to do is to write a function out of it (for example changing the int main(int argc, char *argv[]) to double foo(&quot;some arguments&quot;)) so I use it couple of times in another main program. I don't know how to do it. I inserted the function foo as it was (with all mpi stuff) in the new main program that has not mpi stuff but it doesn't work (it takes the error on arguments of MPI_Init(&amp;argc,&amp;argv) since they are not defined in foo ofcourse. I have even tried to pass these arguments from the new main to foo but it doesn't work.).<br /> Any help is really appreciated.<br /> Thanks,<br /> Amin<br /> <br />Tue, 06 Oct 2009 16:59:09 Z2009-10-12T16:22:28Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/f166abf5-9719-47d2-8d12-08254bf98ccahttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/f166abf5-9719-47d2-8d12-08254bf98ccaRajahShttp://social.microsoft.com/Profile/en-US/?user=RajahSForcing Rank 0 to a particular cluster node<p>Our cluster has a mix of machines, and<em>  I </em>need rank 0 in my application to run on a particular node that has the capabilities the controlling rank needs. Is there some way to do this I'm missing?</p> <p> </p>Mon, 28 Sep 2009 11:57:19 Z2009-10-12T16:19:27Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/fa213e91-5b83-419b-a1df-e11a76568664http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/fa213e91-5b83-419b-a1df-e11a76568664Gu Xuhttp://social.microsoft.com/Profile/en-US/?user=Gu%20XuCannot link to msmpi.lib with both g77 and gfortranParpack is the parallel version of arpack, a well-known package for sparse eigenproblems written in Fortran.  I am trying to port this package and let it run on Microsoft HPC clusters. However, I encontered problems when linking with Micorosft MPI SDK.  I already successfully built the package on both MPICH and MPICH2. Now I am quite sure the problem should not be related to parapck itself.<br/><br/>I am using the 32bit SDK, and two libraries msmpifc.lib and msmpi.lib.  The order of the libraries are important.  If I put msmpi.lib before msmpifc.lib, there will be lots of link errors (&gt;100).  I put msmpifc.lib before msmpi.lib and obtained 15 link errors (two root errors):<br/><br/><strong>undefined reference to _imp_MPI_F_STATUSES_IGNOR in mpif.obj</strong><br/><strong>undefined reference to _imp_MPI_F_STATUS_IGNOR in mpif.obj<br/><br/></strong>Did anybody encounter the similar problem? And solution? <br/><br/>BTW: I also tried the combination of msmpifmc.lib and msmpi.lib.  The errors are the same.<br/><br/>Thanks,<br/>Gu<br/>Wed, 16 Sep 2009 16:02:27 Z2009-09-21T07:00:19Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/49ea3751-10a7-4c48-9edd-9a51cce13a4dhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/49ea3751-10a7-4c48-9edd-9a51cce13a4drteavhttp://social.microsoft.com/Profile/en-US/?user=rteavMS MPI and GUI, COMI have an application which has its GUI and uses COM technology. I want to use MPI in one of the modules of the application. I wonder how will the system behave concerning COM technology in general. <div><br/></div> <div>And how will the existance of GUI affect the MPI parallesation.</div> <div><br/></div> <div>I would be thankfull if someone could give a link to a reference about internal work principals of the MS MPI, I mean what REALLY happens when I call MPI_Init() - is it something like UNIX fork() with copying the whole process envirorment via network (including all dlls that proccess loads) or just some binary code delegation?</div>Fri, 17 Jul 2009 10:47:10 Z2009-09-21T03:21:26Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/94fd3fb7-d330-4b59-b922-04b7a20289f3http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/94fd3fb7-d330-4b59-b922-04b7a20289f3creepahttp://social.microsoft.com/Profile/en-US/?user=creepaMPI_FINALIZE fails with error: rank 0 unable to connect to rank 1Hi all, <br/> <br/> I have MPI fortran code that I would like to run on two computers, one my home computer , the other work, connected through VPN. I run mpiexec with two hosts and I get the output from the program but MPI Finalize crashes with the above error message. job aborted:<br/> <br/> rank: node: exit code[: error message]<br/> 0: &lt;IP rank0&gt;: 1: Fatal error in MPI_Finalize: Other MPI error, error stack:<br/> MPI_Finalize(307)............: MPI_Finalize failed<br/> MPI_Finalize(198)............:<br/> MPID_Finalize(92)............:<br/> PMPI_Barrier(476)............: MPI_Barrier(comm=0x44000002) failed<br/> MPIR_Barrier(82).............:<br/> MPIC_Sendrecv(158)...........:<br/> MPID_Isend(116)..............: failure occurred while attempting to send an eage<br/> r message<br/> MPIDI_CH3_iSend(175).........:<br/> MPIDI_CH3I_Sock_connect(1215): [ch3:sock] rank 0 unable to connect to rank 1 usi<br/> ng business card &lt;port=2652 description=&lt;rank 1's network here&gt; ifname=&lt;rank 1s IP&gt;&gt;<br/> MPIDU_Sock_post_connect(1231): unable to connect to ... on p<br/> ort 2652, exhausted all endpoints (errno -1)<br/> MPIDU_Sock_post_connect(1247): gethostbyname failed, The requested name is valid<br/>  and was found in the database, but it does not have the correct associated data<br/>  being resolved for. (errno 11004)<br/> 1: ...: 1<br/> <br/> I set MPICH_NETMASK to the correct IP but it doesnt seem to make a difference. <br/> <br/> Does anyone have a hint for me? Let me know if you need more information... <br/> <br/> Benjamin<br/>Sat, 12 Sep 2009 21:44:13 Z2009-09-14T22:24:57Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/39821e2f-cf24-410c-8e23-1bca7694c06dhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/39821e2f-cf24-410c-8e23-1bca7694c06dtrimtrimhttp://social.microsoft.com/Profile/en-US/?user=trimtrimHow to evenly create the processors on the nodes.I have a MPI program running on the Windows HPC cluster. The cluster have 50 nodes, each nodes have 8 processors. I tried to run my program using 50 processors. The windows HPC always creates 8 processors on 1 nodes. How can I use 25 nodes and each nodes create 2 processors. I tried to use machine file, however, The windoes HPC didn't create processors as my machine file specified.Mon, 03 Aug 2009 05:48:56 Z2009-09-11T08:32:32Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/ba804691-4325-4ea3-b426-8a945936c90bhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/ba804691-4325-4ea3-b426-8a945936c90bNeP0http://social.microsoft.com/Profile/en-US/?user=NeP0Running Intel MPI Benchmark for Network DirectHi,<br>   I am new to HPC clusters and running MPI job, I am trying to understand if I am missing something while running Intel MPI Benchmark(IMB) over Network Direct on a Windows HPC cluster.<br><br>   I have a IMB(source downloded from Intel site) compiled with MSMPI.lib(from MS Compute Cluster Pack) and tried running through the mpiexec and job submit which failed to run. Below is the error I got,<br><br>&quot;Error (14001) The application has failed to start because its side-by-side configuration is incorrect. Please see the application event log for more detail.&quot;<br><br>   Later I found the IMB source for compiling them for Windows 2008 server and compiled them with impi.lib which came with that source. I am able to start that job only with the mpiexec that cmae with the same source. I got some performance numbers too. I am confused now if the results I got is for Network Direct, since the IMB-MPI1 was not compiled with MSMPI.lib  &amp; the job was not run with mpiexec command from MS HPC Pack. How to confirm the same?<br><br>Thanks,<br>NMon, 16 Mar 2009 18:54:55 Z2009-09-10T12:18:49Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/bc8d2706-7afb-4c09-93fc-9e12b0fdcf95http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/bc8d2706-7afb-4c09-93fc-9e12b0fdcf95alejo_olhttp://social.microsoft.com/Profile/en-US/?user=alejo_olOther MPI error, error stacKHi,  I am working with MPI . try to connect 2 process with mpi_open_port and/or <span style="font-family:arial;font-size:13px;color:#464544"> mpi_publish_name.</span> <div><span style="font-family:arial;color:#464544;font-size:small"><span style="font-size:13px">When i run the program in VC+2008 i dont have problem, but when i run the programm with mpiexec  appear the problem.</span></span></div> <div><span style="font-family:arial;color:#464544;font-size:small"><span style="font-size:13px"><br/></span></span></div> <div><span style="font-family:arial;color:#464544;font-size:small"><span style="font-size:13px">the message of error is the next:</span></span></div> <div><span style="font-family:arial;color:#464544;font-size:small"><span style="font-size:13px"><br/></span></span></div> <div><span style="font-family:arial;color:#464544;font-size:small"><span style="font-size:13px">&quot;Fatal error in MPI_Open_port: Other MPI error, error stack:</span></span></div> <div><span style="font-family:arial;color:#464544;font-size:small"><span style="font-size:13px">MPI_Open_port(119): MPI_Open_port(MPI_INFO_NULL, port=0012FD1C) failed</span></span></div> <div><span style="font-family:arial;color:#464544;font-size:small"><span style="font-size:13px">MPID_Open_port(69): Function not implemented</span></span></div> <div>&quot;</div> <div><br/></div> <div>thanks</div>Thu, 30 Jul 2009 03:06:50 Z2009-08-10T18:56:03Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/47b99479-37d9-43a4-b1e3-90efd3d52c0ahttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/47b99479-37d9-43a4-b1e3-90efd3d52c0aAnton Futurehttp://social.microsoft.com/Profile/en-US/?user=Anton%20FutureCan different MPI processes write data in one file (with different offset)?I'm developing serious mathematics system, which using MPI technology. And I've faced a problem: &quot;Can different MPI processes write data in one file (with different offset)?&quot; If it is possible, I ask you to tell me how to do it. It will be very healthy, if you'll show the source code solving my problem. P.S. English language is not my native language.Wed, 05 Aug 2009 09:38:33 Z2009-08-07T23:42:39Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/4121ae10-93f6-4db1-96c5-75974f4ca7eahttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/4121ae10-93f6-4db1-96c5-75974f4ca7eaAlbert E.http://social.microsoft.com/Profile/en-US/?user=Albert%20E.No MPI - clusrun doesn't workHi,<br/> <br/> i am an absolute newbie in using the HPC and i hope anyone can help me: I have a small Cluster with 1 Headnode and 2 compute Nodes. If i run a job on one compute node only, it works fine. But if i start a job wich should run on booth compute nodes, the job runs only on the first node. Now i tried &quot;clusrun smpd -status&quot; and i got the following error:<br/> <br/> ****************<br/> Command has failed on node CLUSTERHEAD. Message:Task failed during execution wit<br/> h exit code 1. Please check task's output for error details.<br/> -------------------------- CLUSTERHEAD returns 1 --------------------------<br/> 'smdp' is not recognized as an internal or external command,<br/> operable program or batch file.<br/> Command has failed on node CLUSTERPC02. Message:Error from node: CLUSTERPC02:Log<br/> on failure: unknown user name or bad passwordException of type 'Microsoft.Hpc.Ac<br/> tivation.NodeManagerException' was thrown.<br/> -------------------------- CLUSTERPC02 returns 1 --------------------------<br/> Command has failed on node CLUSTERPC01. Message:Error from node: CLUSTERPC01:Log<br/> on failure: unknown user name or bad passwordException of type 'Microsoft.Hpc.Ac<br/> tivation.NodeManagerException' was thrown.<br/> -------------------------- CLUSTERPC01 returns 1 --------------------------<br/> <br/> -------------------------- Summary --------------------------<br/> 0 Nodes succeeded<br/> 3 Nodes failed:CLUSTERHEAD,CLUSTERPC01,CLUSTERPC02<br/> ***************<br/> <br/> By using the command: clusrun /all hostname.exe i get this following error-message:<br/> <br/> ***************<br/> Command proxy has failed on node CLUSTERPC01. Message:Error from node: CLUSTERPC<br/> 01:Logon failure: unknown user name or bad passwordException of type 'Microsoft.<br/> Hpc.Activation.NodeManagerException' was thrown.<br/> Command has been canceled on node CLUSTERHEAD. Message:Command output proxy has<br/> failed.<br/> Command has been canceled on node CLUSTERPC01. Message:Command output proxy has<br/> failed.<br/> Command has been canceled on node CLUSTERPC02. Message:Command output proxy has<br/> failed.<br/> <br/> -------------------------- Summary --------------------------<br/> 0 Nodes succeeded<br/> 3 Nodes failed:CLUSTERHEAD,CLUSTERPC01,CLUSTERPC02<br/> ****************<br/> <br/> Sorry, but i dont know what it means. Can anyone help me???<br/> <br/> Thank you, AlbertMon, 03 Aug 2009 14:30:32 Z2009-08-19T02:53:01Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/54b03b57-8015-464e-9f19-0e9d7a26a2bchttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/54b03b57-8015-464e-9f19-0e9d7a26a2bcSashiBalahttp://social.microsoft.com/Profile/en-US/?user=SashiBalaMS MPI on non-HPC ServersI am migrating an HPC application from Linux to Windows platform. Got a few questions -<br/><br/>1) My platform has a few nodes, and they currently have Windows 2008 Embedded Server on it...will the MS-MPI in the HPC 2008 Pack work with this OS. In otherwords, can I run MPI application without the HPC Server's 'Job Scheduler' ?<br/><br/>2) What's link to the MS-MPI Reference documents - i.e document which details all the MPI calls, and more. <br/>I am especially interested in the 'One Sided Communication' calls, with 'Passive' sychronization, that uses RDMA feature of Infiniband.<br/><br/>Appreciate a prompt response.<br/><br/>Thanks,<br/>SashiWed, 01 Jul 2009 23:29:45 Z2009-07-30T17:25:20Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/910f14b3-ae9c-4ad5-bf1b-9263e3b9b794http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/910f14b3-ae9c-4ad5-bf1b-9263e3b9b794ChristopherMorleyhttp://social.microsoft.com/Profile/en-US/?user=ChristopherMorleyAPI to determine MSMPI versionIs there a way to determine from within my app what version of the msmpi dll was loaded (CCS/2003 or HPC server/2008) ?<br/> MPI_Get_Version appears to be for querying the version of the MPI standard that is supported (2.0 for both ?).<br/> <br/> I can't find the MS MPI API on MSDN, and browsing mpi.h didn't reveal anything to me.<br/> <br/> Thanks,<br/> Chris<br/>Thu, 23 Jul 2009 19:49:39 Z2009-07-30T17:20:16Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/068dea80-0026-4699-9d07-1706198ed786http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/068dea80-0026-4699-9d07-1706198ed786trimtrimhttp://social.microsoft.com/Profile/en-US/?user=trimtrimunresolved external symbol MPI_INIT referenced in function MAIN___fpi.obj_ aHi every one:<br/> I am tried to compile a simple MPI (fpi) program using MSMPI in VS2008 and intel fortran.<br/> The code is the example code named &quot;fpi.f&quot; in MPICH2 installation folder. <br/> The compiler is intel fortran 11 which integrated in VS 2008.<br/> Before I start compile the program, several steps of setting are made:<br/> 1. set &quot;C:\Program Files\Microsoft HPC Pack 2008 SDK\Include&quot; as additional include directories in Fortran tab.<br/> 2. set &quot;C:\Program Files\Microsoft HPC Pack 2008 SDK\Lib\amd64&quot; as additional library directories in Link tab.<br/> 3. set &quot;msmpi.lib  msmpifec.lib&quot; as additional dependencies in link tab.<br/> Then I compiled the program to X64 program. but the compiler reported following errors:<br/> Error    1     error LNK2019: unresolved external symbol MPI_INIT referenced in function MAIN__    fpi.obj    <br/> Error    2     error LNK2019: unresolved external symbol MPI_COMM_RANK referenced in function MAIN__    fpi.obj    <br/> Error    3     error LNK2019: unresolved external symbol MPI_COMM_SIZE referenced in function MAIN__    fpi.obj    <br/> Error    4     error LNK2019: unresolved external symbol MPI_BCAST referenced in function MAIN__    fpi.obj    <br/> Error    5     error LNK2019: unresolved external symbol MPI_FINALIZE referenced in function MAIN__    fpi.obj    <br/> Error    6     error LNK2019: unresolved external symbol MPI_REDUCE referenced in function MAIN__    fpi.obj   <br/> It looks like the program can not link with the &quot;msmpi.lib&quot; . I also tried to link the program to  &quot;C:\Program Files\Microsoft HPC Pack 2008 SDK\Lib\i386&quot;. the compiler also report the above errors. By the way, if I link the program with MPCIH2 then everything is fine. <br/> Can any can tell me how to set the link library for Fortran?<br/> Many thanks.<br/> <br/>Fri, 10 Jul 2009 05:54:05 Z2009-07-17T17:56:56Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/f04f6ab8-42f5-4045-8b7f-a357bb5fb504http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/f04f6ab8-42f5-4045-8b7f-a357bb5fb504trimtrimhttp://social.microsoft.com/Profile/en-US/?user=trimtrimHow to link msMPI with intel fortran?aHi every one:<br/> I am tried to compile a simple MPI (fpi) program using MSMPI in VS2008 and intel fortran.<br/> The code is the example code named &quot;fpi.f&quot; in MPICH2 installation folder. <br/> The compiler is intel fortran 11 which integrated in VS 2008.<br/> Before I start compile the program, several steps of setting are made:<br/> 1. set &quot;C:\Program Files\Microsoft HPC Pack 2008 SDK\Include&quot; as additional include directories in Fortran tab.<br/> 2. set &quot;C:\Program Files\Microsoft HPC Pack 2008 SDK\Lib\amd64&quot; as additional library directories in Link tab.<br/> 3. set &quot;msmpi.lib  msmpifec.lib&quot; as additional dependencies in link tab.<br/> Then I compiled the program to X64 program. but the compiler reported following errors:<br/> Error    1     error LNK2019: unresolved external symbol MPI_INIT referenced in function MAIN__    fpi.obj    <br/> Error    2     error LNK2019: unresolved external symbol MPI_COMM_RANK referenced in function MAIN__    fpi.obj    <br/> Error    3     error LNK2019: unresolved external symbol MPI_COMM_SIZE referenced in function MAIN__    fpi.obj    <br/> Error    4     error LNK2019: unresolved external symbol MPI_BCAST referenced in function MAIN__    fpi.obj    <br/> Error    5     error LNK2019: unresolved external symbol MPI_FINALIZE referenced in function MAIN__    fpi.obj    <br/> Error    6     error LNK2019: unresolved external symbol MPI_REDUCE referenced in function MAIN__    fpi.obj   <br/> It looks like the program can not link with the &quot;msmpi.lib&quot; . I also tried to link the program to  &quot;C:\Program Files\Microsoft HPC Pack 2008 SDK\Lib\i386&quot;. the compiler also report the above errors. By the way, if I link the program with MPCIH2 then everything is fine. <br/> Can any can tell me how to set the link library for Fortran?<br/> Many thanks.<br/>Fri, 10 Jul 2009 05:57:08 Z2009-08-11T01:16:18Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/fc113c32-0d71-46ce-8ca1-119252d042f4http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/fc113c32-0d71-46ce-8ca1-119252d042f4kellydavidhttp://social.microsoft.com/Profile/en-US/?user=kellydavidWHPCS2008 MPI diagnostics failing randomlyHi, <div><br/></div> <div>I'm trying to help a customer who is running WHPCS 2008 on an SGI cluster (of Altix XE310 servers). WHPCS 2008 passes all the built-in diagnostic tests except for the MPI ping-pong and lightweight throughput tests. These fail on random nodes. The cluster is setup with a private GigE network for management and an Infiniband (IB) network for MPI traffic.  The IB network is connected to a Cisco SFS7000d switch which us running the subnet manager. </div> <div>When running the MPI ping-pong and lightweight throughput tests, multiple nodes fail randomly. For example, the tests are run on a subset of the cluster works OK, but after they get up to around 14 nodes, the test will fail on a random node. (The cluster has about 20 nodes in total.) However, a &quot;clusrun&quot; of simple commands works OK across all nodes. The log output from the MPI tests is as follows:</div> <div><br/></div> <div> <p class=MsoPlainText>Time<span style="">      </span>Message</p> <p class=MsoPlainText>18/06/2009 12:32:06 PM<span style="">                </span>Reverted</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>The operation failed due to errors during execution.</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>The operation failed and will not be retried.</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>---- error analysis -----</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span></p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>mpi has detected a fatal error and aborted mpipingpong.exe</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>[5] on MICRO15</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span></p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>---- error analysis -----</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span></p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>[6-19] terminated</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span></p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>CH3_ND::CEndpoint::ConnReqFailed(407): [ch3:nd] INDConnector::Connect to 192.168.2.111:1 failed with 0x80070043</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>CH3_ND::CEndpoint::Connect(236)......:</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>CH3_ND::CEnvironment::Connect(400)...:</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>MPIDI_CH3I_VC_post_connect(426)......: MPIDI_CH3I_Nd_connect failed in VC_post_connect</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>MPIDI_CH3_iSendv(239)................:</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>MPIDI_EagerContigIsend(519)..........: failure occurred while attempting to send an eager message</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>MPIC_Sendrecv(120)...................:</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>MPIR_Allgather(487)..................:</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>MPI_Allgather(864)...................: MPI_Allgather(sbuf=0x000000000022F750, scount=128, MPI_CHAR, rbuf=0x0000000000B71100, rcount=128, MPI_CHAR, MPI_COMM_WORLD) failed</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>Fatal error in MPI_Allgather: Other MPI error, error stack:</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>[5] fatal error</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span></p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>[0-4] terminated</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span></p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span>[ranks] message</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">             </span><span style="">   </span>job aborted:</p> <p class=MsoPlainText>18/06/2009 12:32:05 PM<span style="">                </span></p> <p class=MsoPlainText>18/06/2009 12:31:50 PM<span style="">                </span>Connecting to scheduler service on node micro.</p> <p class=MsoPlainText> </p> <p class=MsoPlainText>Has anyone seen this type f problem? Any suggestions for resolving it?</p> <p class=MsoPlainText>Thanks very much.</p> <p class=MsoPlainText>Regards,</p> <p class=MsoPlainText>David</p> </div>Fri, 19 Jun 2009 06:10:44 Z2009-07-17T17:55:14Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/299c27cf-98eb-4622-b62f-f867f5a99469http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/299c27cf-98eb-4622-b62f-f867f5a99469Parsa1985http://social.microsoft.com/Profile/en-US/?user=Parsa1985Problems with while running an MPICH job<p>when I run a code Like this on cluster with two nodes using one core on each:<br/><span style="font-size:x-small;color:#800000"><span style="font-size:x-small;color:#800000"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'"><br/><br/><br/>use mpi</span> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">implicit real*8(a-h,o-z)</span></p> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">call MPI_Init ( ierr )</span></p> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">call MPI_Comm_rank ( MPI_COMM_WORLD, my_id, ierr )</span></p> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">CALL MPI_COMM_SIZE ( MPI_COMM_WORLD, NUM_PROCS, IERR )</span></p> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">print*,my_id</span></p> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">call mpi_barrier(mpi_comm_world,ierr)</span></p> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">call MPI_Finalize ( ierr )</span></p> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">end</span></p> <p style="margin-bottom:0pt;line-height:50%"> </p> <p style="margin-bottom:0pt;line-height:50%"><span style="font-size:10pt;color:black;line-height:50%;font-family:'Tahoma','sans-serif'">It does the first printing , but on mpi_barrier it return error including message: rank 0 unable to connect to rank 1 using business card (port=30776,....)<br/></span><span style="color:black;font-family:'Tahoma','sans-serif'"><span style="font-size:small"><br/>I can not use any of the command mpi_bcast, mpi_reduce ,... they all don't work, but it can read from each node and print, but <br/><br/>It comes to communication, It returns this error<br/><br/><br/><br/>I use windows HPC cluster<br/><br/><br/><br/><br/>I have run this code with this version of mpi on win XP and vista PC and laptop computers. please help me pass the conundrum.<br/><br/><br/><br/>Thanks<br/><br/><br/><br/></span></span><span style="font-size:x-small;color:#0000ff"><span style="font-size:x-small;color:#0000ff"><br/>Parsa<br/><br/></span></span></p> </span></span></p>Sat, 16 May 2009 07:59:37 Z2009-06-25T23:21:25Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/0ffde564-8bc8-4ccd-8991-217278237ed8http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/0ffde564-8bc8-4ccd-8991-217278237ed8Parsa1985http://social.microsoft.com/Profile/en-US/?user=Parsa1985problems linking fortran 90 code with ms mpihi,<br/>I installed microsoft compute cluster pack and there is no msmpif.lib or any other fortran library.<br/>I used msmpi.lib and It returns that &quot;check your include directories: module [mpi]&quot;<br/>It is said that msmpi works for fortran90 too.<br/>please help me out<br/><br/>Regards<br/>ParsaTue, 19 May 2009 11:05:30 Z2009-06-25T23:20:08Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/684eb470-bdc4-4bcf-85d2-f94e20790ff5http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/684eb470-bdc4-4bcf-85d2-f94e20790ff5YuriL2008http://social.microsoft.com/Profile/en-US/?user=YuriL2008cluster job manager fails all the tasks with the program I try to launch "Could not load file or assembly 'tmpWin32withCLR"<font size=2> <p>I've created the parallel program for computer cluster managed by WCCS2003. Debugging using SDK on separate computer is absolutly succesfull. The only problem is that cluster job manager fails all the tasks with the program I try to launch. Using the only node that was used for developing returns the same result.</p> <p>The program has been developed using managed C++ (CLR Console Application) and MPI in Microsoft Visual Studio 2005. </p> <p>This exception occurred when I try to execute it on cluster:</p> <p>Unhandled Exception: System.IO.FileLoadException: Could not load file or assembly 'tmpWin32withCLR, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' or one of its dependencies. Failed to grant minimum permission requests. (Exception from HRESULT: 0x80131417) File name: 'tmpWin32withCLR, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' ---&gt; System.Security.Policy.PolicyException: Required permissions cannot be acquired.</p> <p>at System.Security.SecurityManager.ResolvePolicy(Evidence evidence, PermissionSet reqdPset, PermissionSet optPset, PermissionSet denyPset, PermissionSet&amp; denied, Boolean checkExecutionPermission)</p> <p>at System.Security.SecurityManager.ResolvePolicy(Evidence evidence, PermissionSet reqdPset, PermissionSet optPset, PermissionSet denyPset, PermissionSet&amp; denied, Int32&amp; securitySpecialFlags, Boolean checkExecutionPermission)</p> <p>Please help or give instructions how I can go in to get around this error.</p></font> Thu, 27 Nov 2008 16:39:34 Z2009-06-25T23:17:02Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/93dda9bb-3c1c-4bde-bc85-4814ed3309f4http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/93dda9bb-3c1c-4bde-bc85-4814ed3309f4AJ Frethttp://social.microsoft.com/Profile/en-US/?user=AJ%20Fret-trace permission troubleI'm currently trying to do a<strong> mpiexec -trace MPIApp.exe</strong> but I am getting the error.<br/> <br/> <br/> <em>Aborting: failed to start tracing on ...<br/> Error (5) Access is denied. </em> <br/> <br/> I know the problem is that I don't have access to the cluster, and the cluster administrator does not want to give me access to use cluster manager, but he asked me to post here if there was a way around it.  Can he give me the capabilities of using -trace without giving me access to the cluster manager?  Can we setup a certain job template that I can use the -trace function?<br/> <br/> Any help is appreciated,<br/> <br/> AJMon, 22 Jun 2009 18:42:38 Z2009-06-24T23:30:12Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/8f9381cb-2f16-4f24-894c-0e15074ee71ehttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/8f9381cb-2f16-4f24-894c-0e15074ee71eJoe Hummel, PhDhttp://social.microsoft.com/Profile/en-US/?user=Joe%20Hummel%2c%20PhD32-bit SDK for HPC Server 2008 fails --- "The procedure entry point GetProcessIdOfThread could not be located..."  I'm wondering if anyone has run into the following with the RTM of the SDK for HPC Server 2008.  I have various 64-bit MPI apps that run fine with the 64-bit version of the new SDK.  But I wanted to do some demos under 32-bit Windows XP Pro, so I rebuilt the apps for 32-bit, installed the 32-bit SDK on the Windows XP machine, and then tried to run my apps via mpiexec:<br><br>  mpiexec -n 4 MyMPIApp.exe<br><br>Immediately, a dialog pops up that says &quot;The procedure entry point GetProcessIdOfThread could not be located in the dynamic link library KERNEL32.dll&quot;.  I close the dialog, and the console window then displays:<br><br>  ReadFile() failed, error 109<br>  Error: unable to start the local smpd manager<br><br>I disabled the firewall, same result.  Rebooted, same result.  Installed MPI.NET and ran one of my MPI.NET apps, exactly the same error.  If I uninstall, and install the SDK for Compute Cluster Pack, the error goes away.  So something is up with the new SDK.<br><br>Any ideas?  Thanks!Wed, 08 Oct 2008 19:18:26 Z2009-06-24T22:54:21Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/18ffa63d-e517-44c3-bdf3-ea15f6910fa9http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/18ffa63d-e517-44c3-bdf3-ea15f6910fa9pkalyanraohttp://social.microsoft.com/Profile/en-US/?user=pkalyanraohow to select cores from nodes to run job on multiple nodes. Hi all,<br/> <br/>     I have 2 nodes in my cluster with 4 core on each node.<br/> <br/>     I have one <strong>exe </strong> file called <strong>sleep.exe. </strong> I submitted job with  <strong>Job submit /numnodes:2 mpiexec -cores 2 sleep.exe</strong> then it<strong> </strong> was open 2 sleep.exe processes on each node. <br/> <br/> <br/>     And I have a 4 core Ansys CFX job, and I want to run this job on 2 core from first node and other 2 core from second node. <br/> <br/>     I have tried  with <strong>job submit /numnodes:2 /workdir:&lt;working directory path&gt; /stdout:out.log /stderr:error.log mpiexec -cores 2 cfx5solve.exe -v -def &lt;.def file&gt; -start-method MSMPI -part 4. </strong> Then the job got failed and generated below error information in error.log file<br/> <br/> <strong><br/> &quot;An error has occurred in cfx5solve:<br/> <br/> Error reported by IO module: readIntFmtData: (fgets failed) syserr:: No<br/> error<br/> <br/> An error has occurred in cfx5solve:<br/> <br/> Error reported by IO module: iif_set_lock: error reading lock file<br/> //litocmaster/work/benchmark.def.lck: No error<br/> <br/> An error has occurred in cfx5solve:<br/> <br/> Neither Start Command nor Option is defined for start method MSMPI; check<br/> that you have given the method name correctly.<br/> <br/> An error has occurred in cfx5solve:<br/> <br/> Neither Start Command nor Option is defined for start method MSMPI; check<br/> that you have given the method name correctly.<br/> <br/> Can't call method &quot;name&quot; on an undefined value at C:\Program Files\ANSYS Inc\v110\CFX\bin\/perllib/CFX5/Job/Settings.pm line 2464.<br/> An error has occurred in cfx5solve:<br/> <br/> Neither Start Command nor Option is defined for start method MSMPI; check<br/> that you have given the method name correctly.<br/> <br/> An error has occurred in cfx5solve:<br/> <br/> Neither Start Command nor Option is defined for start method MSMPI; check<br/> that you have given the method name correctly.<br/> <br/> Can't call method &quot;name&quot; on an undefined value at c:\Program Files\ANSYS Inc\v110\CFX\bin\/perllib/CFX5/Job/Settings.pm line 2464.<br/> An error has occurred in cfx5solve:<br/> <br/> Neither Start Command nor Option is defined for start method MSMPI; check<br/> that you have given the method name correctly.<br/> <br/> An error has occurred in cfx5solve:<br/> <br/> Neither Start Command nor Option is defined for start method MSMPI; check<br/> that you have given the method name correctly.<br/> <br/> Can't call method &quot;name&quot; on an undefined value at c:\Program Files\ANSYS Inc\v110\CFX\bin\/perllib/CFX5/Job/Settings.pm line 2464.<br/> An error has occurred in cfx5solve:<br/> <br/> Neither Start Command nor Option is defined for start method MSMPI; check<br/> that you have given the method name correctly.<br/> <br/> An error has occurred in cfx5solve:<br/> <br/> Neither Start Command nor Option is defined for start method MSMPI; check<br/> that you have given the method name correctly.<br/> <br/> Can't call method &quot;name&quot; on an undefined value at C:\Program Files\ANSYS Inc\v110\CFX\bin\/perllib/CFX5/Job/Settings.pm line 2464.</strong> &quot;<br/>  <br/> <br/> <br/> But when I submit the job with out <strong>mpiexec</strong> option, The job is running fine on available resources.<br/> <br/> Will <strong>mpiexec </strong> works with all applications or not. Please give me suggessions on this. And any body tested this kind of scenario with Starccm application. <br/> <br/> Regards,<br/> P. Kalyan Rao<br/>  Wed, 17 Jun 2009 07:29:54 Z2009-06-24T18:54:29Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/b512f122-d6a4-4c09-a8c7-7980b14bdc19http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/b512f122-d6a4-4c09-a8c7-7980b14bdc19estennerhttp://social.microsoft.com/Profile/en-US/?user=estennerGetting mpiexec.exe to run simple hello world I have just installed the HPC Pack 2008 Beta 2, and I am trying to execute a simple hello world program with the command:<br><br>mpiexec.exe -d 3 C:\Hello\Hello.exe<br><br>The command just hangs, and the last couple lines of the debug output say:<br><br>ERROR: AcceptSecurityContext failed with error 0x80090308<br>ERROR: unable to process the sspi server iteration buffer<br><br>If anyone has any ideas what my problem might be, I would greatly appreciate some help getting going with this.Fri, 27 Jun 2008 15:42:21 Z2009-06-22T21:10:42Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/0791b8a1-85e6-4271-9eef-5f3ec58e86b1http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/0791b8a1-85e6-4271-9eef-5f3ec58e86b1Cindy.Whttp://social.microsoft.com/Profile/en-US/?user=Cindy.Wa sudden interuption of the MPI job<p>I am runing a MPI job on MSHPC cluster. it is runing ok at the beginging of the calculation, the output was written correctly, but the job was just suddenly stoped in the middle.  The error message was as follows. I checked the code line where the &quot;tag=273&quot; pointed at, it is correct. could anybody know the reason? Cheers<br/><br/>Cindy<br/><br/>****************************************<br/>job aborted:<br/>rank: node: exit code: message<br/>0: SB-NODE011: fatal error: Fatal error in MPI_Recv: Other MPI error, error stack:<br/>MPI_Recv(179)...........: MPI_Recv(buf=0x0000000001A4F408, count=1, MPI_INTEGER, src=4, tag=273, MPI_COMM_WORLD, status=0x0000000000712400) failed<br/>MPIDI_CH3I_Progress(165): handle_sock_op failed<br/>handle_sock_read(530)...: <br/>ReadFailed(1518)........: An existing connection was forcibly closed by the remote host.  (errno 10054)<br/>1: SB-NODE011: fatal error: Fatal error in MPI_Bcast: Other MPI error, error stack:<br/>MPI_Bcast(791)..........: MPI_Bcast(buf=0x0000000001A4F408, count=1, MPI_INTEGER, root=0, MPI_COMM_WORLD) failed<br/>MPIR_Bcast(192).........: <br/>MPIC_Recv(98)...........: <br/>MPIC_Wait(321)..........: <br/>MPIDI_CH3I_Progress(165): handle_sock_op failed<br/>handle_sock_read(530)...: <br/>ReadFailed(1518)........: An existing connection was forcibly closed by the remote host.  (errno 10054)<br/>2: SB-NODE011: fatal error: Fatal error in MPI_Bcast: Other MPI error, error stack:<br/>MPI_Bcast(791)..........: MPI_Bcast(buf=0x0000000001A4F408, count=1, MPI_INTEGER, root=0, MPI_COMM_WORLD) failed<br/>MPIR_Bcast(192).........: <br/>MPIC_Recv(98)...........: <br/>MPIC_Wait(321)..........: <br/>MPIDI_CH3I_Progress(165): handle_sock_op failed<br/>handle_sock_read(530)...: <br/>ReadFailed(1518)........: An existing connection was forcibly closed by the remote host.  (errno 10054)<br/>3: SB-NODE011: fatal error: Fatal error in MPI_Bcast: Other MPI error, error stack:<br/>MPI_Bcast(791)..........: MPI_Bcast(buf=0x0000000001A4F408, count=1, MPI_INTEGER, root=0, MPI_COMM_WORLD) failed<br/>MPIR_Bcast(192).........: <br/>MPIC_Recv(98)...........: <br/>MPIC_Wait(321)..........: <br/>MPIDI_CH3I_Progress(165): handle_sock_op failed<br/>handle_sock_read(530)...: <br/>ReadFailed(1518)........: An existing connection was forcibly closed by the remote host.  (errno 10054)<br/>4: SB-NODE013: 157: process exited without calling finalize<br/>5: SB-NODE013: terminated<br/>6: SB-NODE013: terminated<br/>7: SB-NODE013: terminated</p> <p>---- error analysis -----</p> <p>4: sotoncaa.exe ended prematurely and may have crashed on SB-NODE013<br/>**********************************************************</p>Thu, 11 Jun 2009 10:46:39 Z2009-06-17T21:30:20Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/04c786d1-e2c3-4bab-9231-601fd3842043http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/04c786d1-e2c3-4bab-9231-601fd3842043AJ Frethttp://social.microsoft.com/Profile/en-US/?user=AJ%20FretC#, MPI, and VS2008 troubleI found a simple program that is basically a hello world! program for C# in VS2008.<br/><br/>I am trying to submit this job to my cluster using <em>mpiexec -n 10 MPIHello.exe. </em>on the HPC Job Manager and I am currently getting this error.<br/><br/><em>Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly 'MPI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=29b4a045737654fe' or one of its dependencies. The system cannot find the file specified.<br/>File name: 'MPI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=29b4a045737654fe'<br/>   at MPIHello.Program.Main(String[] args)</em><br/><br/><br/>I am executing the program as a release version and I added the MPI.NET reference into VS2008 by the tutorial I found here.<br/><br/><a href="http://www.osl.iu.edu/research/mpi.net/documentation/tutorial/hello.php">http://www.osl.iu.edu/research/mpi.net/documentation/tutorial/hello.php</a><br/><br/>I submitted a simple hello program using just C# with no problems which leads me to believe that I am missing a couple configuration steps.<br/><br/>Any help with my problem would be appreciated.<br/><br/>AJ FretWed, 10 Jun 2009 20:04:41 Z2009-06-13T08:03:20Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/73eb4867-5967-4fb8-8bbb-1de9963113bfhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/73eb4867-5967-4fb8-8bbb-1de9963113bfBluehivehttp://social.microsoft.com/Profile/en-US/?user=BluehivePerformance problem on multi-core system with multithreaded applicationsHello all,<br/>   I have two different multi-threaded applications that when run _alone_ on 16-core Intel Xeon box with Linux, take 4 hours(app1) and 10 hours(app2) respectively. With MPI, I specify 8 threads for each of the application and fire them simultaneously.<br/> Now, when app1 gets completed, app2 runs with its 8 threads. I am wondering if it is possible to create more threads dynamically when you have resources lying idle? Here app2 just doesn't use remaining 8 cores, right?<br/> <br/> Please let me know if I am assuming something wrong.<br/> <br/> Thanks,<br/> BlueThu, 28 May 2009 06:59:40 Z2009-05-28T22:12:57Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/ee181d29-53ee-4f17-9d18-d48736aef628http://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/ee181d29-53ee-4f17-9d18-d48736aef628Jeff - OShttp://social.microsoft.com/Profile/en-US/?user=Jeff%20-%20OSCan MPI take raw data stream?This is also posted in the developer section as I wasn't sure which forum would be the best to find the solution to my issue....any help VERY appreciated, TIA!<br/><br/>We have CCS 2003 SP1 implemented currently, but may need to move to HPC 2008 for this, which is part of the question.  The goal is to take a streams of raw data, currently transported by tcpip/winsock, convert this into something that can be passed through the/a scheduler so that it can manage near real-time analysis, very minimal latency through the cluster which may alter the stream slightly, and then be able to convert and send traffic back through to the raw data interface.<br/><br/>There have been limitations with processing through one serial channel, so we'd like to run multiple channels and we think we can produce what we need, as long as we can get the data in and out fast enough.  <br/><br/>Setup:<br/>Right now we have a small cluster, but would only be used for this one purpse, 2 nodes, quad-cores, with 8Gb of RAM each, Cisco 2960G, teamed NIC interfaces, and TOE cards as well.   All completely isolated to this task and the source/destination interface plugged into the switch.  Although I'm not sure if this is enough firepower to do it?<br/><br/>We intend to run a job that opens a socket and streams the data into memory, which would be very small, but lots of data passing, still only 2Gb for the entire duration, each stream would represent one channel, which we hope can be spread across multiple nodes.  So this brings us to the challenges....<br/><br/>How do we write the interface that recieves the data?  Does this need to run through MPI?  Can the interface be real-time, or does it have to be batched in?  i.e. do we need to try and break the data up in smaller transactions and batch it through to get the result?  How do we get the cluster to accept the data stream and allow us to view it and manipulate it?<br/><br/>Thanks so much for your help on this!Thu, 09 Apr 2009 03:47:10 Z2009-06-25T23:22:06Zhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/822cb08f-699f-41e2-ac73-016dc645175dhttp://social.microsoft.com/Forums/en-US/windowshpcmpi/thread/822cb08f-699f-41e2-ac73-016dc645175dChris Quirkhttp://social.microsoft.com/Profile/en-US/?user=Chris%20QuirkMPI job aborts for unexplained reasons<p>Hi there--<br/><br/>We're upgrading to HPC Pack 2008 from CCS Server 2003, moving our existing C++ and C# MPI programs to HPC Pack 2008.  Generally things seem to progress, but one C++ MPI program is failing.<br/><br/>We fire off a job, and things work for a while, then we get a crash.<br/><br/>In the task properties, the Error message is:  &quot;Task failed during execution with exit code -4. Please check task's output for error details.&quot;<br/><br/>The &quot;Output:&quot; box in the HPC Job manager is empty.  However if we look in the log file that captured stdout/stderr from the job, we see:</p> <pre> Aborting: smpd on MT-CCS-01 failed to communicate with smpd on MT-CCS-11 Other MPI error, error stack: ReadFailed(1317): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. (errno 10060) </pre> <p>No indiciation of failures on the logs at mt-ccs-11.  Under 2003, I would look for %windir%\pchealth\ErrorRep\UserDumps, but that directory isn't here.  Should I enable error reporing?  Any suggestions on how to diagnose?<br/><br/>Thanks for your help!<br/><br/></p>Tue, 12 May 2009 22:12:14 Z2009-05-12T22:12:14Z