locked
MS-MPI and cmd Windows server 2008 RRS feed

  • Question

  •  
    Hi,

    I have installed Windows server 2008 and the HPC component and i have run the diagnostic test and all tests success.
    I have disabled firewall and I am using Network topolgy 5.

    I am tring to run a task from cmd it is the greeting program without job scheduled.

    when i write

    C:\>mpiexec -machinefile hosts.txt -n 2 Greetings.exe

    it give me this message:

    "Aborting: Access denied by node H-node
    This node is a resourc managed by the Microsoft HPC scheduler and mpiexec was attempting to use it without a scheduled job ".
     
    where hosts.txt contains H-Node and C-Node1.
     
    Also when I write in the cmd
     
    C:\>smpd -d
     
    it gives me this message
     
    "[-1:2896] ERROR: MPIDU_Sock_listen failed,
    sock error: Other MPI error, error stack:
    MPIDU_Sock_listen(655).:
    easy_create_ranged(614): Only one usage of each socke
     address/port) is normally permitted.  (errno 10048)"
     
    Need your help.

    Thanks.

    Regards
    Hisham Adel
    Friday, September 12, 2008 8:38 PM

Answers

  • Hi, Hisham (Your account is Hesham but you signed this note "Hisham Adel"?) 

    The concept you're missing here is that of the job scheduler in HPC Server 2008.  The job scheduler manages the resources of the cluster and enable secure use of the cluster resources, namely the compute nodes.  You can learn more about the job scheduler by: 
    1. Starting up the HPCS2008 Job Console and typing F1 for help. 
    2. Or if you prefer a command line interface, you can type "job /?" or "job -help". 
    3. Or for a really powerful command line experience (please excuse the pun), you can start the HPCS Powershell tool. 

    The first error message you encountered, "Aborting: Access denied by node H-node.  This node is a resource managed by the Microsoft HPC scheduler and..." indicates you attempted to use H-node without first reserving H-node by running an HPCS job.  The good news is that it's easy to fix this! ;) 

    Run your job like this:  
        job submit /numnodes:2 mpiexec Greetings.exe
    to run the MPI program Greetings.exe on 2 nodes of your cluster. 

    Or, 
        job submit /numcores:4 mpiexec Greetings.exe
    to run Greetings on 4 cores and let the HPCS scheduler figure out how many and which nodes to get 4 cores. 

    Or, 
        job submit /numcores:8 /requestednodes:H-node,C-node1 mpiexec Greetings.exe
    to run Greetings on 8 cores using at least the nodes H-node and C-node1 (and maybe more nodes if needed to reach the requested 8 cores. 

    And there are many more options...check out "job submit /?" for more info.  The idea is to let the scheduler manage the resource- note there are no -hosts or -machinefile in my examples.  While these are valid mpiexec options and can be used in HPCS they are not required.  If used, the resources in these arguments MUST match those assigned by the scheduler or a permissions error like the one you encountered will ensue. 

    The second error, ""[-1:2896] ERROR: MPIDU_Sock_listen failed..." occurred because you were manually starting an SMPD when one was already running.  Several services run on all HPCS2008 compute nodes (and head node when it's being used as a compute node) including the MS-MPI Service (the smpd).  If you want to run a second SMPD manually, you'll have to specify a communications port other than the default port for that SMPD to monitor.  Type "smpd /?" for more info on command line arguments, but there's NO NEED to run an other SMPD on the nodes...this option is used for advanced debugging and not normally something you'd want to do. 

    Hope this helps ;)
    Please ask more questions if I've not been clear. 

    Eric

    Eric Lantz (Microsoft)
    Tuesday, September 16, 2008 3:16 PM