Задайте вопросЗадайте вопрос
 

Вопросhow to select cores from nodes to run job on multiple nodes.

  • 17 июня 2009 г. 7:29pkalyanrao Медали пользователяМедали пользователяМедали пользователяМедали пользователяМедали пользователя
     
    Hi all,

        I have 2 nodes in my cluster with 4 core on each node.

        I have one exe file called sleep.exe. I submitted job with  Job submit /numnodes:2 mpiexec -cores 2 sleep.exe then it was open 2 sleep.exe processes on each node.


        And I have a 4 core Ansys CFX job, and I want to run this job on 2 core from first node and other 2 core from second node.

        I have tried  with job submit /numnodes:2 /workdir:<working directory path> /stdout:out.log /stderr:error.log mpiexec -cores 2 cfx5solve.exe -v -def <.def file> -start-method MSMPI -part 4. Then the job got failed and generated below error information in error.log file


    "An error has occurred in cfx5solve:

    Error reported by IO module: readIntFmtData: (fgets failed) syserr:: No
    error

    An error has occurred in cfx5solve:

    Error reported by IO module: iif_set_lock: error reading lock file
    //litocmaster/work/benchmark.def.lck: No error

    An error has occurred in cfx5solve:

    Neither Start Command nor Option is defined for start method MSMPI; check
    that you have given the method name correctly.

    An error has occurred in cfx5solve:

    Neither Start Command nor Option is defined for start method MSMPI; check
    that you have given the method name correctly.

    Can't call method "name" on an undefined value at C:\Program Files\ANSYS Inc\v110\CFX\bin\/perllib/CFX5/Job/Settings.pm line 2464.
    An error has occurred in cfx5solve:

    Neither Start Command nor Option is defined for start method MSMPI; check
    that you have given the method name correctly.

    An error has occurred in cfx5solve:

    Neither Start Command nor Option is defined for start method MSMPI; check
    that you have given the method name correctly.

    Can't call method "name" on an undefined value at c:\Program Files\ANSYS Inc\v110\CFX\bin\/perllib/CFX5/Job/Settings.pm line 2464.
    An error has occurred in cfx5solve:

    Neither Start Command nor Option is defined for start method MSMPI; check
    that you have given the method name correctly.

    An error has occurred in cfx5solve:

    Neither Start Command nor Option is defined for start method MSMPI; check
    that you have given the method name correctly.

    Can't call method "name" on an undefined value at c:\Program Files\ANSYS Inc\v110\CFX\bin\/perllib/CFX5/Job/Settings.pm line 2464.
    An error has occurred in cfx5solve:

    Neither Start Command nor Option is defined for start method MSMPI; check
    that you have given the method name correctly.

    An error has occurred in cfx5solve:

    Neither Start Command nor Option is defined for start method MSMPI; check
    that you have given the method name correctly.

    Can't call method "name" on an undefined value at C:\Program Files\ANSYS Inc\v110\CFX\bin\/perllib/CFX5/Job/Settings.pm line 2464.
    "
     


    But when I submit the job with out mpiexec option, The job is running fine on available resources.

    Will mpiexec works with all applications or not. Please give me suggessions on this. And any body tested this kind of scenario with Starccm application.

    Regards,
    P. Kalyan Rao
     
    • ПеремещеноJosh BarnardMSFT, Владелец17 июня 2009 г. 19:03Seems to be more about MPI startup (From:Windows HPC Server Job Submission and Scheduling)
    •  

Все ответы