locked
HPC 2012 R2 4.5.5158.0 Unable to run a test Matlab job RRS feed

  • Question

  • Good afternoon all,

    I am a sys admin that just inherited a HPC 2012 R2 cluster from an engineer that recently left and he provided little to no training on the software/capabilities so I have been trying to educate myself on the setup and capabilities. Right now we need the capability to run Matlab jobs in the future on the HPC so I have been putting together documentation on how to do this. But When I try and run a test job on the HPC it fails. I do have the Matlab runtime installed and I also set the variable with-in the job task so it knows where to get the runtime. Below are the logs I think it could be a permissions issue but I have checked all of those and everything appears to be correct. Any help or insight would be greatly appreciated. Or links to anything that you all would think that could help be educational on HPC for videos etc. Thanks in advance. 

    ERROR messages:

    Unable to open standard input file on node CORECUDA01:Microsoft.Hpc.Activation.NodeManagerException: Failed to open standard input file '\\hpcvmserver\nvme\Matlab HPC Test\magicsquare', The system cannot find the file specifiedException 'Failed to open standard input file '\\hpcvmserver\nvme\Matlab HPC Test\magicsquare', The system cannot find the file specified' reported creating the task. ---> System.ComponentModel.Win32Exception: Failed to open standard input file '\\hpcvmserver\nvme\Matlab HPC Test\magicsquare', The system cannot find the file specified

       at Microsoft.Hpc.NodeManager.RemotingExecutor.Process.CreateStandardFile(String errorName, String filePath, Int32 openOption, FileMode openMode)

       at Microsoft.Hpc.NodeManager.RemotingExecutor.Process.CreateUserProcess(JobEntry job, Int32 taskId, ProcessStartInfo psi)

       --- End of inner exception stack trace ---

     

    Server stack trace:

       at Microsoft.Hpc.NodeManager.RemotingExecutor.RemotingNMExecImpl.StartTask(Int32 jobId, Int32 taskId, ProcessStartInfo startInfo)

       at Microsoft.Hpc.NodeManager.RemotingCommunicator.RemotingNMCommImpl.StartTask(Int32 jobId, Int32 taskId, ProcessStartInfo startInfo)

       at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs)

       at System.Runtime.Remoting.Messaging.StackBuilderSink.SyncProcessMessage(IMessage msg)

     

    Exception rethrown at [0]:

       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

       at Microsoft.Hpc.Scheduler.Communicator.Remoting.NodeController.StartTaskWorker.EndInvoke(IAsyncResult result)

       at Microsoft.Hpc.Scheduler.Communicator.Remoting.NodeController.AsyncContext`1.EndCall(IAsyncResult result)

    Wednesday, June 26, 2019 1:45 PM

All replies

  • Hi xander2019,

    As the error indicates, could you double check if the input file '\\hpcvmserver\nvme\Matlab HPC Test\magicsquare' exists or can be accessed from the task on the compute node?

    For HPC Pack docs, you may check this link.

    Regards,

    Yutong Sun

    Friday, July 26, 2019 3:42 PM