HPCJobScheduler crash with invalid job<p>We have a serious proble with job manager. We submitted a job using the &quot;parametric sweep&quot; link on the right of HPCJobManager console. Somehow the job details seems to be corrupted and it crash the console everytime we tried to look at the job details. We found that the service &quot;HPC Job Scheduler&quot; was crashed by just viewing the job details thus the console freezed. The &quot;job view&quot; command, with the errnous job id, also crashed the service too. Sometimes we can view the job details for a short while, but trying to do anything to the job (cancel/modify) crashed &quot;HPCScheduler&quot; services. </p> <p align=left> </p> <p align=left>We looked at HPCScheduler.log and found a lot of entries like below.</p> <p align=left> </p> <p align=left>2008/05/13 09:30:01 [5][RC] [Error] Unexpected error when process message for job 369. Detail: Object reference not set to an instance of an object.</p> <p align=left><font face=Arial size=2></font> </p> <p align=left>It seems like something in the database was corrupted. Are there anyway to fix this? </p> <p align=left> </p> <p align=left>Right now what we do is to ignore the job completely, not clicking on the job, not trying to look at details of job. Note that, the job state is &quot;Running&quot; but the task state is &quot;failed&quot;. </p>© 2009 Microsoft Corporation. All rights reserved.Wed, 25 Mar 2009 23:57:39 Z54a92b14-bc4b-414d-a3c5-11fc58d8a628http://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#54a92b14-bc4b-414d-a3c5-11fc58d8a628http://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#54a92b14-bc4b-414d-a3c5-11fc58d8a628Somsak Sriprayoonsakulhttp://social.microsoft.com/Profile/en-US/?user=Somsak%20SriprayoonsakulHPCJobScheduler crash with invalid job<p>We have a serious proble with job manager. We submitted a job using the &quot;parametric sweep&quot; link on the right of HPCJobManager console. Somehow the job details seems to be corrupted and it crash the console everytime we tried to look at the job details. We found that the service &quot;HPC Job Scheduler&quot; was crashed by just viewing the job details thus the console freezed. The &quot;job view&quot; command, with the errnous job id, also crashed the service too. Sometimes we can view the job details for a short while, but trying to do anything to the job (cancel/modify) crashed &quot;HPCScheduler&quot; services. </p> <p align=left> </p> <p align=left>We looked at HPCScheduler.log and found a lot of entries like below.</p> <p align=left> </p> <p align=left>2008/05/13 09:30:01 [5][RC] [Error] Unexpected error when process message for job 369. Detail: Object reference not set to an instance of an object.</p> <p align=left><font face=Arial size=2></font> </p> <p align=left>It seems like something in the database was corrupted. Are there anyway to fix this? </p> <p align=left> </p> <p align=left>Right now what we do is to ignore the job completely, not clicking on the job, not trying to look at details of job. Note that, the job state is &quot;Running&quot; but the task state is &quot;failed&quot;. </p>Tue, 13 May 2008 02:49:53 Z2008-05-13T02:49:53Zhttp://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#be2d87fa-df56-4de9-a339-5f130b681373http://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#be2d87fa-df56-4de9-a339-5f130b681373Somsak Sriprayoonsakulhttp://social.microsoft.com/Profile/en-US/?user=Somsak%20SriprayoonsakulHPCJobScheduler crash with invalid job<p>Now we can't submit new job to the system. HPCScheduler crash ( a screen pop-up asking whether we want to debug the application or not) everytime we tried to do anything with HPC Job Manager.</p> <p align=left> </p> <p align=left> <div class=quote> <table width="85%"> <tbody> <tr> <td class=txt4> <strong>Somsak Sriprayoonsakul wrote:</strong></td></tr> <tr> <td class=quoteTable> <table width="100%"> <tbody> <tr> <td class=txt4 valign=top width="100%"> <p></p> <p>We have a serious proble with job manager. We submitted a job using the &quot;parametric sweep&quot; link on the right of HPCJobManager console. Somehow the job details seems to be corrupted and it crash the console everytime we tried to look at the job details. We found that the service &quot;HPC Job Scheduler&quot; was crashed by just viewing the job details thus the console freezed. The &quot;job view&quot; command, with the errnous job id, also crashed the service too. Sometimes we can view the job details for a short while, but trying to do anything to the job (cancel/modify) crashed &quot;HPCScheduler&quot; services. </p> <p align=left> </p> <p align=left>We looked at HPCScheduler.log and found a lot of entries like below.</p> <p align=left> </p> <p align=left>2008/05/13 09:30:01 [5][RC] [Error] Unexpected error when process message for job 369. Detail: Object reference not set to an instance of an object.</p> <p align=left><font face=Arial size=2></font> </p> <p align=left>It seems like something in the database was corrupted. Are there anyway to fix this? </p> <p align=left> </p> <p align=left>Right now what we do is to ignore the job completely, not clicking on the job, not trying to look at details of job. Note that, the job state is &quot;Running&quot; but the task state is &quot;failed&quot;. </p></td></tr></tbody></table></td></tr></tbody></table></div>Tue, 13 May 2008 03:04:40 Z2008-05-13T03:04:40Zhttp://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#ffae24f7-2d5c-435f-9d24-9343a07a4572http://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#ffae24f7-2d5c-435f-9d24-9343a07a4572carter_chenhttp://social.microsoft.com/Profile/en-US/?user=carter_chenHPCJobScheduler crash with invalid job<p align=left><font face=Arial size=2><span>Hi,</span></font></p> <p align=left><font face=Arial size=2><span></span></font> </p> <p align=left><font face=Arial size=2><span></span></font><font face=Arial size=2><span>Could you please provide log file to us?</p> <p align=left> </p> <p align=left>Please run the following PS script on the HN.  This should create a folder called ClusterCfg under the directory you're running the script from. Please zip the directroy and send to me via email. (christc at microsoft dot com) </p> <div class=codeseg> <div class=codecontent> <div class=codesniptitle> </div> <div class=codesniptitle> </div> <p align=left>#Some location information<br>$OutputDirName = &quot;ClusterConfig&quot;<br>$NetworkInfoFile = &quot;$OutputDirName\NetworkInfo.txt&quot;<br>$NodeInfoFile = &quot;$OutputDirName\NodeInfo.txt&quot;<br>$HpcLogDir = &quot;$OutputDirName\HpcLogs&quot;<br>$LogDir = &quot;$OutputDirName\Logs&quot;</p> <p align=left>#Create a directory in which to stash everything<br>Echo &quot;Creating directories . . .&quot;<br>New-Item -name $OutputDirName  -ItemType directory</p> <p align=left>#Get system information<br>&quot;Getting system info . . .&quot;<br>msinfo32 /report &quot;$OutputDirName\SysInfo.txt&quot;</p> <p align=left>#Dump the Network Information to a File<br>Echo &quot;Dumping network configuration . . .&quot;<br>&quot;Network Topology:&quot; &gt; $NetworkInfoFile<br>Get-HpcNetWorkTopology &gt;&gt; $NetworkInfoFile<br>&quot;&quot; &gt;&gt; $NetworkInfoFile<br>&quot;Network Interfaces:&quot; &gt;&gt; $NetworkInfoFile<br>Get-HpcNetworkInterface | Format-List &gt;&gt; $NetworkInfoFile</p> <p align=left>#Dump the Node Information to a File<br>ECho &quot;Dumping node info . . .&quot;<br>Get-HpcNode | sort NetBiosName | Format-List &gt;&gt; $NodeInfoFile</p> <p align=left>#Copy over the log files<br>Echo &quot;Copying HPC logs . . .&quot;<br>robocopy $env:CCP_DATA\Logfiles $HpcLogDir /E</p> <p align=left>#Get Event Logs<br>Echo &quot;Copying system logs . . .&quot;<br>wevtutil epl System &quot;$LogDir\System.evtx&quot;<br>Echo &quot;Copying application logs . . .&quot;<br>wevtutil epl Application &quot;$LogDir\Application.evtx&quot;</p> <p align=left> </p></div></div> <p align=left> </p> <p></p> <p align=left></p> <p align=left>Thanks,</p> <p align=left>Christina</span></font></p>Tue, 13 May 2008 06:17:44 Z2008-05-13T06:17:44Zhttp://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#b659a7b7-f95a-413b-b764-e4fa0d070648http://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#b659a7b7-f95a-413b-b764-e4fa0d070648carter_chenhttp://social.microsoft.com/Profile/en-US/?user=carter_chenHPCJobScheduler crash with invalid job<p>Also, what version of Cluster Manager are you using? Please see Help-&gt;About for the version number</p> <p align=left> </p> <p align=left>Thank you,</p> <p align=left>Christina</p> <p align=left><font face=Arial size=2></font> </p>Tue, 13 May 2008 06:24:39 Z2008-05-13T06:24:39Zhttp://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#3429ca7b-5368-483b-b812-1dea9030d864http://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#3429ca7b-5368-483b-b812-1dea9030d864Somsak Sriprayoonsakulhttp://social.microsoft.com/Profile/en-US/?user=Somsak%20SriprayoonsakulHPCJobScheduler crash with invalid jobHi,<br><br>    Thanks for quick reply.<br>    Our cluster manager version is 2.0.1302.0.<br>    I just send the information to you. Some commands in powershell script failed though. I attached the output of the script in the zipped file.<br>Wed, 14 May 2008 06:49:24 Z2008-05-14T06:49:24Zhttp://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#e07c81fd-1a21-4089-bcad-b17b3d917f8bhttp://social.microsoft.com/Forums/en-US/windowshpcbeta/thread/54a92b14-bc4b-414d-a3c5-11fc58d8a628#e07c81fd-1a21-4089-bcad-b17b3d917f8bDon Patteehttp://social.microsoft.com/Profile/en-US/?user=Don%20PatteeHPCJobScheduler crash with invalid job  <p>HPC Server 2008 shipped in September 2008, so I'm going through and marking all questions in the beta forum as 'answered'.</p>Wed, 25 Mar 2009 23:57:26 Z2009-03-25T23:57:26Z