Linux cancle job RRS feed

  • Question

  • I have notice that when I cancle a job on the Scheduler it does not cancle a job on the compute nodes

    Friday, October 26, 2018 4:18 PM

All replies

  • HI Spooner,

      Have you disabled CGroup on your linux node?

    Qiufang Shi

    Monday, October 29, 2018 5:14 AM
  • <g class="gr_ gr_69 gr-alert gr_gramm gr_inline_cards gr_run_anim Punctuation only-ins replaceWithoutSep" data-gr-id="69" id="69">Well</g> I commented out in the common.sh file, is this the same?
    Monday, October 29, 2018 11:41 AM
  • Hi Spooner,

      With CGroup, we can control all the task process, to clean them up without running away ones. As you disabled CGroup, you need to manage the runaway process (As a task will generate new process).

      We have "Execution Filter" --> "OnJobEnd.sh" help accomplish this. OnJobEnd.sh will be executed on linux node when a job is completed thus you can put your own clean up codes there including killing runaway process. In your case, you can write something like "pkill -9 namd" 


    Qiufang Shi

    Tuesday, October 30, 2018 12:53 AM
  • With CGroup turned on then this would happen? I had turned it on because it was affecting performance. I have not tested it with u2 yet  
    Tuesday, October 30, 2018 1:44 AM
  • Hi, Spooner, 

      With CGRoup turned on (Default), this wouldn't happen. And you disabled it due to performance issue, but in that case you need use "OnJobEnd.sh" to clean up your job process.

    Qiufang Shi

    Thursday, November 1, 2018 3:04 AM
  • sorry i had forgotten about this, 

    where is the OnJObEnd.sh file or is something i have to create?

    Friday, May 17, 2019 9:18 PM
  • i took a look at the site you provide earlier 


    it made a reference to /opt/hpcnodemanager/filters but i do not have the folder filters listed in hpcnodemanager

    Friday, May 17, 2019 9:31 PM