Job Controls

After a job is started, it can be killed, suspended, or resumed by the system, an LSF user, or LSF administrator. LSF job control actions cause the status of a job to change. This chapter describes how to configure job control actions to override or augment the default job control actions.

Default job control actions

After a job is started, it can be killed, suspended, or resumed by the system, an LSF user, or LSF administrator. LSF job control actions cause the status of a job to change. LSF supports the following default actions for job controls:
  • SUSPEND

  • RESUME

  • TERMINATE

On successful completion of the job control action, the LSF job control commands cause the status of a job to change.

The environment variable LS_EXEC_T is set to the value JOB_CONTROLS for a job when a job control action is initiated.

SUSPEND action

Change a running job from RUN state to one of the following states:
  • USUSP or PSUSP in response to bstop

  • SSUSP state when the LSF system suspends the job

The default action is to send the following signals to the job:
  • SIGTSTP for parallel or interactive jobs. SIGTSTP is caught by the master process and passed to all the slave processes running on other hosts.

  • SIGSTOP for sequential jobs. SIGSTOP cannot be caught by user programs. The SIGSTOP signal can be configured with the LSB_SIGSTOP parameter in lsf.conf.

LSF invokes the SUSPEND action when:
  • The user or LSF administrator issues a bstop or bkill command to the job

  • Load conditions on the execution host satisfy any of:
    • The suspend conditions of the queue, as specified by the STOP_COND parameter in lsb.queues

    • The scheduling thresholds of the queue or the execution host

  • The run window of the queue closes

  • The job is preempted by a higher priority job

RESUME action

Change a suspended job from SSUSP, USUSP, or PSUSP state to the RUN state. The default action is to send the signal SIGCONT.

LSF invokes the RESUME action when:
  • The user or LSF administrator issues a bresume command to the job

  • Load conditions on the execution host satisfy all of:
    • The resume conditions of the queue, as specified by the RESUME_COND parameter in lsb.queues

    • The scheduling thresholds of the queue and the execution host

  • A closed run window of the queue opens again

  • A preempted job finishes

TERMINATE action

Terminate a job. This usually causes the job change to EXIT status. The default action is to send SIGINT first, then send SIGTERM 10 seconds after SIGINT, then send SIGKILL 10 seconds after SIGTERM. The delay between signals allows user programs to catch the signals and clean up before the job terminates.

To override the 10 second interval, use the parameter JOB_TERMINATE_INTERVAL in the lsb.params file. See the IBM Platform LSF Configuration Reference for information about the lsb.params file.

LSF invokes the TERMINATE action when:
  • The user or LSF administrator issues a bkill or brequeue command to the job

  • The TERMINATE_WHEN parameter in the queue definition (lsb.queues) causes a SUSPEND action to be redirected to TERMINATE

  • The job reaches its CPULIMIT, MEMLIMIT, RUNLIMIT or PROCESSLIMIT

  • Start of change The administrator defines a cluster wide termination grace period for killing orphan jobs, or the user issues a bsub -w -ti command sub-option to enforce immediate automatic orphan job termination on a per-job basis. End of change

If the execution of an action is in progress, no further actions are initiated unless it is the TERMINATE action. A TERMINATE action is issued for all job states except PEND.

Windows job control actions

On Windows, actions equivalent to the UNIX signals have been implemented to do the default job control actions. Job control messages replace the SIGINT and SIGTERM signals, but only customized applications will be able to process them. Termination is implemented by the TerminateProcess() system call.

See IBM Platform LSF Programmer’s Guide for more information about LSF signal handling on Windows.

Configure job control actions

Several situations may require overriding or augmenting the default actions for job control. For example:
  • Notifying users when their jobs are suspended, resumed, or terminated

  • An application holds resources that are not freed by suspending the job. The administrator can set up an action to be performed that causes the resource to be released before the job is suspended and re-acquired when the job is resumed.

  • The administrator wants the job checkpointed before being:
    • Suspended when a run window closes

    • Killed when the RUNLIMIT is reached

  • A distributed parallel application must receive a catchable signal when the job is suspended, resumed or terminated to propagate the signal to remote processes.

To override the default actions for the SUSPEND, RESUME, and TERMINATE job controls, specify the JOB_CONTROLS parameter in the queue definition in lsb.queues.

JOB_CONTROLS parameter (lsb.queues)

The JOB_CONTROLS parameter has the following format:
Begin Queue 
... 
JOB_CONTROLS = SUSPEND[signal | CHKPNT | command] \  
               RESUME[signal | command]  \ 
               TERMINATE[signal | CHKPNT | command] 
... 
End Queue

When LSF needs to suspend, resume, or terminate a job, it invokes one of the following actions as specified by SUSPEND, RESUME, and TERMINATE.

signal

A UNIX signal name (for example, SIGTSTP or SIGTERM). The specified signal is sent to the job.

The same set of signals is not supported on all UNIX systems. To display a list of the symbolic names of the signals (without the SIG prefix) supported on your system, use the kill -l command.

CHKPNT

Checkpoint the job. Only valid for SUSPEND and TERMINATE actions.
  • If the SUSPEND action is CHKPNT, the job is checkpointed and then stopped by sending the SIGSTOP signal to the job automatically.

  • If the TERMINATE action is CHKPNT, then the job is checkpointed and killed automatically.

command

A /bin/sh command line.
  • Do not quote the command line inside an action definition.

  • Do not specify a signal followed by an action that triggers the same signal (for example, do not specify JOB_CONTROLS=TERMINATE[bkill] or JOB_CONTROLS=TERMINATE[brequeue]). This will cause a deadlock between the signal and the action.

Use a command as a job control action

  • The command line for the action is run with /bin/sh -c so you can use shell features in the command.

  • The command is run as the user of the job.

  • All environment variables set for the job are also set for the command action. The following additional environment variables are set:
    • LSB_JOBPGIDS: A list of current process group IDs of the job

    • LSB_JOBPIDS: A list of current process IDs of the job

  • For the SUSPEND action command, the environment variables LSB_SUSP_REASONS and LSB_SUSP_SUBREASONS are also set. Use them together in your custom job control to determine the exact load threshold that caused a job to be suspended.
    • LSB_SUSP_REASONS: An integer representing a bitmap of suspending reasons as defined in lsbatch.h. The suspending reason can allow the command to take different actions based on the reason for suspending the job.

    • LSB_SUSP_SUBREASONS: An integer representing the load index that caused the job to be suspended. When the suspending reason SUSP_LOAD_REASON (suspended by load) is set in LSB_SUSP_REASONS, LSB_SUSP_SUBREASONS is set to one of the load index values defined in lsf.h.

  • The standard input, output, and error of the command are redirected to the NULL device, so you cannot tell directly whether the command runs correctly. The default null device on UNIX is /dev/null.

  • You should make sure the command line is correct. If you want to see the output from the command line for testing purposes, redirect the output to a file inside the command line.

TERMINATE job actions

Use caution when configuring TERMINATE job actions that do more than just kill a job. For example, resource usage limits that terminate jobs change the job state to SSUSP while LSF waits for the job to end. If the job is not killed by the TERMINATE action, it remains suspended indefinitely.

TERMINATE_WHEN parameter (lsb.queues)

In certain situations you may want to terminate the job instead of calling the default SUSPEND action. For example, you may want to kill jobs if the run window of the queue is closed. Use the TERMINATE_WHEN parameter to configure the queue to invoke the TERMINATE action instead of SUSPEND.

See the IBM Platform LSF Configuration Reference for information about the lsb.queues file and the TERMINATE_WHEN parameter.

Syntax

TERMINATE_WHEN = [LOAD] [PREEMPT] [WINDOW]

Example

The following defines a night queue that will kill jobs if the run window closes.
Begin Queue 
NAME           = night 
RUN_WINDOW     = 20:00-08:00 
TERMINATE_WHEN = WINDOW 
JOB_CONTROLS   = TERMINATE[ kill -KILL $LSB_JOBPIDS; \
     echo "job $LSB_JOBID killed by queue run window" | \
     mail $USER ] 
End Queue

LSB_SIGSTOP parameter (lsf.conf)

Use LSB_SIGSTOP to configure the SIGSTOP signal sent by the default SUSPEND action.

If LSB_SIGSTOP is set to anything other than SIGSTOP, the SIGTSTP signal that is normally sent by the SUSPEND action is not sent. For example, if LSB_SIGSTOP=SIGKILL, the three default signals sent by the TERMINATE action (SIGINT, SIGTERM, and SIGKILL) are sent 10 seconds apart.

Avoid signal and action deadlock

Do not configure a job control to contain the signal or command that is the same as the action associated with that job control. This will cause a deadlock between the signal and the action.

For example, the bkill command uses the TERMINATE action, so a deadlock results when the TERMINATE action itself contains the bkill command.

Any of the following job control specifications will cause a deadlock:
  • JOB_CONTROLS=TERMINATE[bkill]

  • JOB_CONTROLS=TERMINATE[brequeue]

  • JOB_CONTROLS=RESUME[bresume]

  • JOB_CONTROLS=SUSPEND[bstop]

Customize cross-platform signal conversion

LSF supports signal conversion between UNIX and Windows for remote interactive execution through RES.

On Windows, the CTRL+C and CTRL+BREAK key combinations are treated as signals for console applications (these signals are also called console control actions).

LSF supports these two Windows console signals for remote interactive execution. LSF regenerates these signals for user tasks on the execution host.

Default signal conversion

In a mixed Windows/UNIX environment, LSF has the following default conversion between the Windows console signals and the UNIX signals:

Windows

UNIX

CTRL+C

SIGINT

CTRL+BREAK

SIGQUIT

For example, if you issue the lsrun or bsub -I commands from a Windows console but the task is running on an UNIX host, pressing the CTRL+C keys will generate a UNIX SIGINT signal to your task on the UNIX host. The opposite is also true.

Custom signal conversion

For lsrun (but not bsub -I), LSF allows you to define your own signal conversion using the following environment variables:
  • LSF_NT2UNIX_CTRLC

  • LSF_NT2UNIX_CTRLB

For example:
  • LSF_NT2UNIX_CTRLC=SIGXXXX

  • LSF_NT2UNIX_CTRLB=SIGYYYY

Here, SIGXXXX/SIGYYYY are UNIX signal names such as SIGQUIT, SIGTINT, etc. The conversions will then be: CTRL+C=SIGXXXX and CTRL+BREAK=SIGYYYY.

If both LSF_NT2UNIX_CTRLC and LSF_NT2UNIX_CTRLB are set to the same value (LSF_NT2UNIX_CTRLC=SIGXXXX and LSF_NT2UNIX_CTRLB=SIGXXXX), CTRL+C will be generated on the Windows execution host.

For bsub -I, there is no conversion other than the default conversion.

Process tracking through cgroups

This feature depends on the Control Groups (cgroups) functions provided by the LINUX kernel. The cgroups functions are supported on x86_64 and PowerPC LINUX with kernel version 2.6.24 or later.

Process tracking through cgroups can capture job processes that are not in the existing job's process tree and have process group IDs that are different from the existing ones, or job processes that run very quickly, before LSF has a chance to find them in the regular or on-demand process table scan issued by PIM.

Process tracking is controlled by two parameters in lsf.conf:

  • LSF_PROCESS_TRACKING: Tracks job processes and executes job control functions such as termination, suspension, resume and other signaling, on Linux systems which support cgroup's freezer subsystem.
  • LSF_LINUX_CGROUP_ACCT: Tracks processes based on CPU and memory accounting for Linux systems that support cgroup's memory and cpuacct subsystems.

If you plan to use the process tracking and cgroup accounting, you must set up freezer, cpuacct and memory subsystems on each machine in the cluster which support cgroups.

For example, to configure the cgroup's subsystems to support both LSF cgroup features:

  • For Linux kernel versions earlier than 3.0 (for example, Red Hat 6.2, 6.3 and 6.4, and SUSE 11 Patch 1), add the following lines to /etc/fstab:

    CAUTION:
    Confirm that the appropriate functionality is correctly installed on the system before making updates to /etc/fstab.
    cgroup /cgroup/freezer cgroup freezer,ns 0 0
    cgroup /cgroup/cpuset cgroup cpuset 0 0
    cgroup /cgroup/cpuacct cgroup cpuacct 0 0
    cgroup /cgroup/memory cgroup memory 0 0
  • For Linux kernel versions above 3.0 (for example, SUSE 11 Patch 2), add the following lines to /etc/fstab:
    cgroup /cgroup/freezer cgroup freezer 0 0
    cgroup /cgroup/cpuset cgroup cpuset 0 0
    cgroup /cgroup/cpuacct cgroup cpuacct 0 0
    cgroup /cgroup/memory cgroup memory 0 0

    Then, run the following command: mount -a -t cgroup

Make sure these directories (/cgroup/freezer, /cgroup/cpuset, /cgroup/cpuacct, /cgroup/memory) exist in the /cgroup directory before the mount command is issued.

If you only want to enable one LSF cgroup feature (for example, LSF_LINUX_CGROUP_ACCT), add the following lines to /etc/fstab:

cgroup /cgroup/cpuacct cgroup cpuacct 0 0
cgroup /cgroup/memory cgroup memory 0 0

Or, if you use cgconfig to manage cgroups, you can instead configure the cgroup's subsystems to support both LSF cgroup features by adding the following to /etc/cgconfig.conf:

mount {
 freezer = /cgroup/freezer;
 cpuset =  /cgroup/cpuset;
 cpuacct = /cgroup/cpuacct;
 memory = /cgroup/memory;
}

To start or restart the cgconfig service, use /etc/init.d/cgconfig start|restart. Normally, the cgconfig is not installed by default. To install it, use the corresponding rpm package libcgroup for Red Hat and libcgroup1 for SUSE.

For one successful cgroup mount operation, you can use the file /proc/mounts to check, it should contains the lines similar as:

cgroup /cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /cgroup/cpuset cgroup rw,relatime,cpuset 0 0
cgroup /cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
cgroup /cgroup/memory cgroup rw,relatime,memory 0 0

If you no longer need the cgroup subsystem mounted, you can use the command umount -a -t cgroup to dismount all cgroup type mounting points listed in /etc/fstab.

You can also dismount them individually, such as:

umount /cgroup/freezer
umount /cgroup/cpuset
umount /cgroup/cpuacct
umount /cgroup/memory