When LSF detects that a job is terminated, bacct -l, bhist -l, and bjobs -l display one of the following termination reasons:
Keyword displayed by bacct |
Termination reason |
Integer value logged to JOB_FINISH in lsb.acct |
---|---|---|
TERM_ADMIN |
Job killed by root or LSF administrator |
15 |
TERM_BUCKET_KILL |
Job killed with bkill -b |
23 |
TERM_CHKPNT |
Job killed after checkpointing |
13 |
TERM_CPULIMIT |
Job killed after reaching LSF CPU usage limit |
12 |
TERM_CWD_NOTEXIST |
Current working directory is not accessible or does not exist on the execution host |
25 |
TERM_DEADLINE |
Job killed after deadline expires |
6 |
TERM_EXTERNAL_SIGNAL |
Job killed by a signal external to LSF |
17 |
TERM_FORCE_ADMIN |
Job killed by root or LSF administrator without time for cleanup |
9 |
TERM_FORCE_OWNER |
Job killed by owner without time for cleanup |
8 |
TERM_LOAD |
Job killed after load exceeds threshold |
3 |
TERM_MEMLIMIT |
Job killed after reaching LSF memory usage limit |
16 |
TERM_OTHER |
Member of a chunk job in WAIT state killed and requeued after being switched to another queue. |
4 |
TERM_OWNER |
Job killed by owner |
14 |
TERM_PREEMPT |
Job killed after preemption |
1 |
TERM_PROCESSLIMIT |
Job killed after reaching LSF process limit |
7 |
TERM_REMOVE_HUNG_JOB |
Job removed from LSF |
26 |
TERM_REQUEUE_ADMIN |
Job killed and requeued by root or LSF administrator |
11 |
TERM_REQUEUE_OWNER |
Job killed and requeued by owner |
10 |
TERM_RMS |
Job exited from an RMS system error |
18 |
TERM_RUNLIMIT |
Job killed after reaching LSF run time limit |
5 |
TERM_SWAP |
Job killed after reaching LSF swap usage limit |
20 |
TERM_THREADLIMIT |
Job killed after reaching LSF thread limit |
21 |
TERM_UNKNOWN |
LSF cannot determine a termination reason—0 is logged but TERM_UNKNOWN is not displayed |
0 |
|
The orphan job was automatically terminated by LSF |
|
TERM_WINDOW |
|
2 |
TERM_ZOMBIE |
Job exited while LSF is not available |
19 |
The integer values logged to the JOB_FINISH event in lsb.acct and termination reason keywords are mapped in lsbatch.h.
If a queue-level JOB_CONTROL is configured, LSF cannot determine the result of the action. The termination reason only reflects what the termination reason could be in LSF.
LSF cannot be guaranteed to catch any external signals sent directly to the job.
In MultiCluster, a brequeue request sent from the submission cluster is translated to TERM_OWNER or TERM_ADMIN in the remote execution cluster. The termination reason in the email notification sent from the execution cluster as well as that in the lsb.acct is set to TERM_OWNER or TERM_ADMIN.