For example, the following bjobs command shows that job 1 is running longer than the configured JOB_OVERRUN threshold, and is consuming no CPU time. bjobs displays the job idle factor, and both job overrun and job idle exceptions. Job 1 finished before the configured JOB_UNDERRUN threshold, so bjobs shows exception status of underrun:
bjobs -x -l -a
Job <1>, User <user1>, Project <default>, Status <RUN>, Queue <normal>, Command
<sleep 600>
Wed Aug 13 14:23:35 2009: Submitted from host <hostA>, CWD <$HOME>, Output File
</dev/null>, Specified Hosts <hostB>;
Wed Aug 13 14:23:43 2009: Started on <hostB>, Execution Home </home/user1>, Execution
CWD </home/user1>;
Resource usage collected.
IDLE_FACTOR(cputime/runtime): 0.00
MEM: 3 Mbytes; SWAP: 4 Mbytes; NTHREAD: 3
PGID: 5027; PIDs: 5027 5028 5029
MEMORY USAGE:
MAX MEM: 8 Mbytes; AVG MEM: 4 Mbytes
SCHEDULING PARAMETERS:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
cpuspeed bandwidth
loadSched - -
loadStop - -
EXCEPTION STATUS: overrun idle
RESOURCE REQUIREMENT DETAILS:
Combined : {4*{select[type == local] order[r15s:pg] span[ptile=2]}} || {2*{select
[type == local] order[r15s:pg] span[hosts=1]}}
Effective : 2*{select[type == local] order[r15s:pg] span[hosts=1] }
Use bacct -l -x to trace the history of job exceptions.