Configuration to modify preemptive scheduling behavior

There are configuration parameters that modify various aspects of preemptive scheduling behavior, by

  • Modifying the selection of the queue to preempt jobs from

  • Modifying the selection of the job to preempt

  • Modifying preemption of backfill and exclusive jobs

  • Modifying the way job slot limits are calculated

  • Modifying the number of jobs to preempt for a parallel job

  • Modifying the control action applied to preempted jobs

  • Control how many times a job can be preempted

  • Specify a grace period before preemption to improve cluster performance

Configuration to modify selection of queue to preempt

File

Parameter

Syntax and description

lsb.queues

PREEMPTION

PREEMPTION=PREEMPTIVE

[low_queue+pref …]
  • Jobs in theis queue can preempt running jobs from the specified queues, starting with jobs in the queue with the highest value set for preference

PREEMPTION=PREEMPTABLE
[hi_queue …]
  • Jobs in this queue can be preempted by jobs from the specified queues

PRIORITY=integer

  • Sets the priority for this queue relative to all other queues

  • The higher the priority value, the more likely it is that jobs from this queue may preempt jobs from other queues, and the less likely it is for jobs from this queue to be preempted by jobs from other queues

Configuration to modify selection of job to preempt

Files

Parameter

Syntax and description

lsb.params

lsb.applications

PREEMPT_FOR

PREEMPT_FOR=LEAST_RUN_TIME

  • Preempts the job that has been running for the shortest time

NO_PREEMPT_RUN_TIME

NO_PREEMPT_RUN_TIME=%
  • Prevents preemption of jobs that have been running for the specified percentage of minutes, or longer

  • If NO_PREEMPT_RUN_TIME is specified as a percentage, the job cannot be preempted after running the percentage of the job duration. For example, if the job run limit is 60 minutes and NO_PREEMPT_RUN_TIME=50%, the job cannot be preempted after it running 30 minutes or longer.

  • If you specify percentage for

    NO_PREEMPT_RUN_TIME, requires a run time (bsub -We or RUNTIME in lsb.applications),

    or run limit to be specified for the job (bsub -W, or RUNLIMIT in lsb.queues, or RUNLIMIT in lsb.applications)

NO_PREEMPT_FINISH_TIME

NO_PREEMPT_FINISH_TIME=%
  • Prevents preemption of jobs that will finish within the specified percentage of minutes.

  • If NO_PREEMPT_FINISH_TIME is specified as a percentage, the job cannot be preempted if the job finishes within the percentage of the job duration. For example, if the job run limit is 60 minutes and NO_PREEMPT_FINISH_TIME=10%, the job cannot be preempted after it running 54 minutes or longer.

  • If you specify percentage for NO_PREEMPT_RUN_TIME, requires a run time (bsub -We or RUNTIME in lsb.applications), or run limit to be specified for the job (bsub -W, or RUNLIMIT in lsb.queues, or RUNLIMIT in lsb.applications)

lsb.params

lsb.queues

lsb.applications

MAX_TOTAL_TIME_PREEMPT

MAX_TOTAL_TIME_PREEMPT=minutes

  • Prevents preemption of jobs that already have an accumulated preemption time of minutes or greater.

  • The accumulated preemption time is reset in the following cases:

    • Job status becomes EXIT or DONE

    • Job is re-queued

    • Job is re-run

    • Job is migrated and restarted

  • MAX_TOTAL_TIME_PREEMPT does not affect preemption triggered by advance reservation or License Scheduler.

  • Accumulated preemption time does not include preemption by advance reservation or License Scheduler.

NO_PREEMPT_INTERVAL

NO_PREEMPT_INTERVAL=minutes
  • Prevents preemption of jobs until after an uninterrupted run time interval of minutes since the job was dispatched or last resumed.

  • NO_PREEMPT_INTERVAL does not affect preemption triggered by advance reservation or License Scheduler.

Configuration to modify preemption of backfill and exclusive jobs

File

Parameter

Syntax and description

lsb.params

PREEMPT_JOBTYPE

PREEMPT_JOBTYPE=BACKFILL

  • Enables preemption of backfill jobs.

  • Requires the line PREEMPTION=PREEMPTABLE in the queue definition.

  • Only jobs from queues with a higher priority than queues that define resource or slot reservations can preempt jobs from backfill queues.

PREEMPT_JOBTYPE=EXCLUSIVE

  • Enables preemption of and preemption by exclusive jobs.

  • Requires the line PREEMPTION=PREEMPTABLE or PREEMPTION=PREEMPTIVE in the queue definition.

  • Requires the definition of LSB_DISABLE_LIMLOCK_EXCL in lsf.conf.

PREEMPT_JOBTYPE=EXCLUSIVE BACKFILL

  • Enables preemption of exclusive jobs, backfill jobs, or both.

lsf.conf

LSB_DISABLE_LIMLOCK_EXCL

LSB_DISABLE_LIMLOCK_EXCL=y
  • Enables preemption of exclusive jobs.

  • For a host running an exclusive job:
    • lsload displays the host status ok.

    • bhosts displays the host status closed.

    • Users can run tasks on the host using lsrun or lsgrun. To prevent users from running tasks during execution of an exclusive job, the parameter LSF_DISABLE_LSRUN=y must be defined in lsf.conf.

  • Changing this parameter requires a restart of all sbatchds in the cluster (badmin hrestart). Do not change this parameter while exclusive jobs are running.

Configuration to modify how job slot usage is calculated

File

Parameter

Syntax and description

lsb.params

PREEMPT_FOR

PREEMPT_FOR=GROUP_JLP

  • Counts only running jobs when evaluating if a user group is approaching its per-processor job slot limit (SLOTS_PER_PROCESSOR, USERS, and PER_HOST=all in the lsb.resources file), ignoring suspended jobs

PREEMPT_FOR=GROUP_MAX

  • Counts only running jobs when evaluating if a user group is approaching its total job slot limit (SLOTS, PER_USER=all, and HOSTS in the lsb.resources file), ignoring suspended jobs

PREEMPT_FOR=HOST_JLU

  • Counts only running jobs when evaluating if a user or user group is approaching its per-host job slot limit (SLOTS, PER_USER=all, and HOSTS in the lsb.resources file), ignoring suspended jobs

PREEMPT_FOR=USER_JLP

  • Counts only running jobs when evaluating if a user is approaching their per-processor job slot limit (SLOTS_PER_PROCESSOR, USERS, and PER_HOST=all in the lsb.resources file)

  • Ignores suspended jobs when calculating the per-processor job slot limit for individual users

Configuration to modify preemption of parallel jobs

File

Parameter

Syntax and description

lsb.params

PREEMPT_FOR

PREEMPT_FOR=MINI_JOB

  • Optimizes preemption of parallel jobs by preempting only enough low-priority parallel jobs to start the high-priority parallel job

PREEMPT_FOR=OPTIMAL_MINI_JOB

  • Optimizes preemption of parallel jobs by preempting only low-priority parallel jobs based on the least number of jobs that will be suspended to allow the high-priority parallel job to start

Configuration to modify the control action applied to preempted jobs

File

Parameter

Syntax and description

lsb.queues

TERMINATE_WHEN

TERMINATE_WHEN=PREEMPT

  • Changes the default control action of SUSPEND to TERMINATE so that LSF terminates preempted jobs

Configuration to control how many times a job can be preempted

By default, if preemption is enabled, there is actually no guarantee that a job will ever actually complete. A lower priority job could be preempted again and again, and ultimately end up being killed due to a run limit.

Limiting the number of times a job can be preempted is configured cluster-wide (lsb.params), at the queue level (lsb.queues), and at the application level (lsb.applications). MAX_JOB_PREEMPT in lsb.applications overrides lsb.queues, and lsb.queues overrides lsb.params configuration.

Files

Parameter

Syntax and description

lsb.params

lsb.queues

lsb.applications

MAX_JOB_PREEMPT

MAX_JOB_PREEMPT=integer
  • Specifies the maximum number of times a job can be preempted.

  • Specify a value within the following ranges:

    0 < MAX_JOB_PREEMPT < INFINIT_INT

    INFINIT_INT is defined in lsf.h

  • By default, the number of preemption times is unlimited.

When MAX_JOB_ PREEMPT is set, and a job is preempted by higher priority job, the number of job preemption times is set to 1. When the number of preemption times exceeds MAX_JOB_ PREEMPT, the job will run to completion and cannot be preempted again.

The job preemption limit times is recovered when LSF is restarted or reconfigured.

If brequeue or bmig is invoked under a job suspend control (SUSPEND_CONTROL in lsb.applications or JOB_CONTROLS in lsb.queues), the job will be requeued or migrated and the preempted counter reset to 0. To prevent the preempted counter from resetting to 0 under job suspend control, set MAX_JOB_PREEMPT_RESET in lsb.params to N. LSF will not reset the preempted count for MAX_JOB_PREEMPT when the started job is requeued, migrated or rerun.

Configuration of a grace period before preemption

For details, see PREEMPT_DELAY in the file configuration reference.

Files

Parameter

Syntax and description

(in order of precedence:)

lsb.applications

lsb.queues

lsb.params

PREEMPT_DELAY

PREEMPT_DELAY=seconds
  • Specifies the number seconds for a preemptive job in the pending state to wait before a lower-priority job can be preempted.

  • By default, the preemption is immediate.