Absolute job priority scheduling

Absolute job priority scheduling (APS) provides a mechanism to control the job dispatch order to prevent job starvation.

When configured in a queue, APS sorts pending jobs for dispatch according to a job priority value calculated based on several configurable job-related factors. Each job priority weighting factor can contain subfactors. Factors and subfactors can be independently assigned a weight.

APS provides administrators with detailed yet straightforward control of the job selection process.
  • APS only sorts the jobs; job scheduling is still based on configured LSF scheduling policies. LSF attempts to schedule and dispatch jobs based on their order in the APS queue, but the dispatch order is not guaranteed.

  • The job priority is calculated for pending jobs across multiple queues that are based on the sum of configurable factor values. Jobs are then ordered based on the calculated APS value.

  • You can adjust the following for APS factors:
    • A weight for scaling each job-related factor and subfactor

    • Limits for each job-related factor and subfactor

    • A grace period for each factor and subfactor

  • To configure absolute priority scheduling (APS) across multiple queues, define APS queue groups. When you submit a job to any queue in a group, the job's dispatch priority is calculated using the formula defined in the group's master queue.

  • Administrators can also set a static system APS value for a job. A job with a system APS priority is guaranteed to have a higher priority than any calculated value. Jobs with higher system APS settings have priority over jobs with lower system APS settings.

  • Administrators can use the ADMIN factor to manually adjust the calculated APS value for individual jobs.

Scheduling priority factors

To calculate the job priority, APS divides job-related information into several categories. Each category becomes a factor in the calculation of the scheduling priority. You can configure the weight, limit, and grace period of each factor to get the desired job dispatch order.

LSF sums the value of each factor based on the weight of each factor.

Factor weight

The weight of a factor expresses the importance of the factor in the absolute scheduling priority. The factor weight is multiplied by the value of the factor to change the factor value. A positive weight increases the importance of the factor, and a negative weight decreases the importance of a factor. Undefined factors have a weight of 0, which causes the factor to be ignored in the APS calculation.

Factor limit

The limit of a factor sets the minimum and maximum absolute value of each weighted factor. Factor limits must be positive values.

Factor grace period

Each factor can be configured with a grace period. The factor only counted as part of the APS value when the job has been pending for a long time and it exceeds the grace period.

Factors and subfactors

Factors

Subfactors

Metric

FS (user based fairshare factor)

The existing fairshare feature tunes the dynamic user priority

The fairshare factor automatically adjusts the APS value based on dynamic user priority.

FAIRSHARE must be defined in the queue. The FS factor is ignored for non-fairshare queues.

The FS factor is influenced by the following fairshare parameters defined in lsb.queues or lsb.params:

  • CPU_TIME_FACTOR

  • RUN_TIME_FACTOR

  • RUN_JOB_FACTOR

  • HIST_HOURS

RSRC (resource factors)

PROC

Requested tasks is the max of bsub -n min_task, max_task, the min of Start of change bsub -n min End of change, or the value of TASKLIMIT in lsb.queues.

MEM

Total real memory requested (in MB or in units set in LSF_UNIT_FOR_LIMITS in lsf.conf).

Memory requests appearing to the right of a || symbol in a usage string are ignored in the APS calculation.

For multi-phase memory reservation, the APS value is based on the first phase of reserved memory.

SWAP

Total swap space requested (in MB or in units set in LSF_UNIT_FOR_LIMITS in lsf.conf).

As with MEM, swap space requests appearing to the right of a || symbol in a usage string are ignored.

WORK (job attributes)

JPRIORITY

The job priority specified by:

  • Default specified by MAX_USER_PRIORITY in lsb.params

  • Users with bsub -sp or bmod -sp

  • Automatic priority escalation with JOB_PRIORITY_OVER_TIME in lsb.params

QPRIORITY

The priority of the submission queue.

ADMIN

Administrators use bmod -aps to set this subfactor value for each job. A positive value increases the APS. A negative value decreases the APS. The ADMIN factor is added to the calculated APS value to change the factor value. The ADMIN factor applies to the entire job. You cannot configure separate weight, limit, or grace period factors. The ADMIN factor takes effect as soon as it is set.

Where LSF gets the job information for each factor

Factor or subfactor

Gets job information from...

MEM

The value for jobs submitted with -R "rusage[mem]"

For compound resource requirements submitted with -R "n1*{rusage[mem1]} + n2*{rusage[mem2]}" the value of MEM depends on whether resources are reserved per slot.

  • If RESOURCE_RESERVE_PER_SLOT=N, then MEM=mem1+mem2

  • If RESOURCE_RESERVE_PER_SLOT=Y, then MEM=n1*mem1+n2*mem2

For an alternative resource requirements, there is a plugin that considers all alternatives and uses the maximum value for the resource under consideration (SWP or MEM).

SWAP

The value for jobs submitted with -R "rusage[swp]"

For compound and alternative resource requirements, SWAP is determined in the same manner as MEM.

PROC

Start of change

The value of n for jobs submitted with bsub -n (min_task, max_task), or the value of TASKLIMIT in lsb.queues

Task limits can be specified at the job-level (bsub -n), the application-level (TASKLIMIT), and at the queue-level (TASKLIMIT). Job-level limits (bsub -n) override application-level TASKLIMIT, which overrides queue-level TASKLIMIT. Job-level limits must fall within the maximum and minimum limits of the application profile and the queue.

Compound resource requirements by their nature express the number of processors a job requires. The minimum number of processors requested by way of job-level (bsub -n), application-level (TASKLIMIT), and queue-level (TASKLIMIT) must be equal and possibly greater than the number of processors requested through the resource requirement. If the final term of the compound resource requirement does not specify a number of processors then the relationship is equal to or greater than. If the final term of the compound resource requirement does specify a number of processors then the relationship is equal to, and the maximum number of processors requested must be equal to the minimum requested. LSF checks only that the default value supplied in TASKLIMIT (the first value of a pair or middle value of three values) is a multiple of a block. Maximum or minimum TASKLIMIT does not need to be a multiple of the block value.

Alternative resource requirements may or may not specify the number of processors a job requires. The minimum number of processors requested by way of job-level (bsub -n), application-level (TASKLIMIT), and queue-level (TASKLIMIT) must be less than or equal the minimum implied through the resource requirement. The maximum number of processors requested by way of job-level (bsub -n), application-level (TASKLIMIT), and queue-level (TASKLIMIT) must be equal to or greater than the maximum implied through the resource requirement. Any alternative which does not specify the number of processors is assumed to request the range from minimum to maximum, or request the default number of processors.

End of change

JPRIORITY

The dynamic priority of the job, updated every scheduling cycle and escalated by interval defined in JOB_PRIORITY_OVER_TIME defined in lsb.params

QPRIORITY

The priority of the job submission queue

FS

The fairshare priority value of the submission user