Most LSF commands accept a -R res_req argument to specify resource requirements. The exact behavior depends on the command. For example, specifying a resource requirement for the lsload command displays the load levels for all hosts that have the requested resources.
Specifying resource requirements for the lsrun command causes LSF to select the best host out of the set of hosts that have the requested resources.
A resource requirement string describes the resources that a job needs. LSF uses resource requirements to select hosts for remote execution and job execution.
Resource requirement strings can be simple (applying to the entire job) or compound (applying to the specified number of slots).
Depending on the command, one or more of these sections may apply. For example:
select[selection_string] order[order_string] rusage[usage_string [, usage_string]
[|| usage_string] ...] span[span_string] same[same_string] cu[cu_string] affinity[affinity_string]
With the bsub and bmod commands, and only with these commands, you can specify multiple -R order, same, rusage, and select sections. The bmod command does not support the use of the || operator.
The section names are select, order, rusage, span, same, cu, and affinity. Sections that do not apply for a command are ignored.
The square brackets must be typed as shown for each section. A blank space must separate each resource requirement section.
You can omit the select keyword and the square brackets, but the selection string must be the first string in the resource requirement string. If you do not give a section name, the first resource requirement string is treated as a selection string (select[selection_string]).
Each section has a different syntax.
By default, memory (mem) and swap (swp) limits in select[] and rusage[] sections are specified in MB. Use LSF_UNIT_FOR_LIMITS in lsf.conf to specify a larger unit for these limits (MB, GB, TB, PB, or EB).
num1*{simple_string1} + num2*{simple_string2} + ...
where numx is the number of slots affected and simple_stringx is a simple resource requirement string with the syntax:
select[selection_string] order[order_string] rusage[usage_string [, usage_string]...] span[span_string]
Resource requirements applying to the first execution host (if used) should appear in the first compound term num1*{simple_string1}.
Place specific (harder to fill) requirements before general (easier to fill) requirements since compound resource requirement terms are considered in the order they appear. Resource allocation for parallel jobs using compound resources is done for each compound resource term independently instead of considering all possible combinations.
For jobs without the number of total slots specified using bsub
-n, the final numx can be omitted. The
final resource requirement is then applied to the zero or more slots
not yet accounted for using the default slot setting of the parameter TASKLIMIT as
follows:
For jobs with the total number of slots specified using bsub -n num_slots, the total number of slots must match the number of slots in the resource requirement as follows, and the final numx can be omitted:
For jobs with compound resource requirements and first execution host candidates specified using bsub -m, the first allocated host must satisfy the simple resource requirement string appearing first in the compound resource requirement. Thus the first execution host must satisfy the requirements in simple_string1 for the following compound resource requirement:
Compound resource requirements do not support use of the || operator within the component rusage simple resource requirements, or use of the cu section.
Simple resource requirements can be specified at the job, application, and queue levels. When none of the resource requirements are compound, requirements defined at different levels are resolved in the following ways:
section |
simple resource requirement multi-level behavior |
---|---|
select |
all levels satisfied |
same |
all levels combined |
order span cu |
job-level section overwrites application-level section, which overwrites queue-level section (if a given level is present) |
rusage |
all levels merge if conflicts occur the job-level section overwrites the application-level section, which overwrites the queue-level section. |
For internal load indices and duration, jobs are rejected if the merged job-level and application-level resource reservation requirements exceed the requirements specified at the queue level.
Compound resource requirements can be specified at the job, application, and queue levels. When one or more of the resource requirements is compound or alternative, requirements at different levels are resolved depending on where the compound resource requirement appears.
For internal load indices and duration, jobs are rejected if they specify resource reservation requirements that exceed the requirements specified at the application level or queue level.
When a compound resource requirement is set for a queue, it will be ignored unless it is the only resource requirement specified (no resource requirements are set at the job level or application level).
When a compound resource requirement is set at the application level, it will be ignored if any job-level resource requirements (simple or compound) are defined.
In the event no job-level resource requirements are set, the compound application-level requirements interact with queue-level resource requirement strings in the following ways:
section |
compound application and simple queue behavior |
---|---|
select | both levels satisfied; queue requirement applies to all compound terms |
same | queue level ignored |
orderspan | application-level section overwrites queue-level section (if a given level is present); queue requirement (if used) applies to all compound terms |
rusage |
For example: if the application-level requirement is num1*{rusage[R1]} + num2*{rusage[R2]} and the queue-level requirement is rusage[RQ] where RQ is a job-based resource, the merged requirement is num1*{rusage[merge(R1,RQ)]} + num2*{rusage[R2]} |
When a compound resource requirement is set at the job level, any simple or compound application-level resource requirements are ignored, and any compound queue-level resource requirements are ignored.
In the event a simple queue-level requirement appears along with a compound job-level requirement, the requirements interact as follows:
section |
compound job and simple queue behavior |
---|---|
select | both levels satisfied; queue requirement applies to all compound terms |
same | queue level ignored |
orderspan | job-level section overwrites queue-level section (if a given level is present); queue requirement (if used) applies to all compound terms |
rusage |
For example: if the job-level requirement is num1*{rusage[R1]} + num2*{rusage[R2]} and the queue-level requirement is rusage[RQ] where RQ is a job resource, the merged requirement is num1*{rusage[merge(R1,RQ)]} + num2*{rusage[R2]} |
A compound job requirement and simple queue requirement.
job level: 2*{select[type==X86_64] rusage[licA=1] span[hosts=1]} + 8*{select[type==any]}
application level: not defined
queue level: rusage[perslot=1]
The final job scheduling resource requirement merges the simple queue-level rusage section into each term of the compound job-level requirement, resulting in: 2*{select[type==X86_64] rusage[licA=1:perslot=1] span[hosts=1]} + 8*{select[type==any] rusage[perslot=1]}
A compound job requirement and compound queue requirement.
job level: 2*{select[type==X86_64 && tmp>10000] rusage[mem=1000] span[hosts=1]} + 8*{select[type==X86_64]}
application level: not defined
queue level: 2*{select[type==X86_64] rusage[mem=1000] span[hosts=1]} +8*{select[type==X86_64]}
The final job scheduling resource requirement ignores the compound queue-level requirement, resulting in: 2*{select[type==X86_64 && tmp>10000] rusage[mem=1000] span[hosts=1]} + 8*{select[type==X86_64]}
A compound job requirement and simple queue requirement where the queue requirement is a job-based resource.
job level: 2*{select[type==X86_64]} + 2*{select[mem>1000]}
application level: not defined
queue level: rusage[licA=1] where licA=1 is job-based.
The queue-level requirement is added to the first term of the compound job-level requirement, resulting in: 2*{select[type==X86_64] rusage[licA=1]} + 2*{select[mem>1000]}
Compound multi-phase job requirements and simple multi-phase queue requirements.
job level: 2*{rusage[mem=(400 350):duration=(10 15):decay=(0 1)]} + 2*{rusage[mem=300:duration=10:decay=1]}
application level: not defined
queue level: rusage[mem=(500 300):duration=(20 10):decay=(0 1)]
The queue-level requirement is overridden by the first term of the compound job-level requirement, resulting in: 2*{rusage[mem=(400 350):duration=(10 15):decay=(0 1)]} + 2*{rusage[mem=300:duration=10:decay=1]}
After the job is submitted, the pending reason given only applies to the first alternative even though LSF is trying the other applicable alternatives.
The combined resource requirement is the result of mbatchd merging job, application, and queue level resource requirements for a job.
The effective resource requirement always represents the job's allocation. The effective resource requirement string for scheduled jobs represents the resource requirement that is used by the scheduler to make a dispatch decision. When a job is dispatched, the mbschd generates the effective resource requirement for the job from the combined resource requirement according to the job's real allocation.
After the job has started, you can use bmod -R to modify the job's effective resource requirement along with the job allocation. The rusage section of the effective resource is updated with the rusage in the newly combined resource requirement. The other sections in the resource requirement string such as select, order, span, etc. are kept the same during job runtime because they are still used for the job by the scheduler.
For started jobs, you can only modify effective resource requirements from simple to simple. Any request to change effective resource requirements to compound or alternative resource requirements will be rejected. Attempting to modify the resource requirement of a running job to use rusage with or "||" branches will also be rejected.
By default, LSF does not modify effective resource requirements and job resource usage when running the bswitch command. However, you can set the BSWITCH_MODIFY_RUSAGE parameter to Y to allow bswitch to update job resource usage according to the resource requirements in the new queue.
When a job finishes, the effective resource requirement last used by the job will be saved in the JOB_FINISH event record of lsb.acct and JOB_FINISH2 of lsb.stream. bjobs -l always displays the effective resource requirement that is used by the job in the resource requirement details.