Preemptive scheduling in LSF allows a pending high-priority job to preempt a running job of lower priority. LSF suspends the lower-priority job and resumes it as soon as possible.
For more information on LSF preemption, refer to Preemptive Scheduling in Administering IBM Platform LSF.
When preempting VM jobs, Dynamic Cluster also allows you to live migrate the VM, save the VM to disk, or requeue the job (that is, kill the job and resubmit it to the queue). If a high-priority job is pending, Dynamic Cluster can preempt a lower-priority VM job, then start the high-priority job after the preemption action is finished on the hypervisor. For this to happen, the lower priority job must have a specified preemption action, and either high-priority job must be pending in a preemptive queue (a queue that can preempt other queues), or the low-priority job must belong to a preemptable queue (a queue that can be preempted by other queues).
If the preemption action is to live migrate the VM, you can also specify a second preemption action if the live migration does not start before the specified wait trigger time.
Saving the VM allows this job to continue later on. This option defines the action that the lower priority (preempted) job should take upon preemption, not the one the higher priority (preempting) job should initiate.
The system kills the VM job and submits a new VM job request to the queue.
The system migrates the job to the destination host, then releases all resources normally used by the job from the hypervisor host. During this time, the job remains in a RUN state.
RHEL KVM hypervisors do not support live migration with VM job checkpointing. Do not use -dc_chkpntvm with -dc_vmaction livemigvm.
You can also specify a second preemption action to trigger if the live migration action fails. Use a space to separate the two actions and quotation marks to enclose the two actions.
In addition, livemigvm has parameters to specify as arguments. Use a colon (:) to separate the different parameters and square brackets ([]) to enclose the parameters:
bsub -dc_tmpl rhel58_vm -dc_mtype vm -dc_vmaction \
"livemigvm[wait_trigger_time=100:livemig_max_downtime=0:livemig_max_exectime=1000] requeuejob" \
myjob
If the live migration fails (because the trigger time or execute time is exceeded), the requeuejob action triggers.