Start of change

VM job preemption

Preemptive scheduling in LSF allows a pending high-priority job to preempt a running job of lower priority. LSF suspends the lower-priority job and resumes it as soon as possible.

For more information on LSF preemption, refer to Preemptive Scheduling in Administering IBM Platform LSF.

When preempting VM jobs, Dynamic Cluster also allows you to live migrate the VM, save the VM to disk, or requeue the job (that is, kill the job and resubmit it to the queue). If a high-priority job is pending, Dynamic Cluster can preempt a lower-priority VM job, then start the high-priority job after the preemption action is finished on the hypervisor. For this to happen, the lower priority job must have a specified preemption action, and either high-priority job must be pending in a preemptive queue (a queue that can preempt other queues), or the low-priority job must belong to a preemptable queue (a queue that can be preempted by other queues).

If the preemption action is to live migrate the VM, you can also specify a second preemption action if the live migration does not start before the specified wait trigger time.

To specify a preemption action for the VM job specify the –dc_vmaction action_name option at submission time, or specify DC_JOBVM_PREEMPTION_ACTION="action_name" in lsb.applications (for jobs submitted to the specified application profile), where action_name is one of the following:
  • -dc_vmaction savevm: Save the VM.

    Saving the VM allows this job to continue later on. This option defines the action that the lower priority (preempted) job should take upon preemption, not the one the higher priority (preempting) job should initiate.

  • -dc_vmaction requeuejob: Kill the VM job and resubmit it to the queue.

    The system kills the VM job and submits a new VM job request to the queue.

  • -dc_vmaction livemigvm: Live migrate the VM (and the job running on it) from one hypervisor host to another.

    Start of change The system migrates the job to the destination host, then releases all resources normally used by the job from the hypervisor host. During this time, the job remains in a RUN state. End of change

    RHEL KVM hypervisors do not support live migration with VM job checkpointing. Do not use -dc_chkpntvm with -dc_vmaction livemigvm.

    You can also specify a second preemption action to trigger if the live migration action fails. Use a space to separate the two actions and quotation marks to enclose the two actions.

    In addition, livemigvm has parameters to specify as arguments. Use a colon (:) to separate the different parameters and square brackets ([]) to enclose the parameters:

    wait_trigger_time=integer
    If you also specified a second preemption action to trigger if the live migration fails, specifies the amount of time to wait for the live migration to start before taking the second action, in seconds. The default value is infinite (that is, Dynamic Cluster waits indefinitely for the live migration to trigger).
    livemig_max_downtime=integer
    The maximum amount of time that a VM can be down during a live migration. This is the amount of time from when the VM is stopped on the source hypervisor and started on the target hypervisor. If the live migration cannot guarantee this down time, the system will continue to retry the live migration until it can guarantee this maximum down time (or the livemig_max_exectime value is reached). Specify the value in seconds, or specify 0 to use the hypervisor default for the down time. The default value is 0.
    livemig_max_exectime=integer
    The maximum amount of time that the system can attempt a live migration. If the live migration cannot guarantee the down time (as specified by the livemig_max_downtime parameter) within this amount of time, the live migration fails. Specify the value in seconds from 1 to 2147483646. The default value is 2147483646.
    For example,
    bsub -dc_tmpl rhel58_vm -dc_mtype vm -dc_vmaction \
    "livemigvm[wait_trigger_time=100:livemig_max_downtime=0:livemig_max_exectime=1000] requeuejob" \
    myjob

    If the live migration fails (because the trigger time or execute time is exceeded), the requeuejob action triggers.

Note: RHEL KVM hypervisors will not live migrate a VM that has a snapshot. When submitting jobs to RHEL KVM hypervisors, do not use VM job checkpointing (bsub -dc_chkpntvm option) with live migration.
End of change