Introduction

This chapter describes the IBM Platform Dynamic Cluster (Dynamic Cluster) system architecture and basic concepts. It also explains the benefits of using Dynamic Cluster, and explains some of the concepts required to effectively administer the IBM Platform LSF (LSF) cluster with Dynamic Cluster enabled.

This guide assumes that you have a good knowledge of standard LSF features, as well as a working familiarity with common virtual infrastructure concepts such as hypervisors and virtual machines.

High-level architecture

Broadly speaking, Dynamic Cluster can be thought of as an add-on to LSF that provides the following benefits:

Dynamically create new virtual machines to satisfy job demands
Allow jobs to be saved and migrated to another host to release the resources of a priority host
Restrict job memory usage on a host by running it in a virtual machine, avoiding the possibility of one job hoarding all of a host's memory and interfering with other running jobs

The following figure shows the high-level architecture of Dynamic Cluster.

The Dynamic Cluster system consists of two main components:

The Dynamic Cluster module built into LSF that makes the scheduler provisioning-aware
IBM Platform Cluster Manager Advanced Edition (Platform Cluster Manager), which is a product for managing infrastructure resources.
PVMO (Physical and Virtual Machine Orchestrator) is the component of Platform Cluster Manager that interacts with the underlying virtual provisioning systems. The Platform Cluster Manager master host cannot be used to run virtual machines.

Note:

This means that an installation of Dynamic Cluster requires an installation of both LSF and Platform Cluster Manager. You can configure which hosts in your cluster can participate in dynamic provisioning. This allows you to isolate Dynamic Cluster functionality to as small or as large a subset of a standard LSF cluster as you wish.

Dynamic provisioning

Dynamic Cluster supports virtual machine provisioning. The LSF scheduler enabled with Dynamic Cluster module is provisioning aware, so there are no race conditions or issues with "two-brain" scheduling.

The general flow of workload driven job scheduling in Dynamic Cluster is as follows:

A user submits a job and requests machine resource requirement, including the desired OS/application stack and machine type, number of CPUs, and memory.
According to these resource requirements and configured LSF policies, Dynamic Cluster works with the LSF scheduler to select suitable hosts to run the job.
If the selected machines match the job-level resources requested, LSF dispatches the job right away. Otherwise, a machine provisioning request is generated by LSF and communicated to Platform Cluster Manager, which connects to an external virtual provisioning systems to provision machines. In the middle of provisioning, LSF will reserve the selected Dynamic Cluster host's memory/CPU resources for the job.
Once the provisioned machine is up and connects to the LSF master host, LSF dispatches jobs to the selected machine.
When the job completes, the machine remains provisioned, and is able serve new workload.

Without Dynamic Cluster, LSF finds suitable resources and schedules jobs, but the resource attributes are fixed, and some jobs may be pending while resources that do not match the job’s requirements are idle. With Dynamic Cluster, idle resources that do not match job requirements are dynamically repurposed, so that LSF can schedule the pending jobs.

Dynamic Cluster can provision the machine type that is most appropriate for the workload:

Jobs can demand to be scheduled to run on physical machines at submission time to assure greater performance.
Other jobs can be contained in virtual machines (VMs) for greater flexibility.
The VM memory and CPU allocations can be modified when powering them on.

Dynamic Cluster hosts in the cluster are flexible resources. If workload requirements are constantly changing, and different types of workload require different execution environments, Dynamic Cluster can dynamically provision infrastructure according to workload needs (OS, memory, CPUs).

Maximize resource utilization

With Dynamic Cluster you can keep hardware and license utilization high, without affecting the service level for high priority workload. Instead of reserving important resources for critical workload, you can use the resources to run low priority workload, and then preempt those jobs when a high priority job arrives.

Migration is driven by workload priority.

Without Dynamic Cluster, if the LSF job runs on a physical machine, the job is not mobile. The low-priority job must be terminated and rescheduled if it is preempted. If preemption occurs frequently, the job may be started and restarted several times, using up valuable resources without ever completing.

With Dynamic Cluster, the low-priority job can run on a VM, and if the job is preempted, the VM and job can be saved. When the VM is restored, the job continues and eventually finishes without wasting any resources.

Running workload is packed onto the hypervisor to use the smallest possible number of hypervisors. This maximizes availability for new jobs, and minimizes the need for migration.

Restrict resource usage with VMs

Users of HPC applications cannot always predict the memory or CPU usage of a job. Without Dynamic Cluster, a job might unexpectedly use more resources than it asked for and interfere with other workload running on the execution host.

When Dynamic Cluster jobs run on a VM, one physical host can run many jobs, and the job is isolated in its environment. For example, a job that runs out of memory and fails will not interfere with other jobs running on the same host.

Dynamic Cluster concepts

Dynamic Cluster hosts:
- Any number of physical hosts in the LSF cluster can become Dynamic Cluster hosts. Dynamic Cluster hosts are physical machines that are managed by Platform Cluster Manager and can run LSF jobs. Dynamic Cluster manages the hypervisors on which to run virtual machines.
- Dynamic Cluster hosts are identified in LSF by tagging them with the resource dchost in the LSF cluster file.
- The remaining physical hosts in the cluster are ordinary LSF hosts that cannot be repurposed based on workload demand.
Job VMs:
- The virtual machines that are treated specially by LSF in Dynamic Cluster. They are a special kind of LSF host that serves only as an execution container for the LSF job, and not as a scheduling unit.
- These machines are identified in LSF by the resource jobvm. The suggested installation described in this guide configures all dynamically-created virtual machines to provide the jobvm resource.
Dynamic Cluster machine templates:
- A template can be viewed as a machine image containing an operating system and an application stack that can be used to instantiate a VM.
- Platform Cluster Manager requires that each provisioning request it receives references a template that it is aware of. This template is used to load an OS installation onto the target machine.
- Dynamic Cluster extends the notion of templates in Platform Cluster Manager. When a user from Dynamic Cluster submits a job, that user must at least specify one of the Dynamic Cluster templates, which consist of:
  - The Platform Cluster Manager template. The host that this job will run on must be an instance of this template, whether it already exists or must be provisioned.
  - An optional post provisioning script. A user defined script that is executed on the virtual machine after the virtual machine is powered on, allowing for further customization.
    The post provisioning script only executes the first time the virtual machine is powered on. The post provisioning script no longer executes on subsequent boot ups.
- Use the any template if you do not care which template the job runs in (Windows or Linux).

Compatibility notes

Platform Cluster Manager must be installed to manage the virtual machines.
To use Dynamic Cluster 9.1.3, you must complete a fresh installation of LSF 9.1.3.
Dynamic Cluster can be enabled for some or all of the hosts in an existing cluster.
Supported hypervisor operating systems:
- RHEL version 6.3 KVM with the following patches:
  - kernel-2.6.32-279.14.1.el6
  - libvirt-0.9.10-21.el6_3.5
  - qemu-kvm-0.12.1.2-2.295.el6_3.2
- VMware 5.x
Supported virtual machine guest operating systems:
- RHEL version 4.x, version 5.x, version 6.x (64-bit)
- Windows 7 (32-bit)
- Windows 2008 (64-bit)