While traditional LSF job submission, scheduling, and dispatch methods such as job arrays or job chunking are well suited to a mix of long and short running jobs, or jobs with dependencies on each other, Session Scheduler is ideal for large volumes of independent jobs with short run times.
As clusters grow and the volume of workload increases, the need to delegate scheduling decisions increases. Session Scheduler improves throughput and performance of the LSF scheduler by enabling multiple tasks to be submitted as a single LSF job.
Session Scheduler implements a hierarchal, personal scheduling paradigm that provides very low-latency execution. With very low latency per job, Session Scheduler is ideal for executing very short jobs, whether they are a list of tasks, or job arrays with parametric execution.
The Session Scheduler provides users with the ability to run large collections of short duration tasks within the allocation of an LSF job using a job-level task scheduler that allocate resources for the job once, and reuses the allocated resources for each task.
Each Session Scheduler is dynamically scheduled in a similar manner to a parallel job. Each instance of the ssched command then manages its own workload within its assigned allocation. Work is submitted as a task array or a task definition file.
Minimize the latency when scheduling short jobs
Improve overall cluster utilization and system performance
Allocate resources according to LSF policies
Support existing LSF pre-execution, post-execution programs, job starters, resources limits, etc.
Handle thousands of users and more than 50000 short jobs per user
lsf9.1.3_ssched_lnx26-libc23-x64.tar.Z
Note: These libraries may not be installed by default by all Linux distributions.
libstdc++.so.5
libpthread-2.3.4.so or later
Red Hat Enterprise Linux AS 3 or later
SUSE Linux Enterprise Server 10
Session Scheduler is included with Platform LSF Advanced Edition and is available as an add-on for other editions of Platform LSF:
A traditional LSF job that is individually scheduled and dispatched to sbatchd by mbatchd and mbschd
Similar to a job, a unit of workload that describes an executable and its environment that runs on an execution node. Tasks are managed and dispatched by the Session Scheduler.
An LSF job that is individually scheduled by mbatchd, but is not dispatched as an LSF job. Instead, a running Session Scheduler job session represents an allocation of nodes for running large collections of tasks
The component that accepts and dispatches tasks within the nodes allocated for a job session.
Session Scheduler jobs are submitted, scheduled, and dispatched like normal LSF jobs.
When the Session Scheduler begins running, it starts a Session Scheduler execution agent on each host in its allocation.
The Session Scheduler then reads in the task definition file, which contains a list of tasks to run. Tasks are sent to an execution agent and run. When a task finishes, the next task in the list is dispatched to the available host. This continues until all tasks have been run.
Tasks submitted through Session Scheduler bypass the LSF mbatchd and mbschd. The LSF mbatchd is unaware of individual tasks.
Session Scheduler comprises the following components.
The ssched command accepts and dispatches tasks within the nodes allocated for a job session. It reads the task definition file and sends tasks to the execution agents. ssched also logs errors, performs task accounting, and requeues tasks as necessary.
These components are the execution agents. They run on each remote host in the allocation. They set up the task execution environment, run the tasks, and enable task monitoring and resource usage collection.
Average Runtime (seconds) |
Recommended maximum allocation size (slots) |
---|---|
0 |
12 |
5 |
64 |
15 |
256 |
30 |
512 |