About job submission and execution controls

The job submission and execution controls feature uses the executables esub and eexec to control job options and the job execution environment.

External submission (esub)

An esub is an executable that you write to meet the job requirements at your site. The following are some of the things that you can use an esub to do:
  • Validate job options

  • Change the job options specified by a user

  • Change user environment variables on the submission host (at job submission only)

  • Reject jobs (at job submission only)

  • Pass data to stdin of eexec

When a user submits a job using bsub or modifies a job using bmod, LSF runs the esub executable(s) on the submission host before accepting the job. If the user submitted the job with options such as -R to specify required resources or -q to specify a queue, an esub can change the values of those options to conform to resource usage policies at your site.

Note:

When compound resource requirements are used at any level, an esub can create job-level resource requirements which overwrite most application-level and queue-level resource requirements.

An esub can also change the user environment on the submission host prior to job submission so that when LSF copies the submission host environment to the execution host, the job runs on the execution host with the values specified by the esub. For example, an esub can add user environment variables to those already associated with the job.

Use of esub not enabled

With esub enabled

An esub executable is typically used to enforce site-specific job submission policies and command-line syntax by validating or pre-parsing the command line. The file indicated by the environment variable LSB_SUB_PARM_FILE stores the values submitted by the user. An esub reads the LSB_SUB_PARM_FILE and then accepts or changes the option values or rejects the job. Because an esub runs before job submission, using an esub to reject incorrect job submissions improves overall system performance by reducing the load on the master batch daemon (mbatchd).

An esub can be used to:
  • Reject any job that requests more than a specified number of CPUs

  • Change the submission queue for specific user accounts to a higher priority queue

  • Check whether the job specifies an application and, if so, submit the job to the correct application profile

Note:

If an esub executable fails, the job will still be submitted to LSF.

Multiple esub executables

LSF provides a master external submission executable (LSF_SERVERDIR/mesub) that supports the use of application-specific esub executables. Users can specify one or more esub executables using the -a option of bsub or bmod. When a user submits or modifies a job or when a user restarts a job that was submitted or modified with the -a option included, mesub runs the specified esub executables.

An LSF administrator can specify one or more mandatory esub executables by defining the parameter LSB_ESUB_METHOD in lsf.conf. If a mandatory esub is defined, mesub runs the mandatory esub for all jobs submitted to LSF in addition to any esub executables specified with the -a option.

The naming convention is esub.application. LSF always runs the executable named "esub" (without .application) if it exists in LSF_SERVERDIR.
Note:

All esub executables must be stored in the LSF_SERVERDIR directory defined in lsf.conf.

The mesub runs multiple esub executables in the following order:
  1. The mandatory esub or esubs specified by LSB_ESUB_METHOD in lsf.conf

  2. Any executable with the name "esub" in LSF_SERVERDIR

  3. One or more esubs in the order specified by bsub -a

Example of multiple esub execution

An esub runs only once, even if it is specified by both the bsub -a option and the parameter LSB_ESUB_METHOD.

External execution (eexec)

An eexec is an executable that you write to control the job environment on the execution host.

Use of eexec not enabled

With eexec enabled

The following are some of the things that you can use an eexec to do:
  • Monitor job state or resource usage

  • Receive data from stdout of esub

  • Run a shell script to create and populate environment variables needed by jobs

  • Monitor the number of tasks running on a host and raise a flag when this number exceeds a pre-determined limit

  • Pass DCE credentials and AFS tokens using a combination of esub and eexec executables; LSF functions as a pipe for passing data from the stdout of esub to the stdin of eexec

For example, if you have a mixed UNIX and Windows cluster, the submission and execution hosts might use different operating systems. In this case, the submission host environment might not meet the job requirements when the job runs on the execution host. You can use an eexec to set the correct user environment between the two operating systems.

Typically, an eexec executable is a shell script that creates and populates the environment variables required by the job. An eexec can also monitor job execution and enforce site-specific resource usage policies.

If an eexec executable exists in the directory specified by LSF_SERVERDIR, LSF invokes that eexec for all jobs submitted to the cluster. By default, LSF runs eexec on the execution host before the job starts. The job process that invokes eexec waits for eexec to finish before continuing with job execution.

Unlike a pre-execution command defined at the job, queue, or application levels, an eexec:
  • Runs at job start, finish, or checkpoint

  • Allows the job to run without pending if eexec fails; eexec has no effect on the job state

  • Runs for all jobs, regardless of queue or application profile

Scope

Applicability

Details

Operating system

  • UNIX and Linux

  • Windows

Security

  • Data passing between esub on the submission host and eexec on the execution host is not encrypted.

Job types

  • Batch jobs submitted with bsub or modified by bmod.

  • Batch jobs restarted with brestart.

  • Interactive tasks submitted with lsrun and lsgrun (eexec only).

Dependencies

  • UNIX and Windows user accounts must be valid on all hosts in the cluster, or the correct type of account mapping must be enabled.
    • For a mixed UNIX/Windows cluster, UNIX/Windows user account mapping must be enabled.

    • For a cluster with a non-uniform user name space, between-host account mapping must be enabled.

    • For a MultiCluster environment with a non-uniform user name space, cross-cluster user account mapping must be enabled.

  • User accounts must have the correct permissions to successfully run jobs.

  • An eexec that requires root privileges to run on UNIX, must be configured to run as the root user.

Limitations

  • Only an esub invoked by bsub can change the job environment on the submission host. An esub invoked by bmod or brestart cannot change the environment.

  • Any esub messages provided to the user must be directed to standard error, not to standard output. Standard output from any esub is automatically passed to eexec.

  • An eexec can handle only one standard output stream from an esub as standard input to eexec. You must make sure that your eexec handles standard output from correctly if any esub writes to standard output.

  • The esub/eexec combination cannot handle daemon authentication. To configure daemon authentication, you must enable external authentication, which uses the eauth executable.