For simple parallel jobs you can use LSF utilities to start parts of the job on other hosts. Because LSF utilities handle signals transparently, LSF can suspend and resume all components of your job without additional programming.
The simplest parallel job runs an identical copy of the executable on every host. The lsgrun command takes a list of host names and runs the specified task on each host. The lsgrun -p command specifies that the task should be run in parallel on each host.
bsub -n 10 ’lsgrun -p -m "$LSB_HOSTS" myjob’
Job <3856> is submitted to default queue <normal>.
For more complicated jobs, you can write a shell script that runs lsrun in the background to start each component.
Most MPI implementations and many distributed applications use rsh and ssh as their task launching mechanism. The blaunch command provides a drop-in replacement for rsh and ssh as a transparent method for launching parallel and distributed applications within LSF.
Similar to the lsrun command, blaunch transparently connects directly to the RES/SBD on the remote host, and subsequently creates and tracks the remote tasks, and provides the connection back to LSF. There is no need to insert pam or taskstarter into the rsh or ssh calling sequence, or configure any wrapper scripts.
You cannot run blaunch directly from the command line.
blaunch only works within an LSF job; it can only be used to launch tasks on remote hosts that are part of a job allocation. It cannot be used as a standalone command. On success blaunch exits with 0.
Windows: blaunch is supported on Windows 2000 or later with the following exceptions:
Only the following signals are supported: SIGKILL, SIGSTOP, SIGCONT.
The -n option is not supported.
CMD.EXE /C <user command line> is used as intermediate command shell when: -no-shell is not specified
CMD.EXE /C is not used when -no-shell is specified.
Windows Vista User Account Control must be configured correctly to run jobs.
Use bsub to call blaunch, or to invoke a job script that calls blaunch. The blaunch command assumes that bsub -n implies one remote task per job slot.
The blaunch syntax is:
blaunch [-n] [-u host_file | -z host_name ... | host_name] [-use-login-shell | -no-shell ] command [argument ... ]
blaunch [-h | -V]
The following are some examples of blaunch usage:
Submit a parallel job:
bsub -n 4 blaunch myjob
Submit a job to an application profile
bsub -n 4 -app pjob blaunch myjob