COMSOL jobs under LSF

Running comsol from the command line
A batch job file for the cluster

Running comsol from the command line

Once a comsol model is prepared (meshing included), and saved as a .mph file, you can run it from the command line, i.e. without the program trying to open the comsol Graphical User Interface.This requires the batch flag in the command line. For example

comsol batch -h

shows the command line options available.
To run the job from the command line, the most basic complete command line is:

comsol batch -inputfile wrench.mph -outputfile wrench_out.mph

The flags -inputfile and -outputfile followed by the respective file names specifies the model and the output file desired.

It is also possible to specify the name of the file where the Comsol log will be saved, with the flag -batchlog followed by a filename.

Note 1: comsol by default tries to use all the processors and cores on the machine. This is almost never what one wants on a multiple-users-on-a-single-node-cluster. So it is always better to specify the number of cores/processors the program should use.

Note 2: Comsol writes some temporary files, that are removed after the run, if it is successful. However, if it is not, it is likely that these files are not removed at all, filling up the disk. The default location of these files is the /tmp directory on linux machines. On the cluster, this is a local directory, that is not directly accessible by the user. So it is recommended to specify a different location, using the flag -tmpdir followed by the name of a directory, e.g.

-tmpdir $HOME/comsoltmp

Comsol creates the directory if it does not exist. In this way, the user can check if the directory is empty after the run.

Comsol also writes some files for recovery in a temporary directory. These files can be large, thus consuming a lot of space in your home directory. You can specify the path to a directory explicitly, with the option -recoverydir:

-recoverydir $HOME/comsolrecovery

Note 3: If you already have a scratch-directory, then use it as your tmpdir and recoverydir.

Note 4: the command comsol points to the latest version installed on the cluster.
The most recent versions are installed as modules. Use the command module avail comsol to get a list of the possible options. The default version may not be the one you want.
If you need an older version, look for the corresponding command. For example, to use comsol Version 4.4, the command is comsol44.

A batch job file for the cluster

To run a comsol job on the cluster non-interactively, you need to write a batch script and then call the program as in the previous section. In the following examples we will use a tutorial case, the simulation of a wrench, and comsol 4.4. We will discuss here only the batch job options relevant for comsol. For all the others have a look at this page.

Running a serial comsol job

To run a serial job (single processor), a basic script is

#!/bin/sh
# embedded options to bsub - start with #BSUB
# -- job name ---
#BSUB -J Com_wrench_serial
# -- Notify me by email when execution begins  --
#BSUB -B
# -- Notify me by email when execution ends    --
#BSUB -N
# -- email address -- 
# please uncomment the following line and put in your e-mail address,
# if you want to receive e-mail notifications on a non-default address
##BSUB -u your_email_address
# -- estimated wall clock time (execution time) --
#BSUB -W 4:00
### -- specify that we need 2GB of memory per core/slot -- 
#BSUB -R "rusage[mem=2GB]"
# -- parallel environment requests: 1 core --
#BSUB -n 1
### -- Specify the output and error file. %J is the job-id -- 
### -- -o and -e mean append, -oo and -eo mean overwrite -- 
#BSUB -o Output_%J.out 
#BSUB -e Error_%J.err 

# -- end of LSF options --

# -- commands you want to execute -- 
# load the comsol version you need. Use "module avail comsol"
# to see all the possibilities
module load comsol
 
unset JAVA_TOOL_OPTIONS
comsol batch -np 1 -inputfile wrench.mph -outputfile wrench_out.mph

Notice that we request a single core

#BSUB -n 1

and we explicitly specify that we need a single core in the comsol command line

comsol batch -np 1  ...

Without this option, comsol would try to use more than one core anyway, but the the scheduler restricts the number of cores (using linux cpusets), and comsol is not aware of that.

Running a shared-memory comsol job

Comsol by default uses shared-memory parallelism. The correct way to specify it is as follows:

#!/bin/sh
# embedded options to bsub - start with #BSUB
# -- job name ---
#BSUB -J Com_wrench_par_shared
# -- Notify me by email when execution begins  --
#BSUB -B
# -- Notify me by email when execution ends    --
#BSUB -N
# -- email address -- 
# please uncomment the following line and put in your e-mail address,
# if you want to receive e-mail notifications on a non-default address
##BSUB -u your_email_address
# -- estimated wall clock time (execution time) --
#BSUB -W  4:00
### -- specify that we need 2GB of memory per core/slot -- 
#BSUB -R "rusage[mem=2GB]"
# -- parallel environment requests: 4 cores --
#BSUB -n 4
### -- specify that the cores MUST BE on a single host! It's a SMP job! --
#BSUB -R "span[hosts=1]"
### -- Specify the output and error file. %J is the job-id -- 
### -- -o and -e mean append, -oo and -eo mean overwrite -- 
#BSUB -o Output_%J.out 
#BSUB -e Error_%J.err 

# -- end of LSF options --

# -- commands you want to execute -- 
# load the comsol version you need. Use "module avail comsol"
# to see all the possibilities
module load comsol
unset JAVA_TOOL_OPTIONS

comsol batch -np $LSB_DJOB_NUMPROC -inputfile wrench.mph -outputfile wrench_out.mph

The script asks for 4 processors on a single node, through the lines

#BSUB -n 4
#BSUB -R "span[hosts=1]"

and then uses the $LSB_DJOB_NUMPROC

environment variable that is automatically set to the number of cores reserved by the scheduler. In this way there is no risk of a mismatch.

Note 4: the comsol -np is used only to specify the processes that run on a single node ONLY. If you need to run comsol on more than one node, have a look at the following examples. In doubt, ask us by writing at support@hpc.dtu.dk.

Running a MPI comsol job

Comsol can also make use of the distributed parallelism, using the Intel-MPI library. So the script needs a couple of MPI environment variables to be set, and then the correct specifications of the number of nodes.

#!/bin/sh
# embedded options to bsub - start with #BSUB
# -- job name ---
#BSUB -J Com_par_mpi
# -- Notify me by email when execution begins  --
#BSUB -B
# -- Notify me by email when execution ends    --
#BSUB -N
# -- email address -- 
# please uncomment the following line and put in your e-mail address,
# if you want to receive e-mail notifications on a non-default address
##BSUB -u your_email_address
# -- estimated wall clock time (execution time) --
#BSUB -W  4:00
### -- specify that we need 2GB of memory per core/slot -- 
#BSUB -R "rusage[mem=2GB]"
# -- parallel environment requests: 8 cores --
#BSUB -n 8
### -- specify how the cores must be assigned: 4 on each node --
#BSUB -R "span[ptile=4]"
### -- Specify the output and error file. %J is the job-id -- 
### -- -o and -e mean append, -oo and -eo mean overwrite -- 
#BSUB -o Output_%J.out 
#BSUB -e Error_%J.err 

# -- end of LSF options --

# -- commands you want to execute -- 
# load the comsol version you need. Use "module avail comsol"
# to see all the possibilities
module load comsol
unset JAVA_TOOL_OPTIONS

# Set IntelMPI environment variables
export I_MPI_HYDRA_BOOTSTRAP=lsf

# -- program invocation here --
comsol batch -nn $LSB_DJOB_NUMPROC -np $OMP_NUM_THREADS -inputfile wrench.mph -outputfile wrench_out.mph

The scripts reserves 8 cores (4 on each of 2 nodes)

#BSUB -n 8
#BSUB -R "span[ptile=4]"

then sets an environment variable needed to use the MPI library necessary file

 export I_MPI_HYDRA_BOOTSTRAP=lsf

and then comsol is invoked with the correct option (-nn). Even in this case, to enforce consistency, the number of cores is specified using an environment variable ($LSB_DJOB_NUMPROC) that is set to the total number of cores requested.
The option

-np $OMP_NUM_THREADS

tells comsol to use only one thread per process, in a portable way.

Running a hybrid MPI-shared-memory comsol job

In this case you must combine the options.
A simple script is the following:

#!/bin/sh
# embedded options to bsub - start with #BSUB
# -- job name ---
#BSUB -J Com_par_hybrid
# -- Notify me by email when execution begins  --
#BSUB -B
# -- Notify me by email when execution ends    --
#BSUB -N
# -- email address -- 
# please uncomment the following line and put in your e-mail address,
# if you want to receive e-mail notifications on a non-default address
##BSUB -u your_email_address
# -- estimated wall clock time (execution time) --
#BSUB -W  4:00
### -- specify that we need 8GB of memory per slot* -- 
#BSUB -R "rusage[mem=8GB]"
# -- parallel environment requests: 2 "processes" --
#BSUB -n 2
### -- specify affinity: 8 "cores" for each "process" --
#BSUB -R "affinity[core(8)]"
### -- specify how the MPI processes must be assigned: 2 on each node --
#BSUB -R "span[ptile=2]"
### -- Specify the output and error file. %J is the job-id -- 
### -- -o and -e mean append, -oo and -eo mean overwrite -- 
#BSUB -o Output_%J.out 
#BSUB -e Error_%J.err 

# -- end of LSF options --

# -- commands you want to execute -- 
# load the comsol version you need. Use "module avail comsol"
# to see all the possibilities
module load comsol 

unset JAVA_TOOL_OPTIONS
# Set IntelMPI environment variables
export I_MPI_HYDRA_BOOTSTRAP=lsf

# -- program invocation here --

comsol batch -nn $LSB_DJOB_NUMPROC -np $OMP_NUM_THREADS  -inputfile wrench.mph -outputfile wrench_out.mph

In this case, we use both the mpi (-nn) and the shared memory (-np) processor flags, and set them to the value of two environment variables:
$LSB_DJOB_NUMPROC, that is set to the number of MPI processes requested, and
$OMP_NUM_THREADS, that is set to the number of cores per MPI-process requested (the number specified with the affinity option). This will start $LSB_DJOB_NUMPROC MPI processes, and each of them will run 8 threads.

Note: process placement
The resource specification for the hybrid jobs is more complex than the one for pure MPI jobs.
The script reserves 16 cores (for 2 MPI processes ( -n 2, each bound to 8 cores affinity[core(8)]), for a total of 2*8=16 cores. You can also select how the MPI processes are distributed across nodes. If you set

#BSUB -n 2
#BSUB -R "affinity[core(8)]"
#BSUB -R "span[ptile=2]"

both mpi processes are dispatched to the same node.
If you specify

#BSUB -n 2
#BSUB -R "affinity[core(8)]"
#BSUB -R "span[ptile=1]"

the 2 MPI processes will be assigned to different nodes (span[ptile=1]), and each of them will have 8 cores assigned.

More complex parallelization patterns can of course be specified, along these lines.
If you have any doubt, just write us at support@hpc.dtu.dk

General Comment: The choice of using a serial, shared memory, MPI or hybrid run depends on the specific characteristics of the model. And this also holds for the performance that can be expected.