In order to be able to run Matlab jobs under the batch system, you have to be aware of the fact that you cannot make use of graphics, i.e. the desktop GUI, plots in a GUI, etc.

That means, you should be able to execute your Matlab code from the command line prompt first. Assuming that the matlab script is called `my_matlab.m`

, you should be able to run it by typing in a terminal

matlab -nodisplay -r my_matlab

**NOTE:**

There is no .m extension on the command line!

Then one can run matlab on the cluster, in serial or parallel.

## Serial Run

In a serial run you have to reserve one single core. Assuming that your matlab script is called `my_serial_matlab.m`

, a basic script could look like the following:

#!/bin/sh # embedded options to qsub - start with #PBS # -- our name --- #PBS -N MySerialMatlab # -- choose queue -- #PBS -q hpc # -- Notify me by email when execution begins (b) and ends (e) -- #PBS -m be # -- email address -- # please uncomment the following line and put in your e-mail address, # if you want to receive e-mail notifications on a non-default address ##PBS -M your_email_address # -- estimated wall clock time (execution time): hh:mm:ss -- #PBS -l walltime=00:10:00 # -- parallel environment requests -- #PBS -l nodes=1:ppn=1 # -- end of PBS options -- # -- change to working directory cd $PBS_O_WORKDIR # -- commands you want to execute -- # matlab -nodisplay -r my_serial_matlab -logfile MySerialMatlabOut

The option `#PBS -l nodes=1:ppn=1`

specifies that you reserve one single core on one node.

If you omit the redirection to outputfile (MyMatlabOut), the output will be put into the standard output log file, which is named after the *job name* and the *job id*, e.g. for *job id* 12345: *MyMatlab.o12345*. Error messages will go to *MyMatlab.e12345*. Note: due to bug in Matlab, standard output may be scrambled and might miss lines you would expect. To catch all information, the ‘-logfile’ option is preferable.

For more information on batch jobs, see the page on MOAB/Torque jobs.

## Parallel Run

Matlab can also run in parallel, both in a shared memory and in a distributed memory environment. If you want to run Matlab on a single node, like on your personal computer, you have to follow the instructions for the shared memory script. This limits the number of workers that you can define in your Matlab session to the number of cores available on a single node. Currently in the general HPC cluster we have 8 core and 20 core nodes. If you need to use more than 20 cores, or anyway need to use more than one single node, you have to use follow the instructions for the distributed memory script. In this case, one has to use the MATLAB Distributed Computing Server (MDCS).

**IMPORTANT:**

DTU has only **a limited number of MDCS licenses**, and each core use a separate license. So:

- If you decide to run a parallel program, please
benchmark your Matlab programto check if there is really a substantial advantage in using more cores. Doubling the number of cores for a speedup of the 10% is wasting resources.If you need less than 20 workers,DO NOT use the MDCS profile, but use the default Matlablocalprofile. In that way you do not waste precious licenses, that someone else could need.

### Shared Memory Script

Assuming that your script is called `my_shared_matlab.m `

, your script could look like that:

#!/bin/sh # embedded options to qsub - start with #PBS # -- our name --- #PBS -N MySharedMatlab # -- choose queue -- #PBS -q hpc # -- Notify me by email when execution begins (b) and ends (e) -- #PBS -m be # -- email address -- # please uncomment the following line and put in your e-mail address, # if you want to receive e-mail notifications on a non-default address ##PBS -M your_email_address # -- estimated wall clock time (execution time): hh:mm:ss -- #PBS -l walltime=00:10:00 # -- parallel environment requests -- #PBS -l nodes=1:ppn=4 # -- end of PBS options -- # -- change to working directory cd $PBS_O_WORKDIR # -- commands you want to execute -- # matlab -nodisplay -r my_shared_matlab -logfile MySharedMatlabOut

The option `#PBS -l nodes=1:ppn=4`

specifies that you reserve **4 cores** on one node. This means that you are reserving 4 cores for your matlab job, and you should not use more than 4 workers!

**NOTE:**

If you use more workers inside matlab than cores that you have reserved, matlab will only run slowly! So use at most as many workers than the cores you asked for.

- You can get the number of cores reserved from inside matlab, and assign them to a variable:

`nw=str2num(getenv('PBS_NUM_PPN'));`

### Distributed Memory Script

Matlab by default does not know anything about the topology of the cluster, and so it cannot run processes across multiple nodes. If you need to use more than the cores available on a single node, you have to use the MATLAB Distributed Computing Server (MDCS), for which DTU has some licenses. So, first you have to load the corresponding profile in Matlab (the instructions are here), and then prepare a script like the following. Here we assume that your script is called `my_distributed_matlab.m`

.

#!/bin/sh # embedded options to qsub - start with #PBS # -- our name --- #PBS -N MyDistributedMatlab # -- choose queue -- #PBS -q hpc # -- Notify me by email when execution begins (b) and ends (e) -- #PBS -m be # -- email address -- # please uncomment the following line and put in your e-mail address, # if you want to receive e-mail notifications on a non-default address ##PBS -M your_email_address # -- estimated wall clock time (execution time): hh:mm:ss -- #PBS -l walltime=10:00:00 # -- parallel environment requests -- #PBS -l nodes=1:ppn=1 # -- end of PBS options -- # -- change to working directory cd $PBS_O_WORKDIR # -- commands you want to execute -- # matlab -nodisplay -r my_distributed_matlab -logfile MyDistributedMatlabOut

The option `#PBS -l nodes=1:ppn=1`

specifies that you reserve 1** core only**. When you use the MDCS profile, and open a parallel pool, matlab will automatically create a new batch-job script for you, asking for a number of cores equal to the number of workers, and submits the job t the cluster for you. **So, in the end, you will have 2 jobs running**, your own single core job, and the multi-core jobs that matlab submits for you. Inside your matlab script, you have to open a pool using the MDCS profile.

However, the job-script that matlab creates is very “generic”: it does not specify any memory requirements, or any constraints about the core distribution, or for the walltime. So it is up to you to tell matlab which are your specific needs. For example, at the beginning of your `my_distributed_matlab.m`

file you can have some lines as the following:

clust=parcluster('DTUcluster'); % load the MDCS cluster profileclust.ResourceTemplate = '-l nodes=4:ppn=8';% Options for the job schedulerclust.SubmitArguments = '-q hpc -l walltime=08:00:00';% Options for the job scheduler numW=32;% Exactly the number of nodes times the number of processors per cores requested parpool(clust, numW); % here is the rest of your matlab script % %

In this way, you help the scheduler to distribute the work across the cluster in a smarter way. In the same way you can add other submit arguments to the

field, for example the memory requirements. And remember that it is the multi-core job that needs the memory, not your single-core job. Have a look at the batch job page, for more information.**clust.SubmitArguments**

**NOTE:**

Even in this case, if you use more workers inside matlab than cores that you reserve,matlab will only run slowly! So use at most as many workers than the cores you asked for.

- If your original job-script, the one that launches matlab, is killed,
also the child job, submitted by matlabwill be killed. So please select in your original job a walltime that is larger that the one specified inside matlab. Remember that if the cluster is busy, it could take some time for the job submitted by matlab to start, so remember to give to your single core job some extra time.