MATLAB Parallel Server configuration on the LSF clusters


MATLAB can be used in parallel across different nodes thanks to its MATLAB Parallel Server (formerly: MDCS) product. To be able to use it on the DCC HPC LSF cluster, MATLAB needs to be configured correctly, so that it becomes aware of the underlying cluster management software.

This procedure needs to be done only once (for each different matlab version), and is valid for MATLAB R2021b and later releases. For earlier releases, see the old MDCS documentation

Login to the LSF cluster (instructions here)

Open an interactive session (ThinLinc: xterm from the DTU menu; CLI: with the command linuxsh -X ).

Start a MATLAB session typing matlab (or the command for a specific MATLAB version if you do not want to run the default one).

In MATLAB, start the configuration of the cluster with the command configCluster. You will get some messages:

>> configCluster
   Must set MemUsage and WallTime before submitting jobs to DCC. E.g.                 
   >> c = parcluster;                 
   >> % 4 GB/core                 
   >> c.AdditionalProperties.MemPerCPU = '4GB'; 
   >> % 5 hours                 
   >> c.AdditionalProperties.WallTime = '05:00'; 
   >> c.saveProfile

You have now configured a cluster called dcc, with some default settings (e.g maximum number of workers=32), but you need to set some other general Additional Properties. Among others, your email address, the queue name, the walltime, the memory usage per worker/core. For example:

>> % Get a handle to the cluster, using the default name
>> c = parcluster(dccClusterProfile()); 
>> % Specify e-mail address to receive notifications about your job  
>> % c.AdditionalProperties.EmailAddress = 'user-id@dtu.dk'; 
>> % Request 4 GB/core 
>> c.AdditionalProperties.MemPerCPU = '4GB';
>> % Specify procs per node ('0' => automatic/optimized for given queue)  
>> c.AdditionalProperties.ProcsPerNode = 0; 
>> % Specify the walltime (e.g. 5 hours) 
>> c.AdditionalProperties.WallTime = '05:00'; 
>> c.saveProfile 

The first command c = parcluster(dccClusterProfile()) creates a cluster object for the session, using the default name used during configCluster.  This name is returned by  the function call dccClusterProfile().
The following commands set some additional properties for that cluster.
The last line, c.saveProfile saves the settings for the cluster, so they persist across different sessions. You can always check the settings with the command c.AdditionalProperties.
To clear a value, assign the property an empty value (”, [], or false).

>> % Turn off email notifications >> c.AdditionalProperties.EmailAddress = ''; 

Those values becomes the default values for the cluster. Those can be overwritten simply reassigning them in a session, without saving them.

Now you are ready to use the DCC HPC LSF clusters with the following standard settings:

  • Name: “dcc”
  • Used queue: hpc
  • Number of workers: up to 32

Please adapt the profile, if you have different needs, e.g. you want/have to use a different queue, or want to increase the number of workers.  You can give your profiles individual names, for the different settings, and thus it might be a good idea to keep the default profile above as a reference.  To save your changes under a different profile name, e.g. “myDCCProfile2021b”, use

c.saveAsProfile("myDCCprofile2021b")

Note:  Up to 20 workers, the ‘local’ profile can be used as well, and does not use any of the Parallel Server licenses!

 

How does it work?

In your script, you can select the cluster with a command like

>> profname = dccClusterProfile();
>> clust = parcluster(profname);

Then in the script you can create a pool of worker, specifying to use that specific cluster, and the number of workers (in the example 30)

>> p=parpool(clust,30)

With this operation, MATLAB automatically submits a job to the cluster for you using the Additional Properties defined in the profile or in previous lines in the script, and keeps track of the workers.
From this point on everything works as if the workers were running locally on a single machine.
Remember that if you kill the MATLAB session, also the workers are killed automatically.

If you want to use the MATLAB Parallel Server in a job script, please refer to this page.