HPC is a multi-user environment. In order to ensure an effective usage of the available resources some sort of management of the workload is necessary. This task is accomplished by a Resource Manager (RM), a program (or sometimes a set of programs that work together) that assigns the resources to the users according to the system current and expected load, the user needs, and a predefined assignment policy. This means that the user does not run their application directly, but instead “asks” the system to run the application, usually by means of a job script. The Resource Manager parses the script, and tries to optimize the usage of the available resources, scheduling the execution of the applications (jobs) at different times, and on different nodes/cores in the cluster. There are many Resource Manager software available. On the HPC, a combination of the MOAB (scheduler) and Torque (resource manager) is used. However, most of the description in this guide is general. You can find some specific information at MOAB/Torque.
The job script
The Resource Manager takes care of assigning to the user jobs the requested resources so that many jobs can be run simultaneously without interfering with each other, and scheduling the execution of the different requests. The RM needs some user-provided information to be able to do a good job, that’s why the user is required to provide a job script (sometimes called also batch file) with the specification of:
- the resources requested, and the job constraints
- specific queue specification (more details later)
- all what makes a working execution environment
By resources we mean the number of processors/cores/nodes, the amount of memory needed (total and/or per process), and eventually some specific features (special hardware, for example), and so on.
Job constraints are typically time constraints, i.e. for how much time you are reserving the resources. Notice that there are limitations, both for the maximum number of cores, jobs, and time (see hpc queue parameters).
The concept of queue is quite intuitive: since not all the applications/jobs are run immediately, they are ordered in a queue for a delayed execution. However, many different queues can be managed by the same scheduler. At DTU, for example, the RM manages different clusters, some of which are for all users and some only for selected groups, also different queue for different kind of applications (serial vs parallel, short vs long, and others), with different restrictions (for example the app and hpc queue, see hpc queue parameters).
Then you have to provide a functioning execution environment: remember that your application will be run when scheduled by the RM, so the application must be able to run without user intervention (sometimes called unattended execution). This means that the correct specification of the executables must be provided, with all the necessary input files, and a correct specification of the environment (for example libraries, modules…).
Notice that the RM reserves the resources for your job (and run it when there are enough, avoiding conflicts with other users’ jobs) based on your request. It is therefore important that the resources requested correspond to what really is needed by your application.
All these information must be written in a text file, following a simple syntax, that will be shown later.
Once you have this text file (the job script, let us call it submit.sh), you must submit it by typing in a terminal the command:
$ qsub submit.sh
Then you can check the status of your submission issuing the command
NOTE: at present, there is no way to submit jobs other than from the command line
Preparing a job script
The first thing when preparing a batch job is to take care that the application can run unattended, without your supervision. So you have to make sure that the executable are in the right place, that all the input files are specified, and that the output gets written where expected. Let us assume that everything is set up properly, and that you would run your application from the command line as follows:
myapplication.x < input.in > output.out
that means that you run the program myapplication.x reading the input from the file input.in and writing the output in the file output.out in the directory where you are when you issue the command.
- An important part of the execution environment is the location of files. You have to take care that the working directory, the directory where you want your files to be read and write for example, is correctly specified (see PBS_WORKDIR in the examples).
- It could be necessary to specify the full path of your application, to be sure that everything goes as expected.
- If your application needs some special environment (e.g. some special libraries), you have to make sure that they are loaded before the execution.
Example: a not so basic job script
A job script for submitting the same program to a queue, could look like that:
#!/bin/sh # embedded options to qsub - start with #PBS # -- Name of the job --- #PBS -N My_Application # –- specify queue -- #PBS -q hpc # -- estimated wall clock time (execution time): hh:mm:ss -- #PBS -l walltime=01:00:00 # –- number of processors/cores/nodes -- #PBS -l nodes=1:ppn=4 # –- user email address -- # please uncomment the following line and put in your e-mail address, # if you want to receive e-mail notifications on a non-default address ##PBS -M your_email_address # –- mail notification –- #PBS -m abe # here follow the commands you want to execute cd $PBS_O_WORKDIR
myapplication.x < input.in > output.out
The file starts with the line
that tells the system to call the sh command line interpreter (shell) for interpreting the subsequent lines.
The lines that start with # are comments in the shell environment, and so the rest of the line is not executed.
The lines starting with #PBS are interpreted by the RM as lines that contain options for the resource manager.
After the section with the options for the Resource Manager, you have to insert the command(s) for running your programs.
NOTE: all the requests for the Resource Manager, that is all the lines starting with #PBS, MUST appear before the first shell command (i.e. the first non-blank line not starting with #). All the lines starting with #PBS that appear after that are simply ignored by the RM, but no warning or error message will signal that.
Some of the most important options for the RM are shown in the script:
#PBS -N My_Application
Specify the name of your job (-N flag), that is useful to easily check the status of your job.
#PBS -q hpc
Specify the queue you want your job to be run in (-q flag). Notice that different queues have different defaults, and access to specific queue can be restricted to specific groups of users.
#PBS -l walltime=01:00:00
Specifies that you want your job to run AT MOST 1:00:00 hours (-l flag). The -l flag is used for specifying the list of resources needed for the job.
#PBS -l nodes=1:ppn=4
Ask to reserve 1 node and 4 core per each node.
Coming to memory specifications,
#PBS -l mem=4gb
Specifies the maximum amount of physical memory used by all the processes in the job.
#PBS -l pmem=512mb
Specifies the maximum amount of physical memory used by any single process in the job.
#PBS -l vmem=6gb
Specifies the maximum amount of virtual memory used by all the processes in the job.
#PBS -l pvmem=512mb
Specifies the maximum amount of virtual memory used by any single process in the job.
#PBS -M your_email_address
User email address, so that the user can receive notifications
#PBS -m abe
Specifies to write an email to the user address when jobs (b)egin, (e)nd and (a)bort. It is not necessary to set all of them.
$PBS_O_WORKDIR is set to the absolute path of the directory from where you executed the qsub command. The command change the current directory to the working directory, so that your application can find all the needed files.
After this first command no other PBS option will be considered. You can now write the commands you use for running your application:
myapplication.x < input.in > output.out
This section can be a full script, or even better you could build an external script with all the instruction you need, for example a bash, a python, a perl script, and call it from the submit script.
Notice that the application are managed in a modular approach on the HPC (see modules). This means that the software packages you need are probably not already available on the node where your application is run. So you have to take care to explicitly load the modules you need, issuing the correct
module load <list of the modules the program needs>
before the command that executes your job.