The choice of a modular approach to the software stack is natural, in a complex environment as HPC cluster. The reason behind modules is quite simple.
In a multi-user environment, especially when used by users from different disciplines, the software stack can become very large. However, each single user is only interested in a bunch of programs, not to all of them. Moreover, different users need sometimes different versions of the same software, and so these different versions should co-exist in the system. This may lead to potential system-wide version-conflicts, and to a confusion for the users, that may inadvertently use the wrong version.
A modular approach allows to have the full pool of software packages available to all of the users, and to simplify the management.
Using modules, each user has control over his own environment, for example the shell environment variable PATH, that holds the name of the directories from where you are allowed to run programs, without having to specify the full path. As a minor drawback, the user will not find all the programs he needs readily available at the login, but the program modules need to be loaded explicitly.
When the user logs on to the HPC, his/her environment is “standard”. The main software are available, but not most of the scientific software and development environment. The module command, an interface to the Modules package , allows to manage the modules, thus changing the environment. It requires a switch that defines its behaviour as second argument. The most useful switches are described here, grouped according to their function:
Checking the system
Shows a list of the loaded modules
Show the list of all the available modules
module load <module name>
Load the specified module. Notice that after loading a module the environment change. For example, if you load a different version of a program, the command name will refer to the one you just loaded.
module unload <module name>
Unload the specified module.
module switch <old module name> <new module name>
Unload the old module and load the new one.
Information about the modules
module whatis <module name>
Show a short description of the module
module show/display <module name>
Show module information and what the module sets, e.g. the PATH modifications.
Example: loading a compiler and a MPI parallel library.
Situation: you want to compile your brand new program, and you need gcc, and the openMPI library.
After login, you check if the gcc is available:
gcc –version gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3) Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
This is not the version you want. Check which modules are already loaded:
module list Currently Loaded Modulefiles: 1) latex/TeXLive12
By default the LaTex module is loaded. So check if a more recent gcc is available:
module avail -t . . . . . . CryptoMiniSat/3.2.0 . . . . . . gcc/4.7.2 gcc/4.7.2-cloog(default) . . . . . . mpi/gcc(default) mpi/gcc-4.7.2-openmpi-1.6.3 . . . . . .
Choose the modules that you need
module load gcc/4.7.2-cloog(default) module load mpi/gcc-4.7.2-openmpi-1.6.3
module list Currently Loaded Modulefiles: 1) latex/TeXLive12 3) gcc/4.7.2-cloog 2) studio/12u3b 4) mpi/gcc-4.7.2-openmpi-1.6.3
Check that gcc now is the right one
gcc --version gcc (GCC) 4.7.2 Copyright (C) 2012 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
NOTE: when you logout the module are unloaded, so you have to load them again, next time you log in. You can in principle set up your shell environment so that it automatically loads some modules (see section Loading modules automatically). However this could give problems if you want to use different modules later, because you are not starting any more from a clean setup.
Modules and batch jobs
When you prepare a batch job for running a program, you have to take care of explicitly loading the modules that are required by the program. When your program is run by the Resource Manager, it runs in a clean environment (hopefully). If your program depends on specific libraries (compiler related libraries, or some special mpi library, for example), it will simply abort if it cannot find the library. So, in the batch job, before the command that launch the program, remember to add a line
module load <list of the modules needed>
The batch file could look like that
#!/bin/sh # embedded options to qsub - start with #PBS # -- Name of the job --- #PBS -N My_Application # –- specify queue -- #PBS -q hpc # -- estimated wall clock time (execution time): hh:mm:ss -- #PBS -l walltime=00:10:00 # –- number of processors/cores/nodes -- #PBS -l nodes=1:ppn=4 # –- user email address -- # please uncomment the following line and put in your e-mail address, # if you want to receive e-mail notifications on a non-default address ##PBS -M your_email_address # –- mail notification –- #PBS -m abe # -- run in the current working (submission) directory -- if test X$PBS_ENVIRONMENT = XPBS_BATCH; then cd $PBS_O_WORKDIR; fi # here follow the commands you want to execute # Load modules needed by myapplication.x module load gcc/4.7.2-cloog mpi/gcc-4.7.2-openmpi-1.6.3 # Run my program myapplication.x < input.in > output.out
It is possible to load one or several modules automatically every time you start a new shell. Simply edit the .gbarrc file in your home dir (or create it if it does not exist):
$ nano -w ~/.gbarrc
Then type in the modules you want to load every time you start a new shell, as a comma separated list as follows (example):
Finally type ctrl-o, press y (or <enter>) and then enter to save, and finally ctrl-x to exit.