HPC Queues Parameters

There are currently 2 general queues/classes available for general use to all HPC users. These are the hpc and app queue. Below is a table showing their different parameters:

Definition hpc app
default queue Yes No
max wallclock 72 hours 72 hours
max cpu time N/A 24 hours
max nodes/job N/A 1 node
max cores/job 100 1
max cores/queue 100…160 (depending on load) 20/24 depending on node
max processes/node 8/20 (depending on node) 1
default walltime 48 hours 48 hours

As indicated in the table above, if no queue is requested, your job will be submitted to the hpc queue. This allows your job to run for a maximum of 72 hours, with a maximum number of 100 cores allocated to that job, and a maximum number of 8 or 20 cores per node. You are also not allowed to have more than 100 cores dedicated to all your jobs at any single time within the hpc queue.

Note: If your do not have a walltime set within your job submission, you will be assigned the default walltime of 48 hours in the hpc queue.

In the app queue, you also have a maximum walltime of 72 hours, as well as a maximum CPU time of 24 hours (total proc-seconds used by any single job). If any of these limits are reached, your job will be terminated. You are also limited to no more than 20 or 24 cores (depending on the type of application node)  within the app queue.

Note: Please don’t use the app queue as a destination for serial or parallel jobs. The app queue is for running menu applications and for testing your job scripts before submitting them to the cluster.

Serial Jobs

Serial jobs ( 1 core job running on a single node ) are detected automatically by MOAB/Torque, where they are assigned to dedicated nodes within the hpc queue. This functionality is called JobTemplates in Moab. There is currently 1 serial-job template configured on the cluster, the hpc_longserial template. The hpc_longserial template caters for serial jobs requesting 3 to 5 days walltime, which is greater than the default 3 days limit, and ensures they are run within the hpc queue, rather than being rejected by MOAB. For jobs requesting more than 5 days of walltime, you need to send an email stating the reasons for your request to support@cc.dtu.dk.

Here is an example of a submitted job using the hpc_longserial template:

qsub -l nodes=1:ppn=1,walltime=73:00:00 job_script.sh

The example above will be detected by the hpc_longserial since it requesting a single node with a single processor (serial job), and a walltime greater than 72 hours.

Note: If you don’t specify a walltime greater than 72 hours, your job will automatically terminate after 72 hours. If you need to have your serial job run for up to 5 day, you need to specify a walltime greater 3 hours to end up in the hpc_longserial job template.

Specific HPC Queues/Classes

On top of the hpc and app queues which are available to all users, there are other queues managed by MOAB/Torque, which are available to specific users and departments.

Hpc_interactive Queue

The hpc_interactive queue is accessed by running qrsh from a terminal window (app nodes or one of our login-nodes). This is a specific node dedicated to interactive jobs only. The node is identical to the other nodes on the default hpc queue.

Dyna Queue

The dyna queue consists of 12 HP SL390 nodes, which cater for the Chemistry users. The dyna queue does not have a limit on the number of cores, walltime or resources used by any of its authorised users.

TopOpt Queue

The topopt queue consists of 25 HP SL390s nodes, which cater for the TopOpt group within the MEK department. The TopOpt queue does not have a limit on the number of cores, walltime or resources used by any of its authorised users.

Fotonano Queue

The fotonano queue consists of 18 HP ProLiant SL230s Gen8 nodes, and a mixture of Dell PowerEdge SC1435 and HP ProLiant DL165 G7. These nodes cater for the entire photonic and nanotech department users. The fotonano queue currently has the following limits on the number of cores, walltime and resources used by any of its authorised users:

Definition Value
default walltime 48 hours
max walltime 5 days
max cores/job 160
max cores/queue 256…360 (depending on load)
max eligible jobs/queue 16