GPU nodes

So you have to connect to our setup to ssh into login1.hpc.dtu.dk or login2.hpc.dtu.dk or login1.gbar.dtu.dk
or login2.gbar.dtu.dk or connect via a terminal from within Thinlinc.

Available GPUs
Running interactively on GPUs
Requesting GPUs under LSF10

Available GPUs

The following NVIDIA GPUs are currently available as part of the DCC managed HPC clusters:

# GPUs	Name	Year	Architecture	CUDA cap.	CUDA cores	Clock MHz	Mem GiB	SP peak GFlops	DP peak GFlops	Peak GB/s
5	Tesla K40c	2013	GK110B (Kepler)	3.5	2880	745 / 875	11.17	4291 / 5040	1430 / 1680	288
8	Tesla K80c (dual)	2014	GK210 (Kepler)	3.7	2496	562 / 875	11.17	2796 / 4368	932 / 1456	240
8	*TITAN X	2016	GP102 (Pascal)	6.1	3584	1417 / 1531	11.90	10157 / 10974	317.4 / 342.9	480
22	Tesla V100	2017	GV100 (Volta)	7.0	5120	1380	15.75	14131	7065	898
12	Tesla V100-SXM2	2018	GV100 (Volta)	7.0	5120	1530	31.72	15667	7833	898
6	Tesla A100-PCIE	2020	GA100 (Ampere)	8.0	6912	1410	39.59	19492	9746	1555
-	Tesla H100-PCIE	2022	GH100 (Hopper)	9.0	7296	1755	79.18	51200	25600	2048
-	Tesla H100-SXM5	2022	GH100 (Hopper)	9.0	8448	1980	79.18	66900	33500	3352

*Please note that the NVIDIA consumer GPUs TITAN X do not support ECC (error correction code).

Running interactively on GPUs

At the moment, there are currently three kind of nodes available for running interactive jobs on NVIDIA GPUs: Tesla V100 and Tesla V100-SXM2 both based on the Volta architecture and Tesla A100 with the Ampere Architecture. To run interactively on on a Tesla V100 node, you can use the command

voltash

This node has 2 Nvidia-Volta-100 GPUs, each with 16GB of memory.
To run interactively on on a Tesla V100-SXM2 node, you can use the command

sxm2sh

This node has 4 Nvidia-Volta-100 GPUs, each with 32GB of memory.

This node has 2 A100-GPUs, each with 40GB of memory. You can get an interactive shell there with

a100sh

Please note that multiple users are allowed on these nodes, and all users will be able to access all the GPUs on the node. We have set the GPUs to the “Exclusive process” runtime mode, which means that you will encounter a “device not available” (or similar) error, if someone is using the GPU you are trying to access.

In order to avoid too many conflicts we ask you to follow this code-of-conduct:

Please monitor which GPUs are currently occupied using the command nvidia-smi and predominantly select unoccupied GPUs (e.g., using cudaSetDevice()) for your application.
If you need to run on all CPU cores, e.g., for performance profiling, please make sure that you are not disturbing other users.
We kindly ask you to use the interactive nodes mainly for development, profiling, and short test jobs.
Please submit ‘heavy’ jobs into the gpu-queue and don’t use the interactive nodes for heavy stuff

If you have further questions or issues using the GPUs please write to support@hpc.dtu.dk.

Requesting GPUs under LSF10 for non-interactive use

For submitting jobs into the LSF10-setup, please follow these instructions:
Using GPUs under LSF10

If you have further questions or issues using the GPUs please write to support@hpc.dtu.dk.