So you have to connect to our setup to ssh into login1.hpc.dtu.dk or login2.hpc.dtu.dk or login1.gbar.dtu.dk
or login2.gbar.dtu.dk or connect via a terminal from within Thinlinc.
Available GPUs
The following NVIDIA GPUs are currently available as part of the DCC managed HPC clusters:
# GPUs | Name | Year | Architecture | CUDA cap. | CUDA cores | Clock MHz | Mem GiB | SP peak GFlops | DP peak GFlops | Peak GB/s |
---|---|---|---|---|---|---|---|---|---|---|
5 | Tesla K40c | 2013 | GK110B (Kepler) | 3.5 | 2880 | 745 / 875 | 11.17 | 4291 / 5040 | 1430 / 1680 | 288 |
8 | Tesla K80c (dual) | 2014 | GK210 (Kepler) | 3.7 | 2496 | 562 / 875 | 11.17 | 2796 / 4368 | 932 / 1456 | 240 |
8 | *TITAN X | 2016 | GP102 (Pascal) | 6.1 | 3584 | 1417 / 1531 | 11.90 | 10157 / 10974 | 317.4 / 342.9 | 480 |
22 | Tesla V100 | 2017 | GV100 (Volta) | 7.0 | 5120 | 1380 | 15.75 | 14131 | 7065 | 898 |
12 | Tesla V100-SXM2 | 2018 | GV100 (Volta) | 7.0 | 5120 | 1530 | 31.72 | 15667 | 7833 | 898 |
6 | Tesla A100-PCIE | 2020 | GA100 (Ampere) | 8.0 | 6912 | 1410 | 39.59 | 19492 | 9746 | 1555 |
- | Tesla H100-PCIE | 2022 | GH100 (Hopper) | 9.0 | 7296 | 1755 | 79.18 | 51200 | 25600 | 2048 |
- | Tesla H100-SXM5 | 2022 | GH100 (Hopper) | 9.0 | 8448 | 1980 | 79.18 | 66900 | 33500 | 3352 |
*Please note that the NVIDIA consumer GPUs TITAN X do not support ECC (error correction code).
Running interactively on GPUs
At the moment, there are currently three kind of nodes available for running interactive jobs on NVIDIA GPUs: Tesla V100 and Tesla V100-SXM2 both based on the Volta architecture and Tesla A100 with the Ampere Architecture. To run interactively on on a Tesla V100 node, you can use the command
voltash
This node has 2 Nvidia-Volta-100 GPUs, each with 16GB of memory.
To run interactively on on a Tesla V100-SXM2 node, you can use the command
sxm2sh
This node has 4 Nvidia-Volta-100 GPUs, each with 32GB of memory.
This node has 2 A100-GPUs, each with 40GB of memory. You can get an interactive shell there with
a100sh
Please note that multiple users are allowed on these nodes, and all users will be able to access all the GPUs on the node. We have set the GPUs to the “Exclusive process” runtime mode, which means that you will encounter a “device not available” (or similar) error, if someone is using the GPU you are trying to access.
In order to avoid too many conflicts we ask you to follow this code-of-conduct:
- Please monitor which GPUs are currently occupied using the command
nvidia-smi
and predominantly select unoccupied GPUs (e.g., using cudaSetDevice()) for your application. - If you need to run on all CPU cores, e.g., for performance profiling, please make sure that you are not disturbing other users.
- We kindly ask you to use the interactive nodes mainly for development, profiling, and short test jobs.
- Please submit ‘heavy’ jobs into the gpu-queue and don’t use the interactive nodes for heavy stuff
If you have further questions or issues using the GPUs please write to support@hpc.dtu.dk.
Requesting GPUs under LSF10 for non-interactive use
For submitting jobs into the LSF10-setup, please follow these instructions:
Using GPUs under LSF10
If you have further questions or issues using the GPUs please write to support@hpc.dtu.dk.