Docker CUDA Runtime
In this post I'm documenting rough steps to use CUDA inside Docker.
Initial status is the nvidia-smi is working but Docker does not recognize the nvidia runtime.
After following the instructions on
root@Moebels:/home/sampsa# curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \
&& \
sudo apt-get update
deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/deb/$(ARCH) /
#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/experimental/deb/$(ARCH) /
Hit:1 http://fi.archive.ubuntu.com/ubuntu jammy InRelease
Hit:2 http://fi.archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:3 http://fi.archive.ubuntu.com/ubuntu jammy-backports InRelease
Get:4 https://nvidia.github.io/libnvidia-container/stable/deb/amd64 InRelease [1 477 B]
Hit:5 https://download.docker.com/linux/ubuntu jammy InRelease
Hit:6 http://packages.microsoft.com/repos/code stable InRelease
Hit:7 https://dl.google.com/linux/chrome/deb stable InRelease
Hit:8 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 InRelease
Hit:9 http://security.ubuntu.com/ubuntu jammy-security InRelease
Get:10 https://nvidia.github.io/libnvidia-container/stable/deb/amd64 Packages [5 376 B]
Fetched 6 853 B in 0s (19,3 kB/s)
Reading package lists... Done
root@Moebels:/home/sampsa# sudo apt-get install -y nvidia-container-toolkit
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
cuda-cccl-12-3 cuda-command-line-tools-12-3 cuda-compiler-12-3 cuda-crt-12-3 cuda-cudart-12-3 cuda-cudart-dev-12-3
cuda-cuobjdump-12-3 cuda-cupti-12-3 cuda-cupti-dev-12-3 cuda-cuxxfilt-12-3 cuda-documentation-12-3 cuda-driver-dev-12-3
cuda-gdb-12-3 cuda-libraries-12-3 cuda-libraries-dev-12-3 cuda-nsight-12-3 cuda-nsight-compute-12-3 cuda-nsight-systems-12-3
cuda-nvcc-12-3 cuda-nvdisasm-12-3 cuda-nvml-dev-12-3 cuda-nvprof-12-3 cuda-nvprune-12-3 cuda-nvrtc-12-3 cuda-nvrtc-dev-12-3
cuda-nvtx-12-3 cuda-nvvm-12-3 cuda-nvvp-12-3 cuda-opencl-12-3 cuda-opencl-dev-12-3 cuda-profiler-api-12-3 cuda-sanitizer-12-3
cuda-toolkit-12-3 cuda-tools-12-3 cuda-visual-tools-12-3 libaccinj64-11.5 libcub-dev libcublas-12-3 libcublas-dev-12-3
libcublas11 libcublaslt11 libcudart11.0 libcufft-12-3 libcufft-dev-12-3 libcufft10 libcufftw10 libcupti-dev libcupti-doc
libcupti11.5 libcurand-12-3 libcurand-dev-12-3 libcurand10 libcusolver-12-3 libcusolver-dev-12-3 libcusolver11 libcusolvermg11
libcusparse-12-3 libcusparse-dev-12-3 libcusparse11 libegl-dev libgl-dev libgl1-mesa-dev libgles-dev libgles1 libglvnd-core-dev
libglvnd-dev libglx-dev libnpp-12-3 libnpp-dev-12-3 libnppc11 libnppial11 libnppicc11 libnppidei11 libnppif11 libnppig11
libnppim11 libnppist11 libnppisu11 libnppitc11 libnpps11 libnvblas11 libnvidia-egl-wayland1 libnvjitlink-12-3
libnvjitlink-dev-12-3 libnvjpeg-12-3 libnvjpeg-dev-12-3 libnvjpeg11 libnvrtc-builtins11.5 libnvrtc11.2 libnvtoolsext1 libnvvm4
libopengl-dev libpcre2-16-0 libpthread-stubs0-dev libqt5core5a libqt5dbus5 libqt5network5 libtbb-dev libthrust-dev libvdpau-dev
libx11-dev libxau-dev libxcb1-dev libxdmcp-dev node-html5shiv nsight-compute nsight-compute-2023.1.0 nsight-compute-2023.3.0
nsight-compute-target nsight-systems-2023.3.3 nvidia-cuda-gdb nvidia-cuda-toolkit-doc nvidia-gds-12-1 nvidia-opencl-dev
ocl-icd-opencl-dev opencl-c-headers opencl-clhpp-headers qttranslations5-l10n x11proto-dev xorg-sgml-doctools xtrans-dev
Use 'sudo apt autoremove' to remove them.
The following additional packages will be installed:
libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit-base
The following NEW packages will be installed:
libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit nvidia-container-toolkit-base
0 upgraded, 4 newly installed, 0 to remove and 75 not upgraded.
Need to get 4 199 kB of archives.
After this operation, 16,6 MB of additional disk space will be used.
Get:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 libnvidia-container1 1.14.3-1 [924 kB]
Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 libnvidia-container-tools 1.14.3-1 [20,6 kB]
Get:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 nvidia-container-toolkit-base 1.14.3-1 [2 337 kB]
Get:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 nvidia-container-toolkit 1.14.3-1 [918 kB]
Fetched 4 199 kB in 0s (10,2 MB/s)
Selecting previously unselected package libnvidia-container1:amd64.
(Reading database ... 323778 files and directories currently installed.)
Preparing to unpack .../libnvidia-container1_1.14.3-1_amd64.deb ...
Unpacking libnvidia-container1:amd64 (1.14.3-1) ...
Selecting previously unselected package libnvidia-container-tools.
Preparing to unpack .../libnvidia-container-tools_1.14.3-1_amd64.deb ...
Unpacking libnvidia-container-tools (1.14.3-1) ...
Selecting previously unselected package nvidia-container-toolkit-base.
Preparing to unpack .../nvidia-container-toolkit-base_1.14.3-1_amd64.deb ...
Unpacking nvidia-container-toolkit-base (1.14.3-1) ...
Selecting previously unselected package nvidia-container-toolkit.
Preparing to unpack .../nvidia-container-toolkit_1.14.3-1_amd64.deb ...
Unpacking nvidia-container-toolkit (1.14.3-1) ...
Setting up nvidia-container-toolkit-base (1.14.3-1) ...
Setting up libnvidia-container1:amd64 (1.14.3-1) ...
Setting up libnvidia-container-tools (1.14.3-1) ...
Setting up nvidia-container-toolkit (1.14.3-1) ...
Processing triggers for libc-bin (2.35-0ubuntu3.4) ...
root@Moebels:/home/sampsa# sudo nvidia-ctk runtime configure --runtime=docker~
ERRO[0000] unrecognized runtime 'docker~'
root@Moebels:/home/sampsa# sudo nvidia-ctk runtime configure --runtime=docker
INFO[0000] Config file does not exist; using empty config
INFO[0000] Wrote updated config to /etc/docker/daemon.json
INFO[0000] It is recommended that docker daemon be restarted.
root@Moebels:/home/sampsa# sudo systemctl restart docker
root@Moebels:/home/sampsa# sudo nvidia-ctk runtime configure --runtime=containerd
INFO[0000] Loading config from /etc/containerd/config.toml
INFO[0000] Wrote updated config to /etc/containerd/config.toml
INFO[0000] It is recommended that containerd daemon be restarted.
root@Moebels:/home/sampsa# sudo systemctl restart containerd
root@Moebels:/home/sampsa# sudo nvidia-ctk runtime configure --runtime=crio
INFO[0000] Loading config: /etc/crio/crio.conf
INFO[0000] Config file does not exist; using empty config
INFO[0000] Successfully loaded config
INFO[0000] Wrote updated config to /etc/crio/crio.conf
INFO[0000] It is recommended that crio daemon be restarted.
root@Moebels:/home/sampsa# sudo systemctl restart crio
Failed to restart crio.service: Unit crio.service not found.
root@Moebels:/home/sampsa#
It is possible to run Docker with nvidia runtime,
docker run --runtime=nvidia --rm nvidia/cuda:11.1.1-runtime-ubuntu20.04 nvidia-smi
Kommentit
Lähetä kommentti