How to Run Containers with GPU Support

In this short tutorial I will explain how to run a Docker container with GPU support. This becomes important if you want to run AI applications such as the Imixs-AI project within a local Docker environment or in a Kubernetes Cluster. This short tutorial is based on Debian 12 (Bookworm) and the usage of a NVIDA graphic card. I assume that you already have installed Debian and the Docker engine. Read the official install guide about how to install the Docker engine on Debian.

Install NVIDIA Driver

To run Containers with GPU support you first need of course a graphic card running on your server. In the following section I show how install the proprietary NVIDIA Drivers with the Cuda-Toolkit on Debian Bookworm. The Cuda toolkit includes GPU-accelerated libraries to run high-performance, GPU-accelerated applications. The Cuda Framework is widely used, especially in combination with AI projects like LLMA-CPP. There are also open source drivers available, but we have not tested them.

To install the NVIDA driver you frist need to update your repo sources:

$ sudo apt update
$ sudo apt upgrade
# Remove previous installed nvida drivers (optional)
$ sudo apt autoremove nvidia* --purge
# Enable Contrib and Non-Free Repositories on Debian
$ sudo apt install software-properties-common -y
$ sudo add-apt-repository contrib non-free-firmware

Next Import the Nvidia APT Repository. This repo allows access to additional Nvidia tools like nvida-smi

$ sudo apt install dirmngr ca-certificates software-properties-common apt-transport-https dkms curl -y
$ sudo curl -fSsL https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/3bf863cc.pub | sudo gpg --dearmor | sudo tee /usr/share/keyrings/nvidia-drivers.gpg > /dev/null 2>&1
echo 'deb [signed-by=/usr/share/keyrings/nvidia-drivers.gpg] https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/ /' | sudo tee /etc/apt/sources.list.d/nvidia-drivers.list
$ sudo apt update

Finally Install Nvidia Drivers on Debian via DEFAULT APT Repository. Here I assume you have running a 64-bit system.

$ sudo apt update
$ sudo apt install linux-headers-amd64 nvidia-detect

Now you can check the Nvidia support on your server with the nvida-detect command. This will give you hints about the hardware status and how to install the driver:

$ nvidia-detect
Detected NVIDIA GPUs:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1)

Checking card:  NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
Your card is supported by all driver versions.
Your card is also supported by the Tesla drivers series.
Your card is also supported by the Tesla 470 drivers series.
It is recommended to install the
    nvidia-driver
package.

The output reveals that the machine features a GeForce GTX 1080 card and recommends installing the nvidia-driver package. Now you can finally install the recommended package together with the cuda toolkit.

$ sudo apt install nvidia-driver nvidia-smi linux-image-amd64 cuda

I install in addition the Nvida Service-Management-Interface which allows you to verify your installation. Finally reboot your system.

Note: it may happen that you need a hard reset on your machine. This may happen in case of conflicts with existing drivers.

Verify Installation

To verify your installation run nvidia-smi which shows you some insights of your environment:

$ nvidia-smi
Sun Mar 31 10:46:20 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: N/A      |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1080        Off |   00000000:01:00.0 Off |                  N/A |
| 36%   42C    P0             39W /  180W |       0MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Configuring Docker with GPU Support

To get things done in addition it is necessary to install the ‘NVIDIA Container Toolkit’. This will add the GPU support for your Docker engine. (This step is not necessary for Kubernetes – see section below)

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey |sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \
&& sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit    
$ sudo nvidia-ctk runtime configure --runtime=docker
$ sudo systemctl restart docker

Test your setup with

$ nvidia-container-cli -k -d /dev/tty info

This should not show any errors. Now you can start a test container to verify your new Docker GPU support:

docker run --gpus all nvidia/cuda:12.3.1-base-ubuntu20.04  nvidia-smi

This test container should just show the nvidia-smi output form above.

That’s it! Now you have a local Docker Environment with GPU Support.

Kubernetes Integration

If you plan to run you container in a Kubernetes Cluster things become even more easy. This is surprising since Kubernetes typically has a steep learning curve. But NVIDA did a great job with its so called NVIDIA GPU Operator.

To install the operator into your Kubernetes cluster, first you need to install the cli tool Helm. If not yet installed on your Kubernetes Master Node run:

$ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \
    && chmod 700 get_helm.sh \
    && ./get_helm.sh

Next add the NVIDA Helm repository

$ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
    && helm repo update

This operator will add the GPU support to all worker nodes in your cluster. To exclude a worker node this node have to be labeled with feature.node.kubernetes.io/pci-10de.present=false. You can check the current labels of your nodes with:

$ kubectl get nodes --show-labels=true

and add the exclusion label for a specify node with:

$ kubectl label nodes $NODE nvidia.com/gpu.deploy.operands=false

Where `$NODE` is the name of the worker node to be labeled.

Install The GPU Operator

Now you can install the GPU Operator into your cluster. There are different szenarios how to install the NVIDIA GPU Operator. To install the operator including the installation of the NVIDA Driver run:

helm install --wait --generate-name \
        -n gpu-operator --create-namespace \
        nvidia/gpu-operator 

The installation of all components in your cluster make take some time. So be not to impatient…

If you have the NVIDA Driver already installed on your worker node you can run the helm chart without installing the driver by the operator:

helm install --wait --generate-name \
     -n gpu-operator --create-namespace \
     nvidia/gpu-operator \
     --set driver.enabled=false

If something goes wrong with the installation you can delete the operator with:

$ helm delete -n gpu-operator $(helm list -n gpu-operator | grep gpu-operator | awk '{print $1}')

That’s it.

To define a deployment using the GPU support you simply need to add the resource limits nvidia.com/gpu: 1

This is an example of a deployment for the Imixs-AI-CPP Container with GPU support in a kuberentes cluster:

kind: Deployment
apiVersion: apps/v1
metadata:
  name: imixs-ai-llama-cpp-gpu
  labels: 
    app: imixs-ai-llama-cpp-gpu

spec:
  replicas: 1
  selector:
    matchLabels:
      app: imixs-ai-llama-cpp-gpu
  template:
    metadata:
      labels:
        app: imixs-ai-llama-cpp-gpu
    spec:
      containers:
      - name: imixs-ai-llama-cpp-gpu
        image: imixs/imixs-ai-llama-cpp-gpu:latest
        imagePullPolicy: Always
        env:
        - name: PYTHONUNBUFFERED
          value: "1"        
        - name: USE_MLOCK
          value: "true"   
        ports: 
          - name: web
            containerPort: 8000
        resources:
          limits:
            nvidia.com/gpu: 1
        volumeMounts:
        - mountPath: /models
          name: imixs-ai-models
      restartPolicy: Always
      volumes:
      - name: imixs-ai-models
        persistentVolumeClaim:
          claimName: imixs-ai-models

Leave a Reply

Your email address will not be published. Required fields are marked *