This post walks through the use of GPGPUs with Kubernetes and DevicePlugins. We’ll use MicroK8s for a developer workstation example and charmed K8s for a cluster since that’s a consistent multi-cloud Kubernetes approach. The various cloud CAAS offerings like GKE are also enabling GPGPU facilities so you may want to try those too.
We’ll use Ubuntu as the OS because the underlying enablement for GPGPUs ‘Just Works’ in all the clouds and with all the local hardware, and making docker images on Ubuntu ensures that the CUDA libraries line up with the drivers properly.
In order for this all to work, the correct (and matching) driver needs to be installed on the worker node to make the device accessible from the OS; and typically it also requires some userland libraries in order to work. With NVIDIA GPUs this enablement further depends on using the right Docker runtime (
nvidia-docker2) which requires additional host-level configuration and post-deployment installation.
All of that is automated on Ubuntu with MicroK8s and the charmed Kubernetes charms, across all the public clouds where GPUs are available. It’s also currently activated in GKE, other cloud CAAS offerings will follow.
Workstation GPGPU containers with Microk8s
Microk8s is a snap of upstream Kubernetes that is designed for development purposes. It’s not a cluster but it gives you a small zero-ops kubernetes environment that is compatible with all the major multi-cloud K8s offerings. For our purposes the important thing is that it includes GPGPU enablement in the box.
To install MicroK8s:
$ snap install microk8s --classic
This will give you the latest stable version of MicroK8s which tracks upstream releases closely.
You can select a particular version using ‘snap channels’, see ‘
snap info microk8s’ for the available tracks. By selecting a particular track you can lock yourself to a particular version of Kubernetes. By default you will be on the ‘latest’ track, and get upgrades when upstream Kubernetes releases a new stable version. Select a particular track with
--channel=track/stability from the available channels. ‘Stable’ maps to ‘latest/stable’.
$ snap info microk8s [...] channels: stable: v1.13.0 (340) 204MB classic candidate: v1.13.0 (340) 204MB classic beta: v1.13.0 (340) 204MB classic edge: v1.13.0 (340) 204MB classic 1.13/stable: v1.13.0 (340) 204MB classic 1.13/candidate: v1.13.0 (340) 204MB classic 1.13/beta: v1.13.0 (340) 204MB classic 1.13/edge: v1.13.0 (341) 204MB classic 1.12/stable: v1.12.3 (336) 226MB classic 1.12/candidate: v1.12.3 (336) 226MB classic 1.12/beta: v1.12.3 (336) 226MB classic 1.12/edge: v1.12.3 (336) 226MB classic 1.11/stable: v1.11.5 (322) 219MB classic 1.11/candidate: v1.11.5 (322) 219MB classic 1.11/beta: v1.11.5 (322) 219MB classic 1.11/edge: v1.11.5 (322) 219MB classic 1.10/stable: v1.10.11 (321) 175MB classic 1.10/candidate: v1.10.11 (321) 175MB classic 1.10/beta: v1.10.11 (321) 175MB classic 1.10/edge: v1.10.11 (321) 175MB classic
Assuming you have an Nvidia GPU with a current driver installed, you can activate Kubernetes support for it with the “enable” subcommand:
$ microk8s.enable gpu
You can confirm that the GPU is available to Microk8s with this command:
$ microk8s.status microk8s is running addons: gpu: enabled storage: disabled registry: disabled ingress: disabled dns: disabled metrics-server: disabled istio: disabled dashboard: disabled
Running GPGPU-accelerated containers on Kubernetes
Now that you have GPGPU capacity available to Kubernetes you can deploy containers there that get access to the special hardware they need.
Your container needs to have the right userspace pieces, so again we suggest that you build the OCI images on Ubuntu with the CUDA libraries provided; those will be most portable across all the different cloud CAAS offerings as well as offerings from Canonical, VMware, Pivotal, Cisco and others that also use Ubuntu for K8s.
Your workloads can now use something like this to select appropriate worker nodes (example taken from here):
Listing 1: nvidia-pod-example.yaml
apiVersion: v1 kind: Pod metadata: name: cuda-vector-add spec: restartPolicy: OnFailure containers: - name: cuda-vector-add image: "k8s.gcr.io/cuda-vector-add:v0.1" resources: limits: nvidia.com/gpu: 1 # requesting 1 GPU
Kubernetes cluster deployment with GPGPUs
A compelling feature of the Charmed Distribution of Kubernetes (CDK) is that it will automatically enable GPGPU resources which are present on the worker node for use by K8s pods.
GPU resources are enabled through the use of Device Plugins which are deployed as DaemonSets. This ensures that each GPU-enabled worker node is allowed access to the GPU and sets the right paths to the driver plugins on the host.
With the DaemonSet deployed, the Kubernetes scheduler can leverage the NodeSelector to filter worker node candidates advertising the nvidia.com/gpu feature when scheduling workloads.
Charms fully automate the deployment of Kubernetes in a way that is model-driven and thus flexible for use on different kinds of cloud or cluster. We use charms successfully for HPC deployments of Kubernetes, for example, making the deployment of AI/ML pipelines on top of Kubernetes easier. GPU enablement is important for those sorts of workloads.
However, before deploying Kubeflow or similar frameworks, the Kubernetes layer needs to be fully automated and GPUs activated.
The charms of Kubernetes do all the work. As worker nodes get commissioned into the model, the Kubernetes charms auto-detect the presence of NVIDIA hardware, install the right driver and host libraries, replace the container runtime with the NVIDIA supported one, deploy the DaemonSet for the DevicePlugin and labels the nodes automatically.
The K8s cluster is best deployed with
conjure-up which will walk you through the entire process. You can use conjure-up on a public cloud with GPU-enabled instance types, or on MAAS for bare metal clusters with servers that contain GPUs. In both cases, the deployment process is exactly the same.
For example, you can use
p2.xlarge instances on AWS. In order to make that happen, we need to pass a constraint into the conjure-up command line so that we force the usage of the GPU enabled instance types when deploying workers.
Listing 2: cdk-gpu-worker.yaml
services: "kubernetes-worker": charm: "cs:~containers/kubernetes-worker" num_units: 1 options: channel: 1.13/stable expose: true constraints: "instance-type=p2.xlarge root-disk=32768"
Pass this to conjure-up:
$ conjure-up canonical-kubernetes --bundle-add cdk-gpu-worker.yaml
This will launch the conjure-up wizard interface and allow you to select additional add-ons to be deployed, for example, Kubeflow can be selected here. On the controller selection screen, you can either deploy a dedicated Juju controller (one more VM) or you can take advantage of JAAS, which provides Juju-as-a-service on the major public clouds.
Once the installation is kicked off, you see a status screen as shown below:
The status can also be shown using the Juju command directly. If you use JAAS, locate the model name using:
$ juju models -c jaas Controller: jaas Model Cloud/Region Status Machines Cores Access Last connection conjure-canonical-kubern-9dc aws/us-east-1 available 0 0 admin never connected
Then, inspect the status of the model:
$ juju status -m jaas:conjure-canonical-kubern-9dc Model Controller Cloud/Region Version SLA Timestamp conjure-canonical-kubern-9dc jaas aws/us-east-1 2.4.5 unsupported 16:13:37-08:00 App Version Status Scale Charm Store Rev OS Notes aws-integrator 1.15.71 active 1 aws-integrator jujucharms 7 ubuntu easyrsa 3.0.1 maintenance 1 easyrsa jujucharms 117 ubuntu etcd maintenance 3 etcd jujucharms 209 ubuntu flannel waiting 0 flannel jujucharms 146 ubuntu kubeapi-load-balancer maintenance 1 kubeapi-load-balancer jujucharms 162 ubuntu exposed kubernetes-master maintenance 2 kubernetes-master jujucharms 219 ubuntu kubernetes-worker waiting 0/1 kubernetes-worker jujucharms 239 ubuntu exposed Unit Workload Agent Machine Public address Ports Message aws-integrator/0* active idle 0 18.104.22.168 ready easyrsa/0* maintenance executing 1 22.214.171.124 (install) installing charm etcd/0* maintenance executing 2 126.96.36.199 (install) installing charm etcd/1 maintenance executing 3 188.8.131.52 (install) installing charm etcd/2 maintenance executing 4 184.108.40.206 (install) installing charm kubeapi-load-balancer/0* maintenance executing 5 220.127.116.11 (install) installing charm kubernetes-master/0 maintenance executing 6 18.104.22.168 (install) installing charm kubernetes-master/1* maintenance executing 7 22.214.171.124 (install) installing charm kubernetes-worker/0 waiting allocating 8 126.96.36.199 waiting for machine Machine State DNS Inst id Series AZ Message 0 started 188.8.131.52 i-083ce279733998d59 bionic us-east-1a running 1 started 184.108.40.206 i-04828688ddfdb0c6c bionic us-east-1b running 2 started 220.127.116.11 i-03d910e892e7c09f6 bionic us-east-1a running 3 started 18.104.22.168 i-00adeecd668174ee0 bionic us-east-1b running 4 started 22.214.171.124 i-032875fd24a1c1e78 bionic us-east-1c running 5 started 126.96.36.199 i-0008405049b9bed6d bionic us-east-1d running 6 started 188.8.131.52 i-003abf7f3612a2f18 bionic us-east-1b running 7 started 184.108.40.206 i-0abe01060e8179618 bionic us-east-1a running 8 pending 220.127.116.11 i-0d493a35776b9217d bionic us-east-1e running
Once the installation has finished all GPGPU resources are properly configured and available to the Kubernetes operator. You can check this with:
$ kubectl get no -o wide -L cuda,gpu
Leveraging GPGPU resources in your Kubernetes cluster is automatic and easy to do when using the Charmed Distribution of Kubernetes or Microk8s.
What do you think? We’d love to hear about your use cases and how CDK and Microk8s helped with your GPGPU-sensitive workloads.