Skip to content
This repository has been archived by the owner on May 27, 2024. It is now read-only.

Commit

Permalink
Merge branch 'update-readme' into 'main'
Browse files Browse the repository at this point in the history
Update RELEASE and README for v0.6.0

See merge request nvidia/kubernetes/gpu-feature-discovery!115
  • Loading branch information
klueska committed Jun 6, 2022
2 parents 679c3fc + c960fcb commit 416d12c
Show file tree
Hide file tree
Showing 2 changed files with 95 additions and 58 deletions.
150 changes: 94 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,27 @@

## Table of Contents

- [NVIDIA GPU feature discovery](#nvidia-gpu-feature-discovery)
* [Overview](#overview)
* [Beta Version](#beta-version)
* [Prerequisites](#prerequisites)
* [Quick Start](#quick-start)
+ [Node Feature Discovery (NFD)](#node-feature-discovery-nfd)
+ [Preparing your GPU Nodes](#preparing-your-gpu-nodes)
+ [Deploy NVIDIA GPU Feature Discovery (GFD)](#deploy-nvidia-gpu-feature-discovery-gfd)
- [Daemonset](#daemonset)
- [Job](#job)
+ [Verifying Everything Works](#verifying-everything-works)
* [The GFD Command line interface](#the-gfd-command-line-interface)
* [Generated Labels](#generated-labels)
+ [MIG 'single' strategy](#mig-single-strategy)
+ [MIG 'mixed' strategy](#mig-mixed-strategy)
* [Deployment via `helm`](#deployment-via-helm)
+ [Installing via `helm install`from the `gpu-feature-discovery` `helm` repository](#installing-via-helm-install-from-the-gpu-feature-discovery-helm-repository)
- [Overview](#overview)
- [Beta Version](#beta-version)
- [Prerequisites](#prerequisites)
- [Quick Start](#quick-start)
* [Node Feature Discovery (NFD)](#node-feature-discovery-nfd)
* [Preparing your GPU Nodes](#preparing-your-gpu-nodes)
* [Deploy NVIDIA GPU Feature Discovery (GFD)](#deploy-nvidia-gpu-feature-discovery-gfd)
+ [Daemonset](#daemonset)
+ [Job](#job)
* [Verifying Everything Works](#verifying-everything-works)
- [The GFD Command line interface](#the-gfd-command-line-interface)
- [Generated Labels](#generated-labels)
* [MIG 'single' strategy](#mig-single-strategy)
* [MIG 'mixed' strategy](#mig-mixed-strategy)
- [Deployment via `helm`](#deployment-via-helm)
+ [Installing via `helm install`](#installing-via-helm-install)
- [Deploying Standalone](#deploying-standalone)
- [Deploying as a subchart of the NVIDIA device plugin](#deploying-as-a-subchart-of-the-nvidia-device-plugin)
+ [Deploying via `helm install` with a direct URL to the `helm` package](#deploying-via-helm-install-with-a-direct-url-to-the-helm-package)
* [Building and running locally with Docker](#building-and-running-locally-with-docker)
* [Building and running locally on your native machine](#building-and-running-locally-on-your-native-machine)
- [Building and running locally with Docker](#building-and-running-locally-with-docker)
- [Building and running locally on your native machine](#building-and-running-locally-on-your-native-machine)

## Overview

Expand Down Expand Up @@ -74,7 +75,7 @@ The following command will deploy NFD with the minimum required set of
parameters to run `gpu-feature-discovery`.

```shell
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/gpu-feature-discovery/v0.5.0/deployments/static/nfd.yaml
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/gpu-feature-discovery/v0.6.0/deployments/static/nfd.yaml
```

**Note:** This is a simple static daemonset meant to demonstrate the basic
Expand All @@ -96,7 +97,7 @@ or as a Job.
#### Daemonset

```shell
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/gpu-feature-discovery/v0.5.0/deployments/static/gpu-feature-discovery-daemonset.yaml
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/gpu-feature-discovery/v0.6.0/deployments/static/gpu-feature-discovery-daemonset.yaml
```

**Note:** This is a simple static daemonset meant to demonstrate the basic
Expand All @@ -111,7 +112,7 @@ node you want to label:

```shell
$ export NODE_NAME=<your-node-name>
$ curl https://raw.githubusercontent.com/NVIDIA/gpu-feature-discovery/v0.5.0/deployments/static/gpu-feature-discovery-job.yaml.template \
$ curl https://raw.githubusercontent.com/NVIDIA/gpu-feature-discovery/v0.6.0/deployments/static/gpu-feature-discovery-job.yaml.template \
| sed "s/NODE_NAME/${NODE_NAME}/" > gpu-feature-discovery-job.yaml
$ kubectl apply -f gpu-feature-discovery-job.yaml
```
Expand Down Expand Up @@ -267,7 +268,7 @@ The preferred method to deploy `gpu-feature-discovery` is as a daemonset using `
Instructions for installing `helm` can be found
[here](https://helm.sh/docs/intro/install/).

The `helm` chart for the latest release of GFD (`v0.5.0`) includes a number
The `helm` chart for the latest release of GFD (`v0.6.0`) includes a number
of customizable values. The most commonly overridden ones are:

```
Expand All @@ -278,7 +279,7 @@ of customizable values. The most commonly overridden ones are:
migStrategy:
pass the desired strategy for labeling MIG devices on GPUs that support it
[none | single | mixed] (default "none)
nfd.deploy:
nfd.deploy, nfd.enabled:
When set to true, deploy NFD as a subchart with all of the proper
parameters set for it (default "true")
runtimeClassName:
Expand All @@ -293,48 +294,86 @@ Please take a look in the following `values.yaml` files to see the full set of
overridable parameters for both the top-level `gpu-feature-discovery` chart and
the `node-feature-discovery` subchart.

* https://github.com/NVIDIA/gpu-feature-discovery/blob/v0.5.0/deployments/helm/gpu-feature-discovery/values.yaml
* https://github.com/NVIDIA/gpu-feature-discovery/blob/v0.5.0/deployments/helm/gpu-feature-discovery/charts/node-feature-discovery/values.yaml
* https://github.com/NVIDIA/gpu-feature-discovery/blob/v0.6.0/deployments/helm/gpu-feature-discovery/values.yaml
* https://github.com/NVIDIA/gpu-feature-discovery/blob/v0.6.0/deployments/helm/gpu-feature-discovery/charts/node-feature-discovery/values.yaml

#### Installing via `helm install` from the `gpu-feature-discovery` `helm` repository
#### Installing via `helm install`

The preferred method of deployment is with `helm install` via the
`gpu-feature-discovery` `helm` repository.
Starting with `v0.6.0` there are two ways to deploy `gpu-feature-discovery` via
`helm`:
1. As a standalone chart _without_ configuration file or GPU sharing support
1. As a subchart of the NVIDIA device plugin _with_ configuration file and GPU sharing support

This repository can be installed as follows:
##### Deploying Standalone

When deploying standalone, begin by setting up GFD's `helm` repository and
updating it at follows:
```shell
$ helm repo add nvgfd https://nvidia.github.io/gpu-feature-discovery
$ helm repo update
```

Once this repo is updated, you can begin installing packages from it to depoloy
the `gpu-feature-discovery` daemonset and (optionally) the
`node-feature-discovery` daemonset. Below are some examples of deploying these
components with the various flags from above.
Then verify that the latest release (`v0.6.0`) of the plugin is available:
```
$ helm search repo nvgfd --devel
NAME CHART VERSION APP VERSION DESCRIPTION
nvgfd/gpu-feature-discovery 0.6.0 0.6.0 A Helm chart for ...
```

**Note:** Since this is a pre-release version, you will need to pass the
`--devel` flag to `helm search repo` in order to see this release listed.
Once this repo is updated, you can begin installing packages from it to deploy
the `gpu-feature-discovery` helm chart.

Using the default values for all flags:
```shell
$ helm install \
--version=0.5.0 \
--generate-name \
nvgfd/gpu-feature-discovery
The most basic installation command without any options is then:
```
$ helm upgrade -i nvgfd nvgfd/gpu-feature-discovery \
--version 0.6.0 \
--namespace gpu-feature-discovery \
--create-namespace
```

Disabling auto-deployment of NFD and running with a MIG strategy of 'mixed' in
the default namespace.
```shell
$ helm install \
--version=0.5.0 \
--generate-name \
--set nfd.deploy=false \
$ helm upgrade -i nvgfd nvgfd/gpu-feature-discovery \
--version=0.6.0 \
--set allowDefaultNamespace=true \
--set nfd.enabled=false \
--set migStrategy=mixed
--set namespace=default \
nvgfd/gpu-feature-discovery
```

**Note:** You only need the to pass the `--devel` flag to `helm search repo`
and the `--version` flag to `helm upgrade -i` if this is a pre-release
version (e.g. `<version>-rc.1`). Full releases will be listed without this.

##### Deploying as a subchart of the NVIDIA device plugin

Starting with `v0.12.0` of the NVIDIA device plugin, a configuration file can
now be used to configure the plugin. This same configuration file can also be
used to configure GFD. The GFD specific sections of the configuration file are
specified in the `gfd` section as seen below:
```
version: v1
flags:
migStrategy: "none"
failOnInitError: true
nvidiaDriverRoot: "/"
plugin:
passDeviceSpecs: false
deviceListStrategy: "envvar"
deviceIDStrategy: "uuid"
gfd:
oneshot: false
noTimestamp: false
outputFile: /etc/kubernetes/node-feature-discovery/features.d/gfd
sleepInterval: 60s
```

The fields in this section map to the command line flags and environment
variables available for configuring GFD.

For details on how to deploy GFD as part of the plugin please refer to:
* https://github.com/NVIDIA/k8s-device-plugin/tree/v0.12.0/#deploying-with-gpu-feature-discovery-for-automatic-node-labels

#### Deploying via `helm install` with a direct URL to the `helm` package

If you prefer not to install from the `gpu-feature-discovery` `helm` repo, you can
Expand All @@ -344,20 +383,19 @@ they use direct URLs to the `helm` package instead of the `helm` repo.

Using the default values for the flags:
```shell
$ helm install \
--generate-name \
https://nvidia.github.com/gpu-feature-discovery/stable/gpu-feature-discovery-0.5.0.tgz
$ helm upgrade -i nvgfd \
--set allowDefaultNamespace=true \
https://nvidia.github.io/gpu-feature-discovery/stable/gpu-feature-discovery-0.6.0.tgz
```

Disabling auto-deployment of NFD and running with a MIG strategy of 'mixed' in
the default namespace.
```shell
$ helm install \
--generate-name \
$ helm upgrade -i nvgfd \
--set nfd.deploy=false \
--set migStrategy=mixed
--set namespace=default \
https://nvidia.github.com/gpu-feature-discovery/stable/gpu-feature-discovery-0.5.0.tgz
--set migStrategy=mixed \
--set allowDefaultNamespace=true \
https://nvidia.github.io/gpu-feature-discovery/stable/gpu-feature-discovery-0.6.0.tgz
```

## Building and running locally with Docker
Expand Down
3 changes: 1 addition & 2 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,11 @@ Publishing the container is automated through gitlab-ci and only requires one to
Publishing the helm chart is currently manual, and we should move to an automated process ASAP

# Release Process Checklist
- [ ] Update the README to change occurances of the old version (e.g: `v0.1.0`) with the new version
- [ ] Update the README to change occurances of the old version (e.g: `v0.6.0`) with the new version
- [ ] Commit, Tag and Push to Gitlab
- [ ] Build a new helm package with `helm package ./deployments/helm/gpu-feature-discovery`
- [ ] Switch to the `gh-pages` branch and move the newly generated package to either the `stable` helm repo
- [ ] Run the `./build-index.sh` script to rebuild the indices for each repo
- [ ] Commit and push the `gh-pages` branch to GitHub
- [ ] Wait for the [CI job associated with your tag] (https://gitlab.com/nvidia/kubernetes/gpu-feature-discovery/-/pipelines) to complete
- [ ] Update the [README on dockerhub](https://hub.docker.com/r/nvidia/gpu-feature-discovery) with the latest tag information
- [ ] Create a [new release](https://github.com/NVIDIA/gpu-feature-discovery/releases) on Github with the changelog

0 comments on commit 416d12c

Please sign in to comment.