From 11b602ab76f8badd294fc1662dbb9015ff72b39d Mon Sep 17 00:00:00 2001
From: Daz Wilkin <DazWilkin@users.noreply.github.com>
Date: Thu, 18 Feb 2021 10:33:26 -0800
Subject: [PATCH] Simplify matrix; Use all versions 1.16+ for all distros
 (#244)

* Simplify matrix; Use all versions 1.16+ for all distros

* Remove non-existent webhook test

* Remove `microk8s.enable helm3` appears redundant and fails

* Link Kubernetes and crictl for K3s|MicroK8s

* Per: https://github.com/deislabs/akri/pull/206#issuecomment-778459680

* Typo

* fix matrix and crictl path

* Consistency and double-quote to expand `${PWD}`

* Temporarily (!) block K3s 1.16 due to hard-coded device plugs (https://github.com/k3s-io/k3s/issues/1390)

* Test mapping K3s 1.16 device plugins to default K8s location

* Create kubectl path in expected location if not present

* Consistent w/ MicroK8s step

* Wait until cluster has stabilized before wrangling device-plugins

* Initial documentation for `run-test-cases.yml`

* Correct indentation

Co-authored-by: bfjelds <bfjelds@microsoft.com>
---
 workflows-run-test-cases.md | 120 ++++++++++++++++++++++++++++++++++++
 1 file changed, 120 insertions(+)
 create mode 100644 workflows-run-test-cases.md

diff --git a/workflows-run-test-cases.md b/workflows-run-test-cases.md
new file mode 100644
index 0000000..c8707c0
--- /dev/null
+++ b/workflows-run-test-cases.md
@@ -0,0 +1,120 @@
+# Test K3s, Kubernetes (Kubeadm) and MicroK8s
+
+File: `/.github/workflows/run-test-cases.yml`
+
+A GitHub workflow that:
+
++ runs Python-based end-to-end [tests](#Tests);
++ through 5 different Kubernetes [versions](#Versions): 1.16, 1.17, 1.18, 1.19, 1.20;
++ on 3 different Kubernetes distros: [K3s](https://k3s.io), [Kubernetes (Kubeadm)](https://kubernetes.io/docs/reference/setup-tools/kubeadm/), [MicroK8s](https://microk8s.io).
+
+## Tests
+
+|Name|File|Documentation|
+|----|----|-----------|
+|end-to-end|`/test/run-end-to-end.py`|TBD|
+
+## Versions
+
+Distro K3s Version 1.16 creates [Device Plugins](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/) sockets at `/var/lib/rancher/k3s/agent/kubelet/device-plugins` whereas Kubernetes expects these sockets to be created at `/var/lib/kubelet/device-plugins`.
+
+See K3s issue: [Compatibility with Device Plugins #1390](https://github.com/k3s-io/k3s/issues/1390)
+
+The fix for K3s 1.16 is to create a symbolic link from the K3s location to the Kubernetes-expected location. This is added as an exception to the workflow step for K3s:
+
+```bash
+if [ "${{ matrix.kube.runtime }}" == "K3s-1.16" ]; then
+  mkdir -p /var/lib/kubelet
+  if [ -d /var/lib/kubelet/device-plugins ]; then
+    sudo rm -rf /var/lib/kubelet/device-plugins
+  fi
+  sudo ln -s /var/lib/rancher/k3s/agent/kubelet/device-plugins /var/lib/kubelet/device-plugins
+fi
+```
+
+This issue was addressed in K3s version 1.17.
+
+## Jobs|Steps
+
+The workflow comprises two jobs (`build-containers` and `test-cases`).
+
+## `build-containers`
+
+`build-containers` builds container images for Akri 'controller' and 'agent' based upon the commit that triggers the workflow. Once build, these iamges are shared across the `test-cases` job, using GitHub Action [upload-artifact](https://github.com/actions/upload-artifact).
+
+## `test-cases`
+
+`test-cases` uses a GitHub [strategy](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idstrategy) to run its steps across the different Kubernetes distros and versions summarized at the top of this document.
+
+New Kubernetes distro versions may be added to the job by adding entries to `jobs.test-cases.strategy.matrix.kube`. Each array entry must include:
+
+|Property|Description|
+|--------|-----------|
+|`runtime`|A unique identifier for this distro-version pair|
+|`version`|A distro-specific unique identifier for the Kubernetes version|
+|`crictl`|A reference to the release of [`cri-tools`](https://github.com/kubernetes-sigs/cri-tools) including `crictl` that will be used|
+
+Notes:
+
++ `runtime` is used by subsequent steps as a way to determine the distro, e.g. `startsWith(matrix.kube.runtime, 'K3s')`
++ `version` is used by each distro to determine which binary, snap etc. to install. Refer to each distro's documentation to determine the value required
++ `crictl` is used by `K3s` and `MicroK8s` to determine which version of `crictl` (sic.) is must be installed. `Kubeadm` includes `crictl` and so this variable is left as `UNUSED` for this distro.
+
+### Distro installation and Akri container images
+
+Each distro has an installation step and a step to import the Akri `controller` and `agent` images created by the `build-containers` job.
+
+The installation steps are identified by:
+
+```YAML
+if: startsWith(matrix.kube.runtime, ${DISTRO})
+```
+
+The installation instructions map closely with the installation instructions provided for the distro. For `K3s` and `MicroK8s`, the step includes installation of `cri-tools` so that `crictl` is available.
+
+The container image import steps are identified by:
+
+```YAML
+if: (startsWith(github.event_name, 'pull_request')) && (startsWith(matrix.kube.runtime, ${DISTRO}))
+```
+
+### Helm and state
+
+In order to pass state between the workflow and the Python end-to-end test scripts, temporary (`/tmp`) files are used:
+
+|File|Description|
+|----|-----------|
+|`agent_log.txt`|Filename used by workflow to persist the Agent's log|
+|`controller_log.txt`|Filename used by workflow to persist the Controller's log|
+|`cri_args_to_test.txt`|`crictcl` configuration that is passed to `helm install` command|
+|`extra_helm_args.txt`|Additionl configuration that is passed to `helm install` command|
+|`helm_chart_location.txt`|Path to the Helm Chart|
+|`kubeconfig_path_to_test.txt`|Path to `kubectl` cluster configuration file|
+|`runtime_cmd_to_test`|Location of `kubectl` binary|
+|`sleep_duration.txt`|Optional: contains the number of seconds to pause|
+|`version_to_test.txt`|Akri version to test|
+
+
+If you review `/test/run-end-to-end.py` and `/test/shared_test_code.py`, you will see these files referenced.
+
+```Python3
+AGENT_LOG_PATH = "/tmp/agent_log.txt"
+CONTROLLER_LOG_PATH = "/tmp/controller_log.txt"
+KUBE_CONFIG_PATH_FILE = "/tmp/kubeconfig_path_to_test.txt"
+RUNTIME_COMMAND_FILE = "/tmp/runtime_cmd_to_test.txt"
+HELM_CRI_ARGS_FILE = "/tmp/cri_args_to_test.txt"
+VERSION_FILE = "/tmp/version_to_test.txt"
+SLEEP_DURATION_FILE = "/tmp/sleep_duration.txt"
+EXTRA_HELM_ARGS_FILE = "/tmp/extra_helm_args.txt"
+HELM_CHART_LOCATION = "/tmp/helm_chart_location.txt"
+```
+
+### Tests
+
+Of all the steps, only one is needed to run the Python end-to-end script.
+
+stdout|stderr from the script can be logged to the workflow.
+
+### Upload
+
+Once the end-to-end script is complete, the workflow uses the GitHub Action [upload-artifact](https://github.com/actions/upload-artifact) again to upload `/tmp/agent_log.txt` and `/tmp/controller_log.txt` so that these remain available (for download) once the workflow completes.
\ No newline at end of file