Skip to content

Commit

Permalink
Introduce the native CLI
Browse files Browse the repository at this point in the history
This command line tool overhauls the initial setup for Kubernetes
deployments.

The new tool gives us more information when something doesn't work out
as intended and brings the the cluster setup considerably closer to what
one would expect in a production grade system.

The nativelink image and worker containers are now fully built
in the cluster via Tekton Pipelines. Rebuilds may be triggered with curl
requests instead of the old `nix run .#xxx.copyTo` workflow. This makes
the setup more generic and provides clearer pointers on how to bring the
system into continuously updating production workflows.

The `native` tool is technically fully self-contained. The examples
still make use of some local paths, but it's now possible to set up the
cluster and deploy NativeLink in it without cloning the nativelink
repository. This requires slightly modified `01_operations.sh` scripts
which we'll add as a new example.
  • Loading branch information
aaronmondal committed Apr 15, 2024
1 parent 4082759 commit 52ff5d2
Show file tree
Hide file tree
Showing 33 changed files with 2,340 additions and 197 deletions.
2 changes: 2 additions & 0 deletions .github/styles/config/vocabularies/TraceMachina/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,5 @@ rebase
remoteable
Chromium
namespace
Pulumi
Tekton
3 changes: 1 addition & 2 deletions .github/workflows/lre.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,7 @@ jobs:
- name: Start Kubernetes cluster (Infra)
run: >
nix develop --impure --command
bash -c "./deployment-examples/kubernetes/00_infra.sh"
nix run .#native up
- name: Start Kubernetes cluster (Operations)
run: >
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ result
.bazelrc.user
MODULE.bazel.lock
trivy-results.sarif
Pulumi.dev.yaml
21 changes: 21 additions & 0 deletions .golangci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
linters:
enable-all: true
disable:
# Deprecated.
- nosnakecase
- interfacer
- exhaustivestruct
- ifshort
- deadcode
- varcheck
- golint
- maligned
- scopelint
- structcheck

# Allow all packages for now.
- depguard

# TODO(aaronmondal): Fix these at some point.
- exhaustruct
11 changes: 11 additions & 0 deletions Pulumi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
name: nativelink
org: TraceMachina
runtime: go
description: The development cluster for NativeLink.
organization:
pulumi:tags:
company: "Trace Machina, Inc."
backend:
# Only intended to run locally.
url: file://~
1 change: 0 additions & 1 deletion deployment-examples/chromium/00_infra.sh

This file was deleted.

33 changes: 23 additions & 10 deletions deployment-examples/chromium/01_operations.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,29 @@ set -xeuo pipefail

SRC_ROOT=$(git rev-parse --show-toplevel)

kubectl apply -f ${SRC_ROOT}/deployment-examples/chromium/gateway.yaml
EVENTLISTENER=$(kubectl get gtw eventlistener -o=jsonpath='{.status.addresses[0].value}')

# The image for the scheduler and CAS.
nix run .#image.copyTo \
docker://localhost:5001/nativelink:local \
-- \
--dest-tls-verify=false
curl -v \
-H 'content-Type: application/json' \
-d '{
"flakeOutput": "./src_root#image",
"imageTagOverride": "local"
}' \
http://${EVENTLISTENER}:8080

# Wrap it with nativelink to turn it into a worker.
nix run .#nativelink-worker-siso-chromium.copyTo \
docker://localhost:5001/nativelink-worker-siso-chromium:local \
-- \
--dest-tls-verify=false
# Wrap it nativelink to turn it into a worker.
curl -v \
-H 'content-Type: application/json' \
-d '{
"flakeOutput": "./src_root#nativelink-worker-siso-chromium",
"imageTagOverride": "local"
}' \
http://${EVENTLISTENER}:8080

# Wait for the pipelines to finish.
kubectl wait \
--for=condition=Succeeded \
--timeout=30m \
pipelinerun \
-l tekton.dev/pipeline=rebuild-nativelink
11 changes: 10 additions & 1 deletion deployment-examples/chromium/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,25 @@ In this example we're using `kind` to set up the cluster `cilium` to provide a
First set up a local development cluster:

```bash
./00_infra.sh
native up
```

> [!TIP]
> The `native up` command uses Pulumi under the hood. You can view and delete
> the stack with `pulumi stack` and `pulumi destroy`.
Next start a few standard deployments. This part also builds the remote
execution containers and makes them available to the cluster:

```bash
./01_operations.sh
```

> [!TIP]
> The operations invoke cluster-internal Tekton Pipelines to build and push the
> `nativelink` and worker images. You can view the state of the pipelines with
> `tkn pr ls` and `tkn pr logs`/`tkn pr logs --follow`.
Finally, deploy NativeLink:

```bash
Expand Down
131 changes: 0 additions & 131 deletions deployment-examples/kubernetes/00_infra.sh

This file was deleted.

51 changes: 32 additions & 19 deletions deployment-examples/kubernetes/01_operations.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,22 +7,35 @@ set -xeuo pipefail

SRC_ROOT=$(git rev-parse --show-toplevel)

kubectl apply -f ${SRC_ROOT}/deployment-examples/kubernetes/gateway.yaml

# The image for the scheduler and CAS.
nix run .#image.copyTo \
docker://localhost:5001/nativelink:local \
-- \
--dest-tls-verify=false

# The worker image for C++ actions.
nix run .#nativelink-worker-lre-cc.copyTo \
docker://localhost:5001/nativelink-worker-lre-cc:local \
-- \
--dest-tls-verify=false

# The worker image for Java actions.
nix run .#nativelink-worker-lre-java.copyTo \
docker://localhost:5001/nativelink-worker-lre-java:local \
-- \
--dest-tls-verify=false
EVENTLISTENER=$(kubectl get gtw eventlistener -o=jsonpath='{.status.addresses[0].value}')

curl -v \
-H 'content-Type: application/json' \
-d '{
"flakeOutput": "./src_root#image",
"imageTagOverride": "local"
}' \
http://${EVENTLISTENER}:8080

curl -v \
-H 'content-Type: application/json' \
-d '{
"flakeOutput": "./src_root#nativelink-worker-lre-cc",
"imageTagOverride": "local"
}' \
http://${EVENTLISTENER}:8080

curl -v \
-H 'content-Type: application/json' \
-d '{
"flakeOutput": "./src_root#nativelink-worker-lre-java",
"imageTagOverride": "local"
}' \
http://${EVENTLISTENER}:8080

# Wait for the pipelines to finish.
kubectl wait \
--for=condition=Succeeded \
--timeout=30m \
pipelinerun \
-l tekton.dev/pipeline=rebuild-nativelink
11 changes: 10 additions & 1 deletion deployment-examples/kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,25 @@ In this example we're using `kind` to set up the cluster `cilium` to provide a
First set up a local development cluster:

```bash
./00_infra.sh
native up
```

> [!TIP]
> The `native up` command uses Pulumi under the hood. You can view and delete
> the stack with `pulumi stack` and `pulumi destroy`.
Next start a few standard deployments. This part also builds the remote
execution containers and makes them available to the cluster:

```bash
./01_operations.sh
```

> [!TIP]
> The operations invoke cluster-internal Tekton Pipelines to build and push the
> `nativelink` and worker images. You can view the state of the pipelines with
> `tkn pr ls` and `tkn pr logs`/`tkn pr logs --follow`.
Finally, deploy NativeLink:

```bash
Expand Down
24 changes: 0 additions & 24 deletions deployment-examples/kubernetes/gateway.yaml

This file was deleted.

Loading

0 comments on commit 52ff5d2

Please sign in to comment.