docs(archive): Preserve discussion for PR #4974#4982
Conversation
…ency graph
Closes the dev/prod parity loop: same workloads from the same git ref
reconcile into both substrates via ArgoCD, with sync-wave annotations
making the dependency order explicit so things land in the right
sequence on both clusters.
Three additions:
1. `full-ai-cluster/dev-cluster/` — k3d + Docker Desktop-based local
cluster matching prod's substrate shape:
- `k3d-config.yaml`: 1 server + 2 agents, K3S with the same Cilium
takeover flags as prod, local Docker registry at localhost:5000,
LoadBalancer port forwards for ArgoCD UI
- `up.sh`: end-to-end bring-up — k3d cluster, Cilium chicken-and-egg
install via Helm, ArgoCD install via Helm, root App-of-Apps
pointing at this repo's k8s/applications/ at any git ref
(default `main`; pass a PR branch to dev-test before merging)
- `down.sh`: idempotent teardown — cluster + registry + kubectl
context cleanup
- `README.md`: dev/prod parity table, multi-cluster patterns,
dev image push workflow, future ApplicationSet multi-cluster path
2. `k8s/applications/argocd/Application.yaml` — ArgoCD self-management.
Adopts the existing installation from the K3S bootstrap manifests
so subsequent chart upgrades land via git → ArgoCD instead of
requiring a bootstrap-manifest edit + K3S server restart. Sync
wave -90 (earliest non-bootstrap wave).
3. `dev-cluster/SYNC-WAVES.md` — documents the per-app dependency
graph and per-app sync-wave assignment. Annotation applied to
all 34 existing Applications:
Wave -90 argocd self-management
Wave -80 cilium CNI adoption
Wave -70 cert-manager
Wave -60 vault
Wave -50 spire
Wave -45 trust-manager
Wave -40 external-secrets
Wave -30 sealed-secrets
Wave -25 open-policy-agent OPA must precede policy-using apps
Wave -20 node-feature-discovery
Wave -15 longhorn storage class precedes PVC users
Wave -10 hat-system CRDs precede HatBinding workloads
Wave 0 observability core, data planes, runtime
Wave 10 hindsight / orleans / temporal (need data planes up)
Wave 20 hermes (needs Vault secret synced + Hindsight + OZ)
Wave 30 gitlab / forgejo (source-of-truth services last)
Wave 50 ollama / vllm / deepseek-coder / qwen-coder (GPU; manual)
Why now: Aaron flagged that the App-of-Apps would parallel-reconcile
everything by default, breaking ordering. The dev cluster catches this
class of issue on a feature branch before it touches prod — exactly
the loop dev/prod parity is supposed to close.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e352507c70
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Skip apps that don't belong in dev (GPU stack, Longhorn). | ||
| # These reconcile in prod but are excluded here via the | ||
| # exclude glob. | ||
| exclude: '{longhorn/**,ollama/**,vllm/**,deepseek-coder/**,qwen-coder/**}' |
There was a problem hiding this comment.
Exclude prod-only Cilium app from dev App-of-Apps
This root Application includes all full-ai-cluster/k8s/applications/**/Application.yaml except Longhorn/GPU, so it will reconcile full-ai-cluster/k8s/applications/cilium/Application.yaml after bootstrap and overwrite the dev bootstrap settings with prod Cilium values (k8sServiceHost: control-plane.zeta.local, ipam.mode: cluster-pool). The same script bootstraps Cilium with k3d-specific values (k3d-zeta-dev-server-0, ipam.mode=kubernetes), so a normal ArgoCD sync can flip the cluster to an invalid control-plane endpoint and break networking on local k3d.
Useful? React with 👍 / 👎.
| fi | ||
|
|
||
| # ── Step 3: ArgoCD ──────────────────────────────────────────── | ||
| if ! kubectl get ns argocd >/dev/null 2>&1; then |
There was a problem hiding this comment.
Check ArgoCD health instead of namespace existence
Step 3 treats argocd namespace presence as proof ArgoCD is installed. If a prior run created the namespace but helm install failed (or the release/CRDs were later removed), rerunning skips installation and then step 4 applies an argoproj.io/v1alpha1 Application, which fails without ArgoCD CRDs/controllers. This makes the script non-idempotent in common partial-failure recovery paths.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR introduces ArgoCD sync-wave ordering across the full-ai-cluster/k8s/applications/** App-of-Apps set, adds a local k3d “dev cluster” bring-up/tear-down workflow intended to mirror prod reconciliation, and also adds PR discussion archive documents (including PR #4974).
Changes:
- Add
argocd.argoproj.io/sync-waveannotations to many ArgoCDApplicationmanifests and introduce an ArgoCD self-managementApplication. - Add a local dev-cluster toolkit (
up.sh/down.sh, k3d config, and supporting docs) for end-to-end reconciliation against the repo’sk8s/applications/directory. - Add
docs/pr-discussions/archive entries for PRs #4974–#4976 and updatedocs/BACKLOG.md.
Reviewed changes
Copilot reviewed 45 out of 45 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| full-ai-cluster/k8s/applications/weaviate/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/vllm/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/vault/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/trust-manager/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/temporal/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/tempo/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/spire/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/sealed-secrets/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/redis/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/qwen-coder/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/oz/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/orleans/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/open-policy-agent/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/ollama/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/node-feature-discovery/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/nats/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/mimir/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/longhorn/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/loki/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/kube-prometheus-stack/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/hindsight/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/hermes/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/hat-system/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/gitlab/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/forgejo/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/external-secrets/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/deepseek-coder/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/dapr/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/cockroachdb/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/cilium/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/cert-manager/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/argocd/Application.yaml | Add ArgoCD self-management Application (early wave). |
| full-ai-cluster/k8s/applications/argo-workflows/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/argo-rollouts/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/k8s/applications/alloy/Application.yaml | Add sync-wave annotation for ordering under App-of-Apps. |
| full-ai-cluster/dev-cluster/up.sh | New script to create k3d cluster, install Cilium/ArgoCD, and apply root App-of-Apps. |
| full-ai-cluster/dev-cluster/SYNC-WAVES.md | Document intended sync-wave dependency ordering rationale. |
| full-ai-cluster/dev-cluster/README.md | Dev-cluster usage and rationale documentation. |
| full-ai-cluster/dev-cluster/k3d-config.yaml | k3d cluster configuration intended to mirror prod substrate. |
| full-ai-cluster/dev-cluster/down.sh | Tear-down script for the dev cluster and registry. |
| full-ai-cluster/dev-cluster/DOCKER-DESKTOP.md | Docker Desktop sizing/config guidance for running the dev cluster. |
| docs/pr-discussions/PR-4976-feat-substrate-max-addison-personas-onboarding-doc-manifesto.md | Preserve PR discussion archive for PR #4976. |
| docs/pr-discussions/PR-4975-backlog-b-0728-destructive-tool-authoring-contract-rails-per.md | Preserve PR discussion archive for PR #4975. |
| docs/pr-discussions/PR-4974-feat-tools-flash-usb-ts-hardening-runtime-nonce-responsibili.md | Preserve PR discussion archive for PR #4974. |
| docs/BACKLOG.md | Add B-0721 to the generated backlog index. |
| #!/usr/bin/env bash | ||
| # full-ai-cluster/dev-cluster/up.sh | ||
| # | ||
| # Bring up the local dev cluster end-to-end: | ||
| # 1. Create the k3d cluster from k3d-config.yaml |
| if ! kubectl get ns argocd >/dev/null 2>&1; then | ||
| echo "Installing ArgoCD ..." | ||
| kubectl create namespace argocd |
| project: default | ||
| source: | ||
| repoURL: https://github.com/Lucent-Financial-Group/Zeta | ||
| targetRevision: ${GIT_REF} | ||
| path: full-ai-cluster/k8s/applications | ||
| directory: |
|
|
||
| Same git ref, multiple destinations. The dev/prod parity stays | ||
| clean because the spec carries no environment-specific bits; | ||
| overlays handle that. |
| # `k8s/applications/` directory into both clusters. The only | ||
| # environment-specific deltas live under `dev-cluster/overlays/` | ||
| # (e.g., Longhorn replaced by local-path-provisioner; GPU | ||
| # device plugin disabled). Everything else — Cilium, ArgoCD, |
| # Ignore the resources-finalizer the bootstrap installer | ||
| # already added — preserves the existing release. |
|
This PR is a 'blob' that mixes unrelated changes (PR preservation, ArgoCD changes, and a new dev cluster workflow). Please decompose this into smaller, atomic pull requests. |
This PR preserves the discussion for PR #4974.