Skip to content

docs(archive): Preserve discussion for PR #4974#4982

Closed
AceHack wants to merge 7 commits into
mainfrom
lior/archive-pr-4974
Closed

docs(archive): Preserve discussion for PR #4974#4982
AceHack wants to merge 7 commits into
mainfrom
lior/archive-pr-4974

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 25, 2026

This PR preserves the discussion for PR #4974.

Lior and others added 7 commits May 25, 2026 12:41
…ency graph

Closes the dev/prod parity loop: same workloads from the same git ref
reconcile into both substrates via ArgoCD, with sync-wave annotations
making the dependency order explicit so things land in the right
sequence on both clusters.

Three additions:

1. `full-ai-cluster/dev-cluster/` — k3d + Docker Desktop-based local
   cluster matching prod's substrate shape:
   - `k3d-config.yaml`: 1 server + 2 agents, K3S with the same Cilium
     takeover flags as prod, local Docker registry at localhost:5000,
     LoadBalancer port forwards for ArgoCD UI
   - `up.sh`: end-to-end bring-up — k3d cluster, Cilium chicken-and-egg
     install via Helm, ArgoCD install via Helm, root App-of-Apps
     pointing at this repo's k8s/applications/ at any git ref
     (default `main`; pass a PR branch to dev-test before merging)
   - `down.sh`: idempotent teardown — cluster + registry + kubectl
     context cleanup
   - `README.md`: dev/prod parity table, multi-cluster patterns,
     dev image push workflow, future ApplicationSet multi-cluster path

2. `k8s/applications/argocd/Application.yaml` — ArgoCD self-management.
   Adopts the existing installation from the K3S bootstrap manifests
   so subsequent chart upgrades land via git → ArgoCD instead of
   requiring a bootstrap-manifest edit + K3S server restart. Sync
   wave -90 (earliest non-bootstrap wave).

3. `dev-cluster/SYNC-WAVES.md` — documents the per-app dependency
   graph and per-app sync-wave assignment. Annotation applied to
   all 34 existing Applications:

     Wave -90  argocd                    self-management
     Wave -80  cilium                    CNI adoption
     Wave -70  cert-manager
     Wave -60  vault
     Wave -50  spire
     Wave -45  trust-manager
     Wave -40  external-secrets
     Wave -30  sealed-secrets
     Wave -25  open-policy-agent         OPA must precede policy-using apps
     Wave -20  node-feature-discovery
     Wave -15  longhorn                  storage class precedes PVC users
     Wave -10  hat-system                CRDs precede HatBinding workloads
     Wave   0  observability core, data planes, runtime
     Wave  10  hindsight / orleans / temporal (need data planes up)
     Wave  20  hermes (needs Vault secret synced + Hindsight + OZ)
     Wave  30  gitlab / forgejo (source-of-truth services last)
     Wave  50  ollama / vllm / deepseek-coder / qwen-coder (GPU; manual)

Why now: Aaron flagged that the App-of-Apps would parallel-reconcile
everything by default, breaking ordering. The dev cluster catches this
class of issue on a feature branch before it touches prod — exactly
the loop dev/prod parity is supposed to close.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 25, 2026 19:50
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e352507c70

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# Skip apps that don't belong in dev (GPU stack, Longhorn).
# These reconcile in prod but are excluded here via the
# exclude glob.
exclude: '{longhorn/**,ollama/**,vllm/**,deepseek-coder/**,qwen-coder/**}'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exclude prod-only Cilium app from dev App-of-Apps

This root Application includes all full-ai-cluster/k8s/applications/**/Application.yaml except Longhorn/GPU, so it will reconcile full-ai-cluster/k8s/applications/cilium/Application.yaml after bootstrap and overwrite the dev bootstrap settings with prod Cilium values (k8sServiceHost: control-plane.zeta.local, ipam.mode: cluster-pool). The same script bootstraps Cilium with k3d-specific values (k3d-zeta-dev-server-0, ipam.mode=kubernetes), so a normal ArgoCD sync can flip the cluster to an invalid control-plane endpoint and break networking on local k3d.

Useful? React with 👍 / 👎.

fi

# ── Step 3: ArgoCD ────────────────────────────────────────────
if ! kubectl get ns argocd >/dev/null 2>&1; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Check ArgoCD health instead of namespace existence

Step 3 treats argocd namespace presence as proof ArgoCD is installed. If a prior run created the namespace but helm install failed (or the release/CRDs were later removed), rerunning skips installation and then step 4 applies an argoproj.io/v1alpha1 Application, which fails without ArgoCD CRDs/controllers. This makes the script non-idempotent in common partial-failure recovery paths.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces ArgoCD sync-wave ordering across the full-ai-cluster/k8s/applications/** App-of-Apps set, adds a local k3d “dev cluster” bring-up/tear-down workflow intended to mirror prod reconciliation, and also adds PR discussion archive documents (including PR #4974).

Changes:

  • Add argocd.argoproj.io/sync-wave annotations to many ArgoCD Application manifests and introduce an ArgoCD self-management Application.
  • Add a local dev-cluster toolkit (up.sh/down.sh, k3d config, and supporting docs) for end-to-end reconciliation against the repo’s k8s/applications/ directory.
  • Add docs/pr-discussions/ archive entries for PRs #4974#4976 and update docs/BACKLOG.md.

Reviewed changes

Copilot reviewed 45 out of 45 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
full-ai-cluster/k8s/applications/weaviate/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/vllm/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/vault/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/trust-manager/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/temporal/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/tempo/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/spire/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/sealed-secrets/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/redis/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/qwen-coder/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/oz/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/orleans/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/open-policy-agent/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/ollama/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/node-feature-discovery/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/nats/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/mimir/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/longhorn/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/loki/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/kube-prometheus-stack/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/hindsight/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/hermes/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/hat-system/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/gitlab/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/forgejo/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/external-secrets/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/deepseek-coder/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/dapr/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/cockroachdb/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/cilium/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/cert-manager/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/argocd/Application.yaml Add ArgoCD self-management Application (early wave).
full-ai-cluster/k8s/applications/argo-workflows/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/argo-rollouts/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/k8s/applications/alloy/Application.yaml Add sync-wave annotation for ordering under App-of-Apps.
full-ai-cluster/dev-cluster/up.sh New script to create k3d cluster, install Cilium/ArgoCD, and apply root App-of-Apps.
full-ai-cluster/dev-cluster/SYNC-WAVES.md Document intended sync-wave dependency ordering rationale.
full-ai-cluster/dev-cluster/README.md Dev-cluster usage and rationale documentation.
full-ai-cluster/dev-cluster/k3d-config.yaml k3d cluster configuration intended to mirror prod substrate.
full-ai-cluster/dev-cluster/down.sh Tear-down script for the dev cluster and registry.
full-ai-cluster/dev-cluster/DOCKER-DESKTOP.md Docker Desktop sizing/config guidance for running the dev cluster.
docs/pr-discussions/PR-4976-feat-substrate-max-addison-personas-onboarding-doc-manifesto.md Preserve PR discussion archive for PR #4976.
docs/pr-discussions/PR-4975-backlog-b-0728-destructive-tool-authoring-contract-rails-per.md Preserve PR discussion archive for PR #4975.
docs/pr-discussions/PR-4974-feat-tools-flash-usb-ts-hardening-runtime-nonce-responsibili.md Preserve PR discussion archive for PR #4974.
docs/BACKLOG.md Add B-0721 to the generated backlog index.

Comment on lines +1 to +5
#!/usr/bin/env bash
# full-ai-cluster/dev-cluster/up.sh
#
# Bring up the local dev cluster end-to-end:
# 1. Create the k3d cluster from k3d-config.yaml
Comment on lines +70 to +72
if ! kubectl get ns argocd >/dev/null 2>&1; then
echo "Installing ArgoCD ..."
kubectl create namespace argocd
Comment on lines +94 to +99
project: default
source:
repoURL: https://github.com/Lucent-Financial-Group/Zeta
targetRevision: ${GIT_REF}
path: full-ai-cluster/k8s/applications
directory:

Same git ref, multiple destinations. The dev/prod parity stays
clean because the spec carries no environment-specific bits;
overlays handle that.
Comment on lines +17 to +20
# `k8s/applications/` directory into both clusters. The only
# environment-specific deltas live under `dev-cluster/overlays/`
# (e.g., Longhorn replaced by local-path-provisioner; GPU
# device plugin disabled). Everything else — Cilium, ArgoCD,
Comment on lines +52 to +53
# Ignore the resources-finalizer the bootstrap installer
# already added — preserves the existing release.
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 25, 2026

This PR is a 'blob' that mixes unrelated changes (PR preservation, ArgoCD changes, and a new dev cluster workflow). Please decompose this into smaller, atomic pull requests.

@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 27, 2026

This PR is a duplicate of #4980 and has been decomposed into #5441 and #5442. Closing this PR.

@AceHack AceHack closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants