diff --git a/memory/persona/max/PERSONA.md b/memory/persona/max/PERSONA.md index 48082b1b30..67ab9b1b63 100644 --- a/memory/persona/max/PERSONA.md +++ b/memory/persona/max/PERSONA.md @@ -46,6 +46,49 @@ Operationally: - The home-tier substrate that the federated peer mesh (B-0727) operates at; Max's org-design assumes the home/business profile as primary - The C# / F# operator collaboration substrate (B-0724) — once landed, the polyglot pattern proves CRD-as-canonical-contract with two implementations in different languages +## Current focus — tier-2 Docker Desktop dev-experience workstream (added 2026-05-25) + +Aaron 2026-05-25 added Max's primary near-term workstream: **own the tier-2 Docker Desktop + Kubernetes dev-experience** for the Zeta cluster substrate. This is the middle tier in the three-tier testing story from [B-0780](../../../docs/backlog/P1/B-0780-local-loop-deterministic-simulation-testing-of-kubernetes-deployments-lexisnexis-lineage-three-tier-testing-argocd-apps-as-packages-aaron-mika-2026-05-25.md): + +| Tier | Owner | Substrate | +|---|---|---| +| 1 — pure-code (no Docker, no K8s) | Aaron + Otto | F# Local Loop tests | +| **2 — Docker-observable (Docker Desktop + native multi-node kind)** | **Max** | This workstream | +| 3 — full CI in real cluster | Aaron + Otto + the iter-3 NixOS cluster | Already shipping per B-0754 | + +Max's contract: **touch the Docker Desktop GUI only where the API/CLI demonstrably can't do it.** Everything else (clusters, app deploys, port-forwards, kubectl, helm, argo, observability stacks) gets scripted or skill-encoded. If Max finds himself clicking a button twice, that's a signal to encode the next click as a skill or script. + +### Sub-scopes Max owns within tier-2 + +- **Argo CD sync-wave debugging** — the App-of-Apps composition pattern (B-0780 Component 3) makes sync-wave ordering the primary failure surface during tier-2 bring-up. Max becomes the human who can read an Argo CD sync-wave failure trace and pin the root cause in minutes; pattern encoded at `.claude/skills/argocd-sync-wave-debug/SKILL.md`. +- **Observability — OTel auto-instrumentation matching the CNI mesh shape** — production cluster will use Cilium + Hubble + OTel; Docker Desktop tier doesn't ship Cilium by default. Substrate-design choice between full Cilium (Shape A), thinner eBPF + OTel-collector (Shape B), or both-gated (Shape C); default to Shape B per simplest-first, promote when Shape B demonstrably misses prod bugs. +- **30+ chart coverage matrix** — production cluster runs 30+ charts (cockroachdb, redis, nats, temporal, orleans, dapr, opa, longhorn, vllm, argo-{cd,rollouts,workflows}, loki / mimir / tempo, spire, etc.). Max maintains a three-column matrix (single-node DD kind / multi-node DD kind / cluster-only) at `docs/dev-environments/docker-desktop-chart-matrix.md` so future operators (and `zeta dev up` profile defaults) know which charts run where. +- **CI testing on kind / k3d + GitHub workflows** — Max owns `.github/workflows/tier-2-*.yml` (per-PR on kind + nightly full profile + separate multi-cluster federation workflow). Tier-2-in-CI is the substrate that catches "works on my laptop, breaks in CI" before tier-3 (real cluster) bothers running. +- **`zeta dev up` developer-facing surface** — single command brings cluster substrate to ready state on his laptop in time comparable to `docker-compose up` (target: under 5min cold for 3-node DD kind + `data` profile; under 1min warm). Flags: `--single-node` for fast iteration; `--nodes N` to drive DD's settings API; `--profile minimal | data | observability | full | ` for chart subset selection. + +### Topology substrate (corrected 2026-05-25 — Docker Desktop ships native multi-node kind) + +Docker Desktop's native cluster-provisioning UI exposes **kind** as a first-class provisioner with a 1–10 node slider + version picker (current: K8s 1.34.3). Tier-2 = **kind via DD's native provisioner** (NOT bare kind / k3d running on top of DD's Docker engine — that earlier framing was outdated). Max picks node count via DD UI slider OR programmatically via DD's settings API / CLI. **Default = 3-node kind** because consensus-quorum testing is the highest-value tier-2 capability that tier-1 can't deliver. + +Multi-node ≠ multi-cluster. Multi-node (3 nodes in one DD-managed kind cluster) covers ~95% of consensus-quorum testing (CockroachDB Raft, etcd quorum, Longhorn 3-replica, NATS R3, Argo CD HA, anti-affinity, pod-disruption budgets). **Multi-cluster federation / Cilium clustermesh / multi-region** is the remaining ~5%, lives in CI by default plus locally-runnable script for debugging only — NOT always-on in DD. Skill: `.claude/skills/tier-2-federation-debug/SKILL.md`. + +### Touch ID / biometrics integration Max gets to use + +Zeta has a Touch ID + PAM integration for sudo and admin operations, canonical pattern at [`full-ai-cluster/tools/zflash-setup.ts`](../../../full-ai-cluster/tools/zflash-setup.ts). When AI agents need to do anything privileged on Max's macOS workstation (installing Docker Desktop, enabling Kubernetes, mounting disks, etc.), the pattern is: AI announces → invokes via expect wrapper → Max taps fingerprint sensor → command runs with elevated privilege. **Max does not type passwords for admin operations**; if an AI agent reaches for a password prompt, that's a signal to extend the Touch ID pattern instead. Skill candidate: `tools/dev/zfingerprint.ts` — thin wrapper generalizing the zflash pattern for any Max-side privileged operation. + +### Skills-and-scripts encoding contract (load-bearing) + +Every Docker Desktop / Kubernetes / dev-experience interaction Max performs ends as one of: a TypeScript script under `tools/dev/` (per Rule 0 — TS not bash; Bun runtime); a Claude Code skill under `.claude/skills//SKILL.md`; or a backlog row under `docs/backlog/P*/B-NNNN-*.md` for substantive new substrate. Rule of thumb: if Max teaches the AI something about Docker Desktop UX twice, that's a skill or script. Nothing gets lost in chat. + +### Composes with the tier-2 workstream + +- [B-0780](../../../docs/backlog/P1/B-0780-local-loop-deterministic-simulation-testing-of-kubernetes-deployments-lexisnexis-lineage-three-tier-testing-argocd-apps-as-packages-aaron-mika-2026-05-25.md) — tier-2's parent substrate; Max's workstream IS tier-2 +- B-0759 — first-time-CLI-user persona Max's `zeta dev up` UX serves +- B-0770 — Comet Pro IP-KVM substrate that makes local tty1 access load-bearing (which is why iter-4 needs password + SSH key, not just SSH key) +- B-0776 — simplest-first plugin sequence the chart matrix backs +- [B-0786](../../../docs/backlog/P2/B-0786-feature-flags-substrate-openfeature-as-operator-contract-flipt-as-simplest-first-backend-aaron-mika-2026-05-25.md) — "simplest first; add complexity only when simple shape demonstrably doesn't fit" discipline Max applies at every backend / topology / profile decision +- B-0789 (forthcoming) — iter-4 forge-integrated cluster bring-up; provides the password + SSH substrate Max uses to bring up his own dev cluster nodes + ## How agents work with Max - **Welcoming-but-honest review** — Max is new to K8s + the operator pattern; he'll be resistant at first to the ceremony (per Aaron: *"he will be resistant probably like most devs at first until he internlizes is worth"*). Frame feedback constructively + name the WHY (declarative state convergence, idempotent reconcile, CRD-as-typed-API) without selling diff --git a/memory/persona/max/STARTING-POINT.md b/memory/persona/max/STARTING-POINT.md index 99ecf64c1a..4c9fc625f0 100644 --- a/memory/persona/max/STARTING-POINT.md +++ b/memory/persona/max/STARTING-POINT.md @@ -66,6 +66,52 @@ Glass-halo discipline (`.claude/rules/glass-halo-bidirectional.md`) is the frame 4. **Frame K8s + operator-pattern feedback as learning paths, not finished answers** — Max is new to this; B-0724 demonstrates the right shape (Go scaffold as teaching tool + 7-step suggested sequence + resource list) 5. **Don't pace him** — Aaron's parallel-tracks framing is real; Max sets his own velocity +## Current focus — tier-2 Docker Desktop dev-experience (added 2026-05-25) + +Beyond the agentic-organization design + hat-system substrate, Max's near-term workstream (added by Aaron 2026-05-25) is **owning the tier-2 Docker Desktop + Kubernetes dev-experience** for the Zeta cluster substrate. Full scope + sub-scopes are documented in [`PERSONA.md`](PERSONA.md) under "Current focus — tier-2 Docker Desktop dev-experience workstream"; this section names the cold-boot reading list for an AI collaborating with Max on this workstream. + +### Cold-boot reading list (in order) + +1. [`CLAUDE.md`](../../../CLAUDE.md) — repo bootstream, conventions, governance pointers +2. [`AGENTS.md`](../../../AGENTS.md) — cross-cutting governance +3. [`.claude/rules/`](../../../.claude/rules/) — auto-loaded behavioral rules. Especially: [`rule-0-no-sh-files.md`](../../../.claude/rules/rule-0-no-sh-files.md), [`dont-ask-permission.md`](../../../.claude/rules/dont-ask-permission.md), [`claim-acquire-before-worktree-work.md`](../../../.claude/rules/claim-acquire-before-worktree-work.md), [`zeta-expected-branch.md`](../../../.claude/rules/zeta-expected-branch.md) +4. [`docs/backlog/P1/B-0780-*.md`](../../../docs/backlog/P1/B-0780-local-loop-deterministic-simulation-testing-of-kubernetes-deployments-lexisnexis-lineage-three-tier-testing-argocd-apps-as-packages-aaron-mika-2026-05-25.md) — tier-2's parent substrate; Max's workstream IS tier-2 +5. `docs/backlog/P1/B-0759-*.md` — first-time-CLI-user persona Max's `zeta dev up` UX serves +6. [`full-ai-cluster/tools/zflash-setup.ts`](../../../full-ai-cluster/tools/zflash-setup.ts) — canonical Touch ID + PAM + sudo-elevation pattern Max gets to use for all privileged macOS operations +7. [`full-ai-cluster/usb-nixos-installer/zeta-install.sh`](../../../full-ai-cluster/usb-nixos-installer/zeta-install.sh) — zero-typing install pattern Max should emulate at Docker-Desktop scope +8. The "simplest first; add complexity only when simple shape demonstrably doesn't fit" feedback memory at `~/.claude/projects/.../memory/feedback_simplest_first_*` (Aaron-Mika 2026-05-25) — the substrate-engineering discipline Max applies at every backend / topology / profile decision + +### Disciplines that apply to the tier-2 workstream + +- **Substrate-or-it-didn't-happen** — chat doesn't count; commit + push +- **Simplest first** — pick the simplest tool / shape that fits known requirements; promote only when simple shape demonstrably fails +- **No directives** — Aaron's only directive is that there are no directives; Max's input is framing, not orders +- **Glass halo** — log substrate-honestly; surface gaps; don't hide failures +- **Verify before deferring** — if something looks broken, check it before classifying it as "someone else's problem" +- **Skills-and-scripts encoding contract** — every Docker Desktop interaction Max performs ends as a TS script (per Rule 0), a Claude Code skill, or a backlog row. Nothing gets lost in chat +- **Touch ID over passwords** — for any privileged macOS operation, use the zflash Touch ID pattern; never reach for a password prompt + +### Concrete first deliverables for the tier-2 workstream (in order of value-per-effort) + +1. **Read the cold-boot list above** + write a short observation note to Max on what's already-substrate vs gap +2. **Author `.claude/skills/docker-desktop-tier-2/SKILL.md`** — initial skill covering: install Docker Desktop, enable Kubernetes via the native kind provisioner, set node count via DD settings API, verify `kubectl` works +3. **Author `tools/dev/docker-desktop-k8s-enable.ts`** — TS script that programmatically configures DD's native kind provisioner (settings API; edits `~/Library/Group Containers/group.com.docker/settings.json` where API doesn't cover). Documents any GUI-only steps as sibling `.md` with screenshots +4. **Author `tools/dev/zfingerprint.ts`** — thin wrapper around the zflash Touch ID + expect pattern, generalized for any Max-side privileged operation (not just USB flashing) +5. **File backlog row B-NNNN** — Docker Desktop tier-2 dev-experience substrate (composes with B-0780). Use the agent-roster ID allocation discipline (`git ls-tree origin/main -- docs/backlog/` to find current top + `gh pr list --search "B-NNNN"` to check in-flight). Row's acceptance criteria are the skills + scripts to ship over the coming ticks + +### Updated success metrics (first 30 days of the tier-2 workstream) + +- `zeta dev up` defaults to DD-managed 3-node kind via DD's settings API; cold under 5 min, warm under 1 min +- `--single-node` and `--nodes N` flags drive DD settings programmatically +- Chart coverage matrix with three columns (single-node / multi-node-DD / cluster-only) for every chart +- Profiles (`minimal` / `data` / `observability` / `full` / custom) wired with documented resource requirements +- CockroachDB 3-node consensus + Argo CD HA leader-election + Longhorn 3-replica all green in tier-2 +- OTel Shape B traces matching prod shape +- GitHub workflows: per-PR `data` profile on kind; nightly `full` profile; separate `federation` workflow runs multi-cluster kind matrix +- Skills shipped: `tier-2-dd-kind/`, `tier-2-profiles/`, `argocd-sync-wave-debug/`, `tier-2-observability/`, `tier-2-ci-kind-k3d/`, `tier-2-federation-debug/` +- Zero passwords typed by Max for admin operations on his Mac (everything via Touch ID) +- Every Docker Desktop GUI click Max made at least twice has been encoded as either a script or a documented "GUI-only — here's why" comment + ## Composes with - [`PERSONA.md`](PERSONA.md) — fuller persona context