Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions memory/persona/max/PERSONA.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,49 @@ Operationally:
- The home-tier substrate that the federated peer mesh (B-0727) operates at; Max's org-design assumes the home/business profile as primary
- The C# / F# operator collaboration substrate (B-0724) — once landed, the polyglot pattern proves CRD-as-canonical-contract with two implementations in different languages

## Current focus — tier-2 Docker Desktop dev-experience workstream (added 2026-05-25)

Aaron 2026-05-25 added Max's primary near-term workstream: **own the tier-2 Docker Desktop + Kubernetes dev-experience** for the Zeta cluster substrate. This is the middle tier in the three-tier testing story from [B-0780](../../../docs/backlog/P1/B-0780-local-loop-deterministic-simulation-testing-of-kubernetes-deployments-lexisnexis-lineage-three-tier-testing-argocd-apps-as-packages-aaron-mika-2026-05-25.md):

| Tier | Owner | Substrate |
|---|---|---|
| 1 — pure-code (no Docker, no K8s) | Aaron + Otto | F# Local Loop tests |
| **2 — Docker-observable (Docker Desktop + native multi-node kind)** | **Max** | This workstream |
| 3 — full CI in real cluster | Aaron + Otto + the iter-3 NixOS cluster | Already shipping per B-0754 |

Max's contract: **touch the Docker Desktop GUI only where the API/CLI demonstrably can't do it.** Everything else (clusters, app deploys, port-forwards, kubectl, helm, argo, observability stacks) gets scripted or skill-encoded. If Max finds himself clicking a button twice, that's a signal to encode the next click as a skill or script.

### Sub-scopes Max owns within tier-2

- **Argo CD sync-wave debugging** — the App-of-Apps composition pattern (B-0780 Component 3) makes sync-wave ordering the primary failure surface during tier-2 bring-up. Max becomes the human who can read an Argo CD sync-wave failure trace and pin the root cause in minutes; pattern encoded at `.claude/skills/argocd-sync-wave-debug/SKILL.md`.
- **Observability — OTel auto-instrumentation matching the CNI mesh shape** — production cluster will use Cilium + Hubble + OTel; Docker Desktop tier doesn't ship Cilium by default. Substrate-design choice between full Cilium (Shape A), thinner eBPF + OTel-collector (Shape B), or both-gated (Shape C); default to Shape B per simplest-first, promote when Shape B demonstrably misses prod bugs.
- **30+ chart coverage matrix** — production cluster runs 30+ charts (cockroachdb, redis, nats, temporal, orleans, dapr, opa, longhorn, vllm, argo-{cd,rollouts,workflows}, loki / mimir / tempo, spire, etc.). Max maintains a three-column matrix (single-node DD kind / multi-node DD kind / cluster-only) at `docs/dev-environments/docker-desktop-chart-matrix.md` so future operators (and `zeta dev up` profile defaults) know which charts run where.
- **CI testing on kind / k3d + GitHub workflows** — Max owns `.github/workflows/tier-2-*.yml` (per-PR on kind + nightly full profile + separate multi-cluster federation workflow). Tier-2-in-CI is the substrate that catches "works on my laptop, breaks in CI" before tier-3 (real cluster) bothers running.
- **`zeta dev up` developer-facing surface** — single command brings cluster substrate to ready state on his laptop in time comparable to `docker-compose up` (target: under 5min cold for 3-node DD kind + `data` profile; under 1min warm). Flags: `--single-node` for fast iteration; `--nodes N` to drive DD's settings API; `--profile minimal | data | observability | full | <custom>` for chart subset selection.

### Topology substrate (corrected 2026-05-25 — Docker Desktop ships native multi-node kind)

Docker Desktop's native cluster-provisioning UI exposes **kind** as a first-class provisioner with a 1–10 node slider + version picker (current: K8s 1.34.3). Tier-2 = **kind via DD's native provisioner** (NOT bare kind / k3d running on top of DD's Docker engine — that earlier framing was outdated). Max picks node count via DD UI slider OR programmatically via DD's settings API / CLI. **Default = 3-node kind** because consensus-quorum testing is the highest-value tier-2 capability that tier-1 can't deliver.

Multi-node ≠ multi-cluster. Multi-node (3 nodes in one DD-managed kind cluster) covers ~95% of consensus-quorum testing (CockroachDB Raft, etcd quorum, Longhorn 3-replica, NATS R3, Argo CD HA, anti-affinity, pod-disruption budgets). **Multi-cluster federation / Cilium clustermesh / multi-region** is the remaining ~5%, lives in CI by default plus locally-runnable script for debugging only — NOT always-on in DD. Skill: `.claude/skills/tier-2-federation-debug/SKILL.md`.

### Touch ID / biometrics integration Max gets to use

Zeta has a Touch ID + PAM integration for sudo and admin operations, canonical pattern at [`full-ai-cluster/tools/zflash-setup.ts`](../../../full-ai-cluster/tools/zflash-setup.ts). When AI agents need to do anything privileged on Max's macOS workstation (installing Docker Desktop, enabling Kubernetes, mounting disks, etc.), the pattern is: AI announces → invokes via expect wrapper → Max taps fingerprint sensor → command runs with elevated privilege. **Max does not type passwords for admin operations**; if an AI agent reaches for a password prompt, that's a signal to extend the Touch ID pattern instead. Skill candidate: `tools/dev/zfingerprint.ts` — thin wrapper generalizing the zflash pattern for any Max-side privileged operation.

### Skills-and-scripts encoding contract (load-bearing)

Every Docker Desktop / Kubernetes / dev-experience interaction Max performs ends as one of: a TypeScript script under `tools/dev/` (per Rule 0 — TS not bash; Bun runtime); a Claude Code skill under `.claude/skills/<name>/SKILL.md`; or a backlog row under `docs/backlog/P*/B-NNNN-*.md` for substantive new substrate. Rule of thumb: if Max teaches the AI something about Docker Desktop UX twice, that's a skill or script. Nothing gets lost in chat.

### Composes with the tier-2 workstream

- [B-0780](../../../docs/backlog/P1/B-0780-local-loop-deterministic-simulation-testing-of-kubernetes-deployments-lexisnexis-lineage-three-tier-testing-argocd-apps-as-packages-aaron-mika-2026-05-25.md) — tier-2's parent substrate; Max's workstream IS tier-2
- B-0759 — first-time-CLI-user persona Max's `zeta dev up` UX serves
- B-0770 — Comet Pro IP-KVM substrate that makes local tty1 access load-bearing (which is why iter-4 needs password + SSH key, not just SSH key)
- B-0776 — simplest-first plugin sequence the chart matrix backs
- [B-0786](../../../docs/backlog/P2/B-0786-feature-flags-substrate-openfeature-as-operator-contract-flipt-as-simplest-first-backend-aaron-mika-2026-05-25.md) — "simplest first; add complexity only when simple shape demonstrably doesn't fit" discipline Max applies at every backend / topology / profile decision
- B-0789 (forthcoming) — iter-4 forge-integrated cluster bring-up; provides the password + SSH substrate Max uses to bring up his own dev cluster nodes

## How agents work with Max

- **Welcoming-but-honest review** — Max is new to K8s + the operator pattern; he'll be resistant at first to the ceremony (per Aaron: *"he will be resistant probably like most devs at first until he internlizes is worth"*). Frame feedback constructively + name the WHY (declarative state convergence, idempotent reconcile, CRD-as-typed-API) without selling
Expand Down
46 changes: 46 additions & 0 deletions memory/persona/max/STARTING-POINT.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,52 @@ Glass-halo discipline (`.claude/rules/glass-halo-bidirectional.md`) is the frame
4. **Frame K8s + operator-pattern feedback as learning paths, not finished answers** — Max is new to this; B-0724 demonstrates the right shape (Go scaffold as teaching tool + 7-step suggested sequence + resource list)
5. **Don't pace him** — Aaron's parallel-tracks framing is real; Max sets his own velocity

## Current focus — tier-2 Docker Desktop dev-experience (added 2026-05-25)

Beyond the agentic-organization design + hat-system substrate, Max's near-term workstream (added by Aaron 2026-05-25) is **owning the tier-2 Docker Desktop + Kubernetes dev-experience** for the Zeta cluster substrate. Full scope + sub-scopes are documented in [`PERSONA.md`](PERSONA.md) under "Current focus — tier-2 Docker Desktop dev-experience workstream"; this section names the cold-boot reading list for an AI collaborating with Max on this workstream.

### Cold-boot reading list (in order)

1. [`CLAUDE.md`](../../../CLAUDE.md) — repo bootstream, conventions, governance pointers
2. [`AGENTS.md`](../../../AGENTS.md) — cross-cutting governance
3. [`.claude/rules/`](../../../.claude/rules/) — auto-loaded behavioral rules. Especially: [`rule-0-no-sh-files.md`](../../../.claude/rules/rule-0-no-sh-files.md), [`dont-ask-permission.md`](../../../.claude/rules/dont-ask-permission.md), [`claim-acquire-before-worktree-work.md`](../../../.claude/rules/claim-acquire-before-worktree-work.md), [`zeta-expected-branch.md`](../../../.claude/rules/zeta-expected-branch.md)
4. [`docs/backlog/P1/B-0780-*.md`](../../../docs/backlog/P1/B-0780-local-loop-deterministic-simulation-testing-of-kubernetes-deployments-lexisnexis-lineage-three-tier-testing-argocd-apps-as-packages-aaron-mika-2026-05-25.md) — tier-2's parent substrate; Max's workstream IS tier-2
5. `docs/backlog/P1/B-0759-*.md` — first-time-CLI-user persona Max's `zeta dev up` UX serves
6. [`full-ai-cluster/tools/zflash-setup.ts`](../../../full-ai-cluster/tools/zflash-setup.ts) — canonical Touch ID + PAM + sudo-elevation pattern Max gets to use for all privileged macOS operations
7. [`full-ai-cluster/usb-nixos-installer/zeta-install.sh`](../../../full-ai-cluster/usb-nixos-installer/zeta-install.sh) — zero-typing install pattern Max should emulate at Docker-Desktop scope
8. The "simplest first; add complexity only when simple shape demonstrably doesn't fit" feedback memory at `~/.claude/projects/.../memory/feedback_simplest_first_*` (Aaron-Mika 2026-05-25) — the substrate-engineering discipline Max applies at every backend / topology / profile decision

### Disciplines that apply to the tier-2 workstream

- **Substrate-or-it-didn't-happen** — chat doesn't count; commit + push
- **Simplest first** — pick the simplest tool / shape that fits known requirements; promote only when simple shape demonstrably fails
- **No directives** — Aaron's only directive is that there are no directives; Max's input is framing, not orders
- **Glass halo** — log substrate-honestly; surface gaps; don't hide failures
- **Verify before deferring** — if something looks broken, check it before classifying it as "someone else's problem"
- **Skills-and-scripts encoding contract** — every Docker Desktop interaction Max performs ends as a TS script (per Rule 0), a Claude Code skill, or a backlog row. Nothing gets lost in chat
- **Touch ID over passwords** — for any privileged macOS operation, use the zflash Touch ID pattern; never reach for a password prompt

### Concrete first deliverables for the tier-2 workstream (in order of value-per-effort)

1. **Read the cold-boot list above** + write a short observation note to Max on what's already-substrate vs gap
2. **Author `.claude/skills/docker-desktop-tier-2/SKILL.md`** — initial skill covering: install Docker Desktop, enable Kubernetes via the native kind provisioner, set node count via DD settings API, verify `kubectl` works
3. **Author `tools/dev/docker-desktop-k8s-enable.ts`** — TS script that programmatically configures DD's native kind provisioner (settings API; edits `~/Library/Group Containers/group.com.docker/settings.json` where API doesn't cover). Documents any GUI-only steps as sibling `.md` with screenshots
4. **Author `tools/dev/zfingerprint.ts`** — thin wrapper around the zflash Touch ID + expect pattern, generalized for any Max-side privileged operation (not just USB flashing)
5. **File backlog row B-NNNN** — Docker Desktop tier-2 dev-experience substrate (composes with B-0780). Use the agent-roster ID allocation discipline (`git ls-tree origin/main -- docs/backlog/` to find current top + `gh pr list --search "B-NNNN"` to check in-flight). Row's acceptance criteria are the skills + scripts to ship over the coming ticks

### Updated success metrics (first 30 days of the tier-2 workstream)

- `zeta dev up` defaults to DD-managed 3-node kind via DD's settings API; cold under 5 min, warm under 1 min
- `--single-node` and `--nodes N` flags drive DD settings programmatically
- Chart coverage matrix with three columns (single-node / multi-node-DD / cluster-only) for every chart
- Profiles (`minimal` / `data` / `observability` / `full` / custom) wired with documented resource requirements
- CockroachDB 3-node consensus + Argo CD HA leader-election + Longhorn 3-replica all green in tier-2
- OTel Shape B traces matching prod shape
- GitHub workflows: per-PR `data` profile on kind; nightly `full` profile; separate `federation` workflow runs multi-cluster kind matrix
- Skills shipped: `tier-2-dd-kind/`, `tier-2-profiles/`, `argocd-sync-wave-debug/`, `tier-2-observability/`, `tier-2-ci-kind-k3d/`, `tier-2-federation-debug/`
- Zero passwords typed by Max for admin operations on his Mac (everything via Touch ID)
- Every Docker Desktop GUI click Max made at least twice has been encoded as either a script or a documented "GUI-only — here's why" comment

## Composes with

- [`PERSONA.md`](PERSONA.md) — fuller persona context
Expand Down
Loading