feat(ai-cluster): cookie-cutter node provisioning via disko + Longhorn multi-disk#4950
Conversation
…n multi-disk
Adds declarative disk partitioning + multi-disk Longhorn wiring + a
worker-template host so adding a new identical box to the cluster is
six per-host edits and `disko + nixos-install` — no hand-partitioning,
no per-host shell scripts.
What lands:
- `nixos/modules/disko-shapes/2nvme.nix` — cookie-cutter disko shape
for boxes with 2 equal NVMes. Uses `size = "100%"` for the Longhorn
partitions so the shape works on any disk size (handles the "1TB is
never really 1TB" reality). Layout: nvme0 = 1G ESP + 256G root + rest
Longhorn; nvme1 = whole disk Longhorn. OS + bootloader live on nvme0
only — bootloader-on-wrong-disk failure mode from prior manual install
is structurally impossible.
- `nixos/modules/longhorn-disks.nix` — declarative wiring between the
shape's mountpoints and Longhorn's node-level data-path catalog.
Takes `zeta.longhorn.dataDisks` list, ensures mount dirs exist with
the perms Longhorn expects, emits a NodeDiskCatalog YAML at
/etc/longhorn/node-disks.yaml for the cluster-side patch Job, and
adds the `zeta.io/longhorn-disks=N` K3S node label so the scheduler
can target high-capacity nodes.
- `nixos/hosts/worker-template/default.nix` — composes the shape +
longhorn module + existing k3s-agent/gpu/docker modules. Six clearly-
marked PLACEHOLDER blocks (hostName, hostId, nvme0 by-id, nvme1
by-id, network, SSH key). Cookie-cutter from here per new box.
- `flake.nix` — adds disko input + `worker-template` nixosConfiguration
+ exposes the new modules via `nixosModules.{disko-shape-2nvme,
longhorn-disks}` for external reuse.
- `usb-nixos-installer/.../configuration.nix` — bakes disko into the
installer ISO so installs don't need network access just to fetch
disko itself.
- `PROVISIONING.md` — end-to-end cookie-cutter workflow (copy template
→ edit 6 lines → flake.nix entry → boot USB → `disko --mode disko`
+ `nixos-install` → done). Includes disk-failure recovery paths and
multi-shape extension guide for future hardware classes.
Provisioning a new identical box: ~10 minutes wall-clock, six lines
of human edits, declarative end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1f1d1e3699
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| 4. Longhorn DaemonSet pod schedules → reads `/etc/longhorn/node-disks.yaml` | ||
| → patches the Longhorn Node CR to add both data paths |
There was a problem hiding this comment.
Remove unsupported auto-patch step from provisioning workflow
This step states that a Longhorn DaemonSet will read /etc/longhorn/node-disks.yaml and patch Node CRs automatically, but this commit does not add any manifest or script under full-ai-cluster/k8s/ that consumes that file (and nixos/modules/longhorn-disks.nix still documents the patch job as TODO/manual). As written, operators can complete the runbook believing both data disks are active while Longhorn still uses default disk config, which can misreport usable capacity and scheduling behavior immediately after node bring-up.
Useful? React with 👍 / 👎.
| # disko-shapes/ modules. Pre-staged on the ISO so installs | ||
| # don't need network access just to fetch disko itself. | ||
| # Invocation: | ||
| # disko --mode disko --flake /mnt/etc/zeta/full-ai-cluster#<host> | ||
| disko |
There was a problem hiding this comment.
Synchronize disko package change with root USB installer copy
Adding disko only to full-ai-cluster/usb-nixos-installer breaks the documented "byte-identical" relationship with /workspace/Zeta/usb-nixos-installer, whose nixos/installer/configuration.nix still lacks this package. That creates divergent installer behavior: users building from the root usb-nixos-installer/ path (explicitly supported in its README) will not have offline disko, so the documented provisioning flow can fail on air-gapped installs.
Useful? React with 👍 / 👎.
…ory (#4951) Adds precise hardware mapping, a guided install script, and an inventory-capture tool. Four composing pieces: 1. Node Feature Discovery (NFD) — labels every node with detailed hardware features (CPU model, AVX-512, PCI vendors, kernel modules, storage class). Lands as an ArgoCD Application using the upstream kubernetes-sigs Helm chart, with `worker` tolerating everything so it discovers features on control-plane + tainted GPU nodes alike. PCI source plugin tuned to whitelist network/display/NVMe/ accelerator classes for compact labels. 2. hwloc / lstopo — added to common.nix so every cluster node has the topology tool installed; also added to the USB installer ISO so the operator can inspect NUMA / PCI / GPU layout BEFORE picking disk-by-id paths for disko. Composes with NFD: NFD labels are kubectl-queryable; lstopo XML is diff-stable for catching silent hardware drift. 3. tools/cluster-inventory/capture.sh — pulls NFD labels and lstopo XML from every node (via kubectl debug, with ssh fallback), renders SVG topology diagrams, and lands artifacts under full-ai-cluster/docs/cluster-hardware/<node>/. README documents query patterns, the cross-platform rescue substrate via Paragon FS drivers (extFS / NTFS for Mac, ExtFS for Windows — any cluster disk mounts read+write on any maintainer machine), and recommended cadence. 4. zeta-install — guided installer script baked into the USB ISO via writeShellScriptBin. Walks through: enumerate internal NVMes, confirm boot disk, type WIPE to confirm, wipe + partition + format + mount per the 2-NVMe shape, clone Zeta, run nixos-install for the chosen host attribute. /etc/zeta-install.md runbook on the USB updated to point at zeta-install as the default path. Composes with PR #4950 disko cookie-cutter: zeta-install does the manual-shape path the script encodes; once #4950 lands, the disko flake invocation becomes the alternative path documented in the runbook (for hosts that prefer the disko module over the imperative sgdisk sequence). Both end on the same disk layout. Note: zeta-install.sh lives at usb-nixos-installer/zeta-install.sh (not bin/) because root .gitignore blocks bin/ globally for .NET build outputs. configuration.nix readFile path matches. Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…4956) * ci(ai-cluster): workflow that builds full-ai-cluster installer ISO Sibling to .github/workflows/build-installer-iso.yml (which targets the root flake's infra/nixos/hosts/installer/). This new workflow targets full-ai-cluster/flake.nix's installer-iso package, which includes the recent USB additions: - disko (declarative partitioning; PR #4950) - hwloc / lstopo (hardware topology; PR #4951) - zeta-install helper script (PR #4951) - Paragon cross-platform rescue docs (PR #4951) Triggers on PR + push to main when full-ai-cluster/flake.{nix,lock}, full-ai-cluster/usb-nixos-installer/**, or full-ai-cluster/nixos/modules/disko-shapes/** changes. Plus workflow_dispatch for manual runs. Uploads the ISO as a workflow artifact (90 day retention) so anyone reviewing the PR or running the workflow can download it and `dd` to a USB stick without needing Nix installed locally. Discipline mirrors build-installer-iso.yml: - ubuntu-24.04 (pinned) - third-party actions SHA-pinned - permissions: contents: read - concurrency cancel-in-progress only on PRs - no github.event.* values interpolated into run: lines The repo currently carries two parallel installer substrates (infra/ vs full-ai-cluster/). Consolidating them into a single canonical path is a separate larger PR — this workflow just makes the AI-cluster ISO downloadable from CI in the meantime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): align workflow header comment with actual paths filter Copilot flagged that the header comment says triggers ONLY on the 3 paths but the actual paths: list also includes `full-ai-cluster/nixos/modules/disko-shapes/**`. Rewriting the comment as a bullet list that matches the paths verbatim — easier to keep in sync as the list grows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… — hats become negotiated fork structure ON TOP of reference stack — deterministic + declarative + GitOps + AI-native + human-native (#5004) Aaron 2026-05-25, continuing the ACE+fork-negotiation arc after B-0741: 'hats become our negoated fork structure on top of a referece k8s local stack in zeta so anyone can use the reference stack and negoate back hats and new cluster primitives / charts ontology negoation, ace can distribute the reference stack itself as PoC that it has reliable AI control over all the package managers deterministicly and declarative / desired state way for easy git ops ai native human native understanding.' Operational anchor for B-0741. B-0741 = WHAT the primitive is; B-0742 = HOW it's empirically demonstrated via reference-cluster-as-Ace- package. Three substantive claims: 1. full-ai-cluster/ IS the reference k8s local stack. Inventory of existing PR-landed substrate (PR #4930 hat-system + #4950 disko + #4951 NFD/lstopo/zeta-install + #4953 dev-cluster + ArgoCD + Cilium + cert-manager + Vault + SPIRE + Trust Manager + ESO + B-0737 zflash + Determinate Systems Nix installer). 2. Hats become the negotiated fork structure ON TOP of the reference stack. Forks declare delta via hat-ontology; cross-fork negotiation maps capabilities (B-0741 surface 2). Worked example: LFG trading-bot-driver hat + Healthcare-fork hipaa-data-handler hat. 3. Ace distributes the reference stack as PoC of reliable AI control over all PMs. Single 'ace install zeta/reference-cluster' dispatches across Nix + ArgoCD + helm + kustomize + native k8s + brew + apt + mise + DeterminateSystems Nix installer. Deterministic (Nix flake.lock + ArgoCD pins). Declarative + desired-state (GitOps). AI-native (markdown + JSON-LD). Human-native (readable, reviewable). Six independently-shippable scope items: reference-stack inventory doc; hat-as-fork-structure spec; Ace cluster-distribution scope extension to B-0288; determinism PoC (N=3+ machines); cross-PM dispatch PoC; desired- state-enforcement drift-recovery PoC. Composes with: B-0741 (abstract primitive) + B-0731 (hat ontology) + B-0247/B-0287/B-0288 (Ace PM existing substrate) + B-0727 (4-tier federation) + B-0726 (Reticulum) + B-0628/B-0638/B-0634/B-0703 (governance + negotiation + signature + consensus) + B-0732 (leverage- class safety) + B-0737 (zflash bring-up). P2 — high-value PoC anchoring B-0741 abstract primitive; not P1 because full-ai-cluster substrate just landed this round + needs to stabilize before distribution layer ships. Closing arc of today's 2026-05-25 substrate cascade (B-0728 → ... → B-0742): destructive-tool authoring contract through reference-stack PoC. Co-authored-by: Claude <noreply@anthropic.com>
Summary
Declarative end-to-end node provisioning: copy template → edit six lines → boot USB →
disko + nixos-install→ cluster member. No hand-partitioning, no per-host shell scripts. Adding a new identical box is ~10 minutes wall-clock.What lands
nixos/modules/disko-shapes/2nvme.nix— cookie-cutter shape for boxes with 2 equal NVMes. Usessize = "100%"for Longhorn partitions so the shape works at any disk size (handles "1TB is never really 1TB"). Layout: nvme0 = 1G ESP + 256G root + rest Longhorn; nvme1 = whole-disk Longhorn. OS + bootloader live on nvme0 only — the bootloader-on-untouched-disk failure mode from manual install is structurally impossible sincedisko --mode diskowipes both drives first and only one disk has an ESP.nixos/modules/longhorn-disks.nix— wires the shape's mountpoints to Longhorn's node-level data-path catalog. Takeszeta.longhorn.dataDiskslist, ensures mount dirs exist, emits/etc/longhorn/node-disks.yamlfor the cluster-side patch Job, adds thezeta.io/longhorn-disks=NK3S node label.nixos/hosts/worker-template/default.nix— composes shape + longhorn module + k3s-agent/gpu/docker. Six clearly-marked PLACEHOLDER blocks (hostName, hostId, twoby-iddisk symlinks, network, SSH key).flake.nix— disko input +worker-templatenixosConfiguration + new modules exposed undernixosModules.{disko-shape-2nvme, longhorn-disks}.usb-nixos-installer/.../configuration.nix— bakes disko into the ISO so the installer doesn't need network access to fetch it.PROVISIONING.md— end-to-end cookie-cutter workflow + disk-failure recovery + multi-shape extension guide.Why this is the right shape for symmetric 2×NVMe
Discussed in conversation:
Future-shape extension
disko-shapes/2nvme.nixis one shape. Future hardware classes get their own file (4nvme.nix,nvme-plus-sata.nix, etc.) matching the samezeta.diskooptions pattern. The Longhorn module is shape-agnostic — takes a list of mount paths, doesn't care how many disks contributed them.Test plan
nix flake checkfromfull-ai-cluster/passes afternix flake updateadds the disko inputnix build .#installer-isosucceeds (disko gets baked in)disko --mode disko --flake .#worker-template(with placeholder disk IDs replaced) wipes + partitions + formats + mounts both drives without errorsnixos-install --flake .#worker-templatecompleteslsblkshows the expected layout;mount | grep longhornshows both data pathskubectl get nodesshows the new node;kubectl -n longhorn-system get nodes.longhorn.io <hostname> -o yamlshows bothdisksentriesOpen follow-ups (not blockers for this PR)
/etc/longhorn/node-disks.yamland patch the Longhorn Node CR automatically (documented inlonghorn-disks.nixTODO)worker-gpu-01.nix, etc.) as physical boxes come online — those are cookie-cutter copies of the template, one PR per box🤖 Generated with Claude Code