Skip to content

docs(research): cluster bare-metal substrate architecture decision (NixOS + bare-metal k8s + Argo CD; no hypervisor for primary stack)#4808

Merged
AceHack merged 1 commit into
mainfrom
otto/research-cluster-bare-metal-architecture-nixos-no-hypervisor-2026-05-24
May 24, 2026
Merged

docs(research): cluster bare-metal substrate architecture decision (NixOS + bare-metal k8s + Argo CD; no hypervisor for primary stack)#4808
AceHack merged 1 commit into
mainfrom
otto/research-cluster-bare-metal-architecture-nixos-no-hypervisor-2026-05-24

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 24, 2026

Summary

Architecture decision record for the bare-metal substrate layer below Kubernetes in the framework's basement cluster build (20 GPUs + 20 phones via Cellhasher + Pi cluster + AI hats).

Primary stack DECIDED

Layer Choice
Host OS NixOS 24.11+ (flake-based)
Hypervisor None for primary stack (bare-metal direct)
GitOps Argo CD (Aaron preference over Flux)
Container runtime containerd
CNI Cilium (eBPF)
CSI Longhorn over local NVMe + ZFS-on-root
GPU device plugin NVIDIA k8s device plugin
Boot loader systemd-boot
Provisioning nixos-anywhere via SSH + iPXE

Deferred (backlog)

  • Talos Linux as alternative for k8s control-plane subset
  • KubeVirt as k8s extension for VM workloads if needed
  • Proxmox for separate experimental tier (outside framework DST)
  • k3s vs kubeadm decision
  • MIG slicing strategy (hardware-dependent)

Rejected with reasoning

Guix System / Ubuntu/Debian/Fedora / Fedora CoreOS / Flatcar / Bottlerocket / Proxmox primary / ESXi / XCP-ng / Harvester / Flux — each with explicit reasoning.

Heterogeneous compute architecture

Three node classes via NixOS per-node-class modules from one flake:

  • GPU compute nodes (k8s workers)
  • Phone orchestrator (Cellhasher management; NOT k8s worker — phones are workload-substrate)
  • Pi cluster + AI hats (k8s optional; direct hardware access for Hailo/Coral/Edge TPU)

Framework alignment

Maps each architecture choice to specific framework disciplines:

  • DST → NixOS reproducibility
  • Substrate-or-it-didn't-happen → NixOS as full state
  • Glass-halo bidirectional → Argo CD GitOps + Cilium eBPF
  • NCI floor at OS scope → NixOS atomic rollback
  • m/acc-multi-oracle → heterogeneous compute orchestration per class

8 open architecture questions captured

k3s vs kubeadm / Pi hardware specs / GPU class / storage backplane / network fabric / PXE infra / secret management / observability stack.

Test plan

  • CI green (lint only — no source changes)

…cision (NixOS + bare-metal k8s + Argo CD; no hypervisor for primary stack)

Architecture decisions for basement cluster build:
- 20 GPUs + 20 phones via Cellhasher + Pi cluster + AI hats

Aaron's substrate-engineering authority calls captured (verbatim
quotes preserved):

PRIMARY STACK DECIDED:
- NixOS 24.11+ flake-based (declarative OS; DST-aligned)
- Argo CD for GitOps (over Flux despite Flux being lighter — explicit
  operator preference)
- Bare-metal Kubernetes (no hypervisor for primary stack)
- containerd / Cilium CNI / Longhorn CSI + ZFS / NVIDIA k8s device
  plugin / systemd-boot / nixos-anywhere provisioning

DEFERRED (backlog):
- Talos Linux as alternative for k8s control-plane subset
- KubeVirt as k8s extension for VM workloads if needed
- Proxmox for separate experimental tier (outside framework DST)
- k3s vs kubeadm decision

REJECTED (with reasoning):
- Guix System: FSF free-software-fundamentalism may block NVIDIA
- Ubuntu/Debian/Fedora: mutable; not DST-aligned
- Fedora CoreOS / Silverblue / Flatcar / Bottlerocket: less expressive
  than Nix; container-host shape only
- Proxmox primary: imperative web-UI breaks DST + 3 layers vs 1
- ESXi / XCP-ng / Harvester: see body
- Flux: operator preference for Argo

HETEROGENEOUS COMPUTE ARCHITECTURE:
- GPU compute nodes (NVIDIA + k8s workers)
- Phone-orchestrator node (Cellhasher management; NOT k8s worker —
  phones are workload-substrate, not k8s control plane)
- Pi-cluster + AI hats (may or may not run k8s depending on AI workload)

Maps each architecture choice to specific framework substrate-engineering
disciplines: DST, substrate-or-it-didn't-happen, glass-halo, NCI floor,
additive-not-zero-sum, m/acc-multi-oracle, bandwidth-served falsifier.

8 open architecture questions captured for future decision.

Authored via git plumbing fallback.
Copilot AI review requested due to automatic review settings May 24, 2026 01:57
@AceHack AceHack enabled auto-merge (squash) May 24, 2026 01:57
@AceHack AceHack merged commit f11f66a into main May 24, 2026
26 of 27 checks passed
@AceHack AceHack deleted the otto/research-cluster-bare-metal-architecture-nixos-no-hypervisor-2026-05-24 branch May 24, 2026 01:59
AceHack added a commit that referenced this pull request May 24, 2026
… Manager + k3d + Headscale + lend-resources pattern) (#4809)

* docs(research): bundle-file dev-PC substrate architecture (Nix + Home Manager + k3d + Headscale + lend-resources pattern)

Sibling to PR #4808 (cluster substrate). Per Aaron 2026-05-24 'yes
bundle-file it (shadow*)' confirmation.

PRIMARY STACK DECIDED (lightweight-first per Aaron-stated principle
'Lets do whatever is lightweigh now and ease into more heavy weight stuff'):

LAYER 1 — Reproducible dev-PC substrate (Nix):
- macOS: Determinate Systems Nix installer + nix-darwin + Home Manager
- Linux: Nix package manager + Home Manager (on existing distro)
- Windows: WSL2 + Nix in WSL2 + Home Manager
- One flake repo covers cluster + dev PCs + every user's home directory

LAYER 2 — Local k8s for testing:
- k3d (lighter than kind) on each dev PC for manifest testing + GitOps
  practice WITHOUT touching production cluster

LAYER 3 — Background service (lend-resources pattern):
- Aaron framing: 'Dev boxes can be like lending resources to cluster'
- Lightweight Bun/Node daemon polling NATS queue for opt-in work
- NOT first-class k8s nodes (avoid trust-boundary issues)
- Heavier alternative (k3s agent, Liqo federation) deferred

LAYER 4 — Network substrate (Headscale + Tailscale):
- Aaron framing: 'Tailscale is good but we also want headscale'
- Tailscale clients on each device
- Self-hosted Headscale control plane (sovereignty over user/device/ACL
  state; no commercial dependency; free at any node count)
- DERP relay optional for NAT-traversal fallback

DEFERRED (heavyweight ease-into-later):
- Liqo federation
- KubeFed v2
- k3s agent per dev PC
- Custom DERP relays
- Native Nix on Windows (when ships)
- Full NixOS desktop on dev Linux box

5 open questions captured: Headscale deployment location, background-
service queue tech, authentication boundary, lending workload-class
restrictions, Addison's preferences (pending direct articulation per
observation-not-fact consent discipline).

Maps each choice to framework discipline (DST, glass-halo, NCI floor,
m/acc-multi-oracle, bandwidth-served, additive, Aaron lightweight-first
principle, Addison observation-not-fact discipline).

Composes with cluster substrate archive + Addison consent archive +
9 framework rules.

Authored via git plumbing fallback.

* fix(PR #4809): correct impossible decision timestamp + consent-file date-prefix

Two factual corrections caught by Codex P2 + Copilot:

1. Line 3: "Date decided: 2026-05-24 (~03:30Z)" was ~1.5h in
   the future relative to commit time (02:03Z). Corrected to
   ~02:03Z matching `gh pr view 4809 --json commits` last
   committed date.

2. Line 4: consent-file reference
   `addison-consent-pattern-observation-not-fact-discipline-aaron-otto.md`
   missing date prefix; actual file on disk is
   `2026-05-24-addison-consent-pattern-observation-not-fact-discipline-aaron-otto.md`.
   Added date prefix; reference now resolves.

Mechanical fixes only.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
@AceHack AceHack review requested due to automatic review settings May 24, 2026 02:20
AceHack added a commit that referenced this pull request May 24, 2026
…cision (NixOS + bare-metal k8s + Argo CD; no hypervisor for primary stack) (#4808)

Architecture decisions for basement cluster build:
- 20 GPUs + 20 phones via Cellhasher + Pi cluster + AI hats

Aaron's substrate-engineering authority calls captured (verbatim
quotes preserved):

PRIMARY STACK DECIDED:
- NixOS 24.11+ flake-based (declarative OS; DST-aligned)
- Argo CD for GitOps (over Flux despite Flux being lighter — explicit
  operator preference)
- Bare-metal Kubernetes (no hypervisor for primary stack)
- containerd / Cilium CNI / Longhorn CSI + ZFS / NVIDIA k8s device
  plugin / systemd-boot / nixos-anywhere provisioning

DEFERRED (backlog):
- Talos Linux as alternative for k8s control-plane subset
- KubeVirt as k8s extension for VM workloads if needed
- Proxmox for separate experimental tier (outside framework DST)
- k3s vs kubeadm decision

REJECTED (with reasoning):
- Guix System: FSF free-software-fundamentalism may block NVIDIA
- Ubuntu/Debian/Fedora: mutable; not DST-aligned
- Fedora CoreOS / Silverblue / Flatcar / Bottlerocket: less expressive
  than Nix; container-host shape only
- Proxmox primary: imperative web-UI breaks DST + 3 layers vs 1
- ESXi / XCP-ng / Harvester: see body
- Flux: operator preference for Argo

HETEROGENEOUS COMPUTE ARCHITECTURE:
- GPU compute nodes (NVIDIA + k8s workers)
- Phone-orchestrator node (Cellhasher management; NOT k8s worker —
  phones are workload-substrate, not k8s control plane)
- Pi-cluster + AI hats (may or may not run k8s depending on AI workload)

Maps each architecture choice to specific framework substrate-engineering
disciplines: DST, substrate-or-it-didn't-happen, glass-halo, NCI floor,
additive-not-zero-sum, m/acc-multi-oracle, bandwidth-served falsifier.

8 open architecture questions captured for future decision.

Authored via git plumbing fallback.
AceHack added a commit that referenced this pull request May 24, 2026
… Manager + k3d + Headscale + lend-resources pattern) (#4809)

* docs(research): bundle-file dev-PC substrate architecture (Nix + Home Manager + k3d + Headscale + lend-resources pattern)

Sibling to PR #4808 (cluster substrate). Per Aaron 2026-05-24 'yes
bundle-file it (shadow*)' confirmation.

PRIMARY STACK DECIDED (lightweight-first per Aaron-stated principle
'Lets do whatever is lightweigh now and ease into more heavy weight stuff'):

LAYER 1 — Reproducible dev-PC substrate (Nix):
- macOS: Determinate Systems Nix installer + nix-darwin + Home Manager
- Linux: Nix package manager + Home Manager (on existing distro)
- Windows: WSL2 + Nix in WSL2 + Home Manager
- One flake repo covers cluster + dev PCs + every user's home directory

LAYER 2 — Local k8s for testing:
- k3d (lighter than kind) on each dev PC for manifest testing + GitOps
  practice WITHOUT touching production cluster

LAYER 3 — Background service (lend-resources pattern):
- Aaron framing: 'Dev boxes can be like lending resources to cluster'
- Lightweight Bun/Node daemon polling NATS queue for opt-in work
- NOT first-class k8s nodes (avoid trust-boundary issues)
- Heavier alternative (k3s agent, Liqo federation) deferred

LAYER 4 — Network substrate (Headscale + Tailscale):
- Aaron framing: 'Tailscale is good but we also want headscale'
- Tailscale clients on each device
- Self-hosted Headscale control plane (sovereignty over user/device/ACL
  state; no commercial dependency; free at any node count)
- DERP relay optional for NAT-traversal fallback

DEFERRED (heavyweight ease-into-later):
- Liqo federation
- KubeFed v2
- k3s agent per dev PC
- Custom DERP relays
- Native Nix on Windows (when ships)
- Full NixOS desktop on dev Linux box

5 open questions captured: Headscale deployment location, background-
service queue tech, authentication boundary, lending workload-class
restrictions, Addison's preferences (pending direct articulation per
observation-not-fact consent discipline).

Maps each choice to framework discipline (DST, glass-halo, NCI floor,
m/acc-multi-oracle, bandwidth-served, additive, Aaron lightweight-first
principle, Addison observation-not-fact discipline).

Composes with cluster substrate archive + Addison consent archive +
9 framework rules.

Authored via git plumbing fallback.

* fix(PR #4809): correct impossible decision timestamp + consent-file date-prefix

Two factual corrections caught by Codex P2 + Copilot:

1. Line 3: "Date decided: 2026-05-24 (~03:30Z)" was ~1.5h in
   the future relative to commit time (02:03Z). Corrected to
   ~02:03Z matching `gh pr view 4809 --json commits` last
   committed date.

2. Line 4: consent-file reference
   `addison-consent-pattern-observation-not-fact-discipline-aaron-otto.md`
   missing date prefix; actual file on disk is
   `2026-05-24-addison-consent-pattern-observation-not-fact-discipline-aaron-otto.md`.
   Added date prefix; reference now resolves.

Mechanical fixes only.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant