fix(iter-5.5.1 fix-fwd PR #5388 ALIGNMENT): nixpkgs bun → mise (canonical .mise.toml SSoT); linux.sh NixOS detection; zeta-install.sh invokes tools/setup/install.sh (operator default per 'this is our default')#5389
Merged
AceHack merged 1 commit intoMay 27, 2026
Conversation
…e (canonical .mise.toml single source of truth); linux.sh handles NixOS via /etc/NIXOS detection; zeta-install.sh Step 6.95a invokes tools/setup/install.sh (THE default entry per operator framing) Operator catch (verbatim): > "future mise we already do this we've drifed for nixos for some > reason for bun" > "our install.sh for mac and linux this is our default" Substrate-honest finding: .mise.toml at repo root pins bun = "1.3" + node + dotnet + python + java + uv + actionlint + shellcheck + markdownlint-cli2 for ALL Zeta contexts (dev laptops + CI runners + devcontainers per GOVERNANCE §24 three-way-parity). Earlier draft of this PR added `bun` via nixpkgs systemPackages which DRIFTED cluster nodes off the canonical .mise.toml-pinned bun = "1.3". Alignment fix in 3 surfaces: 1. common.nix — `bun` removed from systemPackages; replaced with `mise` (canonical runtime version manager). mise then installs bun + all other .mise.toml-pinned runtimes for the zeta user. Single source of truth across dev + cluster. 2. tools/setup/linux.sh — added NixOS detection via /etc/NIXOS marker file. Skips the apt step entirely (NixOS handles system packages via common.nix systemPackages declaratively); proceeds to mise + downstream runtime setup. Net behavior: tools/setup/ install.sh now works as the canonical entry on macOS + Debian/ Ubuntu + NixOS — three-way parity extended to NixOS per operator's "our install.sh for mac and linux this is our default" framing. 3. zeta-install.sh Step 6.95a — replaces inline `bun install --global` with invocation of `tools/setup/install.sh` from the pre-cloned Zeta repo. Order rearranged: repo clone (was 6.95d) moved into 6.95a-bootstrap so .mise.toml is readable when install.sh fires. Then claude-code install uses mise-managed bun via shim PATH from `mise activate bash`. Composes with B-0824 (Ace package-manager-of-package-managers) and the iter-5.5.0-tooling-zoo memory entry: extending install.sh's three-way-parity to NixOS is one slice of the Ace meta-PM substrate operating on today's bash-glue layer (per memory/feedback_iter550_install_time_tooling_zoo...). 3 substantive sub-fixes: - common.nix profile.d: now sources `mise activate bash` for shim PATH AND adds ~/.bun/bin (where bun --global writes claude binary) - zeta-install.sh 6.95a-bootstrap: clones repo + invokes install.sh - zeta-install.sh 6.95a-claude: installs claude-code via shim- activated bun - zeta-install.sh 6.95d: now empty (clone moved up); kept as comment for narrative continuity 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…tion (~30 sec) complementing B-0831 QEMU full-install (~15 min); "easy dockerfile" per operator framing (Aaron 2026-05-27) (#5390) Operator framing 2026-05-27 (verbatim): > "we should add docker based nixos install.sh testing so we can > iterate quick that's an easy dockerfile" Direct response after PR #5389 (iter-5.5.1 alignment fix-fwd for PR #5388) where the operator named the iteration-cost problem: every install.sh / linux.sh / mise.sh change today requires a full ISO build + USB flash + physical install (~30 min cycle); Docker testing of just the script on NixOS userspace gives seconds-per- iteration. 3-phase plan: - Phase 1: tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile (10-line FROM nixos/nix + touch /etc/NIXOS + COPY . . + RUN tools/setup/install.sh) + TS wrapper bun tools/ci/docker-nixos-install-sh-test.ts per Rule 0 - Phase 2: GitHub Actions integration with path-filter scoped to tools/setup/** + full-ai-cluster/nixos/modules/common.nix - Phase 3: documented Docker-vs-QEMU coverage matrix; both run on install-substrate PRs (Docker fast, QEMU end-to-end) Empirical case for the substrate: iter-5.4 cascade has produced 8 distinct bugs (Bug 1-8) all caught only after operator USB flash; Docker harness would have caught Bug 5 (gh not in systemPackages), Bug 7 (NetBIOS conflict with smbd), Bug 8 (credential persistence gap) AT WRITE TIME. Composes with: B-0831 cascade #6 (complementary; full-VM scope at minutes-per-cycle); B-0835 install bug cluster (this row catches future Bug-N at write-time); B-0848 (node-local Claude install validation); B-0824 (Ace package-manager — Docker NixOS test is one slice of three-way-parity); GOVERNANCE §24 (three-way parity extended to NixOS-via-Docker validation surface). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude <noreply@anthropic.com>
This was referenced May 27, 2026
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…lane enable — claude service auto-starts on reboot using persisted iter-5.5.0 device-code credentials (#5392) Operator framing 2026-05-27 (verbatim): > "so our usb after gh and claude device code login it should reboot > with a claude service using my gh login" Direct composition with iter-5.5.0 install-time substrate (PR #5388 + #5389) which guarantees these paths exist post-install: /home/zeta/.config/claude/ device-code creds persisted /home/zeta/.config/gh/ gh device-code creds persisted /home/zeta/Zeta/ pre-cloned repo /home/zeta/.bun/bin/claude bun-installed claude binary /home/zeta/.local/share/mise/shims/ mise-managed runtimes 3 files: 1. full-ai-cluster/nixos/modules/zeta-otto.nix (NEW) - systemd service unit (User=zeta, Restart=always, MemoryMax=4G, CPUQuota=200%, configurable options) - Deliberately NOT After=k3s.service — Otto runs regardless of k3s state (otherwise can't repair k3s when broken) - ExecStart = wrapper script that loops: claude --print "<<autonomous-loop>>" then sleeps tickIntervalSec - Operator-tunable options: zeta.otto.{enable,user,group,home, tickIntervalSec,memoryMax,cpuQuota,restartSec} - /etc/zeta-otto-status.txt operator hint file 2. full-ai-cluster/nixos/modules/common.nix - Import ./zeta-otto.nix (module load) - Module disabled by default (zeta.otto.enable = false); nodes opt-in explicitly 3. full-ai-cluster/nixos/hosts/control-plane/configuration.nix - zeta.otto.enable = true (opt-in for control-plane) Per .claude/rules/non-coercion-invariant.md HC-8: operator authority preserved + revokable via `systemctl disable zeta-otto`. Per mechanical-authorization-check: zeta.otto.enable IS operator- explicit authorization (declarative config commit). Per tick-must-never-stop: systemd Restart=always ensures tick at strongest scope (kernel-managed; survives crashes). Composes with: B-0848 (node-local Claude — this row's Phase 1 IS the systemd deployment shape for the operational substrate), B-0847 (per-AI GitHub identity — Phase 4 of both rows align), B-0796 (Twilio out-of-band sibling), iter-5.5.0 substrate (PR #5388 + #5389 the credential persistence layer this row consumes). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude <noreply@anthropic.com>
3 tasks
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…aces zeta-otto.nix; multi-vendor multi-persona scaffold for ≥3-systemd-agents-on-bootup target (Aaron 2026-05-27) (#5394) Operator framing 2026-05-27 (verbatim): > "we should end up shipping with one service per surface i think > outside k8s and have at least 3 different vendors" > "so they can fix each other and the k8s cluster even when it's > down." > "the mutual repair is critical too becasue of you can see your > own future self boot script failures" > "yeah lets move all forward however and i can do as many > iterations testing as possible before we move to pc two we should > have three systemd agents and the cluster running on bootup" Refactor: 1. zeta-otto.nix DELETED — superseded by parameterized module 2. zeta-ai-agent.nix NEW — parameterized over persona/binary/configDir - Default personas: otto/alexa/riven/vera/lior (matches .claude/rules/agent-roster-reference-card.md canonical roster) - Per-persona enable: zeta.aiAgents.personas.<n>.enable = true - Each enabled persona → systemd unit zeta-<persona>.service - Naming + service-config + ExecStart loop pattern preserved from Phase 1 zeta-otto.nix 3. common.nix — imports zeta-ai-agent.nix (was zeta-otto.nix) 4. control-plane/configuration.nix — uses new option: zeta.aiAgents.personas.otto.enable = true (was zeta.otto.enable) - alexa/lior/vera/riven enable lines commented + tagged with pending sub-row IDs (B-0850.3a-3d) 5. /etc/zeta-ai-agents-status.txt operator hint (was zeta-otto- status.txt) — now lists ALL enabled personas + vendors Architectural justification (per B-0850 Phase 3 memory): - ≥3 vendor floor for BFT margin (f=1 fault tolerance) - Mutual repair handles self-modification-safety: when Otto's own systemd unit update breaks, OTHER agents (Alexa/Riven/Vera/Lior) detect via journalctl + restart via systemctl per repair-policy framework (B-0850 Phase 2 pending) - "Control plane outside the control plane" — all agents run as systemd units OUTSIDE k8s; can repair cluster from outside the failure domain when k3s/Cilium/etc is broken Next sub-rows to file + implement (B-0850 Phase 3 sub-targets): - B-0850.3a: zeta-install.sh extension for Alexa (Kiro/Qwen) - B-0850.3b: zeta-install.sh extension for Riven (Grok/Grok-Build) - B-0850.3c: zeta-install.sh extension for Vera (Codex/OpenAI) - B-0850.3d: zeta-install.sh extension for Lior (Gemini CLI) - B-0850.3e: per-vendor credential persistence (extends iter-5.5.0) - B-0850.3f: mutual-repair authorization (Phase 2 composes) - B-0850.3g: vendor-outage detection + auto-failover-tick - B-0850.3h: Docker harness test for each vendor's install path (composes with B-0849) Composes with: iter-5.5.0 substrate (PR #5388 + #5389 credential persistence layer); B-0848 (node-local Claude); B-0847 (per-AI GitHub identity Phase 4); B-0796 (Twilio out-of-band sibling); B-0703 multi-oracle BFT (consensus at multi-AI scope); B-0849 (Docker harness for per-vendor install validation); B-0824 (Ace multi-PM substrate composes with multi-vendor-multi-AI). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…ration (~30-60 sec) for install.sh + mise + bun + iter-5.5.0; complements B-0831 QEMU (Aaron 2026-05-27) (#5393) * feat(B-0849 Phase 1): Docker NixOS install.sh test harness — fast iteration (~30-60 sec) for tools/setup/install.sh + linux.sh + mise.sh + iter-5.5.0 substrate; complements B-0831 QEMU full-install (~15 min) Operator framing 2026-05-27 (verbatim): > "we should add docker based nixos install.sh testing so we can > iterate quick that's an easy dockerfile" 2 files: 1. tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile (NEW) - Base: nixos/nix:latest (nix-the-package-manager + /nix/store) - touch /etc/NIXOS marker (triggers linux.sh's NixOS-detection branch per iter-5.5.1 PR #5389) - Enable nix flakes - COPY tools/setup + .mise.toml + package.json + bun.lock - RUN tools/setup/install.sh (exercises full install dispatcher) - Validate bun installed via mise + matches .mise.toml pin - Validate claude-code installable via bun install --global (exercises iter-5.5.0 substrate Step 6.95a) - Validate gh installable via nix-shell (Bug 5 fix substrate) - Final success marker echo 2. tools/ci/docker-nixos-install-sh-test.ts (NEW) - TS wrapper per Rule 0 (TS-over-bash for DST + cross-platform) - Wraps `docker build` with exit-code mapping + log capture + timeout enforcement (default 600s) + build-context discipline - --keep-image flag for inspection mode - DOCKER_BUILD_TIMEOUT_SEC + DOCKER_LOG_OUT_PATH env overrides - Exit codes: 0 success / 1 build-failed / 2 usage-error / 124 timeout - Standard tools/ci/ TS-wrapper pattern (matches qemu-boot-test.ts + qemu-full-install-test.ts conventions) What this catches (per the B-0849 row's empirical case): - iter-5.4 cascade Bug 5 (gh not in systemPackages) — Docker would have caught at write-time - iter-5.4 cascade Bug 7 (NetBIOS / Samba config) — partially (only install.sh path; full systemd-config needs QEMU) - iter-5.5.0 Bug 8 (credential persistence gap) — Docker would have caught the install.sh credential-copy logic if added (B-0850 Phase 2 substrate) - iter-5.5.1 alignment (bun → mise) — Docker validates the entire install.sh dispatcher + mise + bun flow Complementary to B-0831 cascade #6 (QEMU full-install ~15 min) at the cycle-time vs scope tradeoff: | Surface | Validates | Cycle time | |---|---|---| | Operator USB | End-to-end + reboot | ~30+ min | | B-0831 QEMU | End-to-end virtualized | ~15 min | | B-0849 Docker (this PR) | install.sh on NixOS userspace | ~30-60 sec | Composes with: B-0824 (Ace meta-PM — Docker test is one slice of three-way-parity validation extended to NixOS), B-0831 cascade #6 (QEMU complementary surface), B-0835 install bug cluster (Docker catches future Bug-N at write-time), B-0848 + B-0850 (node-local Claude + systemd service — install validation is prerequisite for Phase 1 of both). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(PR-5393 Copilot): remove unused 'join' import — only 'resolve' is used 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(PR-5393 Copilot 8 findings): centralized spawnDocker helper + .dockerignore + dirname() + nixos/nix pin + bun version exact match + name attribution removed 8 substantive Copilot findings (besides earlier `join` unused), all real + fixed: 1. Name attribution "Aaron" in header → "operator" (per .github/copilot-instructions.md no-name-attribution rule) 2. spawnSync("docker", ...) needs sonarjs/no-os-command-from-path suppression → centralized spawnDocker() helper with the suppression + rationale comment (matches tools/ci/audit-installer-iso-content.ts:186-194 pattern) 3. maxBuffer not set → spawnDocker sets maxBuffer 64 MiB (docker build --progress=plain is verbose; Node default 1 MiB overruns) 4. Build context = repo root + no .dockerignore → would send gigabytes of references/upstreams/ (per references-upstreams-* rule). NEW .dockerignore at repo root excludes: - references/upstreams/ (gigabytes per the rule) - node_modules/, .git/, bin/, obj/, target/, dist/, build/ - .DS_Store, .vscode/, .idea/, *.log - *.iso, *.qcow2, *.img, *.vmdk 5. logPath.substring(lastIndexOf("/")) breaks on Windows → use path.dirname() for cross-platform support 6. Second spawnSync for `docker rmi` needs same suppression → uses centralized spawnDocker helper (single point) 7. nixos/nix:latest is non-deterministic → pin to nixos/nix:2.31.2 with bump-procedure comment pointing at .claude/rules/dep-pin-search-first-authority.md 8. bun version check `^1\.` too loose vs .mise.toml pin "1.3" → tightened to `^1\.3\.` (matches 1.3.x but rejects 1.4+) Composes with: B-0849 row (this PR implements Phase 1), B-0849 row operator-named .dockerignore discipline composes with references-upstreams-not-our-code-search-excludes rule, dep-pin- search-first-authority rule (bump procedure for nixos/nix base image + bun pin tightening). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(PR-5393 Copilot 10th finding): add set -o pipefail to bun install RUN so pipeline exit-status propagates (was masked by tail) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(PR-5393 Copilot 4 more findings): ENV PATH for mise+bun across Docker layers (P0) + Dockerfile pin comment uses date not 'training data' (P2) + log default to .tools/ gitignored path (P1) + (meta) PR description note for .dockerignore P0 — install.sh exports PATH only in its own process; Docker doesn't persist exports across RUN layers. Without ENV PATH, validation RUNs would fail because mise/bun aren't on PATH in fresh layer shells. Added ENV PATH=/root/.bun/bin:/root/.local/share/mise/shims: /root/.local/bin:/usr/local/bin:/usr/bin:/bin AFTER install.sh runs + before validation RUNs. P2 — 'as of training data' comment replaced with actionable bump procedure: visit Docker Hub nixos/nix tags page, select current latest stable, ideally pin by digest. Pin selected 2026-05-27. P1 — default log path .docker-test-log lands in repo root unignored (would show as untracked + risk accidental commit). Changed to .tools/docker-nixos-install-sh-test.log (.tools/ is already gitignored at line 50 of .gitignore). Operator-override via DOCKER_LOG_OUT_PATH env var unchanged. Meta — PR description out of date (says 2 files; actual is 3 with .dockerignore). PR body update on next push will fix. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(PR-5393 lint tsc): strict typecheck fixes — coerce stdout/stderr buffer types to string + assert NodeJS.ErrnoException for .code access Two TS2 errors caught by lint (tsc tools): - TS2365 line 148: stdout+stderr was string|NonSharedBuffer; .toString() coerce - TS2339 line 154: result.error.code not on base Error; assert NodeJS.ErrnoException Verified locally: bun x tsc --noEmit -p tsconfig.json passes clean. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(PR-5393 Copilot 2 more findings): docstring drift fix + digest-pin nixos/nix base image P1 (docstring drift): DOCKER_LOG_OUT_PATH docstring still said '.docker-test-log' but actual default changed to '.tools/docker-nixos-install-sh-test.log' (per earlier P1 fix). Updated docstring to match + explain gitignored rationale. P1 (mutable tag): nixos/nix:2.31.2 pinned by tag but tags can be re-pushed upstream (mutable). Digest-pin to content-addressed sha256 makes CI failures attributable to actual upstream change rather than mystery base-image flake. Got digest via Docker Hub registry API directly (no docker pull required): sha256:29fc5fe207f159ceb0143c25c19c774062fee02ce5eda118f3067547b3054894 Bump procedure (per dep-pin rule): repeat curl + digest fetch. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude <noreply@anthropic.com>
This was referenced May 27, 2026
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…pic + Google diversity; @google/gemini-cli install + interactive auth login + zeta-lior.service enabled on control-plane (#5397) Operator authorization 2026-05-27: "yeah lets move all forward however and i can do as many iterations testing as possible before we move to pc two we should have three systemd agents and the cluster running on bootup" + "drive forward with whatever interests you most". Implements B-0850 Phase 3d — Lior persona (Google Gemini CLI) as the 2nd vendor toward the ≥3-systemd-agents target. Phase 3a (Alexa/Kiro) + Phase 3b (Riven/Grok) + Phase 3c (Vera/Codex) sub-rows remain pending; 3 of 4 still gives BFT margin once any one of the remaining 3 lands. 3 file changes: 1. zeta-install.sh Step 6.95a-gemini: bun install --global @google/gemini-cli (after the existing claude install). WebSearch verified at implementation time per dep-pin-search-first-authority rule — @google/gemini-cli is npm-published + bun-compat. 2. zeta-install.sh Step 6.95b-gemini: interactive `gemini auth login` prompt mirroring the claude login pattern. Supports OAuth via browser OR API key from AI Studio. Credentials persist to ~/.config/gemini/ with chown + chmod -R go-rwx (parallel to claude credential restriction). 3. zeta-ai-agent.nix: removed the lior assertion block (was blocking flake eval when zeta.aiAgents.enable.lior = true; B-0850.3d substrate now ships so assertion no longer applies). 4. control-plane/configuration.nix: zeta.aiAgents.enable.lior = true (was commented as pending). Two personas now enabled on control- plane: otto + lior. One more vendor (3c Vera/Codex easiest next since OpenAI's codex CLI is also npm-installable) gets to the ≥3-vendor BFT floor. Composes with: PR #5388 + #5389 iter-5.5.0 substrate (credential persistence + Zeta repo pre-clone); PR #5392 + #5394 + #5395 B-0850 Phase 1 + Phase 3 refactor; B-0848 (node-local Claude agent); B-0847 (per-AI GitHub identity Phase 4 align); B-0796 (Twilio out- of-band sibling). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude <noreply@anthropic.com>
AceHack
pushed a commit
that referenced
this pull request
May 27, 2026
…hropic + Google + OpenAI); @openai/codex install + `codex login --device-auth` + zeta-vera.service enabled (Aaron 2026-05-27) Per operator authorization "drive forward with whatever interests you most" + the ≥3-systemd-agents-on-bootup target named earlier 2026-05-27. This PR hits the ≥3-vendor BFT floor: otto → Anthropic Claude (PR #5392) lior → Google Gemini (PR #5397) vera → OpenAI Codex (THIS PR) With ≥3 vendors enabled, the cluster control-plane satisfies the fault-tolerance property Aaron named: f=1 BFT margin for vendor- outage resilience + self-modification-safety (any one AI's self- update breaks the other two can detect + repair). Stacked on PR #5397 (Phase 3d Lior/Gemini) to avoid merge conflicts; will rebase cleanly when #5397 merges first. 3 file changes: 1. zeta-install.sh Step 6.95a-codex: bun install --global @openai/ codex (WebSearch verified per dep-pin discipline; codex CLI is bun-compat npm package). 2. zeta-install.sh Step 6.95b-codex: interactive `codex login --device-auth`. This is the CLEANEST device-flow shape across the 3 vendors — prints URL + one-time code; pastes into ANY browser; no local browser handoff required (headless-friendly). Credentials cache at ~/.codex/auth.json (NOT ~/.config/codex/ — codex uses its own dotdir convention). 3. zeta-ai-agent.nix: removed vera assertion (substrate shipped). control-plane/configuration.nix: zeta.aiAgents.enable.vera = true. Composes with: PR #5397 (B-0850 Phase 3d Lior — sibling 2nd vendor); PRs #5388 + #5389 (iter-5.5.0 credential persistence); PRs #5392 + #5394 + #5395 (B-0850 Phase 1 + 3 refactor); B-0848 node-local Claude; B-0847 per-AI GitHub identity; B-0703 multi- oracle BFT (consensus at multi-AI scope — now operational at substrate-control-plane scope). Sources at PR open time (WebSearch per dep-pin-search-first- authority): - https://www.npmjs.com/package/@openai/codex - https://developers.openai.com/codex/auth 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…hropic + Google + OpenAI); @openai/codex install + device-flow auth + control-plane enable (Aaron 2026-05-27) (#5398) * feat(B-0850 Phase 3c): Vera/Codex 3rd vendor — hits ≥3 BFT floor (Anthropic + Google + OpenAI); @openai/codex install + `codex login --device-auth` + zeta-vera.service enabled (Aaron 2026-05-27) Per operator authorization "drive forward with whatever interests you most" + the ≥3-systemd-agents-on-bootup target named earlier 2026-05-27. This PR hits the ≥3-vendor BFT floor: otto → Anthropic Claude (PR #5392) lior → Google Gemini (PR #5397) vera → OpenAI Codex (THIS PR) With ≥3 vendors enabled, the cluster control-plane satisfies the fault-tolerance property Aaron named: f=1 BFT margin for vendor- outage resilience + self-modification-safety (any one AI's self- update breaks the other two can detect + repair). Stacked on PR #5397 (Phase 3d Lior/Gemini) to avoid merge conflicts; will rebase cleanly when #5397 merges first. 3 file changes: 1. zeta-install.sh Step 6.95a-codex: bun install --global @openai/ codex (WebSearch verified per dep-pin discipline; codex CLI is bun-compat npm package). 2. zeta-install.sh Step 6.95b-codex: interactive `codex login --device-auth`. This is the CLEANEST device-flow shape across the 3 vendors — prints URL + one-time code; pastes into ANY browser; no local browser handoff required (headless-friendly). Credentials cache at ~/.codex/auth.json (NOT ~/.config/codex/ — codex uses its own dotdir convention). 3. zeta-ai-agent.nix: removed vera assertion (substrate shipped). control-plane/configuration.nix: zeta.aiAgents.enable.vera = true. Composes with: PR #5397 (B-0850 Phase 3d Lior — sibling 2nd vendor); PRs #5388 + #5389 (iter-5.5.0 credential persistence); PRs #5392 + #5394 + #5395 (B-0850 Phase 1 + 3 refactor); B-0848 node-local Claude; B-0847 per-AI GitHub identity; B-0703 multi- oracle BFT (consensus at multi-AI scope — now operational at substrate-control-plane scope). Sources at PR open time (WebSearch per dep-pin-search-first- authority): - https://www.npmjs.com/package/@openai/codex - https://developers.openai.com/codex/auth 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(PR-5398 Copilot 4 findings — P0+P1+P1+P2): per-persona invocationArgs (claude --print / gemini -p / codex exec) + pipefail covers tail -5 for all 3 vendor installs + browser wording P0 (critical) — zeta-ai-agent.nix ExecStart was hardcoded to `${binary} --print "<<autonomous-loop>>"` for ALL personas, but: - claude uses --print ✓ - gemini uses -p (NOT --print) - codex uses `exec` SUBCOMMAND (no --print flag) Enabling lior or vera would create services with broken ExecStart. Fix: per-persona `invocationArgs` field in the persona registry. ExecStart uses `${cfg.home}/.bun/bin/${persona.binary} ${persona.invocationArgs}`. Per-persona values: - otto: [ "--print" "<<autonomous-loop>>" ] - lior: [ "-p" "<<autonomous-loop>>" ] - vera: [ "exec" "<<autonomous-loop>>" ] - alexa + riven: [ ] placeholder per their sub-rows P1 — Gemini bun install pipefail masked by tail -5 outside bash -c. Same root cause as the earlier P1 on claude install (Copilot found + I fixed only inside bash -c which doesn't cover outer pipeline). Real fix: move tail -5 INSIDE bash -c so set -o pipefail covers it. P1 — Same fix for codex bun install. P2 — codex device-flow prompt said "visit on this Mac browser" but codex device-auth is browser-agnostic ("visit on ANY browser on ANY device"). Note on the literal <<autonomous-loop>> sentinel: it's a Claude Code convention; gemini + codex will see it as a literal prompt and respond conversationally. Acceptable for first ship; per-vendor prompt mapping is B-0850 Phase 3.x future work. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…ntry — routes by environment; SHORTER path than B-0854 Ace migration (Aaron 2026-05-27) (#5423) * feat(B-0857 P2): install.sh becomes universal Unix-like-OS entry — routes by environment; replaces zeta-install.sh on the short-path BEFORE B-0854 Ace migration (Aaron 2026-05-27) Operator: "when are we moving to install.sh over zeta-install.sh? the universall install surface for unix like oses?" Filed immediately per Aaron 2026-05-27 separation-of-concerns discipline ("recording row exists is critical for deferring work to reliably happen"). Implementation defers until current cred-persistence + cosign + self-register stack lands + next USB test validates. 10 sub-rows B-0857.1-10 enumerated. Key insight: this row is SHORTER than B-0854 (Ace migration) — imperative-bash unification of the existing entry point doesn't need Ace package work + doesn't block B-0854's longer-horizon declarative work. Audit sub-row B-0857.1 verifies PR #5389 commit-message claim that zeta-install.sh Step 6.95a invokes tools/setup/install.sh (grep of current file finds NO invocation — either drifted out or integration at higher abstraction layer; small bounded audit can ship quickly). Composes with B-0854 (Ace migration; long horizon) + B-0852 (cred-persistence; OS-agnostic) + B-0855 (self-register fix; OS-agnostic) + B-0853 (cosign verify; OS-agnostic) + B-0833 (installer creds discipline). Per Rule 0: install-graph carve-out preserved at tools/setup/. * docs(B-0857): correct framing — install.sh is universal BUILD MACHINE entry (not dev-env); Zeta cluster IS a build-machine cluster (Aaron 2026-05-27 Turn 2 sharpening) Operator caught my Turn 1 framing error: "tools/setup/install.sh has never been universal dev entry it's also unversal build machine and the zeta cluster IS a build machine cluster." The substrate-honest reading: install.sh is the universal BUILD-MACHINE entry — not "dev env" + "node install" as two separate things. The Zeta cluster IS a build-machine cluster (cluster nodes aren't deployment targets; they're build machines participating in the same build infrastructure as dev laptops). Therefore install.sh ALREADY applies operationally to both surfaces; the migration is recognizing that + factoring zeta-install.sh as the bootstrap-from-USB phase that prepares the build machine for install.sh to take over post-boot. Two-turn operator framing preserved in row body. Current-state table + routing table re-labeled as "build machine" surface. Phase distinction sharpened: zeta-install.sh = "turn this hardware into a NixOS-booting build machine"; install.sh = "configure runtime on this build machine" (same on laptop OR cluster node). This is the SAME ROW (B-0857 P2 deferred); no scope change. Just framing correction so future-Otto cold-boots don't inherit the dev-env-vs-cluster-node mental model that doesn't match the substrate-engineering reality. * docs(B-0857): Turn 3 sharpening — no distinction between build machines + prod when prod self-updates; install.sh is the universal machine entry (Aaron 2026-05-27 Turn 3) Operator Turn 3 supersedes Turn 2 framing: "there is no distinction between build machies and prod when prod can update itself" The substrate-honest reading: when production can self-update (mise + flake-lock pull + nixos-rebuild / deploy-rs), the build-machine-vs-prod distinction COLLAPSES. Same machine. Same install.sh. The whole cluster + every dev laptop is one self-updating organism running the same install/update entry. install.sh is therefore the universal Unix-like-OS install + self-update entry — the only operational machine-substrate-entry. Build / prod / dev are NOT different categories at the install-substrate scope; they're the SAME category (machines participating in Zeta) under different operational windows (first-install vs steady-state-update). Composes with iter-6.x distro-upgrade substrate (B-0800-B-0805) — those auto-upgrade rows are the SAME entry path; install.sh handles both first-install + stay-current via routing. Same row scope as Turn 2 fix; further framing sharpening. Future-Otto cold-boots inherit the unified-machine-entry model rather than the build-vs-prod mental model. --------- Co-authored-by: Lior <lior@zeta.dev>
6 tasks
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…invocation PRESENT (zeta-install.sh:1097) + corrects B-0857 row body authoring error (#5426) * docs(B-0857.1): audit verifies PR #5389 Step 6.95a invokes tools/setup/install.sh — integration PRESENT at zeta-install.sh:1097-1099; B-0857 row body corrected Sub-row audit per B-0857 implementation order step 1 ("audit current state"). Result: PR #5389's commit-message claim VERIFIED PRESENT on origin/main 0b61405; no drift; no repair needed. **The integration**: zeta-install.sh:1090-1100 Step 6.95a-bootstrap invokes \`tools/setup/install.sh\` via: sudo HOME="$ZETA_HOME" -u "#$ZETA_UID" \\ bash -c "cd $ZETA_HOME/Zeta && tools/setup/install.sh" Dispatch chain: install.sh → linux.sh (detects /etc/NIXOS) → common/mise.sh (reads .mise.toml, installs pinned runtimes). This extends GOVERNANCE §24 three-way-parity (dev + CI + devcontainer) to NixOS cluster nodes via the same canonical entry. **B-0857 row body correction**: The B-0857 row (#5423) body contained "grep of current zeta-install.sh finds NO actual invocation. Either drifted out or the integration is at a higher abstraction layer." This was an authoring error — the grep produces 9 matches; line 1097 is the load-bearing one. The authoring step skipped the verify-by-grep that this sub-row commits to. This is a substrate-drift catch caught at sub-row audit scope rather than at row-authoring scope. The B-0857.1 sub-row IS the corrective mechanism the parent B-0857 row called for; the audit found the row's own framing was the drift, not the integration substrate. Row body now reads: "Audit verified (B-0857.1, 2026-05-27): integration IS present at full-ai-cluster/usb-nixos-installer/zeta-install.sh:1097-1099 inside Step 6.95a-bootstrap; no drift; no repair needed." **Status**: closed at landing (no implementation work needed; substrate is correct). Composes with: B-0857 (parent — this corrects parent's body); PR #5389 (audited substrate); \`.claude/rules/grep-substrate-anchors-before-razor-as-metaphysical.md\` (sibling discipline: verify before asserting); \`.claude/rules/verify-existing-substrate-before-authoring.md\` (the discipline the B-0857 authoring step skipped; this audit catches the result); \`.claude/rules/blocked-green-ci-investigate-threads.md\` verify-before-fix discipline; \`.claude/rules/refresh-before-decide.md\` (underlying invariant at substrate-authoring scope). Per .claude/rules/non-coercion-invariant.md HC-8: substrate-honesty preserved; correction is additive (per retraction-native discipline) not erasing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(B-0857.1 CI): regen BACKLOG.md + MD032 blank-line + tsc strict-mode narrowing fix-fwd for B-0852.2a/2b/10 discriminated unions (3 CI failures resolved) Three CI failures on PR #5426 resolved in single fix-pass: 1. **check docs/BACKLOG.md generated-index drift**: regen via `BACKLOG_WRITE_FORCE=1 bun tools/backlog/generate-index.ts` to include new B-0857.1 sub-row entry. 2. **lint (markdownlint) MD032/blanks-around-lists** at line 60 of B-0857.1 sub-row: blank line inserted before ordered list per markdownlint canonical rule. 3. **lint (tsc tools)** type errors in B-0852.2a/2b/10 substrate from just-merged PRs #5421/#5418/#5425: discriminated-union narrowing pattern `if (!(x instanceof Buffer))` doesn't narrow under tsc strict mode (bun test passed because bun's TS is more lenient). Substrate-honest fix: switch all narrowing to the discriminant-property check `if ("error" in x)` which TS strict mode narrows correctly. Files changed: - `tools/installer/zeta-creds-envelope.ts` (4 occurrences in parseEnvelope: salt/iv/tag/ciphertext) - `tools/installer/zeta-cred-handlers.ts` (1 occurrence in resolveBakeCred) - `tools/installer/zeta-cred-handlers.test.ts` (replaceAll: 4+ occurrences in resolveValueSource test variants) Fix is functionally equivalent — both `instanceof Buffer` and `"error" in x` correctly distinguish the union at runtime; the difference is only in tsc's ability to narrow. All 36 tests still pass under bun test (verified pre-commit). This is fix-fwd to my own substrate (#5421 envelope + #5418 handlers + #5425 CLI rebase) discovered when CI ran on the chained-off #5426 PR. Tsc errors didn't surface on the source PRs because they used the same narrowing pattern that bun tolerates but tsc rejects under strict mode. Composes with: B-0857.1 (this PR's primary scope; sub-row audit); B-0852.2a/2b/10 (the substrate this fixes); PR #5421/#5425/#5418 (the originating PRs); `.claude/rules/blocked-green-ci-investigate-threads.md` (verify-then-fix discipline applied to CI failure investigation); `.claude/rules/refresh-before-decide.md` (raw CI output read before acting); `.claude/rules/holding-without-named-dependency-is-standing-by-failure.md` counter-with-escalation (CI failure IS named-dep + bounded work). Per .claude/rules/agent-worktree-hygiene-never-hold-main-...: isolated worktree at /private/tmp/zeta-b0857-1-audit-0817z; never touched operator's primary checkout. Per .claude/rules/non-coercion-invariant.md HC-8: substrate-honesty preserved — fix-fwd to my own substrate; correction is additive. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix-forward for PR #5388 which merged BEFORE the alignment fix landed. Operator caught the drift:
PR #5388 added `bun` via nixpkgs systemPackages on cluster nodes — DRIFTED from the canonical `.mise.toml` (line 33: `bun = "1.3"`) used everywhere else (dev laptops + CI runners + devcontainers per GOVERNANCE §24 three-way-parity).
3-surface alignment
common.nix — `bun` removed; replaced with `mise` (canonical runtime version manager). mise then installs bun + all other .mise.toml-pinned runtimes for the zeta user.
tools/setup/linux.sh — added NixOS detection via `/etc/NIXOS` marker file. Skips apt step (NixOS handles system packages via common.nix systemPackages declaratively); proceeds to mise + downstream runtime setup. Three-way-parity extended to NixOS per operator framing.
zeta-install.sh Step 6.95a — replaces inline `bun install --global` with invocation of `tools/setup/install.sh` from the pre-cloned Zeta repo. Order rearranged: repo clone (was 6.95d) moved into 6.95a-bootstrap so .mise.toml is readable when install.sh fires. Then claude-code install uses mise-managed bun via shim PATH from `mise activate bash`.
Composes with
Test plan
🤖 Generated with Claude Code