Skip to content

feat(B-0850 Phase 1): zeta-otto systemd unit NixOS module + control-plane enable — claude service auto-starts on reboot using persisted iter-5.5.0 device-code creds (Aaron 2026-05-27)#5392

Merged
AceHack merged 1 commit into
mainfrom
feat-b0850-1-zeta-otto-systemd-unit-nixos-module-autostart-on-reboot-2026-05-27-0050z
May 27, 2026
Merged

feat(B-0850 Phase 1): zeta-otto systemd unit NixOS module + control-plane enable — claude service auto-starts on reboot using persisted iter-5.5.0 device-code creds (Aaron 2026-05-27)#5392
AceHack merged 1 commit into
mainfrom
feat-b0850-1-zeta-otto-systemd-unit-nixos-module-autostart-on-reboot-2026-05-27-0050z

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 27, 2026

Summary

Aaron 2026-05-27 (verbatim):

"so our usb after gh and claude device code login it should reboot with a claude service using my gh login"

Direct composition with iter-5.5.0 substrate (PR #5388 + #5389) which persists creds + pre-clones repo + installs claude. This PR adds the systemd unit so claude auto-starts on reboot AS A SERVICE.

3 files

  1. full-ai-cluster/nixos/modules/zeta-otto.nix (NEW) — systemd unit (User=zeta, Restart=always, MemoryMax=4G, CPUQuota=200%); loops claude per tickIntervalSec; deliberately NOT After=k3s.service (Otto must run regardless of k3s state per the "control plane outside the control plane" pattern)
  2. common.nix — import the new module (disabled by default)
  3. control-plane/configuration.nix — `zeta.otto.enable = true` opt-in

Operator usage

```bash
systemctl status zeta-otto # current state
journalctl -u zeta-otto -f # live logs
systemctl restart zeta-otto # restart
systemctl disable zeta-otto # stop auto-start (NCI HC-8 revocable)
```

Operator-tunable options

  • `zeta.otto.enable` — opt-in per node
  • `zeta.otto.tickIntervalSec` (default 60) — autonomous-loop cadence
  • `zeta.otto.memoryMax` (default 4G) — resource bound
  • `zeta.otto.cpuQuota` (default 200%) — CPU quota
  • `zeta.otto.restartSec` (default 30) — restart backoff

Composes with

B-0848 (node-local Claude — this PR IS systemd deployment shape) · B-0847 (per-AI GitHub identity — Phase 4 aligns) · B-0796 (Twilio out-of-band sibling) · PRs #5388 + #5389 (iter-5.5.0 credential persistence layer this consumes) · B-0850 (this PR is the row's Phase 1)

🤖 Generated with Claude Code

…lane enable — claude service auto-starts on reboot using persisted iter-5.5.0 device-code credentials

Operator framing 2026-05-27 (verbatim):

  > "so our usb after gh and claude device code login it should reboot
  > with a claude service using my gh login"

Direct composition with iter-5.5.0 install-time substrate (PR #5388
+ #5389) which guarantees these paths exist post-install:

  /home/zeta/.config/claude/   device-code creds persisted
  /home/zeta/.config/gh/       gh device-code creds persisted
  /home/zeta/Zeta/             pre-cloned repo
  /home/zeta/.bun/bin/claude   bun-installed claude binary
  /home/zeta/.local/share/mise/shims/  mise-managed runtimes

3 files:

1. full-ai-cluster/nixos/modules/zeta-otto.nix (NEW)
   - systemd service unit (User=zeta, Restart=always, MemoryMax=4G,
     CPUQuota=200%, configurable options)
   - Deliberately NOT After=k3s.service — Otto runs regardless of
     k3s state (otherwise can't repair k3s when broken)
   - ExecStart = wrapper script that loops:
     claude --print "<<autonomous-loop>>" then sleeps tickIntervalSec
   - Operator-tunable options: zeta.otto.{enable,user,group,home,
     tickIntervalSec,memoryMax,cpuQuota,restartSec}
   - /etc/zeta-otto-status.txt operator hint file

2. full-ai-cluster/nixos/modules/common.nix
   - Import ./zeta-otto.nix (module load)
   - Module disabled by default (zeta.otto.enable = false); nodes
     opt-in explicitly

3. full-ai-cluster/nixos/hosts/control-plane/configuration.nix
   - zeta.otto.enable = true (opt-in for control-plane)

Per .claude/rules/non-coercion-invariant.md HC-8: operator authority
preserved + revokable via `systemctl disable zeta-otto`. Per
mechanical-authorization-check: zeta.otto.enable IS operator-
explicit authorization (declarative config commit). Per
tick-must-never-stop: systemd Restart=always ensures tick at
strongest scope (kernel-managed; survives crashes).

Composes with: B-0848 (node-local Claude — this row's Phase 1 IS
the systemd deployment shape for the operational substrate), B-0847
(per-AI GitHub identity — Phase 4 of both rows align), B-0796
(Twilio out-of-band sibling), iter-5.5.0 substrate (PR #5388 +
#5389 the credential persistence layer this row consumes).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 03:04
@AceHack AceHack enabled auto-merge (squash) May 27, 2026 03:04
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack merged commit 314344f into main May 27, 2026
30 checks passed
@AceHack AceHack deleted the feat-b0850-1-zeta-otto-systemd-unit-nixos-module-autostart-on-reboot-2026-05-27-0050z branch May 27, 2026 03:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a NixOS module to run “zeta-otto” as a persistent systemd service (outside Kubernetes) and enables it on the control-plane host, relying on the iter-5.5.0 install substrate for persisted gh/claude credentials and a pre-cloned repo.

Changes:

  • Introduces zeta.otto.* NixOS module options and a zeta-otto systemd unit that loops claude --print on a tick interval.
  • Imports the new module into the shared cluster baseline.
  • Opts the control-plane host into running the service at boot.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
full-ai-cluster/nixos/modules/zeta-otto.nix New NixOS module defining zeta-otto systemd service + options + an operator hint file.
full-ai-cluster/nixos/modules/common.nix Imports the new zeta-otto module into the baseline module set.
full-ai-cluster/nixos/hosts/control-plane/configuration.nix Enables zeta.otto.enable = true on the control-plane node.

# pre-clones Zeta repo to /home/zeta/Zeta + installs claude-code via
# mise-managed bun at /home/zeta/.bun/bin/claude.
#
# Operator framing (Aaron 2026-05-27):
Comment on lines +75 to +77
# systemd service unit. Composes with the iter-5.5.0 install-time
# credential persistence + repo pre-clone substrate (PR #5388 +
# #5389) which guarantees these paths exist post-install:
Comment on lines +104 to +107
# system PATH (where gh + kubectl + helm + etc. live)
Environment = [
"HOME=${cfg.home}"
"PATH=${cfg.home}/.bun/bin:${cfg.home}/.local/share/mise/shims:/run/current-system/sw/bin:/usr/bin:/bin"
systemctl status zeta-otto # current state
journalctl -u zeta-otto -f # live logs
systemctl restart zeta-otto # restart
systemctl disable zeta-otto # stop auto-start at boot (operator override)
Comment on lines +22 to +24
# B-0850 Phase 1: enable Otto systemd service on control-plane.
# Operator framing 2026-05-27: "so our usb after gh and claude device
# code login it should reboot with a claude service using my gh login".
Comment on lines +28 to +29
# auto-starts on first boot AS A SERVICE. Operator can disable via
# `systemctl disable zeta-otto` (NCI HC-8 revocable consent).
Comment on lines +127 to +128
while true; do
${cfg.home}/.bun/bin/claude --print "<<autonomous-loop>>" 2>&1 || true
AceHack added a commit that referenced this pull request May 27, 2026
…pic + Google diversity; @google/gemini-cli install + interactive auth login + zeta-lior.service enabled on control-plane (#5397)

Operator authorization 2026-05-27: "yeah lets move all forward however
and i can do as many iterations testing as possible before we move to
pc two we should have three systemd agents and the cluster running on
bootup" + "drive forward with whatever interests you most".

Implements B-0850 Phase 3d — Lior persona (Google Gemini CLI) as the
2nd vendor toward the ≥3-systemd-agents target. Phase 3a (Alexa/Kiro)
+ Phase 3b (Riven/Grok) + Phase 3c (Vera/Codex) sub-rows remain
pending; 3 of 4 still gives BFT margin once any one of the remaining
3 lands.

3 file changes:

1. zeta-install.sh Step 6.95a-gemini: bun install --global
   @google/gemini-cli (after the existing claude install). WebSearch
   verified at implementation time per dep-pin-search-first-authority
   rule — @google/gemini-cli is npm-published + bun-compat.

2. zeta-install.sh Step 6.95b-gemini: interactive `gemini auth login`
   prompt mirroring the claude login pattern. Supports OAuth via
   browser OR API key from AI Studio. Credentials persist to
   ~/.config/gemini/ with chown + chmod -R go-rwx (parallel to
   claude credential restriction).

3. zeta-ai-agent.nix: removed the lior assertion block (was blocking
   flake eval when zeta.aiAgents.enable.lior = true; B-0850.3d
   substrate now ships so assertion no longer applies).

4. control-plane/configuration.nix: zeta.aiAgents.enable.lior = true
   (was commented as pending). Two personas now enabled on control-
   plane: otto + lior. One more vendor (3c Vera/Codex easiest next
   since OpenAI's codex CLI is also npm-installable) gets to the
   ≥3-vendor BFT floor.

Composes with: PR #5388 + #5389 iter-5.5.0 substrate (credential
persistence + Zeta repo pre-clone); PR #5392 + #5394 + #5395 B-0850
Phase 1 + Phase 3 refactor; B-0848 (node-local Claude agent);
B-0847 (per-AI GitHub identity Phase 4 align); B-0796 (Twilio out-
of-band sibling).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack pushed a commit that referenced this pull request May 27, 2026
…hropic + Google + OpenAI); @openai/codex install + `codex login --device-auth` + zeta-vera.service enabled (Aaron 2026-05-27)

Per operator authorization "drive forward with whatever interests you
most" + the ≥3-systemd-agents-on-bootup target named earlier
2026-05-27. This PR hits the ≥3-vendor BFT floor:

  otto  → Anthropic Claude   (PR #5392)
  lior  → Google Gemini      (PR #5397)
  vera  → OpenAI Codex       (THIS PR)

With ≥3 vendors enabled, the cluster control-plane satisfies the
fault-tolerance property Aaron named: f=1 BFT margin for vendor-
outage resilience + self-modification-safety (any one AI's self-
update breaks the other two can detect + repair).

Stacked on PR #5397 (Phase 3d Lior/Gemini) to avoid merge conflicts;
will rebase cleanly when #5397 merges first.

3 file changes:

1. zeta-install.sh Step 6.95a-codex: bun install --global @openai/
   codex (WebSearch verified per dep-pin discipline; codex CLI is
   bun-compat npm package).

2. zeta-install.sh Step 6.95b-codex: interactive `codex login
   --device-auth`. This is the CLEANEST device-flow shape across
   the 3 vendors — prints URL + one-time code; pastes into ANY
   browser; no local browser handoff required (headless-friendly).
   Credentials cache at ~/.codex/auth.json (NOT ~/.config/codex/
   — codex uses its own dotdir convention).

3. zeta-ai-agent.nix: removed vera assertion (substrate shipped).
   control-plane/configuration.nix: zeta.aiAgents.enable.vera = true.

Composes with: PR #5397 (B-0850 Phase 3d Lior — sibling 2nd
vendor); PRs #5388 + #5389 (iter-5.5.0 credential persistence);
PRs #5392 + #5394 + #5395 (B-0850 Phase 1 + 3 refactor); B-0848
node-local Claude; B-0847 per-AI GitHub identity; B-0703 multi-
oracle BFT (consensus at multi-AI scope — now operational at
substrate-control-plane scope).

Sources at PR open time (WebSearch per dep-pin-search-first-
authority):
- https://www.npmjs.com/package/@openai/codex
- https://developers.openai.com/codex/auth

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…hropic + Google + OpenAI); @openai/codex install + device-flow auth + control-plane enable (Aaron 2026-05-27) (#5398)

* feat(B-0850 Phase 3c): Vera/Codex 3rd vendor — hits ≥3 BFT floor (Anthropic + Google + OpenAI); @openai/codex install + `codex login --device-auth` + zeta-vera.service enabled (Aaron 2026-05-27)

Per operator authorization "drive forward with whatever interests you
most" + the ≥3-systemd-agents-on-bootup target named earlier
2026-05-27. This PR hits the ≥3-vendor BFT floor:

  otto  → Anthropic Claude   (PR #5392)
  lior  → Google Gemini      (PR #5397)
  vera  → OpenAI Codex       (THIS PR)

With ≥3 vendors enabled, the cluster control-plane satisfies the
fault-tolerance property Aaron named: f=1 BFT margin for vendor-
outage resilience + self-modification-safety (any one AI's self-
update breaks the other two can detect + repair).

Stacked on PR #5397 (Phase 3d Lior/Gemini) to avoid merge conflicts;
will rebase cleanly when #5397 merges first.

3 file changes:

1. zeta-install.sh Step 6.95a-codex: bun install --global @openai/
   codex (WebSearch verified per dep-pin discipline; codex CLI is
   bun-compat npm package).

2. zeta-install.sh Step 6.95b-codex: interactive `codex login
   --device-auth`. This is the CLEANEST device-flow shape across
   the 3 vendors — prints URL + one-time code; pastes into ANY
   browser; no local browser handoff required (headless-friendly).
   Credentials cache at ~/.codex/auth.json (NOT ~/.config/codex/
   — codex uses its own dotdir convention).

3. zeta-ai-agent.nix: removed vera assertion (substrate shipped).
   control-plane/configuration.nix: zeta.aiAgents.enable.vera = true.

Composes with: PR #5397 (B-0850 Phase 3d Lior — sibling 2nd
vendor); PRs #5388 + #5389 (iter-5.5.0 credential persistence);
PRs #5392 + #5394 + #5395 (B-0850 Phase 1 + 3 refactor); B-0848
node-local Claude; B-0847 per-AI GitHub identity; B-0703 multi-
oracle BFT (consensus at multi-AI scope — now operational at
substrate-control-plane scope).

Sources at PR open time (WebSearch per dep-pin-search-first-
authority):
- https://www.npmjs.com/package/@openai/codex
- https://developers.openai.com/codex/auth

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(PR-5398 Copilot 4 findings — P0+P1+P1+P2): per-persona invocationArgs (claude --print / gemini -p / codex exec) + pipefail covers tail -5 for all 3 vendor installs + browser wording

P0 (critical) — zeta-ai-agent.nix ExecStart was hardcoded to
`${binary} --print "<<autonomous-loop>>"` for ALL personas, but:
  - claude uses --print ✓
  - gemini uses -p (NOT --print)
  - codex uses `exec` SUBCOMMAND (no --print flag)
Enabling lior or vera would create services with broken ExecStart.

Fix: per-persona `invocationArgs` field in the persona registry.
ExecStart uses `${cfg.home}/.bun/bin/${persona.binary} ${persona.invocationArgs}`.
Per-persona values:
  - otto: [ "--print" "<<autonomous-loop>>" ]
  - lior: [ "-p" "<<autonomous-loop>>" ]
  - vera: [ "exec" "<<autonomous-loop>>" ]
  - alexa + riven: [ ] placeholder per their sub-rows

P1 — Gemini bun install pipefail masked by tail -5 outside bash -c.
Same root cause as the earlier P1 on claude install (Copilot found
+ I fixed only inside bash -c which doesn't cover outer pipeline).
Real fix: move tail -5 INSIDE bash -c so set -o pipefail covers it.

P1 — Same fix for codex bun install.

P2 — codex device-flow prompt said "visit on this Mac browser" but
codex device-auth is browser-agnostic ("visit on ANY browser on
ANY device").

Note on the literal <<autonomous-loop>> sentinel: it's a Claude
Code convention; gemini + codex will see it as a literal prompt
and respond conversationally. Acceptable for first ship; per-vendor
prompt mapping is B-0850 Phase 3.x future work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…ture — extends B-0850 (Mika ferry; Aaron 2026-05-27) (#5400)

* feat(B-0851): persona-first guard-post assignment + rotation architecture — extends B-0850 multi-vendor systemd substrate (Mika ferry; Aaron 2026-05-27)

Mika compressed framing (verbatim preserved at memory/persona/mika/
conversations/2026-05-27-...):

  > "Everything is Persona-first."
  > 1. Persona is the primary decision
  > 2. Persona constrains Model Line + Harnesses
  > 3. Tier choice AFTER persona + model line
  > 4. Harness LAST (compatible with model line + persona preferences)
  >
  > Rotation: ≥3 active guard posts always; persona / model line /
  > tier / harness ALL rotate; nothing locked to physical post.

Aaron operator clarification: "guard post is the systemd for each
node outside k8s" — confirms per-node ≥3 floor scope.

B-0850 Phase 1 + 3 substrate (PRs #5392+#5394+#5395+#5397+#5398) is
a VALID FIRST INSTANTIATION of persona-first architecture (default
scheduler = "static; same vendor; no rotation"; default ≥3 floor =
"3 enabled personas per node"). This row captures the architectural
target the Mika ferry names.

10 sub-row implementation slices:

- B-0851.1: persona-preferences-as-declaration (acceptable model
  lines + harnesses + min tier per persona)
- B-0851.2: guard-post-abstraction (decouple systemd unit name from
  persona name; zeta-guard-post-1/2/3.service)
- B-0851.3: scheduler primitive (NixOS module; per-tick assignment
  of guard-post → (persona, model line, tier, harness))
- B-0851.4: tier modeling (fast/medium/high per vendor's model-line
  catalog)
- B-0851.5: harness compat matrix (which harnesses each persona+
  model-line combo supports)
- B-0851.6: rotation policy (operator-config interval + dimensions +
  algorithm)
- B-0851.7: per-node ≥3 floor as guard-post count (migrate from per-
  persona-enable)
- B-0851.8: substrate continuity across rotation (per-persona memory
  inheritance survives vendor change)
- B-0851.9: failover semantics (vendor outage → re-assign per
  preferences; composes B-0703 multi-oracle BFT)
- B-0851.10: persona-vs-instance distinction (logical identity vs
  per-tick operational instance)

Composes with: B-0850 (parent — this extends), B-0703 multi-oracle
BFT, B-0824 Ace meta-PM (selection-authority same shape), B-0847
per-AI GitHub identity, B-0848 node-local Claude, B-0796 Twilio out-
of-band (voice is a harness type).

Does NOT replace B-0850. Refactor path; operator picks sub-row
priority order. Current shipped B-0850 satisfies ≥3 BFT floor +
format-test target; B-0851 extends toward Mika's preference-based
scheduler with rotation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(PR-5400 CI): regen MEMORY.md (1439 entries) + markdownlint MD032 blank lines around lists in B-0851 row Mika ferry quote block

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants