Skip to content

feat(B-0852.4a+4d): NixOS module zeta-creds-restore.nix + wire into cluster common.nix imports — last gate for end-to-end USB cred-persistence test (Aaron 2026-05-27 USB priority)#5476

Merged
AceHack merged 3 commits into
mainfrom
feat/b-0852-4a-4d-nixos-module-plus-common-nix-wire-2026-05-27
May 27, 2026
Merged

feat(B-0852.4a+4d): NixOS module zeta-creds-restore.nix + wire into cluster common.nix imports — last gate for end-to-end USB cred-persistence test (Aaron 2026-05-27 USB priority)#5476
AceHack merged 3 commits into
mainfrom
feat/b-0852-4a-4d-nixos-module-plus-common-nix-wire-2026-05-27

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 27, 2026

Summary

Two commits bundled — the NixOS module + the common.nix import — together completing the end-to-end USB cred-persistence chain.

Commit 1 (B-0852.4a): `full-ai-cluster/nixos/modules/zeta-creds-restore.nix` — systemd service `zeta-creds-restore.service` that decrypts `/esp/zeta-creds.enc` at boot (via B-0852.2b restore CLI), populates per-cred files, fires before B-0855.1 `zeta-self-register.service`. Two passphrase modes (file / interactive); disabled by default; opt-in per host config.

Commit 2 (B-0852.4d): adds `./zeta-creds-restore.nix` to `full-ai-cluster/nixos/modules/common.nix` imports list right after `./zeta-self-register.nix` — every cluster node now inherits the module surface; per-host opt-in via `zeta.credsRestore.enable = true;`.

End-to-end USB test path now complete

  1. Reflash USB with ISO carrying these changes
  2. Boot, run installer with ZETA_CREDS_PICKER=1 + ZETA_CREDS_PASSPHRASE=...
  3. Step 6.95-picker writes `/esp/zeta-creds.enc` (B-0852.3a, PR feat(B-0852.3a): interactive cred-picker + zeta-install.sh Step 6.94 integration (16 tests; Aaron 2026-05-27 USB push) #5450 in flight)
  4. Operator enables `zeta.credsRestore.enable = true;` in host config + pre-stages `/run/zeta-creds-passphrase`
  5. Reboot → `zeta-creds-restore.service` fires → blob decrypted → per-cred files populated
  6. `zeta-self-register.service` fires next per B-0855.1 ordering

Test plan

  • `nix-instantiate --parse` on both files → PARSE OK
  • Module disabled by default (opt-in via host config)
  • AgencySignature v1 trailers on both commits
  • Per .claude/rules/agent-worktree-hygiene-...: isolated worktree

🤖 Generated with Claude Code

Lior and others added 2 commits May 27, 2026 10:29
…ypt from ESP via systemd service (Aaron 2026-05-27 USB push; sibling to zeta-self-register.nix per B-0855.1)

Implements the boot-time consumer for the install-time picker (B-0852.3a
PR #5450). Composes with zeta-self-register.service which already
declares `after = "zeta-creds-restore.service"` per B-0855.1 module —
the dependency was wired upstream; this row makes the target service
actually exist.

**Module: full-ai-cluster/nixos/modules/zeta-creds-restore.nix**

NixOS module providing systemd service `zeta-creds-restore.service`:

- Disabled by default (`zeta.credsRestore.enable = false`); opt-in
  per host config (matches zeta-self-register sibling pattern)
- Ordering: `wantedBy=multi-user.target`, `after=local-fs.target` +
  `wants=local-fs.target` (ESP mounted before fire); B-0855.1 enforces
  `after=zeta-creds-restore.service` from its side
- ConditionPathExists guard: blob + USB UUID + restore CLI + bun shim
  must all exist (clean skip when picker wasn't run at install)
- Two passphrase modes (operator-configurable):
  - **file** (default): read from /run/zeta-creds-passphrase
    (operator pre-stages); deleted by ExecStopPost
  - **interactive**: systemd-ask-password on tty1 (300s timeout);
    writes zeta-readable temp file; deleted by ExecStopPost
- Invokes B-0852.2b restore CLI as zeta user via sudo with proper
  HOME + PATH + --target-root=/
- Optional --persona passthrough for per-persona-scoped creds
- Restart=on-failure with 30s backoff (per .claude/rules/non-coercion-invariant.md
  HC-8: required-cred failure surfaces honestly)

**Verification**: `nix-instantiate --parse` returns PARSE OK.

**What this unblocks for operator's USB test**:

End-to-end persist → restore → use chain now possible on real USB:
1. Operator reflashes USB
2. Boots, runs installer with ZETA_CREDS_PICKER=1 + ZETA_CREDS_PASSPHRASE=...
3. Picker writes /esp/zeta-creds.enc (B-0852.3a / PR #5450)
4. Operator enables zeta.credsRestore.enable=true + passphraseMode in
   host common.nix (B-0852.4d wiring; next sub-row)
5. Reboot → systemd fires zeta-creds-restore.service → blob decrypts →
   per-cred files populated in /home/zeta
6. zeta-self-register.service fires next per B-0855.1 ordering

Composes:
- B-0852.1 crypto (PR #5413; decrypt envelope)
- B-0852.2a envelope (PR #5421; parse blob format)
- B-0852.2b restore CLI (PR #5425; the binary this module wraps)
- B-0852.3a picker (PR #5450; produces the blob)
- B-0852.4 row (PR #5454; this is sub-row 4a)
- B-0852.5 manifest (PR #5414; drives per-cred path resolution)
- B-0855.1 zeta-self-register.nix (the sibling module that already
  expects this service to exist)
- B-0857 install.sh universal entry (install-time companion)

Remaining sub-rows planned (per B-0852.4 row):
- 4c: file-mode is implemented (default mode in this PR)
- 4b: interactive-mode also implemented (both modes ship together)
- 4d: wire into common.nix (next PR; simple imports list add)
- 4e: empirical USB end-to-end test (validates full chain on hardware)

Per .claude/rules/agent-worktree-hygiene-never-hold-main-...: isolated
worktree at /private/tmp/zeta-b0852-4a-module-1250z; operator primary
checkout untouched.

Per .claude/rules/non-coercion-invariant.md HC-8: operator authority
over creds preserved; passphrase NEVER logged; interactive prompt
operator-driven; file-mode operator-staged; failure surfaces via
journalctl + restart policy.

Per .claude/rules/methodology-hard-limits.md: clinical/security floor
operative; cred-restore is purely defensive operator-data-recovery
substrate; no offensive use.

Heartbeat-via-commit per CLAUDE.md (PR #5451): this commit IS the
externalized counter tick; AgencySignature v1 trailer below; named
bounded-wait is #5450 build-iso completion.

Agency-Signature-Version: 1
Agent: Otto
Agent-Runtime: Claude Code (auto mode)
Agent-Model: claude-opus-4-7
Credential-Identity: aaron-otto-vscode
Credential-Mode: operator-authorized
Human-Review: pre-merge-pending
Human-Review-Evidence: operator-direction-2026-05-27-usb-push-keep-pushing-forward
Action-Mode: substrate-implementation
Task: B-0852.4a

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…imports — last gate before end-to-end USB test (Aaron 2026-05-27 USB priority)

Adds `./zeta-creds-restore.nix` to `full-ai-cluster/nixos/modules/common.nix`
imports list right after `./zeta-self-register.nix` — matches the
ordering B-0855.1 documents (zeta-self-register declares
`after = "zeta-creds-restore.service"`; both share import position).

Disabled-by-default (per the module's mkEnableOption); host configs
opt in via `zeta.credsRestore.enable = true;` AND operator pre-stages
a passphrase source. Imported here so every cluster-node type
(control-plane / worker-gpu) inherits the same module surface; the
opt-in flip lives at host-config level not common.nix level.

Composes:
- B-0852.4a (this PR's earlier commit ef45b4f) — the module file itself
- B-0852.3a picker (PR #5450) — install-time blob writer
- B-0852.4 row (PR #5454 merged) — substrate-engineering parent
- B-0855.1 zeta-self-register.nix — already declares `after = "zeta-creds-restore.service"`
- iter-5.5.0 install flow — picker writes blob during install; module restores at boot

**Empirical USB test path now complete end-to-end**:
1. Reflash USB with ISO carrying these changes
2. Boot, run installer with ZETA_CREDS_PICKER=1 + ZETA_CREDS_PASSPHRASE=...
3. Step 6.95-picker writes /esp/zeta-creds.enc (B-0852.3a)
4. Operator enables `zeta.credsRestore.enable = true;` in host config
   + pre-stages /run/zeta-creds-passphrase
5. Reboot → zeta-creds-restore.service fires → blob decrypted →
   per-cred files populated in /home/zeta
6. zeta-self-register.service fires next per B-0855.1 ordering

Verification:
- `nix-instantiate --parse full-ai-cluster/nixos/modules/common.nix` → PARSE OK
- `nix-instantiate --parse full-ai-cluster/nixos/modules/zeta-creds-restore.nix` → PARSE OK

Per .claude/rules/non-coercion-invariant.md HC-8: opt-in default
preserves operator authority over per-host enablement; importing
the module surface doesn't activate it.

Per .claude/rules/agent-worktree-hygiene-never-hold-main-...: isolated
worktree at /private/tmp/zeta-b0852-4a-module-1250z; operator primary
checkout untouched.

Agency-Signature-Version: 1
Agent: Otto
Agent-Runtime: Claude Code (auto mode)
Agent-Model: claude-opus-4-7
Credential-Identity: aaron-otto-vscode
Credential-Mode: operator-authorized
Human-Review: pre-merge-pending
Human-Review-Evidence: operator-direction-2026-05-27-back-to-usb-after-heartbeat-iteration
Action-Mode: substrate-implementation-final-usb-gate
Task: B-0852.4d

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 14:30
@AceHack AceHack enabled auto-merge (squash) May 27, 2026 14:30
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes the NixOS-side of the USB credential persistence chain by introducing a new zeta-creds-restore module (boot-time decrypt + restore from /esp/zeta-creds.enc) and wiring it into the cluster-wide common.nix module import set so every node has the same opt-in surface.

Changes:

  • Adds full-ai-cluster/nixos/modules/zeta-creds-restore.nix, defining zeta.credsRestore.* options and a zeta-creds-restore.service oneshot unit.
  • Imports the new module from full-ai-cluster/nixos/modules/common.nix so it’s available across cluster node types.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
full-ai-cluster/nixos/modules/zeta-creds-restore.nix New NixOS module + systemd unit to restore credentials from ESP at boot.
full-ai-cluster/nixos/modules/common.nix Adds the new module to the shared cluster module import list.

Comment thread full-ai-cluster/nixos/modules/zeta-creds-restore.nix Outdated
Comment thread full-ai-cluster/nixos/modules/zeta-creds-restore.nix
Comment thread full-ai-cluster/nixos/modules/zeta-creds-restore.nix Outdated
…0 ExecStopPost-never-fires + P1 USB UUID newline trim

3 Copilot threads on PR #5476:

**P0 (@180): sudo -u ${cfg.user} can't write to /etc paths.**
The default cred manifest includes /etc/zeta/operator-authorized-keys
+ /etc/ssh/ssh_host_* (root-owned paths zeta user can't write).
Fix: run restore CLI AS ROOT directly (drop the sudo -u zeta drop).
Post-restore find ${cfg.home} -user root -exec chown zeta:users
to fix ownership on user-facing creds (~/.config/gh, ~/.config/claude,
~/.gemini, ~/.codex). Operator's pre-existing configs (already
zeta-owned) untouched by the -user root filter.

**P0 (@189): RemainAfterExit=true + Type=oneshot means
ExecStopPost never fires on successful boot.**
The unit stays "active" after ExecStart returns; systemd doesn't
treat that as a "stop" event so ExecStopPost is skipped. Passphrase
cleanup never runs. Fix: move cleanup to bash EXIT trap inside
ExecStart — fires on ANY exit path (success or failure), unaffected
by RemainAfterExit semantics. Removed standalone ExecStopPost.

**P1 (@140): USB_UUID trailing newline from cat.**
`cat /etc/zeta/usb-uuid` includes trailing \n if file ends with one.
Fix: `tr -d '[:space:]' < ${cfg.usbUuidPath}` strips all whitespace
(safer than just newlines; covers \r\n + leading whitespace too).

Per .claude/rules/blocked-green-ci-investigate-threads.md verify-then-fix:
each Copilot finding read against actual file content; all 3 real
findings; bundled fix with rationale per finding.

Verification: `nix-instantiate --parse full-ai-cluster/nixos/modules/zeta-creds-restore.nix`
returns PARSE OK.

Per .claude/rules/non-coercion-invariant.md HC-8: operator authority
preserved (chown only touches root-owned files; pre-existing
zeta-owned files untouched).

Agency-Signature-Version: 1
Agent: Otto
Agent-Runtime: Claude Code (auto mode)
Agent-Model: claude-opus-4-7
Credential-Identity: aaron-otto-vscode
Credential-Mode: operator-authorized
Human-Review: pre-merge-pending
Human-Review-Evidence: copilot-3-findings-on-pr-5476-2-p0-1-p1
Action-Mode: substrate-fix-fwd-security-plus-correctness
Task: B-0852.4a

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AceHack AceHack merged commit fd0ca0c into main May 27, 2026
30 checks passed
@AceHack AceHack deleted the feat/b-0852-4a-4d-nixos-module-plus-common-nix-wire-2026-05-27 branch May 27, 2026 14:40
AceHack added a commit that referenced this pull request May 27, 2026
…ault-on with interactive prompt across all hosts via common.nix — closes the loop on operator's 'don't re-enter credentials over and over' pain point at INSTALLED-system-boot scope (#5640)

The B-0852 install-side substrate cascade (PRs #5637 + #5638 + #5639)
closes the 3 preconditions that gate the install-time cred-blob picker.
But the install-side picker only WRITES the blob; the operator pain
point ("don't re-enter credentials over and over everytime") closes
when the INSTALLED system READS the blob at boot + restores creds
WITHOUT operator intervention beyond a single passphrase prompt.

The zeta-creds-restore.nix NixOS module is already fully implemented
(PR #5476 et al; 214 lines; ConditionPathExists guards + scrypt+HKDF
decrypt via zeta-creds-restore.ts + writes per-cred paths). What was
missing: the module is opt-in (lib.mkEnableOption defaults to false)
and no host config opts in.

This commit flips the default at the common.nix import scope (every
host inherits the default-on) via lib.mkDefault on BOTH options:

  zeta.credsRestore = {
    enable = lib.mkDefault true;
    passphraseMode = lib.mkDefault "interactive";
  };

WHY SAFE: the module's systemd unit has ConditionPathExists checks
for blob + uuid + script + bun shim. On any host without a written
cred-blob (e.g., first install before Step 6.95-picker has fired,
OR fresh install where operator skipped Step 6.56 passphrase prompt),
the unit no-ops cleanly. On hosts WITH a blob, systemd-ask-password
prompts on tty1 for the passphrase at each boot.

WHY INTERACTIVE (not file): file mode requires operator to pre-stage
/run/zeta-creds-passphrase before service start, which is NOT
auto-restore. Interactive mode prompts the operator once at boot via
tty1 (well-established systemd pattern). Per-host opt-out path: set
zeta.credsRestore.passphraseMode = "file" + arrange /run/passphrase
staging via separate mechanism (for headless / cluster scenarios).

WHY mkDefault (not direct assignment): per-host configs can override
without conflict warnings; the common module sets the FLEET DEFAULT;
individual hosts (e.g., headless control-plane-with-pre-staged-key,
worker that should never restore) can override per their substrate.

END-TO-END OPERATOR EXPERIENCE (post all 4 PRs):

1. First install: boot live-USB; zeta-install.sh runs
   - Step 6.56: passphrase prompt (PR #5638)
   - iter-4.2: captures USB UUID -> /etc/zeta/usb-uuid (PR #5637)
   - Step 6.95-picker: auto-fires; writes /esp/zeta-creds.enc (PR #5639)
   - Operator goes through gh + claude + gemini + codex device-flow
     logins ONCE (existing behavior)
   - Picker bundles them into the encrypted blob

2. Subsequent boots: NixOS boots installed system
   - zeta-creds-restore.service fires (this PR + module substrate)
   - systemd-ask-password prompts on tty1 ONCE
   - Operator types the SAME passphrase they used at Step 6.56
   - Restore CLI decrypts blob + writes
     /home/zeta/.config/{gh,claude,gemini,codex} per declarative
     manifest (zeta-creds-manifest.ts)
   - Operator NEVER re-enters gh/claude/gemini/codex device-flow

That IS the operator's "don't re-enter credentials over and over"
solution.

LOCAL VALIDATION:
- Nix syntax: trivial change; the option assignments use the
  module's own option declarations + lib.mkDefault helper
- No new module dependencies

COMPOSES WITH:
- PR #5637 (B-0852.3a-prep USB UUID capture)
- PR #5638 (B-0852.3b passphrase prompt + unset-after-picker)
- PR #5639 (B-0852.3c default-flip with 4-path opt-out)
- full-ai-cluster/nixos/modules/zeta-creds-restore.nix (existing
  module substrate; no changes needed)
- tools/installer/zeta-creds-restore.ts (existing TS impl)
- B-0855.1 zeta-self-register (declares After =
  "zeta-creds-restore.service"; ordering preserved)
- full-ai-cluster/INJECTION-POINTS.md (catalog; in-flight item #5
  now operator-facing end-to-end)

Per operator 2026-05-27 multi-message direction + the operator-
blocking pain point + the "i'm counting on you to keep the backlog
going even when i'm not here" framing: this PR ships during operator-
offline window per the autonomous-loop's actual purpose.

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants