Skip to content

feat(b-0852.4): flip zeta.credsRestore default-on with interactive mode via common.nix — closes the loop on 'don't re-enter credentials over and over' at installed-system-boot scope#5640

Merged
AceHack merged 1 commit into
mainfrom
feat/b-0852.4-enable-zeta-creds-restore-default-with-interactive-passphrase-mode-2026-05-27
May 27, 2026
Merged

feat(b-0852.4): flip zeta.credsRestore default-on with interactive mode via common.nix — closes the loop on 'don't re-enter credentials over and over' at installed-system-boot scope#5640
AceHack merged 1 commit into
mainfrom
feat/b-0852.4-enable-zeta-creds-restore-default-with-interactive-passphrase-mode-2026-05-27

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 27, 2026

Summary

Closes the loop on the operator pain point named 2026-05-27. The B-0852 install-side cascade (PRs #5637 + #5638 + #5639) writes the encrypted cred-blob; this PR enables the installed system to READ it at every subsequent boot.

What changed

Single 2-line addition to common.nix (+ multi-line comment update):

```nix
zeta.credsRestore = {
enable = lib.mkDefault true;
passphraseMode = lib.mkDefault "interactive";
};
```

Inherits across all hosts that import common.nix (control-plane, worker-gpu, worker-template, future configs).

Why safe

The module's systemd.services.zeta-creds-restore unit has ConditionPathExists guards for blob + uuid + script + bun shim. On any host without a written cred-blob (first install before picker fires, OR fresh install where operator skipped Step 6.56), the unit no-ops cleanly. No failure mode.

On hosts WITH a blob, systemd-ask-password prompts on tty1 at boot for the passphrase. Operator types ONCE per boot; restore CLI decrypts blob + writes /home/zeta/.config/{gh,claude,gemini,codex} per the declarative manifest.

Why interactive (not file)

  • File mode requires operator to pre-stage /run/zeta-creds-passphrase BEFORE service start — not auto-restore
  • Interactive mode prompts the operator ONCE at boot via tty1 (well-established systemd pattern) — IS auto-restore

Per-host opt-out path preserved: set zeta.credsRestore.passphraseMode = \"file\" for headless / cluster scenarios where tty1 prompting is inappropriate.

Why mkDefault

Per-host configs can override without conflict warnings; common module sets fleet default; individual hosts can override per their substrate.

End-to-end operator experience (post all 4 PRs)

Step What happens PR
First install: boot live-USB zeta-install.sh runs
Step 6.56 Passphrase prompt #5638
iter-4.2 USB UUID captured to /etc/zeta/usb-uuid #5637
Step 6.95-picker Auto-fires; writes /esp/zeta-creds.enc #5639
Operator Goes through gh/claude/gemini/codex device-flow logins ONCE
Picker Bundles them into the encrypted blob
Subsequent boots zeta-creds-restore.service fires THIS PR
systemd-ask-password Prompts on tty1 ONCE THIS PR
Restore CLI Decrypts blob + writes /home/zeta/.config/{gh,claude,gemini,codex} THIS PR
Operator NEVER re-enters gh/claude/gemini/codex device-flow CLOSES LOOP

Composes with

Test plan

🤖 Generated with Claude Code

…ault-on with interactive prompt across all hosts via common.nix — closes the loop on operator's 'don't re-enter credentials over and over' pain point at INSTALLED-system-boot scope

The B-0852 install-side substrate cascade (PRs #5637 + #5638 + #5639)
closes the 3 preconditions that gate the install-time cred-blob picker.
But the install-side picker only WRITES the blob; the operator pain
point ("don't re-enter credentials over and over everytime") closes
when the INSTALLED system READS the blob at boot + restores creds
WITHOUT operator intervention beyond a single passphrase prompt.

The zeta-creds-restore.nix NixOS module is already fully implemented
(PR #5476 et al; 214 lines; ConditionPathExists guards + scrypt+HKDF
decrypt via zeta-creds-restore.ts + writes per-cred paths). What was
missing: the module is opt-in (lib.mkEnableOption defaults to false)
and no host config opts in.

This commit flips the default at the common.nix import scope (every
host inherits the default-on) via lib.mkDefault on BOTH options:

  zeta.credsRestore = {
    enable = lib.mkDefault true;
    passphraseMode = lib.mkDefault "interactive";
  };

WHY SAFE: the module's systemd unit has ConditionPathExists checks
for blob + uuid + script + bun shim. On any host without a written
cred-blob (e.g., first install before Step 6.95-picker has fired,
OR fresh install where operator skipped Step 6.56 passphrase prompt),
the unit no-ops cleanly. On hosts WITH a blob, systemd-ask-password
prompts on tty1 for the passphrase at each boot.

WHY INTERACTIVE (not file): file mode requires operator to pre-stage
/run/zeta-creds-passphrase before service start, which is NOT
auto-restore. Interactive mode prompts the operator once at boot via
tty1 (well-established systemd pattern). Per-host opt-out path: set
zeta.credsRestore.passphraseMode = "file" + arrange /run/passphrase
staging via separate mechanism (for headless / cluster scenarios).

WHY mkDefault (not direct assignment): per-host configs can override
without conflict warnings; the common module sets the FLEET DEFAULT;
individual hosts (e.g., headless control-plane-with-pre-staged-key,
worker that should never restore) can override per their substrate.

END-TO-END OPERATOR EXPERIENCE (post all 4 PRs):

1. First install: boot live-USB; zeta-install.sh runs
   - Step 6.56: passphrase prompt (PR #5638)
   - iter-4.2: captures USB UUID -> /etc/zeta/usb-uuid (PR #5637)
   - Step 6.95-picker: auto-fires; writes /esp/zeta-creds.enc (PR #5639)
   - Operator goes through gh + claude + gemini + codex device-flow
     logins ONCE (existing behavior)
   - Picker bundles them into the encrypted blob

2. Subsequent boots: NixOS boots installed system
   - zeta-creds-restore.service fires (this PR + module substrate)
   - systemd-ask-password prompts on tty1 ONCE
   - Operator types the SAME passphrase they used at Step 6.56
   - Restore CLI decrypts blob + writes
     /home/zeta/.config/{gh,claude,gemini,codex} per declarative
     manifest (zeta-creds-manifest.ts)
   - Operator NEVER re-enters gh/claude/gemini/codex device-flow

That IS the operator's "don't re-enter credentials over and over"
solution.

LOCAL VALIDATION:
- Nix syntax: trivial change; the option assignments use the
  module's own option declarations + lib.mkDefault helper
- No new module dependencies

COMPOSES WITH:
- PR #5637 (B-0852.3a-prep USB UUID capture)
- PR #5638 (B-0852.3b passphrase prompt + unset-after-picker)
- PR #5639 (B-0852.3c default-flip with 4-path opt-out)
- full-ai-cluster/nixos/modules/zeta-creds-restore.nix (existing
  module substrate; no changes needed)
- tools/installer/zeta-creds-restore.ts (existing TS impl)
- B-0855.1 zeta-self-register (declares After =
  "zeta-creds-restore.service"; ordering preserved)
- full-ai-cluster/INJECTION-POINTS.md (catalog; in-flight item #5
  now operator-facing end-to-end)

Per operator 2026-05-27 multi-message direction + the operator-
blocking pain point + the "i'm counting on you to keep the backlog
going even when i'm not here" framing: this PR ships during operator-
offline window per the autonomous-loop's actual purpose.

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 21:00
@AceHack AceHack enabled auto-merge (squash) May 27, 2026 21:00
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack merged commit 72fea50 into main May 27, 2026
30 of 31 checks passed
@AceHack AceHack deleted the feat/b-0852.4-enable-zeta-creds-restore-default-with-interactive-passphrase-mode-2026-05-27 branch May 27, 2026 21:03
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Flips zeta.credsRestore defaults in the shared NixOS common.nix so the boot-time credential restore service is enabled fleet-wide with interactive (systemd-ask-password on tty1) passphrase mode. Pairs with the install-side PRs (#5637/#5638/#5639) so that on every subsequent boot the operator types the passphrase once and the restore CLI rehydrates client creds from the encrypted USB-ESP blob.

Changes:

  • Set zeta.credsRestore.enable = lib.mkDefault true and passphraseMode = lib.mkDefault "interactive" in common.nix, inheriting to all hosts importing it.
  • Expand the import-block comment to document the default-on rationale, ConditionPathExists no-op safety, opt-out paths, and composition with B-0855.1 self-register ordering.

Comment thread full-ai-cluster/nixos/modules/common.nix
Comment thread full-ai-cluster/nixos/modules/common.nix
AceHack added a commit that referenced this pull request May 27, 2026
… + B-0852 cred-blob substrate + subsequent-boot restore — reflects PRs #5635 + #5637 + #5638 + #5639 + #5640 substrate now operator-facing (#5641)

Operator pain point context: PROVISIONING.md was written before the
B-0852 cred-blob substrate + B-0857.2 cluster-type menu landed; the
doc still described the bare HOST free-text prompt + had nothing
about the cred-blob substrate. Operators reading this doc had no way
to discover the new substrate they were going to encounter on next
install.

THIS COMMIT adds two new sub-sections to "Step 4: boot the box on
the USB":

### Interactive zeta-install.sh flow (Step 4 expansion)

Documents the 7 prompts in order that the operator sees when running
zeta-install.sh interactively:

1. iter-5.3 initial password (Step 6.55)
2. B-0852.3b cred-blob passphrase (Step 6.56; default-on per
   B-0852.3c since 2026-05-27)
3. iter-5.2 hostname injection (Step 6.6)
4. iter-5.1 WiFi persistence (Step 6.7)
5. iter-5.4.0 homelab gh-auth (Step 6.8)
6. Cluster-type menu (Step 6 host-attribute selection; B-0857.2
   per PR #5635 since 2026-05-27) — numbered menu with lspci-based
   hardware-detection suggested default
7. Step 6.95-picker cred-blob picker (B-0852.3c default-on with
   4-path opt-out)

### Subsequent-boot credential restore (B-0852.4 — operator pain
point closure)

Documents the systemd-ask-password prompt on tty1 at each boot +
the zeta-creds-restore.service flow + per-host opt-out paths
(enable=false; passphraseMode=file for headless).

Composes with:
- PR #5635 (cluster-type menu + lspci hardware detection)
- PR #5637 (B-0852.3a-prep USB UUID capture)
- PR #5638 (B-0852.3b passphrase prompt + unset-after-picker)
- PR #5639 (B-0852.3c picker default-on with 4-path opt-out)
- PR #5640 (B-0852.4 restore-service default-on with interactive
  mode; CLOSES THE LOOP at installed-system-boot scope)
- full-ai-cluster/INJECTION-POINTS.md (sibling injection-points
  catalog; this commit is the operator-facing PROVISIONING-side
  documentation that the catalog references at architectural scope)

Doc-only commit; no code paths touched; no zeta-install.sh edits
(stays out of the way of the 3 in-flight zeta-install.sh PRs).

Per operator 2026-05-27 multi-message direction: ships during
operator-offline window per the autonomous-loop's purpose.

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…memory classification; 24 unit tests; Rule 0 TS-over-bash discipline (#5642)

Extracts inline lspci heuristic from zeta-install.sh (PR #5635) into
testable TS module. Extends scope: detects storage (NVMe/SSD/HDD +
count), CPU (nproc + vendor_id), memory (GB). --suggested-host flag
outputs one of control-plane / worker-gpu / worker-template for bash
$(...) capture. 24 unit tests; pure-logic exports (no I/O during
tests). Does NOT yet modify zeta-install.sh (stays out of way of
in-flight #5638 + #5640). Follow-up commit will replace inline lspci
block with bun-invoke.

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack pushed a commit that referenced this pull request May 27, 2026
…opilot threads on #5644)

P1 — producer-side path mismatch ALSO needs fixing
  Prior commit fixed CONSUMER (restore service) but PRODUCER
  (Step 6.95-picker) was writing to /esp/zeta-creds.enc — which
  doesn't correspond to any mount. Target ESP is mounted at
  /mnt/boot during install (zeta-install.sh:226). Blob was
  landing on live USB rootfs, not target ESP. Reboot lost it.

  Fix: picker --output /esp/zeta-creds.enc → /mnt/boot/zeta-creds.enc.
  Producer now writes to target ESP mount, disko remounts as /boot
  post-reboot. Same physical file at two mount paths bridges the
  install-vs-installed boundary.

P2 — option doc style: strip PR-review history attribution
  Prior commit included "caught by Copilot review on PR #5640" in
  option doc. Repo convention: code/current-state docs use
  role-neutral present-tense contract text; PR-review history lives
  in commit messages + history surfaces.

  Fix: rewrite doc as present-tense contract for the option (what
  it configures + install-vs-installed mount convention for
  operators using non-default ESP layouts).

Validation: bash -n OK.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…inding on #5640 — restore service never fired) (#5644)

* fix(b-0852.4): correct cred-blob default path /esp → /boot to match installed-system ESP mount (Copilot finding on #5640)

PR #5640 shipped credsRestore.enable=true with blobPath defaulting to
`/esp/zeta-creds.enc`. Copilot review flagged the substrate-honest bug:

- At INSTALL-TIME (live USB), zeta-install.sh's Step 6.95-picker writes
  the blob to `/esp/zeta-creds.enc` because the live installer mounts
  the target ESP at `/esp`
- POST-REBOOT, disko (`disko-shapes/2nvme.nix`) mounts the SAME ESP
  partition at `/boot` per `mountpoint = "/boot"`
- The blob is the same physical file on the same ESP partition, but
  the mount path differs by context

The restore service runs POST-REBOOT, where the file is at
`/boot/zeta-creds.enc` — NOT `/esp/zeta-creds.enc`. So:

- ConditionPathExists = "/esp/zeta-creds.enc" always evaluates FALSE
  on the installed system (`/esp` doesn't exist post-reboot)
- systemd silently skips the unit (condition unmet)
- restore-from-cred-blob NEVER FIRES on any installed node
- creds are never restored at boot
- operator has to manually re-enter every credential each reboot —
  which is exactly the pain point the whole B-0852 cascade was
  designed to solve

This commit changes the default to `/boot/zeta-creds.enc` so the
service can actually find the blob it's supposed to decrypt. Also
expands the option description to explain the install-vs-installed
mount-path distinction so future maintainers don't reintroduce the
same confusion.

No changes to zeta-install.sh: the install-time write to
`/esp/zeta-creds.enc` is correct for the install-time context;
disko's later remount-as-/boot is what makes the file accessible
at the new path.

Validation:
- `nix-instantiate --parse zeta-creds-restore.nix` parses clean
  (no syntax change; only literal value + description text)
- Substrate-honest: this is a single-line semantic fix; the
  multi-line description expansion is wake-time substrate for the
  next maintainer who edits this module

Composes with #5640 (the row that surfaced the issue), #5643
(passphrase-env supersede), and the B-0852 cred-persistence cascade
(#5635 + #5637 + #5638 + #5639 + #5640 + #5641 + #5642).

Addresses CRITICAL Copilot finding on #5640.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fixup(b-0852.4): producer-side /esp → /mnt/boot + clean option doc (Copilot threads on #5644)

P1 — producer-side path mismatch ALSO needs fixing
  Prior commit fixed CONSUMER (restore service) but PRODUCER
  (Step 6.95-picker) was writing to /esp/zeta-creds.enc — which
  doesn't correspond to any mount. Target ESP is mounted at
  /mnt/boot during install (zeta-install.sh:226). Blob was
  landing on live USB rootfs, not target ESP. Reboot lost it.

  Fix: picker --output /esp/zeta-creds.enc → /mnt/boot/zeta-creds.enc.
  Producer now writes to target ESP mount, disko remounts as /boot
  post-reboot. Same physical file at two mount paths bridges the
  install-vs-installed boundary.

P2 — option doc style: strip PR-review history attribution
  Prior commit included "caught by Copilot review on PR #5640" in
  option doc. Repo convention: code/current-state docs use
  role-neutral present-tense contract text; PR-review history lives
  in commit messages + history surfaces.

  Fix: rewrite doc as present-tense contract for the option (what
  it configures + install-vs-installed mount convention for
  operators using non-default ESP layouts).

Validation: bash -n OK.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…consumer paths (catches PR #5640/#5644 bug class at CI time) (#5649)

Adds CROSS_FILE_ASSERTIONS class to audit-installer-substrate.ts. Each
assertion names a pair of files that MUST agree on a shared contract
(producer/consumer paths, shared filenames, shared env var names), with
producer/consumer extract functions + a consumerEquivalence transform
that maps producer state to expected consumer state.

First assertion: cred-blob path producer/consumer consistency.

- Producer: zeta-install.sh Step 6.95-picker --output path
  (must be on /mnt/boot/ since that's where the target ESP is
  mounted during install per zeta-install.sh Step 5)
- Consumer: zeta-creds-restore.nix blobPath default
  (must be on /boot/ since that's where disko remounts the same
  ESP post-reboot per disko-shapes/2nvme.nix)
- Equivalence: /mnt/boot/<file> ↔ /boot/<file>

Bug class this catches: producer writes one path, consumer reads
another, ConditionPathExists silently evaluates false, restore
service never fires, operator wonders why creds keep re-prompting.
Surfaced empirically by Copilot review on PR #5640 + PR #5644 —
took manual code review to spot the install-vs-installed mount-path
mismatch. This audit makes the check automatic.

Also asserts that producer is ON /mnt/boot/ (not e.g. /esp/ which
doesn't exist on the live USB at all). A producer path off
/mnt/boot/ returns a sentinel that can never match consumer,
surfacing the producer-on-wrong-mount bug via the same cross-file
fail path.

New exit code 4 for cross-file-mismatch isolated failures (mixed
failure classes still exit 1 prioritized). Audit output groups all
failure kinds in a single FAIL report so CI logs show everything
that needs fixing in one pass.

Substrate-honest: this PR depends on PR #5644 landing first (which
moves both sides to the correct paths). Open after #5644 merges;
without it, the audit correctly identifies current main as failing.

Composes with:
- PR #5640 (the row that introduced the restore-service default-on
  with bad paths)
- PR #5644 (the fix-fwd that aligns both sides to /mnt/boot ↔ /boot)
- B-0852 cred-persistence cascade

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack pushed a commit that referenced this pull request May 27, 2026
…D018 fix (Copilot 6 threads on #5648)

Comprehensive accuracy rewrite addressing all 6 Copilot findings:

1. "no more re-entering" overclaim — passphraseMode=interactive
   DOES prompt every boot via systemd-ask-password. Reframed
   accurately: N per-tool login flows → ONE cred-blob passphrase.
   The improvement is atomicity, not zero typing.

2. Install log lines mismatch — restored to match actual zeta-install.sh
   output (Step 6.56 + Step 6.95-picker actual strings).

3. /boot path correctness — preserved (#5644 already fixed
   producer/consumer alignment to /mnt/boot ↔ /boot).

4. Manifest coverage — included gemini + codex paths
   (~/.gemini/oauth_creds.json, ~/.codex/auth.json) plus the
   full default-manifest table.

5. Second-reboot expectation — corrected: interactive mode prompts
   every boot by design. Operator who wants no-prompt-at-boot can
   switch to passphraseMode="file" (with security tradeoff named).

6. Filename reference — zeta-creds-cli.ts → zeta-creds-manifest.ts
   (actual canonical location of defaultManifest).

Also fixes MD018 lint failure: line "#5639 + #5640 + #5643 + #5644 +"
was being parsed as an ATX heading because # was at column 1. Replaced
the line-wrapped PR-number prose with the default-manifest table
(more useful + no MD018 trigger).

Composes with:
- B-0852 cred-persistence cascade (PRs that ACTUALLY ship: #5635,
  #5637, #5639, #5640, #5641, #5642, #5644, #5645, #5646, #5648,
  #5649, #5650; #5638 + #5643 were superseded → closed without merge)
- common.nix passphraseMode=interactive default (PR #5640)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…o-end verification checklist for operator) (#5648)

* docs(provisioning): add cred-restore smoke-test section — first-boot + post-reboot + second-reboot verification + troubleshooting table (B-0852 end-to-end)

The B-0852 cred-persistence cascade (PRs #5635 + #5637 + #5638 +
#5639 + #5640 + #5643 + #5644 + #5646) closes the operator's
'don't re-enter creds over and over' pain point. This docs addition
gives operators a concrete checklist to verify the full path works
after a fresh USB install:

- First-boot verification: what install log lines to look for
- Post-reboot verification: systemctl + ls + auth-status commands
- Second-reboot verification: confirm no re-entry needed
- Troubleshooting table: 4 common symptoms with likely causes

Closes the gap between 'cascade is shipped' and 'operator can
confirm cascade works on their hardware'. The operator no longer
has to figure out which systemd unit to query or which paths to
check — the checklist names them.

Composes with:
- PROVISIONING.md (existing operator-facing install doc)
- B-0852 cred-persistence substrate
- The audit-extension PR (separate; catches drift at CI time)

Substrate-honest scope: this is operator docs, not a TS tool. A
follow-on TS smoke-test runner (run on the installed system to
auto-verify the checklist) is a candidate for follow-up work but
out of scope for this commit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fixup(docs): rewrite cred-restore smoke-test section for accuracy + MD018 fix (Copilot 6 threads on #5648)

Comprehensive accuracy rewrite addressing all 6 Copilot findings:

1. "no more re-entering" overclaim — passphraseMode=interactive
   DOES prompt every boot via systemd-ask-password. Reframed
   accurately: N per-tool login flows → ONE cred-blob passphrase.
   The improvement is atomicity, not zero typing.

2. Install log lines mismatch — restored to match actual zeta-install.sh
   output (Step 6.56 + Step 6.95-picker actual strings).

3. /boot path correctness — preserved (#5644 already fixed
   producer/consumer alignment to /mnt/boot ↔ /boot).

4. Manifest coverage — included gemini + codex paths
   (~/.gemini/oauth_creds.json, ~/.codex/auth.json) plus the
   full default-manifest table.

5. Second-reboot expectation — corrected: interactive mode prompts
   every boot by design. Operator who wants no-prompt-at-boot can
   switch to passphraseMode="file" (with security tradeoff named).

6. Filename reference — zeta-creds-cli.ts → zeta-creds-manifest.ts
   (actual canonical location of defaultManifest).

Also fixes MD018 lint failure: line "#5639 + #5640 + #5643 + #5644 +"
was being parsed as an ATX heading because # was at column 1. Replaced
the line-wrapped PR-number prose with the default-manifest table
(more useful + no MD018 trigger).

Composes with:
- B-0852 cred-persistence cascade (PRs that ACTUALLY ship: #5635,
  #5637, #5639, #5640, #5641, #5642, #5644, #5645, #5646, #5648,
  #5649, #5650; #5638 + #5643 were superseded → closed without merge)
- common.nix passphraseMode=interactive default (PR #5640)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…re the just-written blob at install time (operator catches bad blob BEFORE reboot, not at first boot) (#5655)

Adds opt-in --verify flag to zeta-creds-picker.ts. When set, after
zeta-creds-persist succeeds, the picker spawns zeta-creds-restore.ts
with --dry-run + the same passphrase source + a tmpdir as
--target-root. If restore-dry-run exits 0, the blob is confirmed
cryptographically valid + manifest-parseable. If non-zero, the
operator sees an actionable error at install time + can re-run the
picker to retry.

Operator-experience improvement: without --verify, a corrupt blob
(wrong passphrase captured, disk write error, persist bug) only
surfaces at first reboot when zeta-creds-restore.service fails its
ConditionPathExists or scrypt-decrypt step. At that point the
operator must reboot back into the live USB + re-run the install.
With --verify, the same failure surfaces SECONDS after persist,
inside the running install flow, with the live USB still mounted.

New exit code 5 for verify-failed (distinct from persist-failed=4).

API addition:
- PickerArgs gains `verify: boolean` (default false; opt-in)
- New export buildVerifyArgs(parsed, tmpTargetRoot) — pure
  composer of the restore-CLI argv list; testable in isolation

Tests added (3 new + 2 parseArgs-extension):
- --verify flag default false
- --verify flag parsed when passed
- buildVerifyArgs composes restore-CLI args with --dry-run + tmpdir
- buildVerifyArgs propagates --passphrase-file when picker used file
- buildVerifyArgs propagates --persona when set

21 pass / 0 fail (was 16; +5).

Substrate-honest scope: opt-in only. Future PR can flip default-on
after operator empirical testing confirms verify doesn't introduce
new failure modes (e.g., tmpdir permission, restore-CLI changes).
zeta-install.sh Step 6.95-picker currently does NOT pass --verify;
that flip can land in a follow-up after operator tests.

Composes with:
- B-0852 cred-persistence cascade (#5635 + #5637 + #5639 + #5640 +
  #5642 + #5644 + #5645 + #5646 + #5648 + #5649 + #5650)
- tools/installer/zeta-creds-restore.ts (existing --dry-run mode)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants