Skip to content

fix(b-0852.3b-supersede): narrow passphrase env-exposure window (supersedes #5638)#5643

Closed
AceHack wants to merge 2 commits into
mainfrom
fix/pr-5638-passphrase-env-exposure-narrow-window-2026-05-27
Closed

fix(b-0852.3b-supersede): narrow passphrase env-exposure window (supersedes #5638)#5643
AceHack wants to merge 2 commits into
mainfrom
fix/pr-5638-passphrase-env-exposure-narrow-window-2026-05-27

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 27, 2026

Summary

Supersedes #5638 with both Copilot findings addressed from the start.

Finding 1 (P1) — passphrase exported into installer env for ~750 lines, visible via /proc/<installer-pid>/environ
Finding 2 (P1)unset fired only inside the picker-ran branch; passphrase remained live when picker was skipped

Fix

  • Step 6.56: capture into NON-EXPORTED shell variable ZETA_CREDS_PASSPHRASE_VAL (not in /proc/.../environ)
  • Step 6.95: inline-env-set ZETA_CREDS_PASSPHRASE=\"\$ZETA_CREDS_PASSPHRASE_VAL\" sudo --preserve-env=ZETA_CREDS_PASSPHRASE — sets env in sudo subprocess only
  • Post-picker block: unset ZETA_CREDS_PASSPHRASE_VAL fires UNCONDITIONALLY after if/else, in BOTH branches

Test plan

  • `bash -n` syntax check passed
  • Docker harness (`bun tools/ci/docker-nixos-install-sh-test.ts`) passed in 22s
  • grep audit: no `export ZETA_CREDS_PASSPHRASE*` anywhere in zeta-install.sh
  • Tree-count canary 61 (clean; no corruption)
  • Operator can verify by booting USB + entering passphrase + checking `grep -a CREDS /proc/$(pgrep -f zeta-install)/environ` returns empty during install

Closes

Closes #5638 with both findings addressed. PR #5638 will be closed with cross-reference.

🤖 Generated with Claude Code

…-exported shell var ZETA_CREDS_PASSPHRASE_VAL + inline-set for sudo only + unconditional unset (supersedes #5638)

Addresses both Copilot findings on #5638:

Finding 1 (P1 — wider env-exposure window than necessary):
PR #5638's Step 6.56 exported ZETA_CREDS_PASSPHRASE into the installer
process env at line ~574 and left it there for ~750 lines until the
Step 6.95 picker invocation. During that window the passphrase was
readable via /proc/<installer-pid>/environ to any process that could
read it (root + same-UID processes).

Fix: capture the passphrase into a NON-EXPORTED shell variable
ZETA_CREDS_PASSPHRASE_VAL at Step 6.56. Bash shell variables (no
'export' keyword) live in the shell's own variable table but are NOT
copied into /proc/<pid>/environ. At Step 6.95 invocation use inline
env-set syntax: `ZETA_CREDS_PASSPHRASE="$ZETA_CREDS_PASSPHRASE_VAL"
sudo --preserve-env=...` which sets the env var ONLY in the sudo
subprocess (visible to the picker via --passphrase-env reference)
without touching the parent installer shell's env.

Finding 2 (P1 — unset only in picker-ran branch):
PR #5638's `unset ZETA_CREDS_PASSPHRASE` happened only inside the
picker-ran branch of the if/else. If the operator entered a
passphrase but the picker was skipped (ZETA_CREDS_PICKER=0,
/etc/zeta/no-picker present, /etc/zeta/usb-uuid missing), the
exported passphrase remained live in the installer env until process
exit.

Fix: restructure — move the `unset ZETA_CREDS_PASSPHRASE_VAL`
OUTSIDE the if/else block so it fires UNCONDITIONALLY after the
picker block, in BOTH the picker-ran AND picker-skipped branches.
The shell-var-not-env-var distinction from Finding 1's fix means
even before this unset the passphrase wasn't in /proc/.../environ,
but `unset` still scrubs it from the shell's own variable table —
matters for any later `set` / `declare -p` invocation.

Substrate-honest:
- Both findings are CORRECT — the supersede addresses both
- Validation: bash -n OK; docker harness 22s SUCCESS
- Audit confirmed via grep: no 'export ZETA_CREDS_PASSPHRASE' or
  'export ZETA_CREDS_PASSPHRASE_VAL' anywhere in zeta-install.sh
- The constitutional rail at line 452 still holds verbatim:
  'secrets shouldn't transit non-operator surfaces; operator-typed
  at install time is the safest path'

Closes #5638 with both findings addressed from the start.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 21:11
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the NixOS USB installer (zeta-install.sh) to reduce the credential-blob passphrase’s exposure by keeping it out of the parent installer environment and ensuring it’s scrubbed consistently after the picker step.

Changes:

  • Add Step 6.56 passphrase prompt that stores the value in a non-exported shell variable (ZETA_CREDS_PASSPHRASE_VAL).
  • Update the Step 6.95 picker gate + invocation to inline-set ZETA_CREDS_PASSPHRASE only for the sudo subprocess (and gate on ZETA_CREDS_PASSPHRASE_VAL).
  • Unset ZETA_CREDS_PASSPHRASE_VAL unconditionally after the picker block (runs in both picker-ran and picker-skipped branches).

Comment thread full-ai-cluster/usb-nixos-installer/zeta-install.sh Outdated
Comment thread full-ai-cluster/usb-nixos-installer/zeta-install.sh
…-label refs + align picker doc to ZETA_CREDS_PASSPHRASE_VAL (Copilot threads on #5643)

P1 — hard-coded line numbers drift
  Prior commit's Step 6.56 doc block referenced "line 574 /
  passphrase exposure window" and "line 1321 / unset only in
  picker-ran branch". Those line refs will drift as the script
  evolves. Repo convention: reference step labels (6.56,
  6.95-picker) and/or describe issues semantically.

  Fix: rewrite Step 6.56 doc as semantic two-step-lifecycle
  description (Step 6.56 → Step 6.95-picker → Step 6.95 post-picker).
  Names what each step does + why; no line numbers.

P2 — picker doc + SECURITY comment mismatched on var name
  Step 6.95-picker doc still referenced "ZETA_CREDS_PASSPHRASE
  (PR #5638 closes this via Step 6.56 prompt)" as the precondition,
  but the actual gate keys on ZETA_CREDS_PASSPHRASE_VAL after the
  supersede. SECURITY comment also referenced only the env-var
  side of the discipline.

  Fix: align doc to reference ZETA_CREDS_PASSPHRASE_VAL + expand
  SECURITY block to name the full 2-stage discipline (non-exported
  shell var in parent, inline-set into sudo subprocess only).

Validation: bash -n OK.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…inding on #5640 — restore service never fired) (#5644)

* fix(b-0852.4): correct cred-blob default path /esp → /boot to match installed-system ESP mount (Copilot finding on #5640)

PR #5640 shipped credsRestore.enable=true with blobPath defaulting to
`/esp/zeta-creds.enc`. Copilot review flagged the substrate-honest bug:

- At INSTALL-TIME (live USB), zeta-install.sh's Step 6.95-picker writes
  the blob to `/esp/zeta-creds.enc` because the live installer mounts
  the target ESP at `/esp`
- POST-REBOOT, disko (`disko-shapes/2nvme.nix`) mounts the SAME ESP
  partition at `/boot` per `mountpoint = "/boot"`
- The blob is the same physical file on the same ESP partition, but
  the mount path differs by context

The restore service runs POST-REBOOT, where the file is at
`/boot/zeta-creds.enc` — NOT `/esp/zeta-creds.enc`. So:

- ConditionPathExists = "/esp/zeta-creds.enc" always evaluates FALSE
  on the installed system (`/esp` doesn't exist post-reboot)
- systemd silently skips the unit (condition unmet)
- restore-from-cred-blob NEVER FIRES on any installed node
- creds are never restored at boot
- operator has to manually re-enter every credential each reboot —
  which is exactly the pain point the whole B-0852 cascade was
  designed to solve

This commit changes the default to `/boot/zeta-creds.enc` so the
service can actually find the blob it's supposed to decrypt. Also
expands the option description to explain the install-vs-installed
mount-path distinction so future maintainers don't reintroduce the
same confusion.

No changes to zeta-install.sh: the install-time write to
`/esp/zeta-creds.enc` is correct for the install-time context;
disko's later remount-as-/boot is what makes the file accessible
at the new path.

Validation:
- `nix-instantiate --parse zeta-creds-restore.nix` parses clean
  (no syntax change; only literal value + description text)
- Substrate-honest: this is a single-line semantic fix; the
  multi-line description expansion is wake-time substrate for the
  next maintainer who edits this module

Composes with #5640 (the row that surfaced the issue), #5643
(passphrase-env supersede), and the B-0852 cred-persistence cascade
(#5635 + #5637 + #5638 + #5639 + #5640 + #5641 + #5642).

Addresses CRITICAL Copilot finding on #5640.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fixup(b-0852.4): producer-side /esp → /mnt/boot + clean option doc (Copilot threads on #5644)

P1 — producer-side path mismatch ALSO needs fixing
  Prior commit fixed CONSUMER (restore service) but PRODUCER
  (Step 6.95-picker) was writing to /esp/zeta-creds.enc — which
  doesn't correspond to any mount. Target ESP is mounted at
  /mnt/boot during install (zeta-install.sh:226). Blob was
  landing on live USB rootfs, not target ESP. Reboot lost it.

  Fix: picker --output /esp/zeta-creds.enc → /mnt/boot/zeta-creds.enc.
  Producer now writes to target ESP mount, disko remounts as /boot
  post-reboot. Same physical file at two mount paths bridges the
  install-vs-installed boundary.

P2 — option doc style: strip PR-review history attribution
  Prior commit included "caught by Copilot review on PR #5640" in
  option doc. Repo convention: code/current-state docs use
  role-neutral present-tense contract text; PR-review history lives
  in commit messages + history surfaces.

  Fix: rewrite doc as present-tense contract for the option (what
  it configures + install-vs-installed mount convention for
  operators using non-default ESP layouts).

Validation: bash -n OK.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 27, 2026

Superseded by #5650 after #5644 merged + this branch went DIRTY (overlapping picker-block edit). New branch off current-main; same 2 Copilot findings addressed; preserves #5644 substrate.

@AceHack AceHack closed this May 27, 2026
auto-merge was automatically disabled May 27, 2026 21:27

Pull request was closed

AceHack pushed a commit that referenced this pull request May 27, 2026
…D018 fix (Copilot 6 threads on #5648)

Comprehensive accuracy rewrite addressing all 6 Copilot findings:

1. "no more re-entering" overclaim — passphraseMode=interactive
   DOES prompt every boot via systemd-ask-password. Reframed
   accurately: N per-tool login flows → ONE cred-blob passphrase.
   The improvement is atomicity, not zero typing.

2. Install log lines mismatch — restored to match actual zeta-install.sh
   output (Step 6.56 + Step 6.95-picker actual strings).

3. /boot path correctness — preserved (#5644 already fixed
   producer/consumer alignment to /mnt/boot ↔ /boot).

4. Manifest coverage — included gemini + codex paths
   (~/.gemini/oauth_creds.json, ~/.codex/auth.json) plus the
   full default-manifest table.

5. Second-reboot expectation — corrected: interactive mode prompts
   every boot by design. Operator who wants no-prompt-at-boot can
   switch to passphraseMode="file" (with security tradeoff named).

6. Filename reference — zeta-creds-cli.ts → zeta-creds-manifest.ts
   (actual canonical location of defaultManifest).

Also fixes MD018 lint failure: line "#5639 + #5640 + #5643 + #5644 +"
was being parsed as an ATX heading because # was at column 1. Replaced
the line-wrapped PR-number prose with the default-manifest table
(more useful + no MD018 trigger).

Composes with:
- B-0852 cred-persistence cascade (PRs that ACTUALLY ship: #5635,
  #5637, #5639, #5640, #5641, #5642, #5644, #5645, #5646, #5648,
  #5649, #5650; #5638 + #5643 were superseded → closed without merge)
- common.nix passphraseMode=interactive default (PR #5640)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…ep 6.56 cred-blob prompt + non-exported var + unconditional unset + comment alignment (supersedes #5643 which went DIRTY post-#5644 merge) (#5650)

#5643 (passphrase env-exposure supersede) went DIRTY when #5644
(blob-path fix-fwd) merged because #5644 modified the same picker
block. This is the supersede-via-new-branch reland off current main.

What this PR does (cumulative of #5643 + #5643 fixup):

Step 6.56 (NEW — cred-blob passphrase prompt):
  - Operator-typed passphrase captured into NON-EXPORTED shell
    variable ZETA_CREDS_PASSPHRASE_VAL
  - Bash shell variables without `export` live in shell's own
    variable table but are NOT copied into /proc/<pid>/environ
  - Same operator-typed-once-on-console pattern as iter-5.3
    password (constitutional rail line 452)

Step 6.95-picker (modified):
  - Gate check uses ZETA_CREDS_PASSPHRASE_VAL (non-exported var)
  - sudo invocation uses inline env-set:
    `ZETA_CREDS_PASSPHRASE="$ZETA_CREDS_PASSPHRASE_VAL" sudo
    --preserve-env=ZETA_CREDS_PASSPHRASE ...` — exports env var
    only into sudo subprocess
  - `unset ZETA_CREDS_PASSPHRASE_VAL` moved OUTSIDE if/else block
    so it fires in BOTH picker-ran AND picker-skipped branches
    (prior bug: unset only in picker-ran branch left passphrase
    live when picker was skipped)

Comment alignment (from #5643 fixup):
  - Step 6.56 doc: semantic two-step-lifecycle description (no
    hard-coded line numbers — would drift as script evolves)
  - Step 6.95-picker doc: references ZETA_CREDS_PASSPHRASE_VAL
    as the precondition (matches actual gate check)
  - SECURITY block: documents the full 2-stage discipline
    (non-exported parent shell var + inline-set into sudo only)

Preserves #5644's substrate (picker --output /mnt/boot/zeta-creds.enc
+ all the mount-path comment block) intact — this commit only
touches the env-var lifecycle, not the path lifecycle.

Validation:
- bash -n syntax check passed
- bun tools/ci/audit-installer-substrate.ts PASS
- Docker harness passed in 15s

Closes #5643 (the DIRTY supersede) with both Copilot findings
addressed + #5644's path substrate preserved.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 27, 2026
…o-end verification checklist for operator) (#5648)

* docs(provisioning): add cred-restore smoke-test section — first-boot + post-reboot + second-reboot verification + troubleshooting table (B-0852 end-to-end)

The B-0852 cred-persistence cascade (PRs #5635 + #5637 + #5638 +
#5639 + #5640 + #5643 + #5644 + #5646) closes the operator's
'don't re-enter creds over and over' pain point. This docs addition
gives operators a concrete checklist to verify the full path works
after a fresh USB install:

- First-boot verification: what install log lines to look for
- Post-reboot verification: systemctl + ls + auth-status commands
- Second-reboot verification: confirm no re-entry needed
- Troubleshooting table: 4 common symptoms with likely causes

Closes the gap between 'cascade is shipped' and 'operator can
confirm cascade works on their hardware'. The operator no longer
has to figure out which systemd unit to query or which paths to
check — the checklist names them.

Composes with:
- PROVISIONING.md (existing operator-facing install doc)
- B-0852 cred-persistence substrate
- The audit-extension PR (separate; catches drift at CI time)

Substrate-honest scope: this is operator docs, not a TS tool. A
follow-on TS smoke-test runner (run on the installed system to
auto-verify the checklist) is a candidate for follow-up work but
out of scope for this commit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fixup(docs): rewrite cred-restore smoke-test section for accuracy + MD018 fix (Copilot 6 threads on #5648)

Comprehensive accuracy rewrite addressing all 6 Copilot findings:

1. "no more re-entering" overclaim — passphraseMode=interactive
   DOES prompt every boot via systemd-ask-password. Reframed
   accurately: N per-tool login flows → ONE cred-blob passphrase.
   The improvement is atomicity, not zero typing.

2. Install log lines mismatch — restored to match actual zeta-install.sh
   output (Step 6.56 + Step 6.95-picker actual strings).

3. /boot path correctness — preserved (#5644 already fixed
   producer/consumer alignment to /mnt/boot ↔ /boot).

4. Manifest coverage — included gemini + codex paths
   (~/.gemini/oauth_creds.json, ~/.codex/auth.json) plus the
   full default-manifest table.

5. Second-reboot expectation — corrected: interactive mode prompts
   every boot by design. Operator who wants no-prompt-at-boot can
   switch to passphraseMode="file" (with security tradeoff named).

6. Filename reference — zeta-creds-cli.ts → zeta-creds-manifest.ts
   (actual canonical location of defaultManifest).

Also fixes MD018 lint failure: line "#5639 + #5640 + #5643 + #5644 +"
was being parsed as an ATX heading because # was at column 1. Replaced
the line-wrapped PR-number prose with the default-manifest table
(more useful + no MD018 trigger).

Composes with:
- B-0852 cred-persistence cascade (PRs that ACTUALLY ship: #5635,
  #5637, #5639, #5640, #5641, #5642, #5644, #5645, #5646, #5648,
  #5649, #5650; #5638 + #5643 were superseded → closed without merge)
- common.nix passphraseMode=interactive default (PR #5640)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants