From 0d4c27a1f7d15a5c39211e31a2e1205a47784a0b Mon Sep 17 00:00:00 2001 From: Lior Date: Tue, 26 May 2026 00:19:12 -0400 Subject: [PATCH] =?UTF-8?q?feat(B-0789=20iter-4.2):=20zflash=20auto-inject?= =?UTF-8?q?=20of=20SSH=20pubkey=20to=20boot=20USB=20ESP=20+=20zeta-install?= =?UTF-8?q?.sh=20USB=20probe=20=E2=86=92=20zero-typing=20SSH=20access=20on?= =?UTF-8?q?=20first=20cluster=20boot?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The maintainer's actually-usable iter-4 path (v1 was scaffolding-only; this PR is the workflow Aaron will test against). Per Aaron's 2026-05-26 discipline signals: 1. "we can do what's going to make cluster setup eaiser for me and not users if that's ssh lets do that first cause we want to get ai running the cluster asap" — iter-4 authorized 2. "i can wait for 4.2 or whatever version before we try again" — downgraded v1 to scaffolding; this PR is what Aaron flashes 3. "--no-creds is basically useless right?" — opt-out removed from recommended path (kept as --no-inject escape hatch only) 4. "whenever i have to ferry commands by reading and typing i'm going to avoid it like the plague and try to get like pictures and auto run and short commands pre built in" — design discipline: ALL diagnostics auto-fire in-place + are photo-friendly; zero operator- typed commands beyond `zflash` Files: - full-ai-cluster/tools/flash-usb.ts: added `--no-eject` flag (4 lines) so zflash can do post-flash ESP-mount-and-write before the USB ejects. Allowlist updated per the Copilot P0 catch about destructive-tool flag validation. Help text mentions iter-4.2 use case - full-ai-cluster/tools/zflash.ts: extended with post-flash macOS-side ESP-mount-and-write step: * Default reads ~/.ssh/id_ed25519.pub * --ssh-key overrides * --no-inject opt-out (escape hatch only) * Re-scans external disks post-flash (flash-usb's single-USB-only requirement guarantees exactly one external disk) * Identifies FAT/EFI partition via `diskutil list` regex match (DOS_FAT / EFI / MS-DOS / FAT16 / FAT32 / Windows_FAT) * Mounts via `diskutil mount`; gets mount point from `diskutil info` * Writes /zeta-authorized-keys.pub via `sudo tee` (stdin avoids shell-quoting hazards) * Unmounts + ejects when done * dumpDiagnostics() helper auto-runs on any failure path: diskutil list external + mounted /Volumes/* + "what to do next" suggestions. Compact + photo-friendly per the design discipline - full-ai-cluster/usb-nixos-installer/zeta-install.sh: added step 6.5 pre-install pubkey probe + injection: * Try 1: scan /iso /run /mnt /boot for zeta-authorized-keys.pub via `sudo find -maxdepth 5` * Try 2: probe USB partitions (/dev/sd? /dev/nvme?n? /dev/vd? /dev/mmcblk?, minus install targets) via vfat-readonly mount + file existence check. Partition suffix handling: 1/2 on sd/vd; p1/p2 on nvme/mmcblk * If found: writes operator-ssh-keys.nix with valid ssh-* lines from the file BEFORE nixos-install * If not found: diagnostics auto-fire (external block devices, install targets, full lsblk, "what to do next") + falls back to v1 stub * Post-install credentials echo branches on INJECT_OK: success path says "SSH works immediately"; fallback keeps v1 manual- edit + nixos-rebuild instructions * shellcheck clean (fixed SC2261 redundant stderr redirect) - docs/backlog/P1/B-0789-*.md: updated iter-4.2 acceptance to reflect what shipped: [x] flash-usb.ts --no-eject; [x] zflash.ts ESP inject; [x] zeta-install.sh probe + inject + branched credentials echo; [ ] maintainer flashes + tests on PC; [ ] if failure: photo-driven fix-forward workflow per the maintainer's explicit design choice Composes with PR #5080 (iter-4 v1 scaffolding: initial-password.nix + operator-ssh-keys.nix stub + per-host imports) which this PR builds on. The zero-typing target: `zflash` → boot USB on PC → install → SSH-able as zeta@ from the maintainer's Mac using the existing ~/.ssh/id_ed25519 key. Failure path: photo of auto-diagnostics → AI fixes-forward. Co-Authored-By: Claude --- ...ubstrate-for-cluster-bringup-2026-05-26.md | 29 +- full-ai-cluster/tools/flash-usb.ts | 28 +- full-ai-cluster/tools/zflash.ts | 278 +++++++++++++++++- .../usb-nixos-installer/zeta-install.sh | 140 ++++++++- 4 files changed, 434 insertions(+), 41 deletions(-) diff --git a/docs/backlog/P1/B-0789-iter4-ssh-key-and-hashedpassword-substrate-for-cluster-bringup-2026-05-26.md b/docs/backlog/P1/B-0789-iter4-ssh-key-and-hashedpassword-substrate-for-cluster-bringup-2026-05-26.md index 01ca4f7df6..2050cc0cd0 100644 --- a/docs/backlog/P1/B-0789-iter4-ssh-key-and-hashedpassword-substrate-for-cluster-bringup-2026-05-26.md +++ b/docs/backlog/P1/B-0789-iter4-ssh-key-and-hashedpassword-substrate-for-cluster-bringup-2026-05-26.md @@ -92,25 +92,30 @@ The maintainer 2026-05-26: *"i can wait for 4.2 or whatever version before we tr ### iter-4.2 acceptance (target the maintainer will actually test against) -- [ ] `full-ai-cluster/tools/zflash.ts` extended (or new sibling `zflash-creds.ts`) with post-flash macOS-side ESP-mount-and-write step: +Note: the maintainer 2026-05-26 *"--no-creds is basically useless right?"* signal removed the opt-out flag from the original design. The default behavior IS the new behavior; opt-out (renamed `--no-inject`) exists only as an escape hatch for the operator who explicitly wants the old flash-only flow without the pubkey-write step. + +- [x] `full-ai-cluster/tools/flash-usb.ts` extended with `--no-eject` flag so zflash can do the ESP-mount-and-write before the USB ejects (4-line change; allowlist + skip-eject branch) +- [x] `full-ai-cluster/tools/zflash.ts` extended with post-flash macOS-side ESP-mount-and-write step: - Default reads `~/.ssh/id_ed25519.pub` - `--ssh-key ` overrides - - `--no-creds` opts out (preserves current zflash behavior) - - Mounts the FAT / ESP partition of the flashed USB via `diskutil mount` - - Writes `/Volumes/