ci(B-0831 layer-1): extend audit-installer-substrate with iter-5.4 sentinels#5365
Merged
AceHack merged 2 commits intoMay 27, 2026
Conversation
…ntinels (gh auth setup-git, ssh-key stderr-capture, self-reg flow, ClusterNode YAML schema, MAC parsing) Layer 1 of a 4-layer CI testing approach for the iter-5.4 substrate (B-0812 self-registration + B-0813 cluster reconciliation + B-0835 bug fixes Bug 2a + 2b on #5364): Layer 1 (THIS PR) — source-level sentinel audit (cheap; catches regression) Layer 2 (next PR) — behavioral test with mock gh shim on PATH Layer 3 (B-0833 Approach A) — mock GH device-code endpoint Layer 4 (B-0831 cascade #6) — QEMU full-install + cluster auto-join This layer extends the existing REQUIRED_SENTINELS for full-ai-cluster/usb-nixos-installer/zeta-install.sh with 14 new substrings, organized into 3 groups: (a) iter-5.4 flow anchors (5 sentinels): - "Step 6.8: iter-5.4.0 homelab gh-auth + operator pubkey copy" - "Step 6.9: iter-5.4.1 self-registration commit+push" - "gh auth login" - "gh ssh-key list" - "gh repo clone Lucent-Financial-Group/Zeta" (b) Bug 2a + 2b fix-regression catches (3 sentinels): - "gh auth setup-git" — Bug 2a fix; presence catches removal - "SSH_KEY_ERR_FILE" — Bug 2b fix; presence catches stderr-capture removal - "admin:public_key" — Bug 2b fix; presence catches scope-recovery message removal (c) ClusterNode YAML schema sentinels (5 sentinels — catches the Copilot findings on #5352 where spec.role was scalar, spec.maintainer was at wrong path, spec.storage was a sibling instead of under hardware block): - "apiVersion: zeta.lucent-financial-group.com/v1" - "kind: ClusterNode" - " roles:" — spec.roles is ARRAY per B-0813 - " registration:" — spec.registration block per B-0813 - " hardware:" — spec.hardware block per B-0813 (d) Hardware-probe sentinels (catches MAC parsing regression from #5352): - "/proc/cpuinfo" — CPU_MODEL extraction - "link/ether" — MAC parses field after link/ether (not before) (e) Self-reg branch-shape sentinel: - "register-${NODE_HOSTNAME}-" — iter-5.4.1 branch name pattern Composes with: - PR #5364 (Bug 2a + 2b fixes that this audit will catch if regressed) - PR #5352 (iter-5.4.1 Copilot findings that this audit will catch) - PR #5354 (Bug 1 hostname symlink fix — already covered by existing sentinels) - B-0831 (cascade #6 full-install QEMU test; this is layer 1 of that work) - B-0833 (interactive-login vs baked-in-keys tension; layer 3 of the cascade) Why source-level + cheap-first: - Workflow build-ai-cluster-iso.yml runs `bun tools/ci/audit-installer-substrate.ts` on every PR touching the installer surface - Source-level catches substrate-regression at PR-author-time (seconds) - vs Layer 4 QEMU full-install (~minutes; expensive; flaky) - Layer 1 is the inner loop; Layers 2-4 are the outer loops Per `.claude/rules/verify-existing-substrate-before-authoring.md`: substrate-inventory pass found `tools/ci/audit-installer-substrate.ts` already has the REQUIRED_SENTINELS pattern for iter-4.2 + iter-5.1 + iter-5.2 + iter-5.2.2; this PR extends with iter-5.4 sentinels rather than minting parallel substrate. Verified: `bun tools/ci/audit-installer-substrate.ts` exits 0 ("PASS — 10 required files + 5 sentinel-file assertions OK") with the extended sentinel list against the current installer script at origin/main HEAD (commit 19d9617 from #5364). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
Extends the source-level CI sentinel audit for the AI-cluster installer substrate to cover the iter-5.4.0/5.4.1 GitHub auth + self-registration flows, so text-level regressions (dropped commands / dropped YAML schema anchors) are caught quickly in CI.
Changes:
- Added iter-5.4 sentinel substrings for
zeta-install.shcovering gh auth, ssh-key retrieval, repo clone, and registration-branch shape. - Added schema/hardware-probe sentinels to catch regressions in ClusterNode YAML composition and MAC parsing.
- Updated the sentinel rationale string to reflect the newly-audited substrate.
…o parenthesis closes on first line Copilot finding on the audit-installer-substrate.ts iter-5.4 sentinel addition: the comment 'iter-5.4.1 YAML schema sentinels (catches the Copilot findings from #5352' opened a parenthesis on line 98 that didn't close until line 100 ('block)'). To a code-reader scanning line 98, the sentence reads as unfinished. Fix: restructure as 'sentinels. Each catches a specific Copilot finding on PR #5352: ...' — no multi-line parenthesis; each schema-correction is a complete clause.
auto-merge was automatically disabled
May 27, 2026 00:45
Pull Request is not mergeable
AceHack
added a commit
that referenced
this pull request
May 27, 2026
…low (asserts logical relationships between Bug 2a + 2b fix elements, ClusterNode YAML schema, iter-5.4.1 cascade gating) (#5367) Layer 2a of the 4-layer CI testing approach for iter-5.4 substrate: Layer 1 (#5365) — source-level sentinel audit (substring presence) Layer 2a (THIS PR) — structural-behavioral test (logical relationships) Layer 2b (future PR) — true mock-gh shim execution (refactor iter-5.4 into sourceable bash function; test against mock gh on PATH with success/scope-error/empty modes) Layer 3 (B-0833 App A) — mock GH device-code endpoint Layer 4 (B-0831) — QEMU full-install + cluster auto-join What this layer catches that Layer 1 doesn't: 1. `gh auth setup-git` is INSIDE the SUCCESS branch of `if gh auth login; then` (not just present somewhere in the script — placement matters; if setup-git ended up outside the success branch, it'd run on auth failure too). 2. setup-git is called BEFORE the ssh-key fetch (ordering matters — the git credential helper must be wired before any git push attempt). 3. SSH_KEY_ERR_FILE is wired AS the stderr redirect to `gh ssh-key list` (Bug 2b: if the file is created but not used as stderr, scope-error discrimination silently fails). 4. 3 distinct WARN paths exist (scope-error, empty-no-keys, pipe-broke) with their substrate-honest recovery messages (recovery commands for scope-error; settings/keys URL for empty-no-keys). 5. GH_AUTH_OK=1 is set EXACTLY ONCE — in the success branch of gh auth login (not in any failure or skip path). 6. iter-5.4.1 self-reg is gated on `GH_AUTH_OK = 1` (cascade-skip discipline; runs only if iter-5.4.0 succeeded). 7. iter-5.4.1 subshell uses `set +e` + the subshell wrapper closes with `|| true` (Copilot finding on #5352 — outer set -euo pipefail would propagate subshell failure out of the install). 8. ClusterNode YAML schema sentinels (catches the 3 Copilot findings on #5352 — spec.role was scalar instead of array; spec.maintainer was at flat path instead of nested under spec.registration; spec.storage was sibling of hardware instead of nested under it). 9. MAC parsing extracts the field AFTER `link/ether` (prior bug was `$(NF-2)` extracting `brd` instead of the MAC). 10. Self-reg branch name shape matches `register-<HOSTNAME>-<UTCTS>` (catches accidental rename that would break the cluster-side ArgoCD pattern watching register-* branches). Test approach: parse zeta-install.sh as text; extract iter-5.4.0 and iter-5.4.1 blocks by step-header boundaries; assert regex relationships within each block. 23 tests, 35 expect() calls, ~150ms runtime. Layer 2b deferred: requires refactoring iter-5.4.0 + iter-5.4.1 into a sourceable bash function so we can mock `gh` on PATH and assert behavior across the 4 modes (success/scope-error/empty/pipe-broke). That's a bigger refactor — separate PR. Structural-behavioral catches the same failure modes at much lower cost as the inner-loop test. Composes with: - PR #5364 (Bug 2a + 2b fixes — this layer asserts the fixes' STRUCTURE not just their presence) - PR #5352 (Copilot YAML schema findings — this layer asserts the schema corrections held) - PR #5365 (Layer 1 sentinels — composes; same workflow runs both) - B-0831 (cascade #6 full-install QEMU — this is layer 2a) - B-0833 (interactive-login vs baked-in-keys tension — layer 3 of cascade) Wired into .github/workflows/build-ai-cluster-iso.yml as a fast preflight (runs BEFORE the ~15-min Nix build; fails fast if iter-5.4 substrate has regressed structurally). Verified locally: $ bun test tools/ci/test-iter-54-install-flow.test.ts bun test v1.3.13 23 pass 0 fail 35 expect() calls 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Lior <lior@zeta.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Layer 1 of 4-layer CI testing approach for iter-5.4 substrate
Aaron asked: "yeah push forward a bit maybe create some more ci tests how do you want to test the gh login flow?"
The 4-layer plan:
What this PR adds
Extends `REQUIRED_SENTINELS` for `full-ai-cluster/usb-nixos-installer/zeta-install.sh` with 14 new substrings:
(a) iter-5.4 flow anchors
(b) Bug 2a + 2b fix-regression catches (PR #5364)
(c) ClusterNode YAML schema sentinels (PR #5352 Copilot findings)
(d) Hardware-probe sentinels (MAC parsing regression catch)
(e) Self-reg branch shape
Verified
```
$ bun tools/ci/audit-installer-substrate.ts
audit-installer-substrate: PASS — 10 required files + 5 sentinel-file assertions OK
```
Runs in the existing `build-ai-cluster-iso.yml` workflow on every PR touching the installer surface.
Composes with
Substrate-honest framing
Layer 1 doesn't test BEHAVIOR — only that the substrate is PRESENT. A future Aaron-edit that accidentally removes `gh auth setup-git` would be caught by this layer; an edit that changes `gh auth setup-git` to `gh auth setup-git --hostname github.com` would still pass (substring match). Layer 2 (mock-gh shim) catches behavioral regressions; this layer is the cheapest first line of defense.
🤖 Generated with Claude Code