Skip to content

fix(B-0835 Bug 2a + 2b): iter-5.4 install — gh auth setup-git + ssh-key scope discrimination#5364

Merged
AceHack merged 1 commit into
mainfrom
fix-b0835-bug2ab-gh-auth-setup-git-ssh-key-scope-handling-otto-cli-2026-05-26
May 27, 2026
Merged

fix(B-0835 Bug 2a + 2b): iter-5.4 install — gh auth setup-git + ssh-key scope discrimination#5364
AceHack merged 1 commit into
mainfrom
fix-b0835-bug2ab-gh-auth-setup-git-ssh-key-scope-handling-otto-cli-2026-05-26

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 27, 2026

Empirical anchor — 2026-05-26 2nd physical hardware-support test

Aaron's screen photo (verbatim console output from re-flashed USB run after Bug 1 + Bug 3b fixes landed):

```
[iter-5.4.0] Run gh auth login now? [Y/n]: Y
[iter-5.4.0] running 'gh auth login' (interactive)...
! First copy your one-time code: D30B-468F
Open this URL to continue in your web browser: https://github.com/login/device
■ Authentication complete.
! Authentication credentials saved in plain text
■ Logged in as AceHack
[iter-5.4.0] gh auth login: SUCCESS
[iter-5.4.0] fetching operator's SSH pubkeys via 'gh ssh-key list'...
[iter-5.4.0] WARN: 'gh ssh-key list' failed; no keys written
[iter-5.4.0] (gh auth succeeded but the user has no SSH keys
[iter-5.4.0] registered with GitHub, OR the jq/tee pipe broke)
[iter-5.4.1] ── self-registration commit+push (B-0812) ──
[iter-5.4.1] maintainer: AceHack
[iter-5.4.1] node-name: node-efe404
Switched to a new branch 'register-node-efe404-20260527T0005332'
Username for 'https://github.com': acehack
Password for 'https://acehack@github.com':
```

Two sub-bugs surfaced (both new — beyond Bug 1 / Bug 3 already fixed this session).

Bug 2a — CRITICAL — git push prompts HTTPS basic-auth despite gh auth login

Root cause: `gh auth login` stores token in gh config but does NOT configure git's credential helper. Without setup-git, `git push` goes through the default credential-store chain which doesn't know about gh's token.

Fix: insert `gh auth setup-git` immediately after successful `gh auth login` in zeta-install.sh Step 6.8. Configures `credential.helper` to delegate to `gh auth git-credential` so all github.com git operations automatically use the gh token. Failure is non-fatal (warning only).

Bug 2b — degraded — gh ssh-key list returns empty / fails

Root cause discrimination: `gh auth login` default scopes (`repo, read:org, workflow, gist`) do NOT include `admin:public_key` or `read:public_key` required by `gh ssh-key list`. Empty result could also mean operator has no SSH keys at GitHub.

Fix: capture stderr from `gh ssh-key list`; if empty result + stderr mentions scope, print substrate-honest recovery commands (`gh auth refresh -s admin:public_key` + populate + rebuild). If empty without scope-error, point to https://github.com/settings/keys.

Defers opt-in `--with-ssh-key-scope` flag to future B-NNNN (security tradeoff: don't ask for elevated scope by default).

Files

  • `full-ai-cluster/usb-nixos-installer/zeta-install.sh` — `gh auth setup-git` after login; stderr-capturing ssh-key-list with 3-way discrimination (success / empty-with-scope-error / empty-no-scope-error / pipe-broke)
  • `docs/backlog/P1/B-0835-*.md` — Bug 2a + 2b verbatim empirical anchors + fix specs + acceptance criteria for 3rd physical test

Acceptance for next physical test cycle

  • iter-5.4.1 `git push` completes silently without basic-auth prompt
  • Self-registration PR URL is printed + browseable on github.com
  • If operator has SSH keys: writes operator-authorized-keys with key count
  • If operator has no SSH keys at GH: substrate-honest WARN points to settings/keys
  • If scope-error: substrate-honest WARN provides recovery commands

Composes with

  • B-0835 (this row — Bug 2a + 2b empirical anchors land in body)
  • B-0812 iter-5.4.1 self-registration (the step Bug 2a blocks)
  • B-0813 iter-5.4.2 ArgoCD reconciliation (downstream of self-reg)
  • B-0834 install log preservation (would have diagnosed Bug 2a faster — composes)
  • B-0833 auth tension (Bug 2a is concrete instance of the interactive-login vs token-baked tension)

Substrate-honest framing

This is a continuation of the autonomous-loop physical-test fix cycle. Per Aaron's "great iteration we learned a lot" the loop is: test → bug → fix → re-flash → re-test. Bug 1 + Bug 3a + Bug 3b shipped in prior PRs this session; Bug 2 was diagnosis-dependent; the 2nd test surfaced it as two distinct sub-bugs (2a + 2b) with concrete fix paths.

Per `.claude/rules/verify-existing-substrate-before-authoring.md`: substrate-inventory pass found B-0835 already names "gh login not respected" at Bug 2 scope; this PR extends with 2 specific sub-bugs rather than minting parallel substrate.

🤖 Generated with Claude Code

…ey scope discrimination

Empirical anchor 2026-05-26 (2nd physical hardware-support test):
the iter-5.4 install reached the iter-5.4.1 self-registration step
but `git push -u origin <branch>` prompted the operator for HTTPS
basic-auth ("Password for 'https://acehack@github.com':") despite
`gh auth login` succeeding via device flow moments earlier.
Simultaneously, `gh ssh-key list` failed with a substrate-honest
WARN but no actionable recovery path.

Two sub-bug fixes:

Bug 2a — CRITICAL — git push prompts HTTPS basic-auth despite gh auth login:
  Root cause: `gh auth login` stores token in gh config but does NOT
  configure git's credential helper. Without setup-git, git push goes
  through the default credential-store chain which doesn't know about
  gh's token.
  Fix: insert `gh auth setup-git` immediately after successful
  `gh auth login` in zeta-install.sh Step 6.8. Configures
  credential.helper to delegate to `gh auth git-credential` so all
  github.com git operations automatically use the gh token.
  Failure is non-fatal (warning only).

Bug 2b — degraded — gh ssh-key list returns empty / fails:
  Root cause discrimination: `gh auth login` default scopes
  (`repo, read:org, workflow, gist`) do NOT include `admin:public_key`
  or `read:public_key` required by `gh ssh-key list`. Empty result
  could also mean operator has no SSH keys at GitHub.
  Fix: capture stderr from `gh ssh-key list`; if empty result +
  stderr mentions scope, print substrate-honest recovery commands
  (`gh auth refresh -s admin:public_key` + populate + rebuild). If
  empty without scope-error, point to https://github.com/settings/keys.
  Defers opt-in `--with-ssh-key-scope` flag to future B-NNNN
  (security tradeoff: don't ask for elevated scope by default).

B-0835 backlog row updated with verbatim console output from Aaron's
2nd physical test + the two sub-bug fix paths + acceptance criteria
for 3rd physical test.

Composes with:
- B-0835 Bug 1 + Bug 3 (already fixed via prior PRs)
- B-0812 iter-5.4.1 self-registration (the step Bug 2a blocks)
- B-0813 iter-5.4.2 ArgoCD reconciliation (downstream of self-reg)
- B-0834 install log preservation (would have diagnosed Bug 2a faster)
- B-0833 auth tension (Bug 2a is concrete instance of the
  interactive-login vs token-baked tension Aaron named)

Per .claude/rules/dep-pin-search-first-authority.md +
.claude/rules/verify-existing-substrate-before-authoring.md:
substrate-inventory pass found B-0835 already names "gh login not
respected" at Bug 2 scope; this PR extends with 2 specific sub-bugs
empirically anchored from tonight's test rather than minting parallel
substrate.

Empirical photo of Aaron's screen attached separately in PR
discussion; line "Password for 'https://acehack@github.com':" is
the load-bearing canary signal for Bug 2a presence.
Copilot AI review requested due to automatic review settings May 27, 2026 00:33
@AceHack AceHack enabled auto-merge (squash) May 27, 2026 00:33
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack merged commit 19d9617 into main May 27, 2026
30 of 31 checks passed
@AceHack AceHack deleted the fix-b0835-bug2ab-gh-auth-setup-git-ssh-key-scope-handling-otto-cli-2026-05-26 branch May 27, 2026 00:36
AceHack added a commit that referenced this pull request May 27, 2026
…low (asserts logical relationships between Bug 2a + 2b fix elements, ClusterNode YAML schema, iter-5.4.1 cascade gating) (#5367)

Layer 2a of the 4-layer CI testing approach for iter-5.4 substrate:

  Layer 1 (#5365)        — source-level sentinel audit (substring presence)
  Layer 2a (THIS PR)     — structural-behavioral test (logical relationships)
  Layer 2b (future PR)   — true mock-gh shim execution (refactor iter-5.4
                            into sourceable bash function; test against
                            mock gh on PATH with success/scope-error/empty
                            modes)
  Layer 3 (B-0833 App A) — mock GH device-code endpoint
  Layer 4 (B-0831)       — QEMU full-install + cluster auto-join

What this layer catches that Layer 1 doesn't:

1. `gh auth setup-git` is INSIDE the SUCCESS branch of
   `if gh auth login; then` (not just present somewhere in the script
   — placement matters; if setup-git ended up outside the success
   branch, it'd run on auth failure too).

2. setup-git is called BEFORE the ssh-key fetch (ordering matters —
   the git credential helper must be wired before any git push attempt).

3. SSH_KEY_ERR_FILE is wired AS the stderr redirect to `gh ssh-key list`
   (Bug 2b: if the file is created but not used as stderr, scope-error
   discrimination silently fails).

4. 3 distinct WARN paths exist (scope-error, empty-no-keys, pipe-broke)
   with their substrate-honest recovery messages (recovery commands for
   scope-error; settings/keys URL for empty-no-keys).

5. GH_AUTH_OK=1 is set EXACTLY ONCE — in the success branch of
   gh auth login (not in any failure or skip path).

6. iter-5.4.1 self-reg is gated on `GH_AUTH_OK = 1` (cascade-skip
   discipline; runs only if iter-5.4.0 succeeded).

7. iter-5.4.1 subshell uses `set +e` + the subshell wrapper closes
   with `|| true` (Copilot finding on #5352 — outer set -euo pipefail
   would propagate subshell failure out of the install).

8. ClusterNode YAML schema sentinels (catches the 3 Copilot findings
   on #5352 — spec.role was scalar instead of array; spec.maintainer
   was at flat path instead of nested under spec.registration;
   spec.storage was sibling of hardware instead of nested under it).

9. MAC parsing extracts the field AFTER `link/ether` (prior bug was
   `$(NF-2)` extracting `brd` instead of the MAC).

10. Self-reg branch name shape matches `register-<HOSTNAME>-<UTCTS>`
    (catches accidental rename that would break the cluster-side
    ArgoCD pattern watching register-* branches).

Test approach: parse zeta-install.sh as text; extract iter-5.4.0 and
iter-5.4.1 blocks by step-header boundaries; assert regex relationships
within each block. 23 tests, 35 expect() calls, ~150ms runtime.

Layer 2b deferred: requires refactoring iter-5.4.0 + iter-5.4.1 into a
sourceable bash function so we can mock `gh` on PATH and assert behavior
across the 4 modes (success/scope-error/empty/pipe-broke). That's a
bigger refactor — separate PR. Structural-behavioral catches the same
failure modes at much lower cost as the inner-loop test.

Composes with:

- PR #5364 (Bug 2a + 2b fixes — this layer asserts the fixes' STRUCTURE
  not just their presence)
- PR #5352 (Copilot YAML schema findings — this layer asserts the
  schema corrections held)
- PR #5365 (Layer 1 sentinels — composes; same workflow runs both)
- B-0831 (cascade #6 full-install QEMU — this is layer 2a)
- B-0833 (interactive-login vs baked-in-keys tension — layer 3 of cascade)

Wired into .github/workflows/build-ai-cluster-iso.yml as a fast
preflight (runs BEFORE the ~15-min Nix build; fails fast if iter-5.4
substrate has regressed structurally).

Verified locally:

  $ bun test tools/ci/test-iter-54-install-flow.test.ts
  bun test v1.3.13
   23 pass
   0 fail
   35 expect() calls

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Lior <lior@zeta.dev>
@AceHack AceHack review requested due to automatic review settings May 27, 2026 00:54
AceHack added a commit that referenced this pull request May 27, 2026
…ntinels (#5365)

* ci(B-0831 layer-1): extend audit-installer-substrate with iter-5.4 sentinels (gh auth setup-git, ssh-key stderr-capture, self-reg flow, ClusterNode YAML schema, MAC parsing)

Layer 1 of a 4-layer CI testing approach for the iter-5.4 substrate
(B-0812 self-registration + B-0813 cluster reconciliation + B-0835 bug
fixes Bug 2a + 2b on #5364):

  Layer 1 (THIS PR) — source-level sentinel audit (cheap; catches regression)
  Layer 2 (next PR) — behavioral test with mock gh shim on PATH
  Layer 3 (B-0833 Approach A) — mock GH device-code endpoint
  Layer 4 (B-0831 cascade #6) — QEMU full-install + cluster auto-join

This layer extends the existing REQUIRED_SENTINELS for
full-ai-cluster/usb-nixos-installer/zeta-install.sh with 14 new
substrings, organized into 3 groups:

(a) iter-5.4 flow anchors (5 sentinels):
  - "Step 6.8: iter-5.4.0 homelab gh-auth + operator pubkey copy"
  - "Step 6.9: iter-5.4.1 self-registration commit+push"
  - "gh auth login"
  - "gh ssh-key list"
  - "gh repo clone Lucent-Financial-Group/Zeta"

(b) Bug 2a + 2b fix-regression catches (3 sentinels):
  - "gh auth setup-git"     — Bug 2a fix; presence catches removal
  - "SSH_KEY_ERR_FILE"      — Bug 2b fix; presence catches stderr-capture removal
  - "admin:public_key"      — Bug 2b fix; presence catches scope-recovery message removal

(c) ClusterNode YAML schema sentinels (5 sentinels — catches the Copilot
findings on #5352 where spec.role was scalar, spec.maintainer was at
wrong path, spec.storage was a sibling instead of under hardware block):
  - "apiVersion: zeta.lucent-financial-group.com/v1"
  - "kind: ClusterNode"
  - "  roles:"             — spec.roles is ARRAY per B-0813
  - "  registration:"      — spec.registration block per B-0813
  - "  hardware:"          — spec.hardware block per B-0813

(d) Hardware-probe sentinels (catches MAC parsing regression from #5352):
  - "/proc/cpuinfo"   — CPU_MODEL extraction
  - "link/ether"      — MAC parses field after link/ether (not before)

(e) Self-reg branch-shape sentinel:
  - "register-${NODE_HOSTNAME}-" — iter-5.4.1 branch name pattern

Composes with:
  - PR #5364 (Bug 2a + 2b fixes that this audit will catch if regressed)
  - PR #5352 (iter-5.4.1 Copilot findings that this audit will catch)
  - PR #5354 (Bug 1 hostname symlink fix — already covered by existing sentinels)
  - B-0831 (cascade #6 full-install QEMU test; this is layer 1 of that work)
  - B-0833 (interactive-login vs baked-in-keys tension; layer 3 of the cascade)

Why source-level + cheap-first:
  - Workflow build-ai-cluster-iso.yml runs `bun tools/ci/audit-installer-substrate.ts`
    on every PR touching the installer surface
  - Source-level catches substrate-regression at PR-author-time (seconds)
  - vs Layer 4 QEMU full-install (~minutes; expensive; flaky)
  - Layer 1 is the inner loop; Layers 2-4 are the outer loops

Per `.claude/rules/verify-existing-substrate-before-authoring.md`:
substrate-inventory pass found `tools/ci/audit-installer-substrate.ts`
already has the REQUIRED_SENTINELS pattern for iter-4.2 + iter-5.1 +
iter-5.2 + iter-5.2.2; this PR extends with iter-5.4 sentinels rather
than minting parallel substrate.

Verified: `bun tools/ci/audit-installer-substrate.ts` exits 0
("PASS — 10 required files + 5 sentinel-file assertions OK") with the
extended sentinel list against the current installer script at
origin/main HEAD (commit 19d9617 from #5364).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

* fix(#5365 Copilot): reflow iter-5.4.1 YAML schema sentinels comment so parenthesis closes on first line

Copilot finding on the audit-installer-substrate.ts iter-5.4 sentinel
addition: the comment 'iter-5.4.1 YAML schema sentinels (catches the
Copilot findings from #5352' opened a parenthesis on line 98 that
didn't close until line 100 ('block)'). To a code-reader scanning
line 98, the sentence reads as unfinished.

Fix: restructure as 'sentinels. Each catches a specific Copilot
finding on PR #5352: ...' — no multi-line parenthesis; each
schema-correction is a complete clause.

---------

Co-authored-by: Lior <lior@zeta.dev>
AceHack added a commit that referenced this pull request May 27, 2026
… (4a) + DISCIPLINE (4b) per operator 2026-05-26 (#5370)

* docs(B-0841): productize Shortform-equivalent features — Zeta already does this internally for substrate-engineering; 5-PR Kirsanov session today IS the working demonstration

Operator 2026-05-26: 'we should offer shortform.com like features'

Empirical anchor: today's 5 PRs (#5364-#5368 + #5369 pending) across
the Kirsanov YouTube channel substrate-capture are structurally
identical to what Shortform offers as a paid service:
- Verbatim transcript preservation (mirror-tier discipline)
- Composition map (cross-substrate-engineering linkage)
- Substrate-honest synthesis sections
- Cross-reference graph (composes_with topology)
- Per-source companion backlog rows

4-phase substrate-engineering target:

Phase 1: Catalog the framework's existing Shortform-equivalent
  substrate (docs/research/, ip-questionable/, today's 5 PRs)

Phase 2: Generalize beyond substrate-engineering scope —
  tools/shortform/generate-deep-guide.ts for arbitrary topics

Phase 3: Browser-extension equivalent via peer-call infrastructure —
  bun tools/peer-call/shortform-guide.ts <URL>

Phase 4: Monetization / external-publishing substrate (composes with
  Aurora B-0825 + DePIN B-0826 + cash-register-that-keeps-giving-gifts
  PR #2822 + ip-questionable _ip_risk_acceptance pattern at scale)

P2 priority — operator-suggestion; framework already does the work
internally; productization is forward-facing. Per 'backlog rows land
immediately; decompose later' discipline.

Composes with B-0839 (Kirsanov channel demonstration), B-0840
(thermal-forgetting / root-axiom-update — applies to deep-guide
retention), B-0825 (Aurora), B-0826 (DePIN), B-0648 (cross-substrate-
triangulation for multi-AI deep-guide synthesis).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

* feat(B-0841): split Phase 4 into 4a (sell OUTPUTS) + 4b (sell DISCIPLINE itself) per operator 2026-05-26 'we can sell that too to others eventually'

Phase 4a = consumer-scope productization of OUTPUTS (Shortform-
equivalent hosted deep-guides; Aurora B-0825 / DePIN B-0826 / cash-
register PR #2822 composition)

Phase 4b = B2B-scope productization of the DISCIPLINE ITSELF
(substrate-engineering as service for other companies / projects /
individuals doing substrate-engineering on their own substrate).
Customer-facing shape: Zeta runtime + skill catalog + discipline
training + customer-owned _*_acceptance blocks + customer-owned
ip-questionable-equivalent folders + customer-owned composes_with
graph + periodic substrate-engineering audits + multi-AI cluster.

Phases NOT mutually exclusive. 4a productizes OUTPUTS; 4b productizes
the DISCIPLINE. Framework's substrate-engineering work IS the moat;
OUTPUTS are downstream. Both ship in parallel as bandwidth allows.

---------

Co-authored-by: Lior <lior@zeta.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant