Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
---
pr_number: 5343
title: "docs(backlog): B-0831 \u2014 CI cascade #6 full-install + cluster-auto-join (eliminate routine human physical USB test)"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-26T22:53:48Z"
merged_at: "2026-05-26T22:58:58Z"
closed_at: "2026-05-26T22:58:58Z"
head_ref: "otto/b-0831-ci-cascade-6-full-install-cluster-auto-join-no-human-test-2026-05-26"
base_ref: "main"
archived_at: "2026-05-27T19:30:35Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5343: docs(backlog): B-0831 — CI cascade #6 full-install + cluster-auto-join (eliminate routine human physical USB test)

## PR description

## Summary

Files B-0831 as P1 substrate-engineering target capturing operator direction 2026-05-26: \"zflash is the thing plus cluster auto joining after boot from iso use we want that in ci not needing human to test everytime.\"

## 3-slice decomposition

| Slice | Scope | Latency cost |
|---|---|---|
| 1 | Full-install-in-QEMU: boot installer ISO → first-boot service fires → greedy N-disk install → reboot → verify login banner | +5-10 min PR-build |
| 2 | Cluster-auto-join verification via mock cluster control-plane (capture + verify B-0812 self-registration payload) | +<1 min |
| 3 | ArgoCD reconciliation verification (most coupled to live cluster state; deferrable to push-to-main only) | TBD; possibly push-only |

Each slice ships independently. Overall acceptance: human physical-USB-test is no longer the routine gate for substrate landings.

## What remains valuable for physical test

- Real-hardware quirks (BIOS/UEFI variants; motherboard NICs; SAS controllers) that QEMU doesn't emulate
- Periodic sanity-checks the maintainer chooses to do
- First-time-on-new-hardware validation

## Test plan

- [x] markdownlint clean (B-0831 row + BACKLOG.md regenerated)
- [x] No code changes (backlog row only)
- [x] Composes_with cross-refs to all relevant rows + skills + workflow files
- [x] Substrate-honest scope assessment (L effort; phased; latency trade-off named)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Reviews

### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T22:56:09Z)

## Pull request overview

Adds a new P1 backlog row (B-0831) capturing the planned CI “cascade #6” work to validate a full installer run in QEMU plus post-boot cluster auto-join, with the goal of eliminating routine physical USB testing as the substrate gate.

**Changes:**
- Adds new backlog row **B-0831** describing a 3-slice CI verification plan (full install, mock join verification, optional ArgoCD reconciliation verification).
- Updates `docs/BACKLOG.md` index to include **B-0831**.

### Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

| File | Description |
| ---- | ----------- |
| docs/backlog/P1/B-0831-ci-cascade-6-full-install-plus-cluster-auto-join-eliminate-routine-human-physical-usb-test-aaron-2026-05-26.md | New backlog row defining the problem statement, slices, acceptance criteria, and cross-references for CI cascade #6. |
| docs/BACKLOG.md | Adds B-0831 to the P1 backlog index list. |

## Review threads

### Thread 1: docs/backlog/P1/B-0831-ci-cascade-6-full-install-plus-cluster-auto-join-eliminate-routine-human-physical-usb-test-aaron-2026-05-26.md:71 (resolved)

**@copilot-pull-request-reviewer** (2026-05-26T22:56:08Z):

The cluster registration path is inconsistent with the established per-maintainer shape (e.g., `maintainers/<operator>/cluster-nodes/<hostname>/...` in B-0812/B-0794). `maintainers/cluster-nodes/` suggests a top-level directory that doesn't exist and would mislead future implementation/testing work.

### Thread 2: docs/backlog/P1/B-0831-ci-cascade-6-full-install-plus-cluster-auto-join-eliminate-routine-human-physical-usb-test-aaron-2026-05-26.md:86 (resolved)

**@copilot-pull-request-reviewer** (2026-05-26T22:56:08Z):

This reference to ArgoCD watching `maintainers/cluster-nodes/` should match the globbed path used elsewhere (`maintainers/*/cluster-nodes/**` per B-0813). As written, it implies a different tree shape than the rest of the cluster-node registration substrate.

### Thread 3: docs/backlog/P1/B-0831-ci-cascade-6-full-install-plus-cluster-auto-join-eliminate-routine-human-physical-usb-test-aaron-2026-05-26.md:102 (resolved)

**@copilot-pull-request-reviewer** (2026-05-26T22:56:08Z):

Markdown continuation line starts with `+` inside a `-` bullet, which triggers markdownlint list-style parsing issues in this repo. Also, `<10` is easy to misread as an HTML tag in Markdown renderers; wording it as "under" avoids that ambiguity.

### Thread 4: docs/backlog/P1/B-0831-ci-cascade-6-full-install-plus-cluster-auto-join-eliminate-routine-human-physical-usb-test-aaron-2026-05-26.md:174 (resolved)

**@copilot-pull-request-reviewer** (2026-05-26T22:56:09Z):

`(<1 min ...)` can be misinterpreted as an HTML tag in Markdown renderers; using "under 1 min" keeps the meaning while avoiding rendering ambiguity.

## General comments

### @chatgpt-codex-connector (2026-05-26T22:53:54Z)

You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage).
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
pr_number: 5344
title: "feat(broadcast): add local broadcast schema contract"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-26T22:57:31Z"
merged_at: "2026-05-26T23:00:30Z"
closed_at: "2026-05-26T23:00:30Z"
head_ref: "claim/codex-b0213-broadcast-bus-schema-ttl-receipts-20260526"
base_ref: "main"
archived_at: "2026-05-27T19:30:35Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5344: feat(broadcast): add local broadcast schema contract

## PR description

## Summary
- Add a structured schema contract for the local `~/.local/share/zeta-broadcasts` markdown bus.
- Define default TTL/staleness handling and read-receipt shape for B-0213 before runner wiring.
- Release the Codex claim file in this PR branch per the git-native claim protocol.

## Tests
- `bun test tools/broadcast-local/schema.test.ts`
- `git diff --check origin/main...HEAD`

B-0213 slice: schema/TTL/receipts only; ask/offer matching, priority interrupt behavior, conflict detection, and history remain follow-up wiring work.

## General comments

### @chatgpt-codex-connector (2026-05-26T22:57:36Z)

You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage).
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
pr_number: 5345
title: "docs(backlog): B-0832 \u2014 installer nmtui WiFi rescan/refresh (empirical from physical hardware-support test 2026-05-26; 20+ overlapping networks)"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-26T22:58:25Z"
merged_at: "2026-05-26T23:00:21Z"
closed_at: "2026-05-26T23:00:21Z"
head_ref: "otto/b-0832-nmtui-wifi-refresh-rescan-overlapping-networks-installer-first-boot-2026-05-26"
base_ref: "main"
archived_at: "2026-05-27T19:30:34Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5345: docs(backlog): B-0832 — installer nmtui WiFi rescan/refresh (empirical from physical hardware-support test 2026-05-26; 20+ overlapping networks)

## PR description

## Summary

First empirical UX feedback from operator's physical hardware-support test 2026-05-26 — validates B-0831's reframing of physical-test as first-class hardware-compatibility-matrix substrate.

## Issue

Operator framing: \"in the network manager i can refresh wifi connections if i don't see mine initially i have like 20 overlapping networks in my location so i was unable to select the one i wanted but moving foward but we need some sort of way to refresh thoughs?\"

The installer's zeta-first-boot service auto-launches nmtui when no ethernet is detected. In dense-WiFi environments the initial scan may miss the target SSID; nmtui has no obvious rescan path.

## 3-layer mitigation (smallest first)

| Approach | Scope | Code change |
|---|---|---|
| A | Documentation banner before nmtui launch (F5 rescan + Esc re-launch paths) | Banner text in zeta-first-boot.sh |
| B | Pre-scan + post-nmtui re-launch loop in zeta-first-boot.sh | Small loop addition |
| C | Bypass nmtui entirely; prompt-driven nmcli flow | Larger refactor; 0-human-typing-aligned |

P2 priority — UX friction, not hard blocker (operator continued the test via \"moving forward\" workaround).

## Empirical anchor — B-0831 validation

This row IS what B-0831 predicted: physical hardware-support test surfaces real-world issues that CI emulation cannot reproduce. QEMU has no concept of dense-WiFi channel-contention. The substrate-engineering value of physical-as-hardware-support-test is now empirically validated within one tick of B-0831 landing.

## Test plan

- [x] markdownlint clean
- [x] BACKLOG.md regenerated
- [x] Composes_with B-0754 (zero-typing first-boot scope) + B-0831 (CI cascade #6 + physical-as-hardware-support-test substrate)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Reviews

### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T23:01:01Z)

## Pull request overview

Adds a new backlog row (B-0832) capturing empirical operator feedback from a physical hardware-support test: dense WiFi environments can cause `nmtui` to miss the target SSID on initial scan, and the installer needs a visible rescan/refresh path.

**Changes:**
- Adds `docs/backlog/P2/B-0832...md` describing the issue and outlining three mitigation approaches (A banner, B rescan/relaunch loop, C `nmcli` flow).
- Regenerates/updates `docs/BACKLOG.md` to include the new B-0832 entry.

### Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

| File | Description |
| ---- | ----------- |
| docs/backlog/P2/B-0832-installer-nmtui-wifi-rescan-refresh-button-overlapping-networks-empirical-aaron-2026-05-26-physical-hardware-support-test.md | New P2 backlog row documenting the dense-WiFi `nmtui` rescan UX gap and candidate mitigations. |
| docs/BACKLOG.md | Adds the generated index entry for B-0832 under P2. |

## Review threads

### Thread 1: docs/backlog/P2/B-0832-installer-nmtui-wifi-rescan-refresh-button-overlapping-networks-empirical-aaron-2026-05-26-physical-hardware-support-test.md:20 (resolved)

**@copilot-pull-request-reviewer** (2026-05-26T23:01:00Z):

P1 (xref): This row references `B-0831` in `composes_with`, but there is no `B-0831` row/file anywhere in the repo (search across `docs/backlog/**` and `docs/BACKLOG.md` finds none). Either add the missing `B-0831` row in this PR, or change/remove the reference to the correct existing backlog ID to avoid broken cross-links.

### Thread 2: docs/backlog/P2/B-0832-installer-nmtui-wifi-rescan-refresh-button-overlapping-networks-empirical-aaron-2026-05-26-physical-hardware-support-test.md:119 (resolved)

**@copilot-pull-request-reviewer** (2026-05-26T23:01:00Z):

P1 (xref): This bullet points to `B-0831`, but `B-0831` does not exist in the current repo (no per-row file and not present in `docs/BACKLOG.md`). Update this to the correct existing backlog ID or add the missing row so the "Composes with" section stays linkable/accurate.

## General comments

### @chatgpt-codex-connector (2026-05-26T22:58:31Z)

You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage).

### @AceHack (2026-05-26T23:02:02Z)

Both threads resolved no-op as stale-false-positives. B-0831 row landed via PR #5343 (merge commit `1072f569`) which Copilot reviewed PR #5345 before #5343 merged. Cross-refs are valid on current `main`:

```
$ git ls-tree -r origin/main -- docs/backlog/ | grep B-0831
100644 blob 38ea4ac78fdc... docs/backlog/P1/B-0831-ci-cascade-6-full-install-plus-cluster-auto-join-eliminate-routine-human-physical-usb-test-aaron-2026-05-26.md
```

Per `.claude/rules/blocked-green-ci-investigate-threads.md` stale-but-fresh-looking-findings subsection: these were TRUE at thread-filing time but became STALE by review-resolution time. No-op resolution.
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
pr_number: 5346
title: "docs(backlog): B-0833 \u2014 installer interactive-login vs baked-in-keys CI-test tension (resolve without shipping credentials on ISO)"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-26T23:01:21Z"
merged_at: "2026-05-26T23:05:54Z"
closed_at: "2026-05-26T23:05:54Z"
head_ref: "otto/b-0833-interactive-login-vs-baked-in-keys-ci-test-tension-aaron-2026-05-26"
base_ref: "main"
archived_at: "2026-05-27T19:30:33Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5346: docs(backlog): B-0833 — installer interactive-login vs baked-in-keys CI-test tension (resolve without shipping credentials on ISO)

## PR description

## Summary

Per operator 2026-05-26 from physical hardware-support test: \"in the automated tests i see a tention between interactive login and baked in keys we probably are going to have to resolve this i would love if interactive device login didn't need to be human tested everytime but this is hard to test\"

## The tension

| Mode | Security | Testability |
|---|---|---|
| Interactive login (gh auth login device-code) | NO credentials on ISO; aligned with B-0794 homelab-mode | Hard to test in CI without human |
| Baked-in keys | VIOLATES: ISO is publicly downloadable | Easy to test |

## 4-approach scoping

| # | Approach | Phase | Code cost |
|---|---|---|---|
| A | Mock GH device-code endpoint in CI | Proper coverage (Phase 1) | ~200 LOC TS mock server |
| B | Test-only ephemeral GH App with OIDC-minted tokens | Proper coverage (Phase 1) | GH App + OIDC trust setup |
| C | Skip auth in cascade #6 phase 1; layered tests | Immediate (Phase 0) | --skip-gh-auth flag |
| D | Manual auth-only physical test | Residual (steady-state) | Operator-cadence discipline |

Likely landing: C first + A or B follow-up + D as residual.

## 5 HARD LIMITS (non-negotiable per methodology-hard-limits + B-0794)

1. NO real GitHub PATs on ISO (publicly downloadable)
2. NO operator SSH private keys on ISO (gh ssh-key list reads PUBLIC only)
3. NO long-lived credentials in CI (ephemeral or mock only)
4. NO test credentials work against real GH API (mock-scoped)
5. Audit trail for every CI auth test

## Test plan

- [x] markdownlint clean
- [x] BACKLOG.md regenerated
- [x] Composes_with cross-refs to B-0794 + B-0831 + B-0812 + B-0813 + methodology-hard-limits + classifier-bypass-research

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Reviews

### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T23:03:12Z)

## Pull request overview

Adds a new P1 backlog row (B-0833) documenting the security vs CI-testability tension for installer GitHub authentication (interactive device-code login vs baked-in credentials), and updates the generated backlog index to include the new row.

**Changes:**
- Added backlog row B-0833 describing four resolution approaches (mock endpoint, ephemeral GH App, layered tests with auth skip, and periodic manual auth testing) plus non-negotiable security limits.
- Regenerated `docs/BACKLOG.md` to include B-0833 in the P1 section.

### Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

| File | Description |
| ---- | ----------- |
| docs/backlog/P1/B-0833-installer-interactive-login-vs-baked-in-keys-ci-test-tension-resolve-without-shipping-credentials-aaron-2026-05-26.md | New backlog item capturing constraints and candidate approaches for CI-testing installer auth without shipping credentials. |
| docs/BACKLOG.md | Index update to list the new B-0833 row under P1. |
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
pr_number: 5347
title: "docs(backlog): B-0834 \u2014 installer preserve install log to file (failures + warnings scroll past too fast; 3rd empirical anchor in same physical test session)"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-26T23:04:48Z"
merged_at: "2026-05-26T23:06:48Z"
closed_at: "2026-05-26T23:06:48Z"
head_ref: "otto/b-0834-installer-preserve-failures-warnings-log-scrollback-empirical-2026-05-26"
base_ref: "main"
archived_at: "2026-05-27T19:30:32Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5347: docs(backlog): B-0834 — installer preserve install log to file (failures + warnings scroll past too fast; 3rd empirical anchor in same physical test session)

## PR description

## Summary

Per operator 2026-05-26: \"i got some failures and warings on install of nixos not sure if it matters it scrolled by to faster have gh login this is exactly what i'm hoping you can log and test in ci\"

## Two observations packed into one report

1. Install failures + warnings scrolled past faster than human read speed
2. gh login not reached; the scroll-past blocks diagnosis

## 2-approach scoping

| Approach | Scope | Code change |
|---|---|---|
| A (preferred) | tee install output to /tmp/zeta-install-*.log + copy to /mnt/var/log/zeta-install.log on completion | Small exec redirect at top of zeta-install.sh |
| B (upgrade) | script(1) wrapper records full session (ANSI + timing; replayable) | Wrapper script |

P2 priority — diagnostic enabler, not hard install blocker.

## The operator-side analog to B-0831

B-0831 cascade #6 captures full serial console as workflow-artifact in CI. This row is the OPERATOR-SIDE analog: preserve the log on the install target so operator can review post-failure on real hardware, BEFORE B-0831 lands.

## 3 empirical anchors in 1 test session

| Row | Anchor |
|---|---|
| B-0832 | nmtui WiFi rescan needed (dense-WiFi 20+ networks) |
| B-0833 | interactive-login vs baked-in-keys CI-test tension |
| B-0834 (this PR) | install log scroll-past-too-fast |

Strong validation of B-0831's reframing within minutes of its own landing: physical-test-as-first-class-hardware-compatibility-matrix-substrate produces real-world substrate-engineering targets that CI emulation cannot reproduce.

## Test plan

- [x] markdownlint clean
- [x] BACKLOG.md regenerated
- [x] Composes_with B-0754 + B-0831 + B-0832 + B-0833 + zeta-install.sh + zeta-first-boot.sh

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## General comments

### @chatgpt-codex-connector (2026-05-26T23:04:54Z)

You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage).
Loading
Loading