From 0802f7df531dfcfffe81ce305db5ccc1baa92b13 Mon Sep 17 00:00:00 2001 From: Lior Date: Tue, 26 May 2026 19:04:35 -0400 Subject: [PATCH] =?UTF-8?q?docs(backlog):=20B-0834=20=E2=80=94=20installer?= =?UTF-8?q?=20preserve=20install=20log=20to=20file=20(failures=20+=20warni?= =?UTF-8?q?ngs=20scroll=20past=20too=20fast;=20empirical=20from=20physical?= =?UTF-8?q?=20hardware-support=20test;=203rd=20empirical=20anchor=20in=20s?= =?UTF-8?q?ame=20test=20session=20as=20B-0832=20+=20B-0833)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per operator 2026-05-26: "i got some failures and warings on install of nixos not sure if it matters it scrolled by to faster have gh login this is exactly what i'm hoping you can log and test in ci" Two observations: 1. Install failures + warnings scroll past faster than human read speed 2. gh login not reached; scroll-past blocks diagnosis Files B-0834 as P2 substrate-engineering target. 2-approach scoping: - Approach A (preferred): tee install output to /tmp/zeta-install-*.log + copy to /mnt/var/log/zeta-install.log on completion - Approach B (upgrade path): script(1) wrapper records full session including ANSI + timing (replayable via scriptreplay) P2 priority — diagnostic enabler, not hard install blocker. This row is the OPERATOR-SIDE analog to B-0831's CI-SIDE cascade #6 serial-console-as-workflow-artifact. Both solve the same fundamental problem (preserve install output for later review) at different surfaces (real-hardware operator vs QEMU CI). Both compose; both land separately. THREE empirical anchors in ONE physical hardware-support test session (B-0832 nmtui WiFi rescan + B-0833 interactive-login vs baked-keys tension + this row B-0834 log preservation) is strong validation of B-0831's reframing: physical-test-as-first-class-hardware- compatibility-matrix-substrate produces real-world substrate- engineering targets that CI emulation cannot reproduce within minutes of B-0831's own landing. Co-Authored-By: Claude Opus 4.7 --- docs/BACKLOG.md | 1 + ...s-scrollback-empirical-aaron-2026-05-26.md | 147 ++++++++++++++++++ 2 files changed, 148 insertions(+) create mode 100644 docs/backlog/P2/B-0834-installer-preserve-install-log-to-file-failures-warnings-scrollback-empirical-aaron-2026-05-26.md diff --git a/docs/BACKLOG.md b/docs/BACKLOG.md index 81cad9314d..7f904be1e3 100644 --- a/docs/BACKLOG.md +++ b/docs/BACKLOG.md @@ -767,6 +767,7 @@ are closed (status: closed in frontmatter)._ - [ ] **[B-0828](backlog/P2/B-0828-multi-ai-shared-checkout-convention-human-maintainer-surface-plus-always-up-to-date-main-for-society-aaron-2026-05-26.md)** Multi-AI shared-checkout convention — human-maintainer surface + always-up-to-date-with-main for society - [ ] **[B-0829](backlog/P2/B-0829-schemas-as-rows-cluster-fork-as-trust-boundary-fsharp-type-providers-from-live-cluster-foundation-for-runme-bcl-ontology-kestrel-aaron-2026-05-26.md)** Schemas-as-rows + cluster-fork-as-trust-boundary + F# type providers from live cluster — foundation layer for Runme BCL ontology capability - [ ] **[B-0832](backlog/P2/B-0832-installer-nmtui-wifi-rescan-refresh-button-overlapping-networks-empirical-aaron-2026-05-26-physical-hardware-support-test.md)** installer nmtui WiFi step needs visible rescan/refresh path — empirical from operator's physical hardware-support test 2026-05-26 (20+ overlapping networks; target SSID not initially in scan list) (Aaron 2026-05-26) +- [ ] **[B-0834](backlog/P2/B-0834-installer-preserve-install-log-to-file-failures-warnings-scrollback-empirical-aaron-2026-05-26.md)** installer must preserve install log to file — failures + warnings scroll past faster than operator can read (empirical from 2026-05-26 physical hardware-support test; gh login not reached) (Aaron 2026-05-26) ## P3 — convenience / deferred diff --git a/docs/backlog/P2/B-0834-installer-preserve-install-log-to-file-failures-warnings-scrollback-empirical-aaron-2026-05-26.md b/docs/backlog/P2/B-0834-installer-preserve-install-log-to-file-failures-warnings-scrollback-empirical-aaron-2026-05-26.md new file mode 100644 index 0000000000..e9ca382350 --- /dev/null +++ b/docs/backlog/P2/B-0834-installer-preserve-install-log-to-file-failures-warnings-scrollback-empirical-aaron-2026-05-26.md @@ -0,0 +1,147 @@ +--- +id: B-0834 +priority: P2 +status: open +title: installer must preserve install log to file — failures + warnings scroll past faster than operator can read (empirical from 2026-05-26 physical hardware-support test; gh login not reached) (Aaron 2026-05-26) +effort: S +ask: aaron 2026-05-26 +created: 2026-05-26 +last_updated: 2026-05-26 +depends_on: + - B-0754 +composes_with: + - B-0831 + - B-0832 + - B-0833 +tags: [installer, first-boot, logging, operator-ux, physical-hardware-support-test, empirical-anchor, scrollback] +--- + +## Problem + +Empirical from operator's 2026-05-26 physical hardware-support test +(third empirical-anchor in the same test session after B-0832 nmtui +WiFi rescan + B-0833 interactive-vs-baked-keys auth tension): + +Operator framing: *"i got some failures and warings on install of +nixos not sure if it matters it scrolled by to faster have gh login +this is exactly what i'm hoping you can log and test in ci"* + +Two observations packed into one report: + +1. **Install failures + warnings scrolled past too fast for operator + to read** — `zeta-install` / `nixos-install` / underlying + nix-build output flows directly to the terminal; under load + (parallel derivation builds + parallel disko operations + parallel + nixpkgs evaluation), output rate exceeds human reading speed; ANY + warnings or recoverable failures get lost in the scroll +2. **gh login not reached** — install presumably stalled OR failed + before reaching the `gh auth login` step in `zeta-install.sh`; the + scroll-past-too-fast issue blocks operator's ability to diagnose + +The operator's correct framing: *"this is exactly what i'm hoping +you can log and test in ci"* — B-0831 cascade #6 phase 1 already +plans to capture full serial console as workflow-artifact. This row +is the OPERATOR-SIDE analog: preserve the log on the install target +so operator can review after the fact, BEFORE B-0831 cascade #6 +lands. + +## Proposed mitigation + +Two layered approaches: + +### Approach A — `tee` install output to log file (smallest fix) + +Modify `zeta-install.sh` to `tee` all output to a log file in /mnt +(install target) AND to a log file in /tmp (live ISO): + +```bash +# At top of zeta-install.sh: +LOG_FILE="${LOG_FILE:-/tmp/zeta-install-$(date -u +%Y%m%dT%H%M%SZ).log}" +exec > >(tee -a "$LOG_FILE") 2>&1 +``` + +After install completes (success OR failure), copy the log to: + +- `/mnt/var/log/zeta-install.log` (preserved on installed system; available + after first boot of installed system) +- `/tmp/zeta-install-.log` (available on live ISO until + reboot; operator can `cat` after `Ctrl-C` to abort + diagnose) + +Operator can then: + +- During install: `Ctrl-Z` install → background → `tail -f + /tmp/zeta-install-*.log | less` (scrollable) +- After failure: `cat /tmp/zeta-install-*.log | less` +- After successful install + boot: `journalctl -u zeta-first-boot + --boot=-1` OR `cat /var/log/zeta-install.log` + +### Approach B — `script` command wraps zeta-install entirely + +Use `script(1)` to record the full session (input + output + timing): + +```bash +# Wrapper that records everything: +script -q /tmp/zeta-install-session.typescript -c '/run/current-system/sw/bin/zeta-install ' +``` + +This captures even ANSI escape sequences + timing data (replayable +via `scriptreplay`). Heavier than `tee` but captures more (TUI +interactions like nmtui screen states). + +Approach A is preferred (simpler; smaller code change; sufficient +for the immediate diagnostic need); Approach B is the upgrade path +if Approach A misses anything. + +## Acceptance + +- Install output is `tee`'d to `/tmp/zeta-install-.log` on + the live ISO from the moment `zeta-first-boot` fires +- On install completion (success OR failure), log is copied to + `/mnt/var/log/zeta-install.log` (if `/mnt` is mounted at that point) +- Pre-failure scroll-past failures + warnings are now PRESERVED IN THE + LOG FILE; operator can review via `less /tmp/zeta-install-*.log` +- Log location named in the existing install-banner text so operator + sees WHERE the log is BEFORE it scrolls past +- Empirical re-validation: rerun the physical hardware-support test; + if failures recur, operator can `cat` the log and surface specifics + +## Composes with + +- `full-ai-cluster/usb-nixos-installer/zeta-install.sh` (the script + this row modifies) +- `full-ai-cluster/usb-nixos-installer/zeta-first-boot.sh` (wrapper + that calls zeta-install; could set `LOG_FILE` env var) +- B-0754 (zero-typing first-boot scope; this row extends with + preserved-log substrate) +- B-0831 (CI cascade #6 phase 1 captures full serial console; this row + is the OPERATOR-SIDE analog — log preserved on disk so operator can + review post-failure, BEFORE B-0831 cascade #6 lands) +- B-0832 (sibling empirical anchor: nmtui WiFi rescan; same physical + test session) +- B-0833 (sibling empirical anchor: interactive-login vs baked-in-keys + CI-test tension; same physical test session — this row's failure + blocked reaching the gh-login phase) +- The 2026-05-26 physical hardware-support test (3rd empirical anchor + in the same test session; validates B-0831 reframing of + physical-as-hardware-support-test producing empirically-anchored + substrate-engineering targets within minutes) + +## Substrate-honest framing + +Three empirical anchors in one physical hardware-support test session +(B-0832 + B-0833 + this row B-0834) is STRONG validation of B-0831's +reframing: physical-test-as-first-class-hardware-compatibility-matrix- +substrate produces real-world substrate-engineering targets that CI +emulation cannot reproduce. + +The fix is small (tee output to log file) and immediate-value +(operator can diagnose the install failure that prompted this row, +once Approach A lands). P2 priority because it's a diagnostic +enabler, not a hard install blocker. + +This row also demonstrates the **operator-side analog** pattern to +the CI-side B-0831 cascade #6: both substrate-engineering targets +solve the same fundamental problem (preserve install output for +later review) at different surfaces (operator-physical-test on real +hardware vs CI-automated-test in QEMU). Both are valuable; both +land separately; both compose.