Skip to content

feat(B-0849 Phase 1): Docker NixOS install.sh test harness — fast iteration (~30-60 sec) for install.sh + mise + bun + iter-5.5.0; complements B-0831 QEMU (Aaron 2026-05-27)#5393

Merged
AceHack merged 7 commits into
mainfrom
feat-b0849-1-docker-nixos-install-sh-test-harness-implementation-2026-05-27-0136z
May 27, 2026
Merged

feat(B-0849 Phase 1): Docker NixOS install.sh test harness — fast iteration (~30-60 sec) for install.sh + mise + bun + iter-5.5.0; complements B-0831 QEMU (Aaron 2026-05-27)#5393
AceHack merged 7 commits into
mainfrom
feat-b0849-1-docker-nixos-install-sh-test-harness-implementation-2026-05-27-0136z

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 27, 2026

Summary

Aaron 2026-05-27: "we should add docker based nixos install.sh testing so we can iterate quick that's an easy dockerfile"

Implements B-0849 Phase 1 — bounded-iteration test harness for the install.sh / linux.sh / mise.sh substrate.

3 files

  1. tools/ci/dockerfiles/nixos-install-sh-test/Dockerfilenixos/nix:2.31.2 pinned base + /etc/NIXOS marker + run install.sh + validate bun (1.3.x exact pin)/claude/gh
  2. tools/ci/docker-nixos-install-sh-test.ts — TS wrapper per Rule 0 with exit-code mapping + log capture (default .tools/docker-nixos-install-sh-test.log) + timeout + centralized spawnDocker helper (sonarjs suppression + 64 MiB maxBuffer)
  3. .dockerignore — NEW; excludes references/upstreams/ (gigabytes per the rule), node_modules/, .git/, build outputs, IDE scratch, ISO/qcow2 artifacts. Affects ALL docker builds run from repo root — substrate-honest scope flag.

Validation coverage

Layer Check
linux.sh NixOS detection touch /etc/NIXOS makes linux.sh route to mise.sh
mise install nix-shell bootstraps mise + reads .mise.toml
bun via mise bun --version matches .mise.toml pin 1.3 EXACTLY
claude-code via bun set -o pipefail + bun install --global @anthropic-ai/claude-code
gh via nix nix-shell -p gh install + version check

Cycle-time tradeoff

Surface Validates Cycle
Operator USB End-to-end + reboot ~30+ min
B-0831 QEMU End-to-end virtualized ~15 min
B-0849 Docker (THIS PR) install.sh on NixOS userspace ~30-60 sec

Usage

```bash
bun tools/ci/docker-nixos-install-sh-test.ts # default 600s timeout
bun tools/ci/docker-nixos-install-sh-test.ts --keep-image # inspect after
DOCKER_BUILD_TIMEOUT_SEC=900 bun tools/ci/docker-nixos-install-sh-test.ts
```

Composes with

B-0824 · B-0831 · B-0835 · B-0848 + B-0850

Copilot review responses

10 findings across 2 review batches all addressed: unused import, name attribution, spawnDocker centralization, .dockerignore for repo root, dirname() cross-platform, nixos/nix pin, bun version exact match, pipefail propagation, ENV PATH for mise across Docker layers, gitignored default log path. All threads resolved.

🤖 Generated with Claude Code

…ration (~30-60 sec) for tools/setup/install.sh + linux.sh + mise.sh + iter-5.5.0 substrate; complements B-0831 QEMU full-install (~15 min)

Operator framing 2026-05-27 (verbatim):

  > "we should add docker based nixos install.sh testing so we can
  > iterate quick that's an easy dockerfile"

2 files:

1. tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile (NEW)
   - Base: nixos/nix:latest (nix-the-package-manager + /nix/store)
   - touch /etc/NIXOS marker (triggers linux.sh's NixOS-detection
     branch per iter-5.5.1 PR #5389)
   - Enable nix flakes
   - COPY tools/setup + .mise.toml + package.json + bun.lock
   - RUN tools/setup/install.sh (exercises full install dispatcher)
   - Validate bun installed via mise + matches .mise.toml pin
   - Validate claude-code installable via bun install --global
     (exercises iter-5.5.0 substrate Step 6.95a)
   - Validate gh installable via nix-shell (Bug 5 fix substrate)
   - Final success marker echo

2. tools/ci/docker-nixos-install-sh-test.ts (NEW)
   - TS wrapper per Rule 0 (TS-over-bash for DST + cross-platform)
   - Wraps `docker build` with exit-code mapping + log capture +
     timeout enforcement (default 600s) + build-context discipline
   - --keep-image flag for inspection mode
   - DOCKER_BUILD_TIMEOUT_SEC + DOCKER_LOG_OUT_PATH env overrides
   - Exit codes: 0 success / 1 build-failed / 2 usage-error /
     124 timeout
   - Standard tools/ci/ TS-wrapper pattern (matches qemu-boot-test.ts
     + qemu-full-install-test.ts conventions)

What this catches (per the B-0849 row's empirical case):

- iter-5.4 cascade Bug 5 (gh not in systemPackages) — Docker would
  have caught at write-time
- iter-5.4 cascade Bug 7 (NetBIOS / Samba config) — partially (only
  install.sh path; full systemd-config needs QEMU)
- iter-5.5.0 Bug 8 (credential persistence gap) — Docker would have
  caught the install.sh credential-copy logic if added (B-0850
  Phase 2 substrate)
- iter-5.5.1 alignment (bun → mise) — Docker validates the entire
  install.sh dispatcher + mise + bun flow

Complementary to B-0831 cascade #6 (QEMU full-install ~15 min) at
the cycle-time vs scope tradeoff:

| Surface | Validates | Cycle time |
|---|---|---|
| Operator USB | End-to-end + reboot | ~30+ min |
| B-0831 QEMU | End-to-end virtualized | ~15 min |
| B-0849 Docker (this PR) | install.sh on NixOS userspace | ~30-60 sec |

Composes with: B-0824 (Ace meta-PM — Docker test is one slice of
three-way-parity validation extended to NixOS), B-0831 cascade #6
(QEMU complementary surface), B-0835 install bug cluster (Docker
catches future Bug-N at write-time), B-0848 + B-0850 (node-local
Claude + systemd service — install validation is prerequisite for
Phase 1 of both).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 03:16
@AceHack AceHack enabled auto-merge (squash) May 27, 2026 03:16
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Comment thread tools/ci/docker-nixos-install-sh-test.ts Fixed
… used

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
auto-merge was automatically disabled May 27, 2026 03:19

Pull Request is not mergeable

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a fast-iteration Docker-based NixOS userspace harness intended to validate tools/setup/install.sh’s NixOS detection path (via /etc/NIXOS) plus key post-install tooling checks, providing a quicker feedback loop than the existing QEMU full-install test.

Changes:

  • Introduces a nixos/nix-based Dockerfile that runs tools/setup/install.sh and validates bun + claude-code + gh availability.
  • Adds a Bun/TypeScript wrapper to run docker build with log capture, timeout, and optional image retention.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File Description
tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile Docker build recipe to exercise install.sh on NixOS userspace and validate bun/claude-code/gh.
tools/ci/docker-nixos-install-sh-test.ts Bun/TS wrapper that invokes docker build, captures logs, enforces a timeout, and cleans up the image.

Comment thread tools/ci/docker-nixos-install-sh-test.ts
Comment thread tools/ci/docker-nixos-install-sh-test.ts Outdated
Comment thread tools/ci/docker-nixos-install-sh-test.ts
Comment thread tools/ci/docker-nixos-install-sh-test.ts Outdated
Comment thread tools/ci/docker-nixos-install-sh-test.ts
Comment thread tools/ci/docker-nixos-install-sh-test.ts
Comment thread tools/ci/docker-nixos-install-sh-test.ts Outdated
Comment thread tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile Outdated
Comment thread tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile Outdated
Comment thread tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile Outdated
Lior and others added 2 commits May 26, 2026 23:23
…ckerignore + dirname() + nixos/nix pin + bun version exact match + name attribution removed

8 substantive Copilot findings (besides earlier `join` unused), all
real + fixed:

1. Name attribution "Aaron" in header → "operator" (per
   .github/copilot-instructions.md no-name-attribution rule)

2. spawnSync("docker", ...) needs sonarjs/no-os-command-from-path
   suppression → centralized spawnDocker() helper with the
   suppression + rationale comment (matches
   tools/ci/audit-installer-iso-content.ts:186-194 pattern)

3. maxBuffer not set → spawnDocker sets maxBuffer 64 MiB (docker
   build --progress=plain is verbose; Node default 1 MiB overruns)

4. Build context = repo root + no .dockerignore → would send
   gigabytes of references/upstreams/ (per references-upstreams-*
   rule). NEW .dockerignore at repo root excludes:
     - references/upstreams/ (gigabytes per the rule)
     - node_modules/, .git/, bin/, obj/, target/, dist/, build/
     - .DS_Store, .vscode/, .idea/, *.log
     - *.iso, *.qcow2, *.img, *.vmdk

5. logPath.substring(lastIndexOf("/")) breaks on Windows → use
   path.dirname() for cross-platform support

6. Second spawnSync for `docker rmi` needs same suppression →
   uses centralized spawnDocker helper (single point)

7. nixos/nix:latest is non-deterministic → pin to nixos/nix:2.31.2
   with bump-procedure comment pointing at
   .claude/rules/dep-pin-search-first-authority.md

8. bun version check `^1\.` too loose vs .mise.toml pin "1.3" →
   tightened to `^1\.3\.` (matches 1.3.x but rejects 1.4+)

Composes with: B-0849 row (this PR implements Phase 1), B-0849 row
operator-named .dockerignore discipline composes with
references-upstreams-not-our-code-search-excludes rule, dep-pin-
search-first-authority rule (bump procedure for nixos/nix base
image + bun pin tightening).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
… RUN so pipeline exit-status propagates (was masked by tail)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 03:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Comment thread tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile
Comment thread tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile Outdated
Comment thread tools/ci/docker-nixos-install-sh-test.ts
Comment thread .dockerignore
Lior and others added 2 commits May 26, 2026 23:28
…cker layers (P0) + Dockerfile pin comment uses date not 'training data' (P2) + log default to .tools/ gitignored path (P1) + (meta) PR description note for .dockerignore

P0 — install.sh exports PATH only in its own process; Docker doesn't
persist exports across RUN layers. Without ENV PATH, validation
RUNs would fail because mise/bun aren't on PATH in fresh layer
shells. Added ENV PATH=/root/.bun/bin:/root/.local/share/mise/shims:
/root/.local/bin:/usr/local/bin:/usr/bin:/bin AFTER install.sh
runs + before validation RUNs.

P2 — 'as of training data' comment replaced with actionable bump
procedure: visit Docker Hub nixos/nix tags page, select current
latest stable, ideally pin by digest. Pin selected 2026-05-27.

P1 — default log path .docker-test-log lands in repo root unignored
(would show as untracked + risk accidental commit). Changed to
.tools/docker-nixos-install-sh-test.log (.tools/ is already
gitignored at line 50 of .gitignore). Operator-override via
DOCKER_LOG_OUT_PATH env var unchanged.

Meta — PR description out of date (says 2 files; actual is 3 with
.dockerignore). PR body update on next push will fix.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…buffer types to string + assert NodeJS.ErrnoException for .code access

Two TS2 errors caught by lint (tsc tools):
- TS2365 line 148: stdout+stderr was string|NonSharedBuffer; .toString() coerce
- TS2339 line 154: result.error.code not on base Error; assert NodeJS.ErrnoException

Verified locally: bun x tsc --noEmit -p tsconfig.json passes clean.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 03:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread tools/ci/docker-nixos-install-sh-test.ts Outdated
Comment thread tools/ci/dockerfiles/nixos-install-sh-test/Dockerfile Outdated
…n nixos/nix base image

P1 (docstring drift): DOCKER_LOG_OUT_PATH docstring still said
'.docker-test-log' but actual default changed to
'.tools/docker-nixos-install-sh-test.log' (per earlier P1 fix).
Updated docstring to match + explain gitignored rationale.

P1 (mutable tag): nixos/nix:2.31.2 pinned by tag but tags can be
re-pushed upstream (mutable). Digest-pin to content-addressed
sha256 makes CI failures attributable to actual upstream change
rather than mystery base-image flake. Got digest via Docker Hub
registry API directly (no docker pull required):
  sha256:29fc5fe207f159ceb0143c25c19c774062fee02ce5eda118f3067547b3054894
Bump procedure (per dep-pin rule): repeat curl + digest fetch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@AceHack AceHack merged commit a1b43f3 into main May 27, 2026
32 checks passed
@AceHack AceHack deleted the feat-b0849-1-docker-nixos-install-sh-test-harness-implementation-2026-05-27-0136z branch May 27, 2026 03:47
AceHack added a commit that referenced this pull request May 27, 2026
…ll.sh test on PRs touching install substrate (#5396)

Implements [B-0849](docs/backlog/P2/B-0849-...) Phase 2 — wires the
Docker harness from Phase 1 (PR #5393) into CI so install.sh /
linux.sh / mise.sh bugs are caught at PR time vs reboot time.

Workflow shape mirrors build-ai-cluster-iso.yml (canonical install-
test workflow pattern):
- Runner pinned ubuntu-24.04 (NOT -latest)
- Third-party actions SHA-pinned with vX.Y.Z comments
- permissions: contents: read
- Concurrency: workflow-scoped, cancel-in-progress for PRs
- No github.event.* values in run: lines (per security-guidance)

Path triggers:
- tools/setup/** (install dispatcher + per-OS scripts + common/)
- .mise.toml (pinned runtime versions)
- full-ai-cluster/nixos/modules/common.nix (systemd + bun PATH)
- tools/ci/dockerfiles/nixos-install-sh-test/** (the Dockerfile)
- tools/ci/docker-nixos-install-sh-test.ts (the TS wrapper)
- .dockerignore (affects all docker builds from repo root)
- package.json + bun.lock (TS wrapper deps)

Job timeout 15 min (cold-cache install.sh + mise + bun + claude +
gh nix-shell ~5-10 min upper bound; warm ~60-120 sec). DOCKER_BUILD_
TIMEOUT_SEC bumped to 900s for cold-cache headroom.

Upload-artifact (always) preserves the test log for 7 days for
post-failure diagnostic per `.claude/rules/blocked-green-ci-investigate-
threads.md` verify-before-fix discipline.

Composes with: B-0849 Phase 1 (PR #5393 — the Dockerfile + TS
wrapper), B-0831 cascade #5 QEMU boot test (complementary cycle-time
vs scope), iter-5.5.0 substrate, B-0835 install bug cluster.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants