Skip to content

ci: comprehensive install cache + retry + ubuntu-24.04 bump (fixes mise/bun 502 class)#80

Merged
AceHack merged 1 commit intomainfrom
fix/ci-cache-mise-on-lint-jobs-2026-04-28
Apr 28, 2026
Merged

ci: comprehensive install cache + retry + ubuntu-24.04 bump (fixes mise/bun 502 class)#80
AceHack merged 1 commit intomainfrom
fix/ci-cache-mise-on-lint-jobs-2026-04-28

Conversation

@AceHack
Copy link
Copy Markdown
Owner

@AceHack AceHack commented Apr 28, 2026

Summary

Three structural fixes for the PR #23 mise+bun-1.3.13 502 transient class, addressing Aaron 2026-04-28 directives in sequence.

What changed

1. Comprehensive install cache (lint jobs)

lint-shell, lint-workflows, lint-markdown previously had NO cache step (unlike build-test at line 156+). Now all three cache everything tools/setup/install.sh writes:

~/.local/bin/mise        (the mise binary itself)
~/.local/share/mise      (mise runtimes — bun/dotnet/python/uv/java)
~/.cache/mise            (mise download cache)
~/.dotnet/tools          (dotnet global tools)
~/.elan                  (Lean toolchain)
~/.config/zeta           (managed shellenv)
tools/tla, tools/alloy   (verifier jars)

Cache key: install-${runner.os}-${runner.arch}-${hashFiles('.mise.toml', 'tools/setup/**', 'global.json')}. Hashes BOTH .mise.toml (runtime versions) AND tools/setup/** (install logic itself) so changes to either invalidate cache → vanilla install path re-tested whenever discipline changes.

2. Retry layer (install step)

3-attempt retry with 10s/30s backoff. CI-only (dev runs stay interactive — user decides). Mise's internal 3-attempt retry was exhausted on PR #23's bun-1.3.13 download; wrapping at the install.sh layer catches the case where mise itself gives up.

3. Ubuntu 24.04 bump

Per Otto-247 WebSearch verification (April 2026): ubuntu-latest = ubuntu-24.04 since Jan 2025 rollout. ubuntu-22.04 is now LTS-2 stale.

Bumped: gate.yml (×6 lint jobs), resume-diff.yml, scorecard.yml, memory-index-duplicate-lint.yml, budget-snapshot-cadence.yml. Stays on stock GitHub-hosted runner image — no custom pre-installed bun — preserving Aaron's "vanilla ubuntu so we test do our install scripts work on vanalla and deve machines."

Dev ↔ CI parity

install.sh runs on both surfaces; cache restores state similar to a dev's already-bootstrapped local env; cache key on tools/setup/** + .mise.toml matches what a dev's environment depends on. install.sh stays idempotent so:

  • cache hit = fast no-op (script detects everything installed, exits)
  • cache miss = full vanilla install (which IS the install-script validation Aaron wants)

Test plan

Composes with

🤖 Generated with Claude Code

…26-04-28)

Three structural fixes for the PR #23 mise+bun-1.3.13 502 transient
class, addressing Aaron 2026-04-28 directives:

  "is there not a way to fix this?" (don't default to rerun)
  "we want to use stock and we better not be using that old
   version of ubuntu"
  "can you cache and retry?"
  "we want to make sure dev seutp and build machine setup are as
   close to the same a possible"
  "why not cache the whole install/setup"

1. **Comprehensive install cache** on lint-shell, lint-workflows,
   lint-markdown jobs (previously uncached). Caches everything
   tools/setup/install.sh writes:
     ~/.local/bin/mise (the mise binary)
     ~/.local/share/mise (mise runtimes — bun/dotnet/python/uv/java)
     ~/.cache/mise (mise download cache)
     ~/.dotnet/tools (dotnet global tools)
     ~/.elan (Lean toolchain)
     ~/.config/zeta (managed shellenv)
     tools/tla, tools/alloy (verifier jars)
   Cache key hashes BOTH .mise.toml AND tools/setup/** so install
   logic changes invalidate cache → vanilla install path gets
   re-tested whenever discipline changes.

2. **Retry layer** on the install step (CI-only — dev runs stay
   interactive). Three attempts with 10s/30s backoff. Mise's
   internal 3-attempt retry was exhausted on PR #23's bun download;
   wrapping at the install.sh layer catches the case where mise
   itself gives up. Same shape across all 3 lint jobs.

3. **Ubuntu 24.04 bump** on every workflow that pinned ubuntu-22.04
   (gate.yml lint jobs ×6, resume-diff.yml, scorecard.yml,
   memory-index-duplicate-lint.yml, budget-snapshot-cadence.yml).
   ubuntu-latest = ubuntu-24.04 since Jan 2025 per Otto-247 WebSearch
   verification; 22.04 is now LTS-2 stale. Stays on stock GitHub-
   hosted runner image (no custom pre-installed bun) per Aaron's
   "we want to use stock" + "vanilla ubuntu so we test do our install
   scripts work on vanalla and deve machines."

Dev↔CI parity: install.sh runs on both surfaces; cache restores
state similar to a dev's already-bootstrapped local env; cache key
on tools/setup/** + .mise.toml matches what a dev's environment
depends on. install.sh stays idempotent so cache hit = fast no-op,
cache miss = full vanilla install (which is the install-script
validation Aaron wants).

Composes with PR #75 curl_fetch helper (downstream curl retries),
PR #76 + #79 markdownlint carve-outs (verbatim ferry preservation),
Otto-247 version-currency, Otto-235 4-shell portability, Otto-341
mechanism-over-vigilance, and `feedback_structural_fix_beats_process_discipline_velocity_multiplier_aaron_2026_04_28.md`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 28, 2026 05:39
@AceHack AceHack enabled auto-merge (squash) April 28, 2026 05:39
@AceHack AceHack merged commit 2791c28 into main Apr 28, 2026
18 checks passed
@AceHack AceHack deleted the fix/ci-cache-mise-on-lint-jobs-2026-04-28 branch April 28, 2026 05:41
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2b37210da7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +286 to +288
tools/tla
tools/alloy
key: install-${{ runner.os }}-${{ runner.arch }}-${{ hashFiles('.mise.toml', 'tools/setup/**', 'global.json') }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop caching tracked verifier source directories

This cache entry includes tools/tla and tools/alloy, which are not just tool outputs—they also contain tracked repo sources (for example specs/*.tla and AlloyRunner.java). Since the key only hashes .mise.toml, tools/setup/**, and global.json, a PR that edits files under those directories (without changing setup files) can have its checked-out files overwritten by a restored cache before lint executes, causing CI to analyze stale content. Cache only the downloaded jar targets (for example tools/tla/tla2tools.jar and tools/alloy/alloy.jar) or expand the key to include those tracked directories’ contents.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the GitHub Actions CI configuration to reduce transient install failures (mise/bun 5xx class) by caching toolchain state, adding retries around tools/setup/install.sh, and standardizing runners on Ubuntu 24.04.

Changes:

  • Bump runs-on from ubuntu-22.04 to ubuntu-24.04 across several workflows.
  • Add an install-output cache + 3-attempt retry wrapper around tools/setup/install.sh for lint jobs in gate.yml.
  • Update related workflow header comments to match the runner bump.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
.github/workflows/gate.yml Switch lint jobs to ubuntu-24.04; add cache + retry wrapper around install.sh for shellcheck/actionlint/markdownlint jobs.
.github/workflows/resume-diff.yml Switch runner to ubuntu-24.04 and update runner-pin comment.
.github/workflows/scorecard.yml Switch scorecard runner to ubuntu-24.04.
.github/workflows/memory-index-duplicate-lint.yml Switch runner to ubuntu-24.04 and update header comment.
.github/workflows/budget-snapshot-cadence.yml Switch runner to ubuntu-24.04.

Comment on lines +266 to +267
# "we want to make sure dev seutp and build machine setup
# are as close to the same a possible"
Comment on lines +17 to +19
# work; bumped from 22.04 on 2026-04-28 per Aaron's "we better
# not be using that old version of ubuntu" + Otto-247 version-
# currency WebSearch — ubuntu-latest = 24.04 since Jan 2025).
Comment on lines +306 to +311
set -euo pipefail
for attempt in 1 2 3; do
if ./tools/setup/install.sh; then exit 0; fi
[ "$attempt" = "3" ] && { echo "install.sh failed after 3 attempts"; exit 1; }
backoff=$((attempt * 20 - 10))
echo "install.sh attempt $attempt failed; retrying in ${backoff}s..." >&2
# - concurrency: workflow-scoped; cancel-in-progress for PR
# events.
# - Runner digest-pinned (ubuntu-22.04).
# - Runner digest-pinned (ubuntu-24.04).
Comment on lines +262 to +289
- name: Cache install.sh outputs (mise runtimes + dotnet tools + verifier jars)
# Comprehensive cache of everything `tools/setup/install.sh`
# writes — keeps dev laptops and CI runners as close to the
# same state as possible per Aaron's 2026-04-28 directive:
# "we want to make sure dev seutp and build machine setup
# are as close to the same a possible"
# "why not cache the whole install/setup"
# Without comprehensive caching every CI run hits CDNs cold
# (mise.run + GitHub releases for bun/shellcheck/actionlint +
# NuGet + Lean toolchain). Transient 502s become avoidable
# failures rather than first-run-only cost.
# Cache key hashes BOTH .mise.toml (runtime versions) AND
# tools/setup/** (install logic itself) so changes to either
# invalidate cache → vanilla install path gets re-tested
# whenever the install discipline changes.
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
path: |
~/.local/bin/mise
~/.local/share/mise
~/.cache/mise
~/.dotnet/tools
~/.elan
~/.config/zeta
tools/tla
tools/alloy
key: install-${{ runner.os }}-${{ runner.arch }}-${{ hashFiles('.mise.toml', 'tools/setup/**', 'global.json') }}

Comment on lines +279 to +287
path: |
~/.local/bin/mise
~/.local/share/mise
~/.cache/mise
~/.dotnet/tools
~/.elan
~/.config/zeta
tools/tla
tools/alloy
AceHack added a commit that referenced this pull request Apr 28, 2026
…Otto-357 strengthen (#83)

* tick-history: 2026-04-28T05:44Z — PR #80 MERGED + #81 retry-bump + #82 Otto-357 strengthen + 3 conflict resolutions

* fix(pr-83): reconcile verify-don't-parrot streak count — 4 ticks running (was inconsistent 3 vs 4)

PR #83 review thread (P2 copilot): the row described the streak count as
both "3 ticks running" early and "4 ticks running" later. The conflict
was a scope mismatch — the early count was meant to be cumulative
ticks-of-discipline-applied (4, matching the observations enumeration),
but I'd written it as 3 from an older draft state.

Reconciled to a single 4-count framing that explicitly references the
observations column (which enumerates the 4 distinct verifications
applied this tick: cron-id verify / AUTONOMOUS-LOOP.md grep / CronList
freshness / retry-3-failed-on-#23 sourcing).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants