Skip to content

tools: lint/runner-version-freshness.sh — structural enforcement for Otto-213 stale-version lesson#360

Merged
AceHack merged 3 commits intomainfrom
tools/lint-runner-version-freshness-otto-214
Apr 25, 2026
Merged

tools: lint/runner-version-freshness.sh — structural enforcement for Otto-213 stale-version lesson#360
AceHack merged 3 commits intomainfrom
tools/lint-runner-version-freshness-otto-214

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented Apr 24, 2026

Summary

Otto-214 shipping the tooling-level enforcement I proposed Otto-213. Memory-alone was not sufficient to stop the "write a stale version number" recurrence pattern (I did it 3× this session alone); this lint adds a CI-fail gate so the failure stops compounding.

What the lint does

  • Walks .github/workflows/*.yml
  • Extracts runs-on: + os: matrix lines
  • Fails (exit 2) on stale-label hits: ubuntu-22.04, macos-14, macos-15, windows-2022, etc.
  • Warns (exit 3) if the allow-list itself is stale (>30 days since LAST_VERIFIED)
  • Prints the canonical allow-list + authoritative GitHub docs URL on failure

Allow-list verified 2026-04-24 via the standard-runners docs.

First-run output on current main

13 stale-label hits across 3 workflows:

STALE RUNNER LABEL(S) in .github/workflows/codeql.yml:
  108:    runs-on: ubuntu-22.04
  265:    runs-on: ubuntu-22.04
STALE RUNNER LABEL(S) in .github/workflows/gate.yml:
  [8 hits including matrix + comment-block]
STALE RUNNER LABEL(S) in .github/workflows/github-settings-drift.yml:
  54:    runs-on: ubuntu-22.04

gate.yml hits are cleaned up by PR #359 (already in queue). codeql.yml + github-settings-drift.yml need separate follow-up PRs.

Sequencing — detect-only first, enforce when baseline green

Same pattern as audit-cross-platform-parity.sh (FACTORY-HYGIENE row #51): ship the detector, clean up the existing baseline, THEN wire enforcement into gate.yml lint job. Premature enforcement blocks every current PR. Concrete plan:

  1. This PR — ships the lint tool. Not wired into CI yet.
  2. PR ci: 4-runner PR-gate matrix + delete redundant nightly (Otto-210..213) #359 — clears gate.yml stale refs (already in queue)
  3. Follow-up PRs — clear codeql.yml + github-settings-drift.yml stale refs
  4. Enforcement PR — wire runner-version-freshness.sh into gate.yml lint (...) job chain after baseline is clean

Portability

  • macOS default-bash-3.x compatible (no mapfile)
  • GNU date + BSD date fallback for the LAST_VERIFIED age check
  • Shellcheck clean (SC2001 note acknowledged as intentional)

Composes with

  • Otto-213 version-numbers-always-websearch memory (the durable lesson this tool structurally enforces)
  • Otto-212 use-latest-tags + security-hygiene directive
  • Otto-210/211 macOS-is-free + M1-not-Intel corrections
  • FACTORY-HYGIENE row deps: bump actions/setup-node from 6.0.0 to 6.4.0 #43 safe-pattern compliance

Test plan

  • Detects all 13 current stale refs on main (exit 2)
  • Prints canonical allow-list + docs URL on failure
  • Shellcheck clean
  • Portable to bash 3.x (macOS default)
  • CI wire-up: separate follow-up after baseline green

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 24, 2026 11:45
@AceHack AceHack enabled auto-merge (squash) April 24, 2026 11:45
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3fd4541701

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/lint/runner-version-freshness.sh Outdated
Comment thread tools/lint/runner-version-freshness.sh Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new lint script to detect stale GitHub Actions runner labels in workflow YAMLs, with an allow-list that’s periodically re-verified and a distinct warning mode when the allow-list itself is stale.

Changes:

  • Introduces tools/lint/runner-version-freshness.sh to scan .github/workflows/* for stale runs-on / matrix os labels.
  • Encodes an allow-list + LAST_VERIFIED timestamp and emits a warning exit code when the allow-list is older than 30 days.
  • Emits actionable output including line hits, canonical labels, and the authoritative GitHub docs URL.

Comment thread tools/lint/runner-version-freshness.sh
Comment thread tools/lint/runner-version-freshness.sh Outdated
Comment thread tools/lint/runner-version-freshness.sh Outdated
Comment thread tools/lint/runner-version-freshness.sh
AceHack and others added 2 commits April 25, 2026 00:56
…Otto-213 durable lesson

Otto-214 implementation of the tooling-level enforcement
I proposed Otto-213. Memory-alone was not sufficient to
stop the "write a stale version number" recurrence
pattern; this script adds a CI-fail gate.

Behavior:

- Walks .github/workflows/*.yml files
- Extracts runs-on: + os: matrix lines
- Fails (exit 2) if any line references a STALE runner
  version (ubuntu-22.04, macos-14, macos-15, windows-2022,
  ubuntu-20.04, macos-13, macos-15-intel, etc.)
- Warns (exit 3) if the allow-list itself is stale (>30
  days since LAST_VERIFIED)
- Prints the canonical list of ALLOWED labels on failure
  + the authoritative GitHub docs URL for re-verification

Allow-list verified 2026-04-24 via
https://docs.github.com/en/actions/how-tos/write-workflows/choose-where-workflows-run/choose-the-runner-for-a-job#standard-github-hosted-runners-for-public-repositories
exact quote "Use of the standard GitHub-hosted runners
is free and unlimited on public repositories."

First-run detects 13 stale-label hits across codeql.yml,
gate.yml, github-settings-drift.yml (plus stale comment-
block references in gate.yml from the pre-correction
history). These will be cleaned up by PR #359 for
gate.yml; codeql.yml + github-settings-drift.yml need
separate follow-up PRs.

Does NOT wire into gate.yml automatically — separate
step to add the lint check after the baseline is green.
Premature enforcement would block every current PR.
Sequencing: (1) this PR ships the tool; (2) follow-up
PRs clean up existing stale refs (gate.yml already
covered by #359; others queued); (3) once baseline is
clean, add to gate.yml lint job.

Composes with:

- Otto-213 version-numbers-require-websearch memory
- Otto-212 use-latest-tags + security-hygiene directive
- Otto-210/211 macOS-is-free + M1-not-Intel corrections
- FACTORY-HYGIENE row #43 safe-pattern compliance
- Analogous pattern to audit-cross-platform-parity.sh
  (detect-only-first, enforce-when-baseline-green)

Test plan:

- Runs clean when no stale labels present
- Exits 2 with clear message when stale labels present
- Warns when allow-list >30 days old
- Shellcheck clean (SC2001 note acknowledged; the
  non-bash-4 sed-style substitution is intentional for
  macOS default-bash-3.x compatibility per FACTORY-
  HYGIENE row #51 cross-platform parity)
- Portable: no mapfile (bash 4+ only); uses while-read
  loop pattern that works in bash 3.x

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e + comment-strip + rolling-alias forbidden + warn-only exit

Six Codex findings on tools/lint/runner-version-freshness.sh:

P0 (line 133) — regex-metachar escape:
`stale_pattern` was built from raw label strings; `.` in
ubuntu-22.04 was a regex wildcard, producing false matches/
misses. Added `escape_for_regex` helper that escapes . + *
? ( ) [ ] { } | \ / before alternation.

P0 (line 149) — BSD-grep portability:
`\b` word-boundary doesn't work in BSD grep (macOS default;
treated as backspace per POSIX ERE). Replaced with explicit
non-word boundaries: `([^A-Za-z0-9_]|^)` start +
`([^A-Za-z0-9_]|$)` end, expressed without backrefs so it
works in both GNU and BSD grep.

P1 (line 149-1) — exclude comments:
Stale-label-in-comment was triggering false positives. Added
a comment-stripping pre-filter (`grep -vE '^[[:space:]]*#'`)
so YAML comments are excluded from the scan.

P1 (line 149-2) — explicit-file-not-found masking:
`grep ... 2>/dev/null || true` silently swallowed missing-
file errors and reported 'ok' for nothing-actually-linted.
Added an explicit `[ ! -r "$file" ]` precheck that fails
loud (exit 2) rather than passing silent.

P1 (line 73) — rolling-aliases forbidden by convention:
ALLOWED_LABELS included ubuntu-latest / windows-latest /
macos-latest, contradicting the repo convention of pinned
major-OS-version labels. Removed from ALLOWED_LABELS, added
a separate ROLLING_ALIASES forbidden list, added a
distinct error-class scan ('ROLLING-ALIAS RUNNER LABEL') so
contributors get a different error message than for
stale-version pins. Same fail=1 flag, different operator
message.

P2 (line 179) — warn-only exit on stale freshness:
Header documents this as warning-only; code exited 3 (which
some CI configurations treat as failure). Updated to exit 0
on stale-freshness-only path; warning is still printed to
stderr. Stale-version-detection still exit 2 (a real failure).

Smoke-test note: the new script now flags ubuntu-22.04 in
gate.yml as stale (real finding) — exit 2 with the expected
output. gate.yml's own runner-pin upgrade is out of scope
for this PR; will land separately.
@AceHack AceHack force-pushed the tools/lint-runner-version-freshness-otto-214 branch from 3fd4541 to 55eeb4c Compare April 25, 2026 04:59
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 55eeb4ce6b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/lint/runner-version-freshness.sh Outdated
Comment thread tools/lint/runner-version-freshness.sh Outdated
…ping

Two more substantive Codex findings:

P1 (line 183) — quoted matrix entries missed:
The matrix-entry prefilter was `^[[:space:]]*-[[:space:]]+`
which only matched bare `- <label>`. Common YAML syntax
`- "ubuntu-22.04"` or `- 'macos-15'` was being missed.
Updated prefilter to `^[[:space:]]*-[[:space:]]+(['\"]?)`
which optionally consumes a leading single or double quote.
Smoke-tested with mixed quoting + matrix block: catches both
forms now.

P2 (line 179) — trailing inline comments not stripped:
`runs-on: ubuntu-24.04 # was ubuntu-22.04` was falsely
flagging `ubuntu-22.04` in the trailing comment. Added a
second sed pass: `sed -E 's/[[:space:]]+#.*$//'` strips
everything after the first ` #` (YAML-spec comment-start
sentinel with required leading space). Conservative: doesn't
handle `#` inside quoted strings (rare in workflow YAML).
Smoke-tested: trailing comments correctly stripped.
Copilot AI review requested due to automatic review settings April 25, 2026 05:05
@AceHack AceHack merged commit d9f80b5 into main Apr 25, 2026
15 checks passed
@AceHack AceHack deleted the tools/lint-runner-version-freshness-otto-214 branch April 25, 2026 05:08
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 391b045ab4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/lint/runner-version-freshness.sh
Comment thread tools/lint/runner-version-freshness.sh
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 6 comments.

Comment thread tools/lint/runner-version-freshness.sh
Comment thread tools/lint/runner-version-freshness.sh
Comment thread tools/lint/runner-version-freshness.sh
Comment thread tools/lint/runner-version-freshness.sh
Comment thread tools/lint/runner-version-freshness.sh
Comment thread tools/lint/runner-version-freshness.sh
AceHack added a commit that referenced this pull request Apr 25, 2026
…+ allow-list scope)

P0 (line 187) — set -e + grep -v abort:
`grep -vE ... | sed ...` aborts under set -euo pipefail
when grep -v outputs nothing (every line is a comment).
Added `|| true` to neutralise the exit 1.

P1 (line 195) — validate against allow-list, not stale-subset:
Old code only flagged labels in STALE_LABELS. A label like
`ubuntu-30.04` invented after this script's last refresh
would silently pass. Added a third scan: extract `runs-on:
<value>` and flag anything not in (ALLOWED ∪ ROLLING) and
not in expression form `${{ ... }}`. Distinct error class
'NOT-ON-ALLOW-LIST RUNNER LABEL'.

P1 (line 174) — env-error vs stale-label exit code:
Unreadable files set fail=1 → exit 2 (stale labels). That
mixed environment errors with content findings. Split into
`env_error` (exit 1) vs `fail` (exit 2). Header exit-code
contract updated.

P1 (line 142) — convention: cd to REPO_ROOT:
Other tools/lint/*.sh scripts establish REPO_ROOT and cd
there. Aligned: `git rev-parse --show-toplevel` + cd to
REPO_ROOT before file discovery.

P1 (line 9) — code comments explain code, not history:
Removed Otto-213 / Otto-214 lineage tokens from the header.
Replaced 'Otto-213 durable compounding-failure mitigation'
with the structural rationale (training-data version
numbers decay; structural lint enforces).

P1 (line 237) — exit-code contract reconciliation:
Earlier reviewer wanted warn-only path to exit 0 (so
freshness warning doesn't fail CI). This reviewer pointed
out the header still claimed exit 3 for warn. Updated
header explicitly: 'exit 0 even with freshness warning;
warning is on stderr, non-fatal'. Removed exit 3 from
exit-code contract; now only 0/1/2.

P2 (line 188) — guard comment-stripping pipeline:
Same as P0 fix above (`|| true` neutralises grep -v's
no-match exit 1).

P2 (line 116) — UTC vs local-time TZ skew:
`now_epoch` was UTC but BSD `date -j -f` defaults to local
time. Forced `TZ=UTC` on both branches so age_days computes
in a single timezone (no ±1 day skew across DST/timezones).
AceHack added a commit that referenced this pull request Apr 25, 2026
P0 (Copilot L265) / P1 (Codex L185) — `warn` unbound under `set -u`:
  Initialize `warn=0` alongside `fail=0` and `env_error=0`. Without this,
  `_verify_age_ok` returning success leaves `warn` unset; the final
  `[ "$warn" = "1" ]` check then aborts with "unbound variable" — turning
  passing-lint into env-error. Real regression.

P1 (Copilot L56) — `cd "$REPO_ROOT"` before consuming `$@`:
  Normalize CLI args to absolute paths BEFORE the chdir into REPO_ROOT, so
  paths given relative to the caller's cwd survive. Without this,
  `script.sh ./foo.yml` from outside REPO_ROOT errors after the cd.

P1 (Copilot L213) — `|| true` masking real `sed` failures:
  Group only the `grep` with `|| true` (`{ grep -vE ... || true; } | sed`)
  so a real sed failure (missing tool / unsupported -E) still surfaces;
  before, pipefail propagated through `cmd1 | sed ... || true`, hiding
  legitimate environment errors.

P1 (Codex L244 / Copilot L245) — escape ALLOWED_LABELS for ERE:
  Labels like `ubuntu-24.04` contain `.` (ERE wildcard); without escaping,
  typos like `ubuntu-24x04` would slip through as "allow-listed". Reuses
  `escape_for_regex` for ALLOWED + ROLLING lists.

P1 (Copilot L252) — extend NOT-ON-ALLOW-LIST scan to matrix entries:
  Previously skipped `runs-on: ${{ matrix.os }}` workflows entirely. Now
  validates matrix list entries against the allow-list using a
  line-number-aware exclude prefix `(^|^[0-9]+:)` (grep -n prepends
  `<linenum>:` which would otherwise break the `^` anchor in the exclude
  filter). Smoke-tested: matrix entries `macos-26` / `ubuntu-24.04` /
  `ubuntu-24.04-arm` correctly excluded as allow-listed; existing stale
  `ubuntu-22.04` findings unchanged.

Verified: clean lint pass on `.github/workflows/codeql.yml`, stale-label
detection unchanged on `.github/workflows/gate.yml`, relative-path arg
from a subdirectory now resolves correctly.
AceHack added a commit that referenced this pull request Apr 25, 2026
…indings (#432)

* drain(#360 follow-up: 8 Codex P0/P1/P2 findings on shell portability + allow-list scope)

P0 (line 187) — set -e + grep -v abort:
`grep -vE ... | sed ...` aborts under set -euo pipefail
when grep -v outputs nothing (every line is a comment).
Added `|| true` to neutralise the exit 1.

P1 (line 195) — validate against allow-list, not stale-subset:
Old code only flagged labels in STALE_LABELS. A label like
`ubuntu-30.04` invented after this script's last refresh
would silently pass. Added a third scan: extract `runs-on:
<value>` and flag anything not in (ALLOWED ∪ ROLLING) and
not in expression form `${{ ... }}`. Distinct error class
'NOT-ON-ALLOW-LIST RUNNER LABEL'.

P1 (line 174) — env-error vs stale-label exit code:
Unreadable files set fail=1 → exit 2 (stale labels). That
mixed environment errors with content findings. Split into
`env_error` (exit 1) vs `fail` (exit 2). Header exit-code
contract updated.

P1 (line 142) — convention: cd to REPO_ROOT:
Other tools/lint/*.sh scripts establish REPO_ROOT and cd
there. Aligned: `git rev-parse --show-toplevel` + cd to
REPO_ROOT before file discovery.

P1 (line 9) — code comments explain code, not history:
Removed Otto-213 / Otto-214 lineage tokens from the header.
Replaced 'Otto-213 durable compounding-failure mitigation'
with the structural rationale (training-data version
numbers decay; structural lint enforces).

P1 (line 237) — exit-code contract reconciliation:
Earlier reviewer wanted warn-only path to exit 0 (so
freshness warning doesn't fail CI). This reviewer pointed
out the header still claimed exit 3 for warn. Updated
header explicitly: 'exit 0 even with freshness warning;
warning is on stderr, non-fatal'. Removed exit 3 from
exit-code contract; now only 0/1/2.

P2 (line 188) — guard comment-stripping pipeline:
Same as P0 fix above (`|| true` neutralises grep -v's
no-match exit 1).

P2 (line 116) — UTC vs local-time TZ skew:
`now_epoch` was UTC but BSD `date -j -f` defaults to local
time. Forced `TZ=UTC` on both branches so age_days computes
in a single timezone (no ±1 day skew across DST/timezones).

* drain(#360 follow-up): fix Codex P0/P1/P2 + Copilot findings

P0 (Copilot L265) / P1 (Codex L185) — `warn` unbound under `set -u`:
  Initialize `warn=0` alongside `fail=0` and `env_error=0`. Without this,
  `_verify_age_ok` returning success leaves `warn` unset; the final
  `[ "$warn" = "1" ]` check then aborts with "unbound variable" — turning
  passing-lint into env-error. Real regression.

P1 (Copilot L56) — `cd "$REPO_ROOT"` before consuming `$@`:
  Normalize CLI args to absolute paths BEFORE the chdir into REPO_ROOT, so
  paths given relative to the caller's cwd survive. Without this,
  `script.sh ./foo.yml` from outside REPO_ROOT errors after the cd.

P1 (Copilot L213) — `|| true` masking real `sed` failures:
  Group only the `grep` with `|| true` (`{ grep -vE ... || true; } | sed`)
  so a real sed failure (missing tool / unsupported -E) still surfaces;
  before, pipefail propagated through `cmd1 | sed ... || true`, hiding
  legitimate environment errors.

P1 (Codex L244 / Copilot L245) — escape ALLOWED_LABELS for ERE:
  Labels like `ubuntu-24.04` contain `.` (ERE wildcard); without escaping,
  typos like `ubuntu-24x04` would slip through as "allow-listed". Reuses
  `escape_for_regex` for ALLOWED + ROLLING lists.

P1 (Copilot L252) — extend NOT-ON-ALLOW-LIST scan to matrix entries:
  Previously skipped `runs-on: ${{ matrix.os }}` workflows entirely. Now
  validates matrix list entries against the allow-list using a
  line-number-aware exclude prefix `(^|^[0-9]+:)` (grep -n prepends
  `<linenum>:` which would otherwise break the `^` anchor in the exclude
  filter). Smoke-tested: matrix entries `macos-26` / `ubuntu-24.04` /
  `ubuntu-24.04-arm` correctly excluded as allow-listed; existing stale
  `ubuntu-22.04` findings unchanged.

Verified: clean lint pass on `.github/workflows/codeql.yml`, stale-label
detection unchanged on `.github/workflows/gate.yml`, relative-path arg
from a subdirectory now resolves correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants