Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b9da6fe814
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Updates tools/lint/runner-version-freshness.sh to tighten and clarify runner-label validation for GitHub Actions workflows, improving portability and distinguishing stale-label findings from environment/usage errors.
Changes:
- Refines allow-list-based validation (including a new NOT-ON-ALLOW-LIST scan) and updates exit-code semantics/documentation.
- Improves shell-portability/robustness (UTC handling for
date, safer comment-stripping underset -euo pipefail). - Establishes a repo-root working directory invariant before default workflow discovery.
…+ allow-list scope)
P0 (line 187) — set -e + grep -v abort:
`grep -vE ... | sed ...` aborts under set -euo pipefail
when grep -v outputs nothing (every line is a comment).
Added `|| true` to neutralise the exit 1.
P1 (line 195) — validate against allow-list, not stale-subset:
Old code only flagged labels in STALE_LABELS. A label like
`ubuntu-30.04` invented after this script's last refresh
would silently pass. Added a third scan: extract `runs-on:
<value>` and flag anything not in (ALLOWED ∪ ROLLING) and
not in expression form `${{ ... }}`. Distinct error class
'NOT-ON-ALLOW-LIST RUNNER LABEL'.
P1 (line 174) — env-error vs stale-label exit code:
Unreadable files set fail=1 → exit 2 (stale labels). That
mixed environment errors with content findings. Split into
`env_error` (exit 1) vs `fail` (exit 2). Header exit-code
contract updated.
P1 (line 142) — convention: cd to REPO_ROOT:
Other tools/lint/*.sh scripts establish REPO_ROOT and cd
there. Aligned: `git rev-parse --show-toplevel` + cd to
REPO_ROOT before file discovery.
P1 (line 9) — code comments explain code, not history:
Removed Otto-213 / Otto-214 lineage tokens from the header.
Replaced 'Otto-213 durable compounding-failure mitigation'
with the structural rationale (training-data version
numbers decay; structural lint enforces).
P1 (line 237) — exit-code contract reconciliation:
Earlier reviewer wanted warn-only path to exit 0 (so
freshness warning doesn't fail CI). This reviewer pointed
out the header still claimed exit 3 for warn. Updated
header explicitly: 'exit 0 even with freshness warning;
warning is on stderr, non-fatal'. Removed exit 3 from
exit-code contract; now only 0/1/2.
P2 (line 188) — guard comment-stripping pipeline:
Same as P0 fix above (`|| true` neutralises grep -v's
no-match exit 1).
P2 (line 116) — UTC vs local-time TZ skew:
`now_epoch` was UTC but BSD `date -j -f` defaults to local
time. Forced `TZ=UTC` on both branches so age_days computes
in a single timezone (no ±1 day skew across DST/timezones).
P0 (Copilot L265) / P1 (Codex L185) — `warn` unbound under `set -u`:
Initialize `warn=0` alongside `fail=0` and `env_error=0`. Without this,
`_verify_age_ok` returning success leaves `warn` unset; the final
`[ "$warn" = "1" ]` check then aborts with "unbound variable" — turning
passing-lint into env-error. Real regression.
P1 (Copilot L56) — `cd "$REPO_ROOT"` before consuming `$@`:
Normalize CLI args to absolute paths BEFORE the chdir into REPO_ROOT, so
paths given relative to the caller's cwd survive. Without this,
`script.sh ./foo.yml` from outside REPO_ROOT errors after the cd.
P1 (Copilot L213) — `|| true` masking real `sed` failures:
Group only the `grep` with `|| true` (`{ grep -vE ... || true; } | sed`)
so a real sed failure (missing tool / unsupported -E) still surfaces;
before, pipefail propagated through `cmd1 | sed ... || true`, hiding
legitimate environment errors.
P1 (Codex L244 / Copilot L245) — escape ALLOWED_LABELS for ERE:
Labels like `ubuntu-24.04` contain `.` (ERE wildcard); without escaping,
typos like `ubuntu-24x04` would slip through as "allow-listed". Reuses
`escape_for_regex` for ALLOWED + ROLLING lists.
P1 (Copilot L252) — extend NOT-ON-ALLOW-LIST scan to matrix entries:
Previously skipped `runs-on: ${{ matrix.os }}` workflows entirely. Now
validates matrix list entries against the allow-list using a
line-number-aware exclude prefix `(^|^[0-9]+:)` (grep -n prepends
`<linenum>:` which would otherwise break the `^` anchor in the exclude
filter). Smoke-tested: matrix entries `macos-26` / `ubuntu-24.04` /
`ubuntu-24.04-arm` correctly excluded as allow-listed; existing stale
`ubuntu-22.04` findings unchanged.
Verified: clean lint pass on `.github/workflows/codeql.yml`, stale-label
detection unchanged on `.github/workflows/gate.yml`, relative-path arg
from a subdirectory now resolves correctly.
b9da6fe to
98ce441
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 98ce4412cd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| | grep -nE "${matrix_prefix}[A-Za-z][A-Za-z0-9._-]*[0-9][A-Za-z0-9._-]*" \ | ||
| | grep -vE "${matrix_prefix_ln}(${rolling_or_allowed})(['\"]?)([[:space:]]|$|#)" \ | ||
| | grep -vE "${matrix_prefix_ln}(${stale_pattern})" \ |
There was a problem hiding this comment.
Scope matrix allow-list scan to runner labels only
The new matrix_unknown matcher currently inspects any list item that starts with a letter and contains a digit, regardless of whether it is a runner label, and then treats matches as fatal (fail=1). In practice this flags common non-runner matrices (for example matrix.dotnet: [dotnet-9.0]) as NOT-ON-ALLOW-LIST and exits 2 even when runs-on is valid, so legitimate workflows can start failing CI after this change. Restricting this path to runner-related matrix fields (or making it truly advisory) would avoid these false positives.
Useful? React with 👍 / 👎.
| | grep -nE 'runs-on:[[:space:]]*[A-Za-z0-9_-][A-Za-z0-9._-]*' \ | ||
| | grep -vE 'runs-on:[[:space:]]*\$\{\{' \ | ||
| | grep -vE "runs-on:[[:space:]]*(${rolling_or_allowed})($|[[:space:]]|#)" \ |
There was a problem hiding this comment.
Include quoted
runs-on scalars in allow-list validation
The scalar_unknown regex only matches unquoted scalar values after runs-on:, so quoted values are skipped by the new not-on-allow-list check. As a result, a workflow using runs-on: "ubuntu-30.04" passes this guard unless the label is already in STALE_LABELS, which defeats the purpose of catching unknown future pins. Supporting an optional quote in this matcher would close that bypass.
Useful? React with 👍 / 👎.
…ll portability) (#439) Otto-268 backfill task: drain-log for PR #432 covering 7 substantive shell-portability findings — including a P0 regression (`warn` unbound under `set -u`) caught independently by both Codex and Copilot. Per Otto-250 training-signal discipline. Pattern observations capture four load-bearing patterns: cross-reviewer convergence as a quality signal; `set -euo pipefail` interaction with shell quirks as a recurring class (3 of 7 findings); ERE-metachar escape uniformity across label arrays; grep -n line-number prefix breaking `^` anchors in subsequent greps as a reusable pattern.
…reword) Otto-268 backfill: drain-log for PR #435 (drain follow-up to #148: why-the-factory-is-different live-lock cadence claim + grammar), covering 3 threads across 2 waves with a clean self-induced-cascade pattern. Per Otto-250 training-signal discipline. Pattern observations capture four load-bearing patterns: 1. Cross-reviewer convergence on Wave 1 (Codex P2 + Copilot P1 flagging the same missing-FACTORY-HYGIENE-row) raised quality signal — same shape as #432's `warn` unbound finding. 2. Self-induced cascade: my Wave-1 fix introduced the Wave-2 finding (claim "separate BACKLOG items" implied plural; actual BACKLOG state is one row with multiple sub-items). Pattern: when fixing a claim, verify the new claim is also accurate against current-state. 3. Reword-option-(a)-vs-(b) decision template generalizes: when doc asserts X but X doesn't exist, prefer reword-to-current-truth over add-the-thing-asserted (unless thing is small + isolated). 4. PR-mechanics: 4 of 7 cascade-PRs in this session (#135, #231, #432, #435) went through wave-1 + wave-2 cascade pattern; the reviewer-cascade is a consistent property of the merge-trigger surface, not a per-PR oddity. Closes the session-drain-log backfill (Otto-268) for the major PRs drained in this session: #135 / #235 / #432 / #434 / #195 / #219 / #206 / #377 / #231 / #85 / #435 (11 PRs total covered across drain logs #437-#447).
…reword) (#447) * hygiene(#268): pr-preservation drain-log for #435 (live-lock cadence reword) Otto-268 backfill: drain-log for PR #435 (drain follow-up to #148: why-the-factory-is-different live-lock cadence claim + grammar), covering 3 threads across 2 waves with a clean self-induced-cascade pattern. Per Otto-250 training-signal discipline. Pattern observations capture four load-bearing patterns: 1. Cross-reviewer convergence on Wave 1 (Codex P2 + Copilot P1 flagging the same missing-FACTORY-HYGIENE-row) raised quality signal — same shape as #432's `warn` unbound finding. 2. Self-induced cascade: my Wave-1 fix introduced the Wave-2 finding (claim "separate BACKLOG items" implied plural; actual BACKLOG state is one row with multiple sub-items). Pattern: when fixing a claim, verify the new claim is also accurate against current-state. 3. Reword-option-(a)-vs-(b) decision template generalizes: when doc asserts X but X doesn't exist, prefer reword-to-current-truth over add-the-thing-asserted (unless thing is small + isolated). 4. PR-mechanics: 4 of 7 cascade-PRs in this session (#135, #231, #432, #435) went through wave-1 + wave-2 cascade pattern; the reviewer-cascade is a consistent property of the merge-trigger surface, not a per-PR oddity. Closes the session-drain-log backfill (Otto-268) for the major PRs drained in this session: #135 / #235 / #432 / #434 / #195 / #219 / #206 / #377 / #231 / #85 / #435 (11 PRs total covered across drain logs #437-#447). * drain(#447 follow-up): fix #435 drain-log Reviewer field + stable-identifier xref Codex P2 + Copilot threads on #447 caught: - Thread 1.2 missing the `Reviewer:` field even though the drain-log schema (intro paragraph) declares per-thread reviewer authorship. Added `Reviewer: copilot-pull-request-reviewer`. - Stale `docs/BACKLOG.md lines 1313-1328` citation: those lines now contain the Server Meshing section; the live-lock-smell cadence row drifted to ~L1452 in the P1 tooling section. Replaced with the stable identifier (heading text 'Live-lock smell cadence (round 44 auto-loop-46 absorb, landed as `tools/audit/ live-lock-audit.sh` + hygiene-history log)') so future readers don't chase a moving line-number target. Same stable-identifier-vs-line-number-xref pattern flagged on #423's `near line 4167` finding. Documented in `_patterns.md` — line numbers decay on every adjacent edit; stable identifiers decay only on rename. Adopting heading text as the stable cite. The bare `:111`/`:113` thread location format (Otto-250 file:line shape conformance) is the broader Otto-268-wave divergence documented in PR #467 known-divergence section — deferred to maintainer review per that framing.
Codex post-merge review on #360 surfaced 8 findings (P0 + 5×P1 + 2×P2). Addressed in this follow-up.
TZ=UTCon BSD date branchSmoke-test confirmed: gate.yml's ubuntu-22.04 pins are now flagged twice (STALE + NOT-ON-ALLOW-LIST), exit 2.