Skip to content

ci(lean-proof): add CI gate for DbspChainRule.lean type-check (closes A1+A2 → A-with-CI)#1394

Merged
AceHack merged 2 commits intomainfrom
ci/lean-proof-build-job-2026-05-03
May 3, 2026
Merged

ci(lean-proof): add CI gate for DbspChainRule.lean type-check (closes A1+A2 → A-with-CI)#1394
AceHack merged 2 commits intomainfrom
ci/lean-proof-build-job-2026-05-03

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 3, 2026

Summary

P0 outstanding item from #1383 math-proofs honest assessment. Adds .github/workflows/lean-proof.yml to type-check the Lean 4 / Mathlib formalization (tools/lean4/Lean4/DbspChainRule.lean) on every relevant PR + push to main.

What this closes

  • A1 (chain_rule_proposition_3_2): was A-grade hand-run; now A-grade-with-CI
  • A2 (Dop_LTI_commute): was A-grade hand-run; now A-grade-with-CI

Peer reviewers expect "runs in CI" as the line for an A-grade artifact. This workflow is the gate.

Workflow design

  • Path-filter at trigger: only runs on tools/lean4/** changes (out-of-band from gate.yml because Mathlib cache is multi-GB)
  • Caches tools/lean4/.lake keyed on lean-toolchain + lakefile.toml + lake-manifest.json
  • Caches ~/.elan with same key shape as gate.yml's elan cache
  • lake exe cache get downloads Mathlib pre-built oleans from S3 (skips multi-hour build)
  • lake env lean Lean4/DbspChainRule.lean type-checks the proof
  • Timeout 30 min (cold-cache headroom; warm cache runs ~2-5 min)

Verified locally: lake env lean returns silent + exit 0 on current proof.

Two corrections to #1383 (verify-then-claim sweep)

  1. Semgrep is already in CI — original B4 grading was wrong. Verified gate.yml line ~422 has lint (semgrep) running with --error --metrics=off. Effective grade upgraded to A; outstanding-work item closed.

  2. Outstanding-work matrix updated with Status column: "Done" (Semgrep + A4 registry rows docs(research): registry rows for the 5 in-CI TLA+ specs (closes A4 external-citation-fidelity gap) #1393 + peer-review email docs(research): peer-review outreach email template + recipient ranking 2026-05-03 #1387), "In flight" (Lean CI — this PR), "open" (Stryker + B1 + B2 + C-grade).

Test plan

  • §33 archive-header lint passes
  • Workflow YAML safe-pattern compliant per FACTORY-HYGIENE row deps: bump actions/setup-node from 6.0.0 to 6.4.0 #43 (SHA-pinned actions, contents:read perms, concurrency group, pinned runs-on, path-filter)
  • lake env lean verified locally (silent + exit 0)
  • CI runs the workflow on this PR (the workflow's own first fire)

🤖 Generated with Claude Code

… A1+A2 → A-with-CI)

Per the math-proofs honest assessment (#1383) outstanding-work
matrix, the Lean proof of DBSP chain rule (Prop 3.2 +
Dop_LTI_commute) was A-grade today hand-run but missing CI
gating. Peer reviewers expect "runs in CI" as the line for an
A-grade artifact.

This PR ships `.github/workflows/lean-proof.yml`:

- Path-filter: only runs on `tools/lean4/**` changes (or workflow
  itself). Out-of-band from gate.yml because Mathlib's lake cache
  is multi-GB and Lean toolchain setup is heavier than the
  existing build-and-test surface.
- Caches `tools/lean4/.lake` keyed on lean-toolchain +
  lakefile.toml + lake-manifest.json (busts cleanly on either
  change).
- Caches `~/.elan` with same key shape as gate.yml's elan cache
  (shared cache space).
- Uses `lake exe cache get` to download Mathlib pre-built oleans
  from S3 — skips the multi-hour Mathlib build.
- Type-checks via `lake env lean Lean4/DbspChainRule.lean`.

Verified locally: `lake env lean` returns silent + exit 0
(success) on the current proof.

Two corrections to #1383 captured in same PR:

1. **Semgrep is already in CI** — original B4 grading was wrong.
   Verified by grep: `.github/workflows/gate.yml` line ~422
   has `lint (semgrep)` job running
   `semgrep --config .semgrep.yml --error --metrics=off`.
   Effective grade upgraded to A; outstanding-work item closed.
   This illustrates the verify-then-claim discipline.

2. **Outstanding-work matrix updated** with Status column
   tracking each item: "Done" (Semgrep, A4 registry rows
   #1393, peer-review email #1387), "In flight" (Lean CI —
   this PR), "open" (Stryker, B1, B2, C-grade items).

Composes with verification-registry.md (rows for both A1+A2
benefit) + the verification-drift-auditor skill (every CI fire
is implicit drift-check) + #1383 (the assessment doc this
upgrades).

Workflow safe-pattern compliant per FACTORY-HYGIENE row #43:
SHA-pinned actions, contents:read permissions, concurrency
group, pinned runs-on, path-filter narrows trigger.
Copilot AI review requested due to automatic review settings May 3, 2026 12:18
@AceHack AceHack enabled auto-merge (squash) May 3, 2026 12:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a dedicated GitHub Actions workflow to type-check the Lean 4 formalization under tools/lean4, moving the chain-rule proof artifacts from hand-run verification toward the repository’s CI-backed verification posture. It also updates the math-proofs assessment document to reflect the corrected Semgrep status and the current outstanding-work matrix.

Changes:

  • Adds .github/workflows/lean-proof.yml to install the shared toolchain, warm Lean/Mathlib caches, and run lake env lean for DbspChainRule.lean.
  • Updates the research assessment to reclassify Semgrep as already CI-gated and to add status values to the outstanding-work matrix.
  • Records related status changes for verification follow-up items tied to recent PRs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
.github/workflows/lean-proof.yml New path-filtered workflow for Lean proof type-checking with cache setup and shared installer usage.
docs/research/2026-05-03-math-proofs-honest-assessment.md Updates Semgrep grading and revises the outstanding-work matrix with status annotations.

Comment thread .github/workflows/lean-proof.yml
Comment thread docs/research/2026-05-03-math-proofs-honest-assessment.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5cc5add60

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/lean-proof.yml
…pe outstanding-work row

Three review threads on #1394 addressed:

1. + 3. **GITHUB_TOKEN missing**: mise's aqua: backend uses the
   GitHub API for release-tag lookups during install.sh; without
   GITHUB_TOKEN it hits the unauthenticated 60/hr rate limit and
   fails. Added workflow-level env: with secrets.GITHUB_TOKEN —
   same pattern as gate.yml. Inherits contents:read, no
   escalation.

2. **Outstanding-work row duplicated**: the Peer-review email
   draft row was inserted twice (Done + empty status). Removed
   the empty-status duplicate; single Done row remains.
@AceHack AceHack merged commit 3966f7e into main May 3, 2026
26 checks passed
@AceHack AceHack deleted the ci/lean-proof-build-job-2026-05-03 branch May 3, 2026 12:26
AceHack added a commit that referenced this pull request May 3, 2026
…6-05-03 EOD progress (#1402)

Reflects substantive progress this session across the math-proofs
honest assessment matrix. Key state changes:

**P0 items — 3 of 3 closed:**
- Lean lake-build CI job ✓ (#1394)
- A4 registry rows ✓ (#1393)
- Peer-review email draft ✓ (#1387)
- Stryker B3 → partial (config-fix #1395; CI wire deferred to
  follow-up substantial-design)

**P1 items — significant progress:**
- Alloy B2 → A ✓ (#1396 — silent-no-op was the failure mode;
  spec-path fixed)
- Semgrep B4 → A ✓ (verify-then-claim correction; was already
  in CI)
- B1 4 deferred specs → 2 of 4 done:
  - DbspSpec ✓ #1397 (1M states / 11s)
  - CircuitRegistration ✓ #1401 (B-0180 closed; 3538 states / <1s)
  - SpineAsyncProtocol B-0179 still open (counterexample inv.)
  - SpineMergeInvariants B-0181 still open (counterexample inv.)

**Sibling work tracked:**
- Phase 0 substrate-discovery PoC ✓ (#1392 — 4.0 MB AOT binary
  on osx-arm64; cross-platform CI matrix)
- 3 broken-spec backlog rows filed (#1398 → B-0179 + B-0180 +
  B-0181); B-0180 closed (#1401)
- `.ts/.sh` parity bug in `tools/backlog/generate-index.ts`
  closed ✓ (#1400 — both generators byte-identical)

This update is bounded substrate work documenting the actual
state of the matrix; doesn't add new work, just captures
completion. Future matrix re-grades happen as work-items land
(per the assessment doc's audit-trail discipline).

Composes with #1383 (the original assessment) + every PR
referenced above.

§33 archive-header lint passes.
AceHack added a commit that referenced this pull request May 3, 2026
…ster + cache-clobber discipline encoded (#1408)

Substantial multi-tick session shard. 18 PRs touched (#1383 + #1387
+ #1392-#1407 inclusive); 14 merged + 4 in-flight as of shard time.

**Math-proofs assessment progress** (#1383 outstanding-work matrix):
- A1+A2 → A-with-CI ✓ (#1394 Lean lake-build workflow)
- A4 registry rows ✓ (#1393)
- B1 → 2 of 4 deferred specs in CI ✓ (#1397 DbspSpec + #1401
  CircuitRegistration B-0180 closed)
- B2 Alloy → A ✓ (#1396 silent-no-op spec-path fix)
- B4 Semgrep → A ✓ (correction)
- Peer-review email template ✓ (#1387)
- Phase 0 substrate-discovery PoC ✓ (#1392)
- Stryker config-fix ✓ (#1395; CI wire deferred)
- 3 broken-spec backlog rows filed ✓ (#1398)

**Cache-clobber silent-bug class discovered + fully encoded:**
B-0180 fix passing locally + failing CI → verify-then-claim
identified gate.yml + low-memory.yml caching whole tools/tla and
tools/alloy directories. Fix cluster: #1403 (gate.yml) + #1404
(low-memory.yml + audit-ci-cache-paths.ts) + #1406 (CI lint gate)
+ #1407 (memory file + bug-locus disambiguation per Aaron's 'real
github bug?' question — answer: usage-bug, not tool-bug).

**Other substrate work:** #1399 BACKLOG.md regen, #1400 .ts/.sh
parity bug, #1402 assessment matrix doc update, #1405 B-0182
backlog row (Linux-only formal verification — orthogonal-axes
split per Aaron 2026-05-03).

**Discipline lessons captured:** chat-is-assertion-channel,
substrate-corrections-cluster, search-first-before-architectural-
expansion, verify-then-claim CI fidelity, documentation-is-
current-state-not-history.

Carved sentence: 'When a lucky catch surfaces a class of bug,
build the structural fix that eliminates the luck — audit + lint
gate + carved-sentence rule + memory file.'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants