Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 12 additions & 5 deletions docs/research/2026-05-03-math-proofs-honest-assessment.md
Original file line number Diff line number Diff line change
Expand Up @@ -283,22 +283,29 @@ to publishable form:

---

## Outstanding-work matrix (post-this-assessment, with status updates)
## Outstanding-work matrix (post-this-assessment, with status updates 2026-05-03 EOD)

| Work item | Grade upgrade | Effort | Priority | Status |
|---|---|---|---|---|
| Lean lake-build CI job | A1, A2 → A-with-CI | 1 day | P0 | **In flight (this PR ships `.github/workflows/lean-proof.yml`)** |
| Stryker CI + kill-rate publish | B3 → A | 1 day | P0 | open |
| Lean lake-build CI job | A1, A2 → A-with-CI | 1 day | P0 | **Done (PR #1394, 2026-05-03 — `.github/workflows/lean-proof.yml` shipped; runs on `tools/lean4/**` changes; `lake exe cache get` for Mathlib oleans + `lake env lean` type-check)** |
| Stryker CI + kill-rate publish | B3 → A | 1 day | P0 | **Partial (PR #1395 fixed stale `stryker-config.json` paths; CI workflow design + kill-rate publication target deferred to follow-up — substantial-design item)** |
| Semgrep CI | B4 → A | 0.5 day | P1 | **Already done (verify-then-claim correction; see B4 section)** |
| 4 deferred TLA+ specs into CI | B1 → A | 2 days | P1 | open |
| Alloy CI hook | B2 → A | 0.5 day | P1 | open |
| 4 deferred TLA+ specs into CI | B1 → A | 2 days | P1 | **Partial (2 of 4 done): DbspSpec ✓ #1397; CircuitRegistration ✓ #1401 (B-0180 Safety invariant defined). Remaining: B-0179 SpineAsyncProtocol counterexample + B-0181 SpineMergeInvariants counterexample — both need TTrace investigation (substantial)** |
| Alloy CI hook | B2 → A | 0.5 day | P1 | **Done (PR #1396, 2026-05-03 — `tests/Tests.FSharp/Formal/Alloy.Runner.Tests.fs` spec-path fixed; Spine + InfoTheoreticSharder specs now actually validate, not silent-no-op)** |
| `.cfg` for 4 C1 specs | C1 → B | 2 days | P2 | open |
| TLA+ spec for `Recursive.fs` LFP | C2 → B | 2-3 days | P2 | open |
| TLA+ spec for WDC protocol | C2 → B | 3-5 days | P1 | open |
| 15 FsCheck properties (C3) | C3 → B | 3 days | P2 | open |
| `chain_rule_poly` (3-group) | C4 → A | research | P3 | open |
| Registry rows for A4 specs | external-fidelity claim | 1 day | P0 | **Done (PR #1393, 2026-05-03)** |
| Peer-review email draft | publishability | 2 hours | P0 | **Done (PR #1387, 2026-05-03)** |
| Phase 0 PoC (substrate-discovery toolchain) | substrate-discovery | 2-3 ticks | (sibling) | **Done (PR #1392, 2026-05-03 — 4.0 MB single AOT binary on osx-arm64; full IVM tick path validates; cross-platform CI matrix runs)** |
| 3 broken-spec follow-ups (B-0179 + B-0180 + B-0181) | spec-fix tracking | varies | P2 | **Filed (PR #1398, 2026-05-03); B-0180 closed (PR #1401); B-0179 + B-0181 remain open** |
| `.ts/.sh` parity bug in `tools/backlog/generate-index.ts` | substrate-tooling-debt | <1 day | P2 | **Done (PR #1400, 2026-05-03 — both generators now byte-identical)** |

**Net P0 progress:** Lean CI ✓, A4 registry rows ✓, peer-review email ✓ (3 of 3 P0 items closed). Stryker B3 partial (config-fix done; CI wire deferred to follow-up substantial-design). All P0 items have movement.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fix incorrect P0 closure count in progress summary

The summary says "3 of 3 P0 items closed," but this table now has four P0 rows (Lean lake-build CI job, Stryker CI + kill-rate publish, Registry rows for A4 specs, Peer-review email draft) and one of them is explicitly still partial/open (Stryker). This makes the status math internally inconsistent and can mislead planning decisions based on this matrix; please either report 3 of 4 or clarify why Stryker is excluded from the denominator.

Useful? React with 👍 / 👎.


**Net P1 progress:** Alloy B2 ✓, Semgrep B4 ✓ (correction). B1 4 deferred specs at 2/4 done (CircuitRegistration + DbspSpec); the other 2 need counterexample investigation. WDC protocol spec + Recursive.fs LFP spec remain open (substantial multi-day each).
Comment on lines +306 to +308

---

Expand Down
Loading