diff --git a/.github/workflows/stryker-mutation.yml b/.github/workflows/stryker-mutation.yml new file mode 100644 index 000000000..09c5e51d4 --- /dev/null +++ b/.github/workflows/stryker-mutation.yml @@ -0,0 +1,95 @@ +name: stryker-mutation + +# Runs Stryker.NET mutation testing against `src/Core/Core.fsproj` +# using `tests/Tests.FSharp/Tests.FSharp.fsproj` as the kill-rate +# oracle. Per the math-proofs honest assessment +# (docs/research/2026-05-03-math-proofs-honest-assessment.md) the +# Stryker artifact is B3-grade today (config exists, runs locally, +# threshold-break at 50% wired) but was MISSING CI gating + a +# published kill-rate metric. This workflow closes the gap by: +# +# 1. Wiring Stryker into CI on src/Core/** path-filtered triggers +# so kill-rate signal lands on every behaviorally-relevant PR +# 2. Uploading the HTML report + json result as workflow artifacts +# so the kill-rate metric is verifiable from the run page +# 3. Applying the threshold-break gate (50%) as a workflow failure +# via Stryker's exit-code semantics +# +# Path-filtered + out-of-band from gate.yml because mutation testing +# is slow (15-30 min per run on the current Core.fsproj surface) and +# would block the existing fast PR-loop. Same pattern as +# lean-proof.yml — formal-verification-grade workflows run on their +# own cadence, not in the gate-blocking path. +# +# Composes with: +# - stryker-config.json (the artifact this runs against) +# - tools/setup/manifests/dotnet-tools (dotnet-stryker installed +# globally by tools/setup/install.sh) +# - docs/research/2026-05-03-math-proofs-honest-assessment.md +# (this workflow closes the B3 -> A upgrade with-CI line) +# +# Safe-pattern compliance (per FACTORY-HYGIENE row #43): +# - SHA-pinned actions/checkout + actions/upload-artifact +# - Explicit minimum permissions: contents: read +# - No user-authored context; purely repo-file reads +# - Concurrency group + pinned runs-on +# - Path-filter narrows the trigger surface +# +# Note: this workflow uses only github.run_id (immutable internal +# identifier) and github.ref (validated branch name) in run: / +# concurrency: contexts — no untrusted user-authored inputs (PR +# title, comment body, etc.) are interpolated. + +on: + pull_request: + paths: + - "src/Core/**" + - "tests/Tests.FSharp/**" + - "stryker-config.json" + - ".github/workflows/stryker-mutation.yml" + push: + branches: [main] + paths: + - "src/Core/**" + - "tests/Tests.FSharp/**" + - "stryker-config.json" + - ".github/workflows/stryker-mutation.yml" + workflow_dispatch: {} + +permissions: + contents: read + +env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + +concurrency: + group: stryker-mutation-${{ github.ref }} + cancel-in-progress: true + +jobs: + mutation-test: + name: Stryker.NET mutation test (Core.fsproj) + runs-on: ubuntu-24.04 + timeout-minutes: 60 + + steps: + - name: Checkout + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 + + - name: Install toolchain via three-way-parity script (GOVERNANCE §24) + run: ./tools/setup/install.sh + + - name: Restore + build (Release) + run: dotnet build Zeta.sln -c Release + + - name: Run Stryker + run: dotnet stryker + + - name: Upload Stryker reports + if: always() + uses: actions/upload-artifact@9eaf0eba75d52b5e72c7a193fc2887e6caf95df0 # v5.1.0 + with: + name: stryker-reports-${{ github.run_id }} + path: StrykerOutput/ + if-no-files-found: warn + retention-days: 90 diff --git a/docs/research/2026-05-03-math-proofs-honest-assessment.md b/docs/research/2026-05-03-math-proofs-honest-assessment.md index 726a03f69..10e3155e7 100644 --- a/docs/research/2026-05-03-math-proofs-honest-assessment.md +++ b/docs/research/2026-05-03-math-proofs-honest-assessment.md @@ -288,7 +288,7 @@ to publishable form: | Work item | Grade upgrade | Effort | Priority | Status | |---|---|---|---|---| | Lean lake-build CI job | A1, A2 → A-with-CI | 1 day | P0 | **Done (PR #1394, 2026-05-03 — `.github/workflows/lean-proof.yml` shipped; runs on `tools/lean4/**` changes; `lake exe cache get` for Mathlib oleans + `lake env lean` type-check)** | -| Stryker CI + kill-rate publish | B3 → A | 1 day | P0 | **Partial (PR #1395 fixed stale `stryker-config.json` paths; CI workflow design + kill-rate publication target deferred to follow-up — substantial-design item)** | +| Stryker CI + kill-rate publish | B3 → A | 1 day | P0 | **Done (PR #1395 fixed `stryker-config.json` paths; this PR adds `.github/workflows/stryker-mutation.yml` with src/Core/** path-filter trigger, threshold-break gate at 50%, and HTML+json reports uploaded as 90-day artifacts — kill-rate metric verifiable from every CI run page)** | | Semgrep CI | B4 → A | 0.5 day | P1 | **Already done (verify-then-claim correction; see B4 section)** | | 4 deferred TLA+ specs into CI | B1 → A | 2 days | P1 | **Done (4 of 4): DbspSpec ✓ #1397; CircuitRegistration ✓ #1401 (B-0180 Safety operator); SpineAsyncProtocol ✓ #1411 (B-0179 CHECK_DEADLOCK FALSE for bounded-protocol terminal state); SpineMergeInvariants ✓ this PR (B-0181 Cascade downstream-room precondition + state constraint + WF on Cascade)** | | Alloy CI hook | B2 → A | 0.5 day | P1 | **Done (PR #1396, 2026-05-03 — `tests/Tests.FSharp/Formal/Alloy.Runner.Tests.fs` spec-path fixed; Spine + InfoTheoreticSharder specs now actually validate, not silent-no-op)** | @@ -303,7 +303,7 @@ to publishable form: | 3 broken-spec follow-ups (B-0179 + B-0180 + B-0181) | spec-fix tracking | varies | P2/P1 | **All 3 closed: B-0180 ✓ #1401 (Safety operator); B-0179 ✓ #1411 (CHECK_DEADLOCK FALSE); B-0181 ✓ this PR (Cascade downstream-room precondition; promoted P2→P1 — same author-time class as B-0184 Spine.als under-specified action)** | | `.ts/.sh` parity bug in `tools/backlog/generate-index.ts` | substrate-tooling-debt | <1 day | P2 | **Done (PR #1400, 2026-05-03 — both generators now byte-identical)** | -**Net P0 progress:** Lean CI ✓, A4 registry rows ✓, peer-review email ✓ (3 of 3 P0 items closed). Stryker B3 partial (config-fix done; CI wire deferred to follow-up substantial-design). All P0 items have movement. +**Net P0 progress:** Lean CI ✓, A4 registry rows ✓, peer-review email ✓, Stryker B3 ✓ (4 of 4 P0 items closed). **Net P1 progress:** Alloy B2 ✓, Semgrep B4 ✓ (correction), B1 ✓ (all 4 deferred TLA+ specs in CI). WDC protocol spec + Recursive.fs LFP spec remain open (substantial multi-day each). diff --git a/docs/research/proof-tool-coverage.md b/docs/research/proof-tool-coverage.md index a42a055ae..5b7f3535f 100644 --- a/docs/research/proof-tool-coverage.md +++ b/docs/research/proof-tool-coverage.md @@ -206,7 +206,7 @@ modes, chain rule pipeline. | **Eldarica / Spacer (Horn)** | Recursive-program invariant synthesis — could auto-derive loop invariants for `BalancedSpine.compactLevel` | | **Liquid Haskell / LiquidF#** | ~~Refinement types inline in F# — catches `arr.[i]` out-of-bounds *at compile time* over the whole codebase~~ **Round-35 Hold: tool dormant.** No currently-maintained F#-native refinement checker; F7 (the Microsoft Research ancestor) last shipped 2012. See `docs/research/liquidfsharp-findings.md`. Successor path: F\* extraction to F# (Assess, TECH-RADAR round 35). | | **Hypothesis-style coverage-guided fuzz** | Deeper counter-example minimisation than FsCheck's generic shrinker; catches concurrency bugs via state-space exploration | -| **Mutation testing (Stryker)** | Already configured via `stryker-config.json`, but **not yet run in CI** and no coverage target published — unknown whether our 471 tests survive a realistic mutant kill rate | +| **Mutation testing (Stryker)** | Configured via `stryker-config.json` and run in CI via `.github/workflows/stryker-mutation.yml` — path-filtered to `src/Core/**` + `tests/Tests.FSharp/**`; threshold-break at 50% gates the workflow; HTML + json reports uploaded as 90-day artifacts on every run. Kill-rate trend observable from the workflow run page. | | **CodeQL** | Data-flow / taint analysis (untrusted `File.ReadAllBytes` → parser) — config deferred, listed as P0 in `BACKLOG.md` | | **Semgrep** | Pattern-level anti-patterns; 12 rules shipped but only run externally (`TECH-RADAR.md` says "runs externally"); **not part of CI** |