Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions tools/substrate-claim-checker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,13 @@ specifically; instances marked count-drift in the memo's
"Recurring sub-classes" section are this tool's primary regression
suite.

Frozen on-disk fixtures live in `fixtures/`. Each fixture is a minimal
markdown file reproducing one historical drift instance, paired with
a regression test in `fixtures.test.ts`. The fixture seed is one
count-drift case from PR #1259; additional sub-classes land
incrementally per B-0170.4. See `fixtures/README.md` for the index
and the "adding a new fixture" procedure.

## Hooks integration (planned, not v0)

Per the verify-then-claim memo's mechanization-path section:
Expand Down
37 changes: 37 additions & 0 deletions tools/substrate-claim-checker/fixtures.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
/**
* Eval-set fixture regression tests for substrate-claim-checker (B-0170).
*
* Each test runs an existing `check-*.ts` against a frozen historical
* drift fixture under `fixtures/` and asserts that the checker still
* detects the empirical drift that prompted the fixture's capture.
*
* Run with `bun test tools/substrate-claim-checker/fixtures.test.ts`.
*
* Adding fixtures: see `fixtures/README.md`.
*/

import { describe, expect, test } from "bun:test";
import { join } from "node:path";
import { checkFile as checkCounts } from "./check-counts.ts";

const fixtures = join(import.meta.dir, "fixtures");

describe("eval-set fixtures / count drift", () => {
test("count-drift-9-vs-15.md — claim '9 drift instances' vs 15-row table is detected", () => {
const result = checkCounts(join(fixtures, "count-drift-9-vs-15.md"));
expect(result.ok).toBe(true);
// Per PR #3611 review threads (chatgpt-codex-connector + copilot):
// assert exact finding count and pin the body claim's line so a
// regression in body-claim detection cannot be masked by an
// HTML-comment match. The fixture's provenance comment is
// intentionally worded to not restate the `<number> <noun>` pair
// from the body claim.
expect(result.findings.length).toBe(1);
const finding = result.findings[0]!;
expect(finding.line).toBe(24);
expect(finding.claimedCount).toBe(9);
expect(finding.actualCount).toBe(15);
expect(finding.claim).toContain("drift instances");
expect(finding.claimIsMinimum).toBe(false);
Comment on lines +29 to +35
});
});
46 changes: 46 additions & 0 deletions tools/substrate-claim-checker/fixtures/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# substrate-claim-checker eval-set fixtures

Frozen historical drift instances from the verify-then-claim discipline
memo's body table — the canonical eval-set for B-0170.

Each fixture is a small, self-contained markdown file that demonstrates
ONE drift sub-class as it actually surfaced in shipped substrate. The
fixtures are NOT pristine examples; they preserve enough of the original
PR's substance for the checker's behaviour against the historical case
to be reproducible regression coverage.

## Why on-disk fixtures (not inline test strings)

Inline test strings in `*.test.ts` files cover the synthetic-case axis
(does the checker work on toy inputs?). Frozen fixtures cover the
empirical-case axis (does the checker still catch the actual drift the
PR encountered?). Both axes matter; this directory holds the empirical
axis.

## Fixture index

| Sub-class | Fixture | Anchor PR / history | Expected finding |
|---|---|---|---|
| count drift | `count-drift-9-vs-15.md` | PR #1259 (`review(pr-1257-postmerge): verify-then-claim count drift (9→18+) frontmatter + body + MEMORY.md`) | "9 drift instances" claim vs 15-row table |

Add a new row when a new fixture lands.

## Adding a new fixture

1. Pick a historical drift instance from the verify-then-claim memo's
body table (canonical).
2. Create the smallest markdown file that reproduces the drift pattern.
The fixture body should be runnable through the existing checker
without external dependencies.
3. Document the anchor PR (or memo cite) inline at the top of the
fixture as an HTML comment so the provenance survives.
4. Add a regression test in `fixtures.test.ts` that invokes the
relevant `check-*.ts` and asserts on the finding count + shape.
5. Update the fixture index above.

## Composes with

- `tools/substrate-claim-checker/check-counts.ts` and siblings — the
checkers the fixtures regress against
- B-0170 (the parent backlog row this directory contributes to)
- The verify-then-claim memo's body table (the canonical eval-set)
42 changes: 42 additions & 0 deletions tools/substrate-claim-checker/fixtures/count-drift-9-vs-15.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
<!--
Eval-set fixture for substrate-claim-checker (B-0170).

Reproduces the count-drift pattern surfaced in PR #1259
`review(pr-1257-postmerge): verify-then-claim count drift
(9→18+) frontmatter + body + MEMORY.md` — the body narrative
claims a smaller count than the body table actually records.
The exact values are visible inline below.

The fixture is intentionally minimal: only the count claim,
the table, and enough framing for the checker to find the
nearest-table within its 50-line window.

NOTE: this comment intentionally avoids restating the exact
`<number> <noun>` pair from the body claim. Restating it would
produce a spurious second matching claim from the HTML
provenance and let regressions in body-claim detection slip
past `fixtures.test.ts` (per PR #3611 review threads from
chatgpt-codex-connector + copilot-pull-request-reviewer).
Comment on lines +18 to +19
-->

# Drift catalogue (eval-set fixture)

The catalogue tracks 9 drift instances across the substrate.

| # | Instance | Sub-class |
|---|---|---|
| 1 | row-1 | count |
| 2 | row-2 | count |
| 3 | row-3 | existence |
| 4 | row-4 | path-form |
| 5 | row-5 | cross-surface |
| 6 | row-6 | count |
| 7 | row-7 | convention |
| 8 | row-8 | semantic-equivalence |
| 9 | row-9 | empirical-output |
| 10 | row-10 | self-recursive |
| 11 | row-11 | count |
| 12 | row-12 | existence |
| 13 | row-13 | path-form |
| 14 | row-14 | cross-surface |
| 15 | row-15 | convention |
Loading