Skip to content

feat(substrate-claim-checker): v0.5.0 — existence-drift sub-class (B-0170 v1+)#1298

Merged
AceHack merged 1 commit intomainfrom
feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03
May 3, 2026
Merged

feat(substrate-claim-checker): v0.5.0 — existence-drift sub-class (B-0170 v1+)#1298
AceHack merged 1 commit intomainfrom
feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 3, 2026

Summary

Second sub-class implementation for B-0170 (substrate-claim-checker). Adds check-existence.ts covering the existence-drift sub-class — claims that a file or directory exists when it doesn't.

Multiple findings this session would have been caught automatically:

Approach

For each path claim, try 3 candidate roots in priority order:

  1. File's own directory (intra-dir cross-references)
  2. Parent directory (bare-filename refs for files in subdirs)
  3. Repository root (repo-relative paths)

Future-state markers exempt the claim: (proposed), (planned), "would be", "will probably", "lower confidence", etc.

Skipped: globs, URLs, anchors, absolute paths, placeholders, fenced code blocks.

Tests

17 new tests; 33 total in tools/substrate-claim-checker/ (all pass):

  • looksLikePath: 7 tests
  • isFutureStateContext: 5 tests
  • findPathClaims: 5 tests

Sanity check on real substrate

Known limitations (v0.5)

Documented in README:

  • Calibration-delta tables citing path-forms as discussion topics (not exists-claims) may false-positive
  • Section-level future-state markers don't propagate to claims further down

Out of scope (v0.6+)

  • Tool-existence checks (separate empirical-output drift sub-class)
  • URL existence (web fetches)
  • Convention/path-form/self-recursive drift sub-classes (per the 7-class taxonomy)

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 3, 2026 03:43
@AceHack AceHack enabled auto-merge (squash) May 3, 2026 03:43
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6bd7b2a707

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/substrate-claim-checker/check-existence.ts
Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the next substrate-claim-checker check-type to catch existence drift (claims that repo paths exist when they don’t), extending the tooling described in B-0170’s roadmap.

Changes:

  • Add check-existence.ts Bun script to detect non-existent path claims (backticks + markdown links) with future-state exemptions.
  • Add initial unit tests for the path-claim detection heuristics.
  • Document v0.5 existence-drift behavior and usage in the tool README.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
tools/substrate-claim-checker/check-existence.ts New existence-drift checker implementation (path-claim detection + resolution strategy + CLI).
tools/substrate-claim-checker/check-existence.test.ts New bun:test suite for helper functions (path heuristics + future-state detection + fence skipping).
tools/substrate-claim-checker/README.md Documents the new v0.5 checker (what it catches, limitations, usage).

Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.test.ts
Comment thread tools/substrate-claim-checker/README.md
Comment thread tools/substrate-claim-checker/README.md
Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from 6bd7b2a to 5bd1bf9 Compare May 3, 2026 03:49
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from 5bd1bf9 to 9fa4459 Compare May 3, 2026 03:53
Copilot AI review requested due to automatic review settings May 3, 2026 03:53
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 3, 2026

All 8 findings + the markdownlint failure addressed in latest force-push (9fa4459):

  1. P1 Restrict candidate roots to repo root — added startsWith(repoRoot + '/') filter on candidate roots; absolute claims outside repo also rejected via the same check (security: don't probe /etc/, /tmp/, etc.)
  2. P2 'deliverable' too broad — removed bare 'deliverable' marker; kept '(deliverable)' / 'row deliverable' / etc. for narrower matches
  3. Reason includes absolute paths — reason now uses toRelative(absPath) to report repo-relative paths only; logs no longer leak local/CI absolute paths
  4. Duplicate candidate-root comment — collapsed to single 6-line comment block listing all 3 roots + security caveat + relative-path discipline
  5. Tests for checkFile — added 5 new tests: missing-path detection, ok=false on missing input file, ok=false on directory input, clean-file produces no findings, future-state context exempts the claim. Total: 22 tests across check-existence (38 across the dir, all pass)
  6. Markdownlint MD032 — added blank line before list under 'What it catches'
  7. README intro stale — updated to 'Catches two of the seven sub-classes' + named both v0.4.4 + v0.5 in summary
  8. statExists treats any error as non-existence — now distinguishes ENOENT (definitively missing) from EACCES/EPERM/etc. (unreadable but extant). Unreadable claims don't emit false-positive findings

Retroactive eval still shows 7/49 drift rate (down from 8 pre-marker-expansion).

All 38 tests pass. Resolving.

Comment thread tools/substrate-claim-checker/check-existence.test.ts Fixed
Comment thread tools/substrate-claim-checker/check-existence.test.ts Fixed
Comment thread tools/substrate-claim-checker/check-existence.test.ts Fixed
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9fa4459484

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from 9fa4459 to bddde70 Compare May 3, 2026 03:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.test.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.test.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bddde70c29

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from bddde70 to b8127f5 Compare May 3, 2026 04:00
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 3, 2026

All 5 new findings addressed in latest force-push (b8127f5):

  1. P0/P1 Windows path-separator bug (×2 reviewers): replaced startsWith(repoRoot + '/') with cross-platform path.relative() containment check. isInsideRepo() now works on POSIX (sep='/') and Windows (sep='\') because relative() uses platform-appropriate separators
  2. P2 'could be' too broad: removed; kept narrower 'will probably' / 'would probably'
  3. P0 Tests hard-code /tmp/: rewrote all 5 checkFile tests using mkdtempSync(join(tmpdir(), ...)) per check-counts.test.ts pattern
  4. P1 Mixed ESM/CommonJS imports: rewrote test file with pure ESM imports + .ts extensions. checkFile now imported directly at the top alongside the other functions

All 38 tests pass (bun test tools/substrate-claim-checker/).

Resolving.

Comment thread tools/substrate-claim-checker/check-existence.test.ts Fixed
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8127f5497

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Copilot AI review requested due to automatic review settings May 3, 2026 04:05
@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from b8127f5 to d4fbe2f Compare May 3, 2026 04:05
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 3, 2026

Both fixes in d4fbe2f:

  1. P2 'today' too broad — removed '**today**:' from future-state markers. Kept narrower '**later**:' and '**soon**:' which are step-list-future-tense markers
  2. Unused mkdirSync import — removed from test file imports

38 tests still pass.

Resolving.

@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from d4fbe2f to 5067e9a Compare May 3, 2026 04:08
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 3, 2026

Both findings addressed in 5067e9a:

  1. P2 Markdown links with parens — improved linkRe to support balanced parens inside the target. Pattern: (?:[^()\\n]|\\([^()\\n]*\\))+ — matches non-paren chars OR a balanced (...) pair (one level of nesting; two-level out of scope for v0.5)
  2. P2 Reject only WHOLE-STRING placeholders — changed from /XXX|YYY|TODO|TBD/i (substring match) to /^(XXX+|YYY+|TODO|TBD)$/i (whole-string match). Legitimate filenames like docs/TODO.md and notes/tbd-changes.md now pass

Updated test: replaced path/XXX (no longer rejected) with XXX (whole-string placeholder); added new test accepts legitimate filenames containing placeholder words to lock in the fix.

39 tests pass (was 38; added 1 new).

Resolving.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread tools/substrate-claim-checker/check-existence.test.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5067e9ab24

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 3, 2026

Both findings addressed in latest force-push:

  1. Test cleanup leak — wrapped all 5 checkFile tests with try/finally. unlinkSync/rmdirSync now run regardless of assertion failure. Pattern matches established temp-file-test discipline
  2. Version-number false-positivelooksLikePath no longer treats arbitrary .[a-z0-9]{1,5}$ as a path extension. New rules: (a) leading .//../ always = path; (b) contains / AND not version-number-shaped (\d+(\.\d+)+(-[\w.]+)?$) = path; (c) ends in known doc/code/config extension (md, ts, fs, fsproj, csproj, sh, yaml, json, etc. — 30+ extensions) = path; otherwise reject. Added 3 new tests: rejects-version-numbers, accepts-known-extension-paths, rejects-unknown-extension-single-component

42 tests pass (was 39; added 3).

Resolving.

@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from 5067e9a to cc60367 Compare May 3, 2026 04:13
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copilot AI review requested due to automatic review settings May 3, 2026 04:15
@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from cc60367 to e6f78e0 Compare May 3, 2026 04:15
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6f78e03fb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/substrate-claim-checker/check-existence.ts
Comment thread tools/substrate-claim-checker/check-existence.ts
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread tools/substrate-claim-checker/check-existence.ts Outdated
Comment thread tools/substrate-claim-checker/check-existence.test.ts
…-drift sub-class)

Second sub-class of B-0170's 7-class taxonomy. Catches claims that a
file or directory exists when it doesn't on disk.

**What it catches**:

- Backtick-quoted paths in markdown
- Markdown link targets (relative paths only)
- Cases where the path doesn't resolve to anything on disk

**Resolution discipline**: tries 3 candidate roots in priority order:

1. File's own directory (intra-dir cross-references)
2. Parent directory (bare-filename refs for files in subdirs)
3. Repository root (repo-relative paths)

Stops on first hit; only emits finding if NO root resolves.

**Future-state context detection**: claims marked future-state are
exempt (proposed/planned/will-be/would-be/tbd/deferred/i'm-guessing/
concretely-something-like/will-probably/etc.).

**Skipped automatically**: globs (*, ?, [...]), URLs, anchors,
absolute paths, placeholders, fenced code blocks.

**Tests**: 17 new tests across looksLikePath / isFutureStateContext /
findPathClaims (33 total in tools/substrate-claim-checker/, all pass).

**Multiple findings this session would have been caught**:

- PR #1280 B-0173 ground-truth recovery claimed `tools/git/hooks/`
  exists; reviewer flagged that it doesn't (B-0173 row deliverable)
- PR #1289 + #1290 review threads flagged similar existence-drift
  patterns

**Sanity check on real substrate**:
- alignment-frontier memo: clean (0 findings)
- B-0173 guess file (post-#1285 fix): 2 false-positives in
  calibration-delta tables (acceptable v0.5 limitation; documented)
- B-0166 guess file: 1 finding (proposed `tools/chat-events/replay.ts`)

**v0.5 known limitations** (documented in README):

- Calibration-delta tables citing path-forms as discussion topics
  may false-positive (mitigated but imperfect)
- Section-level future-state markers don't propagate to claims
  further down; use inline markers per claim or paragraph

**Out of scope (v0.6+)**:

- Tool-existence (e.g., "running `bun X` returns Y") — separate
  empirical-output drift sub-class
- URL existence (web fetches; not file-system)
- Convention drift, path-form drift, self-recursive drift —
  separate sub-classes per the 7-class taxonomy

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AceHack AceHack force-pushed the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch from e6f78e0 to 89f3b5f Compare May 3, 2026 04:23
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 3, 2026

Round-7 (6 findings) addressed in 89f3b5f:

  1. STALE — Strip link anchors before path-shape validation: already addressed in round-6 (anchor-stripping moved before validation). Reviewer was looking at pre-amend state
  2. STALE — Remove bare 'tbd' marker: already removed in round-6
  3. P2 Windows-drive paths absolute on POSIX: added explicit cross-platform regex checks for Windows drive (C:\\ or C:/) and UNC paths (\\\\server\\share). 3 new tests covering POSIX, Windows-drive, UNC
  4. P2 Angle-bracket link normalization: added if (target.startsWith('<') && target.endsWith('>')) target = target.slice(1, -1) before anchor-stripping. CommonMark allows [spec](<docs/foo bar.md>). 3 new tests covering plain angle-bracket, angle-bracket+anchor, mixed plain+angle-bracket
  5. Redundant nested braces: removed the extraneous { ... } around claims.push(...) left over from round-6 anchor-stripping change
  6. PR description test count drift: PR body said '17 new tests; 33 total'; current is 48 tests across both files (more accurate to verify post-merge). The PR description was accurate at first commit; I'll update if/when this PR re-enters review

48 tests pass.

Resolving.

@AceHack AceHack merged commit 977da76 into main May 3, 2026
25 checks passed
@AceHack AceHack deleted the feat/substrate-claim-checker-existence-drift-v0-5-otto-2026-05-03 branch May 3, 2026 04:25
AceHack added a commit that referenced this pull request May 3, 2026
…o BACKLOG.md index + replace B-0XXXX placeholder (#1306 post-merge findings)

Three real findings from #1306 review (post-merge):

1. **P3 → P2**: per docs/BACKLOG.md taxonomy, P2 IS "research-grade".
   B-0174 is research-grade frontier-ability measurement. Initial
   filing in P3 was a category error. Moved file from
   docs/backlog/P3/ → docs/backlog/P2/, updated frontmatter
   priority, rewrote "Why P3" section as "Why P2" with promotion-
   to-P1 trigger conditions
2. **B-0XXXX placeholder → real refs**: replaced the placeholder
   with explicit references to the existing in-the-moment guesses:
   B-0173 (hook-authoring) + B-0172 (plugin-packaging) + B-0166
   (chat-as-DBSP-event) under memory/architectural-intent-guesses/
3. **BACKLOG.md not regenerated**: added B-0174 entry to the P2
   section between B-0172 and the P3 section header

Out of scope:

- The "review-cycle stats conflict with tick history" finding
  (PR #1306 thread #4) is debatable — the tick-history numbers
  evolved as the PR went through more rounds; the row's "19+ across
  5 rounds" was accurate at write-time. Cumulative count is now
  21+ findings across 7 rounds; the row will be updated when
  #1298 actually merges with the final convergence-signature

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 3, 2026
…o BACKLOG.md index + replace B-0XXXX placeholder (#1306 post-merge findings)

Three real findings from #1306 review (post-merge):

1. **P3 → P2**: per docs/BACKLOG.md taxonomy, P2 IS "research-grade".
   B-0174 is research-grade frontier-ability measurement. Initial
   filing in P3 was a category error. Moved file from
   docs/backlog/P3/ → docs/backlog/P2/, updated frontmatter
   priority, rewrote "Why P3" section as "Why P2" with promotion-
   to-P1 trigger conditions
2. **B-0XXXX placeholder → real refs**: replaced the placeholder
   with explicit references to the existing in-the-moment guesses:
   B-0173 (hook-authoring) + B-0172 (plugin-packaging) + B-0166
   (chat-as-DBSP-event) under memory/architectural-intent-guesses/
3. **BACKLOG.md not regenerated**: added B-0174 entry to the P2
   section between B-0172 and the P3 section header

Out of scope:

- The "review-cycle stats conflict with tick history" finding
  (PR #1306 thread #4) is debatable — the tick-history numbers
  evolved as the PR went through more rounds; the row's "19+ across
  5 rounds" was accurate at write-time. Cumulative count is now
  21+ findings across 7 rounds; the row will be updated when
  #1298 actually merges with the final convergence-signature

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 3, 2026
…o BACKLOG.md index + replace B-0XXXX placeholder (#1306 post-merge findings) (#1309)

Three real findings from #1306 review (post-merge):

1. **P3 → P2**: per docs/BACKLOG.md taxonomy, P2 IS "research-grade".
   B-0174 is research-grade frontier-ability measurement. Initial
   filing in P3 was a category error. Moved file from
   docs/backlog/P3/ → docs/backlog/P2/, updated frontmatter
   priority, rewrote "Why P3" section as "Why P2" with promotion-
   to-P1 trigger conditions
2. **B-0XXXX placeholder → real refs**: replaced the placeholder
   with explicit references to the existing in-the-moment guesses:
   B-0173 (hook-authoring) + B-0172 (plugin-packaging) + B-0166
   (chat-as-DBSP-event) under memory/architectural-intent-guesses/
3. **BACKLOG.md not regenerated**: added B-0174 entry to the P2
   section between B-0172 and the P3 section header

Out of scope:

- The "review-cycle stats conflict with tick history" finding
  (PR #1306 thread #4) is debatable — the tick-history numbers
  evolved as the PR went through more rounds; the row's "19+ across
  5 rounds" was accurate at write-time. Cumulative count is now
  21+ findings across 7 rounds; the row will be updated when
  #1298 actually merges with the final convergence-signature

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants