Skip to content

rule(verify-reviewer-findings): extend blocked-green-ci rule with verify-before-fix discipline#3721

Merged
AceHack merged 2 commits into
mainfrom
rule/verify-reviewer-findings-otto-cli-2026-05-16
May 16, 2026
Merged

rule(verify-reviewer-findings): extend blocked-green-ci rule with verify-before-fix discipline#3721
AceHack merged 2 commits into
mainfrom
rule/verify-reviewer-findings-otto-cli-2026-05-16

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 16, 2026

Extends .claude/rules/blocked-green-ci-investigate-threads.md with verify-before-fix discipline + a suspect-by-default Copilot finding list.

Empirical evidence from the 2026-05-16 autonomous session: 4 confirmed false positives on Copilot's table double-pipe (||) hallucination class. Threshold for entry: 2+ FPs across distinct PRs.

Also captures the stale-but-fresh-looking finding class (sibling-PR self-healing, write-time-accurate prose) — resolve no-op.

Discovered while authoring: my new check-shard-before-push.ts helper flags bullet-continuation lines as MD032 false positives. Filing as next-tick fix for the helper itself.

Co-Authored-By: Claude noreply@anthropic.com

…ify-before-fix discipline

Extends `.claude/rules/blocked-green-ci-investigate-threads.md` with a
composes-with section on verifying reviewer findings before applying
fixes. Captures empirical evidence from the 2026-05-16 autonomous
session:

1. Verification anchors: direct line-level awk inspection; gh api +
   git log for cross-reference claims; local lint/build re-run.

2. Suspect-by-default Copilot finding classes: table double-pipe (||)
   hallucination — 4 confirmed FPs in one session (PR #3685, #3690,
   #3699-era, #3709), all verified by direct awk as single-| rows.

3. Stale-but-fresh-looking findings: parent-tick links to shard files
   in sibling PRs (true at filing-time, self-healed by review-time);
   "X-status vs Y-status inconsistency" prose observations (accurate
   at write-time but underlying state moved). Resolve no-op.

Threshold for adding a Copilot finding to the suspect-by-default list:
two-or-more across distinct PRs.

Markdownlint clean on the rule file. (The new check-shard-before-push.ts
helper flagged 3 false-positive MD032s on bullet-continuation lines —
filing as next-tick fix for the helper itself.)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 16, 2026 03:55
@AceHack AceHack enabled auto-merge (squash) May 16, 2026 03:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the factory rule for diagnosing “BLOCKED with green required CI” to add a verify-before-fix discipline for reviewer-thread findings, including an empirical “suspect-by-default” list for known false-positive classes and guidance for handling stale-but-previously-true findings.

Changes:

  • Add a “verify-before-fix” step after surfacing unresolved review threads to reduce wasted fixes/regressions from hallucinated findings.
  • Introduce an empirical suspect-by-default list (currently the table-pipe false-positive class) with a threshold for inclusion.
  • Document how to resolve stale-but-fresh-looking findings as no-ops when the underlying state self-heals.

Comment thread .claude/rules/blocked-green-ci-investigate-threads.md Outdated
Comment thread .claude/rules/blocked-green-ci-investigate-threads.md Outdated
P0 (line 35): awk one-liner used `<N>` as a literal placeholder; if copied
verbatim, awk treats `N` as uninitialized (defaults to 0) and prints
nothing. Show `-v N=22` (literal value substitution) + explain the gotcha.

P1 (line 38): `git log <PR-cited-PR>` doesn't work — git log expects
refs/commits/paths, not PR numbers. Replace with three concrete runnable
forms:
  - gh api repos/<owner>/<repo>/pulls/<N> → metadata
  - gh pr view <N> --json commits,mergeCommit → commits via API
  - git log --grep '#<N>' → local-repo merge-commit by PR-number

Both fixes preserve the intent (verification anchors) while making the
commands directly runnable.

Co-Authored-By: Claude <noreply@anthropic.com>
@AceHack AceHack merged commit 444580f into main May 16, 2026
25 checks passed
@AceHack AceHack deleted the rule/verify-reviewer-findings-otto-cli-2026-05-16 branch May 16, 2026 04:02
AceHack added a commit that referenced this pull request May 16, 2026
…er-bug (#3724) (#3725)

GraphQL reset at 03:55:31Z. 3 real Copilot findings this tick:
- PR #3720 (0350Z shard): B-0545 stale ref — peer-Otto landed sweep
- PR #3721 (rule extension): 2 unrunnable command examples (awk N, git log on PR num)
- Helper bullet-continuation MD032 FP (discovered tick 19): isContinuationLine fix in PR #3724

3/3 findings real this tick. Copilot's overall accuracy is high; the table-pipe || class remains the only confirmed 2+-occurrence FP.

Dogfood loop: helper merged tick 17 → used in tick 19 on rule file →
caught helper's own bug → fix landed tick 20 (this).

Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 16, 2026
…ion (PR #3721) (#3722)

* shard(tick): 2026-05-16T03:54Z — verify-reviewer-findings rule extension (PR #3721)

PR #3716 (helper) + PR #3717 (0344Z) merged. Extended
blocked-green-ci-investigate-threads.md with 3 composes-with sections:
verification anchors + suspect-by-default Copilot finding list (table-pipe
|| hallucination, 4-FP entry) + stale-but-fresh-looking finding class.

Self-discovery: running the just-merged check-shard-before-push.ts on the
rule file surfaced a helper bug — checkMd032 flags bullet-continuation
lines as false-positive paragraphs. Filing as next-tick fix.

Also self-bite caught during shard authoring: 2 MD038 violations in the
prose describing the helper bug (literal trailing-space-in-backticks
examples). Rewrote the prose without the literal-trigger pattern. The
self-check IS doing its job.

GraphQL exhausted mid-tick after PR-3721 REST creation. PR #3721 NOT
armed this tick; will arm post-reset at 03:55:31Z next tick.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(pr-3722): Copilot P1 — double-backtick code span for regex with literal backtick

The inline code span containing the structural-marker regex contained a
literal backtick inside the character class. Single-backtick delimiters
ended early at the inner backtick, leaving the closing delimiter
unmatched and tripping MD038 / breaking parse.

Switch to a double-backtick code span (``/^[#>*\-|`]/``) which can carry
single inner backticks since the closing run is two consecutive
backticks.

Verified: bun tools/hygiene/check-shard-before-push.ts ok;
markdownlint-cli2 ok (exit 0).

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants