fix(hooks): #170 — gate post-preflight capture reminder on implementation intent#570
Conversation
|
Warning Review limit reached
More reviews will be available in 3 minutes and 43 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis PR implements a conditional gate for the post-preflight capture reminder by detecting code-implementation intent in user prompts. Intent signals are persisted via session state between two hooks: the UserPromptSubmit hook writes a boolean classification, and the PostToolUse hook reads it to decide whether to emit the Step 5.6 disambiguation reminder. ChangesImplementation-intent gate on post-preflight capture reminder (
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/test_post_preflight_capture_reminder.py (1)
149-149: 💤 Low valueOptional: chained comparison style could be clearer.
The assertion
read_intent(sid) == has_implementation_signal(prompt) is Trueis correct but relies on Python's chained comparison semantics ((a == b) and (b is True)). Consider splitting for clarity:expected = has_implementation_signal(prompt) assert expected is True assert read_intent(sid) is TrueThis applies to line 157 as well.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_post_preflight_capture_reminder.py` at line 149, The chained assertion uses ambiguous comparison semantics; change the assertion using read_intent(sid) and has_implementation_signal(prompt) into two explicit checks: evaluate expected = has_implementation_signal(prompt) then assert expected is True and assert read_intent(sid) is True (repeat the same refactor for the similar assertion at line 157) so the intent of has_implementation_signal(prompt) and read_intent(sid) is unambiguous.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/hooks/_prompt_intent_state.py`:
- Around line 7-9: Update the module docstring in
scripts/hooks/_prompt_intent_state.py (lines describing the persisted predicate)
to accurately state that the pre-hook writes the boolean returned by
has_implementation_signal(prompt) (from scripts/hooks/preflight_reminder.py)
keyed by session_id, rather than should_fire_preflight(prompt), so the docstring
matches the actual contract read by the capture hook.
- Around line 70-80: read_intent currently returns a persisted "fire" value
regardless of file age; update read_intent to enforce the same TTL used by
_sweep_stale: after obtaining the path via state_path(session_id) and before
loading JSON, stat the file's mtime and compare to time.time() against
_TTL_SECONDS, and if the file is older than _TTL_SECONDS return None (treat as
absent/stale); otherwise proceed to read and validate the JSON as before,
catching OSError/JSONDecodeError/ValueError and returning None on error.
---
Nitpick comments:
In `@tests/test_post_preflight_capture_reminder.py`:
- Line 149: The chained assertion uses ambiguous comparison semantics; change
the assertion using read_intent(sid) and has_implementation_signal(prompt) into
two explicit checks: evaluate expected = has_implementation_signal(prompt) then
assert expected is True and assert read_intent(sid) is True (repeat the same
refactor for the similar assertion at line 157) so the intent of
has_implementation_signal(prompt) and read_intent(sid) is unambiguous.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a702a41a-b56e-4517-8c88-54ffd431b9d2
📒 Files selected for processing (6)
.github/workflows/test-mcp-regression.ymlscripts/hooks/_prompt_intent_state.pyscripts/hooks/post_preflight_capture_reminder.pyscripts/hooks/preflight_intent.pyscripts/hooks/preflight_reminder.pytests/test_post_preflight_capture_reminder.py
…tion intent The PostToolUse capture reminder (added #168) fired its disambiguation question on EVERY preflight that surfaced ≥1 decision — including read-only prompts where no refinement is possible (#170 spam). Gate it on implementation intent. Design (operator-confirmed): suppress the reminder ONLY when the originating prompt carries no implementation signal at all; keep the user-disambiguation for every implementation-intent prompt — honoring #175 (the agent/lexical scan can't reliably judge "compatible vs refinement"; "add tests for X" must reach the user). Mechanism: - NEW `has_implementation_signal(prompt)` in preflight_intent.py = verb-regex OR indirect-intent-phrase, INDEPENDENT of SKIP_PATTERNS. `should_fire_preflight` refactored to delegate (`not skip-listed AND has_implementation_signal`). Crucial: `should_fire_preflight` skip-lists `(fix|add|update|write).*test`, so it returns False for "add tests for X" — gating on it would lexically suppress exactly the #175-forbidden case. `has_implementation_signal` ignores skip patterns, so "add tests for X" (verb "add") still reaches the user. - `_prompt_intent_state.py` (new): session-scoped state file (system tempdir, sanitized session_id, 24h TTL sweep — GC across session boundaries, no new SessionEnd wiring). UserPromptSubmit hook persists has_implementation_signal; PostToolUse capture hook suppresses iff `read_intent(...) is False` (absent → fires; missed capture is irreversible). Tests (tests/test_post_preflight_capture_reminder.py, wired into test-mcp-regression.yml): suppress-on-read-only, fire-on-impl-intent, end-to-end add-tests-still-fires (#175 guard), default-fire-on-absent-state, write-side persistence, TTL sweep, state round-trip. Revises #170 acceptance (b): "add tests for X" is NOT lexically suppressed — it fires and the user one-click-dismisses (the lexical-suppress approach is what Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6bafe39 to
543f5fc
Compare
Closes #170. Supersedes the closed #535.
Problem
The PostToolUse capture reminder (added #168) fires its
AskUserQuestiondisambiguation on everypreflightthat surfaces ≥1 decision — including read-only/debug prompts where no refinement is possible. That's the #170 spam.The #170 ↔ #175 conflict, and the resolution
#170's original approaches (verb-list / topic-similarity gates) tried to lexically decide "is this a refinement?" — exactly the unreliable judging #175 closed ("a lexical scan can't tell a compatible 'add tests for X' from a refinement smuggled under 'add'"). Lexically suppressing on compatibility is what #535 died on.
Resolution (operator-confirmed): impl-intent gate only. Separate the two concerns:
Mechanism
has_implementation_signal(prompt)inpreflight_intent.py= verb-regex OR indirect-intent-phrase, independent ofSKIP_PATTERNS.should_fire_preflightrefactored to delegate (not skip-listed AND has_implementation_signal).should_fire_preflight: the latter skip-lists(fix|add|update|write).*test, so it returnsFalsefor "add tests for X". Gating on it would lexically suppress the feat(skill): user-disambiguation question before Step 5.6 contradiction capture #175-forbidden case.has_implementation_signalignores skip-patterns → "add tests for X" (verb "add") still reaches the user. (This was caught by the failing TDD test — the audit/implement loop working.)_prompt_intent_state.py(new): session-scoped handoff file (system tempdir, sanitizedsession_id, 24h TTL sweep → GC across session boundaries with no new SessionEnd wiring). The UserPromptSubmit hook persistshas_implementation_signal; the PostToolUse capture hook suppresses iffread_intent(...) is False(absent/unreadable → fires; missed capture is irreversible).Acceptance (issue #170)
add tests for X→ fires + user disambiguates (acceptance (b) revised — lexical suppression rejected per feat(skill): user-disambiguation question before Step 5.6 contradiction capture #175); read-only → suppressed.Tests
tests/test_post_preflight_capture_reminder.py, wired intotest-mcp-regression.yml(the recurring CI-wiring gap; #562/#402/#170 each needed it): suppress-on-read-only, fire-on-impl-intent, end-to-end add-tests-fires (#175 guard), default-fire-on-absent, write-side persistence, TTL sweep, state round-trip. ruff + mypy clean; existingtest_preflight_hook.py/test_preflight_intent.pypass unchanged.Summary by CodeRabbit
Improvements
Tests
Chores