fix(skill): preflight auto-fire on natural refactor prompts (replaces #151)#155
Merged
Conversation
…rompts (#146) Closes #146 — Flow 2 in tests/e2e/run_e2e_flows.py fails because bicameral.preflight does not auto-fire in headless `claude -p` even when the user prompt explicitly contradicts a prior decision. The existing SKILL.md auto-fire description has plateaued; the agent's default tool-selection priority puts Bash/Glob ahead of preflight. Solution: deterministic UserPromptSubmit hook that detects code-implementation intent via shared verb list and injects an authoritative <system-reminder> elevating preflight above file-inspection tools. Architecture (Hickey razor): - Verb list lives once in scripts/hooks/preflight_intent.py as data (frozenset). Future UI configurability is a one-edit change. - should_fire_preflight(): pure function, 11 lines, depth 2, no network, no LLM, sub-millisecond regex scan. - preflight_reminder.py: 9-line UserPromptSubmit hook entry point; fail-permissive (exit 0 + empty response on errors); never blocks the user. - v0 verb-list duplication between SKILL.md description (frontmatter) and the Python module is documented honestly in the SKILL.md addendum per audit Advisory #1, not papered over with a false SSOT claim. Tests: 11 functionality tests (TDD-light invariant — every test invokes the unit and asserts on output, no presence-only patterns): - 6 classifier tests covering all 30 verbs, 3 skip patterns, indirect intent, data shape, the literal Flow 2 contradiction prompt - 5 hook subprocess tests covering match/no-match/malformed-stdin/ idempotent invocations + Flow 2 fixture Authoritative integration test: tests/e2e/run_e2e_flows.py::test_flow_2 on dev branch (preflight tool_use.id must precede first non-bicameral discovery tool in the stream-json transcript). QorLogic SDLC artifacts: plan-preflight-autofire-hook.md, META_LEDGER Entries #11-#14 (PLAN, GATE PASS, IMPLEMENT, SUBSTANTIATE seal). Merkle seal: 33007d2a72fe3db237935216e063327750896d595faa15001757761e43a8e83c Risk grade: L2 (blast radius: every user prompt; individual-action risk: small + bounded + reversible) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The preflight auto-fire fix in f4de501 added a UserPromptSubmit hook to the bicameral repo's own .claude/settings.json so the e2e flow passes when dogfooding bicameral on bicameral. But setup_wizard's _install_claude_hooks was not extended, so users running `bicameral-mcp setup` on their own repos got the old PostToolUse + SessionEnd hooks and no preflight reinforcement — leaving the bug the PR claims to close (#146) open in production. Changes: - pyproject.toml: add `bicameral-mcp-preflight-reminder` console script entrypoint (`scripts.hooks.preflight_reminder:main`) so the hook resolves on PATH from any pip-installed environment, mirroring the existing `bicameral-mcp` and `bicameral-mcp-classify` pattern. - setup_wizard.py: extend `_install_claude_hooks` with a third `UserPromptSubmit` block that writes the same idempotent merge pattern used for PostToolUse/Bash and SessionEnd. Stale entries matching `bicameral` or `preflight_reminder` in the command string are stripped before re-write. - docs/SYSTEM_STATE.md: document the two new modified files under the preflight-hook session block. Verification: - 11/11 preflight tests pass (tests/test_preflight_intent.py + tests/test_preflight_hook.py). - Smoke test: `_install_claude_hooks` on a fresh tempdir writes all three hook events and the resulting settings.json is byte-stable across repeated invocations. Note: the bicameral repo's own .claude/settings.json continues to invoke `python3 scripts/hooks/preflight_reminder.py` (the source file directly) so devs working on the repo without a `pip install -e .` still get the hook firing — the divergence between dogfood and user install paths is intentional.
Pre-existing format violation in the f4de501 commit caught by CI. Verb frozenset reformatted to one-element-per-line per ruff defaults. No semantic change; 11/11 preflight tests still pass.
The e2e harness writes a project-style settings.json to the test target (cwd=/tmp/desktop-clone) so Claude headless picks up the bicameral hooks. Pre-fix: only PostToolUse/Bash and SessionEnd were materialized — UserPromptSubmit (added in f4de501 + propagated to setup_wizard in 13312d4) was missing. Result: Flow 2 (preflight auto-fire on natural refactor request) and Flow 4 (in-session capture-corrections via preflight step 3.5) both fail with `expected preflight (auto-fired); saw: []` because the agent's default tool priority puts Bash/Glob ahead of preflight and nothing reorders it. Fix: import `_BICAMERAL_PREFLIGHT_REMINDER_COMMAND` alongside the other two hook constants and add a UserPromptSubmit entry to the materialized settings dict. The console-script command resolves on PATH from the workflow's `pip install -e ".[test]"` step. Single source of truth preserved — both real users (via setup_wizard) and the harness pull from the same constants.
…hes model
Claude Code 2.x silently drops the legacy top-level {"additionalContext": ...}
shape — the hook process runs and exits 0, but the system-reminder never
reaches the LLM. Wrap the payload in {"hookSpecificOutput": {"hookEventName":
"UserPromptSubmit", "additionalContext": ...}} per the current CLI contract.
Tests previously asserted against the broken shape (testing the hook against
itself rather than the CLI it must integrate with), which is why this slipped
through. They now assert the envelope shape, so a regression to the legacy
shape would fail loudly.
Verified live with `claude -p` + a real hook: agent now reads and acknowledges
the preflight system-reminder, where before it ignored it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…loop (Flow 2a) The previous Flow 2 assertion required preflight + agent_session ingest + resolve_collision in a single test. After the auto-fire fix (a few commits back) preflight now genuinely fires, but the agent doesn't walk the preflight skill's Step 3.5 to invoke capture-corrections — so the refinement isn't captured and resolve_collision never runs. Two independent contracts were tangled into one verdict. Split: - Flow 2 (mcp_layer) — auto-fire scope only: preflight fires on reorder.ts, precedes the first write op (Edit / Write / git commit). Reads are allowed in parallel (the agent legitimately fetches in parallel with preflight to keep latency reasonable). This is exactly what #146 promised. - Flow 2a (agentic_layer, advisory) — full correction-capture loop: same claude session (reuses Flow 2's transcript via new `reuses_flow` field on FlowSpec, so no duplicate API call) but a different asserter, checking for agent_session ingest + resolve_collision. Currently FAILs because no skill instructs the agent to capture refinements when the user's prompt contradicts a surfaced decision. Tracked as P0 in #154. - Flow 4 — same root cause as Flow 2a (skill-walking gap on Step 3.5). Tagged with advisory pointing at #154. Was already FAILing. CI gate change: blocking_failures = FAIL/ERROR with no advisory text. Flows with an `advisory` field that fail surface loudly in the report (banner + ADVISORIES section) but do not red-light CI. This lets us keep running the gap assertions on every PR (so a silent close becomes visible) without making every PR also pay for the open gap. Verified locally by replaying the asserter against the most recent CI transcript (commit 92525fa, run 25246398064): Flow 2 PASS, Flow 2a FAIL (advisory), Flow 4 FAIL (advisory). Lint + py_compile clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Whitespace-only — formatter collapses three fits-on-one-line list comprehensions and two short return tuples that were unnecessarily wrapped. No behavioural change. Local check: pip install -e ".[test]" inside venv → both `ruff format --check .` (210 files already formatted) and `ruff check .` (all checks passed) clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This was referenced May 2, 2026
jinhongkuan
pushed a commit
that referenced
this pull request
May 2, 2026
Cherry-picked from 1f54f1a, scope-narrowed to the surgical contribution. The original commit was authored against an older base where the e2e harness scaffold did not yet exist; this rebased version adds only the new logic on top of dev's existing harness. What this commit adds: - `tests/e2e/_ledger_helpers.py` — pure helper `count_agent_session_decisions(snapshot)`, extracted so unit tests can import without triggering the harness's top-level env-var / CLI guards. - `tests/e2e/run_e2e_flows.py`: - `_count_agent_session_decisions(snapshot)` — thin wrapper around the helper that hides the import inside the harness. - `_validate_flow4_via_ledger()` — path-X-(b) post-hoc ledger query. Snapshots the ledger after the harness completes and counts decisions with `source_type='agent_session'`. Asserter FAIL + ledger has agent_session → UPGRADE to PASS with explicit annotation. Ledger error → INCONCLUSIVE (verdict unchanged). All five behavior-matrix cases documented in the docstring. - Invocation site: called once after `_validate_flow3_via_ledger` in `main()`, only when `dev_session` ran. - `tests/test_flow4_ledger_validation.py` — five unit tests against the helper covering: zero rows, error snapshot (None), agent_session presence, mixed source types, and empty decisions list. Why this is decoupled from agent caprice: in-stream Flow 4 evidence requires the agent to invoke `bicameral.preflight` and walk Step 3.5 to trigger capture-corrections. Path-X-(b) validates the *product outcome* (decisions written with the canonical source_type) rather than the *mechanism* (which tool the agent chose). This means a SessionEnd subprocess effect that lands in the ledger after the parent stream-json closes still upgrades the verdict, even when the in-stream signal is absent. Closes research-brief recommendation P0 #2. Note: this commit replaces the original 1f54f1a SHA on the branch via rebase. Governance/META_LEDGER edits and the planning artifacts that were bundled with the original have been dropped here and will land via a separate governance PR. The auto-fire UserPromptSubmit hook (#146 fix) that was also bundled is shipping via #155. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
jinhongkuan
pushed a commit
that referenced
this pull request
May 2, 2026
Cherry-picked from 1f54f1a, scope-narrowed to the surgical contribution. The original commit was authored against an older base where the e2e harness scaffold did not yet exist; this rebased version adds only the new logic on top of dev's existing harness. What this commit adds: - `tests/e2e/_ledger_helpers.py` — pure helper `count_agent_session_decisions(snapshot)`, extracted so unit tests can import without triggering the harness's top-level env-var / CLI guards. - `tests/e2e/run_e2e_flows.py`: - `_count_agent_session_decisions(snapshot)` — thin wrapper around the helper that hides the import inside the harness. - `_validate_flow4_via_ledger()` — path-X-(b) post-hoc ledger query. Snapshots the ledger after the harness completes and counts decisions with `source_type='agent_session'`. Asserter FAIL + ledger has agent_session → UPGRADE to PASS with explicit annotation. Ledger error → INCONCLUSIVE (verdict unchanged). All five behavior-matrix cases documented in the docstring. - Invocation site: called once after `_validate_flow3_via_ledger` in `main()`, only when `dev_session` ran. - `tests/test_flow4_ledger_validation.py` — five unit tests against the helper covering: zero rows, error snapshot (None), agent_session presence, mixed source types, and empty decisions list. Why this is decoupled from agent caprice: in-stream Flow 4 evidence requires the agent to invoke `bicameral.preflight` and walk Step 3.5 to trigger capture-corrections. Path-X-(b) validates the *product outcome* (decisions written with the canonical source_type) rather than the *mechanism* (which tool the agent chose). This means a SessionEnd subprocess effect that lands in the ledger after the parent stream-json closes still upgrades the verdict, even when the in-stream signal is absent. Closes research-brief recommendation P0 #2. Note: this commit replaces the original 1f54f1a SHA on the branch via rebase. Governance/META_LEDGER edits and the planning artifacts that were bundled with the original have been dropped here and will land via a separate governance PR. The auto-fire UserPromptSubmit hook (#146 fix) that was also bundled is shipping via #155. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jinhongkuan
pushed a commit
that referenced
this pull request
May 2, 2026
Cherry-picked from 1f54f1a, scope-narrowed to the surgical contribution. The original commit was authored against an older base where the e2e harness scaffold did not yet exist; this rebased version adds only the new logic on top of dev's existing harness. What this commit adds: - `tests/e2e/_ledger_helpers.py` — pure helper `count_agent_session_decisions(snapshot)`, extracted so unit tests can import without triggering the harness's top-level env-var / CLI guards. - `tests/e2e/run_e2e_flows.py`: - `_count_agent_session_decisions(snapshot)` — thin wrapper around the helper that hides the import inside the harness. - `_validate_flow4_via_ledger()` — path-X-(b) post-hoc ledger query. Snapshots the ledger after the harness completes and counts decisions with `source_type='agent_session'`. Asserter FAIL + ledger has agent_session → UPGRADE to PASS with explicit annotation. Ledger error → INCONCLUSIVE (verdict unchanged). All five behavior-matrix cases documented in the docstring. - Invocation site: called once after `_validate_flow3_via_ledger` in `main()`, only when `dev_session` ran. - `tests/test_flow4_ledger_validation.py` — five unit tests against the helper covering: zero rows, error snapshot (None), agent_session presence, mixed source types, and empty decisions list. Why this is decoupled from agent caprice: in-stream Flow 4 evidence requires the agent to invoke `bicameral.preflight` and walk Step 3.5 to trigger capture-corrections. Path-X-(b) validates the *product outcome* (decisions written with the canonical source_type) rather than the *mechanism* (which tool the agent chose). This means a SessionEnd subprocess effect that lands in the ledger after the parent stream-json closes still upgrades the verdict, even when the in-stream signal is absent. Closes research-brief recommendation P0 #2. Note: this commit replaces the original 1f54f1a SHA on the branch via rebase. Governance/META_LEDGER edits and the planning artifacts that were bundled with the original have been dropped here and will land via a separate governance PR. The auto-fire UserPromptSubmit hook (#146 fix) that was also bundled is shipping via #155. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jinhongkuan
pushed a commit
that referenced
this pull request
May 2, 2026
Cherry-picked from 1f54f1a, scope-narrowed to the surgical contribution. The original commit was authored against an older base where the e2e harness scaffold did not yet exist; this rebased version adds only the new logic on top of dev's existing harness. What this commit adds: - `tests/e2e/_ledger_helpers.py` — pure helper `count_agent_session_decisions(snapshot)`, extracted so unit tests can import without triggering the harness's top-level env-var / CLI guards. - `tests/e2e/run_e2e_flows.py`: - `_count_agent_session_decisions(snapshot)` — thin wrapper around the helper that hides the import inside the harness. - `_validate_flow4_via_ledger()` — path-X-(b) post-hoc ledger query. Snapshots the ledger after the harness completes and counts decisions with `source_type='agent_session'`. Asserter FAIL + ledger has agent_session → UPGRADE to PASS with explicit annotation. Ledger error → INCONCLUSIVE (verdict unchanged). All five behavior-matrix cases documented in the docstring. - Invocation site: called once after `_validate_flow3_via_ledger` in `main()`, only when `dev_session` ran. - `tests/test_flow4_ledger_validation.py` — five unit tests against the helper covering: zero rows, error snapshot (None), agent_session presence, mixed source types, and empty decisions list. Why this is decoupled from agent caprice: in-stream Flow 4 evidence requires the agent to invoke `bicameral.preflight` and walk Step 3.5 to trigger capture-corrections. Path-X-(b) validates the *product outcome* (decisions written with the canonical source_type) rather than the *mechanism* (which tool the agent chose). This means a SessionEnd subprocess effect that lands in the ledger after the parent stream-json closes still upgrades the verdict, even when the in-stream signal is absent. Closes research-brief recommendation P0 #2. Note: this commit replaces the original 1f54f1a SHA on the branch via rebase. Governance/META_LEDGER edits and the planning artifacts that were bundled with the original have been dropped here and will land via a separate governance PR. The auto-fire UserPromptSubmit hook (#146 fix) that was also bundled is shipping via #155. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9 tasks
jinhongkuan
pushed a commit
that referenced
this pull request
May 3, 2026
Cherry-picked from 1f54f1a, scope-narrowed to the surgical contribution. The original commit was authored against an older base where the e2e harness scaffold did not yet exist; this rebased version adds only the new logic on top of dev's existing harness. What this commit adds: - `tests/e2e/_ledger_helpers.py` — pure helper `count_agent_session_decisions(snapshot)`, extracted so unit tests can import without triggering the harness's top-level env-var / CLI guards. - `tests/e2e/run_e2e_flows.py`: - `_count_agent_session_decisions(snapshot)` — thin wrapper around the helper that hides the import inside the harness. - `_validate_flow4_via_ledger()` — path-X-(b) post-hoc ledger query. Snapshots the ledger after the harness completes and counts decisions with `source_type='agent_session'`. Asserter FAIL + ledger has agent_session → UPGRADE to PASS with explicit annotation. Ledger error → INCONCLUSIVE (verdict unchanged). All five behavior-matrix cases documented in the docstring. - Invocation site: called once after `_validate_flow3_via_ledger` in `main()`, only when `dev_session` ran. - `tests/test_flow4_ledger_validation.py` — five unit tests against the helper covering: zero rows, error snapshot (None), agent_session presence, mixed source types, and empty decisions list. Why this is decoupled from agent caprice: in-stream Flow 4 evidence requires the agent to invoke `bicameral.preflight` and walk Step 3.5 to trigger capture-corrections. Path-X-(b) validates the *product outcome* (decisions written with the canonical source_type) rather than the *mechanism* (which tool the agent chose). This means a SessionEnd subprocess effect that lands in the ledger after the parent stream-json closes still upgrades the verdict, even when the in-stream signal is absent. Closes research-brief recommendation P0 #2. Note: this commit replaces the original 1f54f1a SHA on the branch via rebase. Governance/META_LEDGER edits and the planning artifacts that were bundled with the original have been dropped here and will land via a separate governance PR. The auto-fire UserPromptSubmit hook (#146 fix) that was also bundled is shipping via #155. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit 8af60f3)
3 tasks
Knapp-Kevin
pushed a commit
to Knapp-Kevin/bicameral-mcp
that referenced
this pull request
May 21, 2026
The UserPromptSubmit hook installed by BicameralAI#146/BicameralAI#155 told the agent to call bicameral.preflight "Before invoking any file-inspection tool (Read, Grep, Bash, Glob)". That short-circuited the caller-LLM discovery the rest of the contract depends on: - bicameral.preflight uses `file_paths` for region-anchored binds_to lookup (the precision channel). Empty file_paths drops to fuzzy text-similarity over decision descriptions. - The user often names a *feature* ("the reorder feature") rather than a *file* (`reorder.ts`). The caller LLM has to do that mapping — it's the semantic half of "selection before generation." - But to do the mapping it needs Read / Grep / Glob, which the old reminder forbade. Symptom on PR BicameralAI#168 / BicameralAI#165 e2e: agent fired preflight with empty file_paths because it had no chance to inspect the codebase first. Server returned weak / no surfaced decisions. Flow 2 asserter failed (file_paths=[]); Flow 2a cascaded (no surfaced decisions to capture from). Reconcile with BicameralAI#146 by gating on the right line: - Read / Grep / Glob FIRST (discovery — caller LLM resolves the user's request to concrete file paths). - bicameral.preflight(topic, file_paths) — fed by step 1. - Write ops (Edit / Write / NotebookEdit / mutating Bash) — preflight must precede the first one. This is the contract assert_flow_2 has *already* been gating; only the hook reminder was misaligned. Files: - scripts/hooks/preflight_reminder.py — REMINDER_TEXT rewrite + docstring documenting the reconciliation with BicameralAI#146 - skills/bicameral-preflight/SKILL.md — Step 2 strengthened: "Discover first, then preflight"; file_paths is the precision channel, omit only for genuinely abstract queries - tests/test_preflight_hook.py — new test_reminder_gates_writes_not_discovery asserts the new posture (positive: "Read-only discovery FIRST", "BEFORE any write op"; negative: must NOT contain the old "before any file-inspection tool" phrasing) The Flow 2 asserter is unchanged — it has always gated writes, not reads (see lines 763-766: "Read is deliberately allowed before/in- parallel-with preflight"). This PR aligns the hook reminder with what the asserter already requires.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Clean cherry-pick of the auto-fire fix from PR #151 onto a fresh
devbase, without the bundled governance/Merkle-ledger commit (3f856af) that was producing rebase conflicts ondocs/META_LEDGER.md,docs/SYSTEM_STATE.md, and.gitignore. The governance cleanup is conceptually independent and should land as its own PR where the qor-logic Merkle chain can be resolved deliberately.Closes #146 (preflight does not auto-fire on natural refactor prompts).
What's in this PR
Seven commits, all scoped to the auto-fire mechanism:
fix(skill): resolve preflight auto-fire failure on natural refactor prompts (#146)— addsscripts/hooks/preflight_intent.py(verb-list classifier) +scripts/hooks/preflight_reminder.py(UserPromptSubmit hook entry point), wires.claude/settings.json, and adds a### Hook reinforcementsubsection toskills/bicameral-preflight/SKILL.md.fix(setup): install preflight UserPromptSubmit hook for end users— adds thebicameral-mcp-preflight-reminderconsole script inpyproject.tomland wires it intosetup_wizard.pyso fresh installs get the hook.style: ruff format scripts/hooks/preflight_intent.pyfix(e2e): materialize UserPromptSubmit hook into test target settings— e2e harness materializes the same hook config a real install would have.fix(hook): emit hookSpecificOutput envelope so additionalContext reaches model— Claude Code 2.x silently drops the legacy top-level{additionalContext: ...}shape; the hook now emits{hookSpecificOutput: {hookEventName: \"UserPromptSubmit\", additionalContext: ...}}.test(e2e): split Flow 2 into auto-fire (Flow 2) + correction-capture loop (Flow 2a)— narrows Flow 2 to the auto-fire scope (precedes write op), adds Flow 2a as advisory for the full correction-capture loop tracked in [P0] Preflight skill does not instruct agent to capture refinements when user prompt contradicts surfaced decisions #154, gates CI exit code on non-advisory failures only.style: ruff format tests/e2e/run_e2e_flows.pyWhat was DROPPED (compared to #151)
3f856af chore(governance): v0 process cleanup— entire commit excluded. Re-open as its own PR.e769eec Merge branch 'dev' into claude/peaceful-bell-12b5e8— merge commit, redundant on a fresh-from-dev branch.docs/META_LEDGER.mdedits fromf4de501— Merkle-chain audit trail, conflicted with dev's parallel cleanup. Should land via the governance PR.docs/SYSTEM_STATE.mdedits fromf4de501and13312d4— same reason.plan-preflight-autofire-hook.md— qor-logic planning artifact; should land via the governance PR.What was MERGED carefully
skills/bicameral-preflight/SKILL.md— dev had added a## Telemetrysection in the same region wheref4de501added### Hook reinforcement. Both kept; ordered as Hook reinforcement → Telemetry (continuation of trigger discussion before the instrumentation interlude before Steps).Validation
ruff format --check .clean (210 files)ruff check .cleantests/test_preflight_hook.py: 5/5 PASS92525fa, run25246398064): Flow 2 PASS, Flow 2a FAIL (advisory → non-blocking), Flow 4 FAIL (advisory → non-blocking). CI exit code: 0.Test plan
ruff + mypypassese2e assertions (auto)passes (advisory failures from Flow 2a / Flow 4 do not red-light CI per the new gate logic)MCP Regression Suite(ubuntu + windows) passesbicameral_preflightpreceding anyEditRelated
🤖 Generated with Claude Code