feat(preflight): eliminate silent graph-expansion fallbacks (#243) by silongtan · Pull Request #294 · BicameralAI/bicameral-mcp

silongtan · 2026-05-10T04:33:27Z

Summary

Closes #243 (P0). Two-piece fix on top of #173/#174's graph-expansion work.

PR #174 closed the recall ceiling but introduced two silent fallback paths in _region_anchored_preflight — when ctx.code_graph was absent OR when the expander raised, the response shape was byte-identical to "expansion ran and matched zero" — caller couldn't tell recall was degraded. The RealCodeLocatorAdapter already raised a loud RuntimeError at the adapter layer, but the preflight handler swallowed it at DEBUG.

This PR makes both ends loud:

Piece A (commit 3c9730f): handler-level loud signal — sources_chained tag + WARN log + telemetry counter.
Piece B (commit d136637): server boot refuses to start with a broken index — fail-loud at startup so silent fallback can't accumulate hours of degraded recall in production.

Phase 2 spec was posted on #243 for signoff before any code landed (per Kevin's #87 ruling). All four open questions defaulted to recommended.

Piece A — Loud handler-level fallback signal

handlers/preflight.py distinguishes three fallback reasons (was conflated into a single if expander is not None: skip):

Code path	New `fallback_reason`	Signals fired
`ctx.code_graph is None`	`"absent"`	response tag + telemetry
`code_graph` set but no `expand_file_paths_via_graph`	`"missing_method"`	response tag + telemetry
expander raised	`"exception:<type>"`	response tag + telemetry + WARN log

Three additive signals when any of the above fires:

Response field — sources_chained includes "graph_unavailable". Additive (never replaces existing "region" / "graph" tags). Bare tag per signoff Q2 — granular reason flows through telemetry, not the response shape, keeping the response stable.
Log level — exception case bumped from logger.debug → logger.warning with stable [preflight:fallback] substring + exception type for grep-friendly production logs.
Telemetry counter — new preflight_telemetry.write_fallback_event(reason, session_id) modeled on write_ingest_refusal_event ([compliance:epic] Ingest boundary guardrails — server-side gates on the durable write surface #216). Emits a graph_expansion_fallback row to ~/.bicameral/preflight_events.jsonl. Gated on BICAMERAL_TELEMETRY=preflight.

Skill update (skills/bicameral-preflight/SKILL.md) renders a one-line recall-degraded note to the agent when the tag is present:

Note: structural-neighbor lookup was unavailable this call — recall may be reduced until the symbol index is rebuilt. Decisions bound to files that import these may not have surfaced.

Piece B — Eager startup init + fail-loud

adapters/code_locator.py:

Singleton-by-REPO_PATH cache via _INSTANCE_CACHE. Path.resolve() on the key so symlink + relative-path callers cache-hit consistently. Multi-repo correctness preserved (per signoff Q1).
New reset_code_locator_cache() test-only hook, mirroring adapters.ledger.reset_ledger_singleton.
New async def initialize() wraps sync _ensure_initialized() in loop.run_in_executor(None, ...) so cold-init doesn't block the event loop. Idempotent on already-initialized adapters.

server.py:serve_stdio():

Calls await get_code_locator().initialize() between dashboard sidecar start and consent-notice block.
Fail-loud per signoff Q3 — explicit except RuntimeError as exc: re-raises after printing an actionable stderr message ("Run: python -m code_locator index <repo>"). Outer try/finally still runs SERVER_SHUTDOWN audit emit.

Files

File	Δ	Role
`handlers/preflight.py`	+70 / −22	Three-reason classifier, loud signals, sources_chained tag
`preflight_telemetry.py`	+39	`write_fallback_event(reason, session_id)`
`skills/bicameral-preflight/SKILL.md`	+15	`graph_unavailable` agent-facing render
`adapters/code_locator.py`	+50	Singleton cache, reset hook, `async initialize()`
`server.py`	+28	Eager startup hook, fail-loud on RuntimeError
`tests/test_preflight_graph_expansion.py`	+330	8 new tests (4 Piece A + 4 Piece B)
`CHANGELOG.md`	+2	Unreleased entries (Added + Changed)

Tests

#	Test	Piece
1	`test_preflight_fallback_absent_code_graph_tags_graph_unavailable`	A
2	`test_preflight_fallback_expander_raises_warns_and_tags` (asserts WARN log via `caplog`)	A
3	`test_preflight_successful_expansion_does_not_tag_graph_unavailable` (regression guard)	A
4	`test_preflight_empty_file_paths_does_not_tag_graph_unavailable` (distinguishes never-attempted from attempted-and-fell-back)	A
5a	`test_get_code_locator_returns_same_instance_per_repo_path` (singleton + reset across two REPO_PATHs)	B
5b	`test_initialize_succeeds_when_index_present` (idempotent on already-initialized)	B
6	`test_initialize_fails_loudly_when_index_empty` (RuntimeError propagates through async wrapper)	B
7	`test_serve_stdio_refuses_boot_on_empty_index` (boot-path level: empty index aborts boot)	B

Existing tests use containment assertions ("region" in sources_chained) not exact list equality, so the additive "graph_unavailable" tag won't break them.

Local verification

✅ ruff check + format + mypy all green on touched files
✅ Singleton + reset_code_locator_cache smoke test (4 assertions: cache hit, distinct on new path, fresh after reset, second call cached again)
✅ Async initialize() smoke test (re-raises stubbed RuntimeError; idempotent no-op on _initialized=True adapter)
✅ bicameral.link_commit clean on both commits — 0 drift, 0 pending checks
⏳ Full ledger-touching test run pending CI (4 of 8 new tests need surrealdb via the integration_env fixture)

Refs

Closes #243 (P0). Parent: #173 / PR #174. Plan signoff via issue-243 comment.

🤖 Generated with Claude Code

PR #174 closed the recall ceiling but introduced two silent fallback paths in `_region_anchored_preflight`: when `ctx.code_graph` was absent OR when the expander raised, the response shape was byte- identical to "expansion ran and matched zero" — caller couldn't tell recall was degraded. Three additive signals now surface every fallback (per Phase 2 spec posted on #243, all four open questions defaulted to recommended): 1. Response field — `sources_chained` includes `"graph_unavailable"`. Additive (never replaces existing `"region"` / `"graph"` tags). Bare tag — granular reason flows through telemetry, not the response shape, per signoff Q2. 2. Log level — exception case bumped from `logger.debug` → `logger.warning` with stable `[preflight:fallback]` substring + exception type for grep-friendly production logs. 3. Telemetry counter — new `preflight_telemetry.write_fallback_event( reason, session_id)` modeled on `write_ingest_refusal_event` (#216). Emits a `graph_expansion_fallback` row to the existing `~/.bicameral/preflight_events.jsonl` substrate. Reasons are a controlled enum: `"absent"`, `"missing_method"`, `"exception:<type>"`. Gated on `BICAMERAL_TELEMETRY=preflight`. The fallback case classifier in `_region_anchored_preflight` distinguishes three reasons (was conflated into a single `if expander is not None:` skip in the pre-#243 code): - `code_graph is None` → "absent" - `code_graph` set but no `expand_file_paths_via_graph` → "missing_method" - expander raised → "exception:<typ>" Skill update (`skills/bicameral-preflight/SKILL.md`) renders a one- line recall-degraded note to the agent when the tag is present: > Note: structural-neighbor lookup was unavailable this call — > recall may be reduced until the symbol index is rebuilt. Decisions > bound to files that import these may not have surfaced. Treats `"graph_unavailable"` as advisory: doesn't block the preflight surface; direct-pin matches are unaffected. Tests ----- 4 new cases in `tests/test_preflight_graph_expansion.py`: - test_preflight_fallback_absent_code_graph_tags_graph_unavailable — ctx with code_graph=None → response carries the tag, telemetry counter reason="absent" - test_preflight_fallback_expander_raises_warns_and_tags — stub expander raises RuntimeError → response carries the tag, `caplog` captures WARN-level log with `[preflight:fallback]` substring, telemetry counter reason="exception:RuntimeError" - test_preflight_successful_expansion_does_not_tag_graph_unavailable — regression guard: clean expansion path must NOT carry the tag (no false alarms) - test_preflight_empty_file_paths_does_not_tag_graph_unavailable — empty file_paths short-circuits before expansion check; the "expansion was never attempted" case is distinguishable from "attempted-and-fell-back" Existing tests use containment assertions (`"region" in sources_chained`) not exact list equality, so additive `"graph_ unavailable"` doesn't break them. What's NOT in this PR --------------------- Piece B (eager symbol-index initialization at server startup) is the follow-up commit on this branch. Lands separately so the response- shape change can ship without the adapter-lifecycle change. After both pieces land, the telemetry counter shipped here gives ongoing visibility into how often fallback engages in production. Refs #243 (parent #173 / PR #174). Plan signoff via #243 (comment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…Piece B) Pre-fix, the code-locator adapter had two cooperating problems that made silent fallback the default: 1. `get_code_locator()` returned a FRESH `RealCodeLocatorAdapter` per call. Caching was absent. 2. `_ensure_initialized()` was lazy — first tool call paid the index-build cost AND could race the index check on concurrent dispatch (e.g. preflight + bind landing in parallel after server boot). Together: every silent fallback in the production runtime was "hot" because the adapter was being rebuilt + rechecked on every call. Piece A (#283 commit 3c9730f) made the fallback loud at the response layer; Piece B closes the upstream cause. Three changes ------------- adapters/code_locator.py - Singleton-by-REPO_PATH cache via `_INSTANCE_CACHE: dict[str, RealCodeLocatorAdapter]`. Path resolved through `Path.resolve()` so symlink + relative-path callers cache-hit consistently. Multi-repo correctness preserved (any test that swaps REPO_PATH mid-process gets a fresh adapter for the new path). - New `reset_code_locator_cache()` test-only hook, mirroring `adapters.ledger.reset_ledger_singleton`. - New `async def RealCodeLocatorAdapter.initialize()` — wraps sync `_ensure_initialized()` in `loop.run_in_executor(None, ...)` so the cold-init path doesn't block the event loop. Idempotent on already-initialized adapters. server.py - `serve_stdio()` calls `await get_code_locator().initialize()` between the dashboard sidecar start and the consent-notice block. - **Fail-loud per #243 phase-2 signoff Q3** — explicit `except RuntimeError as exc:` re-raises after printing an actionable stderr message (`"Run: python -m code_locator index <repo>"`). The outer try/finally still runs the `SERVER_SHUTDOWN` audit emit, so operators get a clean event AND a clear actionable error. No more silent degradation. tests/test_preflight_graph_expansion.py — 4 new tests - test_get_code_locator_returns_same_instance_per_repo_path (singleton + reset behavior across two REPO_PATHs) - test_initialize_succeeds_when_index_present (idempotent on already-initialized adapter) - test_initialize_fails_loudly_when_index_empty (RuntimeError from `_ensure_initialized` propagates through the async wrapper — doesn't get swallowed) - test_serve_stdio_refuses_boot_on_empty_index (boot-path level: with everything else stubbed healthy, an empty index aborts `serve_stdio()` with the expected RuntimeError) Local smoke tests ----------------- - Singleton + reset_code_locator_cache: 4 assertions pass (cache hit on same path, distinct instance on new path, fresh after reset, second call after reset stays cached) - Async `initialize()`: re-raises RuntimeError on stubbed `_ensure_initialized` failure; idempotent no-op on already-initialized adapter - ruff check + ruff format --check + mypy all green on touched files What's NOT in this PR --------------------- Nothing — Piece A (commit 3c9730f) and Piece B (this commit) together close #243's full scope. PR will open with both pieces. Telemetry counter shipped in Piece A gives ongoing production visibility into how often fallback engages post-merge. Refs #243 (parent #173 / PR #174). Plan signoff via #243 (comment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-10T04:33:35Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8127c671-a0fc-450c-b9e5-4c90cd824d0e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch 243-preflight-eliminate-fallbacks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

silongtan and others added 2 commits May 9, 2026 22:29

silongtan temporarily deployed to ci-test May 10, 2026 04:33 — with GitHub Actions Inactive

silongtan temporarily deployed to production May 10, 2026 04:33 — with GitHub Actions Inactive

silongtan had a problem deploying to recording-approval May 10, 2026 04:33 — with GitHub Actions Failure

jinhongkuan merged commit 119cd89 into dev May 10, 2026
8 of 9 checks passed

silongtan mentioned this pull request May 11, 2026

M6 preflight handler retrieval: by-design split (handler structural, skill-layer covered by #306) #58

Closed

silongtan deleted the 243-preflight-eliminate-fallbacks branch May 16, 2026 02:35

jinhongkuan mentioned this pull request May 16, 2026

release: v0.15.0 — PII archive, hard-delete remove_decision, schema v17→v24 chain #388

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(preflight): eliminate silent graph-expansion fallbacks (#243)#294

feat(preflight): eliminate silent graph-expansion fallbacks (#243)#294
jinhongkuan merged 2 commits into
devfrom
243-preflight-eliminate-fallbacks

silongtan commented May 10, 2026

Uh oh!

coderabbitai Bot commented May 10, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

silongtan commented May 10, 2026

Summary

Piece A — Loud handler-level fallback signal

Piece B — Eager startup init + fail-loud

Files

Tests

Local verification

Refs

Uh oh!

coderabbitai Bot commented May 10, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants