feat(v0.4.23): caller-LLM-driven ingest retrieval + search_hint recall booster#33
Conversation
Addresses the BM25 vocab-mismatch problem that surfaced after v0.4.20's honest PENDING projection. Decisions whose description didn't lexically overlap with real code identifier vocabulary were binding to whatever file incidentally shared a keyword. Two changes, both within the deterministic-retrieval moat: **Lever 1 — caller-LLM retrieval is the new default** - skills/bicameral-ingest/SKILL.md restructured: Step 2 now instructs the caller LLM to use validate_symbols + search_code + get_neighbors to resolve explicit code_regions BEFORE ingesting. Step 3 leads with internal format (explicit regions) as preferred; natural format is the fallback for truly abstract decisions. - No server code changes — the server already accepted internal-format payloads. The skill just stops discouraging that path. **Lever 2 — search_hint BM25 recall booster** - IngestMapping.search_hint, IngestDecision.search_hint (both optional). Query-only metadata: synonyms, domain vocab, likely identifier names the description wouldn't contain literally. - adapters.code_locator.ground_mappings concatenates "description search_hint" as the BM25 query when the hint is non-empty. Strictly additive: omitted hint = pre-v0.4.23 behavior. - search_hint never lands on intent.description; never surfaces in briefs, status, or gap-judge. Humans see clean decision text; BM25 sees the widened query. **Guarantee preserved**: retrieval remains deterministic at runtime. Caller LLM does the expensive lookup at ingest time (when it has full codebase context); server-side BM25 fallback is only consulted for abstract decisions. Tech moat intact. Tests: 8 new in test_v0423_search_hint.py — propagation through _normalize_payload, BM25 query construction, backward compatibility (no hint = bare description). Full regression: 34/34 pass across natural-format + L1-wiring + resolve_compliance + search_hint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VERSION: 0.4.22 → 0.4.23 RECOMMENDED_VERSION: 0.4.22 → 0.4.23 CHANGELOG entry covers the skill-level flip (caller-LLM resolves code_regions explicitly) and the search_hint recall booster, plus the upgrade note on pre-existing false-positive bindings persisting until bicameral.reset + re-ingest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdded optional Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
skills/bicameral-ingest/SKILL.md (1)
161-279: Skill guidance reads well; grounds the Lever 1 / Lever 2 split clearly.One micro-nit (non-blocking): example at line 232 puts
search_hintalongside explicitcode_regions, while the contract doc and this skill both note search_hint is only consulted whencode_regionsis empty. The example's own prose at lines 277-278 correctly says to treat it as a safety net for future re-grounding — but a casual reader may miss that and assume it's active. Consider adding a one-line parenthetical to the code block, e.g.// optional — consulted only if code_regions is ever cleared / re-grounded.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/bicameral-ingest/SKILL.md` around lines 161 - 279, Add a clarifying parenthetical to the internal-format example to indicate that search_hint is only used when code_regions is empty or re-grounded; update the example block that contains mappings[].code_regions and the search_hint field (the payload example with mappings, symbols, code_regions, and search_hint) to include a one-line note such as "(optional — consulted only if code_regions is later cleared or during re-grounding)" beside the search_hint so readers don't assume it is active when explicit code_regions are provided.adapters/code_locator.py (1)
312-324: Stage-2 fused re-search dropssearch_hint.Inside
_ground_single, when Stage 1 fuzzy-matching produced symbols, line 319 re-invokesself.search_code(description, symbol_ids=...)with the baredescriptionrather than the widenedbm25_queryused at the top ofground_mappings. The result is that thesearch_hintrecall booster is applied only to the initial BM25 pass (whose hits are then ignored down this branch becausefusedis replaced) and not to the symbol-seeded RRF fusion re-search — which is exactly the path search_hint would most plausibly help.Consider threading the same widened query down:
♻️ Proposed fix
def _ground_single( self, description: str, db, max_files: int, fuzzy_threshold: int, max_symbols: int, hits: list[dict] | None = None, mapping_symbol_names: list[str] | None = None, + bm25_query: str | None = None, ) -> list[dict]: @@ + search_query = bm25_query or description @@ if matched_ids: - fused = self.search_code(description, symbol_ids=sorted(matched_ids)) + fused = self.search_code(search_query, symbol_ids=sorted(matched_ids)) else: fused = hitsAnd pass
bm25_query=bm25_queryfrom the_ground_singlecall site inground_mappings.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@adapters/code_locator.py` around lines 312 - 324, _stage-2 fused re-search in _ground_single calls self.search_code(description, symbol_ids=...) and thus drops the widened bm25_query (and associated search_hint) used higher in ground_mappings; modify the code so the same bm25_query is threaded through: update the call sites so ground_mappings passes bm25_query into _ground_single, and inside _ground_single call self.search_code(..., symbol_ids=sorted(matched_ids), bm25_query=bm25_query) (or pass along whatever parameter name search_code expects) so the search_hint/boosting is applied to the symbol-seeded RRF re-search as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@adapters/code_locator.py`:
- Around line 312-324: _stage-2 fused re-search in _ground_single calls
self.search_code(description, symbol_ids=...) and thus drops the widened
bm25_query (and associated search_hint) used higher in ground_mappings; modify
the code so the same bm25_query is threaded through: update the call sites so
ground_mappings passes bm25_query into _ground_single, and inside _ground_single
call self.search_code(..., symbol_ids=sorted(matched_ids),
bm25_query=bm25_query) (or pass along whatever parameter name search_code
expects) so the search_hint/boosting is applied to the symbol-seeded RRF
re-search as well.
In `@skills/bicameral-ingest/SKILL.md`:
- Around line 161-279: Add a clarifying parenthetical to the internal-format
example to indicate that search_hint is only used when code_regions is empty or
re-grounded; update the example block that contains mappings[].code_regions and
the search_hint field (the payload example with mappings, symbols, code_regions,
and search_hint) to include a one-line note such as "(optional — consulted only
if code_regions is later cleared or during re-grounding)" beside the search_hint
so readers don't assume it is active when explicit code_regions are provided.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a7887c10-9417-4dca-87e2-81388ac45d3a
📒 Files selected for processing (8)
CHANGELOG.mdRECOMMENDED_VERSIONadapters/code_locator.pycontracts.pyhandlers/ingest.pypyproject.tomlskills/bicameral-ingest/SKILL.mdtests/test_v0423_search_hint.py
Three-round audit cycle (VETO -> VETO -> PASS) for Notion ingest + cache contract migration. Plan ships across five phases: - Phase 0 — cache contract migration (schema v1->v2, schema_version table, callable migration dispatch, upsert_canonical_extraction) - Phase 0.5 — worker-task lifecycle pattern + Slack reference wiring (closes the v0 dormant-Slack-worker gap) - Phase 1 — Notion API client + property serializer (internal- integration auth, no OAuth router) - Phase 2 — Notion ingest worker (per-database watermark, peer- authored team_event) - Phase 3 — Notion task registration on lifespan META_LEDGER entries #29-#33 capture: round-1 VETO (4 missing/ undeclared symbols), round-2 VETO (1 wrong-call-shape for decrypt_token), round-3 PASS, IMPLEMENT, and SUBSTANTIATION. SHADOW_GENOME #7 addendum extends the PARALLEL_STRUCTURE_ASSUMED detection heuristic with three new in-sketch checks: signature, type-boundary, helper-symmetry. The two VETOs in this session are the empirical justification. SYSTEM_STATE.md adds the Priority C v1 section: schema state (v2), architectural properties achieved, audit cycle outcomes, implementation deviations from plan. Merkle seal: SHA256(content_hash + previous_hash) = dcb619104e6d88b97a04689093b80b9f03825f9a24bac3c3b9ab3d0107ff24d7 (content_hash 9f003c40..., previous_hash 6f4f8f8f... = Priority C v0 SEAL at Entry #28). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First-round PASS audit cycle for the real heuristic+LLM extractor. Plan ships across six phases (Phase 0 cache contract evolution; Phase 1 deterministic Stage 1 classifier; Phase 2 trigger rules schema; Phase 3 real Anthropic SDK Stage 2; Phase 4 pipeline integration; Phase 5 corpus learner option-c). META_LEDGER entries #34-#36 capture: round-1 PASS audit, IMPLEMENT, and SUBSTANTIATION. Three audit advisories (extract() boundary, TeamServerRules typo, corpus learner table-source) all addressed inline during implementation. A proactive QorLogic Fixer code-quality sweep before commit produced 2 MED + 2 LOW findings; both MEDs landed (fail-soft on non-text content blocks; v2->v3 backfill integration test) with one surfacing a real defect (the migration's TYPE string was rejecting reads on pre-v3 rows with NONE classifier_version; corrected to TYPE option<string>). SYSTEM_STATE.md adds the Priority C v1.1 section: schema state (v4), architectural properties achieved (heuristic-first determinism + LLM-only-when-needed + rule-version-driven cache invalidation + all four "dynamic" angles wired), audit cycle outcomes. Merkle seal: SHA256(content_hash + previous_hash) = b37003661820e2ef80591b9d0cfdeac3df092d6d9b4b5d87e3036e7ccf37d95b (content_hash e8b1b6b6..., previous_hash dcb61910... = Priority C v1 SEAL at Entry #33). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three-round audit cycle (VETO -> VETO -> PASS) for Notion ingest + cache contract migration. Plan ships across five phases: - Phase 0 — cache contract migration (schema v1->v2, schema_version table, callable migration dispatch, upsert_canonical_extraction) - Phase 0.5 — worker-task lifecycle pattern + Slack reference wiring (closes the v0 dormant-Slack-worker gap) - Phase 1 — Notion API client + property serializer (internal- integration auth, no OAuth router) - Phase 2 — Notion ingest worker (per-database watermark, peer- authored team_event) - Phase 3 — Notion task registration on lifespan META_LEDGER entries #29-#33 capture: round-1 VETO (4 missing/ undeclared symbols), round-2 VETO (1 wrong-call-shape for decrypt_token), round-3 PASS, IMPLEMENT, and SUBSTANTIATION. SHADOW_GENOME #7 addendum extends the PARALLEL_STRUCTURE_ASSUMED detection heuristic with three new in-sketch checks: signature, type-boundary, helper-symmetry. The two VETOs in this session are the empirical justification. SYSTEM_STATE.md adds the Priority C v1 section: schema state (v2), architectural properties achieved, audit cycle outcomes, implementation deviations from plan. Merkle seal: SHA256(content_hash + previous_hash) = dcb619104e6d88b97a04689093b80b9f03825f9a24bac3c3b9ab3d0107ff24d7 (content_hash 9f003c40..., previous_hash 6f4f8f8f... = Priority C v0 SEAL at Entry #28). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First-round PASS audit cycle for the real heuristic+LLM extractor. Plan ships across six phases (Phase 0 cache contract evolution; Phase 1 deterministic Stage 1 classifier; Phase 2 trigger rules schema; Phase 3 real Anthropic SDK Stage 2; Phase 4 pipeline integration; Phase 5 corpus learner option-c). META_LEDGER entries #34-#36 capture: round-1 PASS audit, IMPLEMENT, and SUBSTANTIATION. Three audit advisories (extract() boundary, TeamServerRules typo, corpus learner table-source) all addressed inline during implementation. A proactive QorLogic Fixer code-quality sweep before commit produced 2 MED + 2 LOW findings; both MEDs landed (fail-soft on non-text content blocks; v2->v3 backfill integration test) with one surfacing a real defect (the migration's TYPE string was rejecting reads on pre-v3 rows with NONE classifier_version; corrected to TYPE option<string>). SYSTEM_STATE.md adds the Priority C v1.1 section: schema state (v4), architectural properties achieved (heuristic-first determinism + LLM-only-when-needed + rule-version-driven cache invalidation + all four "dynamic" angles wired), audit cycle outcomes. Merkle seal: SHA256(content_hash + previous_hash) = b37003661820e2ef80591b9d0cfdeac3df092d6d9b4b5d87e3036e7ccf37d95b (content_hash e8b1b6b6..., previous_hash dcb61910... = Priority C v1 SEAL at Entry #33). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Fixes the BM25 vocab-mismatch problem that became visible after v0.4.20
made grounding status honest. Ingest-time retrieval now defaults to
caller-LLM-resolved explicit code_regions via the existing MCP
retrieval tools; server-side BM25 becomes the fallback for abstract
decisions. For the fallback path, a new optional
search_hintfieldwidens the BM25 query with caller-supplied synonyms.
Both within the existing deterministic-retrieval tech moat: the server
still does no LLM calls at runtime.
Motivation
Real failure from 2026-04-20: 12 dispatcher-subscription-status decisions
ingested against an Accountable branch. BM25 bound every "dispatch"
decision to
use-toast.ts:dispatch(React toast reducer), "active"decisions to
AcquisitionFunnel.tsx:ActiveUser, "source" decisions toPresentationSlideshow.tsx:SlideSource. Under v0.4.19 these wouldhave silently auto-promoted to REFLECTED; under v0.4.20 they stay at
PENDING forever (honest but unactionable).
Changes
Lever 1 — caller-LLM retrieval is the new default
skills/bicameral-ingest/SKILL.mdStep 2 now instructs the callerLLM to use
validate_symbols+search_code+get_neighborstoresolve explicit
code_regionsBEFORE ingesting.natural format is the fallback for abstract decisions.
format payloads.
Lever 2 —
search_hintrecall boosterIngestMapping.search_hint: strandIngestDecision.search_hint: str— optional caller-supplied synonym / identifier-name hints.
ground_mappingsconcatenatesdescription + search_hintas theBM25 query when the hint is non-empty. Strictly additive.
intent.description, neversurfaces in briefs / status / gap-judge responses.
Test plan
tests/test_v0423_search_hint.py(propagation,query construction, backward compat, span-text pollution guard)
test_v0416_natural_format_fields,test_v0423_search_hint,test_phase1_l1_wiring,test_resolve_complianceUpgrade notes
Pre-existing false-positive bindings from BM25 auto-grounding persist
in the graph. To clean up:
bicameral.reset→ re-ingest under thenew skill defaults. Targeted edge pruning tracked for a future
release.
Next up
This is incremental improvement within the current architecture.
Separate plan document coming for the structural refactor that
collapses
intent → maps_to → symbol → implements → regionintointent → binds_to → regiondirectly (symbol becomes a retrieval-tier-onlyprimitive). That's v0.5.0 territory — decided to ship this today so
you're unblocked on the dispatcher ingest while the refactor is scoped.
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
search_hintparameter to enhance code search accuracy during ingestion, allowing callers to provide additional context for better symbol location resolution.Documentation
Tests
Chores