Skip to content

feat(v0.4.23): caller-LLM-driven ingest retrieval + search_hint recall booster#33

Merged
jinhongkuan merged 2 commits into
mainfrom
jin/ingest-retrieval-search-hint
Apr 21, 2026
Merged

feat(v0.4.23): caller-LLM-driven ingest retrieval + search_hint recall booster#33
jinhongkuan merged 2 commits into
mainfrom
jin/ingest-retrieval-search-hint

Conversation

@jinhongkuan

@jinhongkuan jinhongkuan commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes the BM25 vocab-mismatch problem that became visible after v0.4.20
made grounding status honest. Ingest-time retrieval now defaults to
caller-LLM-resolved explicit code_regions via the existing MCP
retrieval tools; server-side BM25 becomes the fallback for abstract
decisions. For the fallback path, a new optional search_hint field
widens the BM25 query with caller-supplied synonyms.

Both within the existing deterministic-retrieval tech moat: the server
still does no LLM calls at runtime.

Motivation

Real failure from 2026-04-20: 12 dispatcher-subscription-status decisions
ingested against an Accountable branch. BM25 bound every "dispatch"
decision to use-toast.ts:dispatch (React toast reducer), "active"
decisions to AcquisitionFunnel.tsx:ActiveUser, "source" decisions to
PresentationSlideshow.tsx:SlideSource. Under v0.4.19 these would
have silently auto-promoted to REFLECTED; under v0.4.20 they stay at
PENDING forever (honest but unactionable).

Changes

Lever 1 — caller-LLM retrieval is the new default

  • skills/bicameral-ingest/SKILL.md Step 2 now instructs the caller
    LLM to use validate_symbols + search_code + get_neighbors to
    resolve explicit code_regions BEFORE ingesting.
  • Step 3 leads with internal format (explicit regions) as preferred;
    natural format is the fallback for abstract decisions.
  • No server-side code change. The server already accepted internal-
    format payloads.

Lever 2 — search_hint recall booster

  • IngestMapping.search_hint: str and IngestDecision.search_hint: str
    — optional caller-supplied synonym / identifier-name hints.
  • ground_mappings concatenates description + search_hint as the
    BM25 query when the hint is non-empty. Strictly additive.
  • Query-only metadata: never stored on intent.description, never
    surfaces in briefs / status / gap-judge responses.

Test plan

  • 8 new tests in tests/test_v0423_search_hint.py (propagation,
    query construction, backward compat, span-text pollution guard)
  • Regression: 34/34 pass across test_v0416_natural_format_fields,
    test_v0423_search_hint, test_phase1_l1_wiring,
    test_resolve_compliance

Upgrade notes

Pre-existing false-positive bindings from BM25 auto-grounding persist
in the graph. To clean up: bicameral.reset → re-ingest under the
new skill defaults. Targeted edge pruning tracked for a future
release.

Next up

This is incremental improvement within the current architecture.
Separate plan document coming for the structural refactor that
collapses intent → maps_to → symbol → implements → region into
intent → binds_to → region directly (symbol becomes a retrieval-tier-only
primitive). That's v0.5.0 territory — decided to ship this today so
you're unblocked on the dispatcher ingest while the refactor is scoped.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added optional search_hint parameter to enhance code search accuracy during ingestion, allowing callers to provide additional context for better symbol location resolution.
  • Documentation

    • Updated ingest workflow guidance to emphasize explicit symbol resolution using validation tools before ingestion.
  • Tests

    • Added comprehensive test coverage for search hint propagation and BM25 query construction.
  • Chores

    • Bumped version to v0.4.23.

jinhongkuan and others added 2 commits April 20, 2026 17:33
Addresses the BM25 vocab-mismatch problem that surfaced after v0.4.20's
honest PENDING projection. Decisions whose description didn't lexically
overlap with real code identifier vocabulary were binding to whatever
file incidentally shared a keyword.

Two changes, both within the deterministic-retrieval moat:

**Lever 1 — caller-LLM retrieval is the new default**
- skills/bicameral-ingest/SKILL.md restructured: Step 2 now instructs
  the caller LLM to use validate_symbols + search_code + get_neighbors
  to resolve explicit code_regions BEFORE ingesting. Step 3 leads with
  internal format (explicit regions) as preferred; natural format is
  the fallback for truly abstract decisions.
- No server code changes — the server already accepted internal-format
  payloads. The skill just stops discouraging that path.

**Lever 2 — search_hint BM25 recall booster**
- IngestMapping.search_hint, IngestDecision.search_hint (both optional).
  Query-only metadata: synonyms, domain vocab, likely identifier names
  the description wouldn't contain literally.
- adapters.code_locator.ground_mappings concatenates
  "description search_hint" as the BM25 query when the hint is non-empty.
  Strictly additive: omitted hint = pre-v0.4.23 behavior.
- search_hint never lands on intent.description; never surfaces in
  briefs, status, or gap-judge. Humans see clean decision text; BM25
  sees the widened query.

**Guarantee preserved**: retrieval remains deterministic at runtime.
Caller LLM does the expensive lookup at ingest time (when it has full
codebase context); server-side BM25 fallback is only consulted for
abstract decisions. Tech moat intact.

Tests: 8 new in test_v0423_search_hint.py — propagation through
_normalize_payload, BM25 query construction, backward compatibility
(no hint = bare description). Full regression: 34/34 pass across
natural-format + L1-wiring + resolve_compliance + search_hint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VERSION: 0.4.22 → 0.4.23
RECOMMENDED_VERSION: 0.4.22 → 0.4.23

CHANGELOG entry covers the skill-level flip (caller-LLM resolves
code_regions explicitly) and the search_hint recall booster, plus the
upgrade note on pre-existing false-positive bindings persisting until
bicameral.reset + re-ingest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 21, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Added optional search_hint field to IngestMapping and IngestDecision contracts, with propagation through the ingest handler and integration into BM25 query construction in the grounding adapter. Updated skill documentation to reflect MCP-driven grounding workflow, incremented package version to 0.4.23, and added comprehensive regression tests.

Changes

Cohort / File(s) Summary
Version Updates
CHANGELOG.md, RECOMMENDED_VERSION, pyproject.toml
Bumped version from 0.4.22 to 0.4.23 with changelog entry describing search_hint feature and MCP-driven grounding workflow updates.
Contract Definitions
contracts.py
Added optional search_hint: str field (default empty) to both IngestMapping and IngestDecision models to carry BM25 query-only metadata through ingest pipeline.
Handler Logic
handlers/ingest.py
Updated _normalize_payload to propagate search_hint from natural-format decisions into generated mappings during normalization.
Adapter & Grounding
adapters/code_locator.py
Modified ground_mappings() to conditionally concatenate trimmed search_hint to description for BM25 query construction; used in token counting and search operations only, not stored in output.
Skill Documentation
skills/bicameral-ingest/SKILL.md
Restructured ingest workflow to emphasize MCP tool-driven grounding; added symbol hypothesis generation, validate_symbols, search_code, and optional get_neighbors steps; introduced search_hint as standardized BM25 booster; clarified internal vs. natural format selection criteria.
Test Suite
tests/test_v0423_search_hint.py
Added 203 lines of regression tests validating search_hint propagation through normalization, BM25 query construction with conditional hint concatenation, backward compatibility (absent/empty hints), and skipping of grounding when code_regions present.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

  • bicameral-mcp#9: Modifies the grounding pipeline with tiered coverage and reuse logic; overlaps with this PR's changes to ground_mappings() and BM25 query construction.
  • bicameral-mcp#3: Introduced the original IngestMapping/IngestDecision models and _normalize_payload handler; this PR extends those same contracts and propagation paths with the new search_hint field.

Poem

🐰 A hint in the query, a whisper so fine,
To boost the retrieval when symbols align,
Version bumped up to point-two-three,
The MCP tools dance in harmony!
Through contracts and handlers it flows like a stream,
Grounding made better—a rabbit's sweet dream. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main changes: caller-LLM-driven ingest retrieval and the search_hint recall booster feature added in v0.4.23.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jin/ingest-retrieval-search-hint

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
skills/bicameral-ingest/SKILL.md (1)

161-279: Skill guidance reads well; grounds the Lever 1 / Lever 2 split clearly.

One micro-nit (non-blocking): example at line 232 puts search_hint alongside explicit code_regions, while the contract doc and this skill both note search_hint is only consulted when code_regions is empty. The example's own prose at lines 277-278 correctly says to treat it as a safety net for future re-grounding — but a casual reader may miss that and assume it's active. Consider adding a one-line parenthetical to the code block, e.g. // optional — consulted only if code_regions is ever cleared / re-grounded.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/bicameral-ingest/SKILL.md` around lines 161 - 279, Add a clarifying
parenthetical to the internal-format example to indicate that search_hint is
only used when code_regions is empty or re-grounded; update the example block
that contains mappings[].code_regions and the search_hint field (the payload
example with mappings, symbols, code_regions, and search_hint) to include a
one-line note such as "(optional — consulted only if code_regions is later
cleared or during re-grounding)" beside the search_hint so readers don't assume
it is active when explicit code_regions are provided.
adapters/code_locator.py (1)

312-324: Stage-2 fused re-search drops search_hint.

Inside _ground_single, when Stage 1 fuzzy-matching produced symbols, line 319 re-invokes self.search_code(description, symbol_ids=...) with the bare description rather than the widened bm25_query used at the top of ground_mappings. The result is that the search_hint recall booster is applied only to the initial BM25 pass (whose hits are then ignored down this branch because fused is replaced) and not to the symbol-seeded RRF fusion re-search — which is exactly the path search_hint would most plausibly help.

Consider threading the same widened query down:

♻️ Proposed fix
     def _ground_single(
         self,
         description: str,
         db,
         max_files: int,
         fuzzy_threshold: int,
         max_symbols: int,
         hits: list[dict] | None = None,
         mapping_symbol_names: list[str] | None = None,
+        bm25_query: str | None = None,
     ) -> list[dict]:
@@
+        search_query = bm25_query or description
@@
                 if matched_ids:
-                    fused = self.search_code(description, symbol_ids=sorted(matched_ids))
+                    fused = self.search_code(search_query, symbol_ids=sorted(matched_ids))
                 else:
                     fused = hits

And pass bm25_query=bm25_query from the _ground_single call site in ground_mappings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/code_locator.py` around lines 312 - 324, _stage-2 fused re-search in
_ground_single calls self.search_code(description, symbol_ids=...) and thus
drops the widened bm25_query (and associated search_hint) used higher in
ground_mappings; modify the code so the same bm25_query is threaded through:
update the call sites so ground_mappings passes bm25_query into _ground_single,
and inside _ground_single call self.search_code(...,
symbol_ids=sorted(matched_ids), bm25_query=bm25_query) (or pass along whatever
parameter name search_code expects) so the search_hint/boosting is applied to
the symbol-seeded RRF re-search as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@adapters/code_locator.py`:
- Around line 312-324: _stage-2 fused re-search in _ground_single calls
self.search_code(description, symbol_ids=...) and thus drops the widened
bm25_query (and associated search_hint) used higher in ground_mappings; modify
the code so the same bm25_query is threaded through: update the call sites so
ground_mappings passes bm25_query into _ground_single, and inside _ground_single
call self.search_code(..., symbol_ids=sorted(matched_ids),
bm25_query=bm25_query) (or pass along whatever parameter name search_code
expects) so the search_hint/boosting is applied to the symbol-seeded RRF
re-search as well.

In `@skills/bicameral-ingest/SKILL.md`:
- Around line 161-279: Add a clarifying parenthetical to the internal-format
example to indicate that search_hint is only used when code_regions is empty or
re-grounded; update the example block that contains mappings[].code_regions and
the search_hint field (the payload example with mappings, symbols, code_regions,
and search_hint) to include a one-line note such as "(optional — consulted only
if code_regions is later cleared or during re-grounding)" beside the search_hint
so readers don't assume it is active when explicit code_regions are provided.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a7887c10-9417-4dca-87e2-81388ac45d3a

📥 Commits

Reviewing files that changed from the base of the PR and between fe2ba95 and c7a8141.

📒 Files selected for processing (8)
  • CHANGELOG.md
  • RECOMMENDED_VERSION
  • adapters/code_locator.py
  • contracts.py
  • handlers/ingest.py
  • pyproject.toml
  • skills/bicameral-ingest/SKILL.md
  • tests/test_v0423_search_hint.py

@jinhongkuan jinhongkuan merged commit 08ea09a into main Apr 21, 2026
2 checks passed
Knapp-Kevin added a commit that referenced this pull request May 2, 2026
Three-round audit cycle (VETO -> VETO -> PASS) for Notion ingest +
cache contract migration. Plan ships across five phases:

- Phase 0 — cache contract migration (schema v1->v2, schema_version
  table, callable migration dispatch, upsert_canonical_extraction)
- Phase 0.5 — worker-task lifecycle pattern + Slack reference wiring
  (closes the v0 dormant-Slack-worker gap)
- Phase 1 — Notion API client + property serializer (internal-
  integration auth, no OAuth router)
- Phase 2 — Notion ingest worker (per-database watermark, peer-
  authored team_event)
- Phase 3 — Notion task registration on lifespan

META_LEDGER entries #29-#33 capture: round-1 VETO (4 missing/
undeclared symbols), round-2 VETO (1 wrong-call-shape for
decrypt_token), round-3 PASS, IMPLEMENT, and SUBSTANTIATION.

SHADOW_GENOME #7 addendum extends the PARALLEL_STRUCTURE_ASSUMED
detection heuristic with three new in-sketch checks: signature,
type-boundary, helper-symmetry. The two VETOs in this session are
the empirical justification.

SYSTEM_STATE.md adds the Priority C v1 section: schema state (v2),
architectural properties achieved, audit cycle outcomes,
implementation deviations from plan.

Merkle seal: SHA256(content_hash + previous_hash) =
dcb619104e6d88b97a04689093b80b9f03825f9a24bac3c3b9ab3d0107ff24d7
(content_hash 9f003c40..., previous_hash 6f4f8f8f... = Priority C v0
SEAL at Entry #28).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin added a commit that referenced this pull request May 2, 2026
First-round PASS audit cycle for the real heuristic+LLM extractor.
Plan ships across six phases (Phase 0 cache contract evolution; Phase
1 deterministic Stage 1 classifier; Phase 2 trigger rules schema;
Phase 3 real Anthropic SDK Stage 2; Phase 4 pipeline integration;
Phase 5 corpus learner option-c).

META_LEDGER entries #34-#36 capture: round-1 PASS audit, IMPLEMENT,
and SUBSTANTIATION. Three audit advisories (extract() boundary,
TeamServerRules typo, corpus learner table-source) all addressed
inline during implementation.

A proactive QorLogic Fixer code-quality sweep before commit produced
2 MED + 2 LOW findings; both MEDs landed (fail-soft on non-text
content blocks; v2->v3 backfill integration test) with one surfacing
a real defect (the migration's TYPE string was rejecting reads on
pre-v3 rows with NONE classifier_version; corrected to TYPE
option<string>).

SYSTEM_STATE.md adds the Priority C v1.1 section: schema state (v4),
architectural properties achieved (heuristic-first determinism +
LLM-only-when-needed + rule-version-driven cache invalidation + all
four "dynamic" angles wired), audit cycle outcomes.

Merkle seal: SHA256(content_hash + previous_hash) =
b37003661820e2ef80591b9d0cfdeac3df092d6d9b4b5d87e3036e7ccf37d95b
(content_hash e8b1b6b6..., previous_hash dcb61910... = Priority C
v1 SEAL at Entry #33).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin added a commit that referenced this pull request May 3, 2026
Three-round audit cycle (VETO -> VETO -> PASS) for Notion ingest +
cache contract migration. Plan ships across five phases:

- Phase 0 — cache contract migration (schema v1->v2, schema_version
  table, callable migration dispatch, upsert_canonical_extraction)
- Phase 0.5 — worker-task lifecycle pattern + Slack reference wiring
  (closes the v0 dormant-Slack-worker gap)
- Phase 1 — Notion API client + property serializer (internal-
  integration auth, no OAuth router)
- Phase 2 — Notion ingest worker (per-database watermark, peer-
  authored team_event)
- Phase 3 — Notion task registration on lifespan

META_LEDGER entries #29-#33 capture: round-1 VETO (4 missing/
undeclared symbols), round-2 VETO (1 wrong-call-shape for
decrypt_token), round-3 PASS, IMPLEMENT, and SUBSTANTIATION.

SHADOW_GENOME #7 addendum extends the PARALLEL_STRUCTURE_ASSUMED
detection heuristic with three new in-sketch checks: signature,
type-boundary, helper-symmetry. The two VETOs in this session are
the empirical justification.

SYSTEM_STATE.md adds the Priority C v1 section: schema state (v2),
architectural properties achieved, audit cycle outcomes,
implementation deviations from plan.

Merkle seal: SHA256(content_hash + previous_hash) =
dcb619104e6d88b97a04689093b80b9f03825f9a24bac3c3b9ab3d0107ff24d7
(content_hash 9f003c40..., previous_hash 6f4f8f8f... = Priority C v0
SEAL at Entry #28).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin added a commit that referenced this pull request May 3, 2026
First-round PASS audit cycle for the real heuristic+LLM extractor.
Plan ships across six phases (Phase 0 cache contract evolution; Phase
1 deterministic Stage 1 classifier; Phase 2 trigger rules schema;
Phase 3 real Anthropic SDK Stage 2; Phase 4 pipeline integration;
Phase 5 corpus learner option-c).

META_LEDGER entries #34-#36 capture: round-1 PASS audit, IMPLEMENT,
and SUBSTANTIATION. Three audit advisories (extract() boundary,
TeamServerRules typo, corpus learner table-source) all addressed
inline during implementation.

A proactive QorLogic Fixer code-quality sweep before commit produced
2 MED + 2 LOW findings; both MEDs landed (fail-soft on non-text
content blocks; v2->v3 backfill integration test) with one surfacing
a real defect (the migration's TYPE string was rejecting reads on
pre-v3 rows with NONE classifier_version; corrected to TYPE
option<string>).

SYSTEM_STATE.md adds the Priority C v1.1 section: schema state (v4),
architectural properties achieved (heuristic-first determinism +
LLM-only-when-needed + rule-version-driven cache invalidation + all
four "dynamic" angles wired), audit cycle outcomes.

Merkle seal: SHA256(content_hash + previous_hash) =
b37003661820e2ef80591b9d0cfdeac3df092d6d9b4b5d87e3036e7ccf37d95b
(content_hash e8b1b6b6..., previous_hash dcb61910... = Priority C
v1 SEAL at Entry #33).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant