Skip to content

fix: compound token extraction for grounding recall (9%→14%)#26

Merged
silongtan merged 1 commit into
mainfrom
silong/p0-compound-token-recall
Apr 18, 2026
Merged

fix: compound token extraction for grounding recall (9%→14%)#26
silongtan merged 1 commit into
mainfrom
silong/p0-compound-token-recall

Conversation

@silongtan

@silongtan silongtan commented Apr 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • The tokenizer in _ground_single used re.findall(r"[a-zA-Z]{4,}") which stripped underscores and dots, destroying compound identifiers (decrease_stock
    ["decrease", "stock"]) before they reached validate_symbols
  • Added _COMPOUND_RE regex to extract snake_case and dotted.name identifiers intact, prepended to the token list before word-level tokens
  • 6 lines changed in one file (adapters/code_locator.py)

Results

Metric Before After
Recall 9.3% 13.9% (+50%)
MRR@3 0.577 0.592 (held)
Saleor recall 13.6% 27.3% (2x)

Key wins: decrease_stock, update_order_status, transaction.atomic now match their exact symbols instead of generic fragments like Stock or Order.

Test plan

  • 29/29 unit tests pass
  • Eval harness: recall 9.3% → 13.9%, MRR@3 held at 0.59
  • No regressions in hit rate or grounding rate

PR 2: Parent Repo (bicameral)

Title: mcp: bump submodule — compound token recall fix

Body:

Summary

  • Bumps pilot/mcp submodule to include compound token extraction fix
  • See BicameralAI/bicameral-mcp#<PR_NUMBER> for details

Summary by CodeRabbit

  • Improvements
    • Enhanced code location detection to better recognize compound identifiers (such as names with underscores and dots), improving search and navigation accuracy.

… recall

The tokenizer in _ground_single stripped underscores and dots, destroying
compound identifiers like decrease_stock and transaction.atomic before they
reached validate_symbols. Extract compounds first via regex, then append
word tokens as fallback. Aggregate recall: 9.3% → 13.9%, MRR@3 held at 0.59.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 18, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f3e3623b-e82a-40e5-9205-040fab719518

📥 Commits

Reviewing files that changed from the base of the PR and between 8c73e68 and ba35e4b.

📒 Files selected for processing (1)
  • adapters/code_locator.py

📝 Walkthrough

Walkthrough

Modified tokenization in _ground_single() to extract compound tokens (containing underscores or dots) using a new regex pattern, concatenating them before word tokens for fuzzy validation and ranking.

Changes

Cohort / File(s) Summary
Tokenization Enhancement
adapters/code_locator.py
Added _COMPOUND_RE regex and enhanced _ground_single() tokenization to extract compound tokens matching [A-Za-z]\w*(?:[_.][A-Za-z]\w*)+ (≥4 chars) before word tokens for ranking.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Poem

🐰 Hop, skip, through underscores we go,
Compound tokens dancing in a row,
With dots and dashes, fuzzy and fine,
Four chars or more—such code divine!
A rabbit's extraction, swift and neat,
Makes symbol grounding oh-so sweet!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: improving compound token extraction to increase grounding recall from 9% to 14%.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch silong/p0-compound-token-recall

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@silongtan silongtan merged commit f649280 into main Apr 18, 2026
1 check passed
jinhongkuan added a commit that referenced this pull request Apr 30, 2026
…iation seal

Reality matches Promise. Three changes (2 repo files + 2 deferred external
gh actions) land per Entry #24 audit blueprint 1:1; 0 new tests (acknowledged
advisory — manual verification mitigates); Section 4 razor clean.

Audit verdict: PASS, L1 (Entry #24 chain hash 1de1fac7).
Implementation: Entry #25 chain hash 51c8a45c.
Merkle seal: efd0304b2f0e0b3ca28aa4620c2b8ea2eda5ab9e2828ca852ab9f3c5adda6eb5

Architectural decision recorded: bicameral-mcp#135's auto-resolve direction
abandoned (no caller LLM in hook context, MCP sampling not viable in Claude
Code's main chat). Resolution path = dashboard tooltip → /bicameral-sync.
The tooltip surfaces the pending state; the human in their session is the
qualified judge.

Plan addition tracking (Entry #24 preconditions, final state):
  ✅ #2 — SKILL.md tooltip note (delivered in IMPL, sealed here)
  🟡 #1 — PR description manual verification step (composed in /qor-document)
  🟡 #3#135 close comment README/docs deferral (composed in /qor-document)

Surfaced for follow-up (not blocking):
  bicameral-mcp#125 scope should be widened — 7 skills under
  pilot/mcp/.claude/skills/ are absent from the canonical pilot/mcp/skills/
  location claimed by pilot/mcp/CLAUDE.md.

Spec correction queued (post-merge gh action):
  bicameral#108 Flow 1 step 3 claims IngestResponse.supersession_candidates
  exists when it does not; collision detection lives caller-side via
  bicameral-context-sentry skill, surfaces via
  bicameral.preflight.unresolved_collisions.

Capability shortfalls (carried, no regression vs Entry #23): qor/scripts/
runtime helpers absent (gate artifacts not written), tools/reliability/
validators absent (Steps 4.6–4.8 skipped), agent-teams not declared,
codex-plugin not declared (solo audit/seal), intent_lock capture skipped.

Refs #135.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jinhongkuan added a commit that referenced this pull request May 2, 2026
Resolves four conflicts:
- .gitignore: keeps both qor:seed block and .claude/worktrees ignore.
- docs/META_LEDGER.md: takes dev's chain (#1-#26) as canonical. This
  branch's parallel #7-#14 entries (sealed against the obsolete Entry #6
  base before dev added #7-#26) are not folded in here — they need to be
  re-authored against dev's #26 seal via /qor-meta-log-decision in a
  follow-up. Their session content is preserved in commit messages
  (f4de501, 3f856af) and in docs/SYSTEM_STATE.md.
- docs/SYSTEM_STATE.md: keeps both the v0 process cleanup / preflight
  hook session blocks from this branch and dev's #124 implementation
  block.
- skills/bicameral-preflight/SKILL.md: keeps both the new "Hook
  reinforcement" subsection (this branch) and the Telemetry section
  (dev). The earlier removal of Telemetry in this branch's f4de501 was
  authored against a stale base; restoring it here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant