fix: FC-2 follow-ups — symbol seeding, team-mode test unwrap, step-1 parser by silongtan · Pull Request #24 · BicameralAI/bicameral-mcp

silongtan · 2026-04-17T00:04:57Z

Summary

F1: _ground_single now unions LLM-extracted mapping["symbols"] IDs with regex-token IDs before graph channel seeding — fixes 0% multi-region recall@files by giving
the graph channel correct seeds instead of NL-fuzzy false positives
F2: _dump_graph test helper unwraps TeamWriteAdapter via getattr(ledger, "_inner", ledger) so phase3 integration tests pass in team mode — fixes 6 pre-existing
failures
F3: _extract_step1_excerpt targets ### 1. specifically instead of first ### N. header — fixes 1 pre-existing failure after SKILL.md added Step 0

Test plan

pytest tests/ — 435 passed, 0 failed (was 7 failures before)
Run eval_code_locator.py against Accountable to measure multi-region recall@files improvement from F1
Verify in CI (surrealdb + team mode)

Summary by CodeRabbit

Improvements
- Enhanced symbol identification process to support direct name-based lookups, improving accuracy of code location matching.
Tests
- Updated test infrastructure to ensure compatibility with enhanced parameter handling.
- Improved test utilities to support wrapped implementation patterns.

coderabbitai · 2026-04-17T00:05:13Z

Warning

Rate limit exceeded

@silongtan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 42 minutes and 13 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 42 minutes and 13 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 69088c5b-efff-4929-a088-9c57e2bc7775

📥 Commits

Reviewing files that changed from the base of the PR and between e7cb1ae and 30d03e1.

📒 Files selected for processing (4)

adapters/code_locator.py
tests/_extract_headless.py
tests/test_coverage_loop.py
tests/test_phase3_integration.py

📝 Walkthrough

Walkthrough

The PR adds symbol name-based ID seeding to the code locator grounding process, refactors step extraction logic to use direct regex matching, updates test mocks to accept additional parameters, and improves ledger client handling in integration tests.

Changes

Cohort / File(s)	Summary
Grounding Symbol Resolution `adapters/code_locator.py`	Added optional `mapping_symbol_names` parameter to `_ground_single()` enabling name-based ID lookups via database; failures logged at debug level. Updated `ground_mappings()` to pass symbol names for early ID seeding during grounding tiers.
Test Mock Updates `tests/test_coverage_loop.py`	Updated `fake_ground_single()` test doubles across multiple cases to accept `**kwargs` parameter, allowing them to tolerate extra parameters passed by production `_ground_single()` calls.
Test Utility Refactoring `tests/_extract_headless.py`, `tests/test_phase3_integration.py`	Refactored `_extract_step1_excerpt()` to locate Step 1 via direct regex matching instead of header collection. Enhanced `_dump_graph()` to resolve inner ledger clients and ensure connectivity before accessing SurrealDB client.

Sequence Diagram

sequenceDiagram
    participant Ground as Ground<br/>Mappings
    participant GS as _ground_single<br/>(with mapping_symbol_names)
    participant DB as Database<br/>(lookup_by_name)
    participant Results as Matched<br/>IDs

    Ground->>GS: Invoke with optional symbol names
    alt When mapping_symbol_names provided
        GS->>DB: lookup_by_name(symbol_name)
        DB-->>GS: Row IDs or error
        GS->>Results: Add resolved IDs to matched_ids
        GS->>GS: Log failures at debug level
    else No symbol names
        GS->>Results: Continue with existing flow
    end
    GS-->>Ground: Return grounded results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A symbol by name is found with grace,
Through database lookups at steady pace,
Mocks bend to kwargs, extraction refined,
Ledger clients unwrapped, all aligned!
Code grows stronger with each careful bind. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 58.82% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes three distinct changes (F1, F2, F3) that address follow-ups to a prior issue, making it specific and informative for scanning history.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch silong/fc2-followup-f1-f2-f3

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

tests/_extract_headless.py (1)
144-151: LGTM — Step 1 targeting is correct.

Anchoring on ### 1. with _STEP_HEADER_RE as the terminator properly handles the new Step 0 in SKILL.md and preserves the fallback path. Optional: hoist step1_re to module scope next to _STEP_HEADER_RE to avoid recompiling on every call (minor; re.compile is internally cached anyway).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/_extract_headless.py` around lines 144 - 151, The regular expression
step1_re is being compiled inside the function each call; hoist its definition
to module scope next to the existing _STEP_HEADER_RE so it is compiled once and
reused, then update the function to reference the module-level step1_re
variable; keep the same pattern r"^###\s+1\.\s+" and the existing use of
step1_re.search(body) and _STEP_HEADER_RE.search(body, ...).
adapters/code_locator.py (1)
223-230: Seeding symbol IDs by name looks correct; minor robustness nits.

The union into matched_ids before graph seeding is the right fix for the 0% multi-region recall issue. A couple of small, optional improvements:

mapping_symbol_names is passed directly from mapping.get("symbols"), which may contain empty strings, None, or duplicates (e.g., from _build_payload_from_real_code when symbol_name is missing). Consider filtering to avoid unnecessary DB calls.

Broad except Exception matches the style elsewhere in this file (Ruff BLE001 is flagged but consistent); fine to keep.
♻️ Optional filter
 if mapping_symbol_names:
-    for name in mapping_symbol_names:
+    for name in {n for n in mapping_symbol_names if isinstance(n, str) and n}:
         try:
             rows = db.lookup_by_name(name)
             for row in rows:
                 matched_ids.add(row["id"])
         except Exception as exc:
             logger.debug("[ground] symbol name lookup failed for '%s': %s", name, exc)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/code_locator.py` around lines 223 - 230, mapping_symbol_names coming
from mapping.get("symbols") may include empty strings, None, or duplicates
causing unnecessary db.lookup_by_name calls; before iterating in the block that
updates matched_ids, sanitize the list by removing falsy values and
deduplicating (e.g., convert to a set after filtering) so you only call
db.lookup_by_name for unique, non-empty symbol names; keep the remaining
try/except and matched_ids.add(row["id"]) logic unchanged and reference
mapping_symbol_names, mapping.get("symbols"), db.lookup_by_name, and matched_ids
when making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@adapters/code_locator.py`:
- Around line 223-230: mapping_symbol_names coming from mapping.get("symbols")
may include empty strings, None, or duplicates causing unnecessary
db.lookup_by_name calls; before iterating in the block that updates matched_ids,
sanitize the list by removing falsy values and deduplicating (e.g., convert to a
set after filtering) so you only call db.lookup_by_name for unique, non-empty
symbol names; keep the remaining try/except and matched_ids.add(row["id"]) logic
unchanged and reference mapping_symbol_names, mapping.get("symbols"),
db.lookup_by_name, and matched_ids when making the change.

In `@tests/_extract_headless.py`:
- Around line 144-151: The regular expression step1_re is being compiled inside
the function each call; hoist its definition to module scope next to the
existing _STEP_HEADER_RE so it is compiled once and reused, then update the
function to reference the module-level step1_re variable; keep the same pattern
r"^###\s+1\.\s+" and the existing use of step1_re.search(body) and
_STEP_HEADER_RE.search(body, ...).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f6160d02-31bd-418e-99be-841b21feebc9

📥 Commits

Reviewing files that changed from the base of the PR and between d6018ef and e7cb1ae.

📒 Files selected for processing (4)

adapters/code_locator.py
tests/_extract_headless.py
tests/test_coverage_loop.py
tests/test_phase3_integration.py

F1: _ground_single now accepts mapping_symbol_names and unions LLM-extracted symbol IDs with regex-token IDs before graph seeding. Fixes 0% multi-region recall@files. F2: _dump_graph unwraps TeamWriteAdapter via getattr(ledger, "_inner", ledger) so phase3 integration tests work in team mode. F3: _extract_step1_excerpt targets "### 1." specifically instead of the first numbered step header (was grabbing Step 0 after SKILL.md added boundary detection). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The fake_ground_single mocks in test_coverage_loop.py need **kwargs to accept the new mapping_symbol_names kwarg passed by ground_mappings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…cameral-sync Scope-cut from #135's original L2 proposal (--auto-resolve-trivial flag on link_commit). Design enumeration produced 7 options; all required either an LLM in the deterministic core (violating the "selection over generation" guardrail) or trivial-cases enumeration with non-zero false-positive risk. Cut: accept the architectural limit. Post-commit hook stays sync-only. Resolution path = dashboard tooltip on status === 'pending' rows → user runs /bicameral-sync in their Claude Code session. No code is auto-resolved. assets/dashboard.html: renderStateCell() ternary at line 455 → if/else if. New 'pending' branch attaches tooltip text "Pending compliance — run /bicameral-sync in your Claude Code session to resolve." Reuses existing data-tip CSS pattern (lines 187–198, hover transitions). Static string literal — no esc() needed (no HTML special chars). skills/bicameral-dashboard/SKILL.md: One bullet under Notes documenting the tooltip nudge contract. Per pilot/mcp/CLAUDE.md "tool changes ship with skill updates" rule (UI behavior changed; tool response shape unchanged). Section 4 razor: renderStateCell 19 LOC (cap 40), nesting 1 (cap 3), nested ternaries 0. Replaced ternary with if/else if — improves razor score, doesn't degrade it. Verification: manual (no automated test added — dashboard.html has zero existing test infrastructure; UI test harness absent; PR description includes manual verification step). Acknowledged advisory in Entry #24 audit. Refs #135 (close post-merge with scope-cut comment). Refs BicameralAI/bicameral#108 (Flow 3 spec edit, post-merge gh action). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…iation seal Reality matches Promise. Three changes (2 repo files + 2 deferred external gh actions) land per Entry #24 audit blueprint 1:1; 0 new tests (acknowledged advisory — manual verification mitigates); Section 4 razor clean. Audit verdict: PASS, L1 (Entry #24 chain hash 1de1fac7). Implementation: Entry #25 chain hash 51c8a45c. Merkle seal: efd0304b2f0e0b3ca28aa4620c2b8ea2eda5ab9e2828ca852ab9f3c5adda6eb5 Architectural decision recorded: bicameral-mcp#135's auto-resolve direction abandoned (no caller LLM in hook context, MCP sampling not viable in Claude Code's main chat). Resolution path = dashboard tooltip → /bicameral-sync. The tooltip surfaces the pending state; the human in their session is the qualified judge. Plan addition tracking (Entry #24 preconditions, final state): ✅ #2 — SKILL.md tooltip note (delivered in IMPL, sealed here) 🟡 #1 — PR description manual verification step (composed in /qor-document) 🟡 #3 — #135 close comment README/docs deferral (composed in /qor-document) Surfaced for follow-up (not blocking): bicameral-mcp#125 scope should be widened — 7 skills under pilot/mcp/.claude/skills/ are absent from the canonical pilot/mcp/skills/ location claimed by pilot/mcp/CLAUDE.md. Spec correction queued (post-merge gh action): bicameral#108 Flow 1 step 3 claims IngestResponse.supersession_candidates exists when it does not; collision detection lives caller-side via bicameral-context-sentry skill, surfaces via bicameral.preflight.unresolved_collisions. Capability shortfalls (carried, no regression vs Entry #23): qor/scripts/ runtime helpers absent (gate artifacts not written), tools/reliability/ validators absent (Steps 4.6–4.8 skipped), agent-teams not declared, codex-plugin not declared (solo audit/seal), intent_lock capture skipped. Refs #135. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…cameral-sync Scope-cut from #135's original L2 proposal (--auto-resolve-trivial flag on link_commit). Design enumeration produced 7 options; all required either an LLM in the deterministic core (violating the "selection over generation" guardrail) or trivial-cases enumeration with non-zero false-positive risk. Cut: accept the architectural limit. Post-commit hook stays sync-only. Resolution path = dashboard tooltip on status === 'pending' rows → user runs /bicameral-sync in their Claude Code session. No code is auto-resolved. assets/dashboard.html: renderStateCell() ternary at line 455 → if/else if. New 'pending' branch attaches tooltip text "Pending compliance — run /bicameral-sync in your Claude Code session to resolve." Reuses existing data-tip CSS pattern (lines 187–198, hover transitions). Static string literal — no esc() needed (no HTML special chars). skills/bicameral-dashboard/SKILL.md: One bullet under Notes documenting the tooltip nudge contract. Per pilot/mcp/CLAUDE.md "tool changes ship with skill updates" rule (UI behavior changed; tool response shape unchanged). Section 4 razor: renderStateCell 19 LOC (cap 40), nesting 1 (cap 3), nested ternaries 0. Replaced ternary with if/else if — improves razor score, doesn't degrade it. Verification: manual (no automated test added — dashboard.html has zero existing test infrastructure; UI test harness absent; PR description includes manual verification step). Acknowledged advisory in Entry #24 audit. Refs #135 (close post-merge with scope-cut comment). Refs BicameralAI/bicameral#108 (Flow 3 spec edit, post-merge gh action). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit febb0aa)

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

silongtan and others added 2 commits April 16, 2026 20:15

fix: update coverage loop test mocks for new mapping_symbol_names param

30d03e1

The fake_ground_single mocks in test_coverage_loop.py need **kwargs to accept the new mapping_symbol_names kwarg passed by ground_mappings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

silongtan force-pushed the silong/fc2-followup-f1-f2-f3 branch from e7cb1ae to 30d03e1 Compare April 17, 2026 00:22

silongtan merged commit 8d8cf30 into main Apr 17, 2026
1 check passed

This was referenced Apr 17, 2026

fix: Stage 1 symbol ranking by fuzzy score — MRR@3 0.39→0.58 #25

Merged

fix: compound token extraction for grounding recall (9%→14%) #26

Merged

fix: structural grounding pipeline fixes — stage reorder, eval matching, raised caps #27

Merged

jinhongkuan mentioned this pull request Apr 30, 2026

triage(#135): dashboard tooltip → /bicameral-sync (scope-cut from auto-resolve) #138

Merged

5 tasks

jinhongkuan mentioned this pull request Apr 30, 2026

release: v0.13.6 (triage) #140

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: FC-2 follow-ups — symbol seeding, team-mode test unwrap, step-1 parser #24

fix: FC-2 follow-ups — symbol seeding, team-mode test unwrap, step-1 parser #24
silongtan merged 2 commits into
mainfrom
silong/fc2-followup-f1-f2-f3

silongtan commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

silongtan commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

silongtan commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading