Skip to content

grounding accuracy MRR@3 0.59 → 0.79, recall 14% → 30% #28

Merged
silongtan merged 2 commits into
mainfrom
silong/grounding-accuracy-improvements
Apr 19, 2026
Merged

grounding accuracy MRR@3 0.59 → 0.79, recall 14% → 30% #28
silongtan merged 2 commits into
mainfrom
silong/grounding-accuracy-improvements

Conversation

@silongtan

@silongtan silongtan commented Apr 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • MRR@3: 0.592 → 0.786 (+32.8%) — measures if correct code is in top-3 results
  • Recall: 13.9% → 29.9% (+16pp) — measures if correct symbols are found
  • Variance: 0.360 → 0.156 — consistency across repos (Medusa/Saleor/Vendure)
  • 49/49 tests pass

Changes

Pipeline improvements (adapters/code_locator.py)

  • Two-track token extraction: identifier tokens (camelCase/compound) go through fuzzy matching; raw NL words filtered via keyword blocklist. Prevents "void"→Void,
    "state"→State graph seed pollution while preserving "checkout"→Checkout
  • Case-form bigrams: adjacent NL words generate PascalCase/snake_case candidates
  • Coverage tiers widened: (2,80,5)→(3,75,5) at Tier 0
  • Symbol type priority as Stage 2 tiebreaker

Eval metric improvements (tests/eval_code_locator.py)

  • Case-insensitive symbol matching
  • All-component extraction from qualified names

Ground truth corrections (tests/fixtures/expected/decisions.py)

  • Removed 5 phantom Medusa symbols
  • Fixed PluginManager→PluginsManager, on_commit→fulfillment_created
  • Fixed overly-specific Vendure file patterns

Config

  • BM25 k1/b params now configurable
  • Fuzzy scorer configurable (WRatio/token_set_ratio/partial_ratio)

Test plan

  • 49/49 unit tests pass
  • Eval on 3 OSS repos (30 decisions)
  • Saleor MRR stable at 0.864 (no regression)
  • Variance under 0.25 CI gate

Summary by CodeRabbit

Release Notes

  • New Features

    • Added configurable BM25 tuning parameters for search indexing optimization
    • Introduced selectable fuzzy matching scorer strategies
    • Enhanced symbol ranking with type-priority ordering
  • Bug Fixes

    • Improved case-insensitive symbol matching and detection
    • Refined tokenization for better identifier recognition and relevance filtering

silongtan and others added 2 commits April 18, 2026 21:42
Two-track token extraction prevents NL-word pollution in graph seeds
(e.g. "void" → Void, "return" → Return) while preserving domain words
(e.g. "checkout" → Checkout). Case-form bigrams bridge vocabulary gap
between NL descriptions and code identifiers.

Pipeline changes:
- Three-track token extraction: identifiers, domain words, case-form bigrams
- Keyword blocklist filters programming words from fuzzy matching
- Coverage tiers widened: (2,80,5)→(3,75,5) at Tier 0
- RRF k lowered 60→40 for sharper ranking
- Symbol type priority as Stage 2 tiebreaker

Ground truth corrections:
- Remove 5 phantom Medusa symbols (CartCompletionStrategy, etc.)
- Fix PluginManager→PluginsManager for Saleor
- Fix on_commit→fulfillment_created (Django builtin → actual symbol)
- Fix overly-specific Vendure file patterns

Results: recall 13.9%→25.5%, variance 0.360→0.156, 49/49 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Eval improvements (25.5% → 29.9% recall):
- Case-insensitive symbol matching in recall computation
- All-component extraction from qualified names (EventBusService.emit
  now adds both EventBusService and emit to found_symbols)
- Case-insensitive _is_relevant() for MRR consistency

Pipeline preparation:
- Configurable fuzzy scorer via config (WRatio/token_set_ratio/partial_ratio)
- BM25 k1/b params wired through config → bm25s constructor
- Enables future grid search tuning without code changes

49/49 tests pass. MRR@3 stable at 0.786.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 19, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Configuration and algorithm tuning across code locator's grounding and retrieval systems: adjusted coverage-tier limits (max_files increased 2→3, 4→5, 6→7) and fuzzy thresholds, added symbol-type-priority ranking, expanded tokenization with regex-based identifier detection, exposed BM25 hyperparameters (k1, b) and RRF parameter (rrf_k) as configurable options, added pluggable fuzzy scorer selection, and made test symbol matching case-insensitive with corresponding fixture and assertion updates.

Changes

Cohort / File(s) Summary
Configuration & Runtime
code_locator/config.py, code_locator_runtime.py
Added BM25 tuning parameters bm25_k1 (1.5) and bm25_b (0.75); adjusted RRF parameter rrf_k default from 60 to 40. Updated rebuild_index() to explicitly pass these hyperparameters to BM25 initialization.
Symbol Grounding & Ranking
adapters/code_locator.py
Tightened coverage-tier thresholds (max_files: 2→3, 4→5, 6→7; fuzzy scores: 80→75, 70→65, 60→55). Added _SYMBOL_TYPE_PRIORITY mapping with _type_priority() helper; extended per-file symbol ranking to include symbol type priority as third sort key. Expanded _ground_single tokenization with regex-based identifier detection, keyword filtering, and three-track token generation (compound/camelCase identifiers, domain words, and case-form bigrams).
BM25 Retrieval
code_locator/retrieval/bm25s_client.py
Updated index() method signature to accept optional k1 and b BM25 tuning parameters (defaults 1.5 and 0.75); passed parameters to bm25s.BM25 constructor.
Fuzzy Matching
code_locator/tools/validate_symbols.py
Refactored Stage 2 of _fuzzy_match to select scorer dynamically from config (WRatio, token_set_ratio, or partial_ratio), defaulting to WRatio. Score now computes maximum of selected scorer's outputs for both name and qualified_name.
Test Evaluation & Fixtures
tests/eval_code_locator.py, tests/fixtures/expected/decisions.py
Made symbol matching case-insensitive in _is_relevant() and recall computation; expanded found_symbols collection to include all dot-separated symbol parts. Updated decision fixture expectations for symbol mappings (removed CartCompletionStrategy, JobSchedulerService, PaymentSessionService, PluginManager, WebhookEndpoint; renamed PluginManagerPluginsManager; adjusted file pattern matching for Vendure fixtures).
Test Coverage Loop
tests/test_coverage_loop.py
Updated tier progression expectations: strict tier 0 now triggers at max_f == 3 (was 2), tier 1 at max_f == 5 (was 4), full sequence [3, 5, 7] (was [2, 4, 6]). Adjusted all related assertions for tiers_tried and tier stamping tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Possibly related PRs

Suggested reviewers

  • jinhongkuan

Poem

🐰 Hops through the thresholds, tightening the knot,
Three token tracks dance where once was just thought,
BM25's parameters now tuned just right,
Symbols ranked by type—what a glorious sight!
Case-insensitive matching, fuzzy and fleet,
Config and testing complete the retreat.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.42% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly reflects the main achievement of the PR: significant improvements in grounding accuracy metrics (MRR@3 from 0.59 to 0.79, recall from 14% to 30%), which aligns with the PR objectives that demonstrate these exact metric improvements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch silong/grounding-accuracy-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (6)
tests/eval_code_locator.py (1)

129-133: Minor consistency nit: denominator could use lowercased set for symmetry.

If expected_symbols ever contains case-variant duplicates (e.g., {"Foo", "foo"}) — unlikely with the current fixtures — the numerator collapses after .lower() while the denominator uses the raw set cardinality, slightly underreporting recall. Using len(expected_lower) would make the metric consistent with the case-insensitive intent. Not a bug with the current fixture contents; flagging only for robustness.

♻️ Optional tweak
-        recall = matched_count / len(expected_symbols) if expected_symbols else 0
+        recall = matched_count / len(expected_lower) if expected_lower else 0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/eval_code_locator.py` around lines 129 - 133, The recall calculation
uses len(expected_symbols) as the denominator while the numerator is derived
from expected_lower (case-folded), which miscounts when expected_symbols has
case-variant duplicates; update the recall computation to use
len(expected_lower) (i.e., recall = matched_count / len(expected_lower) if
expected_lower else 0) so the denominator matches the case-insensitive sets used
to compute matched_count (references: expected_lower, found_lower,
matched_count, recall).
adapters/code_locator.py (3)

152-155: Annotate _SYMBOL_TYPE_PRIORITY as ClassVar (RUF012).

Ruff flags this as a mutable class attribute. The mapping is effectively constant; annotating with ClassVar documents that intent and silences the lint without behavior change.

♻️ Proposed annotation
+from typing import ClassVar
@@
-    _SYMBOL_TYPE_PRIORITY = {
+    _SYMBOL_TYPE_PRIORITY: ClassVar[dict[str, int]] = {
         "class": 0, "interface": 1, "type_alias": 2,
         "function": 3, "method": 4, "variable": 5,
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/code_locator.py` around lines 152 - 155, Annotate the class-level
mapping _SYMBOL_TYPE_PRIORITY with ClassVar to mark it as an immutable class
attribute: add "from typing import ClassVar" to imports (or include ClassVar in
the existing typing import) and change the declaration to use ClassVar[dict[str,
int]] (or ClassVar[Mapping[str, int]] if Mapping is preferred) for the variable
type; keep the existing mapping values and name _SYMBOL_TYPE_PRIORITY unchanged
so the lint (RUF012) is satisfied without behavioral change.

208-210: Dead helper — _type_priority is defined but never called.

The sort lambda at lines 351-355 inlines self._SYMBOL_TYPE_PRIORITY.get(r["type"], 3) directly (which is correct, since the row is already in hand — calling this helper would add an unnecessary db.lookup_by_id per symbol). Either remove the helper or route the sort key through it; leaving it as-is invites future callers to pick the DB-fetching version by mistake.

♻️ Suggested removal
-    def _type_priority(self, sid: int, db) -> int:
-        row = db.lookup_by_id(sid)
-        return self._SYMBOL_TYPE_PRIORITY.get(row["type"], 3) if row else 3
-
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/code_locator.py` around lines 208 - 210, _remove the dead helper
_type_priority which performs an unnecessary db lookup; delete the method
definition and any unused imports so callers use the inline expression
self._SYMBOL_TYPE_PRIORITY.get(r["type"], 3) (or, if you intended a reusable
helper, replace _type_priority(sid, db) with a non-DB version like
_type_priority_from_row(row) that reads row["type"]). Ensure no remaining
references to _type_priority exist and adjust tests/usages accordingly.

234-267: Tokenization looks correct — one small heads-up on acronym handling.

Three-track extraction with de-dup is clean. One heuristic wrinkle: str.capitalize() lowercases the tail, so acronyms from the description — "JWT permissions" — produce "JwtPermissions" / "jwt_permissions" rather than "JWTPermissions". That's still a plausible symbol candidate but won't match true all-caps acronym-prefixed class names like JWTHandler. If grounding acronyms becomes a recall gap in follow-up evals, consider preserving the original casing of already-upper tokens (e.g., w if w.isupper() else w.capitalize()). Not blocking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/code_locator.py` around lines 234 - 267, The pascal-case
construction in the case_forms loop lowercases acronym tails via
str.capitalize(), causing all-caps tokens like "JWT" to become "JwtPermissions"
and miss true symbols like "JWTPermissions"; update the loop in
adapters/code_locator.py (the nl_words → case_forms section) to preserve
already-uppercase words when building pascal (e.g., use the original token if
token.isupper() else token.capitalize()) and similarly ensure snake uses
lowercased tokens for separation; keep the other heuristics and the length check
intact.
code_locator/retrieval/bm25s_client.py (1)

68-68: Protocol and concrete-class signatures now diverge — consider keeping them in sync.

BM25Search.index(self, repo_path, output_dir) (per code_locator/retrieval/bm25_protocol.py) no longer matches this concrete implementation, which now also accepts symbol_db, k1, and b. Because all extras have defaults, callers using the protocol type still work, so this is cosmetic — but documenting the tuning knobs at the protocol level would help future BM25 backends (e.g. the mentioned "zoekt") honor them.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code_locator/retrieval/bm25s_client.py` at line 68, The BM25 protocol
signature and docs need to match the concrete implementation: update
BM25Search.index in bm25_protocol.py to accept the same optional parameters
(symbol_db=None, k1: float = 1.5, b: float = 0.75) and add a short docstring
describing each tuning knob so other backends (e.g., zoekt) can implement them
consistently; ensure the parameter names and default values exactly match the
concrete method index in bm25s_client.py.
code_locator/tools/validate_symbols.py (1)

80-85: Hoist _SCORERS to module or class scope.

The dict is rebuilt on every call to _fuzzy_match. Lifting it to a module-level constant (or class attribute) keeps the scorer resolution centralized and avoids the per-call dict allocation. Functionally fine as-is.

♻️ Proposed refactor
 from rapidfuzz import fuzz

+_SCORERS = {
+    "WRatio": fuzz.WRatio,
+    "token_set_ratio": fuzz.token_set_ratio,
+    "partial_ratio": fuzz.partial_ratio,
+}
+
 # JSON Schema for tool parameter validation
@@
-        _SCORERS = {
-            "WRatio": fuzz.WRatio,
-            "token_set_ratio": fuzz.token_set_ratio,
-            "partial_ratio": fuzz.partial_ratio,
-        }
-        scorer = _SCORERS.get(self.config.fuzzy_scorer, fuzz.WRatio)
+        scorer = _SCORERS.get(self.config.fuzzy_scorer, fuzz.WRatio)

One silent-fallback concern worth considering: an unknown/typo'd fuzzy_scorer value in config currently silently reverts to WRatio. Consider logging a warning so a misconfigured env var (CODE_LOCATOR_FUZZY_SCORER=token_sort_ratio) doesn't quietly run with the wrong scorer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code_locator/tools/validate_symbols.py` around lines 80 - 85, Hoist the
_SCORERS dict out of _fuzzy_match into module scope (e.g. TOP_LEVEL _SCORERS
constant) or a class attribute so it isn’t recreated per call, then have
_fuzzy_match look up scorer = _SCORERS.get(self.config.fuzzy_scorer,
fuzz.WRatio); additionally, detect when the get() falls back (i.e.
self.config.fuzzy_scorer not a key) and emit a warning via the existing logger
(or logging.getLogger(__name__)) indicating the unknown fuzzy_scorer and that
WRatio is being used as a fallback; keep symbol names _SCORERS, _fuzzy_match,
and self.config.fuzzy_scorer exactly as in the diff so the changes are easy to
locate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@adapters/code_locator.py`:
- Around line 152-155: Annotate the class-level mapping _SYMBOL_TYPE_PRIORITY
with ClassVar to mark it as an immutable class attribute: add "from typing
import ClassVar" to imports (or include ClassVar in the existing typing import)
and change the declaration to use ClassVar[dict[str, int]] (or
ClassVar[Mapping[str, int]] if Mapping is preferred) for the variable type; keep
the existing mapping values and name _SYMBOL_TYPE_PRIORITY unchanged so the lint
(RUF012) is satisfied without behavioral change.
- Around line 208-210: _remove the dead helper _type_priority which performs an
unnecessary db lookup; delete the method definition and any unused imports so
callers use the inline expression self._SYMBOL_TYPE_PRIORITY.get(r["type"], 3)
(or, if you intended a reusable helper, replace _type_priority(sid, db) with a
non-DB version like _type_priority_from_row(row) that reads row["type"]). Ensure
no remaining references to _type_priority exist and adjust tests/usages
accordingly.
- Around line 234-267: The pascal-case construction in the case_forms loop
lowercases acronym tails via str.capitalize(), causing all-caps tokens like
"JWT" to become "JwtPermissions" and miss true symbols like "JWTPermissions";
update the loop in adapters/code_locator.py (the nl_words → case_forms section)
to preserve already-uppercase words when building pascal (e.g., use the original
token if token.isupper() else token.capitalize()) and similarly ensure snake
uses lowercased tokens for separation; keep the other heuristics and the length
check intact.

In `@code_locator/retrieval/bm25s_client.py`:
- Line 68: The BM25 protocol signature and docs need to match the concrete
implementation: update BM25Search.index in bm25_protocol.py to accept the same
optional parameters (symbol_db=None, k1: float = 1.5, b: float = 0.75) and add a
short docstring describing each tuning knob so other backends (e.g., zoekt) can
implement them consistently; ensure the parameter names and default values
exactly match the concrete method index in bm25s_client.py.

In `@code_locator/tools/validate_symbols.py`:
- Around line 80-85: Hoist the _SCORERS dict out of _fuzzy_match into module
scope (e.g. TOP_LEVEL _SCORERS constant) or a class attribute so it isn’t
recreated per call, then have _fuzzy_match look up scorer =
_SCORERS.get(self.config.fuzzy_scorer, fuzz.WRatio); additionally, detect when
the get() falls back (i.e. self.config.fuzzy_scorer not a key) and emit a
warning via the existing logger (or logging.getLogger(__name__)) indicating the
unknown fuzzy_scorer and that WRatio is being used as a fallback; keep symbol
names _SCORERS, _fuzzy_match, and self.config.fuzzy_scorer exactly as in the
diff so the changes are easy to locate.

In `@tests/eval_code_locator.py`:
- Around line 129-133: The recall calculation uses len(expected_symbols) as the
denominator while the numerator is derived from expected_lower (case-folded),
which miscounts when expected_symbols has case-variant duplicates; update the
recall computation to use len(expected_lower) (i.e., recall = matched_count /
len(expected_lower) if expected_lower else 0) so the denominator matches the
case-insensitive sets used to compute matched_count (references: expected_lower,
found_lower, matched_count, recall).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6bb0a65c-4ad5-4cb6-a0b7-7154001e85d6

📥 Commits

Reviewing files that changed from the base of the PR and between f06b5df and 6e466f1.

📒 Files selected for processing (8)
  • adapters/code_locator.py
  • code_locator/config.py
  • code_locator/retrieval/bm25s_client.py
  • code_locator/tools/validate_symbols.py
  • code_locator_runtime.py
  • tests/eval_code_locator.py
  • tests/fixtures/expected/decisions.py
  • tests/test_coverage_loop.py

@silongtan silongtan merged commit eca0c8d into main Apr 19, 2026
1 check passed
Knapp-Kevin added a commit that referenced this pull request May 2, 2026
Three-round audit cycle (VETO -> VETO -> PASS) for Notion ingest +
cache contract migration. Plan ships across five phases:

- Phase 0 — cache contract migration (schema v1->v2, schema_version
  table, callable migration dispatch, upsert_canonical_extraction)
- Phase 0.5 — worker-task lifecycle pattern + Slack reference wiring
  (closes the v0 dormant-Slack-worker gap)
- Phase 1 — Notion API client + property serializer (internal-
  integration auth, no OAuth router)
- Phase 2 — Notion ingest worker (per-database watermark, peer-
  authored team_event)
- Phase 3 — Notion task registration on lifespan

META_LEDGER entries #29-#33 capture: round-1 VETO (4 missing/
undeclared symbols), round-2 VETO (1 wrong-call-shape for
decrypt_token), round-3 PASS, IMPLEMENT, and SUBSTANTIATION.

SHADOW_GENOME #7 addendum extends the PARALLEL_STRUCTURE_ASSUMED
detection heuristic with three new in-sketch checks: signature,
type-boundary, helper-symmetry. The two VETOs in this session are
the empirical justification.

SYSTEM_STATE.md adds the Priority C v1 section: schema state (v2),
architectural properties achieved, audit cycle outcomes,
implementation deviations from plan.

Merkle seal: SHA256(content_hash + previous_hash) =
dcb619104e6d88b97a04689093b80b9f03825f9a24bac3c3b9ab3d0107ff24d7
(content_hash 9f003c40..., previous_hash 6f4f8f8f... = Priority C v0
SEAL at Entry #28).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin added a commit that referenced this pull request May 3, 2026
QorLogic SDLC governance trail for the Priority C v0 implementation
that landed in commits 1-4 of this PR.

Includes:

- docs/research-brief-priority-c-selective-ingest-2026-05-02.md (v3) —
  research substrate. v1 was rejected for INVARIANT_FROM_IMPLEMENTATION
  (treating v0 agent-fetches-only code state as product principle); v2
  added playbook substrate; v3 narrowed to Slack-first + team-server
  + CocoIndex-conditional after operator dialogue clarified "no managed
  backend" = "no human-ops-tax architecture," not "no backend."
- plan-priority-c-team-server-slack-v0.md (437 LOC) — the L3 plan
  with five phases (Phase 5 deferred per "if we can manage it"
  feasibility caveat).
- docs/SHADOW_GENOME.md Failure Entry #6 + addendum — captures the
  framing-error pattern AND the "anti-goals must be parsed by their
  load-bearing keyword" lesson; symmetric to v0-code-as-principle.
- docs/META_LEDGER.md Entries #27 (IMPLEMENT) + #28 (SEAL).
  Predecessor: efd0304b (#135-triage seal on dev).
  Implement chain: 211ffb9e.
  Substantiation seal: 6f4f8f8f1d63ad82b952a3c6aff270d30584e08b0572077ff685e84ce453f6c2
- docs/SYSTEM_STATE.md — Priority C v0 section appended; documents
  schema additions, architectural properties achieved, audit advisory
  disposition, Phase 5 deferred state, and the qor-logic-internal
  steps skipped for downstream-project rationale.
- .agent/staging/AUDIT_REPORT.md — PASS verdict, three non-blocking
  advisories all addressed at implement-time.

Verdict: REALITY = PROMISE for Phases 1-4. Phase 5 (CocoIndex #136)
explicitly deferred per plan slip-independence design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin added a commit that referenced this pull request May 3, 2026
Three-round audit cycle (VETO -> VETO -> PASS) for Notion ingest +
cache contract migration. Plan ships across five phases:

- Phase 0 — cache contract migration (schema v1->v2, schema_version
  table, callable migration dispatch, upsert_canonical_extraction)
- Phase 0.5 — worker-task lifecycle pattern + Slack reference wiring
  (closes the v0 dormant-Slack-worker gap)
- Phase 1 — Notion API client + property serializer (internal-
  integration auth, no OAuth router)
- Phase 2 — Notion ingest worker (per-database watermark, peer-
  authored team_event)
- Phase 3 — Notion task registration on lifespan

META_LEDGER entries #29-#33 capture: round-1 VETO (4 missing/
undeclared symbols), round-2 VETO (1 wrong-call-shape for
decrypt_token), round-3 PASS, IMPLEMENT, and SUBSTANTIATION.

SHADOW_GENOME #7 addendum extends the PARALLEL_STRUCTURE_ASSUMED
detection heuristic with three new in-sketch checks: signature,
type-boundary, helper-symmetry. The two VETOs in this session are
the empirical justification.

SYSTEM_STATE.md adds the Priority C v1 section: schema state (v2),
architectural properties achieved, audit cycle outcomes,
implementation deviations from plan.

Merkle seal: SHA256(content_hash + previous_hash) =
dcb619104e6d88b97a04689093b80b9f03825f9a24bac3c3b9ab3d0107ff24d7
(content_hash 9f003c40..., previous_hash 6f4f8f8f... = Priority C v0
SEAL at Entry #28).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant