v22→v23 migration: backfill decision_level for legacy decisions#371
Merged
Conversation
The v8→v9 migration added the decision_level field (DEFAULT NONE) but did not classify existing rows. The #340 auto-classify heuristic only runs on newly ingested decisions, so all pre-#340 rows remain NONE (unclassified). Per the tolerant policy, NONE is treated as L3 — this silently excludes legacy decisions from the codegenome identity graph. This migration applies the same deterministic heuristic used by _classify_decision_level at ingest time: 1. Has binds_to edge → L2 (architecture, code-grounded). 2. source_type ∈ {transcript, notion, slack, document} → L1. 3. source_type ∈ {implementation_choice, agent_session} → L3. 4. Remaining → L2 (safe default — enters identity graph). Idempotent: only touches rows WHERE decision_level IS NONE. Works around SurrealDB v2 embedded quirk where ->binds_to->code_region IS NOT EMPTY returns True for all rows regardless of actual edge presence. Uses binds_to edge table directly to identify bound decisions. Includes 8 sociable tests over memory:// covering all classification paths, priority ordering, idempotency, and preservation of existing levels. Co-Authored-By: Jin Hong Kuan <jin@bicameral-ai.com>
Contributor
Author
|
Prompt hidden (unlisted session) |
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Co-Authored-By: Jin Hong Kuan <jin@bicameral-ai.com>
SurrealDB v2 re-validates entire records on UPDATE. Legacy fixture rows (v3-era) have NONE in required typed fields (created_at, feature_hint, etc.) that were added by later schema versions. Bulk UPDATEs fail on these rows even though the migration only touches decision_level. Switch to per-row UPDATEs with try/except: classify in Python, UPDATE individually, skip rows that fail re-validation. Skipped rows remain NONE (treated as L3 by tolerant policy — same as current behavior). Co-Authored-By: Jin Hong Kuan <jin@bicameral-ai.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The v8→v9 migration added
decision_levelto the decision table (DEFAULT NONE) but did not classify existing rows. The #340 auto-classify heuristic only runs on newly ingested decisions, so all pre-#340 rows remainNONE(unclassified). Per the tolerant policy inhandlers/bind.py:298,NONEis treated as L3 — silently excluding legacy decisions from the codegenome identity graph.This PR adds a v22→v23 corrective migration that retroactively classifies all
NONErows using the same deterministic heuristic as_classify_decision_level(fromledger/adapter.py:90-105):binds_toedge → L2 (architecture, code-grounded)The migration is idempotent (only touches
WHERE decision_level IS NONE) and non-destructive. Decisions that already have an explicit level are untouched.SurrealDB v2 quirk documented:
->binds_to->code_region IS NOT EMPTYreturnsTruefor all rows in embedded mode regardless of actual edge presence. Step 1 works around this by querying thebinds_toedge table directly.Review & Testing Checklist for Human
WHERE decision_level IS NONEguard handles this, but worth verifying against your production data)python -m pytest tests/test_v23_decision_level_backfill.py -vagainst a clone with your production.bicameral/ledger.dbto confirm the migration doesn't error on your real dataNotes
IS NOT EMPTYquirk on graph traversal is a significant footgun — any future migration or query that relies on it will silently match all rows. The workaround (querying the edge table directly) is reliable.execute()returnsNonefor bulk UPDATEs in SurrealDB v2 embedded, so migration log counts for steps 2-4 will show 0. Step 1 (per-row UPDATE) counts correctly.Link to Devin session: https://app.devin.ai/sessions/c38802aa5aac42ecb0ce7685d9c2d951
Requested by: @jinhongkuan