Skip to content

v22→v23 migration: backfill decision_level for legacy decisions#371

Merged
jinhongkuan merged 3 commits into
devfrom
devin/1778887249-backfill-decision-level
May 15, 2026
Merged

v22→v23 migration: backfill decision_level for legacy decisions#371
jinhongkuan merged 3 commits into
devfrom
devin/1778887249-backfill-decision-level

Conversation

@devin-ai-integration

Copy link
Copy Markdown
Contributor

Summary

The v8→v9 migration added decision_level to the decision table (DEFAULT NONE) but did not classify existing rows. The #340 auto-classify heuristic only runs on newly ingested decisions, so all pre-#340 rows remain NONE (unclassified). Per the tolerant policy in handlers/bind.py:298, NONE is treated as L3 — silently excluding legacy decisions from the codegenome identity graph.

This PR adds a v22→v23 corrective migration that retroactively classifies all NONE rows using the same deterministic heuristic as _classify_decision_level (from ledger/adapter.py:90-105):

  1. Has binds_to edge → L2 (architecture, code-grounded)
  2. source_type ∈ {transcript, notion, slack, document} → L1 (product commitment)
  3. source_type ∈ {implementation_choice, agent_session} → L3 (technical detail)
  4. Remaining → L2 (safe default — enters identity graph)

The migration is idempotent (only touches WHERE decision_level IS NONE) and non-destructive. Decisions that already have an explicit level are untouched.

SurrealDB v2 quirk documented: ->binds_to->code_region IS NOT EMPTY returns True for all rows in embedded mode regardless of actual edge presence. Step 1 works around this by querying the binds_to edge table directly.

Review & Testing Checklist for Human

  • Verify the heuristic priority matches your intent: bound decisions always get L2 even if their source_type would suggest L1 — is that correct for your data?
  • Check that existing ledgers with manually classified decisions won't be overwritten (the WHERE decision_level IS NONE guard handles this, but worth verifying against your production data)
  • Run python -m pytest tests/test_v23_decision_level_backfill.py -v against a clone with your production .bicameral/ledger.db to confirm the migration doesn't error on your real data

Notes

  • The SurrealDB v2 IS NOT EMPTY quirk on graph traversal is a significant footgun — any future migration or query that relies on it will silently match all rows. The workaround (querying the edge table directly) is reliable.
  • execute() returns None for bulk UPDATEs in SurrealDB v2 embedded, so migration log counts for steps 2-4 will show 0. Step 1 (per-row UPDATE) counts correctly.

Link to Devin session: https://app.devin.ai/sessions/c38802aa5aac42ecb0ce7685d9c2d951
Requested by: @jinhongkuan

The v8→v9 migration added the decision_level field (DEFAULT NONE) but
did not classify existing rows. The #340 auto-classify heuristic only
runs on newly ingested decisions, so all pre-#340 rows remain NONE
(unclassified). Per the tolerant policy, NONE is treated as L3 — this
silently excludes legacy decisions from the codegenome identity graph.

This migration applies the same deterministic heuristic used by
_classify_decision_level at ingest time:

  1. Has binds_to edge → L2 (architecture, code-grounded).
  2. source_type ∈ {transcript, notion, slack, document} → L1.
  3. source_type ∈ {implementation_choice, agent_session} → L3.
  4. Remaining → L2 (safe default — enters identity graph).

Idempotent: only touches rows WHERE decision_level IS NONE.

Works around SurrealDB v2 embedded quirk where
->binds_to->code_region IS NOT EMPTY returns True for all rows
regardless of actual edge presence. Uses binds_to edge table directly
to identify bound decisions.

Includes 8 sociable tests over memory:// covering all classification
paths, priority ordering, idempotency, and preservation of existing
levels.

Co-Authored-By: Jin Hong Kuan <jin@bicameral-ai.com>
@devin-ai-integration

Copy link
Copy Markdown
Contributor Author

Prompt hidden (unlisted session)

@devin-ai-integration

Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Co-Authored-By: Jin Hong Kuan <jin@bicameral-ai.com>
SurrealDB v2 re-validates entire records on UPDATE. Legacy fixture
rows (v3-era) have NONE in required typed fields (created_at,
feature_hint, etc.) that were added by later schema versions.
Bulk UPDATEs fail on these rows even though the migration only
touches decision_level.

Switch to per-row UPDATEs with try/except: classify in Python,
UPDATE individually, skip rows that fail re-validation. Skipped
rows remain NONE (treated as L3 by tolerant policy — same as
current behavior).

Co-Authored-By: Jin Hong Kuan <jin@bicameral-ai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant