feat(ledger_locator): move state to ~/.bicameral/projects/<id>/ (#368) by jinhongkuan · Pull Request #408 · BicameralAI/bicameral-mcp

jinhongkuan · 2026-05-19T02:57:53Z

Summary

Adds the ledger_locator/ module: deterministic resolver for ledger / code-graph / bm25 / watermark / transcript-queue / operator-config paths, keyed off sha256(git rev-parse --git-common-dir)[:16] so worktrees of the same clone share one state bag at ~/.bicameral/projects/<id>/.
Delegates every existing call site (ledger adapter, code-locator runtime + config, events materializer + transcript queue, setup_wizard) to the locator. Splits <repo>/.bicameral/config.yaml per R4: team-identity keys stay committed, per-operator keys (telemetry, channel, guided, signer/render attribution, query timeouts, rate limits, team.role) move to ~/.bicameral/projects/<id>/operator.yaml. context.py readers route per-key via _CONFIG_KEY_ROUTING.
Ships bicameral-mcp migrate-state (one-shot, idempotent, archive-on-collision, R4 config-yaml partition) + bicameral-mcp gc (orphan project-dir reclaim). bicameral-update skill now runs migrate-state --auto post-upgrade.

Linked issues

Closes BicameralAI/bicameral-daemon#2

Linked decisions

Closes decision:ko8efq3z1zwhbof7kecq — Name "Ledger Locator"
Closes decision:c2eqcwimhe4lpaexrddw — Supported environments scope lock
Closes decision:fi1def9bci6s6fcflc2p — Branch isolation stays logical (locator carries the rationale forward)
Closes decision:rfbnlw7ghe175iu42u6b — Project identity via git common-dir hash
Closes decision:5nr66wvmapjpt58rrji8 — R4 config split (team-identity vs per-operator)
Closes decision:ew9rgegdlblexsraesss — Delete resolve_config_path(); wizard onboarding via git show HEAD:.bicameral/config.yaml
Closes decision:6c20xahdyxk3suzav4pj — Explicit VCS contract (ProjectIdResolutionError names "git only" assumption)
Closes decision:ogdfx014sqgc6fi6ky1a — Reuse _resolve_authoritative_ref() for the divergence guard
Refs decision:e3xz4c4ji4x7lm3lvq4k — Defer ephemeral environments (one-line notice in migrate-state success summary; full support deferred to v0.16.1/v0.17)

Plan / Audit / Seal

Plan: thoughts/shared/plans/2026-05-16-ledger-locator-and-migration.md (R4-bis)
Audit: R4 audit verdict PASS (R1 PASS → R3 scope expansion → R4 VETO → R4-bis incorporated all three V1/V2/V3 corrections)
Seal: pending squash-merge to dev

Test plan

pytest tests/test_ledger_locator.py tests/test_ledger_locator_origin_guard.py tests/test_ledger_locator_vcs_contract.py -q — Phase 1 (locator + origin guard + VCS contract): 19/19
pytest tests/test_ledger_adapter_uses_locator.py tests/test_code_locator_runtime_uses_locator.py tests/test_code_locator_config_none_safe.py -q — Phase 2A/2B delegation: 9/9
pytest tests/test_setup_wizard_omits_state_env_vars.py tests/test_config_split.py tests/test_setup_wizard_git_native.py tests/test_run_config_wizard.py -q — Phase 2C wizard split + git-native onboarding + two-pane editor: 16/16
pytest tests/test_migrate_state.py -q — Phase 3 migration CLI (12 tests incl. byte-equality, idempotency, archive-on-collision, dry-run, partial state, --auto, default archive dir, bm25/watermark/transcript queues, legacy user-global ledger, R4 config-yaml partition): 12/12
pytest tests/test_gc.py -q — Phase 4 orphan reclaim CLI: 5/5
pytest tests/test_setup_wizard*.py tests/test_v0410_guided_mode.py tests/test_signer_email_fallback.py tests/test_context_ingest_rate_limit.py tests/test_preflight_attribution_redaction.py tests/test_preflight_render_source_attribution.py -q — regression on every existing test that touches a routed key: 145/145
ruff check ledger_locator cli/migrate_state.py cli/gc.py setup_wizard.py context.py server.py tests/test_*.py — clean

🤖 Generated with Claude Code

R4 retracts R3's `resolve_config_path()` (filesystem-topology inference for primary-worktree convergence) and replaces it with five amendments: 1. DELETE `resolve_config_path()` from the locator. Runtime readers in `context.py` revert to direct `<repo>/.bicameral/config.yaml` access; wizard uses `git show HEAD:.bicameral/config.yaml` for onboarding detection. No filesystem-topology inference — concept ports cleanly across all 9 deployment shapes catalogued in the Topology Problem Notion page (worktree, submodule, bare-repo, sparse checkout, devcontainer, Codespaces, CI, --separate-git-dir, non-git VCS). 2. CONFIG SPLIT — team-identity keys stay at `<repo>/.bicameral/config.yaml` (git-committed); per-operator keys move to `~/.bicameral/projects/<id>/operator.yaml` (per-machine). Routing table `context._CONFIG_KEY_ROUTING` is the single source of truth. 3. EXPLICIT VCS CONTRACT — structured `ProjectIdResolutionError` from `ledger_locator/_project_id.py::common_dir_for` when `git rev-parse` fails, with verbatim "bicameral currently supports git only". 4. REUSE `_resolve_authoritative_ref()` for divergence guard. 5. DEFER full ephemeral-environment support; R4 adds only a one-line notice in `migrate-state` post-flight summary. Backed by 9 ratified bicameral decisions + 14 explicitly-rejected alternatives (see bicameral ledger 2026-05-18 session). Audit chain: - R4 (2026-05-18) VETO — V1 `_edit_config_interactive` wrong function name; V2 stale context.py reader lines + bogus 412-429 entry; V3 missing tests for solo-mode short-circuit + `run_config_wizard` editor. Gate: .qor/gates/2026-05-18T2334-r4audit/audit.json (local). - R4-bis (2026-05-18) PASS after plan-text fixes. Gate: .qor/gates/2026-05-18T2338-r4bis/audit.json (local). META_LEDGER entries #51 (VETO) and #52 (PASS) recorded. Phase 1 implementation lands in the following commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…368) Phase 1 of the Ledger Locator plan (R4-bis PASS). Adds: - `resolve_operator_config_path()` returns `<STATE_ROOT>/<project-id>/operator.yaml` for the R4 config split (decision:5nr66wvmapjpt58rrji8). Per-operator keys (telemetry, channel, guided, signer_email_fallback, render_source_attribution, rate-limit knobs, query timeouts) will land here in Phase 2; the locator anchor lands first so consumers have a stable resolution function to import. - Explicit VCS contract in `_project_id.py::common_dir_for` (decision:6c20xahdyxk3suzav4pj). When `git rev-parse --git-common-dir` fails, the raised `ProjectIdResolutionError` now names the assumption verbatim: "bicameral currently supports git only; non-git VCSes are not yet implemented." Forces future ports to jj/sapling/fossil to be a deliberate locator amendment rather than an accidental success on a misclassified VCS. Phase 1 does NOT include: - `resolve_config_path()` — explicitly rejected per R4 (decision:ew9rgegdlblexsraesss; see superseded R3 design at decision:6z39wrjpmmg9vhm8i6t4). Config readers will read `<repo>/.bicameral/config.yaml` directly in Phase 2. - Phase 2 call-site delegation (context.py per-key routing, setup_wizard.py config split, git-show-HEAD onboarding detection). - Phase 3 migrate-state CLI / Phase 4 gc CLI. Tests (14 passing): - 8 existing in `test_ledger_locator.py` + 3 in `test_ledger_locator_origin_guard.py` preserved. - 2 new operator-config tests: path under project dir + stability across `git worktree add` checkouts. - 3 new VCS-contract tests in `test_ledger_locator_vcs_contract.py`: verifies the verbatim error message surfaces from `resolve_ledger_url`, `common_dir_for`, and `project_id_for`. ruff: clean. Plan amendment + R4-bis audit gate in the preceding commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rk, transcripts) (#368) Phase 2A of the Ledger Locator plan (R4-bis PASS). Adds four R3 locator functions for project-scoped derived state — sibling to ledger.db and code-graph.db under `~/.bicameral/projects/<id>/`. These enable Phase 2B to delegate the corresponding call sites in events/, code_locator/, and scripts/hooks/ to the locator. Added: - `resolve_bm25_index_path()` — derived from code-graph; sibling to it. - `resolve_watermark_path()` — replaces `events/materializer.py`'s `local_dir/"watermark"`. Fixes per-worktree re-replay of peer events. - `resolve_pending_transcripts_dir()` — replaces `events/transcript_queue.py:_pending_root`. Fixes per-worktree invisibility of SessionEnd-hook transcripts. - `resolve_processed_transcripts_dir()` — sibling to pending. Refactor: - Extracted `_resolved_project_dir(repo_path)` private helper. The resolve-repo + assert-origin pattern was duplicated across 3 public resolvers; adding 4 more would push it to 7. Helper centralizes the pipeline (repo → project-id → origin-guard) so each public resolver is now a one-line return. Net diff is line-neutral for existing resolvers; new ones get 2 lines each instead of 5. Tests (17 passing, +3 from Phase 1): - `test_resolves_derived_state_paths_under_project_dir` — all four new paths share the same project dir as code-graph.db. - `test_derived_state_paths_stable_across_worktrees` — paths are identical across `git worktree add` checkouts (the whole point of project-scoping derived state). - `test_derived_state_paths_have_no_env_override` — unrelated env overrides (SURREAL_URL, CODE_LOCATOR_SQLITE_DB) do not leak into these paths; they always resolve to the project dir. ruff: clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ase 2B-i) (#368) Phase 2B (call-site delegation), part i: the three load-bearing state-path call sites stop computing paths inline and delegate to the Ledger Locator. Phase 2B-ii (events/materializer, transcript_queue) and Phase 2B-iii (setup_wizard env writes) follow in subsequent commits. Delegations: - `ledger/adapter.py::_default_db_url` now returns `ledger_locator.resolve_ledger_url()`. The hard-coded `~/.bicameral/ledger.db` literal is gone; the SURREAL_URL env override flows through the locator's single resolution path. Implements decision:ko8efq3z1zwhbof7kecq + decision:c2eqcwimhe4lpaexrddw at the adapter boundary. - `code_locator_runtime.ensure_runtime_env` calls `resolve_code_graph_path()` instead of computing `<repo>/.bicameral/code-graph.db` from the REPO_PATH env. The vestigial `_default_cache_root` helper is deleted. Outside a git repo the locator's `ProjectIdResolutionError` is caught and the env is left unset — the None-safe `code_locator.config.resolve_paths()` then handles direct-construction fallback (or raises, see below). - `code_locator_runtime.rebuild_index` line 277: `bm25_path` now comes from `resolve_bm25_index_path()`. Removes the implicit "bm25 lives next to sqlite_db" coupling — both paths come from the locator's project dir independently, matching the R3 plan's "one bag of state" intent. - `code_locator/config.py`: `sqlite_db` default is `None` (was the literal `~/.bicameral/code-graph.db`). `resolve_paths()` is None-safe and defers to `resolve_code_graph_path()` when set to None. Outside a git repo with no `CODE_LOCATOR_SQLITE_DB` override, the locator's `ProjectIdResolutionError` propagates verbatim ("bicameral currently supports git only"). Per decision:c2eqcwimhe4lpaexrddw, behavior is undefined in unsupported environments; naming the problem is better than writing to a hardcoded fallback that drifts from the canonical layout (and isn't Windows-friendly to begin with). Tests (11 new + 17 retained = 28 passing): - `test_ledger_adapter_uses_locator.py` — 3 tests cover the adapter default, SURREAL_URL override, and the regression guard against the legacy un-project-scoped path. - `test_code_locator_runtime_uses_locator.py` — 3 tests cover the env pre-population, the setdefault preservation, and the silent outside-git fallback. - `test_code_locator_config_none_safe.py` — 5 tests cover load_config end-to-end, direct-construction None-safety, env-var precedence, the outside-git error propagation, and the env-override escape hatch for test fixtures. Section 4 razor (advisory from R4 audit): the resolve-repo + assert- origin pattern was duplicated in 3 public resolvers; adding 4 more (Phase 2A) would push it to 7. Extracted `_resolved_project_dir` helper to centralize the pipeline. Net effect is line-neutral on the existing resolvers and removes ~12 lines of duplication. Outside scope: the broader pytest suite has 41 pre-existing failures unrelated to this commit (verified: `test_v0417_jargon_hygiene` fails on the clean dev tree too). They will be addressed separately. ruff: clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…e 2B-ii) (#368) Phase 2B (call-site delegation), part ii: events/materializer.py and events/transcript_queue.py stop computing their state paths inline and delegate to the Ledger Locator. The watermark and pending/processed transcript queues now live under the locator's project dir — shared across worktrees of one repo, ending the v0.15.x failure modes where peer JSONL events re-replayed per-worktree and SessionEnd-hook transcripts written from worktree A were invisible to worktree B's drain loop. Delegations: - `events/materializer.EventMaterializer`: `local_dir` parameter dropped from the positional contract; replaced by an optional `repo_path` (locator-scoping) and a keyword-only `watermark_override` (tests-only escape hatch). Production callers pass `repo_path` or nothing; `watermark_override` exists so test fixtures don't have to git-init tmp_path-derived dirs. Internal mkdir on the watermark parent is preserved as belt-and-braces. - `events/transcript_queue._pending_root` / `_processed_root`: signature unchanged (still take `repo_path: str`); body delegates to `resolve_pending_transcripts_dir` / `resolve_processed_transcripts_dir`. Every caller through this module transparently moves to the project-scoped layout — no caller changes needed for downstream code. - `adapters/ledger.py:117` (team-mode wiring): pass `repo_path` to `EventMaterializer`; drop the local watermark-dir computation. - `handlers/reset.py::_replay_events_into_ledger`: use `resolve_watermark_path()` for the "{}" reset write; drop the local-dir mkdir / inline watermark path. Tests (51 passing — 28 prior + 13 transcript-queue regression): - `tests/test_session_end_queue_writer.py`: `_make_repo` now git-inits the tmp_path fixture; assertions on pending/processed paths use `_pending_root` / `_processed_root` instead of literal `<repo>/.bicameral/...` paths. The writer subprocess (cwd=repo) and CLI archiver (cwd=repo) both transparently use the locator now. - `tests/test_team_event_replay.py`, `tests/test_team_adapter_with_backend.py`, `tests/test_team_round_trip_local_folder.py`, `tests/_replay_helpers.py`: pass `watermark_override=local_dir/"watermark"` to `EventMaterializer` so fixtures keep their per-test watermark location without git-init'ing every tmp_path. ruff: clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…, #368) Implements the operator-facing pieces of the R4 ledger-locator amendment: - `_build_config` no longer writes SURREAL_URL / CODE_LOCATOR_SQLITE_DB to .mcp.json — the locator picks both up at runtime (decision:5nr66wvmapjpt58rrji8). - `context._CONFIG_KEY_ROUTING` is the single source of truth partitioning team-identity keys (mode, team.backend/folder/remote_root, ingest_max_bytes) from per-operator keys (telemetry, channel, guided, signer_email_fallback, render_source_attribution, team.role, ingest_rate_limit_*, query_timeout_*). All 10 context.py readers route per-key via `_config_path_for_key`; falls back to `<repo>/.bicameral/ config.yaml` when the locator can't resolve a project id (non-git tmpdir → preserves v0.15.x behavior for legacy tests). - `_write_collaboration_config` writes BOTH files atomically (operator first → config second, both via temp+rename). On rename failure of the second file, the just-renamed operator file is rolled back and both temps unlinked — neither destination ends up half-written. Accepts a test-only `operator_path` override; falls back to single-file legacy layout when the locator can't resolve. - `run_setup` detects committed team/solo config via `git show HEAD:.bicameral/config.yaml` and auto-joins; falls through to the prompt flow on no-commit / non-team / parse failure (decision:ew9rgegdlblexsraesss). Divergence guard: when HEAD lacks the file but the default branch (via `_resolve_authoritative_ref`) has it, prompt the operator to merge-first before persisting a fresh setup (decision:ogdfx014sqgc6fi6ky1a). - `run_config_wizard` reads from both files via the locator, tags each prompt with `[team]` or `[your machine]`, writes back via the same atomic two-file split as `_write_collaboration_config`. Tests (14 new, 0 broken): - test_setup_wizard_omits_state_env_vars.py — 3 tests - test_config_split.py — 5 tests (incl. rollback-on-failure + routing-table-covers-every-key + reads-route-per-key) - test_setup_wizard_git_native.py — 6 tests (team/solo auto-join, divergence guard, _read_committed_config unit coverage) - test_run_config_wizard.py — 2 tests (reads from both, writes to routed) 125 existing tests across setup_wizard + locator + context pass. ruff clean across the diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes out #368: the state move actually moves, and orphans get cleanable. `cli/migrate_state.py` (Phase 3) - Moves project-scoped state from `<repo>/.bicameral/` into the locator- resolved project dir at `~/.bicameral/projects/<id>/`: ledger.db, code-graph.db (+ shm/wal), bm25_index.pkl, watermark, pending-transcripts/, processed-transcripts/. - Also picks up the v0.15.x user-global `~/.bicameral/ledger.db` on the first project that runs migrate-state, then leaves subsequent projects alone. - R4: partitions a pre-split `<repo>/.bicameral/config.yaml` per the `context._CONFIG_KEY_ROUTING` table — team-identity keys stay in the committed file, per-operator keys move to operator.yaml under the project dir. Merges with any pre-existing operator.yaml (existing values win). Unknown keys stay in config.yaml with a warning. - Idempotent (`Nothing to migrate.` exit 0 on second run), archives on byte-different destination collisions to `~/.bicameral/archive/<project-id>/<name>.<iso8601>.bak` (default; CLI override via `--archive-dir`), de-dupes on byte-identical collisions, cleans empty source dirs after success. - `--dry-run` enumerates the plan without writing. `--auto` skips the pre-execute confirm prompt. - R4 deferred-ephemeral notice printed in every success summary (decision:e3xz4c4ji4x7lm3lvq4k). - Wired into server.py as `migrate-state` + `migrate-ledger` (alias per the issue verbiage). `cli/gc.py` (Phase 4) - Scans `~/.bicameral/projects/<id>/origin.txt` for each project dir, classifying as live / orphan / unreadable. - Default: list. `--delete` prompts per orphan / unreadable dir; `--yes` skips the prompt. Empty `origin.txt` and missing `origin.txt` both classify as `unreadable` so the operator can choose to reclaim. - Wired into server.py as the `gc` subparser. `skills/bicameral-update/SKILL.md` - New Step 3.5: after `bicameral.update(action="apply", ...)`, run `bicameral-mcp migrate-state --auto` before reporting "update complete." Surface stderr verbatim and abort the flow on non-zero exit, with a `--dry-run` offer as the fallback. Tests (17 new, 0 broken): - test_migrate_state.py — 12 tests: full-layout move, idempotent re-run, collision archives, dry-run, missing-source no-op, partial state, auto-flag skips prompts, default archive dir under home, bm25+watermark+transcript-queue explicit coverage, legacy user-global ledger first-project + already-claimed, R4 config.yaml partition - test_gc.py — 5 tests: list-only spares live, delete with per-item prompts, --yes skips prompts, empty origin.txt → unreadable, empty state root 145 tests across Phase 1–4 pass. ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-19T02:58:01Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3acae8e9-8350-4ffe-b45a-ddd850a5a933

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/368-ledger-locator-r4-phase1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CI `ruff + mypy` failed `ruff format --check` on 10 files in the diff. Running `ruff format` and re-staging — pure whitespace / line-wrap reflow, no semantics changed; 25/25 tests across gc + migrate_state + run_config_wizard + setup_wizard_git_native still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jinhongkuan and others added 7 commits May 18, 2026 17:03

jinhongkuan added the flow:feature Standard feature/fix PR targeting BicameralAI/dev (the default flow) label May 19, 2026

jinhongkuan temporarily deployed to ci-test May 19, 2026 02:57 — with GitHub Actions Inactive

jinhongkuan requested a deployment to recording-approval May 19, 2026 02:57 — with GitHub Actions Waiting

jinhongkuan temporarily deployed to production May 19, 2026 02:57 — with GitHub Actions Inactive

jinhongkuan temporarily deployed to production May 19, 2026 03:16 — with GitHub Actions Inactive

jinhongkuan requested a deployment to recording-approval May 19, 2026 03:16 — with GitHub Actions Waiting

jinhongkuan temporarily deployed to ci-test May 19, 2026 03:16 — with GitHub Actions Inactive

jinhongkuan merged commit 2569f30 into dev May 19, 2026
10 of 11 checks passed

jinhongkuan mentioned this pull request May 19, 2026

fix(diagnose,reset): unbreak in-agent recovery path (#410) #412

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ledger_locator): move state to ~/.bicameral/projects/<id>/ (#368)#408

feat(ledger_locator): move state to ~/.bicameral/projects/<id>/ (#368)#408
jinhongkuan merged 8 commits into
devfrom
feat/368-ledger-locator-r4-phase1

jinhongkuan commented May 19, 2026

Uh oh!

coderabbitai Bot commented May 19, 2026 •

edited

Loading

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jinhongkuan commented May 19, 2026

Summary

Linked issues

Linked decisions

Plan / Audit / Seal

Test plan

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 19, 2026 •

edited

Loading