feat(ledger_locator): move state to ~/.bicameral/projects/<id>/ (#368)#408
Merged
Conversation
R4 retracts R3's `resolve_config_path()` (filesystem-topology inference for primary-worktree convergence) and replaces it with five amendments: 1. DELETE `resolve_config_path()` from the locator. Runtime readers in `context.py` revert to direct `<repo>/.bicameral/config.yaml` access; wizard uses `git show HEAD:.bicameral/config.yaml` for onboarding detection. No filesystem-topology inference — concept ports cleanly across all 9 deployment shapes catalogued in the Topology Problem Notion page (worktree, submodule, bare-repo, sparse checkout, devcontainer, Codespaces, CI, --separate-git-dir, non-git VCS). 2. CONFIG SPLIT — team-identity keys stay at `<repo>/.bicameral/config.yaml` (git-committed); per-operator keys move to `~/.bicameral/projects/<id>/operator.yaml` (per-machine). Routing table `context._CONFIG_KEY_ROUTING` is the single source of truth. 3. EXPLICIT VCS CONTRACT — structured `ProjectIdResolutionError` from `ledger_locator/_project_id.py::common_dir_for` when `git rev-parse` fails, with verbatim "bicameral currently supports git only". 4. REUSE `_resolve_authoritative_ref()` for divergence guard. 5. DEFER full ephemeral-environment support; R4 adds only a one-line notice in `migrate-state` post-flight summary. Backed by 9 ratified bicameral decisions + 14 explicitly-rejected alternatives (see bicameral ledger 2026-05-18 session). Audit chain: - R4 (2026-05-18) VETO — V1 `_edit_config_interactive` wrong function name; V2 stale context.py reader lines + bogus 412-429 entry; V3 missing tests for solo-mode short-circuit + `run_config_wizard` editor. Gate: .qor/gates/2026-05-18T2334-r4audit/audit.json (local). - R4-bis (2026-05-18) PASS after plan-text fixes. Gate: .qor/gates/2026-05-18T2338-r4bis/audit.json (local). META_LEDGER entries #51 (VETO) and #52 (PASS) recorded. Phase 1 implementation lands in the following commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…368) Phase 1 of the Ledger Locator plan (R4-bis PASS). Adds: - `resolve_operator_config_path()` returns `<STATE_ROOT>/<project-id>/operator.yaml` for the R4 config split (decision:5nr66wvmapjpt58rrji8). Per-operator keys (telemetry, channel, guided, signer_email_fallback, render_source_attribution, rate-limit knobs, query timeouts) will land here in Phase 2; the locator anchor lands first so consumers have a stable resolution function to import. - Explicit VCS contract in `_project_id.py::common_dir_for` (decision:6c20xahdyxk3suzav4pj). When `git rev-parse --git-common-dir` fails, the raised `ProjectIdResolutionError` now names the assumption verbatim: "bicameral currently supports git only; non-git VCSes are not yet implemented." Forces future ports to jj/sapling/fossil to be a deliberate locator amendment rather than an accidental success on a misclassified VCS. Phase 1 does NOT include: - `resolve_config_path()` — explicitly rejected per R4 (decision:ew9rgegdlblexsraesss; see superseded R3 design at decision:6z39wrjpmmg9vhm8i6t4). Config readers will read `<repo>/.bicameral/config.yaml` directly in Phase 2. - Phase 2 call-site delegation (context.py per-key routing, setup_wizard.py config split, git-show-HEAD onboarding detection). - Phase 3 migrate-state CLI / Phase 4 gc CLI. Tests (14 passing): - 8 existing in `test_ledger_locator.py` + 3 in `test_ledger_locator_origin_guard.py` preserved. - 2 new operator-config tests: path under project dir + stability across `git worktree add` checkouts. - 3 new VCS-contract tests in `test_ledger_locator_vcs_contract.py`: verifies the verbatim error message surfaces from `resolve_ledger_url`, `common_dir_for`, and `project_id_for`. ruff: clean. Plan amendment + R4-bis audit gate in the preceding commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rk, transcripts) (#368) Phase 2A of the Ledger Locator plan (R4-bis PASS). Adds four R3 locator functions for project-scoped derived state — sibling to ledger.db and code-graph.db under `~/.bicameral/projects/<id>/`. These enable Phase 2B to delegate the corresponding call sites in events/, code_locator/, and scripts/hooks/ to the locator. Added: - `resolve_bm25_index_path()` — derived from code-graph; sibling to it. - `resolve_watermark_path()` — replaces `events/materializer.py`'s `local_dir/"watermark"`. Fixes per-worktree re-replay of peer events. - `resolve_pending_transcripts_dir()` — replaces `events/transcript_queue.py:_pending_root`. Fixes per-worktree invisibility of SessionEnd-hook transcripts. - `resolve_processed_transcripts_dir()` — sibling to pending. Refactor: - Extracted `_resolved_project_dir(repo_path)` private helper. The resolve-repo + assert-origin pattern was duplicated across 3 public resolvers; adding 4 more would push it to 7. Helper centralizes the pipeline (repo → project-id → origin-guard) so each public resolver is now a one-line return. Net diff is line-neutral for existing resolvers; new ones get 2 lines each instead of 5. Tests (17 passing, +3 from Phase 1): - `test_resolves_derived_state_paths_under_project_dir` — all four new paths share the same project dir as code-graph.db. - `test_derived_state_paths_stable_across_worktrees` — paths are identical across `git worktree add` checkouts (the whole point of project-scoping derived state). - `test_derived_state_paths_have_no_env_override` — unrelated env overrides (SURREAL_URL, CODE_LOCATOR_SQLITE_DB) do not leak into these paths; they always resolve to the project dir. ruff: clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ase 2B-i) (#368)
Phase 2B (call-site delegation), part i: the three load-bearing
state-path call sites stop computing paths inline and delegate to the
Ledger Locator. Phase 2B-ii (events/materializer, transcript_queue) and
Phase 2B-iii (setup_wizard env writes) follow in subsequent commits.
Delegations:
- `ledger/adapter.py::_default_db_url` now returns
`ledger_locator.resolve_ledger_url()`. The hard-coded
`~/.bicameral/ledger.db` literal is gone; the SURREAL_URL env override
flows through the locator's single resolution path. Implements
decision:ko8efq3z1zwhbof7kecq + decision:c2eqcwimhe4lpaexrddw at the
adapter boundary.
- `code_locator_runtime.ensure_runtime_env` calls
`resolve_code_graph_path()` instead of computing
`<repo>/.bicameral/code-graph.db` from the REPO_PATH env. The
vestigial `_default_cache_root` helper is deleted. Outside a git repo
the locator's `ProjectIdResolutionError` is caught and the env is
left unset — the None-safe `code_locator.config.resolve_paths()` then
handles direct-construction fallback (or raises, see below).
- `code_locator_runtime.rebuild_index` line 277: `bm25_path` now comes
from `resolve_bm25_index_path()`. Removes the implicit "bm25 lives
next to sqlite_db" coupling — both paths come from the locator's
project dir independently, matching the R3 plan's "one bag of state"
intent.
- `code_locator/config.py`: `sqlite_db` default is `None` (was the
literal `~/.bicameral/code-graph.db`). `resolve_paths()` is None-safe
and defers to `resolve_code_graph_path()` when set to None. Outside
a git repo with no `CODE_LOCATOR_SQLITE_DB` override, the locator's
`ProjectIdResolutionError` propagates verbatim ("bicameral currently
supports git only"). Per decision:c2eqcwimhe4lpaexrddw, behavior is
undefined in unsupported environments; naming the problem is better
than writing to a hardcoded fallback that drifts from the canonical
layout (and isn't Windows-friendly to begin with).
Tests (11 new + 17 retained = 28 passing):
- `test_ledger_adapter_uses_locator.py` — 3 tests cover the adapter
default, SURREAL_URL override, and the regression guard against the
legacy un-project-scoped path.
- `test_code_locator_runtime_uses_locator.py` — 3 tests cover the env
pre-population, the setdefault preservation, and the silent
outside-git fallback.
- `test_code_locator_config_none_safe.py` — 5 tests cover load_config
end-to-end, direct-construction None-safety, env-var precedence, the
outside-git error propagation, and the env-override escape hatch for
test fixtures.
Section 4 razor (advisory from R4 audit): the resolve-repo + assert-
origin pattern was duplicated in 3 public resolvers; adding 4 more
(Phase 2A) would push it to 7. Extracted `_resolved_project_dir`
helper to centralize the pipeline. Net effect is line-neutral on the
existing resolvers and removes ~12 lines of duplication.
Outside scope: the broader pytest suite has 41 pre-existing failures
unrelated to this commit (verified: `test_v0417_jargon_hygiene` fails
on the clean dev tree too). They will be addressed separately.
ruff: clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e 2B-ii) (#368)
Phase 2B (call-site delegation), part ii: events/materializer.py and
events/transcript_queue.py stop computing their state paths inline and
delegate to the Ledger Locator. The watermark and pending/processed
transcript queues now live under the locator's project dir — shared
across worktrees of one repo, ending the v0.15.x failure modes where
peer JSONL events re-replayed per-worktree and SessionEnd-hook
transcripts written from worktree A were invisible to worktree B's
drain loop.
Delegations:
- `events/materializer.EventMaterializer`: `local_dir` parameter dropped
from the positional contract; replaced by an optional `repo_path`
(locator-scoping) and a keyword-only `watermark_override` (tests-only
escape hatch). Production callers pass `repo_path` or nothing;
`watermark_override` exists so test fixtures don't have to git-init
tmp_path-derived dirs. Internal mkdir on the watermark parent is
preserved as belt-and-braces.
- `events/transcript_queue._pending_root` / `_processed_root`: signature
unchanged (still take `repo_path: str`); body delegates to
`resolve_pending_transcripts_dir` / `resolve_processed_transcripts_dir`.
Every caller through this module transparently moves to the
project-scoped layout — no caller changes needed for downstream code.
- `adapters/ledger.py:117` (team-mode wiring): pass `repo_path` to
`EventMaterializer`; drop the local watermark-dir computation.
- `handlers/reset.py::_replay_events_into_ledger`: use
`resolve_watermark_path()` for the "{}" reset write; drop the
local-dir mkdir / inline watermark path.
Tests (51 passing — 28 prior + 13 transcript-queue regression):
- `tests/test_session_end_queue_writer.py`: `_make_repo` now git-inits
the tmp_path fixture; assertions on pending/processed paths use
`_pending_root` / `_processed_root` instead of literal `<repo>/.bicameral/...`
paths. The writer subprocess (cwd=repo) and CLI archiver (cwd=repo)
both transparently use the locator now.
- `tests/test_team_event_replay.py`,
`tests/test_team_adapter_with_backend.py`,
`tests/test_team_round_trip_local_folder.py`,
`tests/_replay_helpers.py`: pass `watermark_override=local_dir/"watermark"`
to `EventMaterializer` so fixtures keep their per-test watermark
location without git-init'ing every tmp_path.
ruff: clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, #368) Implements the operator-facing pieces of the R4 ledger-locator amendment: - `_build_config` no longer writes SURREAL_URL / CODE_LOCATOR_SQLITE_DB to .mcp.json — the locator picks both up at runtime (decision:5nr66wvmapjpt58rrji8). - `context._CONFIG_KEY_ROUTING` is the single source of truth partitioning team-identity keys (mode, team.backend/folder/remote_root, ingest_max_bytes) from per-operator keys (telemetry, channel, guided, signer_email_fallback, render_source_attribution, team.role, ingest_rate_limit_*, query_timeout_*). All 10 context.py readers route per-key via `_config_path_for_key`; falls back to `<repo>/.bicameral/ config.yaml` when the locator can't resolve a project id (non-git tmpdir → preserves v0.15.x behavior for legacy tests). - `_write_collaboration_config` writes BOTH files atomically (operator first → config second, both via temp+rename). On rename failure of the second file, the just-renamed operator file is rolled back and both temps unlinked — neither destination ends up half-written. Accepts a test-only `operator_path` override; falls back to single-file legacy layout when the locator can't resolve. - `run_setup` detects committed team/solo config via `git show HEAD:.bicameral/config.yaml` and auto-joins; falls through to the prompt flow on no-commit / non-team / parse failure (decision:ew9rgegdlblexsraesss). Divergence guard: when HEAD lacks the file but the default branch (via `_resolve_authoritative_ref`) has it, prompt the operator to merge-first before persisting a fresh setup (decision:ogdfx014sqgc6fi6ky1a). - `run_config_wizard` reads from both files via the locator, tags each prompt with `[team]` or `[your machine]`, writes back via the same atomic two-file split as `_write_collaboration_config`. Tests (14 new, 0 broken): - test_setup_wizard_omits_state_env_vars.py — 3 tests - test_config_split.py — 5 tests (incl. rollback-on-failure + routing-table-covers-every-key + reads-route-per-key) - test_setup_wizard_git_native.py — 6 tests (team/solo auto-join, divergence guard, _read_committed_config unit coverage) - test_run_config_wizard.py — 2 tests (reads from both, writes to routed) 125 existing tests across setup_wizard + locator + context pass. ruff clean across the diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes out #368: the state move actually moves, and orphans get cleanable.
`cli/migrate_state.py` (Phase 3)
- Moves project-scoped state from `<repo>/.bicameral/` into the locator-
resolved project dir at `~/.bicameral/projects/<id>/`:
ledger.db, code-graph.db (+ shm/wal), bm25_index.pkl, watermark,
pending-transcripts/, processed-transcripts/.
- Also picks up the v0.15.x user-global `~/.bicameral/ledger.db` on the
first project that runs migrate-state, then leaves subsequent projects
alone.
- R4: partitions a pre-split `<repo>/.bicameral/config.yaml` per the
`context._CONFIG_KEY_ROUTING` table — team-identity keys stay in the
committed file, per-operator keys move to operator.yaml under the
project dir. Merges with any pre-existing operator.yaml (existing
values win). Unknown keys stay in config.yaml with a warning.
- Idempotent (`Nothing to migrate.` exit 0 on second run), archives on
byte-different destination collisions to
`~/.bicameral/archive/<project-id>/<name>.<iso8601>.bak` (default; CLI
override via `--archive-dir`), de-dupes on byte-identical collisions,
cleans empty source dirs after success.
- `--dry-run` enumerates the plan without writing. `--auto` skips the
pre-execute confirm prompt.
- R4 deferred-ephemeral notice printed in every success summary
(decision:e3xz4c4ji4x7lm3lvq4k).
- Wired into server.py as `migrate-state` + `migrate-ledger` (alias per
the issue verbiage).
`cli/gc.py` (Phase 4)
- Scans `~/.bicameral/projects/<id>/origin.txt` for each project dir,
classifying as live / orphan / unreadable.
- Default: list. `--delete` prompts per orphan / unreadable dir;
`--yes` skips the prompt. Empty `origin.txt` and missing `origin.txt`
both classify as `unreadable` so the operator can choose to reclaim.
- Wired into server.py as the `gc` subparser.
`skills/bicameral-update/SKILL.md`
- New Step 3.5: after `bicameral.update(action="apply", ...)`, run
`bicameral-mcp migrate-state --auto` before reporting "update
complete." Surface stderr verbatim and abort the flow on non-zero
exit, with a `--dry-run` offer as the fallback.
Tests (17 new, 0 broken):
- test_migrate_state.py — 12 tests: full-layout move, idempotent re-run,
collision archives, dry-run, missing-source no-op, partial state,
auto-flag skips prompts, default archive dir under home,
bm25+watermark+transcript-queue explicit coverage, legacy user-global
ledger first-project + already-claimed, R4 config.yaml partition
- test_gc.py — 5 tests: list-only spares live, delete with per-item
prompts, --yes skips prompts, empty origin.txt → unreadable, empty
state root
145 tests across Phase 1–4 pass. ruff clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CI `ruff + mypy` failed `ruff format --check` on 10 files in the diff. Running `ruff format` and re-staging — pure whitespace / line-wrap reflow, no semantics changed; 25/25 tests across gc + migrate_state + run_config_wizard + setup_wizard_git_native still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ledger_locator/module: deterministic resolver for ledger / code-graph / bm25 / watermark / transcript-queue / operator-config paths, keyed offsha256(git rev-parse --git-common-dir)[:16]so worktrees of the same clone share one state bag at~/.bicameral/projects/<id>/.<repo>/.bicameral/config.yamlper R4: team-identity keys stay committed, per-operator keys (telemetry,channel,guided, signer/render attribution, query timeouts, rate limits,team.role) move to~/.bicameral/projects/<id>/operator.yaml.context.pyreaders route per-key via_CONFIG_KEY_ROUTING.bicameral-mcp migrate-state(one-shot, idempotent, archive-on-collision, R4 config-yaml partition) +bicameral-mcp gc(orphan project-dir reclaim).bicameral-updateskill now runsmigrate-state --autopost-upgrade.Linked issues
Closes BicameralAI/bicameral-daemon#2
Linked decisions
Closes decision:ko8efq3z1zwhbof7kecq — Name "Ledger Locator"
Closes decision:c2eqcwimhe4lpaexrddw — Supported environments scope lock
Closes decision:fi1def9bci6s6fcflc2p — Branch isolation stays logical (locator carries the rationale forward)
Closes decision:rfbnlw7ghe175iu42u6b — Project identity via git common-dir hash
Closes decision:5nr66wvmapjpt58rrji8 — R4 config split (team-identity vs per-operator)
Closes decision:ew9rgegdlblexsraesss — Delete
resolve_config_path(); wizard onboarding viagit show HEAD:.bicameral/config.yamlCloses decision:6c20xahdyxk3suzav4pj — Explicit VCS contract (
ProjectIdResolutionErrornames "git only" assumption)Closes decision:ogdfx014sqgc6fi6ky1a — Reuse
_resolve_authoritative_ref()for the divergence guardRefs decision:e3xz4c4ji4x7lm3lvq4k — Defer ephemeral environments (one-line notice in migrate-state success summary; full support deferred to v0.16.1/v0.17)
Plan / Audit / Seal
thoughts/shared/plans/2026-05-16-ledger-locator-and-migration.md(R4-bis)devTest plan
pytest tests/test_ledger_locator.py tests/test_ledger_locator_origin_guard.py tests/test_ledger_locator_vcs_contract.py -q— Phase 1 (locator + origin guard + VCS contract): 19/19pytest tests/test_ledger_adapter_uses_locator.py tests/test_code_locator_runtime_uses_locator.py tests/test_code_locator_config_none_safe.py -q— Phase 2A/2B delegation: 9/9pytest tests/test_setup_wizard_omits_state_env_vars.py tests/test_config_split.py tests/test_setup_wizard_git_native.py tests/test_run_config_wizard.py -q— Phase 2C wizard split + git-native onboarding + two-pane editor: 16/16pytest tests/test_migrate_state.py -q— Phase 3 migration CLI (12 tests incl. byte-equality, idempotency, archive-on-collision, dry-run, partial state, --auto, default archive dir, bm25/watermark/transcript queues, legacy user-global ledger, R4 config-yaml partition): 12/12pytest tests/test_gc.py -q— Phase 4 orphan reclaim CLI: 5/5pytest tests/test_setup_wizard*.py tests/test_v0410_guided_mode.py tests/test_signer_email_fallback.py tests/test_context_ingest_rate_limit.py tests/test_preflight_attribution_redaction.py tests/test_preflight_render_source_attribution.py -q— regression on every existing test that touches a routed key: 145/145ruff check ledger_locator cli/migrate_state.py cli/gc.py setup_wizard.py context.py server.py tests/test_*.py— clean🤖 Generated with Claude Code