Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,69 @@
All notable changes to bicameral-mcp are tracked here. Format loosely follows
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

## 0.4.11 — 2026-04-14 — Latent Drift Fix (Range-Diff Sweep + Distinct Counters)

Fixes a class of "invisible drift" where decisions silently went stale
because `link_commit` only swept files in HEAD's own diff. After a
gap of N commits without a bicameral invocation, drift introduced by
intermediate commits stayed hidden until someone happened to re-edit
the same files. Now `link_commit` sweeps every file touched between
the last sync cursor and HEAD, so dark-period drift surfaces on the
next call.

### Fixed

- **Latent drift via head-only sweep**. `ingest_commit` previously
enumerated changed files via `git show <head> --name-only`, which
only sees the head commit's own diff. Drift introduced by commits
N+1..N+5 was invisible if the user didn't run a bicameral tool
during that window, then ran one against commit N+5 whose own diff
didn't re-touch the drifted files. Fix: when the sync cursor lags
HEAD, run `git diff --name-only last_synced..HEAD` and sweep every
file in the range. New `sweep_scope` field on `LinkCommitResponse`
reports `head_only` (first sync, or fallback) vs `range_diff`
(default after first sync) vs `range_truncated` (range exceeded
cap; sweep was partial). Range cap defaults to 200 files; remainder
catches up on next sync.
- **Inflated drift counters via per-(region, intent) counting**.
`decisions_drifted` and `decisions_reflected` previously incremented
once per `(region, intent)` pair that flipped — a decision with N
regions all flipping in the same sweep counted as N. Witnessed on
the Accountable demo where one Google Calendar decision flipped 4
regions and the counter reported `decisions_drifted=4` while only
1 distinct intent was actually drifted. Fix: dedupe by intent_id
via sets; counters now report the number of distinct decisions
whose status flipped, matching what users mentally expect from
"how many decisions just changed status."

### Added

- **`LinkCommitResponse.sweep_scope`** —
`Literal["head_only", "range_diff", "range_truncated"]`. Tells the
caller whether this sweep saw HEAD-only files or the full
last_synced..HEAD range. A "backlog sweep" after a dark period
reports `range_diff` with a large `range_size`, so a UI can frame
"47 decisions drifted" as "first scan after 6 weeks" instead of
"what the hell happened today."
- **`LinkCommitResponse.range_size`** — number of files swept this
run. Zero for the `no_changes` and `already_synced` fast paths.
- **`get_changed_files_in_range(base_sha, head_sha, repo_path)`** in
`ledger/status.py`. Runs `git diff --name-only base..head`. Returns
`None` (sentinel) when the diff fails (force-push, shallow clone,
unreachable base SHA) so the caller can fall back to head-only
scope without crashing.

### Migration

No schema changes. Existing `LinkCommitResponse` consumers see two
new optional fields with sane defaults (`sweep_scope="head_only"`,
`range_size=0`) — backward compatible. The semantic shift is in the
counter values: deployments that scraped the old per-region counts
will see smaller numbers in `decisions_drifted` /
`decisions_reflected` because the same flip is now counted once per
intent instead of once per region. The new behavior matches what the
field name implies; the old behavior was a bug.

## 0.4.10 — 2026-04-14 — Guided Mode (Always-On Hints)

Reframes v0.4.9's tester mode. **`action_hints` now fire whenever
Expand Down
27 changes: 24 additions & 3 deletions contracts.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,14 +74,35 @@ class DecisionMatch(BaseModel):


class LinkCommitResponse(BaseModel):
"""Returned by /link_commit and embedded in /search_decisions + /detect_drift."""
"""Returned by /link_commit and embedded in /search_decisions + /detect_drift.

v0.4.11 (latent drift fix):
- ``decisions_reflected`` and ``decisions_drifted`` count **distinct
intent_ids** that flipped this sweep, not (region, intent) pairs.
A decision with 5 regions all flipping to drifted now reports 1,
not 5. Matches user mental model.
- ``sweep_scope`` and ``range_size`` describe what the sweep covered.
``head_only`` is the v0.4.10 behavior — only files in the HEAD
commit's own diff. ``range_diff`` is the v0.4.11 default — files
changed between ``last_synced_commit`` and HEAD. ``range_truncated``
means the range exceeded ``MAX_SWEEP_FILES`` (200) so we capped it
and only swept the first chunk; the remainder will catch up on
next sync.
"""
commit_hash: str
synced: bool # False = new work done; True = fast-path
reason: Literal["new_commit", "already_synced", "no_changes"]
regions_updated: int = 0
decisions_reflected: int = 0 # pending → reflected this run
decisions_drifted: int = 0 # reflected → drifted this run
decisions_reflected: int = 0 # distinct intents that flipped to reflected
decisions_drifted: int = 0 # distinct intents that flipped to drifted
undocumented_symbols: list[str] = []
# v0.4.11: sweep scope provenance.
sweep_scope: Literal[
"head_only", # files in HEAD commit only (first sync, or fallback)
"range_diff", # files in last_synced..HEAD range (default after first sync)
"range_truncated", # range exceeded MAX_SWEEP_FILES; capped
] = "head_only"
range_size: int = 0 # number of files swept in this run


class ActionHint(BaseModel):
Expand Down
2 changes: 2 additions & 0 deletions handlers/link_commit.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,8 @@ async def handle_link_commit(ctx, commit_hash: str = "HEAD") -> LinkCommitRespon
decisions_reflected=result.get("decisions_reflected", 0),
decisions_drifted=result.get("decisions_drifted", 0),
undocumented_symbols=result.get("undocumented_symbols", []),
sweep_scope=result.get("sweep_scope", "head_only"),
range_size=result.get("range_size", 0),
)
_store_sync_cache(ctx, commit_hash, response)
return response
87 changes: 78 additions & 9 deletions ledger/adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,20 @@
upsert_sync_state,
)
from .schema import init_schema, migrate
from .status import compute_content_hash, derive_status, get_changed_files, resolve_head
from .status import (
compute_content_hash,
derive_status,
get_changed_files,
get_changed_files_in_range,
resolve_head,
)


# v0.4.11: cap for range sweep. If the diff between last_synced and HEAD
# spans more files than this, we sweep the first MAX_SWEEP_FILES and report
# `sweep_scope="range_truncated"` so the caller knows the sweep was partial.
# The next link_commit will pick up the remainder.
_MAX_SWEEP_FILES = 200

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -232,10 +245,55 @@ async def ingest_commit(
"decisions_reflected": 0,
"decisions_drifted": 0,
"undocumented_symbols": [],
"sweep_scope": "head_only",
"range_size": 0,
}

# Get changed files from this commit
changed_files = get_changed_files(commit_hash, repo_path)
# v0.4.11: determine sweep scope. The pre-v0.4.11 behavior was always
# head-only (`git show HEAD --name-only`), which missed every file
# drifted between last_synced and HEAD. Now the default is range_diff
# — sweep every file touched since the cursor — falling back to
# head_only when there's no cursor or the range is unreachable.
last_synced = (state or {}).get("last_synced_commit", "") or ""
sweep_scope: str = "head_only"
changed_files: list[str] = []

if last_synced and last_synced != commit_hash:
range_files = get_changed_files_in_range(
last_synced, commit_hash, repo_path,
)
if range_files is None:
# Range unreachable (force-push, shallow clone, rebase
# discarded the base). Fall back to head-only — partial
# but better than crashing. The next sync after a real
# commit will recover.
logger.warning(
"[link_commit] range %s..%s unreachable, falling "
"back to head-only sweep",
last_synced[:8], commit_hash[:8],
)
changed_files = get_changed_files(commit_hash, repo_path)
sweep_scope = "head_only"
else:
changed_files = range_files
sweep_scope = "range_diff"
if len(changed_files) > _MAX_SWEEP_FILES:
logger.warning(
"[link_commit] range sweep capped at %d files "
"(would have swept %d). Remainder will catch up "
"on next sync.",
_MAX_SWEEP_FILES, len(changed_files),
)
changed_files = changed_files[:_MAX_SWEEP_FILES]
sweep_scope = "range_truncated"
else:
# First-ever sync (no cursor) OR same SHA as cursor.
# Head-only is the right scope here.
changed_files = get_changed_files(commit_hash, repo_path)
sweep_scope = "head_only"

range_size = len(changed_files)

if not changed_files:
# Only advance the sync cursor on authoritative refs — pollution guard
if is_authoritative:
Expand All @@ -248,14 +306,20 @@ async def ingest_commit(
"decisions_reflected": 0,
"decisions_drifted": 0,
"undocumented_symbols": [],
"sweep_scope": sweep_scope,
"range_size": 0,
}

# Find all code_regions for changed files
regions = await get_regions_for_files(self._client, changed_files)

regions_updated = 0
decisions_reflected = 0
decisions_drifted = 0
# v0.4.11: track distinct intent_ids that flipped, not (region, intent)
# pairs. A decision with N regions all flipping in the same sweep
# used to inflate the counter N times — now it's counted once. Matches
# what users expect from "how many decisions just changed status."
flipped_to_reflected: set[str] = set()
flipped_to_drifted: set[str] = set()
undocumented_symbols: list[str] = []

for region in regions:
Expand Down Expand Up @@ -318,10 +382,13 @@ async def ingest_commit(
old_status = intent.get("status", "ungrounded")
if is_authoritative:
await update_intent_status(self._client, intent_id, new_status)
# v0.4.11: dedupe by intent_id. A decision with multiple
# regions all flipping in the same sweep is one flipped
# decision, not N. Sets collapse the duplicates.
if new_status == "reflected" and old_status != "reflected":
decisions_reflected += 1
flipped_to_reflected.add(intent_id)
elif new_status == "drifted" and old_status != "drifted":
decisions_drifted += 1
flipped_to_drifted.add(intent_id)

# Flag as undocumented if no intents mapped
intents = [i for i in (region.get("intents") or []) if i is not None]
Expand All @@ -339,9 +406,11 @@ async def ingest_commit(
"commit_hash": commit_hash,
"reason": "new_commit",
"regions_updated": regions_updated,
"decisions_reflected": decisions_reflected,
"decisions_drifted": decisions_drifted,
"decisions_reflected": len(flipped_to_reflected),
"decisions_drifted": len(flipped_to_drifted),
"undocumented_symbols": list(set(undocumented_symbols)),
"sweep_scope": sweep_scope,
"range_size": range_size,
}

async def backfill_empty_hashes(
Expand Down
44 changes: 44 additions & 0 deletions ledger/status.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,50 @@ def get_changed_files(commit_hash: str, repo_path: str) -> list[str]:
return []


def get_changed_files_in_range(
base_sha: str,
head_sha: str,
repo_path: str,
) -> list[str] | None:
"""Return files touched between ``base_sha`` and ``head_sha``.

v0.4.11 (latent drift fix): when ``link_commit`` runs after a gap of
multiple commits since the last sync, sweeping only ``HEAD --name-only``
misses every file drifted by intermediate commits. This helper runs
``git diff --name-only base..head`` to enumerate the full set of files
touched across the gap, so the drift sweep covers everything that
needs re-checking.

Returns:
- ``list[str]`` of changed file paths (possibly empty when the
two refs touch no different files)
- ``None`` when the diff failed (force-push, shallow clone,
unreachable base SHA, etc.) — caller should fall back to
``get_changed_files(head_sha, repo_path)`` for head-only scope.

The ``None`` sentinel matters: empty list means "ran successfully,
no files differ" while ``None`` means "the range is unreachable."
"""
try:
result = subprocess.run(
["git", "diff", "--name-only", f"{base_sha}..{head_sha}"],
cwd=Path(repo_path).resolve(),
capture_output=True,
text=True,
timeout=30,
)
if result.returncode != 0:
logger.warning(
"[status] git diff %s..%s failed: %s",
base_sha[:8], head_sha[:8], result.stderr[:200],
)
return None
return [f.strip() for f in result.stdout.strip().splitlines() if f.strip()]
except (subprocess.TimeoutExpired, FileNotFoundError) as e:
logger.warning("[status] git diff range error: %s", e)
return None


def resolve_head(repo_path: str) -> str | None:
"""Return current HEAD SHA."""
try:
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "bicameral-mcp"
version = "0.4.10"
version = "0.4.11"
description = "Decision ledger MCP server — ingests meeting transcripts, maps decisions to code, tracks drift"
readme = "README.md"
requires-python = ">=3.10"
Expand Down
Loading