Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,81 @@
All notable changes to bicameral-mcp are tracked here. Format loosely follows
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

## v0.15.0 — Preflight telemetry capture loop (pieces 1–4) — built via [QorLogic SDLC](https://github.com/MythologIQ-Labs-LLC/qor-logic)

First slice of the failure-mode triage workflow from #65. Adds a local-only,
**default-off** capture loop that records bicameral.preflight events plus
downstream tool engagement, attributable per-call via a new ``preflight_id``.
The data is for self-triage of false fires / silent misses; it never leaves
the user's machine and is not part of the existing PostHog relay path.

### Added

- **New module: `preflight_telemetry.py`** (top-level, sibling of
`telemetry.py` — they are independent capture systems). Provides:
- `_get_or_create_salt()` — per-install salt at `~/.bicameral/salt`,
`os.urandom(32)`, mode `0o600` on POSIX. Race-safe init: `os.O_EXCL`
create with a `FileExistsError` fallback that reads the winner's
bytes (audit MF1 inline fix).
- `hash_topic(topic)` and `hash_file_paths(paths)` — salted SHA-256
truncated to 16 hex chars (~64 bits). `hash_file_paths` is
order-independent so `["a.py","b.py"]` and `["b.py","a.py"]` collide
by design.
- `new_preflight_id()` — fresh UUIDv4.
- `write_preflight_event(...)` — JSONL append at
`~/.bicameral/preflight_events.jsonl`, mode `0o600`.
- `write_engagement(...)` — JSONL append at
`~/.bicameral/engagements.jsonl`, mode `0o600`. Falls back to
subset-match attribution against recent preflight events when no
explicit `preflight_id` is supplied.
- `_maybe_rotate(path)` — rotates at 50 MB or 30 days, keeps the most
recent 5 rotations. Uses `os.replace` (atomic on Windows + POSIX).
- **`preflight_id` plumb-through** — new optional `str | None` field on
`PreflightResponse`, `LinkCommitResponse`, `BindResponse`, and
`RatifyResponse`. The `update.py` handler returns dicts and now adds a
`preflight_id` key to every return shape (audit S3 — 11 sites). Each
affected handler (`handle_link_commit`, `handle_bind`, `handle_ratify`,
`handle_update`) gains a keyword-only `preflight_id: str | None = None`
parameter.
- **MCP tool inputSchema** — `preflight_id` (optional string) added to
`bicameral.preflight`, `bicameral.link_commit`, `bicameral.bind`,
`bicameral.update`, `bicameral.ratify`. Existing skills that don't pass
it keep working unchanged.
- **Tests** — `tests/test_preflight_telemetry.py` (19 cases covering
salt, hash, writers, rotation, race-loser MF1) and
`tests/test_preflight_id_plumbing.py` (9 cases covering the response
field on each affected handler).

### Privacy stance

- **Opt-in.** Default is OFF. Set `BICAMERAL_PREFLIGHT_TELEMETRY=1` to
capture; unsetting it makes every writer a no-op.
- **Hashed by default.** Topic and file_paths are stored as 16-char
salted SHA-256 prefixes. Set `BICAMERAL_PREFLIGHT_TELEMETRY_RAW=1` to
additionally store plaintext — separate, explicit opt-in.
- **`surfaced_ids` are written raw.** They are opaque ledger
`decision_id` strings, already non-PII. Hashing them would defeat the
triage join with `failure_review.jsonl` (the only useful join).
Documented as an invariant in the module docstring.
- **Local-only.** All files live under `~/.bicameral/`, mode `0o600`.
Data never leaves the machine; this is a separate path from the
PostHog relay in `telemetry.py`.
- **Bounded retention.** 50 MB rolling cap per file; 30-day mtime
ceiling; keep last 5 rotations.

### Out of scope (deferred to follow-up plans)

- **Piece 5 — SessionEnd reconciliation skill** (#65-pt2). Reads the
JSONL files, classifies entries as `suspected_miss` /
`suspected_false_fire` / `normal`, writes `failure_review.jsonl`.
- **Piece 6 — Triage CLI + redaction** (#65-pt3). `bicameral-mcp triage`
CLI for labeling failure rows; promotion to
`tests/eval/real_dataset.jsonl` requires explicit redaction.

### Closes

#65 (pieces 1–4 only — pieces 5–6 tracked separately)

## v0.14.0 — Local-only telemetry counters + usage summary + first-boot consent — built via [QorLogic SDLC](https://github.com/MythologIQ-Labs-LLC/qor-logic)

Privacy-first observability foundation. Adds a local-only counter sink
Expand Down
7 changes: 4 additions & 3 deletions consent.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,10 @@
import logging
import os
import sys
from datetime import datetime, timezone
from collections.abc import Callable
from datetime import UTC, datetime
from pathlib import Path
from typing import Any, Callable
from typing import Any

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -70,7 +71,7 @@ def write_consent(telemetry: bool, *, via: str) -> None:
record: dict[str, Any] = {
"telemetry": "enabled" if telemetry else "disabled",
"policy_version": POLICY_VERSION,
"acknowledged_at": datetime.now(timezone.utc).isoformat(),
"acknowledged_at": datetime.now(UTC).isoformat(),
"acknowledged_via": via,
}
_CONSENT_FILE.parent.mkdir(parents=True, exist_ok=True)
Expand Down
11 changes: 11 additions & 0 deletions contracts.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,10 @@ class LinkCommitResponse(BaseModel):
# ``pending_compliance_checks`` before the response is sent. Zero
# when ``codegenome.enhance_drift`` is disabled.
auto_resolved_count: int = 0
# #65 — preflight telemetry plumb-through. When the caller passed a
# preflight_id (from a prior bicameral.preflight call), the response
# echoes it so downstream telemetry rows can be attributed.
preflight_id: str | None = None


class ActionHint(BaseModel):
Expand Down Expand Up @@ -645,6 +649,9 @@ class PreflightResponse(BaseModel):
context_pending_ready: list[BriefDecision] = [] # context_pending with ≥1 confirmed context_for
sync_metrics: SyncMetrics | None = None # V1 A3 — catch-up wall times
product_stage: str | None = None # shown once per device; wait-time expectation-setting
# #65 — opaque per-call id for the preflight telemetry capture loop.
# None when telemetry is disabled (BICAMERAL_PREFLIGHT_TELEMETRY != 1).
preflight_id: str | None = None


# ── Tool 10: /bicameral_judge_gaps ───────────────────────────────────
Expand Down Expand Up @@ -709,6 +716,8 @@ class RatifyResponse(BaseModel):
was_new: bool # True if this call set the signoff; False if already set
signoff: dict
projected_status: Literal["reflected", "drifted", "pending", "ungrounded"]
# #65 — preflight telemetry plumb-through.
preflight_id: str | None = None


# ── Tool: bicameral.resolve_collision ────────────────────────────────────────
Expand Down Expand Up @@ -823,6 +832,8 @@ class BindResponse(BaseModel):

bindings: list[BindResult]
sync_metrics: SyncMetrics | None = None # V1 A3 — write-barrier hold time
# #65 — preflight telemetry plumb-through.
preflight_id: str | None = None


# ── Session-start banner ─────────────────────────────────────────────
Expand Down
24 changes: 23 additions & 1 deletion handlers/bind.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,17 @@

from contracts import BindResponse, BindResult, PendingComplianceCheck, SyncMetrics
from handlers.sync_middleware import repo_write_barrier
from preflight_telemetry import telemetry_enabled, write_engagement

logger = logging.getLogger(__name__)


async def handle_bind(ctx, bindings: list[dict]) -> BindResponse:
async def handle_bind(
ctx,
bindings: list[dict],
*,
preflight_id: str | None = None,
) -> BindResponse:
"""Create decision→code_region bindings from caller-LLM-supplied locations.

For each binding:
Expand All @@ -32,6 +38,22 @@ async def handle_bind(ctx, bindings: list[dict]) -> BindResponse:
async with repo_write_barrier(ctx) as timing:
response = await _do_bind(ctx, bindings)
response.sync_metrics = SyncMetrics(barrier_held_ms=timing.held_ms)
response.preflight_id = preflight_id

if telemetry_enabled():
# One row per bind call (not per binding) — the call is the unit of
# engagement. decision_id is the first binding's id when present;
# file_paths is the union of file paths across the call.
first_decision = (str(bindings[0].get("decision_id") or "") if bindings else None) or None
file_paths = [str(b.get("file_path") or "") for b in (bindings or []) if b.get("file_path")]
write_engagement(
session_id=str(getattr(ctx, "session_id", "unknown") or "unknown"),
tool="bicameral.bind",
decision_id=first_decision,
preflight_id=preflight_id,
file_paths=file_paths or None,
)

return response


Expand Down
30 changes: 29 additions & 1 deletion handlers/link_commit.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
import uuid

from contracts import LinkCommitResponse, PendingComplianceCheck
from preflight_telemetry import telemetry_enabled, write_engagement


def _is_ephemeral_commit(commit_hash: str, repo_path: str, authoritative_ref: str = "") -> bool:
Expand Down Expand Up @@ -440,7 +441,12 @@ async def _run_continuity_pass(ctx, pending: list[PendingComplianceCheck]) -> li
return resolutions


async def handle_link_commit(ctx, commit_hash: str = "HEAD") -> LinkCommitResponse:
async def handle_link_commit(
ctx,
commit_hash: str = "HEAD",
*,
preflight_id: str | None = None,
) -> LinkCommitResponse:
# v0.4.8: short-circuit if we've already synced this SHA within this
# MCP call. Returns the FULL cached response from the first sync so
# downstream consumers (search/drift's ``sync_status``) see real
Expand All @@ -451,6 +457,18 @@ async def handle_link_commit(ctx, commit_hash: str = "HEAD") -> LinkCommitRespon
"[link_commit] sync dedup: %s already synced in this call",
commit_hash,
)
# Echo preflight_id into the cached response so the engagement row
# (and downstream consumers) sees the caller-supplied id.
if preflight_id is not None:
cached = cached.model_copy(update={"preflight_id": preflight_id})
if telemetry_enabled():
write_engagement(
session_id=str(getattr(ctx, "session_id", "unknown") or "unknown"),
tool="bicameral.link_commit",
decision_id=None,
preflight_id=preflight_id,
file_paths=None,
)
return cached

# Self-heal legacy regions with empty content_hash from pre-v0.4.5
Expand Down Expand Up @@ -549,9 +567,19 @@ async def handle_link_commit(ctx, commit_hash: str = "HEAD") -> LinkCommitRespon
ephemeral=is_ephemeral,
continuity_resolutions=continuity_resolutions,
auto_resolved_count=auto_resolved_count,
preflight_id=preflight_id,
)
_store_sync_cache(ctx, commit_hash, response)

if telemetry_enabled():
write_engagement(
session_id=str(getattr(ctx, "session_id", "unknown") or "unknown"),
tool="bicameral.link_commit",
decision_id=None,
preflight_id=preflight_id,
file_paths=None,
)

try:
from dashboard.server import notify_dashboard

Expand Down
60 changes: 59 additions & 1 deletion handlers/preflight.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@
)
from handlers.action_hints import generate_hints_from_findings
from handlers.analysis import _to_brief_decision
from preflight_telemetry import (
new_preflight_id,
telemetry_enabled,
write_preflight_event,
)

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -295,28 +300,55 @@ async def handle_preflight(
"""Pre-flight context check. Gates output by ``ctx.guided_mode``."""
guided_mode = bool(getattr(ctx, "guided_mode", False))

# #65 — generate the per-call preflight_id once, when telemetry is enabled.
# Stable across the preflight → downstream-tool engagement chain.
pid: str | None = new_preflight_id() if telemetry_enabled() else None
session_id = str(getattr(ctx, "session_id", "unknown") or "unknown")

# Explicit mute via env var — one-line off-switch for the session.
if os.getenv("BICAMERAL_PREFLIGHT_MUTE", "").strip().lower() in (
"1",
"true",
"yes",
"on",
):
if pid is not None:
write_preflight_event(
session_id=session_id,
preflight_id=pid,
topic=topic,
file_paths=file_paths or [],
fired=False,
surfaced_ids=[],
reason="preflight_disabled",
)
return PreflightResponse(
topic=topic,
fired=False,
reason="preflight_disabled",
guided_mode=guided_mode,
preflight_id=pid,
)

# Per-session dedup — same topic within 5 min is silenced.
if _check_dedup(ctx, topic):
logger.debug("[preflight] dedup hit for topic: %r", topic[:60])
if pid is not None:
write_preflight_event(
session_id=session_id,
preflight_id=pid,
topic=topic,
file_paths=file_paths or [],
fired=False,
surfaced_ids=[],
reason="recently_checked",
)
return PreflightResponse(
topic=topic,
fired=False,
reason="recently_checked",
guided_mode=guided_mode,
preflight_id=pid,
)

# V1 A3: time the call locally so the metric reflects THIS handler's catch-up.
Expand Down Expand Up @@ -385,7 +417,7 @@ async def handle_preflight(
fired = bool(region_matches or unresolved_collisions or context_pending_ready or guided_mode)
action_hints = generate_hints_from_findings([], drift_candidates, [], guided_mode)

return PreflightResponse(
response = PreflightResponse(
topic=topic,
fired=fired,
reason="fired" if fired else "no_matches", # type: ignore[arg-type]
Expand All @@ -400,4 +432,30 @@ async def handle_preflight(
context_pending_ready=context_pending_ready,
sync_metrics=sync_metrics,
product_stage=_PRODUCT_STAGE_MSG if _should_show_product_stage() else None,
preflight_id=pid,
)

# #65 — capture-loop event. surfaced_ids is the union of decision_ids the
# response is steering the agent toward, used for triage joins.
if pid is not None:
surfaced_ids: list[str] = []
for d in decisions:
if d.decision_id:
surfaced_ids.append(d.decision_id)
for d in unresolved_collisions:
if d.decision_id and d.decision_id not in surfaced_ids:
surfaced_ids.append(d.decision_id)
for d in context_pending_ready:
if d.decision_id and d.decision_id not in surfaced_ids:
surfaced_ids.append(d.decision_id)
write_preflight_event(
session_id=session_id,
preflight_id=pid,
topic=topic,
file_paths=file_paths or [],
fired=fired,
surfaced_ids=surfaced_ids,
reason=response.reason,
)

return response
Loading
Loading