feat: local telemetry counters + usage_summary + first-boot consent (v0.14.0)#95
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…v0.14.0) Privacy-first observability foundation. Authored via QorLogic SDLC (plan → audit → implement → substantiate). Builds on the dev branch post-merge with main's v0.13.x telemetry refactor. Closes BicameralAI#39 — Local-only counter sink at ~/.bicameral/counters.jsonl. Records only {tool_name, delta=1, ts}; mode 0o600 on POSIX; thread-safe; no network egress. Always-on alongside the network relay (counters are local introspection, distinct from outbound telemetry). Kill-switch: BICAMERAL_LOCAL_COUNTERS=0. New module local_counters.py with increment(tool_name) and read_counters() API. Closes BicameralAI#42 — bicameral.usage_summary MCP tool. Aggregates ingest/bind call counts (from BicameralAI#39's counters file) plus decision counts by status (from ledger) and cosmetic-drift percentage (from compliance_check verdicts) over a configurable window. Returns counts and floats only — no event rows, no user content. New module handlers/usage_summary.py. Adjacent to BicameralAI#39: consent.py — owns ~/.bicameral/consent.json, telemetry_allowed() predicate (single source of truth gating the relay), and notify_if_first_run() non-blocking notice. Marker has acknowledged_via field distinguishing "wizard" from "first_boot_notice" for future audit. POLICY_VERSION constant re-fires the notice for everyone if the telemetry policy ever changes. telemetry.send_event: - now uses consent.telemetry_allowed() as the single gating predicate - always increments the local counter before the relay path (wrapped in try/except — failure cannot affect the caller or the relay) setup_wizard._select_telemetry: - writes the consent marker on every answer (wizard, non-interactive default, both) - raises OSError on marker write failure — guarantees a "no" answer cannot silently leave telemetry on server.serve_stdio: - calls consent.notify_if_first_run() once at startup, never blocking CI: BICAMERAL_SKIP_CONSENT_NOTICE=1 added to test job env. tests/conftest.py: session-scoped autouse fixture reroutes ~/.bicameral/ to a per-session tmp dir; stdlib only. Tests: 23 pass, 1 skipped (POSIX-only file mode). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bf71c03 to
895f8a3
Compare
|
Rebased onto current What changed
Status against
|
Adds mandatory PR labels mirroring the target branch: - flow:feature (green) — standard PR to dev (default flow) - flow:release (blue) — periodic dev→main release PR - flow:hotfix (red) — emergency direct-to-main fix bypassing dev The base branch alone can't disambiguate `--base main` PRs, which can be either release or hotfix — different processes, different review tiers. The labels make the lane visible in `gh pr list` output and give a clean audit trail of historical hotfixes via `--label flow:hotfix --state closed`. Distinct from the existing `merged-to-dev` label (post-merge status) — flow:* labels are pre-merge intent. Labels created in BicameralAI/bicameral-mcp; retroactively applied to the open PR backlog (BicameralAI#85, BicameralAI#86, BicameralAI#93, BicameralAI#95, BicameralAI#99). PR BicameralAI#96 left unlabeled until @silongtan confirms the targeting question raised in that PR. PR BicameralAI#99 (this dev-cycle policy's companion) will land the matching Dependabot auto-label so future bumps arrive pre-tagged.
…#93) * docs: development cycle reference + demos/guides/training scaffolding - docs/DEV_CYCLE.md — full lifecycle reference: issue → branch → PR → dev → release PR → main → tag → GitHub Release. Covers labels/milestones, PR body conventions, CI gates, squash-vs-merge policy, CHANGELOG flip pattern, documentation matrix per release, hotfix path, roles, and four demo storyboards for headline functionality. - docs/demos/README.md — demo authoring rules, template, four-row index matching DEV_CYCLE.md §12. - docs/guides/README.md — user-guide template + authoring rules. Pairs with DEV_CYCLE.md §8 documentation matrix. - docs/training/README.md — training-doc template for concept-level teaching (vs. tool reference). Distinguishes when a topic warrants training over a guide. Intent: codify the dev cycle so contributors and the release manager have a single source of truth, and pre-stage the index/template files so future features have somewhere to land their docs without re-deciding structure. Per DEV_CYCLE.md change protocol, amendments to the doc require the docs:dev-cycle label. * docs(dev-cycle): expand §4.5 CI gates with two-tier model Replaces the three-line CI gates section with a tiered breakdown: - Tier 1 (PR → dev) — fast gates blocking every PR: lint, type check, regression on Linux + Windows matrix, schema persistence, module import smoke, secret scan, pip check, merged-to-dev label automation. - Tier 2 (release PR → main) — release-quality gates inheriting Tier 1 plus full regression w/ slow markers, blocking preflight eval, schema migration validation, performance regression, security scan, CHANGELOG enforcement, version monotonicity, MCP protocol live smoke, issue auto-close + label-strip on merge. Includes a "why the split" rationale table and a three-phase implementation roadmap. Calls out which gates exist today vs which are aspirational, so reviewers don't assume the doc reflects current enforcement. §6.4 pre-release checklist annotated with the corresponding Tier 2 CI gates so the manual checklist and automated gates stay in sync as Phase 2 lands. Phase 1 priority items (per recent triage): - Windows test job — three of the last four bugs (#67, #68, #74) were Windows-only. - merged-to-dev auto-labeller — addresses the manual labeling problem surfaced in PR-A audit. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(dev-cycle): §4.1.1 flow:* PR labels (feature/release/hotfix) Adds mandatory PR labels mirroring the target branch: - flow:feature (green) — standard PR to dev (default flow) - flow:release (blue) — periodic dev→main release PR - flow:hotfix (red) — emergency direct-to-main fix bypassing dev The base branch alone can't disambiguate `--base main` PRs, which can be either release or hotfix — different processes, different review tiers. The labels make the lane visible in `gh pr list` output and give a clean audit trail of historical hotfixes via `--label flow:hotfix --state closed`. Distinct from the existing `merged-to-dev` label (post-merge status) — flow:* labels are pre-merge intent. Labels created in BicameralAI/bicameral-mcp; retroactively applied to the open PR backlog (#85, #86, #93, #95, #99). PR #96 left unlabeled until @silongtan confirms the targeting question raised in that PR. PR #99 (this dev-cycle policy's companion) will land the matching Dependabot auto-label so future bumps arrive pre-tagged. * docs(dev-cycle): §2.1.1/§2.1.2 issue priority + state labels Adds two new label axes for issues: - Priority (mandatory after triage, one of P0/P1/P2/P3) — replaces the [P0]/[P1]/[P2] title-prefix convention some issues currently use. Calibration heuristics included; P0 explicitly rare. - State (optional, orthogonal to priority): triage / blocked / parked. triage is the default on file; parked is maintainer-only. State labels never replace priority — both axes coexist. Also moves the existing risk:L* axis off issues and onto PRs in the doc text — risk is a property of the change being designed, knowable only after planning, so it doesn't make sense as an issue label. PR review tiers in §4.4 already consume risk:L*; this change just makes the doc internally consistent. Labels created in BicameralAI/bicameral-mcp: - P0 (red), P1 (orange), P2 (yellow), P3 (grey) - parked (purple), blocked (dark grey), triage (light grey) Retroactive application: - #39 → P0 (had [P0] prefix) - #42 → P1 (had [P1] prefix) - #44 → P2 (had [P2] prefix) - #87, #89, #50, #23 → triage (unlabeled or speculative) Bulk priority triage of remaining issues left to maintainers. * docs(dev-cycle): parked supersedes priority (not orthogonal) Maintainer correction to §2.1.2: parked + Px is redundant. parked already encodes "not on the priority axis"; adding a priority label on top clutters the label list without adding signal. Issue #50 demonstrates the cleanup (P3 removed; parked stands alone). triage and blocked still coexist with priority as before — those are genuinely orthogonal states. Only parked is the exception. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Issue BicameralAI#44 (P2) plan, targeting v0.14.0. Three architectural calls: D1. Skill-side, not server-side — preserves the local-first / LLM-free-server anti-goal in docs/CONCEPT.md. D2. Caching is already free via Phase 4's compliance_check writes (semantic_status + evidence_refs persisted). D3-D4. Reuses existing typed contracts — no new fields, no new tools, no schema changes. The judge maps to the existing two-axis output (verdict + semantic_status). D5. The rubric is data — text in SKILL.md — not Python code. The LLM follows it during the uncertain-band sub-protocol. D6. Five exit criteria, four CI-checkable + one operator QC pass. Two phases: - Phase 1: M3 benchmark corpus extension — every uncertain case gains expected_judge: {verdict, semantic_status}. Pure data. - Phase 2: Skill rubric — bicameral-sync SKILL.md gains an Uncertain-band sub-protocol section + paired training doc in docs/training/cosmetic-vs-semantic.md. Risk grade L1 (skill rubric + docs + test data; no production code paths). Telemetry surface (acceptance criterion 3 of BicameralAI#44) deferred to a follow-up gated on PR BicameralAI#95 landing — flagged in plan Open Question 3. Branched off BicameralAI/dev post-Phase-4 seal (200dbd5).
…v0.14.0) (#95) Privacy-first observability foundation. Authored via QorLogic SDLC (plan → audit → implement → substantiate). Builds on the dev branch post-merge with main's v0.13.x telemetry refactor. Closes #39 — Local-only counter sink at ~/.bicameral/counters.jsonl. Records only {tool_name, delta=1, ts}; mode 0o600 on POSIX; thread-safe; no network egress. Always-on alongside the network relay (counters are local introspection, distinct from outbound telemetry). Kill-switch: BICAMERAL_LOCAL_COUNTERS=0. New module local_counters.py with increment(tool_name) and read_counters() API. Closes #42 — bicameral.usage_summary MCP tool. Aggregates ingest/bind call counts (from #39's counters file) plus decision counts by status (from ledger) and cosmetic-drift percentage (from compliance_check verdicts) over a configurable window. Returns counts and floats only — no event rows, no user content. New module handlers/usage_summary.py. Adjacent to #39: consent.py — owns ~/.bicameral/consent.json, telemetry_allowed() predicate (single source of truth gating the relay), and notify_if_first_run() non-blocking notice. Marker has acknowledged_via field distinguishing "wizard" from "first_boot_notice" for future audit. POLICY_VERSION constant re-fires the notice for everyone if the telemetry policy ever changes. telemetry.send_event: - now uses consent.telemetry_allowed() as the single gating predicate - always increments the local counter before the relay path (wrapped in try/except — failure cannot affect the caller or the relay) setup_wizard._select_telemetry: - writes the consent marker on every answer (wizard, non-interactive default, both) - raises OSError on marker write failure — guarantees a "no" answer cannot silently leave telemetry on server.serve_stdio: - calls consent.notify_if_first_run() once at startup, never blocking CI: BICAMERAL_SKIP_CONSENT_NOTICE=1 added to test job env. tests/conftest.py: session-scoped autouse fixture reroutes ~/.bicameral/ to a per-session tmp dir; stdlib only. Tests: 23 pass, 1 skipped (POSIX-only file mode). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Triage release per DEV_CYCLE §10.5. Restores Guided-mode post-commit hook behavior (#124) and ships event vocabulary extension for cross-author replay (#97), alongside earlier carry-forward fixes (#74 Windows ingest, #95 telemetry counters + first-boot consent, #98 RFC docs). Full triage provenance and §10.5.3 adaptation notes in PR #128. CHANGELOG headline reworked: replaces the cherry-picked v0.14.0 dev-side heading with a v0.13.5 triage heading covering all 5 commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Privacy-first observability foundation. Adds a local-only counter sink alongside the existing network relay, the
bicameral.usage_summaryMCP tool that reads from it, and a non-blocking first-boot notice so users see the telemetry policy on the first boot of an upgraded binary.local_counters.py— append-only JSONL at~/.bicameral/counters.jsonl. No network egress. Mode0o600on POSIX.bicameral.usage_summaryMCP tool — aggregate counts/percentages over N days.Depends on
This PR targets
devand assumes #94 (chore: merge main into dev) lands first. The integration site istelemetry.send_event(introduced on main in v0.12.0); without #94 dev would still call the olderrecord_eventand the wiring would not match.Architecture
consent.telemetry_allowed()is the single source of truth: returnsTruewhen envBICAMERAL_TELEMETRY != "0"AND (marker missing ORmarker.telemetry == "enabled"). Missing marker preserves default-on for upgraders who haven't seen the notice yet.First-boot notice (non-blocking)
server.serve_stdiocallsconsent.notify_if_first_run()once at startup. The notice fires on first boot of an upgraded binary (no marker, or stalepolicy_version), surfaces via:notifications/message(when an active session is available — surfaced by harness like Claude Code)After emit, the marker is stamped (
acknowledged_via="first_boot_notice",policy_version=1) so the notice does not repeat. If telemetry policy ever changes, bumpingPOLICY_VERSIONre-fires the notice for everyone.Server is never blocked —
notify_if_first_runis best-effort, wrapped in try/except. Marker write failure logs at debug and continues.bicameral.usage_summary(#42)Returns the schema specified in the issue:
{ "period_days": 7, "ingest_calls": int, # from local counters "bind_calls_total": int, # from local counters "decisions_ingested": int, # from ledger "decisions_ungrounded": int, "decisions_pending": int, "decisions_reflected": int, "decisions_drifted": int, "reflected_pct": float, "drift_pct": float, "cosmetic_drift_pct": float, "error_rate": float, }Privacy: aggregates only — no event rows, no session IDs, no user content.
Wizard hard-fail
setup_wizard._select_telemetrynow writes the consent marker on every answer (interactive or auto-yes). On marker write failure (OSErrorfrom disk full / RO home / permission denied), the wizard exits non-zero — guarantees a "no" answer never silently leaves telemetry on.Test plan
tests/test_local_counters.py(7 tests) — file creation, append, aggregation, network independence, concurrent writes, env kill-switchtests/test_consent_notice.py(12 tests, 1 skipped on Windows) —telemetry_allowed()predicate truth table, marker shape, file mode, notice emit/suppress/re-fire, env-var skip, write-failure swallow, relay blocked when consent disabledtests/test_usage_summary.py(5 tests) — zero-days short-circuit, decision aggregation, cosmetic_drift_pct, empty ledger, tool-call counts from local filetests/conftest.py— session-scoped autouse fixture isolates~/.bicameral/to tmp dir; stdlib onlyBICAMERAL_SKIP_CONSENT_NOTICE: '1'added to test job envPR-B total: 23 pass, 1 skipped (POSIX-only file mode test on Windows host).
Closes
Closes #39
Closes #42
🤖 Authored via QorLogic SDLC (plan → audit → implement → substantiate).