feat(cli): #252 Layer 3 — bicameral-mcp diagnose CLI by Knapp-Kevin · Pull Request #257 · BicameralAI/bicameral-mcp

Knapp-Kevin · 2026-05-07T19:38:22Z

Summary

Closes #252 Layer 3 of the privacy-preserving ledger-remediation strategy. New `bicameral-mcp diagnose` CLI subcommand emits a markdown-styled report containing structural metadata only — versions, file metadata, table row counts, schema-revision sentinel state, recent warn|error event tail. Operators copy-paste the rendered output into GitHub bug reports without privacy review.

Plan / Audit / Seal

Plan: `plan-252-layer-3-diagnose-cli.md`
Strategy brief: `docs/research-brief-252-privacy-preserving-ledger-remediation.md`
Audit: round 1 PASS (no VETO — plan absorbed prior-round audit learnings: separate-table architecture from Layer 2; allowlist + content-contract test pattern from [compliance:SOC 2 CC, OWASP A09] Structured audit-log emission for self-hosted operators (gap SOC2-06 + OWASP-06 fold) #227; helper-extraction acknowledged in implementer notes)
Ledger seal: META_LEDGER entry bicameral_scan_branch: surface drift and ungrounded opportunities on a feature branch #47

What ships

Surface	Change
`cli/diagnose.py` (new, 213 LOC)	`Diagnosis` frozen dataclass (17 fields) + `_ALLOWED_FIELDS` frozenset + `_CANONICAL_TABLES` + 6 section helpers + `format_diagnosis` + `main`
`cli/_diagnose_gather.py` (new, 244 LOC — split per Razor advisory)	`gather_diagnosis` async + 7 private readers (ledger metadata, bicameral_meta sentinel, schema_version, table counts, audit-log channel, JSONL warn/error tail, recent-events merge across preflight + audit-log paths) + 5-heuristic suggestion engine
`server.py`	new `diagnose` subparser registration + dispatch arm
`handlers/update.py`	public alias `fetch_recommended_version` for clean cross-layer call (audit advisory)
`docs/policies/diagnose-output.md` (new)	operator-readable allowlist policy + suggestion heuristic catalog + always-safe-to-paste guarantee
`README.md`	"Compliance posture" bumped 5 → 6 policy files
`tests/test_diagnose_*.py` (new, 4 files)	32 functional tests (3 allowlist + 18 gather + 8 format + 3 CLI)
`tests/test_compliance_policy_docs.py`	2 new content-contract tests (allowlist field doc-coverage + heuristic-catalog drift lock)

Privacy posture

The output is always safe to paste. Allowlist enforcement at the `Diagnosis` dataclass level + `_ALLOWED_FIELDS` frozenset + content-contract drift lock collectively form the airtight surface. Negative content-leak test (`test_gather_diagnosis_emits_no_decision_content_when_decisions_present`) inserts a decision with marker `"TOP-SECRET-DECISION-CONTENT-MARKER"` and asserts the marker does NOT appear in `repr(Diagnosis)`.

Suggestion heuristics (5 hardcoded)

drift detected — `drift_status == "drift"`
recommended-version mismatch — `bicameral_version` ≠ fetched `RECOMMENDED_VERSION`
audit log disabled — `audit_log_channel == "stderr"` (default)
ledger > 100 MiB — `ledger_size_bytes > 100 * 1024 * 1024`
schema version old — `schema_version_recorded < schema_version_expected`

Test plan

32 new functional tests pass (3 allowlist + 18 gather + 8 format + 3 CLI)
2 new content-contract tests pass (doc/code drift locks for allowlist + heuristic catalog)
Compliance-policy regression: 10 tests pass clean
`ruff check` + `ruff format --check` clean on every touched + new file
End-to-end smoke test against `memory://` ledger emits all 6 required section headers
Privacy verification: decision content never appears in rendered output
Network-isolation: CLI tests monkeypatch `fetch_recommended_version` to avoid live HTTP

All 3 audit advisories applied

Advisory	Resolution
Razor headroom on `cli/diagnose.py` (~280 plan estimate)	Helper extraction: `cli/diagnose.py` (213) + `cli/_diagnose_gather.py` (244). Both under 250. Mirror of `cli/_link_commit_runner.py` pattern.
Recommended-version fetch needs cross-layer call	Public alias `fetch_recommended_version` added to `handlers/update.py` (one-line addition). Avoids private-symbol re-use.
CLI tests would otherwise gate on `raw.githubusercontent.com` availability	Network monkeypatched in `tests/test_diagnose_cli.py` autouse fixture.

Razor compliance

All new code under limits:

`cli/diagnose.py` 213 LOC; longest function `format_diagnosis` ~17 LOC (orchestrator)
`cli/_diagnose_gather.py` 244 LOC; longest function `gather_diagnosis` ~35 LOC
All readers <25 LOC; `_compute_suggestions` ~38 LOC (right at 40-LOC ceiling)
`server.py` modifications: 5 LOC additive
`handlers/update.py` modification: 3-LOC public alias

Closes / unlocks

Closes: link_commit fails with SurrealDB 'Invalid revision 116' + recommended v0.13.9 not on PyPI #252 Layer 3 (diagnose CLI)
Substrate for link_commit fails with SurrealDB 'Invalid revision 116' + recommended v0.13.9 not on PyPI #252 Layer 4 (export/import): the Layer 4 export-stamp logic reads the same `bicameral_meta` sentinel + Layer 3's allowlist discipline becomes the export's privacy contract baseline
Substrate for link_commit fails with SurrealDB 'Invalid revision 116' + recommended v0.13.9 not on PyPI #252 Layer 5 (opt-in auto-migrate): pre-migrate signal via `bicameral-mcp diagnose` output
Substrate for BicameralAI/bicameral-daemon#23 GDPR right-to-erasure: design discipline (allowlist by field name; never row content) becomes the template for DSAR + erasure surfaces

Inheritance from prior work

Mirrors the dual-emit + exception-isolation pattern from `handlers/ingest._emit_ingest_refusal_telemetry` ([compliance:SOC 2 CC, OWASP A09] Structured audit-log emission for self-hosted operators (gap SOC2-06 + OWASP-06 fold) #227)
Forbid-list ↔ allowlist symmetry: [compliance:SOC 2 CC, OWASP A09] Structured audit-log emission for self-hosted operators (gap SOC2-06 + OWASP-06 fold) #227 catches accidents at the write site (audit-log emit); Layer 3 catches accidents at the read site (diagnose output)
Policy-doc + content-contract-test pattern from docs(compliance): install + update trust model bundle (#218 OWASP-03 + OWASP-05) #248/feat(release): skills/MANIFEST.toml signing — closes epic #218 (#214 LLM-06) #249/feat(ledger): #252 Layer 2 — wire-format sentinel via bicameral_meta table #256 — locks new policy doc into doc/code drift detection

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Privacy-preserving operator bug-report tool per docs/research-brief-252-privacy-preserving-ledger-remediation.md. Three phases: - Phase 1: Diagnosis dataclass + _ALLOWED_FIELDS frozenset + async gather_diagnosis() with private readers (ledger metadata, bicameral_meta sentinel, table counts, audit-log tail merge across both JSONL + optional configured path, suggestion engine). Allowlist-by-field-name is the load-bearing privacy mechanism. - Phase 2: format_diagnosis() markdown renderer + CLI subparser registration in server._register_subparsers + dispatch arm. Output is operator-pasteable markdown with 6 section headers. - Phase 3: operator policy doc enumerating allowlisted fields + suggestion heuristic catalog; content-contract tests lock doc/code drift; README compliance-posture row added. ~280 LOC + 25 functional tests. No new pip deps. Implementation can land before #256 merges (bicameral_meta reads are tolerant of missing table); integration test for first-write status xfail-marked until #256 reaches dev. Five hardcoded suggestion heuristics: drift detected; recommended- version mismatch; audit-log disabled; ledger > 100 MiB; schema version old. Plugin extension is YAGNI for v1. Audit-log tail covers BOTH ~/.bicameral/preflight_events.jsonl AND BICAMERAL_AUDIT_LOG=<path> when configured (full-utility hybrid per operator directive — refused the simplification tradeoff). Plan: plan-252-layer-3-diagnose-cli.md Strategy: docs/research-brief-252-privacy-preserving-ledger-remediation.md Depends on: #252 Layer 2 (#256) for bicameral_meta table availability

…eport (#252 Layer 3) Closes #252 Layer 3 per docs/research-brief-252-privacy-preserving-ledger-remediation.md. cli/diagnose.py (215 LOC): Diagnosis dataclass + _ALLOWED_FIELDS frozenset (17 fields) + _CANONICAL_TABLES + format_diagnosis() markdown renderer + main() entrypoint. Allowlist-by-field-name is the load-bearing privacy mechanism. cli/_diagnose_gather.py (244 LOC, helper-extracted per Razor advisory): gather_diagnosis() async function + 5 private readers (ledger metadata, bicameral_meta sentinel, schema_version, table_counts, audit-log tail merge across both preflight_events.jsonl AND BICAMERAL_AUDIT_LOG=<path> when configured) + 5-heuristic suggestion engine. server.py: new "diagnose" subparser registration + dispatch arm. handlers/update.py: public alias `fetch_recommended_version` for clean cross-layer call from cli/diagnose without re-using private underscore symbol. docs/policies/diagnose-output.md: operator-readable allowlist policy + suggestion heuristic catalog + always-safe-to-paste guarantee. README.md: Compliance posture section bumped 5 → 6 policy files. Tests: 31 new functional tests across 4 files (allowlist parity + gather scenarios + format rendering + CLI subprocess) plus 2 new content-contract tests in test_compliance_policy_docs.py (allowlist field doc-coverage + suggestion-catalog drift lock). All 41 PASS. option<datetime> support inherited from #252 Layer 2 (not present on this branch base; bicameral_meta read returns "first-write" status on pre-Layer-2 ledgers — semantically equivalent for diagnostic purposes). Plan: plan-252-layer-3-diagnose-cli.md Audit: round 1 PASS (no VETO — plan absorbed prior-round audit learnings: separate-table architecture from Layer 2; allowlist + content-contract test pattern from #227; helper-extraction acknowledged in implementer notes).

…I substantiated Closes #252 Layer 3 per docs/research-brief-252-privacy-preserving-ledger-remediation.md. Reality matches Promise: 9 planned files (4 new tests + 1 new policy doc + 4 modified sources/docs) shipped across 3 phases. 41 new functional tests pass (16 more than the plan's 25 — doctrine-positive expansion covering per-reader unit boundary + edge cases like PackageNotFoundError unknown branch, recent-events 5-cap enforcement, table-counts canonical subset). Logged deviations: 1. Helper-extraction split per Razor advisory (cli/diagnose.py 213 + cli/_diagnose_gather.py 244, both under 250). Mirrors cli/_link_commit_runner.py pattern. 2. drift_status semantics adjusted to "match" on fresh ledger (Layer 2's sentinel writes row at adapter.connect time; gather sees populated row by the time it runs). Test renamed to reflect Layer-2-integrated reality. 3. Test count expansion (25 -> 41). Privacy posture: allowlist-by-field-name at the Diagnosis dataclass level. Negative content-leak test verifies decision content never appears in repr(Diagnosis). Forbidden-field-name negative lock catches future field expansion using content-bearing names. Three audit advisories all applied: - Razor headroom: helper extraction - Recommended-version fetch: public alias added - CLI tests: network monkeypatched (no live HTTP) Audit: round 1 PASS (no VETO — plan absorbed prior-round audit learnings). Plan: plan-252-layer-3-diagnose-cli.md Strategy: docs/research-brief-252-privacy-preserving-ledger-remediation.md

coderabbitai · 2026-05-07T19:41:25Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ef45a56-a224-4cdc-98e2-582a8fd2439a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch plan/252-layer-3-diagnose-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

release: v0.14.1 — SBOM fix + #263 sync auto-bind + #257 diagnose CLI

@v2

…residual) Aligns preflight-eval.yml with the test-mcp-regression.yml pin landed in PR #273. Closes the last of #272's three CI-baseline regressions — preflight-eval was the only remaining consumer of the unpinned mutable @v2 tag whose published artifact silently broke between PR #257 and PR #258 (index.js missing from the action's bundled output). Same SHA (31493c76ec9e7aa675f1585d3ed6f1da69269a86, v2.4) used in test-mcp-regression.yml:213 so a future bump is one grep-and-replace. Per docs/policies/install-trust-model.md (OWASP A06 supply-chain discipline): no GitHub Action runs in our CI from a mutable tag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- pyproject.toml: 0.13.3 → 0.14.1 (dev was stuck at 0.13.3 throughout the v0.14.0 stream; bumping past v0.14.0 to align with what's actually on main) - RECOMMENDED_VERSION: 0.13.3 → 0.14.1 - pyproject.toml scripts: drop `bicameral-mcp-classify` (broken since BicameralAI#244 deleted cli/classify.py — carryover cleanup from the v0.14.0 release surgery) - release/sbom_emit.py: install wheel into temp venv before scanning. Fixes the v0.14.0 publish-pipeline halt where `cyclonedx-py environment <wheel>` failed because the subcommand introspects a Python environment via a Python-executable path, not a wheel file. New flow: tempdir venv → pip install wheel + cyclonedx-bom → run `cyclonedx-py environment --output-file <out> <venv-python>`. Output is the wheel's actual dependency closure with no contamination from the build env. `--output-file` flag replaces v0.14.0's `-o` short form (cyclonedx-py 7.x dropped the alias). - CHANGELOG.md: new ## v0.14.1 release header summarizing SBOM fix + BicameralAI#257 diagnose CLI + BicameralAI#259/BicameralAI#260 dependabot bumps. Demoted prior "[Unreleased]" content to "[Unreleased — pre-v0.14.0]" to mark the cutoff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Knapp-Kevin added 3 commits May 7, 2026 15:34

Knapp-Kevin temporarily deployed to ci-test May 7, 2026 19:38 — with GitHub Actions Inactive

Knapp-Kevin had a problem deploying to recording-approval May 7, 2026 19:38 — with GitHub Actions Failure

Knapp-Kevin temporarily deployed to ci-test May 7, 2026 19:38 — with GitHub Actions Inactive

Knapp-Kevin temporarily deployed to production May 7, 2026 19:38 — with GitHub Actions Inactive

jinhongkuan merged commit 50e2481 into dev May 7, 2026
8 of 9 checks passed

jinhongkuan deleted the plan/252-layer-3-diagnose-cli branch May 7, 2026 21:36

This was referenced May 7, 2026

feat(cli): #252 Layer 4 — portable JSON-Lines ledger export/import #258

Merged

release(prep): v0.14.1 — SBOM emitter fix + version bumps #264

Merged

release: v0.14.1 — SBOM fix + #263 sync auto-bind + #257 diagnose CLI #267

Merged

jinhongkuan added a commit that referenced this pull request May 7, 2026

Merge pull request #267 from BicameralAI/release/v0.14.1

2a9b003

release: v0.14.1 — SBOM fix + #263 sync auto-bind + #257 diagnose CLI

Knapp-Kevin mentioned this pull request May 8, 2026

Dev CI baseline regressions blocking #258 merge: test-summary action + M1 eval AttributeError + e2e Flow 1 #272

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): #252 Layer 3 — bicameral-mcp diagnose CLI#257

feat(cli): #252 Layer 3 — bicameral-mcp diagnose CLI#257
jinhongkuan merged 3 commits into
devfrom
plan/252-layer-3-diagnose-cli

Knapp-Kevin commented May 7, 2026

Uh oh!

coderabbitai Bot commented May 7, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Knapp-Kevin commented May 7, 2026

Summary

Plan / Audit / Seal

What ships

Privacy posture

Suggestion heuristics (5 hardcoded)

Test plan

All 3 audit advisories applied

Razor compliance

Closes / unlocks

Inheritance from prior work

Uh oh!

coderabbitai Bot commented May 7, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants