Skip to content

feat(cli): #252 Layer 3 — bicameral-mcp diagnose CLI#257

Merged
jinhongkuan merged 3 commits into
devfrom
plan/252-layer-3-diagnose-cli
May 7, 2026
Merged

feat(cli): #252 Layer 3 — bicameral-mcp diagnose CLI#257
jinhongkuan merged 3 commits into
devfrom
plan/252-layer-3-diagnose-cli

Conversation

@Knapp-Kevin

Copy link
Copy Markdown
Collaborator

Summary

Closes #252 Layer 3 of the privacy-preserving ledger-remediation strategy. New `bicameral-mcp diagnose` CLI subcommand emits a markdown-styled report containing structural metadata only — versions, file metadata, table row counts, schema-revision sentinel state, recent warn|error event tail. Operators copy-paste the rendered output into GitHub bug reports without privacy review.

Plan / Audit / Seal

What ships

Surface Change
`cli/diagnose.py` (new, 213 LOC) `Diagnosis` frozen dataclass (17 fields) + `_ALLOWED_FIELDS` frozenset + `_CANONICAL_TABLES` + 6 section helpers + `format_diagnosis` + `main`
`cli/_diagnose_gather.py` (new, 244 LOC — split per Razor advisory) `gather_diagnosis` async + 7 private readers (ledger metadata, bicameral_meta sentinel, schema_version, table counts, audit-log channel, JSONL warn/error tail, recent-events merge across preflight + audit-log paths) + 5-heuristic suggestion engine
`server.py` new `diagnose` subparser registration + dispatch arm
`handlers/update.py` public alias `fetch_recommended_version` for clean cross-layer call (audit advisory)
`docs/policies/diagnose-output.md` (new) operator-readable allowlist policy + suggestion heuristic catalog + always-safe-to-paste guarantee
`README.md` "Compliance posture" bumped 5 → 6 policy files
`tests/test_diagnose_*.py` (new, 4 files) 32 functional tests (3 allowlist + 18 gather + 8 format + 3 CLI)
`tests/test_compliance_policy_docs.py` 2 new content-contract tests (allowlist field doc-coverage + heuristic-catalog drift lock)

Privacy posture

The output is always safe to paste. Allowlist enforcement at the `Diagnosis` dataclass level + `_ALLOWED_FIELDS` frozenset + content-contract drift lock collectively form the airtight surface. Negative content-leak test (`test_gather_diagnosis_emits_no_decision_content_when_decisions_present`) inserts a decision with marker `"TOP-SECRET-DECISION-CONTENT-MARKER"` and asserts the marker does NOT appear in `repr(Diagnosis)`.

Suggestion heuristics (5 hardcoded)

  1. drift detected — `drift_status == "drift"`
  2. recommended-version mismatch — `bicameral_version` ≠ fetched `RECOMMENDED_VERSION`
  3. audit log disabled — `audit_log_channel == "stderr"` (default)
  4. ledger > 100 MiB — `ledger_size_bytes > 100 * 1024 * 1024`
  5. schema version old — `schema_version_recorded < schema_version_expected`

Test plan

  • 32 new functional tests pass (3 allowlist + 18 gather + 8 format + 3 CLI)
  • 2 new content-contract tests pass (doc/code drift locks for allowlist + heuristic catalog)
  • Compliance-policy regression: 10 tests pass clean
  • `ruff check` + `ruff format --check` clean on every touched + new file
  • End-to-end smoke test against `memory://` ledger emits all 6 required section headers
  • Privacy verification: decision content never appears in rendered output
  • Network-isolation: CLI tests monkeypatch `fetch_recommended_version` to avoid live HTTP

All 3 audit advisories applied

Advisory Resolution
Razor headroom on `cli/diagnose.py` (~280 plan estimate) Helper extraction: `cli/diagnose.py` (213) + `cli/_diagnose_gather.py` (244). Both under 250. Mirror of `cli/_link_commit_runner.py` pattern.
Recommended-version fetch needs cross-layer call Public alias `fetch_recommended_version` added to `handlers/update.py` (one-line addition). Avoids private-symbol re-use.
CLI tests would otherwise gate on `raw.githubusercontent.com` availability Network monkeypatched in `tests/test_diagnose_cli.py` autouse fixture.

Razor compliance

All new code under limits:

  • `cli/diagnose.py` 213 LOC; longest function `format_diagnosis` ~17 LOC (orchestrator)
  • `cli/_diagnose_gather.py` 244 LOC; longest function `gather_diagnosis` ~35 LOC
  • All readers <25 LOC; `_compute_suggestions` ~38 LOC (right at 40-LOC ceiling)
  • `server.py` modifications: 5 LOC additive
  • `handlers/update.py` modification: 3-LOC public alias

Closes / unlocks

Inheritance from prior work

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Privacy-preserving operator bug-report tool per
docs/research-brief-252-privacy-preserving-ledger-remediation.md.

Three phases:
- Phase 1: Diagnosis dataclass + _ALLOWED_FIELDS frozenset + async
  gather_diagnosis() with private readers (ledger metadata, bicameral_meta
  sentinel, table counts, audit-log tail merge across both JSONL +
  optional configured path, suggestion engine). Allowlist-by-field-name
  is the load-bearing privacy mechanism.
- Phase 2: format_diagnosis() markdown renderer + CLI subparser
  registration in server._register_subparsers + dispatch arm. Output
  is operator-pasteable markdown with 6 section headers.
- Phase 3: operator policy doc enumerating allowlisted fields +
  suggestion heuristic catalog; content-contract tests lock doc/code
  drift; README compliance-posture row added.

~280 LOC + 25 functional tests. No new pip deps. Implementation can
land before #256 merges (bicameral_meta reads are tolerant of missing
table); integration test for first-write status xfail-marked until
#256 reaches dev.

Five hardcoded suggestion heuristics: drift detected; recommended-
version mismatch; audit-log disabled; ledger > 100 MiB; schema version
old. Plugin extension is YAGNI for v1.

Audit-log tail covers BOTH ~/.bicameral/preflight_events.jsonl AND
BICAMERAL_AUDIT_LOG=<path> when configured (full-utility hybrid per
operator directive — refused the simplification tradeoff).

Plan: plan-252-layer-3-diagnose-cli.md
Strategy: docs/research-brief-252-privacy-preserving-ledger-remediation.md
Depends on: #252 Layer 2 (#256) for bicameral_meta table availability
…eport (#252 Layer 3)

Closes #252 Layer 3 per
docs/research-brief-252-privacy-preserving-ledger-remediation.md.

cli/diagnose.py (215 LOC): Diagnosis dataclass + _ALLOWED_FIELDS
frozenset (17 fields) + _CANONICAL_TABLES + format_diagnosis()
markdown renderer + main() entrypoint. Allowlist-by-field-name is
the load-bearing privacy mechanism.

cli/_diagnose_gather.py (244 LOC, helper-extracted per Razor advisory):
gather_diagnosis() async function + 5 private readers (ledger metadata,
bicameral_meta sentinel, schema_version, table_counts, audit-log tail
merge across both preflight_events.jsonl AND BICAMERAL_AUDIT_LOG=<path>
when configured) + 5-heuristic suggestion engine.

server.py: new "diagnose" subparser registration + dispatch arm.

handlers/update.py: public alias `fetch_recommended_version` for
clean cross-layer call from cli/diagnose without re-using private
underscore symbol.

docs/policies/diagnose-output.md: operator-readable allowlist policy
+ suggestion heuristic catalog + always-safe-to-paste guarantee.

README.md: Compliance posture section bumped 5 → 6 policy files.

Tests: 31 new functional tests across 4 files (allowlist parity +
gather scenarios + format rendering + CLI subprocess) plus 2 new
content-contract tests in test_compliance_policy_docs.py (allowlist
field doc-coverage + suggestion-catalog drift lock). All 41 PASS.

option<datetime> support inherited from #252 Layer 2 (not present on
this branch base; bicameral_meta read returns "first-write" status
on pre-Layer-2 ledgers — semantically equivalent for diagnostic
purposes).

Plan: plan-252-layer-3-diagnose-cli.md
Audit: round 1 PASS (no VETO — plan absorbed prior-round audit
learnings: separate-table architecture from Layer 2; allowlist +
content-contract test pattern from #227; helper-extraction
acknowledged in implementer notes).
…I substantiated

Closes #252 Layer 3 per
docs/research-brief-252-privacy-preserving-ledger-remediation.md.

Reality matches Promise: 9 planned files (4 new tests + 1 new policy
doc + 4 modified sources/docs) shipped across 3 phases. 41 new
functional tests pass (16 more than the plan's 25 — doctrine-positive
expansion covering per-reader unit boundary + edge cases like
PackageNotFoundError unknown branch, recent-events 5-cap enforcement,
table-counts canonical subset).

Logged deviations:
1. Helper-extraction split per Razor advisory (cli/diagnose.py 213 +
   cli/_diagnose_gather.py 244, both under 250). Mirrors
   cli/_link_commit_runner.py pattern.
2. drift_status semantics adjusted to "match" on fresh ledger (Layer 2's
   sentinel writes row at adapter.connect time; gather sees populated
   row by the time it runs). Test renamed to reflect Layer-2-integrated
   reality.
3. Test count expansion (25 -> 41).

Privacy posture: allowlist-by-field-name at the Diagnosis dataclass
level. Negative content-leak test verifies decision content never
appears in repr(Diagnosis). Forbidden-field-name negative lock catches
future field expansion using content-bearing names.

Three audit advisories all applied:
- Razor headroom: helper extraction
- Recommended-version fetch: public alias added
- CLI tests: network monkeypatched (no live HTTP)

Audit: round 1 PASS (no VETO — plan absorbed prior-round audit
learnings).

Plan: plan-252-layer-3-diagnose-cli.md
Strategy: docs/research-brief-252-privacy-preserving-ledger-remediation.md
@coderabbitai

coderabbitai Bot commented May 7, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ef45a56-a224-4cdc-98e2-582a8fd2439a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch plan/252-layer-3-diagnose-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jinhongkuan jinhongkuan merged commit 50e2481 into dev May 7, 2026
8 of 9 checks passed
@jinhongkuan jinhongkuan deleted the plan/252-layer-3-diagnose-cli branch May 7, 2026 21:36
jinhongkuan added a commit that referenced this pull request May 7, 2026
release: v0.14.1 — SBOM fix + #263 sync auto-bind + #257 diagnose CLI
Knapp-Kevin added a commit that referenced this pull request May 14, 2026
…residual)

Aligns preflight-eval.yml with the test-mcp-regression.yml pin landed in
PR #273. Closes the last of #272's three CI-baseline regressions —
preflight-eval was the only remaining consumer of the unpinned mutable
@v2 tag whose published artifact silently broke between PR #257 and
PR #258 (index.js missing from the action's bundled output).

Same SHA (31493c76ec9e7aa675f1585d3ed6f1da69269a86, v2.4) used in
test-mcp-regression.yml:213 so a future bump is one grep-and-replace.

Per docs/policies/install-trust-model.md (OWASP A06 supply-chain
discipline): no GitHub Action runs in our CI from a mutable tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin pushed a commit to Knapp-Kevin/bicameral-mcp that referenced this pull request May 21, 2026
- pyproject.toml: 0.13.3 → 0.14.1 (dev was stuck at 0.13.3 throughout
  the v0.14.0 stream; bumping past v0.14.0 to align with what's actually
  on main)
- RECOMMENDED_VERSION: 0.13.3 → 0.14.1
- pyproject.toml scripts: drop `bicameral-mcp-classify` (broken since
  BicameralAI#244 deleted cli/classify.py — carryover cleanup from the v0.14.0
  release surgery)
- release/sbom_emit.py: install wheel into temp venv before scanning.
  Fixes the v0.14.0 publish-pipeline halt where
  `cyclonedx-py environment <wheel>` failed because the subcommand
  introspects a Python environment via a Python-executable path, not
  a wheel file. New flow: tempdir venv → pip install wheel + cyclonedx-bom
  → run `cyclonedx-py environment --output-file <out> <venv-python>`.
  Output is the wheel's actual dependency closure with no contamination
  from the build env. `--output-file` flag replaces v0.14.0's `-o` short
  form (cyclonedx-py 7.x dropped the alias).
- CHANGELOG.md: new ## v0.14.1 release header summarizing SBOM fix +
  BicameralAI#257 diagnose CLI + BicameralAI#259/BicameralAI#260 dependabot bumps. Demoted prior
  "[Unreleased]" content to "[Unreleased — pre-v0.14.0]" to mark the
  cutoff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants