triage: dev → main (README restructure + #272 Fix 3 + SECURITY.md) by jinhongkuan · Pull Request #282 · BicameralAI/bicameral-mcp

jinhongkuan · 2026-05-08T04:37:52Z

Summary

Triages 10 commits from dev onto main — every commit from e661bd8 onwards (the README restructure base) up through the latest Flow 5 prompt disambiguation. Cherry-picked rather than merge-tracked because main has independent hotfix history (#267-#269 SBOM hotfixes) that's already in dev via prior triage; this avoids re-merging the merge commits.

Commits in order

dc380a7 docs(README): restructure for getting-started clarity + add hero image
5d1dca7 chore(repo): untrack plan-*.md and add to .gitignore
279e9ee docs(security): add SECURITY.md — enables GitHub Security tab
b1e9879 docs(README): trim to test-out path; add star CTA + logo
d786914 fix(ci): SHA-pin test-summary action + rename eval_decision_relevance attr (Dev CI baseline regressions blocking #258 merge: test-summary action + M1 eval AttributeError + e2e Flow 1 #272)
6cab533 fix(skill): ingest advises on ratify instead of auto-prompting (Dev CI baseline regressions blocking #258 merge: test-summary action + M1 eval AttributeError + e2e Flow 1 #272 Fix 3)
46d8fad style: ruff format assert_flow_1 ratify_note line
d1364a1 fix(e2e): re-anchor flow-1 prompt with 'decision ledger' to trigger ingest
3975447 style(e2e): refresh stale Flow 5 no-ratify message (Dev CI baseline regressions blocking #258 merge: test-summary action + M1 eval AttributeError + e2e Flow 1 #272 Fix 3)
c2cf52a fix(e2e): disambiguate flow-5 prompt — 'decisions we've tracked in our ledger'

Highlights

README: hero image, restructured Quickstart, star CTA, logo
SECURITY.md: enables the GitHub Security tab
Dev CI baseline regressions blocking #258 merge: test-summary action + M1 eval AttributeError + e2e Flow 1 #272 Fix 3: ingest skill now advises on ratification rather than auto-firing an AskUserQuestion gate (saves Flow 1's tool-call budget; ratify moves to Flow 5)
CI hardening: SHA-pinned phoenix-actions/test-reporting-summary
e2e: Flow 1 + Flow 5 prompt anchors tightened so the agent reaches bicameral.ingest / bicameral.history before exhausting budget on auto-memory + working-tree reads

What this does NOT include

Older dev-only work that was never triaged (sim_* scripts, release/sbom_emit.py refactors) — out of scope; left on dev

Test plan

CI: MCP Regression Suite green on triage-from-dev
CI: ruff + mypy green
CI: e2e Flow 1 + Flow 5 green (the new prompt anchors are the load-bearing change here)
CI: TruffleHog clean

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Supply-chain protections: signed artifacts and SBOM verification surfaced in docs
Documentation
- Reworked README with clearer privacy/compliance posture and telemetry opt-out
- Added SECURITY.md detailing telemetry, vulnerability reporting, and release verification
Changes
- Ingest/ratify flow is advisory by default (ratify optional; direct sign-off still supported)
- E2E assertions updated to tolerate advisory-only ratification
- CI workflow stability improved by pinning a test-summary action
Chores
- Ignore local plan files via .gitignore entries

Visible changes when a user lands on the README: 1. Hero image (`assets/bicameral-hero.png`) at the very top — visual without/with comparison from the landing-page asset bundle. 2. Quickstart immediately after the one-line value prop (was buried below "The Problem" and "How It Feels"). User goes from "what is this" to "type these three lines" without scrolling. 3. Compliance section trimmed from a 12-line policy-file enumeration near the top to a 5-line "we take privacy seriously" paragraph at the bottom. The full posture stays linkable via docs/policies/. 4. pipx dropped from the install path. The two paths are now uv (recommended) and plain pip (fallback). uv was already preferred per #199's 3-path resolve order; pipx was middle ground that doesn't pull its weight in a top-level README. Section order: Hero → title → 1-liner → Quickstart → How It Feels → The Problem → Core Concepts → What setup installs → Slash Commands → MCP Tools Reference → Configuration → Local Development → Telemetry → Contributing → Privacy & Compliance → License 312 lines (was 376). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

27 in-progress planning artifacts (`plan-114-*.md`, `plan-codegenome-*.md`, `plan-A-*.md`, etc.) were tracked in the repo. They're working memory between author and reviewers during a feature; once the feature merges, the PR description + CHANGELOG carry the durable record. Keeping these in the public-ish repo: - adds 27 markdown files to the wheel's source-distribution surface for no end-user benefit - couples planning vocabulary to release artifacts (e.g. plan-codegenome files describe v1 work that #246 reverted; the plan stays useful as reference but doesn't belong on `main`) - creates churn pressure to mark plans as "done" or "superseded" instead of just letting them rest as the author's working notes This commit: 1. Adds `plan-*.md` pattern to `.gitignore` with a one-paragraph comment on the policy. 2. `git rm --cached` on all 27 currently-tracked `plan-*.md` files — they remain on local disk for the author's reference, just no longer tracked. After merge, anyone with a checkout will keep their local `plan-*.md` files; new plans drafted in-tree will be untracked by default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a top-level SECURITY.md so GitHub auto-creates a "Security" tab in the repo nav bar — the closest GitHub-native surface to a "button that pulls up our SBOM/privacy statement." Contents: - **Privacy posture** — explicit "all data on your laptop" + telemetry opt-out + pointer to docs/policies/ for the full posture - **Software supply chain** — table of signed artifacts on each release (CycloneDX SBOM, Rekor attestation, hooks-manifest sigs, skills-manifest sigs, release-tag-commit sig); pointer to the release-evidence verification procedure; mention of GitHub's auto-SPDX SBOM under Insights → Dependency graph - **Supported versions** — only latest minor, ~30-day backport window for critical fixes - **Reporting a vulnerability** — GitHub Security Advisories preferred; jin@bicameral-ai.com fallback with `[security]` subject prefix; 3-day ack target, 30-day patch target for critical - **Scope** — what's in (server, skills, hooks, release pipeline) and what's out (third-party deps, host vulns, local-attack scenarios already covered by host-trust model) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Sections removed (PM/dev evaluating Bicameral wants the test-out path, not the why/how-internals): - "The Problem" (long context narrative) - "Core Concepts" (two-axis model, link_commit, collab modes) - "Removing Bicameral" (post-test concern) - "Configuration" env-var table - "Local Development" - "Contributing" - "Telemetry" subsection (folded into Privacy & Compliance one-liner) What stays — the path from "land on the page" to "type three commands and see something work": - Hero image - Star-on-GitHub CTA (animated SVG, adapted from cocoindex's design with metadata strings updated; visual mechanism is generic) - Logo (small, inline-right of title) - Title + badges + 1-liner - Quickstart (uv | pip | Windows) - How It Feels (preflight render + dashboard screenshot) - Slash Commands - What `setup` writes (trust signal — what hits disk) - MCP Tools Reference (collapsed by default) - Privacy & Compliance (concise) - License 312 → 152 lines. Visuals from landing-page borrowed: assets/logo.png (was landing-page/logo.png), assets/bicameral-hero.png from landing-page/output/imagegen/. Star CTA SVG saved as assets/star-on-github.svg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… attr (#272) Closes 2 of 3 dev-baseline regressions tracked in #272 (the third — Flow 1 e2e expectation post-#263 auto-bind — is deferred for product-level discussion since it touches the ingest skill choreography). ## test-summary/action SHA-pin The `test-summary/action@v2` mutable tag was repointed on 2026-05-07 22:09 UTC from a `dist`-targeted release (with bundled `index.js`) to the new v2.5 release that targets `main` (no bundled output). Every PR opened after that point fails the Test Summary step with `File not found: index.js`, marking MCP Regression Suite jobs red on both ubuntu and windows. Pin to v2.4 commit `31493c76ec9e7aa675f1585d3ed6f1da69269a86` (the last `dist`-targeted release with the bundled artifact). Aligns with OWASP-03 / `docs/policies/install-trust-model.md` discipline (do not trust mutable action tags for security-load-bearing CI). ## tests/eval_decision_relevance.py:166 attribute rename `IngestResponse` schema field is `pending_grounding_decisions` per `contracts.py:571` and `handlers/ingest.py:695`. The eval script was still reading the legacy internal name `ungrounded_decisions` directly off the response, raising `AttributeError` and failing the M1 adversarial corpus eval step (technically `continue-on-error: true`, but worth fixing independently since the eval result was lost on every run). Keeps the JSON-output key `ungrounded_decisions` unchanged so downstream consumers (M1 trend reports, etc.) see no change. Both fixes verified locally: yaml.safe_load on the workflow + IngestResponse field introspection confirms the schema. Refs #272.

…Fix 3) Closes the third regression deferred from PR #273. Post-#263 (sync auto-bind step 1.5), Flow 1's e2e exhausted budget on ~41 non-bicameral Read/Grep/validate_symbols calls before reaching the ingest skill's auto-fired ratify AskUserQuestion gate, so the agent never invoked ratify. Per #108 Flow 1 spec discussion: this is both a regression AND a canonical flow change. Auto-bind stays (deterministic, useful), but ratify drops out of the auto-prompt path and becomes advisory text. Ratification belongs to Flow 5 (PM Friday review) and to direct user requests like "sign these off" / "ratify all". Skill changes (skills/bicameral-ingest/SKILL.md Step 7): - Replace AskUserQuestion ratify gate with a one-block advisory: "○ N decisions captured as proposals — drift tracking activates after ratification. Run `bicameral.ratify` when ready, or revisit them in your next history review (Flow 5)." - Add explicit "Direct user request shortcut": if the prompt asked for sign-off in the same turn, ratify directly without the round-trip. - Update the example at the bottom to match. Test changes: - tests/e2e/prompts/flow-1-ingest.md: drop "sign them off on our end" so the prompt exercises the new advisory path. Direct sign-off requests are covered by Flow 5 (history + ratify). - tests/e2e/run_e2e_flows.py:assert_flow_1: drop the ratify requirement; accept [ingest, bind?] as the canonical Flow 1 signature. Update Flow 5's stale comment about Flow 1 pre-ratifying. - tests/test_e2e_asserters.py: invert test_flow1_fails_without_ratify → test_flow1_passes_without_ratify (advisory-only is now the expected behavior). Doesn't touch sync skill #263 — that auto-bind change is preserved intentionally; this PR fixes the test/spec contract that conflicted with it. Refs #272 #108 #263. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ngest The previous fix dropped "sign them off on our end" to defeat the ratify auto-call; in doing so, it also removed the bicameral disambiguator. The remaining "please log these on our end" landed inside the auto-memory trigger surface (the runner's ~/.claude/projects/.../memory/ directory), so the agent wrote four memory files instead of invoking bicameral.ingest — Flow 1 went 0/0 on bicameral calls and the cascading flows (3, 5) then had nothing to assert against. Replace "log these on our end" with "log these to our decision ledger" — same intent, but "decision ledger" is an unambiguous bicameral signal the auto-memory skill does not match. Still no "sign them off" / "ratify" phrasing, so the new advisory-only Step 7 contract is preserved. Refs #272 #108. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The no-ratify success branch in assert_flow_5 still claimed "Flow 1 ratified its 3 seeds" — but per #272 Fix 3, Flow 1 now leaves seeds as `proposed` (advisory-only ingest). The docstring was updated in 2252c82; this catches the trailing f-string that was missed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…r ledger' Same auto-memory / working-tree ambiguity that bit Flow 1 (#272 Fix 3 follow-up): Flow 5's prompt 'all the things we're tracking' was readable as either bicameral decisions OR files modified in the cwd's working tree. With Flow 1 no longer auto-ratifying (advisory-only Step 7), the agent's mental model on the new ledger state shifted: it pulled the auto-memory index and the working-tree diff first, then never reached bicameral.history. Tighten the prompt: - 'all the things we're tracking' → 'the decisions we've tracked in our ledger' (unambiguous bicameral signal — auto-memory and the working tree are not 'our ledger'). - 'still on the to-do pile that hasn't moved' → 'still on the proposed pile that hasn't been ratified' (uses the ledger's `signoff.state` vocabulary; aligns with the new advisory-only ingest contract where Flow 1 leaves seeds as `proposed` for Flow 5 to walk). The asserter (history mandatory; ratify conditional on proposals) is unchanged. Now the asserter's load-bearing 'if proposals exist' branch is the live path: Flow 1 leaves 3 proposals → Flow 5 ratifies → drift tracking activates, exactly as the #108 spec describes. Refs #272 #108. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-08T04:38:06Z

Warning

Rate limit exceeded

@jinhongkuan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 46 minutes and 43 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bdf5e54d-e6d6-4c22-8a16-071a7224445f

📥 Commits

Reviewing files that changed from the base of the PR and between c20954e and cdc1f19.

📒 Files selected for processing (2)

tests/e2e/prompts/flow-3-commit-sync.md
tests/e2e/run_e2e_flows.py

📝 Walkthrough

Walkthrough

This PR consolidates project planning artifacts, pivots the SessionEnd capture flow to queue pending transcripts, changes ingest ratification from mandatory to advisory-only, reorganizes README and security documentation, and updates e2e test assertions to match the new ingest behavior. The core narrative is SessionEnd hook refactoring coupled with ingest flow simplification and significant documentation hygiene work.

Changes

SessionEnd Queue Pivot + Ingest Flow Refactor + Plan Consolidation

Layer / File(s)	Summary
Plan File Cleanup `plan-124-post-commit-hook-fix.md`, `plan-156b-preflight-queue-drain.md`, `plan-187-gap-brief-unification.md`	Removes obsolete/completed plan files (124, 156b) and archives 187 for future phases.
Transcript Queue Data Model `events/transcript_queue.py`	New module implementing pending/processed transcript queue with three operations: write-pending (fail-soft), list-pending-FIFO (by mtime), and archive-processed (idempotent overwrite).
SessionEnd Queue Writer Hook `scripts/hooks/session_end_queue_writer.py`	New fail-soft SessionEnd hook script that parses JSON stdin, validates inputs (session_id, transcript_path, cwd), and writes to pending queue while always exiting zero on error.
Transcript Archive CLI Helper `scripts/hooks/transcript_archive.py`	New CLI to safely archive pending transcripts by basename, validate format to prevent traversal, and return distinct exit codes for error vs missing file.
SessionEnd Command Wiring `setup_wizard.py`, `.claude/settings.json`	Updated to invoke `session_end_queue_writer.py` instead of legacy `claude -p ... --auto-ingest` command; preserves `BICAMERAL_SESSION_END_RUNNING` guard.
Capture Corrections Skill Queue Draining `skills/bicameral-capture-corrections/SKILL.md`	Added Step 0 to drain pending-transcripts FIFO via `transcript_archive.py`; updated narrative to reflect queue-in-SessionEnd, drain-in-next-session flow.
SessionEnd Queue Tests & Drift `tests/test_session_end_queue_writer.py`, test/drift assertions	Expanded coverage: queue writing success, fail-soft no-ops, FIFO ordering by mtime, archive idempotency; updated drift tests to validate new queue-writer command and remove `claude -p`/`--auto-ingest` assertions.
Ingest Skill Ratification Flow `skills/bicameral-ingest/SKILL.md`	Changed from mandatory `AskUserQuestion` ratification step to advisory-only step (v0.14.2+); preserves direct-request shortcut to `bicameral.ratify` when user explicitly requests sign-off.
E2E Prompt Wording `tests/e2e/prompts/flow-1-ingest.md`, `tests/e2e/prompts/flow-5-history.md`	Updated Flow 1 and Flow 5 prompts to use "decision ledger" and "proposed pile" terminology, aligning with advisory-only ratification model.
E2E Flow Assertions `tests/e2e/run_e2e_flows.py`, `tests/eval_decision_relevance.py`, `tests/test_e2e_asserters.py`	Updated Flow 1 asserter to tolerate missing `bicameral_ratify` (advisory-only), Flow 3 commit detection improved via regex, Flow 5 assertions clarified; eval harness updated to use `result.pending_grounding_decisions`.
README Restructure `README.md`	Streamlined hero section, removed Problem/How-It-Feels blocks, reorganized MCP Tools Reference into "Decision Ledger" and "Code Locator" categories, added "What `setup` writes to your repo" section, consolidated "Privacy & Compliance" section.
Security & Privacy Documentation `SECURITY.md`	New comprehensive documentation of local-only operation, telemetry scope (tool name, version, duration, error flag, counts) with `BICAMERAL_TELEMETRY=0` opt-out, supply-chain protections (cosign/SBOM/Rekor), version maintenance policy, vulnerability reporting channels.
GitHub Actions Workflow `.github/workflows/test-mcp-regression.yml`	SHA-pinned `test-summary/action` from `@v2` to v2.4 commit with explanatory comments about repointing/incident context.
Gitignore Plan Exclusion `.gitignore`	Added `plan-*.md` patterns to exclude internal planning artifacts from version control.
Plan Document References `plan-*.md` (114, 156, 190, 197, 199–200, 205, 216, 218, 227, 252, 48, A–F, codegenome phases)	Documentation-only specifications for future planned features, architectures, and multi-phase implementations (no executable code changes in this diff layer).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

test(e2e): Flow 3 agent flake — 'agent did NOT commit' despite explicit prompt instruction #197: Flow 3 prompt-clarity plan and e2e harness updates addressing the "agent did NOT commit" failure.
CodeGenome Phase 4: Semantic Drift Evaluation in resolve_compliance (M3) #61: CodeGenome Phase 4 semantic-drift and resolve_compliance plans referenced by plan documents in this PR.
BicameralAI/bicameral-daemon#34: Compliance research brief and related documentation updates align with issue #205.

Possibly related PRs

BicameralAI/bicameral-mcp#29: Related prior workflow edits that also modified .github/workflows/test-mcp-regression.yml.
BicameralAI/bicameral-mcp#55: Prior SessionEnd hook wiring and auto-ingest behavior related to the SessionEnd changes here.
BicameralAI/bicameral-mcp#165: Overlapping SessionEnd/auto-capture flow changes.

Suggested labels

enhancement

Suggested reviewers

Knapp-Kevin

Poem

🐰 A rabbit's note on queues and plans:
Hooks now tuck transcripts in a patient queue,
Ratify is gentle — advisory, not a cue,
Docs trimmed and pinned, tests tuned to the new,
A tidy hop forward — congrats from this crew! 🥕

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch triage-from-dev

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Line 111: Update the summary text that currently reads "<summary><strong>13
tools across two categories</strong></summary>" to reflect the actual count of
tools; change it to "<summary><strong>15 tools across two
categories</strong></summary>" (or optionally "<summary><strong>15 tools across
two categories (13 Decision Ledger + 2 Code Locator)</strong></summary>") so the
summary matches the listed totals.

In `@skills/bicameral-ingest/SKILL.md`:
- Around line 823-827: The fenced block starting with triple backticks around
the decision-note example is missing a language tag (causing MD040); update the
opening fence in that block (the ``` that begins the example containing "○ N
decisions captured as proposals ...") to include a language identifier such as
text (i.e., change ``` to ```text) so the markdown linter recognizes the code
fence.

In `@tests/eval_decision_relevance.py`:
- Line 166: The JSON key "ungrounded_decisions" is semantically stale and should
match the response attribute name; update the dict construction where the output
is built to replace the "ungrounded_decisions" key with
"pending_grounding_decisions" so it serializes
result.pending_grounding_decisions (i.e., use the attribute
result.pending_grounding_decisions as the value under the key
"pending_grounding_decisions") to keep naming consistent for downstream
consumers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eee1f34f-01fa-4bc0-9860-2d00577dc0cf

📥 Commits

Reviewing files that changed from the base of the PR and between 18e1414 and c2cf52a.

⛔ Files ignored due to path filters (3)

assets/bicameral-hero.png is excluded by !**/*.png
assets/logo.png is excluded by !**/*.png
assets/star-on-github.svg is excluded by !**/*.svg

📒 Files selected for processing (37)

.github/workflows/test-mcp-regression.yml
.gitignore
README.md
SECURITY.md
plan-114-grounding-lint.md
plan-124-post-commit-hook-fix.md
plan-156-sessionend-queue-pivot.md
plan-156b-preflight-queue-drain.md
plan-187-gap-brief-unification.md
plan-190-compliance-event.md
plan-197-flow3-prompt-clarity.md
plan-199-windows-install-and-uv.md
plan-200-config-yaml-redaction.md
plan-200-skills-audit-hardening.md
plan-205-compliance-research-brief.md
plan-216-ingest-size-and-rate-limit.md
plan-218-cosign-hooks-manifest-and-sbom.md
plan-227-structured-audit-log-emission.md
plan-252-layer-2-schema-revision-sentinel.md
plan-252-layer-3-diagnose-cli.md
plan-48-pre-push-drift-hook.md
plan-A-ingest-followups-230-232-233.md
plan-B-preflight-attribution-regex-209.md
plan-C-soc2-03-signed-tags-and-release-evidence.md
plan-D-compliance-stance-docs-220-225-226.md
plan-E-owasp-03-and-05-install-update-trust-model.md
plan-F-llm-06-skills-manifest-signing.md
plan-codegenome-llm-drift-judge.md
plan-codegenome-phase-1-2.md
plan-codegenome-phase-3.md
plan-codegenome-phase-4.md
skills/bicameral-ingest/SKILL.md
tests/e2e/prompts/flow-1-ingest.md
tests/e2e/prompts/flow-5-history.md
tests/e2e/run_e2e_flows.py
tests/eval_decision_relevance.py
tests/test_e2e_asserters.py

💤 Files with no reviewable changes (27)

plan-227-structured-audit-log-emission.md
plan-197-flow3-prompt-clarity.md
plan-B-preflight-attribution-regex-209.md
plan-codegenome-phase-3.md
plan-114-grounding-lint.md
plan-156b-preflight-queue-drain.md
plan-E-owasp-03-and-05-install-update-trust-model.md
plan-codegenome-phase-4.md
plan-187-gap-brief-unification.md
plan-200-config-yaml-redaction.md
plan-codegenome-llm-drift-judge.md
plan-218-cosign-hooks-manifest-and-sbom.md
plan-124-post-commit-hook-fix.md
plan-C-soc2-03-signed-tags-and-release-evidence.md
plan-216-ingest-size-and-rate-limit.md
plan-205-compliance-research-brief.md
plan-48-pre-push-drift-hook.md
plan-A-ingest-followups-230-232-233.md
plan-252-layer-2-schema-revision-sentinel.md
plan-199-windows-install-and-uv.md
plan-200-skills-audit-hardening.md
plan-codegenome-phase-1-2.md
plan-F-llm-06-skills-manifest-signing.md
plan-D-compliance-stance-docs-220-225-226.md
plan-156-sessionend-queue-pivot.md
plan-252-layer-3-diagnose-cli.md
plan-190-compliance-event.md

coderabbitai · 2026-05-08T04:42:40Z


 <details>
-<summary><strong>13 tools across three categories</strong></summary>
+<summary><strong>13 tools across two categories</strong></summary>


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Tool count in summary appears inconsistent with the tables

Line 111 says “13 tools across two categories,” but the section currently lists 15 total (13 Decision Ledger + 2 Code Locator). Please update the summary line to avoid reader confusion.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 111, Update the summary text that currently reads "<summary>13 tools across two categories</summary>" to reflect the actual count of tools; change it to "<summary>15 tools across two categories</summary>" (or optionally "<summary>15 tools across two categories (13 Decision Ledger + 2 Code Locator)</summary>") so the summary matches the listed totals.

coderabbitai · 2026-05-08T04:42:40Z

 ```
-AskUserQuestion({
-  question: "Captured N decisions as proposals — ratify now to start drift tracking:",
-  header: "Ratify",
-  multiSelect: false,
-  options: [
-    { label: "Ratify all N (recommended)",
-      description: "Drift tracking starts immediately on all N decisions." },
-    { label: "Pick which to ratify",
-      description: "Use the Other field to specify decision numbers (e.g. '1 3 5')." },
-    { label: "Skip — review later",
-      description: "All stay as proposals. They'll surface as stale after inactivity." }
-  ]
-})
+○ N decisions captured as proposals — drift tracking activates after ratification.
+  Run `bicameral.ratify` when ready, or revisit them in your next history review
+  (Flow 5: "show me what's tracked" → ratify what's ready).
 ```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced example block

The code fence opened at Line 823 is missing a language identifier, which triggers MD040.

Suggested patch

-``` +```text ○ N decisions captured as proposals — drift tracking activates after ratification. Run `bicameral.ratify` when ready, or revisit them in your next history review (Flow 5: "show me what's tracked" → ratify what's ready).

</details>  <details> <summary>📝 Committable suggestion</summary> > ‼️ **IMPORTANT** > Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements. ```suggestion

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 823-823: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/bicameral-ingest/SKILL.md` around lines 823 - 827, The fenced block starting with triple backticks around the decision-note example is missing a language tag (causing MD040); update the opening fence in that block (the ``` that begins the example containing "○ N decisions captured as proposals ...") to include a language identifier such as text (i.e., change ``` to ```text) so the markdown linter recognizes the code fence.

coderabbitai · 2026-05-08T04:42:40Z

        "grounded_pct": round(stats.grounded_pct, 4),
        "grounding_deferred": stats.grounding_deferred,
-        "ungrounded_decisions": list(result.ungrounded_decisions),
+        "ungrounded_decisions": list(result.pending_grounding_decisions),


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the return type of handle_ingest exposes pending_grounding_decisions # and check whether ungrounded_decisions still exists on that type. rg -n "pending_grounding_decisions\|ungrounded_decisions" --type py -C 3

Repository: BicameralAI/bicameral-mcp

Length of output: 51

🏁 Script executed:

git ls-files --cached | head -20

Repository: BicameralAI/bicameral-mcp

Length of output: 907

🏁 Script executed:

find . -name "*.py" -type f | grep -E "(eval_decision|handle_ingest)" | head -20

Repository: BicameralAI/bicameral-mcp

Length of output: 104

🏁 Script executed:

cd tests && ls -la *.py | head -20

Repository: BicameralAI/bicameral-mcp

Length of output: 1568

🏁 Script executed:

cat -n tests/eval_decision_relevance.py | sed -n '140,175p'

Repository: BicameralAI/bicameral-mcp

Length of output: 1709

🏁 Script executed:

rg -n "def handle_ingest|class.*Result|pending_grounding_decisions|ungrounded_decisions" -A 2 tests/eval_decision_relevance.py

Repository: BicameralAI/bicameral-mcp

Length of output: 249

🏁 Script executed:

rg -n "handle_ingest" --type py

Repository: BicameralAI/bicameral-mcp

Length of output: 16489

🏁 Script executed:

head -500 handlers/ingest.py | tail -100

Repository: BicameralAI/bicameral-mcp

Length of output: 4313

🏁 Script executed:

grep -n "async def handle_ingest" handlers/ingest.py

Repository: BicameralAI/bicameral-mcp

Length of output: 98

🏁 Script executed:

sed -n '456,550p' handlers/ingest.py

Repository: BicameralAI/bicameral-mcp

Length of output: 4884

🏁 Script executed:

grep -n "class IngestResponse\|IngestResponse =" handlers/ingest.py | head -5

Repository: BicameralAI/bicameral-mcp

Length of output: 51

🏁 Script executed:

rg -n "class IngestResponse" --type py

Repository: BicameralAI/bicameral-mcp

Length of output: 119

🏁 Script executed:

sed -n '564,650p' contracts.py

Repository: BicameralAI/bicameral-mcp

Length of output: 2920

🏁 Script executed:

rg "ungrounded_decisions" --type py handlers/ contracts.py

Repository: BicameralAI/bicameral-mcp

Length of output: 170

🏁 Script executed:

# Check git history or comments about this field change rg -B 3 -A 3 "pending_grounding_decisions" handlers/ingest.py | head -40

Repository: BicameralAI/bicameral-mcp

Length of output: 347

Rename stale JSON key to match attribute name.

The JSON output key "ungrounded_decisions" is out of sync with the response field pending_grounding_decisions. The attribute exists on the response object (no runtime risk), but the key name mismatch creates semantic confusion—downstream consumers will receive data about decisions pending grounding (deferred, not yet grounded) under a key implying definitive ungroundedness.

Suggested fix

- "ungrounded_decisions": list(result.pending_grounding_decisions), + "pending_grounding_decisions": list(result.pending_grounding_decisions),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"ungrounded_decisions": list(result.pending_grounding_decisions),

"pending_grounding_decisions": list(result.pending_grounding_decisions),

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/eval_decision_relevance.py` at line 166, The JSON key "ungrounded_decisions" is semantically stale and should match the response attribute name; update the dict construction where the output is built to replace the "ungrounded_decisions" key with "pending_grounding_decisions" so it serializes result.pending_grounding_decisions (i.e., use the attribute result.pending_grounding_decisions as the value under the key "pending_grounding_decisions") to keep naming consistent for downstream consumers.

…er-as-verdict Per BicameralAI/bicameral#108, Flow 3's `git commit` is the *trigger* for the V1 lifecycle, not the assertion. The previous design conflated the two: the per-flow asserter FAILed on missing commit, and a buggy substring check (`"git commit" in cmd`) silently rejected the agent's `git -C <path> commit` form that `claude` produces when cwd doesn't match the prompt's bare path. That masked the real signal — the post-hoc ledger validator's "no compliance_check rows" message — and reported a regex bug as a sync-chain failure. Changes: - New `_command_runs_git_commit()` helper using a regex that handles `-C <path>`, `--git-dir=<path>`, &&-chained sequences, and rejects `commit-tree` plumbing. - `assert_flow_3` now always returns True; body text describes the precondition (met / unmet) for the report. Per-flow no longer gates. - `_validate_flow3_via_ledger` becomes the single source of verdict truth: no commit → SKIP (advisory, non-blocking) commit + verdict written → PASS (full V1 lifecycle) commit + cc rows only → PASS (degraded, advisory: #135 gap) commit + neither → FAIL (no advisory — real sync-chain bug) - `commit_note` no longer derives from `flow3.verdict`, breaking the precondition→assertion leak. Replayed against PR #282's failing CI artifact: the agent's commit is now correctly detected, and the validator surfaces the real upstream bug (cc_delta == 0 despite a successful commit + 3 subsequent bicameral calls). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The old prompt asked the agent to add a comment above the CherryPickResult enum (line 30 of cherry-pick.ts), which sits OUTSIDE the cherry-pick decision's bound code region (the cherryPick function body, lines 142-183). Verified against PR #282's CI ledger artifact: code_region row for app/src/lib/git/cherry-pick.ts: symbol: cherryPick start_line: 142 end_line: 183 pinned_commit: 192ef5730e84c9337a1e07a04e56cba18a0b2766 ← Flow 3's commit link_commit DID fire post-Flow-3 (the pinned_commit set on the region proves ensure_ledger_synced → link_commit ran), but the bound region's content hash didn't change because the edit was 100+ lines above. With no drift detected, link_commit correctly wrote ZERO compliance_check rows — and the harness's success signal (cc_delta > 0) never fired. The new prompt edits inside the cherryPick function body (line 146, above the empty-set early-return), which is well inside the bound region. This produces the drift the V1 lifecycle is designed to detect, and the existing validator's `cc_delta > 0` branch will fire correctly (PASS-degraded with #135 advisory, since headless can't run resolve_compliance). Drive-by: use the full path `app/src/lib/git/cherry-pick.ts` in both the edit and commit commands so the agent doesn't have to defensively rewrite to `git -C <repo> commit` (the failing CI's agent did exactly this when faced with the bare relative path against cwd=/tmp/desktop-clone). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/e2e/run_e2e_flows.py (1)
1004-1026: 💤 Low value

Regex correctly detects git commit across documented patterns; two edge cases identified for future-proofing.

The regex reliably handles the three patterns documented in the code — bare git commit, git -C <path> commit, git --git-dir=<path> commit — and chained variants like git status && git commit, while the trailing (?=\s|$) lookahead correctly excludes plumbing such as git commit-tree. Two edge cases fall outside the current pattern:

(?=\s|$) does not match commands where commit is followed directly by shell punctuation (git commit;, git commit) in a subshell, git commit&& without a space). A lookahead like (?![\w-]) would accept these while still rejecting commit-tree/committer.

The flag-arg clause [^\s-]\S* consumes only unquoted tokens, so quoted paths with spaces (git -C "/tmp/has space" commit) will not match.

These gaps are unlikely to occur with the Claude Code CLI patterns currently tested. No changes needed for the current scope.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/run_e2e_flows.py` around lines 1004 - 1026, The _GIT_COMMIT_RE
currently misses two edge cases: commands where "commit" is followed immediately
by shell punctuation (e.g. "git commit;", "git commit)") and flag arguments that
are quoted tokens with spaces (e.g. git -C "/tmp/has space" commit); update the
_GIT_COMMIT_RE pattern used by _command_runs_git_commit to (1) use a lookaround
that forbids word/ hyphen chars after "commit" (e.g. (?![\w-]) or equivalent)
instead of (?=\s|$) so punctuation is accepted, and (2) broaden the flag-arg
clause to accept quoted tokens (e.g. allow "(\"[^\"]+\"|'[^']+'|[^\s-]\S*)" as
the optional flag argument) so quoted paths are matched; keep the rest of the
pattern logic and ensure _command_runs_git_commit still returns
bool(_GIT_COMMIT_RE.search(cmd or "")).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/e2e/run_e2e_flows.py`:
- Around line 357-361: The multi-line list comprehension for commit_calls should
be collapsed into a single-line comprehension to satisfy ruff formatting: locate
the comprehension that constructs commit_calls (uses bash_calls and calls
_command_runs_git_commit) and rewrite it so the conditional is on the same line
as the for clause (e.g., commit_calls = [c for c in bash_calls if
_command_runs_git_commit((c.get("input") or {}).get("command", ""))]); apply the
same single-line change to the identical comprehension later in the file (the
one around lines 1049–1053) to fix both ruff failures.

---

Nitpick comments:
In `@tests/e2e/run_e2e_flows.py`:
- Around line 1004-1026: The _GIT_COMMIT_RE currently misses two edge cases:
commands where "commit" is followed immediately by shell punctuation (e.g. "git
commit;", "git commit)") and flag arguments that are quoted tokens with spaces
(e.g. git -C "/tmp/has space" commit); update the _GIT_COMMIT_RE pattern used by
_command_runs_git_commit to (1) use a lookaround that forbids word/ hyphen chars
after "commit" (e.g. (?![\w-]) or equivalent) instead of (?=\s|$) so punctuation
is accepted, and (2) broaden the flag-arg clause to accept quoted tokens (e.g.
allow "(\"[^\"]+\"|'[^']+'|[^\s-]\S*)" as the optional flag argument) so quoted
paths are matched; keep the rest of the pattern logic and ensure
_command_runs_git_commit still returns bool(_GIT_COMMIT_RE.search(cmd or "")).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ada8ff59-745f-468d-8013-07e895ff7e3f

📥 Commits

Reviewing files that changed from the base of the PR and between c2cf52a and c20954e.

📒 Files selected for processing (1)

tests/e2e/run_e2e_flows.py

Drive-by formatter fix on c20954e — ruff prefers single-line list comprehensions when they fit within line-length. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Patch on v0.14.1 — README + SECURITY surface, ingest skill softening, Flow 3 e2e hardening. See CHANGELOG.md for the full per-section detail. Pre-release checklist (per docs/RELEASE_EVIDENCE_PROCEDURE.md): - All PRs in window (BicameralAI#282) have approving CI; review approval pending - No force-pushes to main since v0.14.1 - CHANGELOG draft ready (this commit) Tag + GitHub Release to be cut by the maintainer after this PR merges (triggers .github/workflows/publish.yml on `release:` event). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jinhongkuan and others added 10 commits May 7, 2026 21:29

style: ruff format assert_flow_1 ratify_note line

46d8fad

jinhongkuan had a problem deploying to production May 8, 2026 04:37 — with GitHub Actions Failure

jinhongkuan temporarily deployed to ci-test May 8, 2026 04:37 — with GitHub Actions Inactive

jinhongkuan had a problem deploying to recording-approval May 8, 2026 04:37 — with GitHub Actions Failure

ci: re-trigger workflows on fix/272 head

4329982

jinhongkuan enabled auto-merge (rebase) May 8, 2026 04:40

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

jinhongkuan had a problem deploying to production May 9, 2026 02:17 — with GitHub Actions Failure

jinhongkuan had a problem deploying to recording-approval May 9, 2026 02:17 — with GitHub Actions Failure

jinhongkuan temporarily deployed to ci-test May 9, 2026 02:17 — with GitHub Actions Inactive

jinhongkuan temporarily deployed to ci-test May 9, 2026 02:23 — with GitHub Actions Inactive

jinhongkuan temporarily deployed to production May 9, 2026 02:23 — with GitHub Actions Inactive

jinhongkuan temporarily deployed to ci-test May 9, 2026 02:23 — with GitHub Actions Inactive

jinhongkuan had a problem deploying to recording-approval May 9, 2026 02:23 — with GitHub Actions Failure

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

Comment thread tests/e2e/run_e2e_flows.py

style(e2e): ruff format Flow 3 asserter list comprehensions

cdc1f19

Drive-by formatter fix on c20954e — ruff prefers single-line list comprehensions when they fit within line-length. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jinhongkuan temporarily deployed to production May 9, 2026 02:30 — with GitHub Actions Inactive

jinhongkuan had a problem deploying to recording-approval May 9, 2026 02:30 — with GitHub Actions Failure

jinhongkuan temporarily deployed to ci-test May 9, 2026 02:30 — with GitHub Actions Inactive

jinhongkuan disabled auto-merge May 9, 2026 02:56

jinhongkuan merged commit eb8ade8 into main May 9, 2026
6 of 7 checks passed

jinhongkuan mentioned this pull request May 9, 2026

chore(release): v0.14.2 — README + SECURITY surface, ingest softening, Flow 3 e2e fix #287

Merged

10 tasks

Knapp-Kevin mentioned this pull request May 14, 2026

fix(ci): SHA-pin test-summary/action in preflight-eval workflow (#272 residual) #317

Merged

2 tasks

devin-ai-integration Bot mentioned this pull request May 15, 2026

fix(infra): always-run schema CI, diagnose row probe, env-var truthy parity (#122, #301, #232) #353

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

triage: dev → main (README restructure + #272 Fix 3 + SECURITY.md)#282

triage: dev → main (README restructure + #272 Fix 3 + SECURITY.md)#282
jinhongkuan merged 14 commits into
mainfrom
triage-from-dev

jinhongkuan commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	"ungrounded_decisions": list(result.pending_grounding_decisions),
	"pending_grounding_decisions": list(result.pending_grounding_decisions),

Conversation

jinhongkuan commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commits in order

Highlights

What this does NOT include

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jinhongkuan commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading