triage: dev → main (README restructure + #272 Fix 3 + SECURITY.md)#282
Conversation
Visible changes when a user lands on the README: 1. Hero image (`assets/bicameral-hero.png`) at the very top — visual without/with comparison from the landing-page asset bundle. 2. Quickstart immediately after the one-line value prop (was buried below "The Problem" and "How It Feels"). User goes from "what is this" to "type these three lines" without scrolling. 3. Compliance section trimmed from a 12-line policy-file enumeration near the top to a 5-line "we take privacy seriously" paragraph at the bottom. The full posture stays linkable via docs/policies/. 4. pipx dropped from the install path. The two paths are now uv (recommended) and plain pip (fallback). uv was already preferred per #199's 3-path resolve order; pipx was middle ground that doesn't pull its weight in a top-level README. Section order: Hero → title → 1-liner → Quickstart → How It Feels → The Problem → Core Concepts → What setup installs → Slash Commands → MCP Tools Reference → Configuration → Local Development → Telemetry → Contributing → Privacy & Compliance → License 312 lines (was 376). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
27 in-progress planning artifacts (`plan-114-*.md`, `plan-codegenome-*.md`, `plan-A-*.md`, etc.) were tracked in the repo. They're working memory between author and reviewers during a feature; once the feature merges, the PR description + CHANGELOG carry the durable record. Keeping these in the public-ish repo: - adds 27 markdown files to the wheel's source-distribution surface for no end-user benefit - couples planning vocabulary to release artifacts (e.g. plan-codegenome files describe v1 work that #246 reverted; the plan stays useful as reference but doesn't belong on `main`) - creates churn pressure to mark plans as "done" or "superseded" instead of just letting them rest as the author's working notes This commit: 1. Adds `plan-*.md` pattern to `.gitignore` with a one-paragraph comment on the policy. 2. `git rm --cached` on all 27 currently-tracked `plan-*.md` files — they remain on local disk for the author's reference, just no longer tracked. After merge, anyone with a checkout will keep their local `plan-*.md` files; new plans drafted in-tree will be untracked by default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a top-level SECURITY.md so GitHub auto-creates a "Security" tab in the repo nav bar — the closest GitHub-native surface to a "button that pulls up our SBOM/privacy statement." Contents: - **Privacy posture** — explicit "all data on your laptop" + telemetry opt-out + pointer to docs/policies/ for the full posture - **Software supply chain** — table of signed artifacts on each release (CycloneDX SBOM, Rekor attestation, hooks-manifest sigs, skills-manifest sigs, release-tag-commit sig); pointer to the release-evidence verification procedure; mention of GitHub's auto-SPDX SBOM under Insights → Dependency graph - **Supported versions** — only latest minor, ~30-day backport window for critical fixes - **Reporting a vulnerability** — GitHub Security Advisories preferred; jin@bicameral-ai.com fallback with `[security]` subject prefix; 3-day ack target, 30-day patch target for critical - **Scope** — what's in (server, skills, hooks, release pipeline) and what's out (third-party deps, host vulns, local-attack scenarios already covered by host-trust model) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sections removed (PM/dev evaluating Bicameral wants the test-out path, not the why/how-internals): - "The Problem" (long context narrative) - "Core Concepts" (two-axis model, link_commit, collab modes) - "Removing Bicameral" (post-test concern) - "Configuration" env-var table - "Local Development" - "Contributing" - "Telemetry" subsection (folded into Privacy & Compliance one-liner) What stays — the path from "land on the page" to "type three commands and see something work": - Hero image - Star-on-GitHub CTA (animated SVG, adapted from cocoindex's design with metadata strings updated; visual mechanism is generic) - Logo (small, inline-right of title) - Title + badges + 1-liner - Quickstart (uv | pip | Windows) - How It Feels (preflight render + dashboard screenshot) - Slash Commands - What `setup` writes (trust signal — what hits disk) - MCP Tools Reference (collapsed by default) - Privacy & Compliance (concise) - License 312 → 152 lines. Visuals from landing-page borrowed: assets/logo.png (was landing-page/logo.png), assets/bicameral-hero.png from landing-page/output/imagegen/. Star CTA SVG saved as assets/star-on-github.svg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… attr (#272) Closes 2 of 3 dev-baseline regressions tracked in #272 (the third — Flow 1 e2e expectation post-#263 auto-bind — is deferred for product-level discussion since it touches the ingest skill choreography). ## test-summary/action SHA-pin The `test-summary/action@v2` mutable tag was repointed on 2026-05-07 22:09 UTC from a `dist`-targeted release (with bundled `index.js`) to the new v2.5 release that targets `main` (no bundled output). Every PR opened after that point fails the Test Summary step with `File not found: index.js`, marking MCP Regression Suite jobs red on both ubuntu and windows. Pin to v2.4 commit `31493c76ec9e7aa675f1585d3ed6f1da69269a86` (the last `dist`-targeted release with the bundled artifact). Aligns with OWASP-03 / `docs/policies/install-trust-model.md` discipline (do not trust mutable action tags for security-load-bearing CI). ## tests/eval_decision_relevance.py:166 attribute rename `IngestResponse` schema field is `pending_grounding_decisions` per `contracts.py:571` and `handlers/ingest.py:695`. The eval script was still reading the legacy internal name `ungrounded_decisions` directly off the response, raising `AttributeError` and failing the M1 adversarial corpus eval step (technically `continue-on-error: true`, but worth fixing independently since the eval result was lost on every run). Keeps the JSON-output key `ungrounded_decisions` unchanged so downstream consumers (M1 trend reports, etc.) see no change. Both fixes verified locally: yaml.safe_load on the workflow + IngestResponse field introspection confirms the schema. Refs #272.
…Fix 3) Closes the third regression deferred from PR #273. Post-#263 (sync auto-bind step 1.5), Flow 1's e2e exhausted budget on ~41 non-bicameral Read/Grep/validate_symbols calls before reaching the ingest skill's auto-fired ratify AskUserQuestion gate, so the agent never invoked ratify. Per #108 Flow 1 spec discussion: this is both a regression AND a canonical flow change. Auto-bind stays (deterministic, useful), but ratify drops out of the auto-prompt path and becomes advisory text. Ratification belongs to Flow 5 (PM Friday review) and to direct user requests like "sign these off" / "ratify all". Skill changes (skills/bicameral-ingest/SKILL.md Step 7): - Replace AskUserQuestion ratify gate with a one-block advisory: "○ N decisions captured as proposals — drift tracking activates after ratification. Run `bicameral.ratify` when ready, or revisit them in your next history review (Flow 5)." - Add explicit "Direct user request shortcut": if the prompt asked for sign-off in the same turn, ratify directly without the round-trip. - Update the example at the bottom to match. Test changes: - tests/e2e/prompts/flow-1-ingest.md: drop "sign them off on our end" so the prompt exercises the new advisory path. Direct sign-off requests are covered by Flow 5 (history + ratify). - tests/e2e/run_e2e_flows.py:assert_flow_1: drop the ratify requirement; accept [ingest, bind?] as the canonical Flow 1 signature. Update Flow 5's stale comment about Flow 1 pre-ratifying. - tests/test_e2e_asserters.py: invert test_flow1_fails_without_ratify → test_flow1_passes_without_ratify (advisory-only is now the expected behavior). Doesn't touch sync skill #263 — that auto-bind change is preserved intentionally; this PR fixes the test/spec contract that conflicted with it. Refs #272 #108 #263. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ngest The previous fix dropped "sign them off on our end" to defeat the ratify auto-call; in doing so, it also removed the bicameral disambiguator. The remaining "please log these on our end" landed inside the auto-memory trigger surface (the runner's ~/.claude/projects/.../memory/ directory), so the agent wrote four memory files instead of invoking bicameral.ingest — Flow 1 went 0/0 on bicameral calls and the cascading flows (3, 5) then had nothing to assert against. Replace "log these on our end" with "log these to our decision ledger" — same intent, but "decision ledger" is an unambiguous bicameral signal the auto-memory skill does not match. Still no "sign them off" / "ratify" phrasing, so the new advisory-only Step 7 contract is preserved. Refs #272 #108. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The no-ratify success branch in assert_flow_5 still claimed "Flow 1 ratified its 3 seeds" — but per #272 Fix 3, Flow 1 now leaves seeds as `proposed` (advisory-only ingest). The docstring was updated in 2252c82; this catches the trailing f-string that was missed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…r ledger' Same auto-memory / working-tree ambiguity that bit Flow 1 (#272 Fix 3 follow-up): Flow 5's prompt 'all the things we're tracking' was readable as either bicameral decisions OR files modified in the cwd's working tree. With Flow 1 no longer auto-ratifying (advisory-only Step 7), the agent's mental model on the new ledger state shifted: it pulled the auto-memory index and the working-tree diff first, then never reached bicameral.history. Tighten the prompt: - 'all the things we're tracking' → 'the decisions we've tracked in our ledger' (unambiguous bicameral signal — auto-memory and the working tree are not 'our ledger'). - 'still on the to-do pile that hasn't moved' → 'still on the proposed pile that hasn't been ratified' (uses the ledger's `signoff.state` vocabulary; aligns with the new advisory-only ingest contract where Flow 1 leaves seeds as `proposed` for Flow 5 to walk). The asserter (history mandatory; ratify conditional on proposals) is unchanged. Now the asserter's load-bearing 'if proposals exist' branch is the live path: Flow 1 leaves 3 proposals → Flow 5 ratifies → drift tracking activates, exactly as the #108 spec describes. Refs #272 #108. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR consolidates project planning artifacts, pivots the SessionEnd capture flow to queue pending transcripts, changes ingest ratification from mandatory to advisory-only, reorganizes README and security documentation, and updates e2e test assertions to match the new ingest behavior. The core narrative is SessionEnd hook refactoring coupled with ingest flow simplification and significant documentation hygiene work. ChangesSessionEnd Queue Pivot + Ingest Flow Refactor + Plan Consolidation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@README.md`:
- Line 111: Update the summary text that currently reads "<summary><strong>13
tools across two categories</strong></summary>" to reflect the actual count of
tools; change it to "<summary><strong>15 tools across two
categories</strong></summary>" (or optionally "<summary><strong>15 tools across
two categories (13 Decision Ledger + 2 Code Locator)</strong></summary>") so the
summary matches the listed totals.
In `@skills/bicameral-ingest/SKILL.md`:
- Around line 823-827: The fenced block starting with triple backticks around
the decision-note example is missing a language tag (causing MD040); update the
opening fence in that block (the ``` that begins the example containing "○ N
decisions captured as proposals ...") to include a language identifier such as
text (i.e., change ``` to ```text) so the markdown linter recognizes the code
fence.
In `@tests/eval_decision_relevance.py`:
- Line 166: The JSON key "ungrounded_decisions" is semantically stale and should
match the response attribute name; update the dict construction where the output
is built to replace the "ungrounded_decisions" key with
"pending_grounding_decisions" so it serializes
result.pending_grounding_decisions (i.e., use the attribute
result.pending_grounding_decisions as the value under the key
"pending_grounding_decisions") to keep naming consistent for downstream
consumers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: eee1f34f-01fa-4bc0-9860-2d00577dc0cf
⛔ Files ignored due to path filters (3)
assets/bicameral-hero.pngis excluded by!**/*.pngassets/logo.pngis excluded by!**/*.pngassets/star-on-github.svgis excluded by!**/*.svg
📒 Files selected for processing (37)
.github/workflows/test-mcp-regression.yml.gitignoreREADME.mdSECURITY.mdplan-114-grounding-lint.mdplan-124-post-commit-hook-fix.mdplan-156-sessionend-queue-pivot.mdplan-156b-preflight-queue-drain.mdplan-187-gap-brief-unification.mdplan-190-compliance-event.mdplan-197-flow3-prompt-clarity.mdplan-199-windows-install-and-uv.mdplan-200-config-yaml-redaction.mdplan-200-skills-audit-hardening.mdplan-205-compliance-research-brief.mdplan-216-ingest-size-and-rate-limit.mdplan-218-cosign-hooks-manifest-and-sbom.mdplan-227-structured-audit-log-emission.mdplan-252-layer-2-schema-revision-sentinel.mdplan-252-layer-3-diagnose-cli.mdplan-48-pre-push-drift-hook.mdplan-A-ingest-followups-230-232-233.mdplan-B-preflight-attribution-regex-209.mdplan-C-soc2-03-signed-tags-and-release-evidence.mdplan-D-compliance-stance-docs-220-225-226.mdplan-E-owasp-03-and-05-install-update-trust-model.mdplan-F-llm-06-skills-manifest-signing.mdplan-codegenome-llm-drift-judge.mdplan-codegenome-phase-1-2.mdplan-codegenome-phase-3.mdplan-codegenome-phase-4.mdskills/bicameral-ingest/SKILL.mdtests/e2e/prompts/flow-1-ingest.mdtests/e2e/prompts/flow-5-history.mdtests/e2e/run_e2e_flows.pytests/eval_decision_relevance.pytests/test_e2e_asserters.py
💤 Files with no reviewable changes (27)
- plan-227-structured-audit-log-emission.md
- plan-197-flow3-prompt-clarity.md
- plan-B-preflight-attribution-regex-209.md
- plan-codegenome-phase-3.md
- plan-114-grounding-lint.md
- plan-156b-preflight-queue-drain.md
- plan-E-owasp-03-and-05-install-update-trust-model.md
- plan-codegenome-phase-4.md
- plan-187-gap-brief-unification.md
- plan-200-config-yaml-redaction.md
- plan-codegenome-llm-drift-judge.md
- plan-218-cosign-hooks-manifest-and-sbom.md
- plan-124-post-commit-hook-fix.md
- plan-C-soc2-03-signed-tags-and-release-evidence.md
- plan-216-ingest-size-and-rate-limit.md
- plan-205-compliance-research-brief.md
- plan-48-pre-push-drift-hook.md
- plan-A-ingest-followups-230-232-233.md
- plan-252-layer-2-schema-revision-sentinel.md
- plan-199-windows-install-and-uv.md
- plan-200-skills-audit-hardening.md
- plan-codegenome-phase-1-2.md
- plan-F-llm-06-skills-manifest-signing.md
- plan-D-compliance-stance-docs-220-225-226.md
- plan-156-sessionend-queue-pivot.md
- plan-252-layer-3-diagnose-cli.md
- plan-190-compliance-event.md
|
|
||
| <details> | ||
| <summary><strong>13 tools across three categories</strong></summary> | ||
| <summary><strong>13 tools across two categories</strong></summary> |
There was a problem hiding this comment.
Tool count in summary appears inconsistent with the tables
Line 111 says “13 tools across two categories,” but the section currently lists 15 total (13 Decision Ledger + 2 Code Locator). Please update the summary line to avoid reader confusion.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 111, Update the summary text that currently reads
"<summary><strong>13 tools across two categories</strong></summary>" to reflect
the actual count of tools; change it to "<summary><strong>15 tools across two
categories</strong></summary>" (or optionally "<summary><strong>15 tools across
two categories (13 Decision Ledger + 2 Code Locator)</strong></summary>") so the
summary matches the listed totals.
| ``` | ||
| AskUserQuestion({ | ||
| question: "Captured N decisions as proposals — ratify now to start drift tracking:", | ||
| header: "Ratify", | ||
| multiSelect: false, | ||
| options: [ | ||
| { label: "Ratify all N (recommended)", | ||
| description: "Drift tracking starts immediately on all N decisions." }, | ||
| { label: "Pick which to ratify", | ||
| description: "Use the Other field to specify decision numbers (e.g. '1 3 5')." }, | ||
| { label: "Skip — review later", | ||
| description: "All stay as proposals. They'll surface as stale after inactivity." } | ||
| ] | ||
| }) | ||
| ○ N decisions captured as proposals — drift tracking activates after ratification. | ||
| Run `bicameral.ratify` when ready, or revisit them in your next history review | ||
| (Flow 5: "show me what's tracked" → ratify what's ready). | ||
| ``` |
There was a problem hiding this comment.
Add a language tag to the fenced example block
The code fence opened at Line 823 is missing a language identifier, which triggers MD040.
Suggested patch
-```
+```text
○ N decisions captured as proposals — drift tracking activates after ratification.
Run `bicameral.ratify` when ready, or revisit them in your next history review
(Flow 5: "show me what's tracked" → ratify what's ready).</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 823-823: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@skills/bicameral-ingest/SKILL.md` around lines 823 - 827, The fenced block
starting with triple backticks around the decision-note example is missing a
language tag (causing MD040); update the opening fence in that block (the ```
that begins the example containing "○ N decisions captured as proposals ...") to
include a language identifier such as text (i.e., change ``` to ```text) so the
markdown linter recognizes the code fence.
| "grounded_pct": round(stats.grounded_pct, 4), | ||
| "grounding_deferred": stats.grounding_deferred, | ||
| "ungrounded_decisions": list(result.ungrounded_decisions), | ||
| "ungrounded_decisions": list(result.pending_grounding_decisions), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the return type of handle_ingest exposes pending_grounding_decisions
# and check whether ungrounded_decisions still exists on that type.
rg -n "pending_grounding_decisions\|ungrounded_decisions" --type py -C 3Repository: BicameralAI/bicameral-mcp
Length of output: 51
🏁 Script executed:
git ls-files --cached | head -20Repository: BicameralAI/bicameral-mcp
Length of output: 907
🏁 Script executed:
find . -name "*.py" -type f | grep -E "(eval_decision|handle_ingest)" | head -20Repository: BicameralAI/bicameral-mcp
Length of output: 104
🏁 Script executed:
cd tests && ls -la *.py | head -20Repository: BicameralAI/bicameral-mcp
Length of output: 1568
🏁 Script executed:
cat -n tests/eval_decision_relevance.py | sed -n '140,175p'Repository: BicameralAI/bicameral-mcp
Length of output: 1709
🏁 Script executed:
rg -n "def handle_ingest|class.*Result|pending_grounding_decisions|ungrounded_decisions" -A 2 tests/eval_decision_relevance.pyRepository: BicameralAI/bicameral-mcp
Length of output: 249
🏁 Script executed:
rg -n "handle_ingest" --type pyRepository: BicameralAI/bicameral-mcp
Length of output: 16489
🏁 Script executed:
head -500 handlers/ingest.py | tail -100Repository: BicameralAI/bicameral-mcp
Length of output: 4313
🏁 Script executed:
grep -n "async def handle_ingest" handlers/ingest.pyRepository: BicameralAI/bicameral-mcp
Length of output: 98
🏁 Script executed:
sed -n '456,550p' handlers/ingest.pyRepository: BicameralAI/bicameral-mcp
Length of output: 4884
🏁 Script executed:
grep -n "class IngestResponse\|IngestResponse =" handlers/ingest.py | head -5Repository: BicameralAI/bicameral-mcp
Length of output: 51
🏁 Script executed:
rg -n "class IngestResponse" --type pyRepository: BicameralAI/bicameral-mcp
Length of output: 119
🏁 Script executed:
sed -n '564,650p' contracts.pyRepository: BicameralAI/bicameral-mcp
Length of output: 2920
🏁 Script executed:
rg "ungrounded_decisions" --type py handlers/ contracts.pyRepository: BicameralAI/bicameral-mcp
Length of output: 170
🏁 Script executed:
# Check git history or comments about this field change
rg -B 3 -A 3 "pending_grounding_decisions" handlers/ingest.py | head -40Repository: BicameralAI/bicameral-mcp
Length of output: 347
Rename stale JSON key to match attribute name.
The JSON output key "ungrounded_decisions" is out of sync with the response field pending_grounding_decisions. The attribute exists on the response object (no runtime risk), but the key name mismatch creates semantic confusion—downstream consumers will receive data about decisions pending grounding (deferred, not yet grounded) under a key implying definitive ungroundedness.
Suggested fix
- "ungrounded_decisions": list(result.pending_grounding_decisions),
+ "pending_grounding_decisions": list(result.pending_grounding_decisions),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "ungrounded_decisions": list(result.pending_grounding_decisions), | |
| "pending_grounding_decisions": list(result.pending_grounding_decisions), |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/eval_decision_relevance.py` at line 166, The JSON key
"ungrounded_decisions" is semantically stale and should match the response
attribute name; update the dict construction where the output is built to
replace the "ungrounded_decisions" key with "pending_grounding_decisions" so it
serializes result.pending_grounding_decisions (i.e., use the attribute
result.pending_grounding_decisions as the value under the key
"pending_grounding_decisions") to keep naming consistent for downstream
consumers.
…er-as-verdict
Per BicameralAI/bicameral#108, Flow 3's `git commit` is the *trigger* for
the V1 lifecycle, not the assertion. The previous design conflated the two:
the per-flow asserter FAILed on missing commit, and a buggy substring check
(`"git commit" in cmd`) silently rejected the agent's `git -C <path> commit`
form that `claude` produces when cwd doesn't match the prompt's bare path.
That masked the real signal — the post-hoc ledger validator's "no
compliance_check rows" message — and reported a regex bug as a sync-chain
failure.
Changes:
- New `_command_runs_git_commit()` helper using a regex that handles
`-C <path>`, `--git-dir=<path>`, &&-chained sequences, and rejects
`commit-tree` plumbing.
- `assert_flow_3` now always returns True; body text describes the
precondition (met / unmet) for the report. Per-flow no longer gates.
- `_validate_flow3_via_ledger` becomes the single source of verdict truth:
no commit → SKIP (advisory, non-blocking)
commit + verdict written → PASS (full V1 lifecycle)
commit + cc rows only → PASS (degraded, advisory: #135 gap)
commit + neither → FAIL (no advisory — real sync-chain bug)
- `commit_note` no longer derives from `flow3.verdict`, breaking the
precondition→assertion leak.
Replayed against PR #282's failing CI artifact: the agent's commit is now
correctly detected, and the validator surfaces the real upstream bug
(cc_delta == 0 despite a successful commit + 3 subsequent bicameral calls).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The old prompt asked the agent to add a comment above the CherryPickResult enum (line 30 of cherry-pick.ts), which sits OUTSIDE the cherry-pick decision's bound code region (the cherryPick function body, lines 142-183). Verified against PR #282's CI ledger artifact: code_region row for app/src/lib/git/cherry-pick.ts: symbol: cherryPick start_line: 142 end_line: 183 pinned_commit: 192ef5730e84c9337a1e07a04e56cba18a0b2766 ← Flow 3's commit link_commit DID fire post-Flow-3 (the pinned_commit set on the region proves ensure_ledger_synced → link_commit ran), but the bound region's content hash didn't change because the edit was 100+ lines above. With no drift detected, link_commit correctly wrote ZERO compliance_check rows — and the harness's success signal (cc_delta > 0) never fired. The new prompt edits inside the cherryPick function body (line 146, above the empty-set early-return), which is well inside the bound region. This produces the drift the V1 lifecycle is designed to detect, and the existing validator's `cc_delta > 0` branch will fire correctly (PASS-degraded with #135 advisory, since headless can't run resolve_compliance). Drive-by: use the full path `app/src/lib/git/cherry-pick.ts` in both the edit and commit commands so the agent doesn't have to defensively rewrite to `git -C <repo> commit` (the failing CI's agent did exactly this when faced with the bare relative path against cwd=/tmp/desktop-clone). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/e2e/run_e2e_flows.py (1)
1004-1026: 💤 Low valueRegex correctly detects
git commitacross documented patterns; two edge cases identified for future-proofing.The regex reliably handles the three patterns documented in the code — bare
git commit,git -C <path> commit,git --git-dir=<path> commit— and chained variants likegit status && git commit, while the trailing(?=\s|$)lookahead correctly excludes plumbing such asgit commit-tree. Two edge cases fall outside the current pattern:
(?=\s|$)does not match commands wherecommitis followed directly by shell punctuation (git commit;,git commit)in a subshell,git commit&&without a space). A lookahead like(?![\w-])would accept these while still rejectingcommit-tree/committer.- The flag-arg clause
[^\s-]\S*consumes only unquoted tokens, so quoted paths with spaces (git -C "/tmp/has space" commit) will not match.These gaps are unlikely to occur with the Claude Code CLI patterns currently tested. No changes needed for the current scope.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/run_e2e_flows.py` around lines 1004 - 1026, The _GIT_COMMIT_RE currently misses two edge cases: commands where "commit" is followed immediately by shell punctuation (e.g. "git commit;", "git commit)") and flag arguments that are quoted tokens with spaces (e.g. git -C "/tmp/has space" commit); update the _GIT_COMMIT_RE pattern used by _command_runs_git_commit to (1) use a lookaround that forbids word/ hyphen chars after "commit" (e.g. (?![\w-]) or equivalent) instead of (?=\s|$) so punctuation is accepted, and (2) broaden the flag-arg clause to accept quoted tokens (e.g. allow "(\"[^\"]+\"|'[^']+'|[^\s-]\S*)" as the optional flag argument) so quoted paths are matched; keep the rest of the pattern logic and ensure _command_runs_git_commit still returns bool(_GIT_COMMIT_RE.search(cmd or "")).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/e2e/run_e2e_flows.py`:
- Around line 357-361: The multi-line list comprehension for commit_calls should
be collapsed into a single-line comprehension to satisfy ruff formatting: locate
the comprehension that constructs commit_calls (uses bash_calls and calls
_command_runs_git_commit) and rewrite it so the conditional is on the same line
as the for clause (e.g., commit_calls = [c for c in bash_calls if
_command_runs_git_commit((c.get("input") or {}).get("command", ""))]); apply the
same single-line change to the identical comprehension later in the file (the
one around lines 1049–1053) to fix both ruff failures.
---
Nitpick comments:
In `@tests/e2e/run_e2e_flows.py`:
- Around line 1004-1026: The _GIT_COMMIT_RE currently misses two edge cases:
commands where "commit" is followed immediately by shell punctuation (e.g. "git
commit;", "git commit)") and flag arguments that are quoted tokens with spaces
(e.g. git -C "/tmp/has space" commit); update the _GIT_COMMIT_RE pattern used by
_command_runs_git_commit to (1) use a lookaround that forbids word/ hyphen chars
after "commit" (e.g. (?![\w-]) or equivalent) instead of (?=\s|$) so punctuation
is accepted, and (2) broaden the flag-arg clause to accept quoted tokens (e.g.
allow "(\"[^\"]+\"|'[^']+'|[^\s-]\S*)" as the optional flag argument) so quoted
paths are matched; keep the rest of the pattern logic and ensure
_command_runs_git_commit still returns bool(_GIT_COMMIT_RE.search(cmd or "")).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ada8ff59-745f-468d-8013-07e895ff7e3f
📒 Files selected for processing (1)
tests/e2e/run_e2e_flows.py
Drive-by formatter fix on c20954e — ruff prefers single-line list comprehensions when they fit within line-length. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Patch on v0.14.1 — README + SECURITY surface, ingest skill softening, Flow 3 e2e hardening. See CHANGELOG.md for the full per-section detail. Pre-release checklist (per docs/RELEASE_EVIDENCE_PROCEDURE.md): - All PRs in window (BicameralAI#282) have approving CI; review approval pending - No force-pushes to main since v0.14.1 - CHANGELOG draft ready (this commit) Tag + GitHub Release to be cut by the maintainer after this PR merges (triggers .github/workflows/publish.yml on `release:` event). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Triages 10 commits from
devontomain— every commit frome661bd8onwards (the README restructure base) up through the latest Flow 5 prompt disambiguation. Cherry-picked rather than merge-tracked becausemainhas independent hotfix history (#267-#269SBOM hotfixes) that's already in dev via prior triage; this avoids re-merging the merge commits.Commits in order
dc380a7docs(README): restructure for getting-started clarity + add hero image5d1dca7chore(repo): untrack plan-*.md and add to .gitignore279e9eedocs(security): add SECURITY.md — enables GitHub Security tabb1e9879docs(README): trim to test-out path; add star CTA + logod786914fix(ci): SHA-pin test-summary action + rename eval_decision_relevance attr (Dev CI baseline regressions blocking #258 merge: test-summary action + M1 eval AttributeError + e2e Flow 1 #272)6cab533fix(skill): ingest advises on ratify instead of auto-prompting (Dev CI baseline regressions blocking #258 merge: test-summary action + M1 eval AttributeError + e2e Flow 1 #272 Fix 3)46d8fadstyle: ruff format assert_flow_1 ratify_note lined1364a1fix(e2e): re-anchor flow-1 prompt with 'decision ledger' to trigger ingest3975447style(e2e): refresh stale Flow 5 no-ratify message (Dev CI baseline regressions blocking #258 merge: test-summary action + M1 eval AttributeError + e2e Flow 1 #272 Fix 3)c2cf52afix(e2e): disambiguate flow-5 prompt — 'decisions we've tracked in our ledger'Highlights
AskUserQuestiongate (saves Flow 1's tool-call budget; ratify moves to Flow 5)phoenix-actions/test-reporting-summarybicameral.ingest/bicameral.historybefore exhausting budget on auto-memory + working-tree readsWhat this does NOT include
devTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Changes
Chores