release: v0.13.6 (triage)#140
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughThis PR introduces a comprehensive telemetry consent system with persistent markers, local tool-call counters, new link_commit CLI command, event sourcing extensions for decision ratification/supersession, cross-platform file locking, and refactored handlers delegating write operations to ledger adapters. Includes 200+ lines of test coverage and CI updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Tool as Tool/Skill
participant Send as telemetry.send_event()
participant LC as local_counters
participant Consent as consent
participant Relay as Relay/Network
Tool->>Send: send_event(skill, payload)
Send->>LC: increment(tool_name)
LC->>LC: append to counters.jsonl
Send->>Consent: telemetry_allowed()
alt Consent Disabled
Consent-->>Send: False
Send-->>Tool: return (no relay)
else Consent Enabled
Consent-->>Send: True
Send->>Relay: POST event
Relay-->>Send: ack/error
Send-->>Tool: return
end
sequenceDiagram
participant Handler as Handler (ratify/supersede)
participant Adapter as TeamWriteAdapter
participant EventWriter as EventFileWriter
participant Materializer as Materializer
participant Ledger2 as Ledger (Replay)
Handler->>Adapter: apply_ratify() / apply_supersede()
Adapter->>Adapter: compute canonical_id
Adapter->>EventWriter: emit decision_ratified/superseded.completed
EventWriter->>EventWriter: exclusive lock, append JSON
Adapter->>Adapter: forward to inner adapter (UPDATE state)
Materializer->>Materializer: replay_new_events()
Materializer->>Materializer: find by canonical_id
alt Decision Found
Materializer->>Ledger2: apply_ratify() / apply_supersede()
Ledger2->>Ledger2: UPDATE signoff state
else Not Found
Materializer->>Materializer: log warning, skip
end
sequenceDiagram
participant CLI as CLI/Git Hook
participant LCRunner as invoke_link_commit()
participant Handler as handle_link_commit()
participant Ledger as Ledger
CLI->>LCRunner: invoke_link_commit(commit_hash)
LCRunner->>LCRunner: check ledger.db exists
alt Ledger Missing
LCRunner-->>CLI: None (graceful skip)
else Ledger Present
LCRunner->>Handler: handle_link_commit(ctx, commit_hash)
Handler->>Ledger: check/update compliance
Ledger-->>Handler: result
Handler-->>LCRunner: LinkCommitResponse
LCRunner-->>CLI: response
CLI->>CLI: serialize to JSON / exit 0
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 13 minutes and 9 seconds.Comment |
Carry-forward commits (already on triage-from-dev pre-this-cycle)The PR description only enumerates the 3 commits I cherry-picked this cycle. But because
These are why the PR diff is +2678/-62 across 30 files rather than ~+832/-28 across 4 files (which would be just my cycle's picks). Each of these has its own dev-sha provenance baked in via Closes (carry-forward): #74, #80, #95, #97 (already merged on dev), #124 (this is its first ride to main; PR #134 was the dev landing). If the maintainer prefers to split this — i.e. ship only my cycle's 3 commits and hold the other 5 for a separate release PR — they can reset |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
setup_wizard.py (1)
529-570:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't persist telemetry as enabled when the prompt is cancelled.
questionary.select(...).ask()can returnNoneon cancellation/EOF, and the current fallback writestelemetry=Trueanyway. That silently opts the user in without an explicit answer. Abort the wizard or default toFalseinstead.Suggested fix
- choice = result if result is not None else True - write_consent(telemetry=choice, via="wizard") - return choice + if result is None: + raise RuntimeError("Telemetry consent prompt was cancelled") + write_consent(telemetry=result, via="wizard") + return result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@setup_wizard.py` around lines 529 - 570, The prompt currently treats a cancelled questionary.select (.ask() returning None) as True and persists telemetry enabled; change the fallback so a None result is treated as False (do not enable telemetry) or abort; specifically, in _select_telemetry() update the handling of result from questionary.select to set choice = result if result is not None else False and then call write_consent(telemetry=choice, via="wizard") and return choice (leave the non-interactive branch that auto-writes telemetry=True unchanged).
🧹 Nitpick comments (4)
docs/BACKLOG.md (1)
38-39: ⚡ Quick winUse role/artifact-based gating instead of person-specific wording.
“Jin's call” is likely to become stale; prefer a role or decision artifact reference so this remains actionable over time.
Suggested doc tweak
- to the full rewrite. Decision blocked on Jin's call about team - sync as a v1.0.0 goal. + to the full rewrite. Decision blocked pending architecture + decision (owner: release manager/architecture lead) on whether + team sync is a v1.0.0 goal.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/BACKLOG.md` around lines 38 - 39, Replace the person-specific phrase "Jin's call" with a role- or artifact-based gate so the backlog item remains actionable; for example change it to "engineering lead approval", "product manager decision", or reference a decision artifact like "pending ADR/RFC: team-sync-as-v1.0.0-goal" and ensure the sentence explicitly cites that artifact or role (e.g., "Decision blocked pending engineering lead approval or ADR 'team-sync-v1'") so future readers know who or what finalizes the decision.telemetry.py (1)
117-125: ⚡ Quick winNarrow the local-counter fallback.
Catching
Exceptionhere will also hide programming bugs inlocal_counters, so a broken counter path only shows up as a debug log. If this must stay non-fatal, catch the specific import/IO failures you expect.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@telemetry.py` around lines 117 - 125, The current block catches Exception broadly which can mask programming errors in local_counters; replace the broad except with a narrow catch for the expected import/IO failures (e.g., ImportError, ModuleNotFoundError, OSError) around the import and increment call for local_counters/_local_increment and keep the non-fatal logger.debug for those specific exceptions; refer to symbols local_counters, _local_increment, skill_name, properties and logger.debug when making the change.scripts/sim_issue_108_flows.py (1)
828-843: ⚡ Quick winGuard script execution behind
__main__to avoid import side effects.Running flows at import time makes this module unsafe to import from tests or other scripts.
💡 Proposed fix
-asyncio.run(main()) - -print("\n\n=== REPORT ===\n") -overall = "PASS" if all(v == "PASS" for _, v, _ in RESULTS) else "PARTIAL/FAIL" -for flow_id, verdict, body in RESULTS: - print(f"\n## {flow_id} — {verdict}\n") - print(body) - print() - -print("\n=== SUMMARY ===\n") -print(f"{'Flow':<10} {'Verdict':<8}") -print(f"{'-' * 10} {'-' * 8}") -for flow_id, verdict, _ in RESULTS: - print(f"{flow_id:<10} {verdict:<8}") -print(f"\nOverall: {overall}") +if __name__ == "__main__": + asyncio.run(main()) + + print("\n\n=== REPORT ===\n") + overall = "PASS" if all(v == "PASS" for _, v, _ in RESULTS) else "PARTIAL/FAIL" + for flow_id, verdict, body in RESULTS: + print(f"\n## {flow_id} — {verdict}\n") + print(body) + print() + + print("\n=== SUMMARY ===\n") + print(f"{'Flow':<10} {'Verdict':<8}") + print(f"{'-' * 10} {'-' * 8}") + for flow_id, verdict, _ in RESULTS: + print(f"{flow_id:<10} {verdict:<8}") + print(f"\nOverall: {overall}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/sim_issue_108_flows.py` around lines 828 - 843, The top-level execution block currently runs asyncio.run(main()) and prints RESULTS at import time, causing side effects; wrap the asyncio.run(main()) call and all subsequent reporting/printing logic (the lines referencing asyncio.run(main()), RESULTS, overall, and the summary print loop) inside an if __name__ == "__main__": guard so importing the module won't execute flows—keep references to the existing main coroutine and RESULTS list unchanged but only invoke them under that guard.tests/test_consent_notice.py (1)
196-200: ⚡ Quick winAssert the local counter increment that the test claims.
The comment says local counters should still increment, but the test only checks that relay sending is blocked. Please assert the counter value too.
💡 Proposed test addition
# Counter should still increment locally. import local_counters importlib.reload(local_counters) + counts = local_counters.read_counters() + assert counts.get("bicameral-ingest", 0) >= 1 # Relay was NOT called (consent denied). assert sent == []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_consent_notice.py` around lines 196 - 200, The test claims the local counter should increment but never checks it: before performing the action that triggers the local increment, snapshot the local counter value from the local_counters module (e.g. local_counters.counter or a get_count()/COUNTER symbol in that module), then after importlib.reload(local_counters) assert the new value equals the prior value + 1; keep the existing relay assertion (sent == []) and place this counter assertion just after the reload so it verifies the local increment occurred while relay sending remained blocked.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Around line 68-70: The "### Closes" heading followed by a line starting with
"#39, `#42`." is being parsed as a markdown heading; change the reference line so
it does not start with a hash: under the "### Closes" heading replace the raw
"#39, `#42`." line with a plain text or bullet-style entry such as "Closes: `#39`,
`#42`" or "- Closes `#39` and `#42`" so markdownlint no longer treats it as a heading.
In `@consent.py`:
- Around line 102-109: The conversion of marker.get("policy_version", 0) to int
in _should_notify can raise ValueError/TypeError for malformed markers and cause
silent suppression of the notification; update _should_notify to safely parse
the stored policy_version by wrapping the int conversion in a try/except (catch
ValueError and TypeError) and treating invalid values as 0 (or otherwise forcing
re-notification) so the function returns True when the marker is corrupted;
reference the _should_notify function, the marker variable and POLICY_VERSION
when making this change.
In `@events/materializer.py`:
- Around line 97-146: The current handlers for etype
"decision_ratified.completed" and "decision_superseded.completed" call
find_decision_by_canonical_id and simply log+continue when canonical IDs are
unresolved, which lets the consumer advance the watermark (f.tell) and
permanently drop the event; change the logic in materializer.py so that when
find_decision_by_canonical_id returns None you do NOT advance the offset but
instead buffer the event for retry (e.g., push the payload and metadata into an
in-memory/DB unresolved_events queue keyed by canonical_id or re-enqueue the
message) and return without calling any watermark-advancing code; ensure the
same approach is applied to both inner_adapter.apply_ratify and
inner_adapter.apply_supersede paths so unresolved peer events are retried once
canonical IDs are available.
In `@handlers/usage_summary.py`:
- Around line 84-98: The query in handlers/usage_summary.py is counting a
verdict value "cosmetic_autopass" that doesn't exist in the schema
(compliance_check.verdict is "compliant | drifted | not_relevant"), so cosmetic
will always be zero; update the query and counts to use the schema-valid verdict
(e.g., replace "cosmetic_autopass" with "not_relevant") and adjust the variable
references (cc_rows, cc_counts, cosmetic, drift_total,
base["cosmetic_drift_pct"]) accordingly so cosmetic_drift_pct is computed from
the actual verdict values present in compliance_check.
In `@ledger/adapter.py`:
- Around line 1120-1181: The write paths use caller-controlled record ids
directly in SurrealQL f-strings causing injection; before constructing any query
validate/sanitize record ids using the existing _validated_record_id() helper
(or equivalent) and replace f-string-inlined ids with validated values or
parameterized placeholders. Apply this to apply_ratify(decision_id),
apply_supersede(new_id, old_id), and in the downstream helpers
update_decision_status(...) and relate_supersedes(...) so every occurrence that
builds SQL like f"UPDATE {id} ..." or f"SELECT ... FROM {id}" first calls
_validated_record_id(id) and then uses that result (or a bound parameter) when
composing the query. Ensure both new_id and old_id are validated in
apply_supersede and propagate validated ids into calls to
relate_supersedes/update_decision_status.
In `@local_counters.py`:
- Around line 34-41: Currently _COUNTERS_FILE is resolved at import time using
Path.home(), which prevents later HOME/USERPROFILE changes from taking effect;
replace the module-level _COUNTERS_FILE constant with a function (e.g.,
_counters_file()) that computes and returns Path.home() / ".bicameral" /
"counters.jsonl" at call time and update all uses to call that function instead
(ensure callers that write create parent directory as needed); leave _LOCK and
_enabled() as-is.
In `@scripts/sim_issue_108_flows.py`:
- Line 29: The script currently injects a machine-specific absolute path via
sys.path.insert(0, "/Users/jinhongkuan/github/bicameral/pilot/mcp"); remove this
hardcoded insertion and replace it with a portable solution: either install the
mcp package into the environment or compute the project-relative path at runtime
(e.g., derive the repo root from __file__ / pathlib and append the relative
"mcp" directory) and add that computed path to sys.path only when necessary;
update the script's import approach so imports succeed without hardcoded
filesystem paths and ensure any alternative respects CI/dev environments.
In `@server.py`:
- Around line 705-724: The smoke test fails because the new Tool named
"bicameral.usage_summary" was added but the static EXPECTED_TOOL_NAMES used by
run_smoke_test() wasn't updated; update the EXPECTED_TOOL_NAMES list (the
constant referenced by run_smoke_test) to include the string
"bicameral.usage_summary" so the test reflects the newly registered Tool and
re-run the smoke tests.
- Around line 870-873: The bicameral.usage_summary branch returns before the
auto-sync, risking stale data; call ensure_ledger_synced(ctx) before invoking
handle_usage_summary or move the entire block below the existing auto-sync path
so reads occur after the usual sync; specifically, insert a call to
ensure_ledger_synced(ctx) just before the await handle_usage_summary(ctx, ...)
(or relocate the if name == "bicameral.usage_summary" block to follow the
auto-sync logic) so the usage_summary reflects the current ledger state.
---
Outside diff comments:
In `@setup_wizard.py`:
- Around line 529-570: The prompt currently treats a cancelled
questionary.select (.ask() returning None) as True and persists telemetry
enabled; change the fallback so a None result is treated as False (do not enable
telemetry) or abort; specifically, in _select_telemetry() update the handling of
result from questionary.select to set choice = result if result is not None else
False and then call write_consent(telemetry=choice, via="wizard") and return
choice (leave the non-interactive branch that auto-writes telemetry=True
unchanged).
---
Nitpick comments:
In `@docs/BACKLOG.md`:
- Around line 38-39: Replace the person-specific phrase "Jin's call" with a
role- or artifact-based gate so the backlog item remains actionable; for example
change it to "engineering lead approval", "product manager decision", or
reference a decision artifact like "pending ADR/RFC: team-sync-as-v1.0.0-goal"
and ensure the sentence explicitly cites that artifact or role (e.g., "Decision
blocked pending engineering lead approval or ADR 'team-sync-v1'") so future
readers know who or what finalizes the decision.
In `@scripts/sim_issue_108_flows.py`:
- Around line 828-843: The top-level execution block currently runs
asyncio.run(main()) and prints RESULTS at import time, causing side effects;
wrap the asyncio.run(main()) call and all subsequent reporting/printing logic
(the lines referencing asyncio.run(main()), RESULTS, overall, and the summary
print loop) inside an if __name__ == "__main__": guard so importing the module
won't execute flows—keep references to the existing main coroutine and RESULTS
list unchanged but only invoke them under that guard.
In `@telemetry.py`:
- Around line 117-125: The current block catches Exception broadly which can
mask programming errors in local_counters; replace the broad except with a
narrow catch for the expected import/IO failures (e.g., ImportError,
ModuleNotFoundError, OSError) around the import and increment call for
local_counters/_local_increment and keep the non-fatal logger.debug for those
specific exceptions; refer to symbols local_counters, _local_increment,
skill_name, properties and logger.debug when making the change.
In `@tests/test_consent_notice.py`:
- Around line 196-200: The test claims the local counter should increment but
never checks it: before performing the action that triggers the local increment,
snapshot the local counter value from the local_counters module (e.g.
local_counters.counter or a get_count()/COUNTER symbol in that module), then
after importlib.reload(local_counters) assert the new value equals the prior
value + 1; keep the existing relay assertion (sent == []) and place this counter
assertion just after the reload so it verifies the local increment occurred
while relay sending remained blocked.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 682c31fa-41d3-4fd4-99c5-fb6db66c4fb9
📒 Files selected for processing (30)
.github/workflows/test-mcp-regression.ymlCHANGELOG.mdassets/dashboard.htmlcli/_link_commit_runner.pycli/link_commit_cli.pyconsent.pydocs/BACKLOG.mdevents/materializer.pyevents/team_adapter.pyevents/writer.pyhandlers/ratify.pyhandlers/resolve_collision.pyhandlers/usage_summary.pyledger/adapter.pyledger/queries.pylocal_counters.pyscripts/sim_issue_108_flows.pyserver.pysetup_wizard.pyskills/bicameral-capture-corrections/SKILL.mdskills/bicameral-dashboard/SKILL.mdtelemetry.pytests/conftest.pytests/test_consent_notice.pytests/test_event_writer.pytests/test_hook_command_registration.pytests/test_link_commit_cli.pytests/test_local_counters.pytests/test_team_event_replay.pytests/test_usage_summary.py
| ### Closes | ||
|
|
||
| #39, #42. |
There was a problem hiding this comment.
Fix the closes reference formatting.
#39 / #42 at the start of the line is parsed as a heading, which is what markdownlint is warning about. Make it a bullet or plain text so the release note renders and lints cleanly.
Suggested fix
-#39, `#42`.
+Closes `#39`, `#42`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### Closes | |
| #39, #42. | |
| ### Closes | |
| Closes `#39`, `#42`. |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 70-70: No space after hash on atx style heading
(MD018, no-missing-space-atx)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CHANGELOG.md` around lines 68 - 70, The "### Closes" heading followed by a
line starting with "#39, `#42`." is being parsed as a markdown heading; change the
reference line so it does not start with a hash: under the "### Closes" heading
replace the raw "#39, `#42`." line with a plain text or bullet-style entry such as
"Closes: `#39`, `#42`" or "- Closes `#39` and `#42`" so markdownlint no longer treats it
as a heading.
| def _should_notify() -> bool: | ||
| """True iff the notice has not been emitted for the current policy version.""" | ||
| if os.getenv("BICAMERAL_SKIP_CONSENT_NOTICE", "").strip() == "1": | ||
| return False | ||
| marker = read_consent() | ||
| if marker is None: | ||
| return True | ||
| return int(marker.get("policy_version", 0)) < POLICY_VERSION |
There was a problem hiding this comment.
Handle malformed policy_version without aborting notification logic.
At Line 109, int(marker.get("policy_version", 0)) can raise ValueError/TypeError for a corrupted or legacy marker. That gets swallowed upstream and silently prevents re-notification/re-stamping.
💡 Proposed fix
def _should_notify() -> bool:
@@
marker = read_consent()
if marker is None:
return True
- return int(marker.get("policy_version", 0)) < POLICY_VERSION
+ try:
+ marker_version = int(marker.get("policy_version", 0))
+ except (TypeError, ValueError):
+ return True
+ return marker_version < POLICY_VERSION🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@consent.py` around lines 102 - 109, The conversion of
marker.get("policy_version", 0) to int in _should_notify can raise
ValueError/TypeError for malformed markers and cause silent suppression of the
notification; update _should_notify to safely parse the stored policy_version by
wrapping the int conversion in a try/except (catch ValueError and TypeError) and
treating invalid values as 0 (or otherwise forcing re-notification) so the
function returns True when the marker is corrupted; reference the _should_notify
function, the marker variable and POLICY_VERSION when making this change.
| # Cosmetic drift: count compliance_check verdicts of cosmetic_autopass | ||
| # over total drift verdicts in the window. | ||
| try: | ||
| cc_rows = await client.query( | ||
| "SELECT verdict, count() AS n FROM compliance_check " | ||
| f"WHERE checked_at > <datetime>'{cutoff}' " | ||
| "AND verdict IN ['drifted', 'cosmetic_autopass'] GROUP BY verdict" | ||
| ) | ||
| cc_counts = { | ||
| r.get("verdict"): int(r.get("n", 0)) for r in (cc_rows or []) | ||
| } | ||
| cosmetic = cc_counts.get("cosmetic_autopass", 0) | ||
| drift_total = cosmetic + cc_counts.get("drifted", 0) | ||
| if drift_total > 0: | ||
| base["cosmetic_drift_pct"] = round(cosmetic / drift_total, 4) |
There was a problem hiding this comment.
cosmetic_drift_pct is computed from a verdict value not present in schema.
The query uses cosmetic_autopass, but ledger/schema.py defines compliance_check.verdict as compliant | drifted | not_relevant. As written, cosmetic count will stay zero and the metric is wrong.
💡 Proposed alignment with current schema
try:
cc_rows = await client.query(
"SELECT verdict, count() AS n FROM compliance_check "
f"WHERE checked_at > <datetime>'{cutoff}' "
- "AND verdict IN ['drifted', 'cosmetic_autopass'] GROUP BY verdict"
+ "AND verdict IN ['drifted', 'not_relevant'] GROUP BY verdict"
)
cc_counts = {
r.get("verdict"): int(r.get("n", 0)) for r in (cc_rows or [])
}
- cosmetic = cc_counts.get("cosmetic_autopass", 0)
+ cosmetic = cc_counts.get("not_relevant", 0)🧰 Tools
🪛 Ruff (0.15.12)
[error] 88-90: Possible SQL injection vector through string-based query construction
(S608)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@handlers/usage_summary.py` around lines 84 - 98, The query in
handlers/usage_summary.py is counting a verdict value "cosmetic_autopass" that
doesn't exist in the schema (compliance_check.verdict is "compliant | drifted |
not_relevant"), so cosmetic will always be zero; update the query and counts to
use the schema-valid verdict (e.g., replace "cosmetic_autopass" with
"not_relevant") and adjust the variable references (cc_rows, cc_counts,
cosmetic, drift_total, base["cosmetic_drift_pct"]) accordingly so
cosmetic_drift_pct is computed from the actual verdict values present in
compliance_check.
| # ── Decision signoff write path (#97 event vocabulary) ──────────── | ||
| # Both methods are idempotent so the materializer can replay them | ||
| # safely. Handlers do their own pre-write idempotency / collision | ||
| # checks; the adapter just performs the write and re-projects status. | ||
|
|
||
| async def apply_ratify(self, decision_id: str, signoff: dict) -> str: | ||
| """Write a ratify/reject signoff and re-project the decision's status. | ||
|
|
||
| Idempotent. Returns the projected decision status after the write. | ||
| """ | ||
| await self._ensure_connected() | ||
| await self._client.query( | ||
| f"UPDATE {decision_id} SET signoff = $signoff", | ||
| {"signoff": signoff}, | ||
| ) | ||
| projected = await project_decision_status(self._client, decision_id) | ||
| await update_decision_status(self._client, decision_id, projected) | ||
| return projected | ||
|
|
||
| async def apply_supersede( | ||
| self, | ||
| new_id: str, | ||
| old_id: str, | ||
| signer: str = "", | ||
| signoff_note: str = "", | ||
| superseded_at: str = "", | ||
| session_id: str = "", | ||
| ) -> dict: | ||
| """Write the supersedes edge and freeze the old decision's signoff. | ||
|
|
||
| Idempotent: ``relate_supersedes`` upserts the edge and the signoff | ||
| UPDATE is a full overwrite. Returns ``{"old_status": "superseded"}``. | ||
| """ | ||
| await self._ensure_connected() | ||
| await relate_supersedes( | ||
| self._client, | ||
| new_id, | ||
| old_id, | ||
| confidence=1.0, | ||
| reason=( | ||
| f"human-confirmed supersession via resolve_collision session={session_id}" | ||
| ), | ||
| ) | ||
| rows = await self._client.query(f"SELECT signoff FROM {old_id} LIMIT 1") | ||
| old_signoff: dict = {} | ||
| if rows and isinstance(rows[0], dict): | ||
| old_signoff = rows[0].get("signoff") or {} | ||
| await self._client.execute( | ||
| f"UPDATE {old_id} SET signoff = $s", | ||
| { | ||
| "s": { | ||
| **old_signoff, | ||
| "state": "superseded", | ||
| "superseded_by": new_id, | ||
| "superseded_at": superseded_at, | ||
| "session_id": session_id, | ||
| "signer": signer, | ||
| "note": signoff_note, | ||
| } | ||
| }, | ||
| ) | ||
| return {"old_status": "superseded"} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Confirm the new adapter methods are the only new write-path callers and
# that record-id validation is present where needed.
rg -nP 'apply_ratify\(|apply_supersede\(|relate_supersedes\(' ledger handlers events
rg -nP '_validated_record_id\(' ledgerRepository: BicameralAI/bicameral-mcp
Length of output: 1535
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check current imports in adapter.py
echo "=== Current imports in ledger/adapter.py (first 50 lines) ==="
head -50 ledger/adapter.py | cat -n
echo -e "\n=== Check if _validated_record_id is imported ==="
rg 'from .queries import|_validated_record_id' ledger/adapter.py | head -20Repository: BicameralAI/bicameral-mcp
Length of output: 1877
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Read the exact code in apply_ratify and apply_supersede
echo "=== apply_ratify and apply_supersede code (lines 1120-1181) ==="
sed -n '1120,1181p' ledger/adapter.py | cat -nRepository: BicameralAI/bicameral-mcp
Length of output: 2943
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check what _validated_record_id does
echo "=== _validated_record_id implementation ==="
sed -n '1479,1510p' ledger/queries.py | cat -nRepository: BicameralAI/bicameral-mcp
Length of output: 1471
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if relate_supersedes uses validation
echo "=== relate_supersedes implementation (around line 1310) ==="
sed -n '1310,1350p' ledger/queries.py | cat -nRepository: BicameralAI/bicameral-mcp
Length of output: 1809
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check how project_decision_status and update_decision_status handle the id
echo "=== project_decision_status implementation ==="
rg -A 20 'async def project_decision_status' ledger/queries.py | head -30
echo -e "\n=== update_decision_status implementation ==="
rg -A 20 'async def update_decision_status' ledger/queries.py | head -30Repository: BicameralAI/bicameral-mcp
Length of output: 1848
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for safe patterns with WHERE and bound parameters
echo "=== Examples of WHERE id = \$param patterns ==="
rg 'WHERE.*id\s*=\s*\$' ledger/queries.py | head -10
echo -e "\n=== General patterns with bound params ==="
rg 'UPDATE.*WHERE' ledger/queries.py | head -10Repository: BicameralAI/bicameral-mcp
Length of output: 552
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there are any existing safe record-id handling patterns in adapter.py
echo "=== How other adapter methods handle record IDs ==="
rg -B 2 -A 2 'SELECT|UPDATE' ledger/adapter.py | grep -E '(SELECT|UPDATE|\$)' | head -20Repository: BicameralAI/bicameral-mcp
Length of output: 716
Guard the new write paths against record-id injection.
apply_ratify() and apply_supersede() forward caller-controlled ids directly into SurrealQL strings, creating a write-path injection surface. The issue extends to their downstream dependencies: update_decision_status() (called from apply_ratify) and relate_supersedes() (called from apply_supersede) both use the same unsafe pattern. The codebase already validates record IDs with _validated_record_id() and uses bound parameters elsewhere (e.g., WHERE canonical_id = $cid). Apply the same validation before query construction in all affected methods.
Suggested fix scope
from .queries import (
decision_exists,
delete_binds_to_edge,
+ _validated_record_id,
find_subject_identities_for_decision,Apply validation in apply_ratify(), apply_supersede(), update_decision_status(), and relate_supersedes() before any query building.
🧰 Tools
🪛 Ruff (0.15.12)
[error] 1132-1132: Possible SQL injection vector through string-based query construction
(S608)
[error] 1163-1163: Possible SQL injection vector through string-based query construction
(S608)
[error] 1168-1168: Possible SQL injection vector through string-based query construction
(S608)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ledger/adapter.py` around lines 1120 - 1181, The write paths use
caller-controlled record ids directly in SurrealQL f-strings causing injection;
before constructing any query validate/sanitize record ids using the existing
_validated_record_id() helper (or equivalent) and replace f-string-inlined ids
with validated values or parameterized placeholders. Apply this to
apply_ratify(decision_id), apply_supersede(new_id, old_id), and in the
downstream helpers update_decision_status(...) and relate_supersedes(...) so
every occurrence that builds SQL like f"UPDATE {id} ..." or f"SELECT ... FROM
{id}" first calls _validated_record_id(id) and then uses that result (or a bound
parameter) when composing the query. Ensure both new_id and old_id are validated
in apply_supersede and propagate validated ids into calls to
relate_supersedes/update_decision_status.
| _COUNTERS_FILE = Path.home() / ".bicameral" / "counters.jsonl" | ||
| _OFF_VALUES = frozenset({"0", "false", "no", "off"}) | ||
| _LOCK = threading.Lock() | ||
|
|
||
|
|
||
| def _enabled() -> bool: | ||
| val = os.getenv("BICAMERAL_LOCAL_COUNTERS", "1").strip().lower() | ||
| return val not in _OFF_VALUES |
There was a problem hiding this comment.
Resolve the counters path at call time, not import time.
Path.home() is captured once here, so later HOME/USERPROFILE overrides and the new temp-dir test isolation never take effect unless the whole module is reloaded. That makes the sink stick to the wrong user directory in long-lived processes and breaks the isolation these tests are trying to guarantee.
Suggested fix
-_COUNTERS_FILE = Path.home() / ".bicameral" / "counters.jsonl"
+def _counters_file() -> Path:
+ return Path.home() / ".bicameral" / "counters.jsonl"
...
- _COUNTERS_FILE.parent.mkdir(parents=True, exist_ok=True)
+ counters_file = _counters_file()
+ counters_file.parent.mkdir(parents=True, exist_ok=True)
...
- with _open_for_append_secure(_COUNTERS_FILE) as f:
+ with _open_for_append_secure(counters_file) as f:
...
- if not _COUNTERS_FILE.exists():
+ counters_file = _counters_file()
+ if not counters_file.exists():
...
- with open(_COUNTERS_FILE, "rb") as f:
+ with open(counters_file, "rb") as f:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@local_counters.py` around lines 34 - 41, Currently _COUNTERS_FILE is resolved
at import time using Path.home(), which prevents later HOME/USERPROFILE changes
from taking effect; replace the module-level _COUNTERS_FILE constant with a
function (e.g., _counters_file()) that computes and returns Path.home() /
".bicameral" / "counters.jsonl" at call time and update all uses to call that
function instead (ensure callers that write create parent directory as needed);
leave _LOCK and _enabled() as-is.
| Tool( | ||
| name="bicameral.usage_summary", | ||
| description=( | ||
| "Aggregate operational readout — counts and percentages over the last N days. " | ||
| "Returns ingest_calls, bind_calls_total, decision counts by status, " | ||
| "reflected/drift/cosmetic_drift percentages, and error_rate. " | ||
| "Privacy-preserving: aggregates only, no event rows, no user content. " | ||
| "Read-only over the local ledger plus the local-only counters file." | ||
| ), | ||
| inputSchema={ | ||
| "type": "object", | ||
| "properties": { | ||
| "days": { | ||
| "type": "integer", | ||
| "description": "Window size in days (default 7). Pass 0 for tool-call counts only.", | ||
| "default": 7, | ||
| }, | ||
| }, | ||
| }, | ||
| ), |
There was a problem hiding this comment.
Add bicameral.usage_summary to the smoke-test expectation.
run_smoke_test() still compares the registered tools against the old EXPECTED_TOOL_NAMES list, so this new tool will make the smoke test fail until the list is updated.
Suggested fix
"bicameral.feedback",
+ "bicameral.usage_summary",
"validate_symbols",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@server.py` around lines 705 - 724, The smoke test fails because the new Tool
named "bicameral.usage_summary" was added but the static EXPECTED_TOOL_NAMES
used by run_smoke_test() wasn't updated; update the EXPECTED_TOOL_NAMES list
(the constant referenced by run_smoke_test) to include the string
"bicameral.usage_summary" so the test reflects the newly registered Tool and
re-run the smoke tests.
| if name == "bicameral.usage_summary": | ||
| from handlers.usage_summary import handle_usage_summary | ||
| data = await handle_usage_summary(ctx, days=int(arguments.get("days", 7))) | ||
| return [TextContent(type="text", text=json.dumps(data, indent=2))] |
There was a problem hiding this comment.
Run the usual sync before aggregating usage metrics.
This handler returns before the auto-sync block, so bicameral.usage_summary can report stale ledger state right after a commit or ingest. Either move it below the sync path or call ensure_ledger_synced(ctx) here so the readout matches the rest of the server tools.
Suggested fix
if name == "bicameral.usage_summary":
+ from handlers.sync_middleware import ensure_ledger_synced
+ await ensure_ledger_synced(ctx)
from handlers.usage_summary import handle_usage_summary
data = await handle_usage_summary(ctx, days=int(arguments.get("days", 7)))
return [TextContent(type="text", text=json.dumps(data, indent=2))]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@server.py` around lines 870 - 873, The bicameral.usage_summary branch returns
before the auto-sync, risking stale data; call ensure_ledger_synced(ctx) before
invoking handle_usage_summary or move the entire block below the existing
auto-sync path so reads occur after the usual sync; specifically, insert a call
to ensure_ledger_synced(ctx) just before the await handle_usage_summary(ctx,
...) (or relocate the if name == "bicameral.usage_summary" block to follow the
auto-sync logic) so the usage_summary reflects the current ledger state.
…cameral-sync Scope-cut from #135's original L2 proposal (--auto-resolve-trivial flag on link_commit). Design enumeration produced 7 options; all required either an LLM in the deterministic core (violating the "selection over generation" guardrail) or trivial-cases enumeration with non-zero false-positive risk. Cut: accept the architectural limit. Post-commit hook stays sync-only. Resolution path = dashboard tooltip on status === 'pending' rows → user runs /bicameral-sync in their Claude Code session. No code is auto-resolved. assets/dashboard.html: renderStateCell() ternary at line 455 → if/else if. New 'pending' branch attaches tooltip text "Pending compliance — run /bicameral-sync in your Claude Code session to resolve." Reuses existing data-tip CSS pattern (lines 187–198, hover transitions). Static string literal — no esc() needed (no HTML special chars). skills/bicameral-dashboard/SKILL.md: One bullet under Notes documenting the tooltip nudge contract. Per pilot/mcp/CLAUDE.md "tool changes ship with skill updates" rule (UI behavior changed; tool response shape unchanged). Section 4 razor: renderStateCell 19 LOC (cap 40), nesting 1 (cap 3), nested ternaries 0. Replaced ternary with if/else if — improves razor score, doesn't degrade it. Verification: manual (no automated test added — dashboard.html has zero existing test infrastructure; UI test harness absent; PR description includes manual verification step). Acknowledged advisory in Entry #24 audit. Refs #135 (close post-merge with scope-cut comment). Refs BicameralAI/bicameral#108 (Flow 3 spec edit, post-merge gh action). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit febb0aa)
The simulation (scripts/sim_issue_108_flows.py) walks all six canonical flows from BicameralAI/bicameral#108 against the live bicameral-mcp implementation on dev. All 6 PASS post-#135-triage merge: Flow 1 PASS ingest → ratify; supersession_candidates absent (corrected) Flow 2 PASS region-anchored preflight (current contract; topic-BM25 removed) Flow 3 PASS full V1 path: ingest→ratify→bind→commit→link_commit→reflect Flow 3a PASS branch ephemeral; switch-to-main → drifted (no phantom reflect) Flow 4 PASS capture-corrections; agent_session source round-trips Flow 5 PASS history exposes both axes (status × signoff_state) Two spec drifts surfaced and fixed forward: 1. Flow 2 step 1 — spec said "BM25 search on the topic". Reality: v0.10.0 removed topic-BM25 from handle_preflight (see docs/preflight-failure-scenarios.md §intro). Current behaviour is region-anchored lookup via file_paths + HITL surfacing (unresolved_collisions, context_pending_ready). The caller LLM reads bicameral.history() and reasons over it for topic-relevance. Spec text correction queued as post-merge gh issue edit on #108. 2. Flow 4 step 3 — spec said source="conversation". Implementation's _SOURCE_TYPE_MAP (handlers/history.py) does NOT include "conversation" — it falls through to "manual". Canonical value for AI-surfaced session decisions is "agent_session". This commit corrects the capture-corrections skill (which was instructing callers to use the silently-broken "conversation" value) to use "agent_session". Spec text correction queued as post-merge gh issue edit on #108. Both spec corrections are external gh actions (gh issue edit) that fire post-merge once this PR lands on dev — same pattern as #135 triage. Closes the original ask in this session: validate #108 flows end-to-end on dev. Triage #135 (PR #138, merged eaf97e2) corrected the supersession_candidates wording and added the out-of-session committer paragraph to Flow 3; this PR closes the remaining gaps. Refs #108. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit 2503fe6)
Two fixes for CI: - Apply ruff format (formatting drift on long f-strings + dict trailing commas). - Update top-of-file docstring Flow 4 description to match the agent_session correction in the function body (was still "source=conversation" — stale). Verified locally: python3 -m ruff format --check scripts/sim_issue_108_flows.py → 1 file already formatted python3 -m ruff check scripts/sim_issue_108_flows.py → All checks passed! python3 scripts/sim_issue_108_flows.py → all 6 flows PASS Adaptation: scripts/sim_issue_108_flows.py — additional line-wraps applied on triage-from-dev because this branch's pyproject.toml omits a custom line-length (defaults to ruff's 88), whereas dev has line-length=100. Cherry-picked from dev's format pass (d3fb58c) plus mechanical re-wrap to satisfy triage-from-dev's stricter default. No semantic change. Per DEV_CYCLE.md §10.5.3 adaptation clause. (cherry picked from commit d3fb58c) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace machine-specific absolute path with __file__-relative resolution so the simulation script runs on any developer machine or CI environment. Addresses CodeRabbit review on PR #140. Verified: python3 -m ruff format --check scripts/sim_issue_108_flows.py → already formatted python3 -m ruff check scripts/sim_issue_108_flows.py → all checks passed Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Triage release per DEV_CYCLE §10.5. Forwards three commits from dev: - feat(#135): dashboard tooltip nudges out-of-session committers to /bicameral-sync - feat(#108): end-to-end sim + capture-corrections skill correction - style(#108): ruff format scripts/sim_issue_108_flows.py + docstring sync Real bug fix: capture-corrections skill was instructing callers to use source="conversation" but _SOURCE_TYPE_MAP has no such entry, so it silently fell through to "manual". Skill now uses canonical "agent_session" value; end-to-end simulation confirms round-trip. Full triage provenance and §10.5.3 adaptation note in PR #140. CHANGELOG headline adds v0.13.6 entry above v0.13.5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c4c0f62 to
ad3e440
Compare
Update — branch rebased onto current
|
| File / line | Status | Reasoning |
|---|---|---|
scripts/sim_issue_108_flows.py:29 (hardcoded path) |
✅ Fixed in 6163002 |
In-scope: code introduced by this triage. Replaced absolute path with pathlib.Path(__file__).resolve().parents[1]. |
CHANGELOG.md:70 (#39, \#42`.` parsed as heading) |
✅ Resolved by rebase | The flagged line was inside the stale v0.14.0 entry that the rebase replaced with v0.13.5 + v0.13.6 entries. Section now uses Closes [#135](...) bullet form. |
consent.py:109 (malformed policy_version aborts notify) |
⏭️ Out of scope | Code already on main since v0.13.5 (PR #128). Triage forwards specific commits per §10.5.1 — new fixes belong on dev, not in a triage release. Filing as a follow-up. |
events/materializer.py:97-146 (drops unresolved peer events) |
⏭️ Out of scope | Same — already on main from v0.13.5's #97 work. Real correctness concern; filing for dev. |
handlers/usage_summary.py:88-98 (cosmetic_autopass not in schema) |
⏭️ Out of scope | Same — already on main. CodeRabbit is right that the verdict literal is wrong (compliance_check.verdict is compliant | drifted | not_relevant); will file. |
ledger/adapter.py:1132-1181 (record-id injection in apply_ratify / apply_supersede) |
⏭️ Out of scope | Same — already on main from v0.13.5's #97 work. Will file as a security-track follow-up to apply _validated_record_id() consistently. |
local_counters.py:34-41 (resolves Path.home() at import time) |
⏭️ Out of scope | Same — already on main from v0.13.5's #95 work. Affects test isolation; will file. |
server.py:705-724 (bicameral.usage_summary missing from EXPECTED_TOOL_NAMES) |
⏭️ Out of scope | Same — already on main. Trivial one-liner; will file. |
server.py:870-873 (no ensure_ledger_synced() before usage_summary) |
⏭️ Out of scope | Same — already on main. Real staleness concern; will file. |
The five "out of scope" items are real findings against code already shipped in v0.13.5. Per §10.5.1, triage releases forward dev commits with minimal adaptation rather than introducing new fixes — those belong on dev. I'll open a tracking issue listing them so they don't fall through.
Replace machine-specific absolute path with __file__-relative resolution so the simulation script runs on any developer machine or CI environment. Addresses CodeRabbit review on PR #140. Verified: python3 -m ruff format --check scripts/sim_issue_108_flows.py → already formatted python3 -m ruff check scripts/sim_issue_108_flows.py → all checks passed Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Triage release per
docs/DEV_CYCLE.md§10.5.4. Three commits forwarded fromdevtomainviatriage-from-dev. Patch version bumps; minor stays pinned to v0.13.x.Cherry-picked commits
febb0aa76213662503fe695edbb7source="conversation"→"agent_session"is a real bug fix (callers were using a value silently mapped to "manual"); sim is test infrastructure with zero API surfaced3fb58cc4c0f62All three picks used
git cherry-pick -xto record dev-sha provenance. The third commit carries anAdaptation:trailer per §10.5.3 because triage-from-dev'spyproject.tomldoesn't customizeline-length, defaulting to ruff's 88 (vs dev's 100). No semantic change — purely additional line-wraps.Closes (auto-close on merge)
Skipped commits (not triage-eligible per §10.5.1)
eaf97e29bea6e4dev → mainreleaseThe release manager (this PR's merger) decides whether to expand scope before merging.
Verification on triage-from-dev
Expected:
(Confirmed locally on
c4c0f62.)Skill fix verification:
grep -c 'source="conversation"' skills/bicameral-capture-corrections/SKILL.md→ 0 (was 2 on bb76ad5).grep -c 'source="agent_session"'→ 2.Risk
L1 — UI feature + skill text fix + test-infra additions. No new code paths, no schema changes, no contract changes, no API surface changes. The §10.5.3 adaptation is mechanical (line-wrap re-flow only).
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
bicameral.usage_summaryreporting tool with decision and call metricslink_commitCLI subcommand for decision linkingBug Fixes
/bicameral-syncDocumentation
Tests