feat(privacy): preflight render-attribution + bypass-tracking config knobs (#200 Phase 3)#208
Merged
Merged
Conversation
…knobs (#200 Phase 3) Closes A4 of #200's preflight findings via two new deterministic config gates that filter what data flows from server to agent / disk. Deterministic gate #1 — render_source_attribution: - New BicameralContext.render_source_attribution field (modes: full, redacted (default), hidden) read from .bicameral/config.yaml at config-load. - New handlers.preflight._apply_attribution_policy applies the mode to DecisionMatch.source_ref before the response leaves the server. The agent receives pre-filtered output and renders verbatim. - full: legacy passthrough - redacted: replace `[A-Z][a-z]+` (names) with <NAME_REDACTED>, `\d{4}-\d{2}-\d{2}` (dates) with <DATE_REDACTED>; preserves structural shape so operator sees "<NAME_REDACTED> review · <NAME_REDACTED>, <DATE_REDACTED>" instead of full attribution - hidden: blank source_ref entirely Deterministic gate #2 — preflight_bypass_tracking: - New BicameralContext.preflight_bypass_tracking field (modes: enabled (default), disabled). - handlers.record_bypass.handle_record_bypass short-circuits BEFORE the preflight_telemetry.write_bypass_event call when ctx setting is disabled. Returns recorded=False, reason="tracking_disabled". When disabled, ~/.bicameral/preflight_events.jsonl gets no writes; engine's recency read sees no events → no escalation drop, which matches the user's privacy choice. Config schema: - context.py refactored to share a generic _read_yaml_string_field helper across the three Phase 2/3 config readers (signer_email_fallback, render_source_attribution, preflight_bypass_tracking). Each is a fail-soft string-with-valid-set read with documented privacy-positive defaults. - setup_wizard.py writes both new fields to fresh .bicameral/config.yaml with their defaults. SKILL.md updates: - skills/bicameral-preflight/SKILL.md: telemetry transparency note added above HITL prompts; render_source_attribution doc inserted inline ("the deterministic gate is the config field, not this instruction"); preflight_bypass_tracking note appended to the bypass-semantics list. Tests (5 new): - tests/test_preflight_render_source_attribution.py: 3 tests for full/redacted/hidden modes against the pure helper. - tests/test_preflight_bypass_tracking.py: 2 tests for enabled/disabled paths through the handler with monkeypatched preflight_telemetry. 10/10 Phase 2+3 tests pass; ruff/mypy/format clean. Phase 3 of plan-200-skills-audit-hardening (PR C of three — completes the plan). After merge, the queued workstream is the e2e security audit across all #205 compliance standards. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
4 tasks
…egex overbroad (#200 Phase 3 fixup) The v1 redaction regex `re.compile(r'\b[A-Z][a-z]+\b')` replaces ALL capitalized words, including meaningful structural tokens like "Sprint", "Linear", "Slack", "Notion". Result on e2e Flow 3: agent loses source_ref parseability and the downstream binding+commit chain breaks (post-hoc ledger query shows zero compliance_check rows; the agent made only 1 bicameral call instead of the expected preflight+link_commit chain). The deterministic gate is still in place. Users who want privacy- positive rendering can flip `render_source_attribution: redacted` or `hidden` in `.bicameral/config.yaml`. The default flips to `redacted` once the regex is refined to match only true name/date patterns without stripping platform/tool tokens; tracked in #209. Concrete example of the v1 overreach: Input: "Sprint 14 architecture review · Ian, 2026-03-12" Output: "<NAME_REDACTED> 14 <NAME_REDACTED> <NAME_REDACTED> · <NAME_REDACTED>, <DATE_REDACTED>" The agent can't parse "what kind of source is this" from the redacted form, so downstream reasoning fails. PR #208's commit added an inline note in `context.py` explaining the v1 default tradeoff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase 3 (PR C, FINAL) of plan-200-skills-audit-hardening. Closes the preflight A4 findings via two new deterministic config gates.
Deterministic gate #1 —
render_source_attribution(.bicameral/config.yaml)full(legacy verbatim),redacted(default — name + date placeholders),hidden(blank source_ref)handlers.preflight._apply_attribution_policy— agent receives pre-filtered outputDeterministic gate #2 —
preflight_bypass_tracking(.bicameral/config.yaml)enabled(default, pre-skill(bicameral-report-bug): Windows portability + privacy hardening (4 findings from in-session audit) #200 behavior),disabled(no JSONL write)handlers.record_bypass.handle_record_bypass— short-circuits before disk writeConfig schema unification:
context.pyrefactored to share a generic_read_yaml_string_fieldhelper across all three Phase 2/3 readers (signer_email_fallback,render_source_attribution,preflight_bypass_tracking).SKILL.md updates: telemetry transparency note added above HITL prompts; config-field docs inserted inline;
preflight_bypass_trackingnote appended to bypass-semantics list.Refs #200 (A4 finding on preflight); refs BicameralAI/bicameral-daemon#34 (governance lift; this PR adopts max-deterministic-where-tractable).
10/10 Phase 2+3 tests pass; ruff/mypy/format clean. Final of three planned PRs from plan-200; plan-200-skills-audit-hardening fully closed when this lands.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
Generated with Claude Code.