fix(preflight): refine render_source_attribution regex + flip default (#209) by Knapp-Kevin · Pull Request #239 · BicameralAI/bicameral-mcp

Knapp-Kevin · 2026-05-06T23:12:30Z

Summary

Closes #209. Refines the render_source_attribution redaction regex from the v1 broad \b[A-Z][a-z]+\b to four POSITIONAL-cue patterns, then flips the default from full to redacted (privacy-positive per #200 audit finding A4).

Plan / Audit

Plan: plan-B-preflight-attribution-regex-209.md
Audit: round 1 VETO (spec-drift on _PLATFORM_TOKEN_ALLOWLIST; ambiguous test description) → round 2 PASS

What ships

Surface	Change
`handlers/preflight.py`	Replace broad `_NAME_PATTERN` with 4 positional-cue patterns: `(?<=· )...`, `(?<=, )...(?=,?\s+\d{4}-\d{2}-\d{2})`, `(?<=^Speaker:\s)...`, `(?<=^From:\s)...`. Multi-word continuation uses `[ \t]+` (not `\s+`) to avoid line-spanning
`context.py:28`	`_DEFAULT_RENDER_ATTRIBUTION_MODE = "redacted"` (was `"full"`)
`setup_wizard.py:1005`	YAML template flipped: `render_source_attribution: redacted` (was `full`)
`setup_wizard.py:972-974`	Banner print message updated to reflect new default and reverse the opt-in direction
`tests/test_preflight_attribution_redaction.py`	10 new functional tests covering all 4 cues + preservation contracts + default-flip lock + fresh-install YAML render
`tests/test_preflight_render_source_attribution.py`	1 pre-existing test updated for the new contract (bare names without positional cues are no longer redacted)

Why no platform-token allowlist

A curated allowlist (Sprint, Linear, GitHub, etc.) was considered as defense-in-depth. Rejected per round-1 audit finding 1: the positional-cue patterns require explicit cues to fire by construction. Context tokens like "Sprint" / "Linear" / "GitHub" appearing in <context-words> position never follow these cues, so they survive without an allowlist. Tests directly verify this contract.

Test plan

10 new functional tests pass (tests/test_preflight_attribution_redaction.py)
96/96 preflight + setup_wizard regression tests pass (1 pre-existing test updated for new contract)
ruff check + ruff format --check clean
No new dependencies

Acceptance per #209

Refined regex matches names + dates without stripping platform/tool tokens
New unit tests against representative real-world source_ref strings confirm preserved-vs-redacted boundaries
Default flipped to redacted in both context.py default and setup_wizard fresh-install YAML
e2e Flow 3 still passes after the flip (CI will validate)

OWASP A04 positive contribution

Closes a fail-open privacy posture: the prior full default leaked names + dates verbatim to the agent's chat surface. The deterministic gate is in place; flipping the default is the privacy-positive move directed by #200 audit finding A4.

🤖 Generated with Claude Code

…nt (#209) Plan B addresses #209: the v1 broad redaction regex (`\b[A-Z][a-z]+\b`) over-matches every capitalized lowercase token, including platform/tool names (Sprint, Linear, GitHub, etc.), breaking the agent's structural parsing of `source_ref`. Plan B replaces it with four POSITIONAL-cue patterns that require explicit cues (`· `, `, ` adjacent to a date, `^Speaker:\s`, `^From:\s`) to fire — context tokens never follow these cues by construction, so no allowlist is needed. After the refinement, flip the default in both: - context._DEFAULT_RENDER_ATTRIBUTION_MODE: "full" → "redacted" - setup_wizard YAML template: "render_source_attribution: full" → "redacted" Audit: round 1 VETO (2 binding: spec-drift on `_PLATFORM_TOKEN_ALLOWLIST` declared without consumer; test-failure on ambiguous test description) → round 2 PASS after dropping the allowlist (Path A: positional patterns are precise by construction) and rewriting the test to be unambiguously functional (invoke `_write_collaboration_config`, assert on rendered YAML). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…l cues (#209) Replace the v1 broad name-redaction regex with four positional-cue patterns: - `(?<=· )[A-Z][a-z]+(?:[ \t]+[A-Z][a-z]+)*` — name after `· ` separator - `(?<=, )[A-Z][a-z]+(?:[ \t]+[A-Z][a-z]+)*(?=,?\s+\d{4}-\d{2}-\d{2})` — name before date - `(?<=^Speaker:\s)[A-Z][a-z]+(?:[ \t]+[A-Z][a-z]+)*` (re.MULTILINE) - `(?<=^From:\s)[A-Z][a-z]+(?:[ \t]+[A-Z][a-z]+)*` (re.MULTILINE) Names match only after explicit cues; context tokens (Sprint, Linear, GitHub, etc.) never follow these cues so they survive without an allowlist. Multi-word continuation uses `[ \t]+` (not `\s+`) to avoid swallowing text on subsequent lines through `\n`. Date pattern unchanged (`\b\d{4}-\d{2}-\d{2}\b` is correct). 10 new functional tests in `tests/test_preflight_attribution_redaction.py` covering the 4 positional cues, platform-token preservation, capitalized context-word preservation, no-attribution-shape passthrough, full + hidden mode regression locks, default-flip lock, and fresh-install YAML render. Updated 1 pre-existing test in `tests/test_preflight_render_source_attribution.py` to use the canonical attribution shape (`Sprint review · Brian, 2026-03-22`) since bare names without positional cues are no longer redacted by design. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Privacy-positive default flip per #200 audit finding A4. The deterministic gate (refined regex in handlers/preflight.py) is now precise enough to preserve agent structural parsing while redacting names and dates. Two sources-of-truth flip in lockstep: - context._DEFAULT_RENDER_ATTRIBUTION_MODE: "full" → "redacted" (loaded-code default when YAML is missing/malformed) - setup_wizard.py YAML template at line 1005: "render_source_attribution: full" → "render_source_attribution: redacted" (fresh-install default written to .bicameral/config.yaml) Banner print message updated to reflect the new default and reverse the opt-in direction (was "flip to redacted/hidden", now "flip to full/hidden"). Operators who relied on the verbatim default can opt back via `render_source_attribution: full` in `.bicameral/config.yaml`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-06T23:12:37Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e49400cf-0490-48cd-9853-74885fe5c7cb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch 209-preflight-attribution-regex-refinement

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ruff format CI on PR #239 flagged context.py for reformatting. The default-flip comment line `_DEFAULT_RENDER_ATTRIBUTION_MODE = "redacted" # #209: ...` exceeded ruff's preferred line length and was wrapped. Pure formatter pass — no semantic changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…icameralAI#225 + BicameralAI#226) Plan for the three compliance-posture stance declarations: - BicameralAI#220 / MCP-01: MCP host UX dependency (OWASP LLM-07) - BicameralAI#225 / NIST-RMF-01 + AI-ACT-02: prohibited-uses declaration - BicameralAI#226 / SOC2-02: availability stance (operator-run-only) All three bundle naturally because they share docs/policies/ + a single README cross-reference section. Pure-doc surface fully disjoint from in-flight code PRs (BicameralAI#237, BicameralAI#238, BicameralAI#239) — safe as a parallel PR. Audit: round 1 PASS (L1, doc-only). Doctrine interpretation locked: for markdown policy artifacts, the unit IS the document content; read_text() + assert "<commitment>" in content is genuine unit invocation per qor/references/doctrine-test-functionality.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Knapp-Kevin and others added 3 commits May 6, 2026 19:11

Knapp-Kevin had a problem deploying to recording-approval May 6, 2026 23:12 — with GitHub Actions Failure

Knapp-Kevin temporarily deployed to ci-test May 6, 2026 23:12 — with GitHub Actions Inactive

Knapp-Kevin temporarily deployed to production May 6, 2026 23:12 — with GitHub Actions Inactive

Knapp-Kevin temporarily deployed to ci-test May 6, 2026 23:12 — with GitHub Actions Inactive

Knapp-Kevin temporarily deployed to ci-test May 6, 2026 23:51 — with GitHub Actions Inactive

Knapp-Kevin had a problem deploying to recording-approval May 6, 2026 23:51 — with GitHub Actions Failure

Knapp-Kevin temporarily deployed to ci-test May 6, 2026 23:51 — with GitHub Actions Inactive

Knapp-Kevin temporarily deployed to production May 6, 2026 23:51 — with GitHub Actions Inactive

Knapp-Kevin temporarily deployed to ci-test May 6, 2026 23:51 — with GitHub Actions Inactive

Knapp-Kevin mentioned this pull request May 6, 2026

docs(compliance): bundle stance declarations #220 + #225 + #226 #240

Merged

4 tasks

Knapp-Kevin merged commit b470658 into dev May 7, 2026
7 of 8 checks passed

Knapp-Kevin deleted the 209-preflight-attribution-regex-refinement branch May 7, 2026 00:06

Knapp-Kevin mentioned this pull request May 27, 2026

skill(preflight): refine regex + flip default #209

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(preflight): refine render_source_attribution regex + flip default (#209)#239

fix(preflight): refine render_source_attribution regex + flip default (#209)#239
Knapp-Kevin merged 4 commits into
devfrom
209-preflight-attribution-regex-refinement

Knapp-Kevin commented May 6, 2026

Uh oh!

coderabbitai Bot commented May 6, 2026 •

edited

Loading

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Knapp-Kevin commented May 6, 2026

Summary

Plan / Audit

What ships

Why no platform-token allowlist

Test plan

Acceptance per #209

OWASP A04 positive contribution

Uh oh!

coderabbitai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 6, 2026 •

edited

Loading