Skip to content

Storage abstraction, soft delete, secure deployment, README reframe#1

Merged
cmeans merged 7 commits into
mainfrom
storage-abstraction-soft-delete-secure-deployment
Mar 20, 2026
Merged

Storage abstraction, soft delete, secure deployment, README reframe#1
cmeans merged 7 commits into
mainfrom
storage-abstraction-soft-delete-secure-deployment

Conversation

@cmeans
Copy link
Copy Markdown
Owner

@cmeans cmeans commented Mar 20, 2026

Summary

Major update that reflects the project's evolution from a monitoring tool to a portable knowledge + awareness layer.

Storage abstraction

  • Store protocol extracted from concrete implementation
  • AwarenessStore renamed to SQLiteStore (backward-compat alias preserved)
  • Collator depends on protocol — future backends (Postgres, vector) swap without changing consumers

Soft delete

  • delete_entry tool with dry-run by default (bulk deletes require confirm=True)
  • 30-day trash retention, auto-purged by existing expiry mechanism
  • restore_entry and get_deleted tools for recovery

Secure deployment

  • AWARENESS_MOUNT_PATH env var: secret path prefix, everything else gets 404
  • SecretPathMiddleware for streamable-http transport
  • Docker Compose with named Cloudflare Tunnel (stable URL) + quick tunnel option
  • Documented Cloudflare WAF rule for edge-level blocking

Documentation reframe

  • README rewritten: leads with portable knowledge store, not monitoring. Comparison table vs Mem0/Zep/platform memory. New architecture diagram with security layer.
  • PoC Demo Guide rewritten: secure deployment walkthrough (Cloudflare account → domain → tunnel → WAF → secret path). Security considerations with honest limitations and gotchas.
  • CLAUDE.md updated with Store protocol, soft delete, deployment info.

What prompted this

During live testing, Claude.ai was asked to "save your knowledge about me" — it wrote 39 tagged knowledge entries (personal, projects, infrastructure, health, finances) using learn_pattern. This demonstrated that the system is becoming a portable knowledge store, not just a monitoring service. The README reframe reflects this.

Test plan

  • 124 tests passing
  • ruff check clean
  • mypy strict clean
  • Verified secret path: /mcp returns 404, /<secret>/mcp returns 406
  • Verified Cloudflare WAF: unauthorized paths return 403 at edge
  • Verified Claude.ai reads knowledge store through secured endpoint
  • Review README framing and tone
  • Review PoC guide for completeness and accuracy
  • Review security considerations for anything missed

🤖 Generated with Claude Code

cmeans and others added 7 commits March 20, 2026 10:03
…README

Storage layer:
- Extract Store protocol from concrete implementation
- Rename AwarenessStore → SQLiteStore (backward-compat alias preserved)
- Collator depends on protocol, not implementation — future backends swap cleanly

Soft delete:
- delete_entry tool: soft-deletes to trash (30-day retention, auto-purged)
- Bulk deletes require confirm=True (dry-run by default to prevent accidental wipes)
- restore_entry tool: recover from trash by ID
- get_deleted tool: list trash contents

Secure deployment:
- Secret path auth via AWARENESS_MOUNT_PATH env var (SecretPathMiddleware)
- Docker Compose with named Cloudflare Tunnel + quick tunnel option
- Bind mount for data persistence (~/ instead of Docker volume)
- Cloudflare WAF rule blocks requests without secret path at the edge

Documentation:
- README reframed: leads with portable knowledge store, not monitoring
- Comparison table vs platform memory and Mem0/Zep
- PoC demo guide rewritten with secure deployment walkthrough
- Security considerations section with honest limitations
- CLAUDE.md updated with new architecture and deployment info

124 tests, all passing. Lint and type checks clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename PoC Demo Guide to Deployment Guide, update all references
- Add platform note (Fedora Linux, macOS likely works, Windows untested)
- Add reverse proxy flexibility note (Cloudflare is just one option)
- Fix "production-ready" claim to "suitable for personal use and testing"
- Fix leaked secret in example output
- Add free tier caveat for Claude.ai
- Update memory instruction to knowledge-first framing with platform tagging
- Add issue templates: bug report, platform test report, feature request
- Rework simulate_edge.py to demo all data types (status, knowledge, context, preferences)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the entries table schema, all 6 entry types with their data
field structures, indexes, lifecycle rules (upsert, soft delete,
auto-purge, staleness), hard delete warning, and SQLite configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cmeans cmeans merged commit cc7b008 into main Mar 20, 2026
5 checks passed
@cmeans cmeans deleted the storage-abstraction-soft-delete-secure-deployment branch March 20, 2026 16:53
cmeans added a commit that referenced this pull request Mar 23, 2026
…traint

- Add HNSW index to _create_tables (finding #1: non-Alembic installs)
- Document VECTOR(768) hardcoded dimension in data-dictionary (finding #2)
- Clarify embedding-on-write is synchronous, background planned for Phase 2
- Update test count to 294 in CHANGELOG and README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans added a commit that referenced this pull request Mar 24, 2026
…ount

- hint now passes source/tags/entry_type/since/until filters to
  semantic_search instead of searching all entries (finding #3)
- Test count updated to 315 in CHANGELOG and README (finding #1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans pushed a commit that referenced this pull request Apr 7, 2026
- #1: Specify two-step re-init (synthetic initialize + replay original)
  using stored registry metadata (capabilities, client_info, protocol_version)
- #2: Add session_redirects table for old→new session_id mapping with
  5-min grace period. Middleware transparently rewrites request headers.
- #3: Clarify sliding-window TTL — touch() extends expires_at
- #4: Document HAProxy stick table gap after rotation (self-healing)
- #5: Note deliberate LOGGED deviation from issue with rationale
- #6: Specify lookup() filters expired sessions (no re-init for zombies)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans pushed a commit that referenced this pull request Apr 20, 2026
Addresses QA round-1 finding #1 on PR #333. The actually-merged
mcp-clipboard PRs are:

  - #88 (merged 2026-04-20T01:31:47Z) — original env-routing hardening.
    #87 does not exist in that repo.
  - #92 (merged 2026-04-20T20:26:12Z) — comment-escape cascade.
    #90 was superseded and closed without merging.

No code change; CHANGELOG narrative only. PR body updated in parallel
with the same corrections (including the pull_request / pull_request_target
trigger correction for pr-labels.yml). Symmetric hardening follow-up
for pr-labels.yml filed as #334 (P3, non-blocking).

Refs #332.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmeans pushed a commit that referenced this pull request Apr 21, 2026
…GELOG

Round-2 fixes against PR #343 QA. Two substantive findings + one
observation:

  - (#3, applied) docker-compose group now restricted to minor+patch
    update types (matching the pip policy). Ollama/Postgres/Cloudflared
    minor and patch bumps still batch into one PR per week; major
    version bumps like pgvector pg17->pg18 (a data-migration event)
    land as their own individual PRs where they can get the review
    attention they need.
  - (#1, resolved as a side effect) CHANGELOG rewritten to describe
    the actual policy: pip and docker-compose are grouped minor+patch
    by design; github-actions groups all update types (action major
    bumps are mechanical). Round-1 CHANGELOG claimed "minor+patch per
    ecosystem" which was inaccurate for two of the four.

Also reframes the CHANGELOG opening from "add" to "replaced the
existing 2-ecosystem config with a 4-ecosystem weekly rotation" since
.github/dependabot.yml already existed on main with pip +
github-actions. PR title + body + step-8 expected-diff stats get the
same treatment in parallel (not in this commit).

Refs #343.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 21, 2026
…334) (#351)

Closes [#334](#334). Also
closes CodeQL alerts #1, #2, #3 (three flags of
`actions/missing-workflow-permissions` in `ci.yml`).

## Summary

Two workflow-hardening fixes bundled because they're the same theme
(least-privilege + contributor-controlled-input discipline) and both
surfaced from the same security review:

### 1. `pr-labels.yml` — cascade the `#333` env-routing pattern (closes
#334)

Three job steps in `pr-labels.yml` (`on-push`, `on-unlabel`, `on-label`)
previously inlined contributor-visible fields as `${{ ... }}`
expressions inside `run:` bodies:

```yaml
# Before
run: |
  PR=${{ github.event.pull_request.number }}
  REPO=${{ github.repository }}
  HEAD_SHA=${{ github.event.pull_request.head.sha }}
  # ... shell script references PR / REPO / HEAD_SHA
```

Now they route through step-level `env:` and are referenced as shell
variables:

```yaml
# After
env:
  PR: ${{ github.event.pull_request.number }}
  REPO: ${{ github.repository }}
  HEAD_SHA: ${{ github.event.pull_request.head.sha }}
run: |
  # ... shell script references "$PR" / "$REPO" / "$HEAD_SHA"
```

**Not a currently-exploitable bug.** The `on: pull_request:` trigger
means fork PRs get a read-only `GITHUB_TOKEN` — the `gh pr edit
--add-label` / `--remove-label` calls would be rejected from a fork PR
regardless of what `PR`/`REPO`/`HEAD_SHA` contained. And all three
values are typed (numeric PR, repo name validated by GHA, hex SHA) —
none come from user-authored text like titles or bodies.

**Why change it anyway**, per #334's rationale:

- **Trigger-drift risk.** If `pr-labels.yml` ever switches to
`pull_request_target` (to allow label automation on fork PRs), the same
injection class that #333 closed on `pr-labels-ci.yml` reappears — and
now the hardening would already be in place.
- **Parameterization-drift risk.** A future maintainer adding a
contributor-authored string field (label name, PR title fragment, branch
name) to a `run:` block won't be prompted to route via `env:` first
because the file already establishes the inline `${{ ... }}` style as
"fine here."
- **Cascade consistency.** `pr-labels-ci.yml` uses env-routing since
#333; having the sibling workflow use a different style is a readability
cost for anyone auditing the repo.

### 2. `ci.yml` — add workflow-level `permissions: contents: read`
(closes CodeQL #1/#2/#3)

`ci.yml` had no `permissions:` block at workflow or job level, so all
three jobs (`lint`, `typecheck`, `test`) inherited whatever repo-level
default `GITHUB_TOKEN` scope is configured. CodeQL flagged this three
times (one per job).

Fix: declare `permissions: contents: read` at the workflow level. Every
job inherits read-only content access, which is sufficient for lint /
typecheck / pytest / codecov. No job actually needs write access to
anything.

## Audit sweep results

While touching workflow files, checked all six for missing
`permissions:`:

| Workflow | Had `permissions:`? | This PR's action |
|----------|---------------------|------------------|
| `ci.yml` | No (CodeQL flagged 3x) | Added `contents: read` at workflow
level |
| `docker-publish.yml` | Yes, line 23 | No change |
| `docker-smoke.yml` | Yes, line 40 (from #350) | No change |
| `pr-labels-ci.yml` | Yes, line 35 (from #333) | No change |
| `pr-labels.yml` | Yes, line 26 | No change to permissions block;
env-routing changes only |
| `qa-gate.yml` | Yes, line 24 | No change |

`ci.yml` was the last gap. Sweep is complete.

## Scope

- `.github/workflows/ci.yml` — `+7 lines` (permissions block with inline
rationale comment)
- `.github/workflows/pr-labels.yml` — `+10, -11` (three `run:` bodies
lose two shell-assignment lines each; three `env:` blocks gain two-three
entries each; explanatory comment added in the `on-unlabel` case)
- `CHANGELOG.md` — `+4 lines` (new `### Security` subsection under
`[Unreleased]`)

No source, no tests, no migrations.

## References

- Closes [#334](#334)
- Closes CodeQL alerts #1, #2, #3
(`actions/missing-workflow-permissions` on `ci.yml:27/41/53`)
- Cascade source: PR
[#333](#333) (same pattern
for `pr-labels-ci.yml`, which closed #332)
- Related CodeQL alerts not addressed by this PR: #5/#6/#7/#8 (OAuth
clear-text logging in `oauth.py` and `oauth_proxy.py`) — separate audit
PR, coming next. #4 (socket bind in tests) — dismiss via UI.

## QA

### Prerequisites

None. Pure workflow-YAML changes.

### Automated checks

- `lint`, `typecheck`, `test (3.10/3.11/3.12)` — none touch YAML, should
remain green.
- `CodeQL (actions)` — will re-scan `ci.yml` and `pr-labels.yml` on this
PR. Expected outcome: alerts #1/#2/#3 flip to "fixed" on merge; no new
alerts introduced.
- `docker-smoke` — not triggered (no changes under `Dockerfile` /
`pyproject.toml` / `uv.lock` / `.dockerignore`).

### Manual tests

1. - [x] **Both workflow files parse.**
     ```
python3 -c "import yaml; [yaml.safe_load(open(f)) for f in
['.github/workflows/ci.yml', '.github/workflows/pr-labels.yml']];
print('OK')"
     ```
     Expected: `OK`.

2. - [x] **`ci.yml` now has `permissions: contents: read`.**
     ```
     grep -A1 '^permissions:' .github/workflows/ci.yml
     ```
     Expected: `permissions:` header followed by `  contents: read`.

3. - [x] **No contributor-controlled inputs in `pr-labels.yml` `run:`
bodies.**
     ```
awk '/^[[:space:]]+run: \|/,/^[[:space:]]+-
name:|^[[:space:]]{2,6}[a-z-]+:$/' .github/workflows/pr-labels.yml |
grep -nE '\$\{\{ *github\.(event|repository|head_ref)' || echo "(none —
good)"
     ```
Expected: `(none — good)`. All `github.event.*` / `github.repository`
references are now in `env:` blocks (and in job-level `if:`
conditionals, which is safe context).

4. - [x] **All six workflows now have `permissions:`.**
     ```
     for f in .github/workflows/*.yml; do
if ! grep -q '^permissions:\|^ permissions:\|^ permissions:' "$f"; then
         echo "$f: MISSING permissions"
       fi
     done
     echo "(if no 'MISSING' lines above, sweep is complete)"
     ```
     Expected: no `MISSING` lines.

5. - [x] **Label automation still functions on this PR.** When I push,
`pr-labels.yml`'s `on-push` should reset labels to `Awaiting CI` and
strip any stale QA labels. When `Dev Active` is removed, `on-unlabel`
should promote to `Ready for QA` after CI passes. Empirically validated
if the label transitions on this PR itself behave identically to recent
merged PRs (self-test).

6. - [x] **`permissions: contents: read` doesn't break anything.** Lint
/ typecheck / pytest / codecov upload only need read access to
`GITHUB_TOKEN` — none of them push labels, create comments, or mutate
repo state. If any of the existing CI checks start failing on this PR
with "resource not accessible" errors, that's a signal the permissions
block is too tight (unlikely, but the empirical test is: does this PR's
CI go green?).

7. - [x] **Diff review.**
     ```
     git diff --stat origin/main
     ```
Expected: `.github/workflows/ci.yml` (+7),
`.github/workflows/pr-labels.yml` (+10, -11), `CHANGELOG.md` (+4).
Nothing else.

### Acceptance

- ✅ `#334` — symmetric env-routing cascade landed in `pr-labels.yml`
- ✅ CodeQL `#1`, `#2`, `#3` — `ci.yml` now has explicit `permissions:`
- ☐ CodeQL re-scan confirmation — post-merge, the three alerts flip from
Open → Fixed automatically on the next `Analyze (actions)` run against
`main`

Post-merge, also worth a look at CodeQL's /security/code-scanning
dashboard to confirm Open count drops from 8 → 5 (just the four
OAuth-logging + the one test-file socket-bind remaining).

Co-authored-by: cmeans-claude-dev[bot] <3223881+cmeans-claude-dev[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 21, 2026
…ng flags (#5/#6/#7/#8) (#352)

Closes CodeQL alerts **#5, #6, #7, #8, #9**
(`py/clear-text-logging-sensitive-data` in `oauth.py` +
`oauth_proxy.py`).

## Round 3 — IP-header-chain redesign

The round-1 `_format_ip_header_chain` helper (see description below in
§"The fix") didn't break CodeQL's dataflow — the analyzer tracked
`header_chain` through the helper call and CodeQL #9 duplicated #8's
flag on the new form. Round-3 commit `e31561c` **removes the helper**
and instead splits the log by config path: the default case logs a
module-level string literal (no taint flow), the override case logs only
the count + env-var pointer (names never reach the sink). Both #8 and #9
close through code change — no UI dismissal.

This is the code-improvement path per the user preference for a
"reasonable wording change that would make CodeQL happy without
sacrificing log usefulness" rather than inline suppression.

## The problem

CodeQL flagged four log statements in the OAuth paths:

| Alert | File | Line | What it logs |
|-------|------|------|--------------|
| #5 | `oauth.py` | 81 | `discovery_url` + exception when OIDC discovery
fails |
| #6 | `oauth.py` | 85 | `self.issuer` when falling back to default JWKS
path |
| #7 | `oauth_proxy.py` | 225 | `discovery_url` + exception in the
proxy's discovery path |
| #8 | `oauth_proxy.py` | 312 | IP-header-name chain config |

CodeQL's `py/clear-text-logging-sensitive-data` uses dataflow from
"sensitive sources" (values tracked as credential-adjacent) to log
sinks. In our case, all four sites log values that are **not actually
secrets today** — `self.issuer` is a config URL, header names aren't
credentials — but CodeQL taints conservatively because the values flow
from constructor parameters in auth-adjacent contexts.

## The fix — defensive URL redaction, not suppression

Two new helpers:

### `safe_url_for_log(url)` (in `helpers.py`)

Strips credential-carrying URL components before logging:

- **`userinfo`** — RFC 3986 lets URLs embed `user:password@host`. An
OAuth issuer URL misconfigured that way would leak credentials through
every log line that mentioned it.
- **`query string`** — OAuth authorization-code flow passes `code`,
`state`, and PKCE verifiers here. Anything on the query string is a
candidate to be secret.
- **`fragment`** — OAuth implicit flow (deprecated but still
occasionally in the wild) puts access tokens in the fragment.

Returns `scheme://host[:port]/path` — preserves the operator-meaningful
portion so an operator reading a log can still identify the endpoint,
while removing anything that could leak a credential.

Defensive rather than currently-exploitable: **our issuer URLs today are
plain `https://provider/...` strings with no userinfo/query/fragment**.
The helper ensures a future misconfiguration, provider change, or
redirect can't change that. Unparseable or empty inputs return
`"<redacted url>"` — a log helper should never itself become a failure
source.

**Applied at every URL-logging site in the OAuth paths**, not just the
four CodeQL flagged — consistency:

- `oauth.py:_discover_oidc_config` — 4 log statements (discovered JWKS
URI, discovered userinfo endpoint, discovery-request failure,
discovery-fallback warning)
- `oauth_proxy.py:discover_oidc_endpoints` — 2 log statements (discovery
failure, discovered-endpoints summary)

### IP-header-chain log split by config path (in `oauth_proxy.py`)

The round-1 approach wrapped the IP-header-chain log in a
`_format_ip_header_chain` helper to try to break CodeQL's taint-flow
through function indirection. That didn't work — CodeQL #9 duplicated
#8's flag on the helper-call form. Round 3 takes a different approach:
**split the log statement by whether defaults are in use, and remove the
taint-flow path entirely**:

- **Default branch** (`ip_headers is None`): log a module-level literal
`_DEFAULT_IP_HEADER_DISPLAY = "CF-Connecting-IP → X-Real-IP → ASGI
client"`. Source-literal string; no contributor-controlled value flows
into the sink.
- **Override branch** (env-configured `ip_headers`): log only the
*count* of custom entries and a pointer to
`AWARENESS_OAUTH_PROXY_IP_HEADERS` so operators can inspect the env var
directly. The names themselves never reach a log sink.

Trade-off: operators running with a non-default header chain don't see
the names in logs — they see the count + env-var name. Small loss of log
convenience for the minority-override case; full operator clarity for
the default case (which most deployments run). Closes CodeQL #8 and #9
cleanly without UI dismissal.

## Testing

11 new unit tests in `tests/test_helpers.py::TestSafeUrlForLog`:

- `test_plain_https_preserved` — unaltered URLs round-trip
- `test_strips_userinfo` — `user:password@host` → `host`
- `test_strips_query_string` — `?code=secret&state=xyz` → `` (empty)
- `test_strips_fragment` — `#access_token=secret` → `` (empty)
- `test_strips_all_credential_carrying_parts_at_once` — composite test
- `test_preserves_non_default_port` — `:8443` survives
- `test_empty_string_returns_placeholder` — `""` → `"<redacted url>"`
- `test_unparseable_string_returns_placeholder` — `"not a url"` →
`"<redacted url>"`
- `test_no_netloc_returns_placeholder` — `"/path/only"` → `"<redacted
url>"`
- `test_path_preserved_exactly` — deep paths with dots/dashes unchanged

**All existing 120 OAuth tests (`tests/test_oauth.py` +
`tests/test_oauth_proxy.py`) continue to pass** against the refactored
code — no behavior regression, only log-output change.

## Scope

- `src/mcp_awareness/helpers.py` — `+42` (new `safe_url_for_log` helper
+ docstring)
- `src/mcp_awareness/oauth.py` — `+10, -4` (import + 4 log-site changes)
- `src/mcp_awareness/oauth_proxy.py` — `+34, -10` (import +
`_DEFAULT_IP_HEADER_DISPLAY` module constant + 3 log-site changes,
including the split-by-config-path IP-header-chain log added in round 3)
- `tests/test_helpers.py` — `+69` (import + 11 new test cases in
`TestSafeUrlForLog` class)
- `CHANGELOG.md` — `+3` (new `### Security` entry under `[Unreleased]`)

No migrations. No source API surface changed. No log-message semantics
changed beyond the URL-component redaction.

## References

- Closes CodeQL alerts #5, #6, #7 (direct — URL now redacted before
flowing to log sink)
- #8 and #9 also closed — the round-3 split-by-config-path log statement
removes taint-flow entirely; no UI dismissal required
- Session context: #334 hardening cascade and ci.yml permissions closed
alerts #1/#2/#3 via PR #351. This PR addresses Group B of yesterday's
CodeQL audit.

## QA

### Prerequisites

None. Pure code changes, covered by new unit tests and existing
integration tests.

### Automated checks

- `lint` / `typecheck` — clean locally
- `test (3.10/3.11/3.12)` — `tests/test_helpers.py::TestSafeUrlForLog`
adds 11 cases; existing OAuth tests continue to pass
- `CodeQL (python)` — expected outcome: #5, #6, #7, #8, #9 all flip Open
→ Fixed on merge.
- `codecov/patch` — new helper is covered by the 11 new test cases;
should be 100% on touched lines
- `docker-smoke` — **not triggered** (no `Dockerfile` / `pyproject.toml`
/ `uv.lock` / `.dockerignore` change)

### Manual tests

1. - [x] **All tests pass locally.**
     ```
python -m pytest tests/test_helpers.py tests/test_oauth.py
tests/test_oauth_proxy.py -q
     ```
     Expected: all pass, including the 11 new `TestSafeUrlForLog` cases.

2. - [x] **`safe_url_for_log` correctly strips each sensitive
component.**
     ```
     python -c "
     from mcp_awareness.helpers import safe_url_for_log as f
assert f('https://user:secret@host/path') == 'https://host/path',
'userinfo'
assert f('https://host/path?code=secret') == 'https://host/path',
'query'
assert f('https://host/path#token=secret') == 'https://host/path',
'fragment'
assert f('https://u:p@host:8443/oauth?c=1#t=2') ==
'https://host:8443/oauth', 'composite'
     assert f('') == '<redacted url>', 'empty'
     assert f('not a url') == '<redacted url>', 'unparseable'
     print('all manual assertions pass')
     "
     ```
     Expected: `all manual assertions pass`.

3. - [x] **Log-output semantics unchanged for normal URLs.** For any
sensible OIDC issuer URL
(`https://api.workos.com/user_management/client_X`), the log now reads
identically to before — same host, same path. Verify by eyeballing an
OAuth auth flow locally and confirming `Discovered JWKS URI: <url>` /
`OAuth proxy: discovered endpoints — authorize=<url>, …` log lines
render as before when URLs don't contain credentials.

4. - [x] **No contributor-controlled inputs are now logged.** Grep for
any remaining `%s` + URL arg that isn't routed through the helper in the
OAuth paths:
     ```
grep -nE
'logger\.(info|warning|error).*%s.*(discovery_url|self\.issuer|userinfo_endpoint|authorization_endpoint|token_endpoint|registration_endpoint|header_chain|jwks)'
src/mcp_awareness/oauth.py src/mcp_awareness/oauth_proxy.py | grep -v
safe_url_for_log | grep -v _format_ip_header_chain || echo "(none —
good)"
     ```
Expected: `(none — good)`. Every URL-logging site has been routed
through a redactor.

5. - [x] **CHANGELOG entry documents the defensive rationale.**
     ```
     grep -nE '(credential|redact)' CHANGELOG.md | head -3
     ```
Expected: the new `### Security` entry mentions credential-carrying
components and redaction.

6. - [x] **Diff review.**
     ```
     git diff --stat origin/main
     ```
Expected: 5 files changed. `CHANGELOG.md` (+3),
`src/mcp_awareness/helpers.py` (+42), `src/mcp_awareness/oauth.py` (+10,
-4), `src/mcp_awareness/oauth_proxy.py` (+34, -10),
`tests/test_helpers.py` (+69). Nothing else.

7. - [ ] **(Post-merge) Confirm CodeQL alerts flip.** After merge,
`Analyze (python)` re-runs on `main`. At
https://github.com/cmeans/mcp-awareness/security/code-scanning:
- Alerts #5, #6, #7 flip Open → Fixed (each was
`clear-text-logging-sensitive-data` in an `oauth.py`/`oauth_proxy.py`
file — the URL arg is now routed through `safe_url_for_log`, which
CodeQL should see as a sanitizer).
- Alerts #8 and #9 both flip Open → Fixed — the round-3
split-by-config-path log statement removes the taint-flow path entirely.

### Acceptance

- ✅ CodeQL #5 — `oauth.py:82` discovery-request-failed log: URL now
`safe_url_for_log(discovery_url)`.
- ✅ CodeQL #6 — `oauth.py:91` default-JWKS-path log: URL now
`safe_url_for_log(self.issuer)`.
- ✅ CodeQL #7 — `oauth_proxy.py:229` discovery-failed log: URL now
`safe_url_for_log(discovery_url)`.
- ✅ CodeQL #8 — `oauth_proxy.py:330-345` header-chain log: split into
default (logs literal `_DEFAULT_IP_HEADER_DISPLAY` constant — no taint
flow) and override (logs count + env-var pointer — header names never
reach the log sink).
- ✅ CodeQL #9 — same fix. The round-1 `_format_ip_header_chain` helper
triggered a duplicate flag; removing the helper and splitting by config
path resolves both #8 and #9.

---------

Co-authored-by: cmeans-claude-dev[bot] <3223881+cmeans-claude-dev[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 22, 2026
#377)

Closes R2 of #359 (RLS harness coverage extension tracking). Closes
#362.

## Summary

Complements R1 (#372) and R3 (merged in #373) by covering the execution
contexts the request-path `rls_store` fixture doesn't reach: the
`_do_cleanup` daemon thread, the `upsert_embedding` pool path used by
`server._embedding_pool`, and Postgres's transaction-local `set_config`
semantics plus the combined pool/Postgres contract.

Test-only change — no production code modified.

**What the tests actually catch (honest framing, per QA round-1
feedback):** the background-thread and pool tests verify that
regressions dropping *all* of the layered cross-tenant defenses (SQL
`WHERE owner_id = %s`, `_set_rls_context`, and the pool role's
`BYPASSRLS` / fixture `NOBYPASSRLS` re-entry) are caught in aggregate.
They do **not** each isolate one layer — that's defense-in-depth doing
its job, and the module docstring now says so explicitly. The two
`test_set_config_is_local_…` tests are the ones that isolate a single
guarantee (Postgres's transaction-local `set_config` semantic) from
everything else, using a raw `psycopg.connect` with no pool involved.

## Scope

**2 files changed, +518, -0** relative to origin/main (`git diff
--shortstat origin/main` → `2 files changed, 518 insertions(+)`).

| File | ± | Purpose |
|---|---|---|
| `tests/test_rls_background.py` | +517, -0 (new) | 11 tests across 3
classes |
| `CHANGELOG.md` | +1, -0 | `### Security` bullet under `[Unreleased]`
(pure append to existing section) |

## Test inventory (11 tests, 3 classes)

### `TestRLSBackgroundCleanup` (4 tests)

1. `test_cleanup_isolates_expired_deletions_per_owner` — alice opts in,
bob does not; both have expired entries. After `_do_cleanup`, alice's
expired entries are gone, bob's remain. Exercises the full cleanup call
path (owner enumeration + per-owner DELETE) under a `NOBYPASSRLS`
request-path fixture.
2. `test_cleanup_skips_owners_without_preference` — cleanup is a no-op
for owners who haven't set `auto_cleanup=true`.
3. `test_cleanup_preserves_non_expired_entries_for_opted_in_owner` —
only rows with `expires <= now` are deleted; future-dated entries
survive.
4. `test_cleanup_expired_background_thread_preserves_isolation` — runs
cleanup through the spawned daemon thread (via `_cleanup_expired()`) and
verifies isolation holds. Exercises the real threaded path rather than
the synchronous call.

### `TestRLSBackgroundEmbedding` (2 tests)

5. `test_upsert_embedding_respects_owner_isolation` — alice's embedding
is not visible to bob via `get_entries_without_embeddings`. Covers the
full upsert call path including both the `WHERE owner_id = %s` SQL
filter and RLS policies.
6. `test_upsert_embedding_from_worker_thread_preserves_isolation` —
submits `upsert_embedding` via a `ThreadPoolExecutor`, same pattern as
`server._embedding_pool`.

### `TestRLSPoolGuarantees` (5 tests)

7.
**`test_set_config_is_local_true_does_not_persist_across_transactions`**
— direct Postgres check on a raw `psycopg.connect` (no pool):
`set_config(..., true)` is reverted at COMMIT. This is the test that
isolates the transaction-local semantic from `psycopg_pool`'s `RESET
ALL` check-in reset.
8. **`test_set_config_is_local_false_persists_across_transactions`** —
direct Postgres check counterpart: `set_config(..., false)` does persist
across transactions. Together with #7, these prove the `is_local` flag
is what's producing the behavior, not some ambient reset.
9. `test_pool_checkout_does_not_see_prior_rls_context` — after a store
operation, a fresh pool checkout sees no `app.current_user`. Verifies
the *combined* pool+Postgres contract, not either layer alone.
10. `test_rls_context_cleared_after_exception_rollback` — an exception
inside a store-style transaction + pool check-in cleanup combine to
leave no residue. Same combined-contract pattern as #9.
11. `test_concurrent_owners_do_not_cross_contaminate` — two threads on
different owners; each lands writes correctly and cannot see the other's
data. Verifies the full call path under real concurrency (the pool
physically hands out distinct connections per thread, which makes
`app.current_user`-based cross-contamination impossible on its own; this
test proves the broader call path is also clean).

## What changed vs. round-1 reviewed head `263250e2`

1. Replaced the round-1
`test_rls_context_does_not_persist_between_transactions` with **two new
direct-Postgres tests** (#7 and #8 above). Those two actually fail when
you flip `true` ↔ `false` in the test — meta-verified in-session before
pushing. The old test is now renamed
`test_pool_checkout_does_not_see_prior_rls_context` and its docstring is
honest about being a combined-contract check.
2. Softened the docstrings of the remaining combined-contract tests
(#10, #11) so they no longer claim to isolate a layer they don't
isolate.
3. Module docstring rewritten to describe the defense-in-depth design
accurately: cleanup and embedding are doubly scoped (SQL `WHERE owner_id
= %s` + `_set_rls_context`), pool tests split into two camps
(direct-Postgres isolation vs. combined-contract).
4. CHANGELOG bullet updated to reflect the new test count (11, not 9)
and to describe what the tests verify without overclaiming.

## Runtime cost

11 tests, ~2.6 s against the shared testcontainers Postgres. Full pytest
suite went from 989 (pre-R2) to **1000 passing** on this branch; 7
skipped unchanged.

## References

- Parent tracking: #359
- Closes: #362
- Prior R-series: #372 (R1, merged), #373 (R3, merged 2026-04-22)
- R4 (hypothesis fuzz): #364 — remaining sub-PR
- Round-1 QA review: flagged three inaccurate meta-verifications; this
revision addresses each one

## QA

### Prerequisites

Docker (testcontainers spins up Postgres). Same as any other pytest run.
The test file is self-contained; uses the shared `pg_dsn` fixture from
`conftest.py`.

### Manual tests

1. - [ ] **Run the new test file in isolation:** `python -m pytest
tests/test_rls_background.py -v`. Expected: `11 passed in ~2.6s`.

2. - [ ] **Meta-verify #7 is load-bearing** — flip `true` to `false` in
`test_set_config_is_local_true_does_not_persist_across_transactions`.
Re-run the test. Expected: **FAIL** with `AssertionError:
set_config(..., true) survived COMMIT: got 'alice'`. Revert after
verification. (Verified by Dev in-session on `e1708d7`.)

3. - [ ] **Meta-verify #8 is load-bearing** — flip `false` to `true` in
`test_set_config_is_local_false_persists_across_transactions`. Expected:
**FAIL** with `AssertionError: assert '' == 'alice'`. Revert. (Verified
by Dev in-session on `e1708d7`.)

4. - [ ] **Defense-in-depth framing (not a failure scenario).** Per QA
round-1 finding: a single-layer regression (e.g., commenting out only
`self._set_rls_context(...)` inside `cleanup_expired`) does **not**
cause tests #1#6 to fail — the redundant `WHERE owner_id = %s` in the
SQL files and the pool role's `BYPASSRLS` still enforce isolation on
their own. That's the point: the tests verify the *aggregate* contract,
not any individual layer. To cause a failure you'd need to drop both the
SQL filter and `_set_rls_context`.

5. - [ ] **Scope** — `git diff --stat origin/main` shows exactly
`CHANGELOG.md` (+1, -0) and `tests/test_rls_background.py` (+517, -0);
net: 2 files, +518, -0.

### Acceptance

- ☐ CI green on all matrix entries
- ✅ 1000/1000 tests passing locally (989 → 1000, +11 new)
- ✅ `ruff check`, `ruff format --check`, `mypy` all clean
- ✅ Single-concern: background-thread + pool coverage only
- ✅ Module docstring accurately describes what tests catch
(defense-in-depth aggregate, not single-layer)
- ✅ Two new tests (`test_set_config_is_local_true/false_…`) directly
codify Postgres's `is_local` semantic independently of `psycopg_pool` —
meta-verified to fail when the `is_local` flag is flipped
- ✅ Deferred refactor explicitly noted (shared helpers module —
follow-up)
- ✅ Bot commit identity verified (`272174644+cmeans-claude-dev[bot]`,
author + committer); push attributed to bot (`e1708d7`, verified via `gh
api repos/.../activity`)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: cmeans-claude-dev[bot] <272174644+cmeans-claude-dev[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 24, 2026
Closes #363.

## Summary

Adds static analysis to CI with two layers:

1. **Community Semgrep packs** — `p/python` and `p/owasp-top-ten` via
the `semgrep/semgrep:1.161.0` container. Broad coverage for Python
security anti-patterns without us re-deriving the wheel.
2. **Three project-specific custom rules** under `.semgrep/`, each
targeting a regression class we've identified:
- `awareness-sql-missing-owner-id` — every SQL statement that touches
`entries`, `reads`, `actions`, or `embeddings` must include `owner_id`.
Codifies the R1–R4 RLS harness's runtime contract in static form.
- `no-credential-identifier-in-logs` — rejects `logger.*` / `ctx.*`
calls that interpolate variables named `token`, `bearer`,
`authorization`, `password`, `secret`, `api_key`, etc. Zero current
sites; preventive.
- `no-sql-string-interpolation` — bans f-strings, `.format()`, and
concatenation as arguments to `cursor.execute` / `conn.execute` /
`executemany`. Codifies the parameterized-query convention the project
already follows.

## Scope

**8 files changed, +391, -1** (`git diff --shortstat origin/main`).

| File | ± | Purpose |
|---|---|---|
| `.github/workflows/semgrep.yml` | +60, -0 (new) | CI job running packs
+ `.semgrep/` rules |
| `.semgrep/awareness-sql-owner-scope.yml` | +65, -0 (new) | Custom rule
#1 |
| `.semgrep/no-token-in-logs.yml` | +64, -0 (new) | Custom rule #2 |
| `.semgrep/no-sql-string-interpolation.yml` | +66, -0 (new) | Custom
rule #3 (includes `paths.exclude` for `alembic/versions/**` and
`benchmarks/**` — see "Round 2 fix" below) |
| `docs/security/semgrep-rules.md` | +113, -0 (new) | Per-rule
documentation + exception/suppression discipline |
| `CONTRIBUTING.md` | +12, -0 | § "Security scanners" gains Semgrep
entry + § "Semgrep failure policy" section |
| `CHANGELOG.md` | +6, -0 | `### Security` bullet under `[Unreleased]` |
| `src/mcp_awareness/oauth_proxy.py` | +5, -1 | Inline `# nosemgrep`
suppression on a false-positive `python-logger-credential-disclosure`
finding, with a diff-visible justification comment |

Per-file additions: `60+65+64+66+113+12+6+5 = 391` — matches the summary
line.

## Round 2 fix (commit `51390e3`)

Round 1 (`b80b9cf`) CI failed with 5 findings from the
`no-sql-string-interpolation` rule — all in operational scripts outside
the request path:

- `alembic/versions/d5e8a3b17f92_add_embeddings_table.py`,
`f1a2b3c4d5e6_add_owner_id_and_users.py`,
`g2b3c4d5e6f7_add_rls_policies.py`, `j5e6f7g8h9i0_force_rls.py` —
migrations emit DDL via `op.execute(f"ALTER TABLE {table} ...")`
iterating a hard-coded Python-literal table list. Controlled input, zero
user data path.
- `benchmarks/semantic_search_bench.py:241,267` — ephemeral throwaway
databases dropped via `f'DROP DATABASE IF EXISTS "{db_name}"'`.
Developer-supplied name in a local benchmark script, not user input.

Root cause of the CI miss: pre-landing scan ran on `src/` + `tests/`
only (matching how the other linters are scoped) instead of the whole
repo. Fix: added `paths.exclude` to the rule for `alembic/versions/**`
and `benchmarks/**`. Documented in the rule's header comment. A
contributor introducing user-input SQL into those directories would
bypass the rule — but that would stand out in review, and the cost of
false positives on every migration PR isn't worth that marginal
coverage.

Round 2 scan (`semgrep --config p/python --config p/owasp-top-ten
--config .semgrep/ --error --oss-only` over the **whole repo**): exit 0,
no findings.

## Rule selection rationale (what's dropped from the issue's proposal)

The issue proposed 3-4 custom rules as candidates. A pre-implementation
audit of the codebase changed the viable set:

**Kept:**
- SQL owner-scoping guard (issue's rule 1) — the audit confirmed 52/66
SQL files already conform; the 14 exceptions are `users` + `sessions`
tables with their own keys, not owner-scoped. Strongly expressible in
Semgrep's generic mode.

**Dropped:**
- *Tool handlers must `await ctx.info(...)` for audit visibility*
(issue's rule 2). Audit: **0 of 32** tool handlers in `tools.py`
currently do this. The convention is aspirational, not established. A
rule that fires 0 times today and requires retrofitting 32 handlers
before it's useful is scope creep; the retrofit is its own design
decision.
- *New `postgres_store` function with `owner_id` must be in the RLS
harness coverage list* (issue's rule 4, already flagged in the issue as
*"may defer to a lint check"*). Audit confirmed: `test_rls.py` uses a
**per-method** pattern (42 hardcoded tests), not parametrization — so
Semgrep can't express the rule at all (needs cross-file function→test
mapping). Defer to a standalone check if ever needed; out of scope for
Semgrep.

**Added (not in issue, justified by audit):**
- Credential-identifier-in-logs prevention (pattern-based, not taint
mode).
- SQL-string-interpolation ban.

Net: 3 custom rules that each catch a real regression class, zero
retrofit burden, all expressible in Semgrep. Issue's "3-4 custom rules"
target met.

## Pre-landing preflight

Ran the exact CI command locally: \`semgrep --config p/python --config
p/owasp-top-ten --config .semgrep/ --error src/ tests/\`. Result: **one
finding before suppression; clean after.** The finding was Semgrep's
`python-logger-credential-disclosure` firing on `oauth_proxy.py` line
252:

> `logger.info("OAuth proxy: discovered endpoints — authorize=%s,
token=%s, register=%s", safe_url_for_log(...), ...)`

False positive: the format string contains the word *"token"* as a label
for the OIDC `token_endpoint` URL, not a credential value; values pass
through `safe_url_for_log()` which redacts embedded credentials.
Suppressed with a `# nosemgrep` comment on the `logger.info(` line plus
a multi-line justification comment directly above. `CONTRIBUTING.md`
documents the suppression-with-justification discipline.

All three custom rules were tested locally against both positive
fixtures (synthetic bad examples — each rule fires) and negative
fixtures (realistic benign code — no rule fires).

## References

- Closes: #363
- Related: #358 (SHA-pinning — the Semgrep container is tag-pinned to
`1.161.0`, which matches the pattern for third-party images with
explicit Dependabot-visible version numbers); #367/#369/#370 (sibling
security-tooling PRs just landed as #380, #382, #385); #359 / R1–R4 RLS
harness (the SQL owner-scope rule codifies its contract statically)
- Semgrep docs: <https://semgrep.dev/docs>
- `docs/security/semgrep-rules.md` — per-rule documentation

## QA

### Prerequisites

- `pip install semgrep` if verifying locally (CI runs in the
`semgrep/semgrep:1.161.0` container, no local install needed)

### Manual tests

1. - [x] **CI `semgrep / scan` job runs on this PR.** In the PR's Checks
tab, a new `semgrep / scan` job appears alongside existing jobs. Runs to
completion and passes (no findings after the in-tree suppression).

2. - [x] **Local repro matches CI.** `semgrep --config p/python --config
p/owasp-top-ten --config .semgrep/ --error --oss-only src/ tests/`.
Expected: exit code 0, no findings.

3. - [x] **Custom rule: SQL owner-scoping — positive case fires.** Write
a throwaway `src/mcp_awareness/sql/test_bad.sql`:
   \`\`\`sql
   /* name: test_bad */
   /* mode: literal */
   SELECT id FROM entries WHERE source = %s;
   \`\`\`
Run \`semgrep --config .semgrep/ src/mcp_awareness/sql/test_bad.sql\`.
Expected: `awareness-sql-missing-owner-id` fires with the blocking
severity. Remove the file when done.

4. - [x] **Custom rule: SQL owner-scoping — negative case clean.**
Replace the SELECT with `SELECT id FROM entries WHERE owner_id = %s AND
source = %s;` and rerun. Expected: no finding.

5. - [x] **Custom rule: credential-identifier-in-logs — positive case
fires.** In a throwaway `/tmp/bad_log.py`:
   \`\`\`python
   import logging
   logger = logging.getLogger(__name__)
   def bad(token): logger.info(f"got {token}")
   \`\`\`
Run \`semgrep --config .semgrep/no-token-in-logs.yml /tmp/bad_log.py\`.
Expected: `no-credential-identifier-in-logs` fires.

6. - [x] **Custom rule: SQL-string-interpolation — positive case
fires.** In a throwaway `/tmp/bad_sql.py`:
   \`\`\`python
   def bad(cur, x): cur.execute(f"SELECT * FROM t WHERE id = '{x}'")
   \`\`\`
Run \`semgrep --config .semgrep/no-sql-string-interpolation.yml
/tmp/bad_sql.py\`. Expected: `no-sql-string-interpolation` fires.

7. - [x] **Custom rule docs render on the PR diff.** Review
`docs/security/semgrep-rules.md` — each of the three rules has a
*what/why/suppression* section; the "Running locally" section's command
line matches what CI runs.

### Acceptance

- ☐ CI green across all matrix entries including the new `semgrep /
scan` job
- ✅ `ruff check src/ tests/`, `ruff format --check src/ tests/`, `mypy
src/mcp_awareness/` clean locally
- ✅ Full `semgrep --config p/python --config p/owasp-top-ten --config
.semgrep/ --error` clean locally (one false positive suppressed with
diff-visible justification)
- ✅ Each custom rule tested positively (synthetic bad fixture fires) AND
negatively (realistic benign code doesn't fire)
- ✅ Single-concern: Semgrep CI + custom rules + docs only
- ✅ Issue's 3-4 target met with 3 rules that each catch a real
regression class; dropped rules explicitly justified
- ✅ Failure policy + suppression discipline documented in
CONTRIBUTING.md and docs/security/semgrep-rules.md
- ✅ CHANGELOG `### Security` bullet under `[Unreleased]` references #363
- ✅ Bot commit identity verified (`272174644+cmeans-claude-dev[bot]`,
author + committer)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: cmeans-claude-dev[bot] <272174644+cmeans-claude-dev[bot]@users.noreply.github.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 24, 2026
Addresses QA round 1 findings on PR #393.

Finding #1 (substantive — undercount): compose_summary used
len(fired_intentions) for the summary line. fired_intentions is
capped at 10 by generate_briefing, so any backlog of 11+ fired
intentions silently reported as "10 intentions ready" regardless
of the actual count. Reproduced in QA session with 15 seeded:
summary said "10" while evaluation.intentions_fired correctly said
15. Impact at deploy time: the PR #392 deployer note flags 20+
accumulated fired handoffs on the production owner; every post-deploy
briefing would have under-reported by 10. Fix: pull the total from
briefing["evaluation"]["intentions_fired"], fall back to list length
only when evaluation block is absent.

Finding #2 (substantive — test coverage gap): the cap test asserted
`sum(f.urgency == "high") == 3` which would also pass if the sort
key flipped sign (high-urgency entries still in the top 10, just at
slots 7-9). Replaced with a strict slot check —
`[f.urgency for f in fired[:3]] == ["high", "high", "high"]` plus
an `all(... "normal")` for slots 3-9. A sign flip on the urgency
sort key now fails the test immediately.

Also adds test_briefing_summary_reports_total_fired_count_not_capped
as an explicit regression guard for finding #1 — seeds 15 fired,
asserts the summary contains "15 intentions ready" and not
"10 intentions ready".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant