Skip to content

Increase health check start_period and timeout#5

Merged
cmeans merged 1 commit into
mainfrom
health-check-fix
Mar 21, 2026
Merged

Increase health check start_period and timeout#5
cmeans merged 1 commit into
mainfrom
health-check-fix

Conversation

@cmeans
Copy link
Copy Markdown
Owner

@cmeans cmeans commented Mar 21, 2026

Summary

  • start_period 5s → 15s — gives the server more time to boot before health checks count
  • timeout 5s → 10s — prevents intermittent timeout failures during normal operation

Observed the container reporting unhealthy on startup due to health check timeouts, which caused the tunnel dependency to fail.

🤖 Generated with Claude Code

start_period 5s→15s, timeout 5s→10s. Prevents false unhealthy status
during container startup when the server is slow to accept connections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cmeans cmeans merged commit 59f5532 into main Mar 21, 2026
5 checks passed
@cmeans cmeans deleted the health-check-fix branch March 21, 2026 22:33
cmeans pushed a commit that referenced this pull request Mar 31, 2026
Move the deliver_at timestamp comparison from Python-side list
comprehension to a SQL WHERE clause, avoiding fetching all pending
intentions when only ripe ones are needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans pushed a commit that referenced this pull request Mar 31, 2026
Move the deliver_at timestamp comparison from Python-side list
comprehension into the SQL WHERE clause, avoiding fetching all pending
intentions when only a subset have fired.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans pushed a commit that referenced this pull request Apr 7, 2026
- #1: Specify two-step re-init (synthetic initialize + replay original)
  using stored registry metadata (capabilities, client_info, protocol_version)
- #2: Add session_redirects table for old→new session_id mapping with
  5-min grace period. Middleware transparently rewrites request headers.
- #3: Clarify sliding-window TTL — touch() extends expires_at
- #4: Document HAProxy stick table gap after rotation (self-healing)
- #5: Note deliberate LOGGED deviation from issue with rationale
- #6: Specify lookup() filters expired sessions (no re-init for zombies)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans pushed a commit that referenced this pull request Apr 10, 2026
Add a cross-cutting "Data sovereignty policy" section that governs all
inference calls awareness makes (embedding, extraction, future LLM
features). Amends the earlier round-1 response to the sourcing concern
with a more complete framework.

Core framework:

  safe = (we-control-it) OR (contract-protects-it)

Two trust anchors:
- Trust-anchor-B: operator/user controls the machine running the model.
  Network location is irrelevant — Ollama on a user-owned cloud VM is
  B, Ollama on an untrusted shared workstation is not. "Cloud
  deployment" does NOT imply "cloud inference" — awareness running in
  AWS can use Ollama in the same VPC and still be trust-anchor-B.
- Trust-anchor-C: enterprise-tier contract (zero-retention, BAA,
  no-training) binds the third party running the model.

Unprotected providers (consumer-tier APIs, unknown endpoints) get a
log warning. Hard-block is not enforced; informed consent is the
mechanism. Our canonical hosted deployments must use B or C
exclusively; other operators are free to configure however they want,
with the warning visible.

Per-entry sensitivity routing (opt-in):
- Reserved tag `sensitive` — always routes to trust-anchor-B
- Operator-configurable additional tags via
  AWARENESS_LOCAL_ONLY_TAGS env var (deployment-level floor)
- User-preference additions via users.preferences.local_only_tags
  (additive, cannot remove from operator's floor)
- Three tradeoffs documented: quality degradation, availability
  degradation in pure-cloud (silent fallback to FTS-only), search
  consistency drift

Sovereignty benchmark (release criterion):
- Quantitative comparison of retrieval quality with sensitivity
  routing on vs off
- Hard release criterion for any trust-anchor-C path shipped as a
  default; Layer 1+2 with B-only defaults can ship without it

Consent surface (three complementary mechanisms):
- get_info (#235) exposes extraction/embedding providers with B/C/U
  classification — always on, on-demand
- First-time-seen briefing notice (one-shot per config change)
- Recurring briefing warning only when unprotected provider detected
  — silent on all-protected state, consistent with awareness alert
  philosophy

Layer 2/3 updates:
- Layer 2 references the sovereignty policy; future cloud embedding
  providers (e.g. #111) must be enterprise-tier to qualify as
  trust-anchor-C
- Layer 3 "cloud extractor privacy" risk replaced with a pointer to
  the policy; initial release still ships trust-anchor-B candidates
  only (cloud integration is real work, out of scope for experimental
  release), but trust-anchor-C extraction is acceptable in principle

Phase 1 additions:
- Sovereignty framework scaffolding (env var + user preference
  parsing, routing helper, trust-anchor classification helper) is
  implemented in Phase 1 even though it's a no-op until cloud
  providers are added (all current providers are trust-anchor-B)
- get_info exposes provider trust-anchor classifications
- First-time-seen briefing notice plumbing
- Unprotected provider detection plumbing (tested with fixtures)

Acceptance criteria updated across Layer 1, Layer 2, Layer 3 to
reflect sovereignty framework requirements.

This amendment adds a new Problem statement #5 (data sovereignty
undefined) and a new top-level "Data sovereignty policy" section
between "Layer scoping and user-facing releases" and "Design — three
independent layers," positioned so the technical layer designs can
reference it.

Refs PR #241 round-2 discussion on cloud extractor privacy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 10, 2026
* docs: hybrid retrieval + multilingual design plan

Three-layer design closing the framing of #195 (chunked storage) in
favor of a lower-blast-radius approach:

- Layer 1 (#238): hybrid retrieval via vector + Postgres FTS + RRF,
  with per-entry language regconfig and lingua-py detection
- Layer 2 (#239): swap default embedding model to bge-m3 for
  multilingual cross-lingual vector space (benchmark-gated)
- Layer 3 (#240): proposition extraction (Dense X Retrieval) for
  sub-document semantic granularity, experimental follow-on

Cross-language unified memory as the novel differentiator.
~35 languages at launch (28 Postgres built-ins + CJK via pgroonga).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: amend hybrid retrieval design after round-1 QA review

Address all six substantive findings and sourcing amendment from PR
#241 round-1 review.

Context section:
- Document the 500-char content truncation in embeddings.py:212-217
  as the literal dilution mechanism (not "averaged dilution")
- Explain that hybrid retrieval is complementary at the data level,
  not just the algorithm level

Layer scoping:
- Clarify Layer 1 alone is lexical cross-language, not semantic
- User-facing v1 = Phase 1 + 3 (dilution fix, 35-language lexical)
- User-facing v2 = adds Phase 2 (cross-language semantic via model
  swap) — this is the first "multilingual" release
- Layer 1 standalone ships as the dilution-bug fix, not marketed as
  multilingual

Layer 1:
- Add Phase 1.0 schema verification task for PG17 generated-column +
  regconfig-from-another-column pattern, with trigger-based fallback
  documented
- Add write-time regconfig validation step: startup cache of
  pg_ts_config, pre-INSERT validation, fall through to 'simple' +
  report_alert if missing, cache refresh on alert miss. Prevents
  INSERT failures when a user's preferred language regconfig is
  not loaded on the database cluster.
- Add investigation task: lift or remove the 500-char content
  truncation as part of Layer 1 (potentially highest-leverage
  one-line change in the entire effort)
- Expose `language` parameter explicitly on write tools (not just
  implicit via entry.data.language) for bilingual users who need
  per-entry control
- Commit to `search` rename with `semantic_search` as deprecated
  alias for one release
- Add FTS weight validation benchmark task

Layer 2 — full rewrite of the model section:
- Drop bge-m3 (BAAI / Chinese-sourced — user preference to avoid
  Chinese models for defaults)
- Primary candidate: intfloat/multilingual-e5-large (Microsoft /
  intfloat, MIT, 100+ languages, 1024 dim)
- Fallback candidate: granite-embedding:278m (IBM, Apache 2.0,
  ~12 languages, 768 dim — no schema migration needed, lower risk
  if e5-large benchmarks borderline)
- Document e5 prefix convention ("query: " / "passage: ") as a
  critical implementation requirement with test-enforced symmetry
- Expand benchmark gate: add cross-lingual smoke tests (EN→non-EN,
  non-EN→EN, non-EN→non-EN-same-lang) before default swap

Layer 3 — full rewrite of candidates and risks:
- Drop qwen2.5 (Alibaba / Chinese-sourced)
- Primary default: phi3.5 (Microsoft, MIT, JSON-reliable)
- Low-resource alternative: gemma2:2b (Google)
- High-quality alternative: mistral-nemo (Mistral AI, France)
- Avoid llama3.2:3b as default (license MAU clause)
- AWARENESS_PROPOSITION_EXTRACTION_MODEL env var configurable
- Add three missing risks: non-determinism of LLM extraction (tests
  flaky by construction), backref staleness on entry edits
  (committed to mark-stale + background re-extract), cloud extractor
  privacy hazard (default must be local-runnable; cloud extractors
  out of scope for initial release)
- Expand prompt guidance and JSON reliability as primary selection
  criterion

Other:
- Split acceptance criteria per layer (Layer 1 ACs / Layer 2 ACs /
  Layer 3 ACs) so gating is explicit
- Add merge checklist: close #195, update #239 and #240 bodies
- Note GIN + RLS multi-tenant scaling as a future concern
- Update risks section with upgrade to Layer 1 Low → Low-Medium and
  Layer 3 Medium → Medium-High after QA review
- Resolve six of the seven open questions

Refs PR #241 round-1 QA review, sourcing amendment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: data sovereignty policy (PR #241 round-2 amendment)

Add a cross-cutting "Data sovereignty policy" section that governs all
inference calls awareness makes (embedding, extraction, future LLM
features). Amends the earlier round-1 response to the sourcing concern
with a more complete framework.

Core framework:

  safe = (we-control-it) OR (contract-protects-it)

Two trust anchors:
- Trust-anchor-B: operator/user controls the machine running the model.
  Network location is irrelevant — Ollama on a user-owned cloud VM is
  B, Ollama on an untrusted shared workstation is not. "Cloud
  deployment" does NOT imply "cloud inference" — awareness running in
  AWS can use Ollama in the same VPC and still be trust-anchor-B.
- Trust-anchor-C: enterprise-tier contract (zero-retention, BAA,
  no-training) binds the third party running the model.

Unprotected providers (consumer-tier APIs, unknown endpoints) get a
log warning. Hard-block is not enforced; informed consent is the
mechanism. Our canonical hosted deployments must use B or C
exclusively; other operators are free to configure however they want,
with the warning visible.

Per-entry sensitivity routing (opt-in):
- Reserved tag `sensitive` — always routes to trust-anchor-B
- Operator-configurable additional tags via
  AWARENESS_LOCAL_ONLY_TAGS env var (deployment-level floor)
- User-preference additions via users.preferences.local_only_tags
  (additive, cannot remove from operator's floor)
- Three tradeoffs documented: quality degradation, availability
  degradation in pure-cloud (silent fallback to FTS-only), search
  consistency drift

Sovereignty benchmark (release criterion):
- Quantitative comparison of retrieval quality with sensitivity
  routing on vs off
- Hard release criterion for any trust-anchor-C path shipped as a
  default; Layer 1+2 with B-only defaults can ship without it

Consent surface (three complementary mechanisms):
- get_info (#235) exposes extraction/embedding providers with B/C/U
  classification — always on, on-demand
- First-time-seen briefing notice (one-shot per config change)
- Recurring briefing warning only when unprotected provider detected
  — silent on all-protected state, consistent with awareness alert
  philosophy

Layer 2/3 updates:
- Layer 2 references the sovereignty policy; future cloud embedding
  providers (e.g. #111) must be enterprise-tier to qualify as
  trust-anchor-C
- Layer 3 "cloud extractor privacy" risk replaced with a pointer to
  the policy; initial release still ships trust-anchor-B candidates
  only (cloud integration is real work, out of scope for experimental
  release), but trust-anchor-C extraction is acceptable in principle

Phase 1 additions:
- Sovereignty framework scaffolding (env var + user preference
  parsing, routing helper, trust-anchor classification helper) is
  implemented in Phase 1 even though it's a no-op until cloud
  providers are added (all current providers are trust-anchor-B)
- get_info exposes provider trust-anchor classifications
- First-time-seen briefing notice plumbing
- Unprotected provider detection plumbing (tested with fixtures)

Acceptance criteria updated across Layer 1, Layer 2, Layer 3 to
reflect sovereignty framework requirements.

This amendment adds a new Problem statement #5 (data sovereignty
undefined) and a new top-level "Data sovereignty policy" section
between "Layer scoping and user-facing releases" and "Design — three
independent layers," positioned so the technical layer designs can
reference it.

Refs PR #241 round-2 discussion on cloud extractor privacy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: sovereignty policy refinements from round-2 QA review

Round-3 amendment addressing the four substantive items and two
observations QA raised on the new Data sovereignty policy section.

Trust-anchor classification mechanism (QA Sub-1):
- Add "Trust-anchor classification mechanism (operator self-
  certification)" subsection. Trust-anchor-C cannot be technically
  verified; design uses operator self-certification with
  auditability.
- Default classification is U (unprotected) for any provider neither
  auto-classified via URL allowlist nor explicitly asserted via env
  var. "Failure to classify defaults to unprotected" — safer default.
- Operator self-certification env vars:
  AWARENESS_EMBEDDING_TRUST_ANCHOR=C
  AWARENESS_EMBEDDING_CONTRACT_REFERENCE=<doc URL>
  (and parallel extraction env vars)
- Contract reference is documentation-only, logged at startup and
  surfaced via get_info for audit.
- Operator override for user-owned URLs outside the B allowlist via
  AWARENESS_EMBEDDING_TRUST_ANCHOR=B + TRUST_ANCHOR_REFERENCE — this
  is how "cloud deployment with trust-anchor-B" is expressed.

Trust-anchor-B URL allowlist (QA Sub-2):
- Add "Automatic trust-anchor-B classification (URL allowlist)"
  subsection with the full allowlist spec, critically including:
  * Tailscale CGNAT range 100.64.0.0/10 (most important entry for
    users running Ollama on Tailscale-connected nodes)
  * host.docker.internal (critical for Docker deployments)
  * RFC1918 ranges, IPv6 ULA, *.local, *.lan/*.home/*.internal
- Document what's deliberately NOT in the allowlist (public DNS
  names, ad-hoc port forwards) and why.
- Custom user-owned cloud URLs (e.g., ollama.example.com → GCE)
  require operator override — cannot be classified from URL alone.

Deployment-time mismatch warning (QA Sub-3):
- Add "Deployment-time mismatch warning" subsection.
- Startup check: if AWARENESS_LOCAL_ONLY_TAGS is configured but no
  trust-anchor-B target is available, fire a persistent startup
  alert (level=warning, alert_id=sovereignty-degraded-deployment).
- Dismissable only via acted_on so operators must explicitly
  acknowledge the tradeoff.
- Explicitly distinct from the per-call unprotected-provider warning
  — this is a deployment-time configuration consistency check.

#235 bundled into Phase 1 (QA Sub-4):
- User confirmed: #235 get_info tool is now bundled into Phase 1
  scope because the Phase 1 sovereignty consent surface depends on it
- Added as explicit Phase 1 migration plan step 15 with scope:
  version/uptime/features, provider classification display, first-
  time-seen briefing plumbing, recurring unprotected warning
  plumbing
- Rationale documented inline: rather than split sovereignty across
  two phases, bundle #235 so the framework ships with its consent
  surface intact
- Merge checklist updated to note #235 bundling in the issue body
  follow-up

Observations addressed:
- Policy extraction: added post-Phase-1 follow-up to extract the
  sovereignty policy to docs/policy/data-sovereignty.md once it
  stabilizes through implementation
- User-floor opt-out: added one-paragraph "Operator-floor consequence
  for users" note in the sensitivity routing section — users cannot
  opt out of operator-configured local-only tags even with informed
  consent; relevant for managed multi-tenant deployments
- text_hash language consideration: added as Layer 2 follow-up in
  post-Phase-1 section

Layer 3 risk bullet updated to point at the sovereignty policy
instead of the superseded "local only, period" framing.

Refs PR #241 round-2 QA review, sovereignty policy refinement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: promote data_sovereignty to first-class entry field

Round-4 amendment: decouple the sovereignty control signal from
language-coupled plain-text tags. The `sensitive` tag was an English-
biased default that forced non-English users to either memorize an
English word or discover the operator tag-trigger configuration. A
control signal has no business being expressed in natural language.

New primary mechanism: `data_sovereignty` field on entries
- Type: Literal["strict"] — enum-style for forward compatibility with
  future values (audit, public, etc.) that might land later
- Stored in entry.data JSONB payload
- Absent means "no constraint"; "strict" means "must use trust-
  anchor-B for all inference, even if C is available"
- Language-independent by design
- Write-tool API: every write tool (remember, add_context,
  learn_pattern, update_entry, report_alert, report_status, remind)
  gains an optional data_sovereignty parameter
- Read-tool filter: get_knowledge gains a data_sovereignty filter
- get_info exposes aggregate counts of strict vs unconstrained

Tag-trigger convenience layer (demoted from primary to sugar):
- AWARENESS_SOVEREIGNTY_STRICT_TAGS env var — default EMPTY, no
  English bias in shipped config
- users.preferences.sovereignty_strict_tags — user additions
- At write time, if any entry tag matches the combined set, coerce
  data_sovereignty="strict" before storing
- Inference time reads only the field (single dict lookup) — tag-
  trigger resolution is entirely at write time

Renamed throughout:
- "Per-entry sensitivity routing" → "Per-entry sovereignty routing"
- "sensitive entries" → "strict entries"
- "AWARENESS_LOCAL_ONLY_TAGS" → "AWARENESS_SOVEREIGNTY_STRICT_TAGS"
- "users.preferences.local_only_tags" → "users.preferences
  .sovereignty_strict_tags"
- "The three tradeoffs of sensitivity routing" → "The three
  tradeoffs of sovereignty routing"

Why "data_sovereignty" (verbose) over "local_only" (shorter):
"local" is ambiguous (local process? local host? local network?
local to what?), while data_sovereignty ties directly to the policy
terminology. LLMs in agent contexts see the Literal schema and can
translate user intent into the appropriate value — the verbosity at
the API boundary simplifies into user-friendly presentation at the
agent boundary.

Phase 1 migration plan updated:
- data_sovereignty field implementation (field, write-tool params,
  read-tool filter, aggregate counts in get_info)
- Tag-trigger convenience layer with empty default
- Write-time coercion from tag match to field
- Inference-time check reads the field, not the tags

Acceptance criteria updated:
- data_sovereignty field round-trips through storage and read
- get_knowledge filter by data_sovereignty
- get_info aggregate counts
- Tag-trigger convenience layer
- Empty default for AWARENESS_SOVEREIGNTY_STRICT_TAGS verified
- Deployment-time mismatch warning on configured sovereignty without
  trust-anchor-B available
- Unknown values on write return structured errors listing valid
  values

Remaining "sensitive" occurrences in the doc are deliberate: one in
the "Why a field instead of a tag" rationale, others as English
example values in illustrative tag-trigger strings.

Refs PR #241 round-3 discussion on language support for reserved
control tags.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: fix round-3 QA findings (doc drift + sticky semantics + backfill)

Round-4 QA found one doc-drift blocker and two specification items.
All three addressed in this commit:

1. Doc drift (blocker — fixed here):
   - Three "sensitivity routing" leftovers renamed to "sovereignty
     routing" per project convention (lines 280, 290, 776 in the
     prior state)
   - zero remaining occurrences of "sensitivity routing" in the doc

2. update_entry sticky semantics (was a "specify before code" item,
   now locked in the design):
   - New "Sticky semantics: update_entry never silently weakens
     sovereignty" subsection added to the per-entry sovereignty
     routing section
   - Rule: once data_sovereignty="strict" is set, tag edits alone
     cannot clear it. Only an explicit data_sovereignty=None (or
     future "unset" sentinel) on a subsequent update can clear the
     strict constraint.
   - Rationale: sovereignty is a *promise*, tags are *labels*,
     editing a label should not silently break a promise
   - Concrete update_entry algorithm documented (explicit param
     wins; otherwise preserve existing; tag-add can coerce to
     strict, tag-remove never re-evaluates)
   - Phase 1 step 14 adds the sticky semantics implementation
     requirement with explicit test coverage for three paths:
     tag-add coerces, tag-remove preserves, explicit clear works
   - New Layer 1 AC

3. Write-time-only coercion limitation + backfill CLI (was a
   "specify before code" item, now locked in the design):
   - New "Write-time-only coercion: limitation and backfill"
     subsection documenting the gap between operator/user mental
     model ("I added a strict tag, all matching entries are now
     strict") and implementation behavior (forward-only)
   - Backfill CLI tool specified: mcp-awareness-sovereignty-backfill
     with --dry-run and --owner-id flags
   - Idempotent, additive-only (never removes strict from an entry)
   - Phase 1 step 14 adds the backfill CLI as an implementation
     requirement
   - New Layer 1 AC covering the backfill tool

Per QA feedback: "these are substantive but they're 'specify before
code' items, not 'fix before merge' items the way the doc drift is."
I'm addressing them in the doc anyway because the user's direction
was "nail this down best we can, then start work" — locking the
specification in markdown now is cheaper than discovering it
mid-implementation.

The additive-only semantics of the backfill CLI is explicitly
consistent with the update_entry sticky semantics: neither mechanism
can downgrade an entry's sovereignty without explicit intent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: deployment compatibility + interface localization framing

Round-5 amendment addressing two questions the user raised after
round-4 signoff:

1. Interface localization: are we consistent about "multilingual"
   when tools, prompts, errors are all English?
2. AWS / EKS deployment compatibility: does any design decision
   preclude managed-Postgres deployment?

Both answers are captured in a new "Deployment compatibility and
interface localization" subsection, inserted between the Risks
section and the Out of scope section.

Deployment compatibility answer:

Phase 1 and Phase 2 work unchanged on AWS RDS, Aurora PostgreSQL-
compatible, GCP Cloud SQL, and Azure Database. All required Postgres
features (pgvector, HNSW, FTS, regconfig, generated columns, RLS,
JSONB, logical WAL) are stock Postgres available on managed services.

Phase 3 (pgroonga CJK support) is incompatible with managed Postgres
— pgroonga is not on any managed provider's supported extension list
as of this design's date. Deployment matrix documents the tradeoff:

- RDS/Aurora + Ollama or enterprise OpenAI = works, no pgroonga CJK,
  falls back to 'simple' regconfig for CJK languages
- Self-managed Postgres on EC2/EKS/GKE/AKS with custom image =
  works, full feature set including pgroonga

Graceful degradation: the sovereignty policy's unsupported-language
fallback (missing-ts-config alert + 'simple' regconfig) means CJK
users on managed Postgres get degraded but working retrieval. Vector
branch still handles cross-lingual retrieval via Layer 2. The
managed awareness cloud offering (future) will have to choose
between RDS simplicity and self-managed full feature set; documented
as informational, no decision made.

Interface localization answer (Option D, per user direction):

Dual-lane framing. The interface (tool names, docstrings, prompts,
errors, CLI help) stays English like MCP itself, like SQL keywords,
like Python syntax. The data (entries, search, briefings about
entries) stays native. Two specific exceptions where interface
English bleeds into user-facing content enough to justify
localization:

1. Briefing template wrapper strings ("Status summary:", etc.) —
   headings that mix with the user's content in generated briefings
2. instructions.md — server instructions read by the client LLM at
   connection time, which shape agent behavior

These two hooks are tracked in new issue #242 as a separate
follow-up, not Phase 1 scope. #242 depends on Phase 1 landing first
(get_info + users.preferences.language) and can ship in parallel
with Layer 2 or Layer 3.

Translation pipeline per user: LLM-generated initially, community
contributions via Issues and PRs welcome. Quality floor: no partial
translations — a language has both briefing + instructions or falls
back entirely to English.

Merge checklist updated:
- New item: update issue #111 body to require enterprise-tier
  (trust-anchor-C) for any cloud embedding provider shipped as a
  default
- New item: cross-link #242 (interface localization follow-up)

Refs PR #241 round-5 discussion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: fix round-5 QA findings (wal_level + observations)

Round-6 fix addressing the one substantive doc-drift item and all
three observations from round-5 QA review.

Substantive (blocker — fixed):

- Remove wal_level=logical from the managed Postgres compatibility
  list. It is NOT required for hybrid retrieval, sovereignty, or
  anything in Phase 1+2. It is part of the standard awareness
  Postgres parameter set for Debezium CDC readiness — a separate
  forward-looking concern unrelated to this design doc. Leaving it
  in the compatibility list as a "required feature" was misleading
  — an operator reading the list would reasonably conclude they
  need to configure it on their RDS parameter group just to deploy
  hybrid retrieval.
- Replaced with a "Not required for hybrid retrieval" clarifying
  paragraph explicitly stating wal_level=logical is part of
  awareness's broader parameter set (cited docs/data-dictionary.md)
  but operators deploying only hybrid retrieval do not need it.

Observation 1 (expand Phase 1.0 verification to managed Postgres):

- Added an explicit verification step: run the same schema
  verification against AWS RDS Postgres 17 (and Aurora
  PostgreSQL-compatible if accessible) to confirm the generated
  tsvector + per-row regconfig pattern works on managed providers,
  not just on the docker pg17 image. RDS has subtle differences
  from upstream (parameter group defaults, extension versioning,
  no superuser, role/permission model) and the deployment
  compatibility claim in the new subsection is only as good as the
  verification that proves it.
- If Aurora access isn't available to the verifier, document the
  gap as an open verification item rather than assuming it works.
- Fallback semantics also extended: "if the generated-column
  approach fails on PG17 (docker image or managed provider), the
  fallback is a BEFORE INSERT/UPDATE trigger" — same mechanism,
  clearer scope.

Observation 2 (deployment matrix #111 forward reference):

- Two rows in the deployment matrix reference "enterprise-tier
  OpenAI embeddings" which is tracked in #111 (still open, out of
  scope for this design). Changed the "Works?" column from ✅ to
  "⏳ once #111 ships and is configured as trust-anchor-C per the
  sovereignty policy" for both rows, with a link to #111.
- Makes it explicit that the matrix is forward-looking for those
  rows, not a claim about the current state.

Observation 3 (users.preferences.language overload):

- Not a design doc change — this is an implementation concern for
  issue #242 (interface localization follow-up). Posted a
  dedicated implementation note comment on #242 flagging the
  overload (content_language vs interface_language), presenting
  two implementation paths (single field with documented
  limitation, or split fields with backwards-compat fallback),
  and recommending the split.
- The point of the flag is to make the implementer decide
  deliberately rather than stumble into the overload by reusing
  the field name without thinking about it.

Refs PR #241 round-5 QA review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: cmeans-claude-dev[bot] <3223881+cmeans-claude-dev[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 21, 2026
…334) (#351)

Closes [#334](#334). Also
closes CodeQL alerts #1, #2, #3 (three flags of
`actions/missing-workflow-permissions` in `ci.yml`).

## Summary

Two workflow-hardening fixes bundled because they're the same theme
(least-privilege + contributor-controlled-input discipline) and both
surfaced from the same security review:

### 1. `pr-labels.yml` — cascade the `#333` env-routing pattern (closes
#334)

Three job steps in `pr-labels.yml` (`on-push`, `on-unlabel`, `on-label`)
previously inlined contributor-visible fields as `${{ ... }}`
expressions inside `run:` bodies:

```yaml
# Before
run: |
  PR=${{ github.event.pull_request.number }}
  REPO=${{ github.repository }}
  HEAD_SHA=${{ github.event.pull_request.head.sha }}
  # ... shell script references PR / REPO / HEAD_SHA
```

Now they route through step-level `env:` and are referenced as shell
variables:

```yaml
# After
env:
  PR: ${{ github.event.pull_request.number }}
  REPO: ${{ github.repository }}
  HEAD_SHA: ${{ github.event.pull_request.head.sha }}
run: |
  # ... shell script references "$PR" / "$REPO" / "$HEAD_SHA"
```

**Not a currently-exploitable bug.** The `on: pull_request:` trigger
means fork PRs get a read-only `GITHUB_TOKEN` — the `gh pr edit
--add-label` / `--remove-label` calls would be rejected from a fork PR
regardless of what `PR`/`REPO`/`HEAD_SHA` contained. And all three
values are typed (numeric PR, repo name validated by GHA, hex SHA) —
none come from user-authored text like titles or bodies.

**Why change it anyway**, per #334's rationale:

- **Trigger-drift risk.** If `pr-labels.yml` ever switches to
`pull_request_target` (to allow label automation on fork PRs), the same
injection class that #333 closed on `pr-labels-ci.yml` reappears — and
now the hardening would already be in place.
- **Parameterization-drift risk.** A future maintainer adding a
contributor-authored string field (label name, PR title fragment, branch
name) to a `run:` block won't be prompted to route via `env:` first
because the file already establishes the inline `${{ ... }}` style as
"fine here."
- **Cascade consistency.** `pr-labels-ci.yml` uses env-routing since
#333; having the sibling workflow use a different style is a readability
cost for anyone auditing the repo.

### 2. `ci.yml` — add workflow-level `permissions: contents: read`
(closes CodeQL #1/#2/#3)

`ci.yml` had no `permissions:` block at workflow or job level, so all
three jobs (`lint`, `typecheck`, `test`) inherited whatever repo-level
default `GITHUB_TOKEN` scope is configured. CodeQL flagged this three
times (one per job).

Fix: declare `permissions: contents: read` at the workflow level. Every
job inherits read-only content access, which is sufficient for lint /
typecheck / pytest / codecov. No job actually needs write access to
anything.

## Audit sweep results

While touching workflow files, checked all six for missing
`permissions:`:

| Workflow | Had `permissions:`? | This PR's action |
|----------|---------------------|------------------|
| `ci.yml` | No (CodeQL flagged 3x) | Added `contents: read` at workflow
level |
| `docker-publish.yml` | Yes, line 23 | No change |
| `docker-smoke.yml` | Yes, line 40 (from #350) | No change |
| `pr-labels-ci.yml` | Yes, line 35 (from #333) | No change |
| `pr-labels.yml` | Yes, line 26 | No change to permissions block;
env-routing changes only |
| `qa-gate.yml` | Yes, line 24 | No change |

`ci.yml` was the last gap. Sweep is complete.

## Scope

- `.github/workflows/ci.yml` — `+7 lines` (permissions block with inline
rationale comment)
- `.github/workflows/pr-labels.yml` — `+10, -11` (three `run:` bodies
lose two shell-assignment lines each; three `env:` blocks gain two-three
entries each; explanatory comment added in the `on-unlabel` case)
- `CHANGELOG.md` — `+4 lines` (new `### Security` subsection under
`[Unreleased]`)

No source, no tests, no migrations.

## References

- Closes [#334](#334)
- Closes CodeQL alerts #1, #2, #3
(`actions/missing-workflow-permissions` on `ci.yml:27/41/53`)
- Cascade source: PR
[#333](#333) (same pattern
for `pr-labels-ci.yml`, which closed #332)
- Related CodeQL alerts not addressed by this PR: #5/#6/#7/#8 (OAuth
clear-text logging in `oauth.py` and `oauth_proxy.py`) — separate audit
PR, coming next. #4 (socket bind in tests) — dismiss via UI.

## QA

### Prerequisites

None. Pure workflow-YAML changes.

### Automated checks

- `lint`, `typecheck`, `test (3.10/3.11/3.12)` — none touch YAML, should
remain green.
- `CodeQL (actions)` — will re-scan `ci.yml` and `pr-labels.yml` on this
PR. Expected outcome: alerts #1/#2/#3 flip to "fixed" on merge; no new
alerts introduced.
- `docker-smoke` — not triggered (no changes under `Dockerfile` /
`pyproject.toml` / `uv.lock` / `.dockerignore`).

### Manual tests

1. - [x] **Both workflow files parse.**
     ```
python3 -c "import yaml; [yaml.safe_load(open(f)) for f in
['.github/workflows/ci.yml', '.github/workflows/pr-labels.yml']];
print('OK')"
     ```
     Expected: `OK`.

2. - [x] **`ci.yml` now has `permissions: contents: read`.**
     ```
     grep -A1 '^permissions:' .github/workflows/ci.yml
     ```
     Expected: `permissions:` header followed by `  contents: read`.

3. - [x] **No contributor-controlled inputs in `pr-labels.yml` `run:`
bodies.**
     ```
awk '/^[[:space:]]+run: \|/,/^[[:space:]]+-
name:|^[[:space:]]{2,6}[a-z-]+:$/' .github/workflows/pr-labels.yml |
grep -nE '\$\{\{ *github\.(event|repository|head_ref)' || echo "(none —
good)"
     ```
Expected: `(none — good)`. All `github.event.*` / `github.repository`
references are now in `env:` blocks (and in job-level `if:`
conditionals, which is safe context).

4. - [x] **All six workflows now have `permissions:`.**
     ```
     for f in .github/workflows/*.yml; do
if ! grep -q '^permissions:\|^ permissions:\|^ permissions:' "$f"; then
         echo "$f: MISSING permissions"
       fi
     done
     echo "(if no 'MISSING' lines above, sweep is complete)"
     ```
     Expected: no `MISSING` lines.

5. - [x] **Label automation still functions on this PR.** When I push,
`pr-labels.yml`'s `on-push` should reset labels to `Awaiting CI` and
strip any stale QA labels. When `Dev Active` is removed, `on-unlabel`
should promote to `Ready for QA` after CI passes. Empirically validated
if the label transitions on this PR itself behave identically to recent
merged PRs (self-test).

6. - [x] **`permissions: contents: read` doesn't break anything.** Lint
/ typecheck / pytest / codecov upload only need read access to
`GITHUB_TOKEN` — none of them push labels, create comments, or mutate
repo state. If any of the existing CI checks start failing on this PR
with "resource not accessible" errors, that's a signal the permissions
block is too tight (unlikely, but the empirical test is: does this PR's
CI go green?).

7. - [x] **Diff review.**
     ```
     git diff --stat origin/main
     ```
Expected: `.github/workflows/ci.yml` (+7),
`.github/workflows/pr-labels.yml` (+10, -11), `CHANGELOG.md` (+4).
Nothing else.

### Acceptance

- ✅ `#334` — symmetric env-routing cascade landed in `pr-labels.yml`
- ✅ CodeQL `#1`, `#2`, `#3` — `ci.yml` now has explicit `permissions:`
- ☐ CodeQL re-scan confirmation — post-merge, the three alerts flip from
Open → Fixed automatically on the next `Analyze (actions)` run against
`main`

Post-merge, also worth a look at CodeQL's /security/code-scanning
dashboard to confirm Open count drops from 8 → 5 (just the four
OAuth-logging + the one test-file socket-bind remaining).

Co-authored-by: cmeans-claude-dev[bot] <3223881+cmeans-claude-dev[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 21, 2026
…ng flags (#5/#6/#7/#8) (#352)

Closes CodeQL alerts **#5, #6, #7, #8, #9**
(`py/clear-text-logging-sensitive-data` in `oauth.py` +
`oauth_proxy.py`).

## Round 3 — IP-header-chain redesign

The round-1 `_format_ip_header_chain` helper (see description below in
§"The fix") didn't break CodeQL's dataflow — the analyzer tracked
`header_chain` through the helper call and CodeQL #9 duplicated #8's
flag on the new form. Round-3 commit `e31561c` **removes the helper**
and instead splits the log by config path: the default case logs a
module-level string literal (no taint flow), the override case logs only
the count + env-var pointer (names never reach the sink). Both #8 and #9
close through code change — no UI dismissal.

This is the code-improvement path per the user preference for a
"reasonable wording change that would make CodeQL happy without
sacrificing log usefulness" rather than inline suppression.

## The problem

CodeQL flagged four log statements in the OAuth paths:

| Alert | File | Line | What it logs |
|-------|------|------|--------------|
| #5 | `oauth.py` | 81 | `discovery_url` + exception when OIDC discovery
fails |
| #6 | `oauth.py` | 85 | `self.issuer` when falling back to default JWKS
path |
| #7 | `oauth_proxy.py` | 225 | `discovery_url` + exception in the
proxy's discovery path |
| #8 | `oauth_proxy.py` | 312 | IP-header-name chain config |

CodeQL's `py/clear-text-logging-sensitive-data` uses dataflow from
"sensitive sources" (values tracked as credential-adjacent) to log
sinks. In our case, all four sites log values that are **not actually
secrets today** — `self.issuer` is a config URL, header names aren't
credentials — but CodeQL taints conservatively because the values flow
from constructor parameters in auth-adjacent contexts.

## The fix — defensive URL redaction, not suppression

Two new helpers:

### `safe_url_for_log(url)` (in `helpers.py`)

Strips credential-carrying URL components before logging:

- **`userinfo`** — RFC 3986 lets URLs embed `user:password@host`. An
OAuth issuer URL misconfigured that way would leak credentials through
every log line that mentioned it.
- **`query string`** — OAuth authorization-code flow passes `code`,
`state`, and PKCE verifiers here. Anything on the query string is a
candidate to be secret.
- **`fragment`** — OAuth implicit flow (deprecated but still
occasionally in the wild) puts access tokens in the fragment.

Returns `scheme://host[:port]/path` — preserves the operator-meaningful
portion so an operator reading a log can still identify the endpoint,
while removing anything that could leak a credential.

Defensive rather than currently-exploitable: **our issuer URLs today are
plain `https://provider/...` strings with no userinfo/query/fragment**.
The helper ensures a future misconfiguration, provider change, or
redirect can't change that. Unparseable or empty inputs return
`"<redacted url>"` — a log helper should never itself become a failure
source.

**Applied at every URL-logging site in the OAuth paths**, not just the
four CodeQL flagged — consistency:

- `oauth.py:_discover_oidc_config` — 4 log statements (discovered JWKS
URI, discovered userinfo endpoint, discovery-request failure,
discovery-fallback warning)
- `oauth_proxy.py:discover_oidc_endpoints` — 2 log statements (discovery
failure, discovered-endpoints summary)

### IP-header-chain log split by config path (in `oauth_proxy.py`)

The round-1 approach wrapped the IP-header-chain log in a
`_format_ip_header_chain` helper to try to break CodeQL's taint-flow
through function indirection. That didn't work — CodeQL #9 duplicated
#8's flag on the helper-call form. Round 3 takes a different approach:
**split the log statement by whether defaults are in use, and remove the
taint-flow path entirely**:

- **Default branch** (`ip_headers is None`): log a module-level literal
`_DEFAULT_IP_HEADER_DISPLAY = "CF-Connecting-IP → X-Real-IP → ASGI
client"`. Source-literal string; no contributor-controlled value flows
into the sink.
- **Override branch** (env-configured `ip_headers`): log only the
*count* of custom entries and a pointer to
`AWARENESS_OAUTH_PROXY_IP_HEADERS` so operators can inspect the env var
directly. The names themselves never reach a log sink.

Trade-off: operators running with a non-default header chain don't see
the names in logs — they see the count + env-var name. Small loss of log
convenience for the minority-override case; full operator clarity for
the default case (which most deployments run). Closes CodeQL #8 and #9
cleanly without UI dismissal.

## Testing

11 new unit tests in `tests/test_helpers.py::TestSafeUrlForLog`:

- `test_plain_https_preserved` — unaltered URLs round-trip
- `test_strips_userinfo` — `user:password@host` → `host`
- `test_strips_query_string` — `?code=secret&state=xyz` → `` (empty)
- `test_strips_fragment` — `#access_token=secret` → `` (empty)
- `test_strips_all_credential_carrying_parts_at_once` — composite test
- `test_preserves_non_default_port` — `:8443` survives
- `test_empty_string_returns_placeholder` — `""` → `"<redacted url>"`
- `test_unparseable_string_returns_placeholder` — `"not a url"` →
`"<redacted url>"`
- `test_no_netloc_returns_placeholder` — `"/path/only"` → `"<redacted
url>"`
- `test_path_preserved_exactly` — deep paths with dots/dashes unchanged

**All existing 120 OAuth tests (`tests/test_oauth.py` +
`tests/test_oauth_proxy.py`) continue to pass** against the refactored
code — no behavior regression, only log-output change.

## Scope

- `src/mcp_awareness/helpers.py` — `+42` (new `safe_url_for_log` helper
+ docstring)
- `src/mcp_awareness/oauth.py` — `+10, -4` (import + 4 log-site changes)
- `src/mcp_awareness/oauth_proxy.py` — `+34, -10` (import +
`_DEFAULT_IP_HEADER_DISPLAY` module constant + 3 log-site changes,
including the split-by-config-path IP-header-chain log added in round 3)
- `tests/test_helpers.py` — `+69` (import + 11 new test cases in
`TestSafeUrlForLog` class)
- `CHANGELOG.md` — `+3` (new `### Security` entry under `[Unreleased]`)

No migrations. No source API surface changed. No log-message semantics
changed beyond the URL-component redaction.

## References

- Closes CodeQL alerts #5, #6, #7 (direct — URL now redacted before
flowing to log sink)
- #8 and #9 also closed — the round-3 split-by-config-path log statement
removes taint-flow entirely; no UI dismissal required
- Session context: #334 hardening cascade and ci.yml permissions closed
alerts #1/#2/#3 via PR #351. This PR addresses Group B of yesterday's
CodeQL audit.

## QA

### Prerequisites

None. Pure code changes, covered by new unit tests and existing
integration tests.

### Automated checks

- `lint` / `typecheck` — clean locally
- `test (3.10/3.11/3.12)` — `tests/test_helpers.py::TestSafeUrlForLog`
adds 11 cases; existing OAuth tests continue to pass
- `CodeQL (python)` — expected outcome: #5, #6, #7, #8, #9 all flip Open
→ Fixed on merge.
- `codecov/patch` — new helper is covered by the 11 new test cases;
should be 100% on touched lines
- `docker-smoke` — **not triggered** (no `Dockerfile` / `pyproject.toml`
/ `uv.lock` / `.dockerignore` change)

### Manual tests

1. - [x] **All tests pass locally.**
     ```
python -m pytest tests/test_helpers.py tests/test_oauth.py
tests/test_oauth_proxy.py -q
     ```
     Expected: all pass, including the 11 new `TestSafeUrlForLog` cases.

2. - [x] **`safe_url_for_log` correctly strips each sensitive
component.**
     ```
     python -c "
     from mcp_awareness.helpers import safe_url_for_log as f
assert f('https://user:secret@host/path') == 'https://host/path',
'userinfo'
assert f('https://host/path?code=secret') == 'https://host/path',
'query'
assert f('https://host/path#token=secret') == 'https://host/path',
'fragment'
assert f('https://u:p@host:8443/oauth?c=1#t=2') ==
'https://host:8443/oauth', 'composite'
     assert f('') == '<redacted url>', 'empty'
     assert f('not a url') == '<redacted url>', 'unparseable'
     print('all manual assertions pass')
     "
     ```
     Expected: `all manual assertions pass`.

3. - [x] **Log-output semantics unchanged for normal URLs.** For any
sensible OIDC issuer URL
(`https://api.workos.com/user_management/client_X`), the log now reads
identically to before — same host, same path. Verify by eyeballing an
OAuth auth flow locally and confirming `Discovered JWKS URI: <url>` /
`OAuth proxy: discovered endpoints — authorize=<url>, …` log lines
render as before when URLs don't contain credentials.

4. - [x] **No contributor-controlled inputs are now logged.** Grep for
any remaining `%s` + URL arg that isn't routed through the helper in the
OAuth paths:
     ```
grep -nE
'logger\.(info|warning|error).*%s.*(discovery_url|self\.issuer|userinfo_endpoint|authorization_endpoint|token_endpoint|registration_endpoint|header_chain|jwks)'
src/mcp_awareness/oauth.py src/mcp_awareness/oauth_proxy.py | grep -v
safe_url_for_log | grep -v _format_ip_header_chain || echo "(none —
good)"
     ```
Expected: `(none — good)`. Every URL-logging site has been routed
through a redactor.

5. - [x] **CHANGELOG entry documents the defensive rationale.**
     ```
     grep -nE '(credential|redact)' CHANGELOG.md | head -3
     ```
Expected: the new `### Security` entry mentions credential-carrying
components and redaction.

6. - [x] **Diff review.**
     ```
     git diff --stat origin/main
     ```
Expected: 5 files changed. `CHANGELOG.md` (+3),
`src/mcp_awareness/helpers.py` (+42), `src/mcp_awareness/oauth.py` (+10,
-4), `src/mcp_awareness/oauth_proxy.py` (+34, -10),
`tests/test_helpers.py` (+69). Nothing else.

7. - [ ] **(Post-merge) Confirm CodeQL alerts flip.** After merge,
`Analyze (python)` re-runs on `main`. At
https://github.com/cmeans/mcp-awareness/security/code-scanning:
- Alerts #5, #6, #7 flip Open → Fixed (each was
`clear-text-logging-sensitive-data` in an `oauth.py`/`oauth_proxy.py`
file — the URL arg is now routed through `safe_url_for_log`, which
CodeQL should see as a sanitizer).
- Alerts #8 and #9 both flip Open → Fixed — the round-3
split-by-config-path log statement removes the taint-flow path entirely.

### Acceptance

- ✅ CodeQL #5 — `oauth.py:82` discovery-request-failed log: URL now
`safe_url_for_log(discovery_url)`.
- ✅ CodeQL #6 — `oauth.py:91` default-JWKS-path log: URL now
`safe_url_for_log(self.issuer)`.
- ✅ CodeQL #7 — `oauth_proxy.py:229` discovery-failed log: URL now
`safe_url_for_log(discovery_url)`.
- ✅ CodeQL #8 — `oauth_proxy.py:330-345` header-chain log: split into
default (logs literal `_DEFAULT_IP_HEADER_DISPLAY` constant — no taint
flow) and override (logs count + env-var pointer — header names never
reach the log sink).
- ✅ CodeQL #9 — same fix. The round-1 `_format_ip_header_chain` helper
triggered a duplicate flag; removing the helper and splitting by config
path resolves both #8 and #9.

---------

Co-authored-by: cmeans-claude-dev[bot] <3223881+cmeans-claude-dev[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cmeans-claude-dev cmeans-claude-dev Bot mentioned this pull request Apr 22, 2026
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 22, 2026
Patch release stamping six PRs merged to `main` since v0.18.1 on
2026-04-20.

## Summary

Two-file diff:

- `pyproject.toml` — `version` bump `0.18.1` → `0.18.2`
- `CHANGELOG.md` — `[Unreleased]` renamed to `[0.18.2] - 2026-04-21`;
new empty `[Unreleased]` section seeded; comparison-link footer updated

## Why patch

- No new MCP tools, no changed tool signatures, no resource changes.
- No breaking config, no migration, no data-format change.
- `requires-python = ">=3.10"` floor unchanged in `pyproject.toml`.
- Dockerfile base bump (3.12 → 3.13) is runtime-transparent to image
consumers; CI matrix widening (3.13, 3.14) is pure infra.
- OAuth log-redaction is security-hardening with no behavior change on
the happy path.
- `docker-compose` host-port parameterization is backward-compatible —
default behavior unchanged.

Textbook patch bump for a 0.x project.

## Included PRs

| PR | Title | Kind |
|---|---|---|
| [#351](#351) | ci: cascade
env-routing to `pr-labels.yml` + workflow permissions | Security |
| [#352](#352) | fix(oauth):
redact URLs in log output (CodeQL #5-#9) | Security |
| [#350](#350) | ci: add
`docker-smoke` workflow — build + import smoke on Dockerfile PRs | Added
|
| [#353](#353) |
chore(compose): parameterize host port in `docker-compose.yaml` |
Changed |
| [#354](#354) | ci: extend
Python test matrix to include 3.13 and 3.14 | Added |
| [#355](#355) |
chore(docker): bump base image from `python:3.12-slim` to `3.13-slim` |
Changed |

All six merged via their own QA-Approved cycles — nothing in this
release bypasses the standard pipeline.

## What's unchanged

- `docker-compose.yaml` — uses `:latest`, no version bump needed
- `README.md` — tool count (32) and text-mode content unchanged; no
update needed
- `uv.lock` — no dep changes in any of the six PRs

## QA

Lightweight per project convention — all substantive code was tested in
its own PR. Review-only checks:

1. - [x] **`pyproject.toml` version** is `0.18.2`. Verify line 3:
`version = "0.18.2"`.
2. - [x] **CHANGELOG** — `[0.18.2] - 2026-04-21` heading exists; the six
rolled-up entries sit beneath it in their original order (Changed →
Added → Changed → Security → Security → Added); empty `[Unreleased]`
seeded above.
3. - [x] **Comparison links** — `[0.18.2]: …v0.18.1...v0.18.2` added;
`[Unreleased]` now points at `v0.18.2...HEAD`.
4. - [ ] **Scope** — `git diff --stat origin/main` shows exactly
`CHANGELOG.md` (+4, -1) and `pyproject.toml` (+1, -1). Nothing else.
5. - [x] **No accidental content drift in rolled-up entries** — diff
between this branch's `[0.18.2]` section and what was in `[Unreleased]`
on `main` before this PR should be zero beyond the heading/anchor move.

### Acceptance

- ✅ CI green
- ☐ Merge + tag (Dev authorization, executed post-merge)

## Merge + tag (Dev post-merge action)

After merge, Dev runs:

\`\`\`
git checkout main && git pull --ff-only origin main
git tag -a v0.18.2 -m "v0.18.2 — CI matrix widening (3.13/3.14),
Dockerfile to python:3.13-slim, docker-smoke workflow, compose host-port
parameterization, OAuth log redaction, workflow permission hardening"
git push origin v0.18.2
\`\`\`

The tag triggers \`docker-publish.yml\` to build and publish the
\`:v0.18.2\` + \`:latest\` images.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: cmeans-claude-dev[bot] <3223881+cmeans-claude-dev[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 22, 2026
Closes medium-severity gap #5 from the 2026-04-21 contribution-safety
audit (`audit-contribution-safety-2026-04-21`). Third of three planned
Phase 2 PRs; docs was
[#357](#357) (merged),
ruleset+UI changes handled out-of-band (see
`decision-ruleset-main-protection-2026-04-21` in awareness).

## Summary

Moving major-version tags on third-party GitHub Actions (`@v6`, `@v3`,
etc.) are an unreviewed supply-chain channel — the upstream repo owner
can move the tag to any commit at any time, including after a
compromise. Pinning to a full 40-char commit SHA freezes the code that
actually runs in CI to what was reviewed at this commit.

Each pinned line carries a trailing `# <action> pinned to full commit
SHA — <version>` comment so:
- The human-readable version stays visible to reviewers.
- Dependabot's `github-actions` ecosystem (configured in
`.github/dependabot.yml`) continues to open update PRs — it respects SHA
pins and updates both the SHA and the version comment in lockstep.

## Scope

**4 files changed, +25, -7** (verified via `git diff --shortstat
origin/main`).

| File | ± | Notes |
|---|---|---|
| `.github/workflows/ci.yml` | +2, -1 | codecov/codecov-action |
| `.github/workflows/docker-smoke.yml` | +4, -2 |
docker/setup-buildx-action, docker/build-push-action |
| `.github/workflows/docker-publish.yml` | +8, -4 |
docker/setup-buildx-action, docker/login-action (×2),
docker/build-push-action |
| `CHANGELOG.md` | +11, -0 | `### Security` entry under `[Unreleased]` |

**7 `uses:` lines touched, 6 distinct action@tag pairs**
(`docker/login-action@v4` is invoked twice in `docker-publish.yml`, once
for GHCR and once for Docker Hub; both pinned to the same SHA).

## The six pinned pairs

Each SHA was resolved from the upstream tag via `gh api
repos/<action>/git/refs/tags/<tag>` at PR creation time, cross-verified
that the exact commit also carries a specific-version tag (`vX.Y.Z`).

| Action | Old | New SHA | Version |
|---|---|---|---|
| `codecov/codecov-action` | `@v6` |
`57e3a136b779b570ffcdbf80b3bdc90e7fab3de2` | v6.0.0 |
| `docker/setup-buildx-action` | `@v3` |
`8d2750c68a42422c14e847fe6c8ac0403b4cbd6f` | v3.12.0 |
| `docker/build-push-action` | `@v6` |
`10e90e3645eae34f1e60eeb005ba3a3d33f178e8` | v6.19.2 |
| `docker/setup-buildx-action` | `@v4` |
`4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd` | v4.0.0 |
| `docker/login-action` | `@v4` |
`4907a6ddec9925e35a0a9e82d7399ccc52663121` | v4.1.0 |
| `docker/build-push-action` | `@v7` |
`bcafcacb16a39f128d818304e6c9c0c18556b85f` | v7.1.0 |

## What's unchanged

- `actions/checkout` and `actions/setup-python` — first-party GitHub
actions; different trust boundary (GitHub itself is the upstream).
Tag-pinned `@v6` as before.
- Action *versions* are unchanged — the SHAs resolve to the exact same
commits `@v3`/`@v4`/`@v6`/`@v7` currently point to. No behavior change
expected.
- Inconsistency in `docker-publish.yml` vs `docker-smoke.yml` (buildx v3
vs v4; build-push v6 vs v7) is preserved. Bundling version alignment
with SHA-pinning would mix concerns; Dependabot can propose alignment as
its own PR.

## Risks and unknowns

- **Dependabot update noise.** Once this lands, Dependabot may open six
update PRs (one per action family) for the next patch/minor release.
That's the expected, correct behavior — the github-actions ecosystem is
configured with "group all" policy in `dependabot.yml`, so noise is
bounded.
- **Upstream repo-renames break SHA refs eventually.** If e.g.
`docker/build-push-action` is ever renamed or deleted, `@<sha>` won't
resolve. Low probability; detection is a CI failure, not silent
breakage.
- **`codecov/codecov-action@57e3a136…` resolves to v6.0.0, not a later
v6.x.** That's what the floating `@v6` tag currently points to, so no
behavior change from main — we just froze the current state. If codecov
publishes v6.1.0 later, Dependabot will propose the update.

## References

- Audit report: `audit-contribution-safety-2026-04-21` (id `0c79b026`)
- Originating handoff: awareness intention `598956c6` (claude.ai,
2026-04-19, state: active)
- Earlier Phase 2 PR:
[#357](#357) (docs template
+ CONTRIBUTING + README, merged)
- Ruleset changes (not a PR):
`decision-ruleset-main-protection-2026-04-21` (id `dedf7748`)

## QA

### Prerequisites

None. Workflow-file change; CI executes it automatically.

### Automated checks

- **`docker-smoke` — load-bearing.** Exercises the two SHA-pinned
actions in `docker-smoke.yml` (`setup-buildx@8d2750c6…`,
`build-push@10e90e36…`) against the current Dockerfile. Green smoke =
the pinned SHAs resolve and behave identically to the `@v3`/`@v6` tags
they replaced.
- **`test (3.10–3.14)`, `lint`, `typecheck`, `codecov/patch`** —
`codecov/patch` is the first end-to-end exercise of
`codecov/codecov-action@57e3a136…` in this PR's CI run. If codecov
upload behaves differently than before, that's the signal.
- **`docker-publish`** — NOT triggered by this PR (trigger is `push:
tags`). The pinned SHAs in `docker-publish.yml` will get their real
exercise on the next release tag push.
- **CodeQL (actions)** — re-scans the three workflow files. SHA-pins
shouldn't introduce new taint-flow sites; clean scan expected.
- **`license/cla`, `QA Gate`** — unchanged.

### Manual tests

1. - [x] **Diff matches the scope claim.** `git diff --stat
origin/main..security/sha-pin-third-party-actions` shows exactly the
four files above at `+25, -7`. Nothing else.

2. - [x] **Every new `uses:` line is a full-SHA pin (40 hex chars).**
Spot-check:
     ```
grep -E "uses: .*@[0-9a-f]{40}"
.github/workflows/{ci,docker-smoke,docker-publish}.yml
     ```
Expected: 7 matches, all with the trailing `# <action> pinned to full
commit SHA — <version>` comment one line above.

3. - [x] **Each SHA matches a specific-version tag on the upstream
action.** Optional spot-check for any one action:
     ```
     gh api repos/docker/build-push-action/tags?per_page=100 \
--jq '[.[] | select(.commit.sha ==
"bcafcacb16a39f128d818304e6c9c0c18556b85f")] | map(.name)'
     ```
Expected: `["v7.1.0", "v7"]` — the SHA carries both the specific-version
tag and the floating major-version tag.

4. - [x] **First-party `actions/*` untouched.** `grep -E "uses:
actions/" .github/workflows/ci.yml` still shows `@v6` for `checkout` and
`setup-python` — no scope creep into first-party.

5. - [x] **`docker-smoke` CI job is green on this PR.** Verify in the
Actions tab. This is the load-bearing integration test for the buildx +
build-push pins.

### Acceptance

- ✅ CI green (all automated checks)
- ✅ `docker-smoke` green — **load-bearing**
- ✅ 7 `uses:` lines SHA-pinned across 6 distinct action@tag pairs
- ✅ Each pin carries a human-readable version comment
- ✅ No behavior change — SHAs resolve to same commits as the tags they
replaced
- ✅ First-party `actions/*` unchanged
- ✅ Dependabot continues to own update proposals (github-actions
ecosystem)
- ✅ Single-concern; no source / test / compose changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: cmeans-claude-dev[bot] <3223881+cmeans-claude-dev[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: cmeans-claude-dev[bot] <272174644+cmeans-claude-dev[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant