chore: 2026-05-03 Bucket A audit (#1733): 21 mechanical fixes by Aureliolo · Pull Request #1744 · Aureliolo/synthorg

Aureliolo · 2026-05-03T20:43:16Z

Closes #1733.

Bundles every Bucket A item from the 2026-05-03 codebase audit (mechanical / pattern-replace / add-a-call / add-a-constraint) into one mega-PR. Per the issue, this is one squash-merge unit; the commit log inside the branch keeps each topic independently reviewable.

Acceptance items (21/21 RESOLVED)

Sub-section 1 — Audit-chain + security misroutes

Add design specification, license, and project setup #2 SSRF audit-chain misrouting — record() and update_status() now log under SECURITY_* constants; the resolution branch picks SECURITY_SSRF_VIOLATION_ALLOWED / SECURITY_SSRF_VIOLATION_DENIED. Failed resolutions emit a dedicated SECURITY_SSRF_VIOLATION_RESOLUTION_FAILED so SIEM filters cannot misclassify them as actual decisions. The two API_* constants moved off the audit chain were dropped.
Implement agent-to-agent messaging with channels and topics #10 Rate limits — added oauth.initiate (10 req / 60s per user) and settings.import (5 req / 3600s per user) to RATE_LIMIT_POLICIES; both routes now apply per_op_rate_limit_from_policy.
Integration test: multi-agent delegation with task decomposition #30 TSA URLs http→https — DigiCert and Sectigo defaults switched.

Sub-section 2 — Concurrency + lifecycle locks

Implement model routing engine (role-based, cost-based, fallback chains) #6 Race conditions — DelegationCircuitBreaker (sync threading.RLock for the hot path), TrustService (asyncio.Lock across the read-modify-write of _trust_states + _change_history), and the in-memory memory backend (_store_lock separate from _connect_lock).
Implement per-call cost tracking and usage logging #7 Lifecycle locks — installed _lifecycle_lock on Worker and converted ApprovalTimeoutScheduler.start() to async with the canonical pattern. ApprovalTimeoutScheduler.stop() now accepts a timeout argument and sets the unrestartable _stop_failed flag on drain timeout so a subsequent start() raises rather than spawning a duplicate task.
Implement retry logic, rate limiting, and provider error handling #9 tar.extractfile leaks — both sites wrapped in with so the file handle closes deterministically.

Sub-section 3 — Persistence + migrations + N+1

Evaluate and implement message bus (asyncio queues vs external broker) #8 Migration parity — added CHECK (json_valid(...)) constraints on provider_audit_events.payload and the three nullable preset_overrides JSON columns via Atlas migration 20260503181821_json_check_constraints.sql. Postgres schema unchanged; JSONB already enforces validity.
Implement personality injection and system prompt construction #13 N+1 batched — added save_many to ApprovalRepository (SQLite + Postgres). ApprovalStore collects expired items into to_persist then issues one batched write; expire-callback failures log at ERROR.
End-to-end integration test: single agent receives and completes a task #24 Dual-backend conformance — extended the approval conformance with save_many (round-trip / empty / upsert / duplicate-id batch). The audit's "9 missing repos" was wrong: investigation showed all nine are already covered under tests/conformance/persistence/{test_auth_repositories,test_connection_repositories,test_ontology_repositories}.py.

Sub-section 4 — Settings wiring + Clock seam

Implement tool permission checking based on role and access level #16 BackupService + ApprovalTimeoutScheduler — wired into the lifespan startup / shutdown. Investigation confirmed 11 of the 13 audit-flagged "dead" settings are already consumed; only BackupService and ApprovalTimeoutScheduler were unwired and now are.
Implement loop prevention (depth limits, dedup, circuit breakers, ancestry) #17 Clock seam — injected clock: Clock | None = None into 14 time-reading sites: a2a/well_known.py, api/controllers/{health,events}.py, api/services/idempotency_service.py, api/state.py (AppState.clock), communication/bus/{memory,_nats_state,_nats_receive}.py, engine/workflow/ceremony_scheduler.py, integrations/health/checks/database.py, memory/retrieval/hierarchical/workers.py (3 worker classes), meta/validation/ci_validator.py, security/uncertainty.py, tools/sandbox/docker_sandbox.py.

Sub-section 5 — Pydantic + pagination + magic numbers + Prometheus

Implement built-in tools: file system operations #18 extra="forbid" sweep — 489 ConfigDicts hardened across the listed domains; 46 carve-outs preserved on classes that declare @computed_field (Pydantic v2 model_dump() includes computed values, so strict-extra would reject the round trip).
Evaluate and set up Python package manager (uv/poetry/pip-tools) #19 Mandatory limit — limit: int = 100 (was int | None = None) on every persistence list / query / list_by_* method (40 sites across 26 files); the missing-limit check was removed.
Implement parallel agent execution with async coordination #22 Magic numbers — extracted module-level constants in engine/routing/scorer.py, engine/quality/graders/heuristic.py, and templates/model_matcher.py.
Implement hierarchical delegation (task flows down, results flow up) #12 Prometheus tool_name — bounded via _LabelSnapshot.tool_names, refreshed on each PrometheusCollector.refresh(). validate_tool_name rejects unknown values; the registry-fetch path narrows its swallow to AttributeError / TypeError / ValueError and logs at ERROR.

Sub-section 6 — Frontend + docs + consolidation

Implement single-task execution lifecycle (assign, execute, complete) #21 Cobra Long + Example — filled on 14 commands.
Implement core CLI commands (company create/start/stop, agent list/hire/fire) #28 REJECTION_REASON_REQUIRED — extracted to web/src/pages/approvals/errors.ts; 3 inline copies replaced.
R4 Normalisation helpers — strip_trailing_slash, normalize_optional_string, normalize_path in core/normalization.py; 17 call sites migrated.
R8 Web dispatch consolidation — getNodeLabel extracted to web/src/pages/org/node-utils.ts; budget.ts replaced two parallel switch statements with a DIMENSION_RESOLVERS strategy table.
R9 Tag dedup — deduplicate_tags in synthorg.memory.utils; MemoryMetadata, MemoryQuery, and ProcedureLearningRecord now call the shared helper.
R11 Wrapper removal — deleted _group_records_by_agent; callers import group_by_agent directly.

Pre-PR review — agent findings addressed in this PR

21 review agents ran on this branch. Critical / Major / Medium findings all fixed in the final commit:

C1 / scheduler stop_failed — drain timeout now sets _stop_failed; outer _try_stop budget exceeds the inner so the unrestartable guard fires before cancellation.
C2 / circuit-breaker TOCTOU — check() runs the entire OPEN-branch decision under _state_lock; the old split between get_state() and the post-hoc cooldown read is gone.
C3 / SSRF success-event misrouting — failure paths route to a dedicated SECURITY_SSRF_VIOLATION_RESOLUTION_FAILED.
M1 / trust read outside lock — apply_trust_change holds _state_lock from the read through the change_history append.
M2 / silent expire-callback failure — logs at ERROR with structured fields.
M3 / mock-spec gate — lifecycle test mocks now declare spec=.
M4 / Prometheus snapshot fixture — promoted to session scope so cross-file ordering under xdist loadfile cannot leave an empty snapshot in place.
M5 / stale module docstring — updated to mention the new failure event.
Md1–Md7 — narrowed exception types in _fetch_tool_names; split backup-manifest extraction into archive-corrupt / json-parse-failed / schema-validation-failed / io-error categories; lifecycle startup distinguishes RuntimeError; SSE revalidate iteration cap raised with rationale; save_many empty-input post-condition asserted; _list_from_repo_locked docstring expanded; CLAUDE.md and docs/reference/conventions.md updated to reflect the audit-imposed conventions.

False-positive findings (PEP 758 except A, B: flagged as a syntax error in three agents, "missing limit arg" in one, "missing Postgres peer migration" in one) were classified INVALID with reasoning recorded in _audit/pre-pr-review/triage.md.

New regression coverage

tests/unit/security/timeout/test_scheduler_lifecycle_locks.py — concurrent start() consolidates on a single task, stop(timeout=...) sets _stop_failed on drain timeout, start() after _stop_failed raises RuntimeError, stop(timeout=0) rejects.
tests/unit/communication/loop_prevention/test_circuit_breaker.py — TOCTOU regression: the OPEN branch acquires _state_lock exactly once, and a cooldown elapsing mid-check still produces a stable result.
tests/conformance/persistence/test_json_constraints_sqlite.py — SQLite-only IntegrityError checks for the new CHECK json_valid constraints (Postgres arm auto-skips since JSONB validates implicitly).
tests/conformance/persistence/test_approval_repository.py — added save_many duplicate-id-within-batch test (settles to last) and post-condition assertion on the empty-batch no-op.

Verification

uv run ruff check src/ tests/ — clean
uv run ruff format src/ tests/ — clean
uv run mypy src/ tests/ — clean (3,630 source files)
uv run python -m pytest tests/ -m unit -n 8 — 26,460 passed, 16 skipped (platform-only)
npm --prefix web run lint — clean
npm --prefix web run type-check — clean
npm --prefix web run test — 2,994 passed
go -C cli vet ./... — clean
go -C cli test ./... — clean
go -C cli build ./... — clean

Notes for the reviewer

Issue Audit 2026-05-03: batched mechanical fixes (Bucket A) #1733 framed the work as 6 PR-sized buckets but explicitly opted for one squash-merge. Sub-section commits are listed above for navigation.
Audit-correction calls (already noted in the issue body): Implement tool permission checking based on role and access level #16 (11 of 13 settings already wired; only 2 bootstrap hooks were truly missing), End-to-end integration test: single agent receives and completes a task #24 (the 9 listed repos already had dual-backend conformance), and the per-section path corrections.

github-actions · 2026-05-03T20:43:27Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-05-03T20:45:29Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9c626035-a888-4bf9-a3c7-3a9c748b8447

📥 Commits

Reviewing files that changed from the base of the PR and between cc7b04d and 1c4bb1f.

📒 Files selected for processing (2)

src/synthorg/integrations/mcp_catalog/in_memory_installations.py
tests/conformance/persistence/test_mcp_installations_repository.py

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Lighthouse Site
GitHub Check: CLI Test (windows-latest)
GitHub Check: CLI Bench Regression
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Build Web Assets (melange)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (6)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use from __future__ import annotations. Python 3.14 has PEP 649 native lazy annotations.

Use PEP 758 except syntax: except A, B: (no parens) when not binding to a name; ruff enforces this on 3.14. as exc requires parens (except (A, B) as exc:).

Type hints: all public functions must have type hints; mypy strict mode is enforced.

Docstrings must use Google style and are required on public classes and functions; ruff D rules enforce this.

Code comments explain WHY only, never origin/review/issue context. Forbidden: reviewer citations, in-code issue/PR back-references, cryptic internal-taxonomy shorthand, migration/rebrand framing, round/iteration narrative, self-evident restatements.

Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves. Never mix static config and mutable runtime fields in one model.

Pydantic v2 conventions: ConfigDict(frozen=True, allow_inf_nan=False) everywhere; extra="forbid" on every model that does not need to round-trip through model_dump() (request DTOs always); @computed_field for derived values; NotBlankStr for identifier/name fields.

Args models at every system boundary: every BaseTool subclass, MCP tool registration, A2A RPC method, and WebSocket event must declare a typed Pydantic args model and be validated before dispatch.

Every entry-point that ingests a dict payload from an external source must call parse_typed() from synthorg.api.boundary with a Pydantic model class or TypeAdapter. The boundary label MUST be a hardcoded LiteralString -- never user-controlled.

Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in. Wrap independent task bodies in async def helpers that catch Exception (re-raise only MemoryError/RecursionError) so one failure doesn't unwind the group.

Time injection (Clock seam): classes that read time or sleep cooperatively must take `clock: Clock | ...

Files:

tests/conformance/persistence/test_mcp_installations_repository.py
src/synthorg/integrations/mcp_catalog/in_memory_installations.py

**/*.{py,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

No default may privilege a single region, currency, or locale. Every user-facing format resolves from: user/company setting -> browser/system -> neutral fallback.

Currency: never hardcode ISO 4217 codes or symbols. Backend: DEFAULT_CURRENCY from synthorg.budget.currency or the runtime budget.currency setting. Frontend: DEFAULT_CURRENCY from @/utils/currencies or useSettingsStore().currency.

Field naming: no _usd suffix on money fields anywhere. The type carries money semantics; the value is in the operator's configured currency.

Timezone: store UTC; render via Intl without passing timeZone (browser tz wins).

Files:

tests/conformance/persistence/test_mcp_installations_repository.py
src/synthorg/integrations/mcp_catalog/in_memory_installations.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.

Every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare the interface it stands for via spec=ConcreteClass (Protocol or class). A pre-commit gate (scripts/check_mock_spec.py) blocks new bare-call sites.

Time-driven tests: import FakeClock from tests._shared.fake_clock and inject it into the class under test. Use await clock.advance_async(seconds) to drive cooperative tasks. FakeClock-first: patch time.monotonic() / asyncio.sleep() globals only for legacy code paths that don't have a Clock seam.

Coverage: 80% minimum (enforced in CI; benchmarks are excluded via --ignore=tests/benchmarks/ in coverage runs).

Async: asyncio_mode = "auto"; no manual @pytest.mark.asyncio needed.

Timeout: 30 seconds per test (global in pyproject.toml; do not add per-file markers).

Parallelism: pytest-xdist via -n 8, distribution --dist=loadfile (default in pyproject.toml addopts). ALWAYS include -n 8 when running pytest locally.

Isolation regression gate: the affected-tests pre-push runner runs the affected subset twice via pytest-repeat (--count 2 -x) after the primary green pass. Opt out via SYNTHORG_SKIP_ISOLATION_GATE=1 for emergency pushes only.

Never use monkeypatch.setattr(module.logger, "info", spy). Use a context manager that wraps direct setattr + try/finally del proxy.<level> instead -- see _logger_info_spy in tests/unit/settings/test_service.py for the canonical pattern.

Parametrize: Prefer @pytest.mark.parametrize for testing similar cases.

Property-based testing: Python uses Hypothesis; React uses fast-check; Go uses testing.F fuzz functions. CI runs 10 deterministic examples per property test. When Hypothesis finds a failure, fix the underlying bug and add an explicit @example(...) decorator so the case is permanently covered.

Flaky tests: NEVER skip, dismiss, or ignore flaky tests; always fix th...

Files:

tests/conformance/persistence/test_mcp_installations_repository.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/conformance/persistence/test_mcp_installations_repository.py

**/*.{py,ts,tsx,md,yml,yaml}

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, large/medium/small as aliases.

Files:

tests/conformance/persistence/test_mcp_installations_repository.py
src/synthorg/integrations/mcp_catalog/in_memory_installations.py

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every business-logic module has from synthorg.observability import get_logger then logger = get_logger(__name__). Variable name is always logger. Carve-outs are documented in the module docstring.

Never use import logging / logging.getLogger() / print() in application code. Exception: observability/{setup,sinks,syslog_handler,http_handler,otlp_handler}.py for handler-construction/bootstrap code.

Event names: always import constants from synthorg.observability.events.<domain>; never use string literals.

Structured kwargs: always logger.info(EVENT, key=value); never logger.info("msg %s", val).

All error paths log at WARNING or ERROR with context before raising.

State transitions: every hop on a status enum (including non-terminal hops like PENDING -> RUNNING) logs at INFO using a domain-scoped *_STATUS_TRANSITIONED constant carrying from_status/to_status/domain id, AFTER the persistence write succeeds.

DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, re-exports do NOT need logging.

Secret-log redaction (SEC-1): never call any logger severity with error=str(exc); use structured logging with error_type=type(exc).__name__ and error=safe_error_description(exc). All five log methods are enforced unconditionally.

Monetary models: every cost-bearing Pydantic model carries currency: CurrencyCode; aggregation sites enforce a same-currency invariant (mixing raises MixedCurrencyAggregationError, HTTP 409).

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService), never directly into repositories. Services centralize audit logging and cross-repo orchestration; repositories must not log mutations themselves.

For every mutable setting: DB > env (SYNTHORG_<NAMESPACE>_<KEY>) > YAML > code default, resolved through SettingsService / ConfigResolver. First-cold-read emits one INFO settings.value.resolved car...

Files:

src/synthorg/integrations/mcp_catalog/in_memory_installations.py

src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/integrations/mcp_catalog/in_memory_installations.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:28:52.071Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. The design spec is the starting point for architecture, data models, and behavior. If implementation deviates from the spec, alert the user and explain why; every deviation needs explicit user approval.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:28:52.071Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts. At every phase of planning and implementation, be critical and actively look for ways to improve the design; surface improvements as suggestions, not silent changes.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:28:52.071Z
Learning: Pluggable subsystems: follow protocol + strategy + factory + config discriminator pattern with safe defaults. Services (which wrap repositories) are a distinct pattern.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:28:52.071Z
Learning: Commits: `<type>: <description>`. Types: feat, fix, refactor, docs, test, chore, perf, ci. Enforced by commitizen (commit-msg hook).

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:28:52.071Z
Learning: Branches: `<type>/<slug>` from main.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:28:52.071Z
Learning: After finishing an issue implementation: always create a feature branch (`<type>/<slug>`), commit, and push; do NOT create a PR automatically. Do NOT leave work uncommitted on main; branch, commit, push immediately after finishing.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:28:52.071Z
Learning: NEVER create a PR directly: `gh pr create` is blocked by hookify. ALWAYS use `/pre-pr-review` to create PRs; it runs automated checks + review agents + fixes before creating the PR. For trivial/docs-only changes: `/pre-pr-review quick` skips agents but still runs automated checks.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:28:52.071Z
Learning: Fix everything valid, never skip: When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes), fix them all. No deferring, no "out of scope" skipping.

🔇 Additional comments (3)

src/synthorg/integrations/mcp_catalog/in_memory_installations.py (1)

62-95: list_items now correctly enforces the shared pagination contract and stable ordering.

This aligns in-memory behavior with durable backends (limit=100, offset=0, shared validation, deterministic (installed_at, catalog_entry_id) ordering), which is exactly what conformance needs.

tests/conformance/persistence/test_mcp_installations_repository.py (2)

95-124: Good conformance update for list_items pagination semantics.

The test now exercises the new API and verifies deterministic page windows correctly.

126-150: Excellent contract lock for invalid pagination inputs across backends.

The parametrized QueryError assertions (including bool cases) are a strong guard against silent coercion regressions.

Walkthrough

The PR applies broad mechanical changes across the codebase: it tightens many Pydantic models to use ConfigDict(..., extra="forbid"); standardizes list/query pagination to limit: int = 100 in protocols and repositories; injects Clock/SystemClock and replaces direct time/time.monotonic() uses; adds clock-aware components and deterministic-clock test utilities; introduces batch approval APIs (save_many, expire_if_pending) and DB/SQLite implementations; adds locks and concurrency hardening (circuit breaker, trust service, worker, in-memory store, timeout scheduler); adds normalization helpers and migrates trailing-slash handling; updates SSRF audit event routing and per-op rate limits; expands CLI help; adds SQLite JSON CHECK constraints and migration; adds Prometheus tool-name bounding/validation; and includes numerous test updates and new tests.

codspeed-hq · 2026-05-03T20:46:05Z

Merging this PR will not alter performance

✅ 54 untouched benchmarks

_{Comparing chore/audit-bucket-a (1c4bb1f) with main (de0a1e1)¹}

No successful run was found on main (70bffe9) during the generation of this report, so de0a1e1 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

gemini-code-assist

Code Review

This pull request implements a broad set of improvements including Pydantic model hardening, enhanced CLI documentation, and improved testability through a centralized clock seam. It also addresses concurrency issues, introduces batched persistence for approvals, and adds database-level JSON validation for SQLite. Critical feedback was provided regarding the use of obsolete Python 2 exception handling syntax in several files, which would cause syntax errors in Python 3, and an omission of the strict configuration flag on one of the updated models.

gemini-code-assist · 2026-05-03T20:51:01Z

+        except MemoryError, RecursionError:
+            raise


The except MemoryError, RecursionError: syntax is from Python 2 and is a SyntaxError in Python 3. To catch multiple exceptions, you should use a tuple: except (MemoryError, RecursionError):.

except (MemoryError, RecursionError): raise

gemini-code-assist · 2026-05-03T20:51:01Z

        except MemoryError, RecursionError:
            raise


The except MemoryError, RecursionError: syntax is from Python 2 and is a SyntaxError in Python 3. To catch multiple exceptions, you should use a tuple: except (MemoryError, RecursionError):.

Suggested change

except MemoryError, RecursionError:

raise

except (MemoryError, RecursionError):

raise

gemini-code-assist · 2026-05-03T20:51:01Z

        except MemoryError, RecursionError:
            raise


The except MemoryError, RecursionError: syntax is from Python 2 and is a SyntaxError in Python 3. To catch multiple exceptions, you should use a tuple: except (MemoryError, RecursionError):.

Suggested change

except MemoryError, RecursionError:

raise

except (MemoryError, RecursionError):

raise

gemini-code-assist · 2026-05-03T20:51:01Z

+        return frozenset(registry.list_tools())
+    except MemoryError, RecursionError:
+        raise
+    except AttributeError, TypeError, ValueError:


The except AttributeError, TypeError, ValueError: syntax is from Python 2 and is a SyntaxError in Python 3. To catch multiple exceptions, you should use a tuple: except (AttributeError, TypeError, ValueError):.

except (AttributeError, TypeError, ValueError):

gemini-code-assist · 2026-05-03T20:51:01Z

    """

-    model_config = ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid")
+    model_config = ConfigDict(frozen=True, allow_inf_nan=False)


This ConfigDict is missing extra="forbid", which is being added to most other models in this pull request as part of the Pydantic hardening effort. If this model is not intended to be an exception (e.g., for use with @computed_field), extra="forbid" should be added for consistency and to prevent unexpected fields.

Suggested change

model_config = ConfigDict(frozen=True, allow_inf_nan=False)

model_config = ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid")

codecov · 2026-05-03T20:53:31Z

Codecov Report

❌ Patch coverage is 87.32694% with 119 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.69%. Comparing base (6ddbb86) to head (1c4bb1f).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
...g/communication/loop_prevention/circuit_breaker.py	73.68%	16 Missing and 4 partials ⚠️
src/synthorg/backup/retention.py	37.50%	14 Missing and 1 partial ⚠️
src/synthorg/backup/service_archive.py	12.50%	14 Missing ⚠️
src/synthorg/persistence/sqlite/approval_repo.py	50.00%	12 Missing and 1 partial ⚠️
src/synthorg/workers/worker.py	0.00%	13 Missing ⚠️
src/synthorg/persistence/postgres/approval_repo.py	50.00%	11 Missing and 1 partial ⚠️
src/synthorg/api/approval_store.py	35.29%	11 Missing ⚠️
src/synthorg/observability/prometheus_collector.py	41.66%	6 Missing and 1 partial ⚠️
src/synthorg/a2a/well_known.py	66.66%	3 Missing ⚠️
src/synthorg/api/lifecycle.py	40.00%	3 Missing ⚠️
... and 6 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1744      +/-   ##
==========================================
- Coverage   84.72%   84.69%   -0.04%     
==========================================
  Files        1789     1790       +1     
  Lines      102395   102642     +247     
  Branches     8991     9008      +17     
==========================================
+ Hits        86758    86934     +176     
- Misses      13451    13518      +67     
- Partials     2186     2190       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 22

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (19)

src/synthorg/persistence/user_protocol.py (1)

189-200: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update list_by_user docstring to match the new bounded default.

The signature now enforces limit: int = 100, but the docstring still documents limit=None fetch-all behavior.

Suggested docstring correction

     async def list_by_user(
         self,
         user_id: NotBlankStr,
         *,
         limit: int = 100,
         offset: int = 0,
     ) -> tuple[ApiKey, ...]:
-        """List API keys belonging to a user, optionally paginated.
+        """List API keys belonging to a user with bounded pagination.
@@
-            limit: Maximum keys to return; ``None`` (default) preserves
-                fetch-all semantics.
+            limit: Maximum keys to return. Defaults to ``100``.
             offset: Keys to skip; applied independently of *limit*.
-                ``offset > 0`` with ``limit=None`` yields a tail
-                window starting after *offset* rows.
+                ``offset > 0`` returns a window starting after *offset*
+                rows.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/user_protocol.py` around lines 189 - 200, The
docstring for list_by_user no longer matches the function signature: update the
docstring for the function list_by_user (returning tuple[ApiKey, ...]) to remove
references to limit=None and the "fetch-all semantics" and instead document that
limit defaults to 100 (a bounded maximum number of keys returned), explain that
offset skips rows and that providing a different integer limit overrides the
default, and ensure the Args section and examples (if any) reflect the new
bounded-default behavior.

src/synthorg/persistence/cost_record_protocol.py (1)

37-40: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Docstring still describes obsolete None-limit behavior.

Line [29] makes limit a non-nullable int defaulting to 100, but Lines [37]-[40] still document fetch-all semantics via None.

Suggested patch

-            limit: Maximum rows to return; ``None`` (default) preserves
-                fetch-all semantics.
-            offset: Rows to skip before applying *limit*; ignored when
-                *limit* is ``None``.
+            limit: Maximum rows to return (default: 100).
+            offset: Rows to skip before applying *limit*.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/cost_record_protocol.py` around lines 37 - 40,
Update the docstring to remove the obsolete "None" fetch-all semantics and
accurately reflect that parameter limit is a non-nullable int defaulting to 100
and not optional; describe that offset is applied only when limit is provided
(i.e., when limit > 0) and adjust wording to state the default value (100)
rather than "None preserves fetch-all". Locate and change the wording around the
limit/offset parameter descriptions (references: the limit and offset parameters
in this module's function/method signature in cost_record_protocol.py) so the
docs match the implementation.

src/synthorg/persistence/task_protocol.py (1)

54-56: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

list_tasks docstring is out of sync with the new limit contract.

Line [45] now sets limit: int = 100, but Lines [54]-[56] still describe None as a supported mode.

Suggested patch

-            limit: Maximum rows to return.  ``None`` means "no
-                repository-level cap" (the caller remains free to
-                impose a safety cap above).
+            limit: Maximum rows to return (default: 100).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/task_protocol.py` around lines 54 - 56, The
docstring for list_tasks is outdated: the parameter `limit` is no longer allowed
to be None and now defaults to an int (100); update the docstring on
`list_tasks` to describe `limit: int = 100` as the default maximum rows to
return and remove or replace the text that says "``None`` means 'no
repository-level cap'"; instead document that callers may pass an integer to
control paging and that higher-level callers can still impose additional caps,
and ensure the parameter description matches the function signature and default
behavior.

src/synthorg/memory/backends/inmemory/adapter.py (1)

178-178: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Re-validate connection state after waiting for _store_lock.

_require_connected() at Line 178 runs before the await at Line 191. If this call waits on the lock, disconnect() can run meanwhile, and the code still writes at Line 207 while disconnected.

💡 Proposed fix

     async def store(
         self,
         agent_id: NotBlankStr,
         request: MemoryStoreRequest,
     ) -> NotBlankStr:
@@
-        self._require_connected()
+        self._require_connected()
         memory_id = NotBlankStr(str(uuid.uuid4()))
@@
         entry = MemoryEntry(
@@
         )
         async with self._store_lock:
+            # Re-check after potential lock wait to avoid writes post-disconnect.
+            self._require_connected()
             agent_store = self._store.setdefault(str(agent_id), {})
             # Prune expired entries before checking quota.
             _prune_expired(agent_store)

Also applies to: 191-207

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/memory/backends/inmemory/adapter.py` at line 178, The code calls
_require_connected() before awaiting _store_lock which can allow disconnect() to
run while waiting; after acquiring the _store_lock (i.e., immediately after the
await that obtains the lock in the same method), re-check connection state by
calling _require_connected() (or checking the same connected flag used by
disconnect()) again and abort/raise if disconnected so subsequent writes (the
code that uses the protected store later in this method) cannot proceed when
disconnected; apply the same post-lock revalidation pattern to the other block
covering the await and following write operations.

src/synthorg/persistence/sqlite/user_repo.py (2)

685-689: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update stale docstring for limit semantics.

The docstring still says None preserves fetch-all behavior, but the signature is now limit: int = 100 and always applies SQL LIMIT.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/user_repo.py` around lines 685 - 689, The
docstring for the parameter `limit` in
src/synthorg/persistence/sqlite/user_repo.py is stale: update the `limit`
description to reflect the current signature `limit: int = 100` and state that
SQL LIMIT is always applied (default 100) rather than `None` meaning fetch-all;
also adjust the `offset` description to remove the clause about being ignored
when `limit` is `None` and instead explain its interaction with the applied
LIMIT. Locate the parameter docs for `limit` and `offset` in the method
docstring (search for "limit:" and "offset:") and revise the wording to mention
the default value 100 and that LIMIT is always enforced.
678-700: ⚠️ Potential issue | 🟠 Major

list_by_user() needs limit validation, correct error handling, and docstring updates.

Line 700 binds int(limit) directly to SQLite. Negative values trigger LIMIT -1 behavior, which returns all rows and bypasses the new pagination invariant.

Three fixes required:
Validate limit before SQL binding: Reject limit <= 0 with a check and raise QueryError (not ValueError), matching the persistence error hierarchy.
Add logging before raising: All error paths must log at WARNING with context before raising. Use a pattern like:
if limit <= 0:
    msg = f"limit must be positive, got {limit}"
    logger.warning(PERSISTENCE_API_KEY_LIST_FAILED, user_id=user_id, error=msg)
    raise QueryError(msg)
Fix stale docstring (lines 685–689): Still claims "None (default) preserves fetch-all semantics" but signature is now limit: int = 100. Update to reflect bounded-by-default contract.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/user_repo.py` around lines 678 - 700, In
list_by_user, validate the limit before building the SQL: if int(limit) <= 0,
log a warning with PERSISTENCE_API_KEY_LIST_FAILED and user_id/error context
(use the existing logger) and raise QueryError (not ValueError); then proceed to
bind the validated limit and offset to the SQL. Also update the docstring for
list_by_user to remove the stale "None (default) preserves fetch-all semantics"
wording and state that the default is 100 and must be a positive integer.

src/synthorg/persistence/sqlite/mcp_installation_repo.py (1)

107-120: ⚠️ Potential issue | 🟠 Major

Validate limit parameter before passing to SQLite LIMIT clause.

Line 119 forwards int(limit) without validation. In SQLite, a negative LIMIT value removes the upper bound on returned rows, defeating pagination. Other repositories in this codebase (ssrf_violation_repo, escalation_repo) enforce limit > 0 before binding. Enforce the same here, and update the docstring to acknowledge the limit parameter.

Proposed fix

     async def list_all(
         self,
         *,
         limit: int = 100,
         offset: int = 0,
     ) -> tuple[McpInstallation, ...]:
-        """List all recorded installations, oldest-first."""
+        """List recorded installations (default limit 100), oldest-first."""
+        if limit <= 0:
+            msg = "limit must be positive"
+            raise ValueError(msg)
         sql = (
             "SELECT catalog_entry_id, connection_name, installed_at "
             "FROM mcp_installations "
             "ORDER BY installed_at ASC, catalog_entry_id ASC"
         )
         params: tuple[object, ...] = ()
         effective_offset = max(0, int(offset))
         sql += " LIMIT ? OFFSET ?"
         params = (int(limit), effective_offset)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py` around lines 107 -
120, The list method that returns tuple[McpInstallation, ...] should validate
the limit before binding to the SQLite LIMIT clause: ensure limit is converted
to int and constrained to >0 (e.g., raise or coerce to a positive minimum) or
only append " LIMIT ? OFFSET ?" when int(limit) > 0, and bind params accordingly
(use effective_offset for OFFSET as already computed); update the method
docstring to state that limit must be a positive integer (or that non-positive
means "no upper bound" is not allowed) and mirror the validation behavior used
in ssrf_violation_repo/escalation_repo to prevent passing a negative LIMIT to
SQLite.

src/synthorg/observability/otlp_trace_handler.py (1)

195-199: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Normalize first, then suffix-check to avoid duplicate /v1/traces.

If base_endpoint is configured as .../v1/traces/, this currently resolves to .../v1/traces/v1/traces.

Suggested fix

 def _resolve_traces_endpoint(base_endpoint: str) -> str:
     """Append ``/v1/traces`` to *base_endpoint* if not already present."""
-    if base_endpoint.endswith(_TRACES_ENDPOINT_SUFFIX):
-        return base_endpoint
-    return strip_trailing_slash(base_endpoint) + _TRACES_ENDPOINT_SUFFIX
+    normalized = strip_trailing_slash(base_endpoint)
+    if normalized.endswith(_TRACES_ENDPOINT_SUFFIX):
+        return normalized
+    return normalized + _TRACES_ENDPOINT_SUFFIX

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/otlp_trace_handler.py` around lines 195 - 199,
Normalize base_endpoint by stripping the trailing slash before
checking/appending the traces suffix to avoid duplicating "/v1/traces": call
strip_trailing_slash(base_endpoint) into a local variable, check if that
normalized value endswith _TRACES_ENDPOINT_SUFFIX and return it if so, otherwise
return the normalized value + _TRACES_ENDPOINT_SUFFIX; update the implementation
of _resolve_traces_endpoint to use this normalized variable instead of testing
base_endpoint directly.

src/synthorg/api/controllers/setup_agents.py (1)

494-503: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard untyped model/tier values before string access.

agent is persisted JSON; if model is None/non-dict or tier is non-string, this path can raise while building summaries.

Suggested fix

-    model = agent.get("model", {})
+    model_raw = agent.get("model")
+    model = model_raw if isinstance(model_raw, dict) else {}
+    tier = normalize_optional_string(agent.get("tier")) or "medium"
     return SetupAgentSummary(
         name=name,
         role=role,
         department=department,
         level=normalize_optional_string(agent.get("level")),  # type: ignore[arg-type]
         model_provider=normalize_optional_string(model.get("provider")),
         model_id=normalize_optional_string(model.get("model_id")),
-        tier=(agent.get("tier") or "").strip() or "medium",  # type: ignore[arg-type]
+        tier=tier,
         personality_preset=normalize_optional_string(agent.get("personality_preset")),
     )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/api/controllers/setup_agents.py` around lines 494 - 503, The
code assumes agent["model"] is a dict and agent["tier"] is a string; guard both
before calling .get or .strip to avoid exceptions when persisted JSON contains
null/non-dict or non-string values: in the block building the SetupAgentSummary
(see variable model and call to normalize_optional_string(model.get("provider"))
and normalize_optional_string(model.get("model_id"))), coerce or replace model
with an empty dict when it is None or not a mapping, and for tier ensure you
convert agent.get("tier") to a string (or fallback to "") before calling
.strip() and default to "medium"; keep using normalize_optional_string for
personality_preset and level as before.

src/synthorg/security/trust/service.py (1)

299-306: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Race condition: check_decay modifies _trust_states outside the lock.

The lock is added to guard _trust_states and _change_history against concurrent apply_trust_change / evaluate_agent calls for the same agent. However, check_decay performs a read-modify-write on _trust_states (lines 300-306) without acquiring _state_lock. A concurrent apply_trust_change can interleave between the read at line 300 and the assignment at line 306, causing one update to be lost.

🔒 Proposed fix to guard the decay-check write

         result = await self.evaluate_agent(agent_id, snapshot)
 
         # Update decay check timestamp *after* evaluation
         key = str(agent_id)
-        state = self._trust_states.get(key)
-        if state is not None:
-            now = datetime.now(UTC)
-            updated = state.model_copy(
-                update={"last_decay_check_at": now},
-            )
-            self._trust_states[key] = updated
+        now = datetime.now(UTC)
+        async with self._state_lock:
+            state = self._trust_states.get(key)
+            if state is not None:
+                updated = state.model_copy(
+                    update={"last_decay_check_at": now},
+                )
+                self._trust_states[key] = updated
 
         return result

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/security/trust/service.py` around lines 299 - 306, check_decay
currently reads and then writes self._trust_states without holding
self._state_lock, allowing races with apply_trust_change/evaluate_agent; wrap
the read-modify-write of the state in check_decay with the same lock used
elsewhere (self._state_lock) so the sequence key = str(agent_id); state =
self._trust_states.get(key); updated = state.model_copy(...);
self._trust_states[key] = updated is performed while holding the lock, ensuring
last_decay_check_at is updated atomically and avoiding lost updates.

src/synthorg/persistence/connection_protocol.py (1)

36-42: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Docstring contract is stale after the limit=100 signature change

list_all() no longer defaults to limit=None, but the docstring still documents that behavior. Please update this text to match the new bounded default contract.

Suggested docstring fix

-        Sorted by ``name`` ascending; pass ``limit=None`` (default) to
-        retain the legacy fetch-all semantics, or set both to consume
-        a paginated window. ``limit <= 0`` returns ``()``; negative
+        Sorted by ``name`` ascending. Set ``limit``/``offset`` to
+        consume a paginated window. ``limit <= 0`` returns ``()``; negative
         ``offset`` is treated as ``0``.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/connection_protocol.py` around lines 36 - 42, The
docstring for list_all() is stale: update it to reflect the new default bounded
behavior (default limit=100 rather than limit=None), state that passing
limit=None retains unbounded legacy semantics if still supported, keep the notes
that limit <= 0 returns () and negative offset is treated as 0, and confirm
sorting by name ascending; edit the docstring in the list_all() definition to
clearly document the new default and the semantics of None, limit, and offset.

src/synthorg/a2a/well_known.py (1)

136-137: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Pass app_state.clock into cache helpers to complete the clock seam.

The helpers accept clock, but controller call sites still use the module default. This bypasses the app-level injected clock and makes cache timing behavior drift from the configured runtime clock.

♻️ Suggested patch

-        cached = await _get_cached_card(company_cache_key, ttl)
+        cached = await _get_cached_card(
+            company_cache_key,
+            ttl,
+            clock=app_state.clock,
+        )
@@
             await _put_cached_card(
                 cache_key,
                 card_data,
                 ttl,
                 fingerprint=id_fp,
+                clock=app_state.clock,
             )
@@
-        cached = await _get_cached_card(agent_cache_key, ttl)
+        cached = await _get_cached_card(
+            agent_cache_key,
+            ttl,
+            clock=app_state.clock,
+        )
@@
             await _put_cached_card(
                 agent_cache_key,
                 card_data,
                 ttl,
                 fingerprint=agent_fp,
+                clock=app_state.clock,
             )

As per coding guidelines, "Time injection (Clock seam): classes that read time or sleep cooperatively take clock: Clock | None = None defaulting to SystemClock(); tests inject FakeClock."

Also applies to: 174-179, 229-230, 284-289

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/a2a/well_known.py` around lines 136 - 137, Controller call sites
are calling cache helpers without the injected clock, bypassing the app-level
clock seam; update each call to pass app_state.clock into the helper so timing
honors the injected Clock. Concretely, change calls like
_get_cached_card(company_cache_key, ttl) to include the clock argument (e.g.
_get_cached_card(company_cache_key, ttl, clock=app_state.clock)) and do the same
for the other helpers used in this module (e.g. _set_cached_card,
_get_cached_mappings, _set_cached_mappings) using the keyword parameter name the
helper expects so tests can inject FakeClock.

src/synthorg/persistence/sqlite/session_repo.py (1)

145-160: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard non-positive limit values before executing the list queries.

Line 145 and Line 167 switched to an always-on LIMIT, but these paths no longer short-circuit invalid limits. Add an early limit <= 0 return to keep pagination behavior consistently bounded and predictable.

Suggested patch

@@
     async def list_by_user(
         self,
         user_id: str,
         *,
         limit: int = 100,
         offset: int = 0,
     ) -> tuple[Session, ...]:
         """List active (non-expired, non-revoked) sessions for a user."""
+        if limit <= 0:
+            return ()
         now = format_iso_utc(datetime.now(UTC))
@@
     async def list_all(
         self,
         *,
         limit: int = 100,
         offset: int = 0,
     ) -> tuple[Session, ...]:
         """List all active (non-expired, non-revoked) sessions."""
+        if limit <= 0:
+            return ()
         now = format_iso_utc(datetime.now(UTC))

Also applies to: 167-181

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/session_repo.py` around lines 145 - 160, The
method that lists active sessions (takes parameters user_id, limit, offset and
returns tuple[Session, ...]) must guard against non-positive limits: before
building or executing the SQL (i.e. before constructing sql, params,
effective_offset and calling self._db.execute), check if int(limit) <= 0 and
immediately return an empty tuple; apply the same guard to the corresponding
listing method for the other path referenced around lines 167–181 so both code
paths short-circuit on non-positive limits to keep pagination predictable.

src/synthorg/persistence/postgres/session_repo.py (1)

127-144: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a non-positive limit guard for parity with other paginated repos.

Line 127 and Line 155 now enforce paginated SQL by default, but invalid non-positive limits still flow into query construction. Short-circuiting limit <= 0 keeps behavior deterministic and backend-consistent.

Suggested patch

@@
     async def list_by_user(
         self,
         user_id: str,
         *,
         limit: int = 100,
         offset: int = 0,
     ) -> tuple[Session, ...]:
         """List active (non-expired, non-revoked) sessions for a user."""
+        if limit <= 0:
+            return ()
         dict_row = self._dict_row
@@
     async def list_all(
         self,
         *,
         limit: int = 100,
         offset: int = 0,
     ) -> tuple[Session, ...]:
         """List all active (non-expired, non-revoked) sessions."""
+        if limit <= 0:
+            return ()
         dict_row = self._dict_row

Also applies to: 155-171

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/postgres/session_repo.py` around lines 127 - 144,
Add a guard that returns an empty tuple when limit is non-positive to match
other paginated repos: in the method that lists active (non-expired,
non-revoked) sessions (the function taking parameters limit and offset and using
variables now, sql, params, effective_offset), check if int(limit) <= 0 and
immediately return tuple() before constructing the SQL/params; do the same in
the similar paginated block around lines 155–171 (the other query that appends
"LIMIT %s OFFSET %s") to ensure deterministic behavior when limit is
non-positive.

src/synthorg/persistence/postgres/repositories.py (1)

224-253: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove None bypass paths so pagination caps are always enforced.

Both methods moved to limit: int = 100, but legacy if limit is not None branching still allows a caller to pass None and skip LIMIT, which reintroduces unbounded reads.

Suggested fix

@@
-        if limit is not None:
-            query += " LIMIT %s"
-            params.append(int(limit))
-        if offset:
-            query += " OFFSET %s"
-            params.append(int(offset))
+        if not isinstance(limit, int) or isinstance(limit, bool) or limit < 1:
+            msg = f"limit must be a positive integer, got {limit!r}"
+            raise QueryError(msg)
+        effective_offset = max(0, int(offset))
+        query += " LIMIT %s OFFSET %s"
+        params.extend([limit, effective_offset])
@@
-        if limit is not None and (
-            not isinstance(limit, int) or isinstance(limit, bool) or limit < 1
-        ):
+        if not isinstance(limit, int) or isinstance(limit, bool) or limit < 1:
             msg = f"limit must be a positive integer, got {limit!r}"
             logger.warning(
                 PERSISTENCE_MESSAGE_HISTORY_FAILED,
                 channel=channel,
                 error=msg,
             )
             raise QueryError(msg)
@@
-        if limit is not None:
-            sql += " LIMIT %s"
-            params.append(limit)
+        sql += " LIMIT %s"
+        params.append(limit)

Also applies to: 620-643

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/postgres/repositories.py` around lines 224 - 253,
The pagination code currently permits callers to bypass LIMIT by checking "if
limit is not None", so a caller passing None can trigger unbounded reads; change
the logic in the tasks listing method (the block building query using
self._TASK_COLUMNS, clauses, params, and variables limit/offset) to always
append " LIMIT %s" and add int(limit) to params (remove the if limit is not None
guard), and always append " OFFSET %s" when offset is provided while still
casting offset with int(offset); apply the same change to the other similar
method referenced (the second occurrence around lines 620-643) so pagination
caps are always enforced.

src/synthorg/persistence/postgres/ontology_entity_repo.py (1)

226-231: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Drop legacy None fallback to keep the new limit=100 contract consistent.

After moving to limit: int = 100, keeping limit is None branches (None -> 1000) leaves a hidden widened-page path and inconsistent behavior.

Suggested fix

-        effective_limit = 1000 if limit is None else int(limit)
+        if not isinstance(limit, int) or isinstance(limit, bool) or limit < 1:
+            msg = f"limit must be a positive integer, got {limit!r}"
+            raise OntologyError(msg)
+        effective_limit = int(limit)
@@
-        effective_limit = 1000 if limit is None else int(limit)
+        if not isinstance(limit, int) or isinstance(limit, bool) or limit < 1:
+            msg = f"limit must be a positive integer, got {limit!r}"
+            raise OntologyError(msg)
+        effective_limit = int(limit)

Also applies to: 263-270

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/postgres/ontology_entity_repo.py` around lines 226 -
231, The code keeps a legacy None fallback (effective_limit = 1000 if limit is
None else int(limit)) which contradicts the new signature limit: int = 100;
change the logic in the method (the list function around the shown signature,
e.g., list_entities) to simply compute effective_limit = int(limit) (and remove
any checks for limit is None), and repeat the same removal of the None-fallback
in the other similar block referenced (lines ~263-270) so both methods honour
the new limit=100 contract consistently.

src/synthorg/persistence/sqlite/ontology_entity_repo.py (1)

225-226: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clamp limit to a positive bounded value before applying SQL LIMIT.

Line 225 and Line 259 currently accept negative limits (int(limit)), which can defeat the bounded-pagination guarantee.

💡 Proposed fix

-        effective_limit = 1000 if limit is None else int(limit)
+        raw_limit = 1000 if limit is None else int(limit)
+        effective_limit = min(1000, max(1, raw_limit))
         effective_offset = max(0, int(offset))
...
-        effective_limit = 1000 if limit is None else int(limit)
+        raw_limit = 1000 if limit is None else int(limit)
+        effective_limit = min(1000, max(1, raw_limit))
         effective_offset = max(0, int(offset))

Also applies to: 259-260

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ontology_entity_repo.py` around lines 225 -
226, Clamp the computed effective_limit to a positive bounded range instead of
accepting negative values: replace the current assignments that set
effective_limit = 1000 if limit is None else int(limit) with logic that converts
limit to int and then applies min(max(1, int(limit)), 1000) (so limit is forced
into [1,1000]); keep effective_offset = max(0, int(offset)) as-is. Apply this
same change at both occurrences where effective_limit is computed (the two spots
using the effective_limit variable around the lines referenced) so SQL LIMIT
never receives a negative value.

src/synthorg/persistence/sqlite/repositories.py (1)

608-623: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Dead code: limit is not None checks are now unreachable.

With limit: int = 100, the parameter can never be None at runtime (callers passing None would need # type: ignore). The checks on lines 611 and 621 are now dead code paths.

♻️ Suggested cleanup

     async def get_history(
         self,
         channel: str,
         *,
         limit: int = 100,
     ) -> tuple[Message, ...]:
         """Retrieve message history for a channel, newest first."""
-        if limit is not None and limit < 1:
+        if limit < 1:
             msg = f"limit must be a positive integer, got {limit}"
             raise QueryError(msg)
         sql = """\
 SELECT id, timestamp, sender, "to", type, priority,
        channel, content, attachments, metadata
 FROM messages
 WHERE channel = ?
 ORDER BY timestamp DESC"""
         params: list[object] = [channel]
-        if limit is not None:
-            sql += " LIMIT ?"
-            params.append(limit)
+        sql += " LIMIT ?"
+        params.append(limit)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/repositories.py` around lines 608 - 623, The
checks for "limit is not None" are dead because the signature already declares
limit: int = 100; remove those redundant None guards: drop the "if limit is not
None and limit < 1" branch in favor of a single validation that raises
QueryError when limit < 1, and always append the " LIMIT ?" to the SQL and
params (using the limit variable) instead of guarding with "if limit is not
None"; update the function that builds the query (the message-history retrieval
routine that constructs sql, params and may raise QueryError) accordingly so
only the numeric validation remains and the LIMIT is unconditionally applied.

src/synthorg/persistence/sqlite/connection_repo.py (1)

231-241: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Vestigial None checks after type change.

Since limit is now int = 100 (non-nullable), the if limit is not None checks on lines 235, 239, 272, and 279 are always true and can be simplified. The code is correct but carries dead branches.

♻️ Optional cleanup

 async def list_all(
     self,
     *,
     limit: int = 100,
     offset: int = 0,
 ) -> tuple[Connection, ...]:
     """List all connections, sorted by name for determinism."""
-    if limit is not None and limit <= 0:
+    if limit <= 0:
         return ()
     sql = f"SELECT {_SELECT_COLS} FROM connections ORDER BY name ASC"  # noqa: S608
-    params: tuple[object, ...] = ()
-    if limit is not None:
-        sql += " LIMIT ? OFFSET ?"
-        params = (int(limit), max(0, int(offset)))
+    sql += " LIMIT ? OFFSET ?"
+    params = (int(limit), max(0, int(offset)))

Apply the same pattern to list_by_type.

Also applies to: 268-281

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/connection_repo.py` around lines 231 - 241,
Remove the dead None-checks for the non-nullable limit parameter: in the
connection listing function that takes limit: int = 100 and offset: int = 0 (the
block building sql and params with "LIMIT ? OFFSET ?", variables sql and
params), eliminate the surrounding if limit is not None and always append "
LIMIT ? OFFSET ?" and set params = (int(limit), max(0, int(offset))). Apply the
same simplification to the corresponding list_by_type implementation so both
functions no longer perform redundant None checks on limit.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cli/cmd/config.go`:
- Around line 181-184: The help text for "config list" uses "config-file" but
the code path resolveSource() emits "config", so update the help string to match
resolveSource() (or alternatively change resolveSource() to return "config-file"
if you prefer that label). Locate the help text shown in the diff and replace
"config-file" with "config" (or update resolveSource() accordingly), ensuring
all mentions and tests use the same label; verify by running the CLI help and
any tests that assert the source label.
- Around line 61-66: The help text for the "config show" command contradicts
runConfigShow's behavior—update the Long description string in the command
definition so it accurately states that when the config file is absent the
command returns "Not initialized" and does not render built-in defaults (or,
alternatively, change runConfigShow to render defaults); specifically edit the
Long string near the command declaration to reflect the actual behavior observed
in runConfigShow and ensure the wording references the "Not initialized" output
exactly so docs and runtime match.

In `@src/synthorg/api/approval_store.py`:
- Around line 571-585: The _compute_expiration method reads datetime.now(UTC)
directly, breaking fake-clock testing; update ApprovalStore to accept an
injected clock (clock: Clock | None = None defaulting to SystemClock()) stored
on self (e.g., self._clock) and replace the datetime.now(UTC) call in
_compute_expiration (and any other expiration checks like
_check_expiration_locked) with the seamable clock's now() (or now(UTC)) call so
time is driven via the injected Clock implementation in tests.
- Around line 280-284: The current call to self._repo.list_items(...) passes
status=status which pre-filters results and prevents lazy expiration from
promoting PENDING items to EXPIRED; remove status from the repo query (call
list_items without the status arg or pass None) so you first retrieve the
candidate items (repo_items), run the lazy expiration/transition logic (the
existing expire/refresh code in this class), then apply the final status filter
(e.g., ApprovalStatus.EXPIRED) in-memory; update references to
repo_items/list_items and the ApprovalStatus check accordingly.
- Around line 289-300: The loop currently updates the in-memory cache
(self._items[item.id] = checked) before awaiting the durable write
self._repo.save_many(to_persist); change the flow so you collect to_persist in
the loop but do not mutate self._items there, then if to_persist: await
self._repo.save_many(to_persist) and only after that iterate over to_persist (or
the original repo_items/to_persist mapping) to assign self._items[item.id] =
checked for each persisted checked item; keep the same status/risk_level
filtering logic and ensure get()/other readers only see the promoted EXPIRED
state after save_many completes.

In `@src/synthorg/backup/retention.py`:
- Around line 174-176: The manifest loading blocks currently only catch
json.JSONDecodeError which misses non-UTF8 failures; update both exception
handlers (the one that reads manifest_path.read_text(...) and the one that calls
json.loads(raw)) to also catch UnicodeDecodeError so the prune() run won't crash
— i.e. change the handler to catch (json.JSONDecodeError, UnicodeDecodeError) as
exc and keep the existing error logging/handling around
BackupManifest.model_validate(data) and the archive loader's json parse.

In `@src/synthorg/memory/consolidation/models.py`:
- Line 107: Restore strict extra handling on the ConsolidationResult model by
updating its model_config to include extra="forbid" (e.g., change
ConfigDict(frozen=True, allow_inf_nan=False) to ConfigDict(frozen=True,
allow_inf_nan=False, extra="forbid")), ensuring the ConsolidationResult class
enforces rejection of unknown keys consistent with surrounding models and
documented behavior.

In `@src/synthorg/observability/prometheus_collector.py`:
- Around line 117-146: The _fetch_tool_names function currently only handles
AttributeError/TypeError/ValueError and lets other exceptions escape the
TaskGroup; update its exception handling so non-fatal registry failures don't
cancel sibling tasks by catching all Exceptions (while still re-raising critical
errors like MemoryError and RecursionError), e.g. change the except clauses to
first re-raise on (MemoryError, RecursionError) and then catch Exception as e
for the rest, log via logger.exception(METRICS_SCRAPE_FAILED,
component="tool_registry") including the exception, and return None so the
snapshot rebuild can proceed.

In `@src/synthorg/persistence/postgres/org_fact_repo.py`:
- Line 473: The list_by_category method accepts a raw limit and binds int(limit)
directly; normalize it the same way as query() by clamping to the range 1..100
before binding (e.g. convert to int then set limit = max(1, min(limit, 100))).
Update the code in list_by_category to use the normalized limit variable in the
SQL bind and any subsequent logic so negative, zero, or oversized values are
prevented, mirroring the behavior of query().

In `@src/synthorg/persistence/sqlite/hr_repositories.py`:
- Line 113: The code binds the `limit` parameter directly into the SQL "LIMIT ?"
clause (see the `limit: int = 100` parameter and the "LIMIT ?" binding around
lines 135-137) without validating it; add repository-boundary validation to
ensure `limit` is an integer and within an acceptable range (e.g., >0) before
binding, raising a clear error (ValueError or the repo's standard validation
exception) if it is invalid so the SQLite path mirrors Postgres rejection
behavior.

In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py`:
- Line 110: The docstring "List all recorded installations, oldest-first." is
outdated because the listing method now applies a default limit=100; update the
docstring in src/synthorg/persistence/sqlite/mcp_installation_repo.py for the
listing method to state that it returns installations oldest-first but is
bounded by a default limit (limit=100) and mention the limit parameter (and its
default) so callers know results are paginated/limited.

In `@src/synthorg/persistence/sqlite/org_fact_repo.py`:
- Line 491: Validate the incoming limit parameter before it's used by converting
to int and enforcing a sane positive range (e.g., ensure int_limit = int(limit);
if int_limit <= 0 raise ValueError or normalize to 1, and optionally cap with a
MAX_LIMIT constant like 1000) so negative or extremely large values cannot
produce unbounded SQLite reads or undefined Postgres behavior; apply the same
check/normalization in both org_fact_repo implementations where the
function/method accepts the limit parameter so behavior is identical across
backends.

In `@src/synthorg/templates/model_matcher.py`:
- Around line 361-365: Update the comment above _compute_score to use the
current constant names: replace `_TIER_BASE`, `_HEADROOM_MAX`, and
`_PRIORITY_MAX` with `_TIER_BASE_SCORE`, `_HEADROOM_MAX_BONUS`, and
`_PRIORITY_MAX_BONUS` (keep `_HEADROOM_RATIO_CAP` as-is), so the docstring
matches the actual symbols used in the module (_TIER_BASE_SCORE,
_HEADROOM_MAX_BONUS, _PRIORITY_MAX_BONUS, _HEADROOM_RATIO_CAP, and
_compute_score).

In `@tests/conformance/persistence/test_approval_repository.py`:
- Around line 300-316: The test currently calls save_many with a single-item
tuple which delegates back to save(), so change
test_save_many_upserts_existing_rows to pass multiple items to repo.save_many to
exercise the batched/ executemany path: after creating and saving original (via
_approval_repo and _make_item), build an additional item (e.g., other =
_make_item(approval_id="approval-batch-other")) and call await
repo.save_many((updated, other)); keep the existing fetch/assert for original.id
and adjust or ignore assertions for the extra item as needed so len(items) > 1
triggers the real batch upsert path in save_many.

In `@tests/conformance/persistence/test_json_constraints_sqlite.py`:
- Around line 55-83: Update the test
test_preset_overrides_default_models_rejects_non_json to parametrize the
invalid-JSON insertion across the three nullable override columns
("default_models", "supported_auth_types", "candidate_urls") using
`@pytest.mark.parametrize`; inside the test use the existing conn and the
_attempt_insert helper but make the helper accept the target column name and
insert the string "not-json" into that column while leaving the other override
columns as NULL, then assert aiosqlite.IntegrityError for each case to ensure
each CHECK constraint is exercised.

In `@tests/unit/api/controllers/test_sse_revalidate.py`:
- Around line 61-67: Replace the real SystemClock with the test FakeClock
(import FakeClock from tests._shared.fake_clock) and inject it into the tested
app_state/stream so the SSE revalidation test no longer depends on wall-clock
timing; instantiate FakeClock instead of SystemClock, set app_state.clock =
fake_clock (or pass fake_clock into the class under test), and drive time
deterministically in the test by advancing fake_clock (instead of relying on
asyncio.wait_for delays); apply the same change to the other occurrence around
lines 164-170 to make both revocation/revalidation deadlines deterministic.

In `@tests/unit/api/services/test_idempotency_service.py`:
- Around line 179-225: Replace the bespoke _DeterministicClock and
_install_deterministic_clock with the shared FakeClock from
tests._shared.fake_clock: import FakeClock, create an instance and return it for
the test to inject into the service constructor (instead of
_DeterministicClock), and monkeypatch svc_mod.asyncio.sleep to point at the
FakeClock.sleep method so the service's polling loop advances the fake time;
remove the custom clock class and helper and update references to use the
returned FakeClock instance.

In `@tests/unit/api/test_app.py`:
- Around line 526-528: The AsyncMock assignments bypass the spec on mock_sched;
replace the bare AsyncMock()s with AsyncMock instances that include the specific
method signature spec so they respect the ApprovalTimeoutScheduler interface —
e.g., set mock_sched.start = AsyncMock(spec=ApprovalTimeoutScheduler.start) and
mock_sched.stop = AsyncMock(spec=ApprovalTimeoutScheduler.stop) (keeping
mock_sched = MagicMock(spec=ApprovalTimeoutScheduler) as-is) so both async
methods are interface-checked.

In `@tests/unit/observability/conftest.py`:
- Around line 90-110: The fixture _seed_prometheus_label_snapshot currently uses
scope="session" which allows the mutable _LabelSnapshot to be shared and leaked
between tests; change the fixture to run per test (use scope="function" or omit
scope) so update_label_snapshot(...) is applied fresh for each test and the
teardown call to _reset_label_snapshot_for_tests() runs after each test; ensure
the fixture still yields between update_label_snapshot and
_reset_label_snapshot_for_tests and keep references to _LabelSnapshot,
update_label_snapshot, _reset_label_snapshot_for_tests and
PrometheusCollector.refresh() so readers can locate the change.

In `@tests/unit/observability/test_prometheus_collector_new_metrics.py`:
- Around line 17-22: Remove the local autouse snapshot fixture in this test file
(the one using _LabelSnapshot and calling _reset_label_snapshot_for_tests in
teardown); it resets the process-global Prometheus label snapshot after each
test and can poison other tests. Delete the fixture block (and any references to
it) so the shared session-level seeding remains authoritative; if you still need
explicit cleanup here, replace per-test teardown with a non-autouse helper that
calls _reset_label_snapshot_for_tests only when explicitly invoked, and leave
update_label_snapshot/status_class usage unchanged.

In `@tests/unit/security/timeout/test_scheduler_lifecycle_locks.py`:
- Around line 29-31: The test creates bare AsyncMock instances for
store.list_items and store.save_if_pending which violates the rule that all
mocks must declare the interface; update these to
AsyncMock(spec=ConcreteStoreClass) (or AsyncMock(spec=type(store)) if the
concrete class is available) so the mock enforces the store interface and method
signature, and apply the same change to the other bare AsyncMock added at the
later location (the other call at line ~48) so all AsyncMock() usages in this
test use a spec-bound concrete store/interface class (e.g., ApprovalStore or the
actual store class used by the code under test).
- Around line 67-72: The test currently only checks scheduler.is_running and
scheduler._task but may not catch brief duplicate task creations; patch or
monkeypatch asyncio.create_task (or wrap the scheduler.create_task call) in the
test around the calls to scheduler.start() to count invocations and assert it
was called exactly once, then proceed to assert scheduler.is_running and that
scheduler._task is the same task instance to verify the lifecycle lock behavior;
reference the calls to scheduler.start(), scheduler._task and
asyncio.create_task when locating where to add the mock/patch and the assertion.

---

Outside diff comments:
In `@src/synthorg/a2a/well_known.py`:
- Around line 136-137: Controller call sites are calling cache helpers without
the injected clock, bypassing the app-level clock seam; update each call to pass
app_state.clock into the helper so timing honors the injected Clock. Concretely,
change calls like _get_cached_card(company_cache_key, ttl) to include the clock
argument (e.g. _get_cached_card(company_cache_key, ttl, clock=app_state.clock))
and do the same for the other helpers used in this module (e.g.
_set_cached_card, _get_cached_mappings, _set_cached_mappings) using the keyword
parameter name the helper expects so tests can inject FakeClock.

In `@src/synthorg/api/controllers/setup_agents.py`:
- Around line 494-503: The code assumes agent["model"] is a dict and
agent["tier"] is a string; guard both before calling .get or .strip to avoid
exceptions when persisted JSON contains null/non-dict or non-string values: in
the block building the SetupAgentSummary (see variable model and call to
normalize_optional_string(model.get("provider")) and
normalize_optional_string(model.get("model_id"))), coerce or replace model with
an empty dict when it is None or not a mapping, and for tier ensure you convert
agent.get("tier") to a string (or fallback to "") before calling .strip() and
default to "medium"; keep using normalize_optional_string for personality_preset
and level as before.

In `@src/synthorg/memory/backends/inmemory/adapter.py`:
- Line 178: The code calls _require_connected() before awaiting _store_lock
which can allow disconnect() to run while waiting; after acquiring the
_store_lock (i.e., immediately after the await that obtains the lock in the same
method), re-check connection state by calling _require_connected() (or checking
the same connected flag used by disconnect()) again and abort/raise if
disconnected so subsequent writes (the code that uses the protected store later
in this method) cannot proceed when disconnected; apply the same post-lock
revalidation pattern to the other block covering the await and following write
operations.

In `@src/synthorg/observability/otlp_trace_handler.py`:
- Around line 195-199: Normalize base_endpoint by stripping the trailing slash
before checking/appending the traces suffix to avoid duplicating "/v1/traces":
call strip_trailing_slash(base_endpoint) into a local variable, check if that
normalized value endswith _TRACES_ENDPOINT_SUFFIX and return it if so, otherwise
return the normalized value + _TRACES_ENDPOINT_SUFFIX; update the implementation
of _resolve_traces_endpoint to use this normalized variable instead of testing
base_endpoint directly.

In `@src/synthorg/persistence/connection_protocol.py`:
- Around line 36-42: The docstring for list_all() is stale: update it to reflect
the new default bounded behavior (default limit=100 rather than limit=None),
state that passing limit=None retains unbounded legacy semantics if still
supported, keep the notes that limit <= 0 returns () and negative offset is
treated as 0, and confirm sorting by name ascending; edit the docstring in the
list_all() definition to clearly document the new default and the semantics of
None, limit, and offset.

In `@src/synthorg/persistence/cost_record_protocol.py`:
- Around line 37-40: Update the docstring to remove the obsolete "None"
fetch-all semantics and accurately reflect that parameter limit is a
non-nullable int defaulting to 100 and not optional; describe that offset is
applied only when limit is provided (i.e., when limit > 0) and adjust wording to
state the default value (100) rather than "None preserves fetch-all". Locate and
change the wording around the limit/offset parameter descriptions (references:
the limit and offset parameters in this module's function/method signature in
cost_record_protocol.py) so the docs match the implementation.

In `@src/synthorg/persistence/postgres/ontology_entity_repo.py`:
- Around line 226-231: The code keeps a legacy None fallback (effective_limit =
1000 if limit is None else int(limit)) which contradicts the new signature
limit: int = 100; change the logic in the method (the list function around the
shown signature, e.g., list_entities) to simply compute effective_limit =
int(limit) (and remove any checks for limit is None), and repeat the same
removal of the None-fallback in the other similar block referenced (lines
~263-270) so both methods honour the new limit=100 contract consistently.

In `@src/synthorg/persistence/postgres/repositories.py`:
- Around line 224-253: The pagination code currently permits callers to bypass
LIMIT by checking "if limit is not None", so a caller passing None can trigger
unbounded reads; change the logic in the tasks listing method (the block
building query using self._TASK_COLUMNS, clauses, params, and variables
limit/offset) to always append " LIMIT %s" and add int(limit) to params (remove
the if limit is not None guard), and always append " OFFSET %s" when offset is
provided while still casting offset with int(offset); apply the same change to
the other similar method referenced (the second occurrence around lines 620-643)
so pagination caps are always enforced.

In `@src/synthorg/persistence/postgres/session_repo.py`:
- Around line 127-144: Add a guard that returns an empty tuple when limit is
non-positive to match other paginated repos: in the method that lists active
(non-expired, non-revoked) sessions (the function taking parameters limit and
offset and using variables now, sql, params, effective_offset), check if
int(limit) <= 0 and immediately return tuple() before constructing the
SQL/params; do the same in the similar paginated block around lines 155–171 (the
other query that appends "LIMIT %s OFFSET %s") to ensure deterministic behavior
when limit is non-positive.

In `@src/synthorg/persistence/sqlite/connection_repo.py`:
- Around line 231-241: Remove the dead None-checks for the non-nullable limit
parameter: in the connection listing function that takes limit: int = 100 and
offset: int = 0 (the block building sql and params with "LIMIT ? OFFSET ?",
variables sql and params), eliminate the surrounding if limit is not None and
always append " LIMIT ? OFFSET ?" and set params = (int(limit), max(0,
int(offset))). Apply the same simplification to the corresponding list_by_type
implementation so both functions no longer perform redundant None checks on
limit.

In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py`:
- Around line 107-120: The list method that returns tuple[McpInstallation, ...]
should validate the limit before binding to the SQLite LIMIT clause: ensure
limit is converted to int and constrained to >0 (e.g., raise or coerce to a
positive minimum) or only append " LIMIT ? OFFSET ?" when int(limit) > 0, and
bind params accordingly (use effective_offset for OFFSET as already computed);
update the method docstring to state that limit must be a positive integer (or
that non-positive means "no upper bound" is not allowed) and mirror the
validation behavior used in ssrf_violation_repo/escalation_repo to prevent
passing a negative LIMIT to SQLite.

In `@src/synthorg/persistence/sqlite/ontology_entity_repo.py`:
- Around line 225-226: Clamp the computed effective_limit to a positive bounded
range instead of accepting negative values: replace the current assignments that
set effective_limit = 1000 if limit is None else int(limit) with logic that
converts limit to int and then applies min(max(1, int(limit)), 1000) (so limit
is forced into [1,1000]); keep effective_offset = max(0, int(offset)) as-is.
Apply this same change at both occurrences where effective_limit is computed
(the two spots using the effective_limit variable around the lines referenced)
so SQL LIMIT never receives a negative value.

In `@src/synthorg/persistence/sqlite/repositories.py`:
- Around line 608-623: The checks for "limit is not None" are dead because the
signature already declares limit: int = 100; remove those redundant None guards:
drop the "if limit is not None and limit < 1" branch in favor of a single
validation that raises QueryError when limit < 1, and always append the " LIMIT
?" to the SQL and params (using the limit variable) instead of guarding with "if
limit is not None"; update the function that builds the query (the
message-history retrieval routine that constructs sql, params and may raise
QueryError) accordingly so only the numeric validation remains and the LIMIT is
unconditionally applied.

In `@src/synthorg/persistence/sqlite/session_repo.py`:
- Around line 145-160: The method that lists active sessions (takes parameters
user_id, limit, offset and returns tuple[Session, ...]) must guard against
non-positive limits: before building or executing the SQL (i.e. before
constructing sql, params, effective_offset and calling self._db.execute), check
if int(limit) <= 0 and immediately return an empty tuple; apply the same guard
to the corresponding listing method for the other path referenced around lines
167–181 so both code paths short-circuit on non-positive limits to keep
pagination predictable.

In `@src/synthorg/persistence/sqlite/user_repo.py`:
- Around line 685-689: The docstring for the parameter `limit` in
src/synthorg/persistence/sqlite/user_repo.py is stale: update the `limit`
description to reflect the current signature `limit: int = 100` and state that
SQL LIMIT is always applied (default 100) rather than `None` meaning fetch-all;
also adjust the `offset` description to remove the clause about being ignored
when `limit` is `None` and instead explain its interaction with the applied
LIMIT. Locate the parameter docs for `limit` and `offset` in the method
docstring (search for "limit:" and "offset:") and revise the wording to mention
the default value 100 and that LIMIT is always enforced.
- Around line 678-700: In list_by_user, validate the limit before building the
SQL: if int(limit) <= 0, log a warning with PERSISTENCE_API_KEY_LIST_FAILED and
user_id/error context (use the existing logger) and raise QueryError (not
ValueError); then proceed to bind the validated limit and offset to the SQL.
Also update the docstring for list_by_user to remove the stale "None (default)
preserves fetch-all semantics" wording and state that the default is 100 and
must be a positive integer.

In `@src/synthorg/persistence/task_protocol.py`:
- Around line 54-56: The docstring for list_tasks is outdated: the parameter
`limit` is no longer allowed to be None and now defaults to an int (100); update
the docstring on `list_tasks` to describe `limit: int = 100` as the default
maximum rows to return and remove or replace the text that says "``None`` means
'no repository-level cap'"; instead document that callers may pass an integer to
control paging and that higher-level callers can still impose additional caps,
and ensure the parameter description matches the function signature and default
behavior.

In `@src/synthorg/persistence/user_protocol.py`:
- Around line 189-200: The docstring for list_by_user no longer matches the
function signature: update the docstring for the function list_by_user
(returning tuple[ApiKey, ...]) to remove references to limit=None and the
"fetch-all semantics" and instead document that limit defaults to 100 (a bounded
maximum number of keys returned), explain that offset skips rows and that
providing a different integer limit overrides the default, and ensure the Args
section and examples (if any) reflect the new bounded-default behavior.

In `@src/synthorg/security/trust/service.py`:
- Around line 299-306: check_decay currently reads and then writes
self._trust_states without holding self._state_lock, allowing races with
apply_trust_change/evaluate_agent; wrap the read-modify-write of the state in
check_decay with the same lock used elsewhere (self._state_lock) so the sequence
key = str(agent_id); state = self._trust_states.get(key); updated =
state.model_copy(...); self._trust_states[key] = updated is performed while
holding the lock, ensuring last_decay_check_at is updated atomically and
avoiding lost updates.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

coderabbitai · 2026-05-03T20:58:37Z

+@pytest.fixture(autouse=True, scope="session")
+def _seed_prometheus_label_snapshot() -> Iterator[None]:
+    """Seed the Prometheus label snapshot for all observability tests.
+
+    ``record_tool_invocation`` validates the ``tool_name`` label
+    against the snapshot maintained by ``PrometheusCollector.refresh()``.
+    Unit tests that never invoke ``refresh()`` would otherwise see an
+    empty snapshot and reject every recording call.  Session-scoping
+    keeps the snapshot populated even when individual files install
+    their own (now redundant) seed-and-reset fixtures, so cross-file
+    test ordering under xdist ``loadfile`` can no longer leave an
+    empty snapshot in place between files on the same worker.
+    """
+    update_label_snapshot(
+        _LabelSnapshot(
+            tool_names=frozenset({"web_search", "calculator", "t"}),
+            tool_names_seeded=True,
+        ),
+    )
+    yield
+    _reset_label_snapshot_for_tests()


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Seed this global snapshot per test, not per session.

_LabelSnapshot is mutable process-global state. With scope="session", any test that calls update_label_snapshot() or PrometheusCollector.refresh() can leak its snapshot into later tests, so pass/fail depends on execution order.

Suggested fix

-@pytest.fixture(autouse=True, scope="session") +@pytest.fixture(autouse=True) def _seed_prometheus_label_snapshot() -> Iterator[None]:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@pytest.fixture(autouse=True, scope="session")

def _seed_prometheus_label_snapshot() -> Iterator[None]:

"""Seed the Prometheus label snapshot for all observability tests.

``record_tool_invocation`` validates the ``tool_name`` label

against the snapshot maintained by ``PrometheusCollector.refresh()``.

Unit tests that never invoke ``refresh()`` would otherwise see an

empty snapshot and reject every recording call. Session-scoping

keeps the snapshot populated even when individual files install

their own (now redundant) seed-and-reset fixtures, so cross-file

test ordering under xdist ``loadfile`` can no longer leave an

empty snapshot in place between files on the same worker.

"""

update_label_snapshot(

_LabelSnapshot(

tool_names=frozenset({"web_search", "calculator", "t"}),

tool_names_seeded=True,

),

)

yield

_reset_label_snapshot_for_tests()

`@pytest.fixture`(autouse=True)

def _seed_prometheus_label_snapshot() -> Iterator[None]:

"""Seed the Prometheus label snapshot for all observability tests.

``record_tool_invocation`` validates the ``tool_name`` label

against the snapshot maintained by ``PrometheusCollector.refresh()``.

Unit tests that never invoke ``refresh()`` would otherwise see an

empty snapshot and reject every recording call. Session-scoping

keeps the snapshot populated even when individual files install

their own (now redundant) seed-and-reset fixtures, so cross-file

test ordering under xdist ``loadfile`` can no longer leave an

empty snapshot in place between files on the same worker.

"""

update_label_snapshot(

_LabelSnapshot(

tool_names=frozenset({"web_search", "calculator", "t"}),

tool_names_seeded=True,

),

)

yield

_reset_label_snapshot_for_tests()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/observability/conftest.py` around lines 90 - 110, The fixture _seed_prometheus_label_snapshot currently uses scope="session" which allows the mutable _LabelSnapshot to be shared and leaked between tests; change the fixture to run per test (use scope="function" or omit scope) so update_label_snapshot(...) is applied fresh for each test and the teardown call to _reset_label_snapshot_for_tests() runs after each test; ensure the fixture still yields between update_label_snapshot and _reset_label_snapshot_for_tests and keep references to _LabelSnapshot, update_label_snapshot, _reset_label_snapshot_for_tests and PrometheusCollector.refresh() so readers can locate the change.

coderabbitai · 2026-05-03T20:58:37Z

+from synthorg.observability.prometheus_labels import (
+    _LabelSnapshot,
+    _reset_label_snapshot_for_tests,
+    status_class,
+    update_label_snapshot,
+)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove the local snapshot fixture; its teardown can poison later test files.

This autouse fixture resets the process-global Prometheus label snapshot to bootstrap mode after each test here. Since the shared fixture only seeds once per session, later observability files on the same worker can inherit the empty snapshot and start rejecting tool_name labels based on file order.

Suggested cleanup

from synthorg.observability.prometheus_labels import ( - _LabelSnapshot, - _reset_label_snapshot_for_tests, status_class, - update_label_snapshot, ) @@ -@pytest.fixture(autouse=True) -def _seed_tool_name_snapshot() -> Generator[None]: - """Seed the prometheus label snapshot with bounded tool-name values. - - record_tool_invocation now bounds ``tool_name`` against the - snapshot maintained by PrometheusCollector.refresh(); these unit - tests never invoke refresh() so we seed manually. Reset on - teardown so cross-file tests start from a clean snapshot. - """ - update_label_snapshot( - _LabelSnapshot( - tool_names=frozenset({"web_search", "calculator", "t"}), - tool_names_seeded=True, - ), - ) - yield - _reset_label_snapshot_for_tests()

Also applies to: 27-43

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/observability/test_prometheus_collector_new_metrics.py` around lines 17 - 22, Remove the local autouse snapshot fixture in this test file (the one using _LabelSnapshot and calling _reset_label_snapshot_for_tests in teardown); it resets the process-global Prometheus label snapshot after each test and can poison other tests. Delete the fixture block (and any references to it) so the shared session-level seeding remains authoritative; if you still need explicit cleanup here, replace per-test teardown with a non-autouse helper that calls _reset_label_snapshot_for_tests only when explicitly invoked, and leave update_label_snapshot/status_class usage unchanged.

coderabbitai · 2026-05-03T20:58:37Z

+    store.list_items = AsyncMock(return_value=())
+    store.save_if_pending = AsyncMock(
+        side_effect=lambda item: ApprovalItem(


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not introduce new unspecced AsyncMock call sites.

Line 29, Line 30, and Line 48 add bare AsyncMock(...); these should be spec-bound so the tests enforce interface contracts.

💡 Proposed fix

def _make_store() -> MagicMock: from synthorg.approval.protocol import ApprovalStoreProtocol store = MagicMock(spec=ApprovalStoreProtocol) - store.list_items = AsyncMock(return_value=()) + store.list_items = AsyncMock( + spec=ApprovalStoreProtocol.list_items, + return_value=(), + ) store.save_if_pending = AsyncMock( + spec=ApprovalStoreProtocol.save_if_pending, side_effect=lambda item: ApprovalItem( id=item.id, action_type=item.action_type, title=item.title, description=item.description, @@ def _make_checker() -> MagicMock: from synthorg.security.timeout.timeout_checker import TimeoutChecker checker = MagicMock(spec=TimeoutChecker) - checker.check_and_resolve = AsyncMock(side_effect=lambda item: (item, None)) + checker.check_and_resolve = AsyncMock( + spec=TimeoutChecker.check_and_resolve, + side_effect=lambda item: (item, None), + ) return checker

As per coding guidelines tests/**/*.py: “Every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare the interface it stands for via spec=ConcreteClass.”

Also applies to: 48-48

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/security/timeout/test_scheduler_lifecycle_locks.py` around lines 29 - 31, The test creates bare AsyncMock instances for store.list_items and store.save_if_pending which violates the rule that all mocks must declare the interface; update these to AsyncMock(spec=ConcreteStoreClass) (or AsyncMock(spec=type(store)) if the concrete class is available) so the mock enforces the store interface and method signature, and apply the same change to the other bare AsyncMock added at the later location (the other call at line ~48) so all AsyncMock() usages in this test use a spec-bound concrete store/interface class (e.g., ApprovalStore or the actual store class used by the code under test).

coderabbitai · 2026-05-03T20:58:37Z

+            await asyncio.gather(scheduler.start(), scheduler.start())
+            assert scheduler.is_running
+            # Both calls converged on the same task instance.
+            first_task = scheduler._task
+            assert first_task is not None
+        finally:


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

The concurrency assertion does not prove single task spawn.

This test currently validates “running + task exists,” but it can still pass if duplicate tasks were briefly created. Assert asyncio.create_task call count (or equivalent) to verify the lifecycle lock behavior directly.

💡 Proposed fix

-from unittest.mock import AsyncMock, MagicMock +from unittest.mock import AsyncMock, MagicMock, patch ... scheduler = ApprovalTimeoutScheduler( approval_store=_make_store(), timeout_checker=_make_checker(), interval_seconds=60.0, ) try: - await asyncio.gather(scheduler.start(), scheduler.start()) + with patch("asyncio.create_task", wraps=asyncio.create_task) as create_task: + await asyncio.gather(scheduler.start(), scheduler.start()) + assert create_task.call_count == 1 assert scheduler.is_running # Both calls converged on the same task instance. first_task = scheduler._task assert first_task is not None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/security/timeout/test_scheduler_lifecycle_locks.py` around lines 67 - 72, The test currently only checks scheduler.is_running and scheduler._task but may not catch brief duplicate task creations; patch or monkeypatch asyncio.create_task (or wrap the scheduler.create_task call) in the test around the calls to scheduler.start() to count invocations and assert it was called exactly once, then proceed to assert scheduler.is_running and that scheduler._task is the same task instance to verify the lifecycle lock behavior; reference the calls to scheduler.start(), scheduler._task and asyncio.create_task when locating where to add the mock/patch and the assertion.

The wrapper was a one-line passthrough to budget._aggregation.group_by_agent. Callers now import group_by_agent directly, removing a layer of indirection that obscured the call graph without adding any value. Refs #1733 (R11)

MemoryMetadata and MemoryQuery had identical post-init dedup validators; ProcedureLearningRecord had a similar but cap-truncated variant. Extract the dedup primitive to synthorg.memory.utils.deduplicate_tags so all three share one implementation. Procedural model still composes its own truncation on top. Refs #1733 (R9)

Add three new helpers to synthorg.core.normalization: - strip_trailing_slash(url): replaces inline url.rstrip('/') at A2A agent cards, OAuth redirect, OTLP handlers, telemetry emitter, ntfy adapter, provider probing. - normalize_optional_string(raw): replaces (raw.strip() or None) if raw else None at setup-agents helpers. - normalize_path(path): replaces (path or '').rstrip('/') or '/' at CSRF middleware. 17 inline call sites migrated. Sites with non-canonical shapes (citation normalizer's conditional strip, validate_subworkflow's elif-chain, mem0 sparse_search's isinstance ternary) left untouched. Refs #1733 (R4)

Extract getNodeLabel + node-data shape guards from useOrgChartSelection into a shared web/src/pages/org/node-utils.ts. OrgChartFilter previously had a similar but cast-based copy; now both call the guard-protected version, so a record with a blank/whitespace name can never surface as an empty UI label. Replace the two parallel switch statements in computeCostBreakdown with a DIMENSION_RESOLVERS strategy table keyed by BreakdownDimension, so the key/label resolution stays exhaustively typed against the union and each dimension's logic lives in one place. Refs #1733 (R8)

The audit-chain RFC 3161 timestamping defaults shipped with bare http:// URLs across three sources of truth: settings/definitions/observability.py registry entries, settings/bridge_configs.py Pydantic field defaults, and audit_chain/config.py's _DEFAULT_PRESET_URLS map. Both endpoints serve TLS today; switch all three to https:// so a fresh-install operator does not silently round-trip signed audit events over an unencrypted hop. Both definitions are restart_required=True, so existing operators keep their current YAML/DB value; only fresh installs and reset-to-default flows pick up the new scheme. Refs #1733 (#30)

The SsrfViolationService recorded mutations on the api.* event namespace which excludes them from the signed audit chain. Switch to SECURITY_SSRF_VIOLATION_RECORDED for create and route the resolution event to SECURITY_SSRF_VIOLATION_ALLOWED / _DENIED based on the new status so a single audit-chain reader can reconstruct the WHO+WHEN of allow vs deny without unpacking the payload. The two read-side events (API_SSRF_VIOLATION_LISTED, API_SSRF_VIOLATION_FETCH_FAILED) stay on the api.* namespace because list/fetch failures carry no audit-chain implication. The two now-unused mutation constants are dropped from observability/events/api.py. Refs #1733 (#2)

Two write endpoints had no per-op rate limit despite being expensive: /oauth/initiate kicks off an external authorization flow, and /settings/security/import validates and persists a full SecurityConfig payload. Add policies oauth.initiate (10 req/60s/user) and settings.import (5 req/3600s/user) to RATE_LIMIT_POLICIES, and attach per_op_rate_limit_from_policy guards to both routes. Refs #1733 (#10)

The 'Rejection requires a reason for the approval record. Provide a brief explanation.' literal lived inline at three sites: the approval detail drawer's inline field error + toast, and the batch reject toast on ApprovalsPage. Move to web/src/pages/approvals/errors.ts so a future copy tweak lands in one place. Refs #1733 (#28)

tarfile.TarFile.extractfile() returns a file-like object that must be closed; on CPython the GC closes eventually, but on PyPy or fast restart paths the file descriptor can outlive the request. Both manifest readers now wrap the returned object in a 'with' block so the fd is released as soon as the read completes, before the JSON-validation step. Refs #1733 (#9)

Extract weight/threshold literals to named constants in three scoring modules without changing their values. The constants document intent, let docstrings reference symbols rather than restate numbers, and prepare the way for future tuning experiments to land as single-line diffs. - engine/routing/scorer.py: PRIMARY_SKILL_WEIGHT (0.4), SECONDARY_SKILL_WEIGHT (0.2), TAG_MATCH_BONUS (0.1), ROLE_MATCH_BONUS (0.2), SENIORITY_ALIGNMENT_BONUS (0.2). - engine/quality/graders/heuristic.py: PASS_GRADE (0.8), FAIL_GRADE (0.3), CONFIDENCE_CEILING (0.9), CONFIDENCE_BIAS (0.1). - templates/model_matcher.py: TIER_BASE_SCORE (0.5), HEADROOM_MAX_BONUS (0.25), PRIORITY_MAX_BONUS (0.25), HEADROOM_RATIO_CAP (2.0), BALANCED_PARTIAL_CREDIT (0.125). Refs #1733 (#22)

record_tool_invocation accepted free-form tool_name, leaving the metric vulnerable to cardinality explosion if a buggy caller fabricated names. Add validate_tool_name() that consults the same process-global label snapshot used by validate_agent_id and validate_workflow_definition_id; the snapshot's tool_names set is refreshed from app_state.tool_registry.list_tools() on every scrape. Plugin-loaded tools are picked up the next time refresh() runs, so the bound stays in lockstep with the registry without requiring runtime callers to manage the allowlist. Refs #1733 (#12)

Three hot-path classes mutated shared state without synchronisation: - DelegationCircuitBreaker (loop_prevention/circuit_breaker.py): _pairs and _dirty are mutated by sync get_state() and record_delegation() called from many concurrent async paths. Add a threading.RLock guarding the read-modify-write regions; RLock so a future caller that re-enters via get_state() inside record_delegation() does not deadlock. - TrustService.apply_trust_change / evaluate_agent (security/trust/service.py): the setdefault().append() pattern on _change_history plus the dict assignment to _trust_states formed a multi-step read-modify-write that two concurrent same-agent calls could interleave, dropping one update or producing a phantom _change_history entry that no _trust_states row backs. Add an asyncio.Lock and hold it across both writes. - InMemoryBackend.store (memory/backends/inmemory/adapter.py): the setdefault() / capacity check / assign chain could silently exceed max_memories_per_agent under concurrent load. Add a separate _store_lock (kept distinct from _connect_lock so connect/disconnect do not serialise hot-path traffic). Refs #1733 (#6)

Two service classes had no lifecycle synchronisation despite owning async start/stop methods: - Worker (workers/worker.py): an in-place runner where run() is the loop body itself, so the lock guards only the _running flag transition (acquire briefly to check-and-set, release before the loop body, re-acquire in the finally to clear). Holding it across the whole run() body would deadlock a second concurrent caller. - ApprovalTimeoutScheduler (security/timeout/scheduler.py): start() was sync; converted to async. Added _lifecycle_lock held across the full body of both start() and stop() so two concurrent start() callers cannot both pass the is_running guard and spawn duplicate scheduler tasks. Added _stop_failed unrestartable flag per the canonical pattern in docs/reference/lifecycle-sync.md. Updated callers in api/lifecycle.py and the test suite to await scheduler.start(). Refs #1733 (#7)

Add Long descriptions to all top-level commands missing them (start, status, stop, version, uninstall, update) and their config subcommands (show, unset, list, path, edit; get + set already had Long fields). Add Example blocks to the four config subcommands that lacked them (show, unset, list, path, edit, get, set) and to doctor report. Existing test suite continues to pass. Refs #1733 (#21)

…JSONB provider_audit_events.payload and preset_overrides.{default_models, supported_auth_types,candidate_urls} are JSONB on Postgres (which implicitly validates JSON shape) but plain TEXT on SQLite, so a buggy caller could land malformed JSON on the SQLite side without warning. SQLite has no JSONB type, so add CHECK (json_valid(col)) constraints on the TEXT columns to match the implicit Postgres validation. The nullable preset_overrides columns guard with 'IS NULL OR json_valid' because SQLite's json_valid() returns 0 for NULL. The single Atlas migration captures both schema deltas in one revision so the check-single-migration-per-pr hook stays satisfied. Refs #1733 (#8)

ApprovalStore.list_items previously called _check_expiration_locked per item, each issuing a save round-trip when the item transitioned PENDING to EXPIRED. Worst case K simultaneous expiries produced K UPDATE round-trips. Add save_many to the ApprovalRepository protocol and implement on both backends (executemany under one transaction; constraint violations roll back the whole batch). Split the per-item check into a pure _compute_expiration plus a side-effect block that fans out a single save_many and an audit pass after the DB write. The list path is split into _list_from_repo_locked (batched) and _list_from_cache_locked (per-item) so the orchestration body stays under the 10-branch complexity ceiling. Refs #1733 (#13)

…licy The app() factory previously hard-coded approval_timeout_scheduler=None with a comment noting auto-creation from settings was 'not yet wired', which made security.timeout_check_interval_seconds dead at the runtime. Wire a default scheduler now: WaitForeverPolicy is the safe production default (the scheduler runs the periodic scan and emits TIMEOUT_WAITING events, but never auto-decides pending approvals); operators swap in DenyOnTimeout / Tiered / EscalationChain via the security.* settings. BackupService bootstrap was already complete (build_backup_service factory, lifecycle.py start/stop hooks); this commit completes the matching ApprovalTimeoutScheduler half so all 13 audit-flagged settings now have a live consumer at runtime. Refs #1733 (#16)

Replace bare time.monotonic / time.time / asyncio.get_event_loop().time with a Clock-injected seam in classes flagged by the audit so tests can drive virtual time without monkey-patching modules. Each affected class adds an optional clock: Clock | None = None constructor parameter that defaults to SystemClock; module-level helpers without a class accept the clock as a keyword. Sites covered: - a2a/well_known.py module-level cache helpers - api/controllers/health.py via app_state.clock - api/controllers/events.py SSE keepalive + revalidate timer - api/services/idempotency_service.py poll deadline - api/state.py AppState.clock attribute (new) - api/app.py instantiates AppState.clock = SystemClock by default - communication/bus/memory.py InProcessMessageBus - communication/bus/_nats_state.py state.clock field - communication/bus/_nats_receive.py reads state.clock - engine/workflow/ceremony_scheduler.py - integrations/health/checks/database.py - memory/retrieval/hierarchical/workers.py three workers - meta/validation/ci_validator.py - security/uncertainty.py - tools/sandbox/docker_sandbox.py The idempotency-service test that previously monkey-patched svc_mod.time.monotonic now constructs a _DeterministicClock and passes it through the new clock= constructor kwarg. Refs #1733 (#17)

Sweep ALL ConfigDict(frozen=True, allow_inf_nan=False) instances in the listed domains (a2a, api/controllers, approval, backup, client, communication, config, engine, hr, memory, providers, security) and add extra='forbid' so unknown fields surface as ValidationError at ingestion rather than silently passing. Carve-out: 46 ConfigDict instances on 37 classes that declare @computed_field keep their lenient extra-handling, because pydantic v2's model_dump() includes computed-field values by default and the forbid contract would reject them on round-trip via ModelClass( **other.model_dump()). Net: 489 ConfigDict instances now reject extras (535 - 46 carve-outs). Test fix for the SSE-revalidate fake (which now needs an app_state.clock attribute) lands in the same commit. Refs #1733 (#18)

The audit (#19) flagged 12 protocols whose list_* / query / list_by_* methods accepted limit: int | None = None, allowing callers to inadvertently issue 'fetch all' queries that hit the database hard on large tables. Sweep src/synthorg/persistence/ and replace every 'limit: int | None = None' with 'limit: int = 100' (40 method signatures across 26 files). The default 100 mirrors the existing caps documented on the most-paginated methods so callers that omit limit keep working; callers that explicitly pass None now surface a TypeError, which is the intended contract shift. The full unit suite continues to pass (26454 passed) and the conformance suite is green. Refs #1733 (#19)

Audit (#24) flagged 9 repositories as needing new dual-backend conformance tests but every listed repo already has parametrised sqlite + postgres coverage: connection_repo, connection_secret_repo, oauth_state_repo, webhook_receipt_repo: tests/conformance/persistence/test_connection_repositories.py lockout_repo, refresh_repo, session_repo: tests/conformance/persistence/test_auth_repositories.py ontology_drift_repo, ontology_entity_repo: tests/conformance/persistence/test_ontology_repositories.py 74 conformance tests across the 3 files run against both backends via the parametrised backend fixture in conformance/persistence/ conftest.py (sqlite + testcontainers Postgres). #24's premise was a miscount; the existing coverage already satisfies the dual-backend parity contract for all 9 listed repos. What this commit adds is conformance for the new save_many method landed in #13: 3 new tests (round-trip, empty no-op, upsert semantics) so the new protocol method also runs on both backends. Refs #1733 (#24)

After making limit required on persistence list methods, several call sites still had if limit is not None branches that became unreachable once the parameter type changed from int | None to int. The hr lifecycle event protocol was outside src/synthorg/persistence/ so the prior batched rewrite missed it; the protocol now matches concrete impls. The task engine wrapper still exposes limit=None for legacy fetch-all semantics; it now translates that into the safety cap before reaching the repository so the protocol-level int requirement holds.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/synthorg/persistence/postgres/mcp_installation_repo.py (1)
133-159: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Default limit=100 introduces silent truncation for no-arg callers.

On Line 133, list_all() now caps results by default. Cross-file, src/synthorg/integrations/connections/catalog.py (Line 99-Line 103) calls await self._repo.list_all() with no pagination, which now returns only the first page when rows exceed 100. Please either preserve full-scan semantics for list_all() or update full-scan call sites to explicit pagination loops.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/postgres/mcp_installation_repo.py` around lines 133
- 159, The new list_all() method now defaults limit=100 which silently truncates
results for callers like self._repo.list_all() in
src/synthorg/integrations/connections/catalog.py; fix by either restoring
full-scan semantics in list_all() (e.g., change the signature so limit defaults
to None/unbounded, update validate_pagination_args and the SQL generation to
omit LIMIT/OFFSET when limit is None) or update the caller (the call site using
self._repo.list_all()) to perform explicit pagination loops using limit/offset
until no more rows; relevant symbols: list_all, validate_pagination_args,
PERSISTENCE_MCP_INSTALLATION_LIST_FAILED, and the caller self._repo.list_all().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/api/approval_store.py`:
- Around line 655-681: The helper _fire_expire_callback currently logs
API_APPROVAL_EXPIRE_CALLBACK_FAILED with logger.error while
_check_expiration_locked emits the same event at WARNING; make the severities
consistent by changing the logging call in _fire_expire_callback to use
logger.warning (keeping the same structured fields: approval_id, error_type,
error via safe_error_description) so both expiration paths emit the same
WARNING-level event for API_APPROVAL_EXPIRE_CALLBACK_FAILED.
- Around line 315-375: The loop in list_items() is holding self._lock across
unbounded repo scans and then performing durable repo I/O (self._repo.save_many)
and callback dispatch (_fire_expire_callback) while still holding the lock,
blocking get()/save()/save_if_pending(); fix by confining the lock to only cache
reads/writes and computing expirations: iterate pages under the lock to build a
list of candidate ApprovalItem ids and computed checked items (via
_compute_expiration) and update the in-memory cache mapping (_items) only for
identity/fast updates, then release self._lock and perform durable writes
(self._repo.save_many) and fire callbacks (_fire_expire_callback) outside the
lock; alternatively, process in bounded batches (use page_size) so each batch
does: compute expirations under lock, release lock, await save_many, update
cache and emit callbacks, then re-acquire for next batch to avoid holding the
lock during long repo I/O or callback work.

In `@src/synthorg/engine/health/models.py`:
- Around line 100-123: The current __init__ logic wraps metadata in
MappingProxyType before Pydantic validation, so callers can still receive a
plain mutable dict; remove or stop performing the metadata freeze in __init__
and instead implement an `@field_validator`("metadata", mode="after") that:
accepts the validated value, if it's a Mapping deep-copies it
(copy.deepcopy(dict(value))) and returns MappingProxyType(...) to make it
read-only, otherwise returns the value unchanged; update/remove the
metadata-handling block in __init__ to avoid double-wrapping or pre-validation
mutation and ensure the field description and typing for metadata remain
unchanged.

In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py`:
- Around line 114-116: The docstring in MCPInstallationRepo that claims callers
must loop with offset instead of passing a larger limit is inaccurate: the
implementation currently accepts any limit >= 1 (with a default of 100). Fix by
either (A) updating the docstring for the method that accepts the limit/offset
parameters to state that limit defaults to 100 but callers may pass larger
limits (i.e., no enforced max), or (B) enforce a max by adding validation in the
same method (raise ValueError or clamp) when limit > 100 and update the
docstring to document the enforced max; reference the MCPInstallationRepo class
and its method that takes parameters named limit and offset when making the
change.

In `@tests/unit/communication/loop_prevention/test_circuit_breaker.py`:
- Around line 486-490: Replace the time.sleep-based handshake between the main
thread and the sibling thread with a deterministic synchronization primitive:
create a threading.Event (or Barrier) that the sibling thread in
_mutate_in_sibling sets once it has observed the contended acquire (i.e., after
it attempts to acquire and detects the lock is held inside its __enter__/acquire
path), and have the main thread wait on that event before proceeding with
check(); then assert in the test that the event was set (verifying the sibling
observed the lock) and clear/cleanup the event before allowing __exit__ to run.
Ensure you update both occurrences around Thread(target=_mutate_in_sibling, ...)
(lines near check() / __enter__/__exit__) so the sibling signals the event and
the main thread waits deterministically instead of using time.sleep.

---

Outside diff comments:
In `@src/synthorg/persistence/postgres/mcp_installation_repo.py`:
- Around line 133-159: The new list_all() method now defaults limit=100 which
silently truncates results for callers like self._repo.list_all() in
src/synthorg/integrations/connections/catalog.py; fix by either restoring
full-scan semantics in list_all() (e.g., change the signature so limit defaults
to None/unbounded, update validate_pagination_args and the SQL generation to
omit LIMIT/OFFSET when limit is None) or update the caller (the call site using
self._repo.list_all()) to perform explicit pagination loops using limit/offset
until no more rows; relevant symbols: list_all, validate_pagination_args,
PERSISTENCE_MCP_INSTALLATION_LIST_FAILED, and the caller self._repo.list_all().

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6356d35a-bcb4-48ce-8b92-c19b8caa0063

📥 Commits

Reviewing files that changed from the base of the PR and between 2ceba78 and 4f8a47e.

📒 Files selected for processing (9)

scripts/mock_spec_baseline.txt
src/synthorg/api/approval_store.py
src/synthorg/engine/health/models.py
src/synthorg/observability/events/persistence.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/postgres/user_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/sqlite/user_repo.py
tests/unit/communication/loop_prevention/test_circuit_breaker.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Lighthouse Dashboard
GitHub Check: Lighthouse Site
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Build Web Assets (melange)
GitHub Check: CLI Test (windows-latest)
GitHub Check: CLI Bench Regression
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (8)

**/*.{py,ts,tsx,go}

📄 CodeRabbit inference engine (CLAUDE.md)

Comments explain WHY only, never origin/review/issue context. A code comment answers: why is this code shaped this way? Forbidden: reviewer citations, issue/PR back-references, cryptic internal-taxonomy shorthand, migration/rebrand framing, and self-evident restatements. Required: hidden constraints, subtle invariants, workarounds with stable bug-tracker URLs, and why non-obvious choices were made.

Files:

src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/user_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py
tests/unit/communication/loop_prevention/test_circuit_breaker.py
src/synthorg/persistence/postgres/user_repo.py
src/synthorg/observability/events/persistence.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

No from __future__ import annotations — Python 3.14+ has PEP 649 native lazy annotations.

Use PEP 758 except syntax: except A, B: (no parens) when not binding to a name; ruff enforces this on 3.14. Use as exc with parens: except (A, B) as exc:.

All public functions require type hints; mypy strict mode enforced. Docstrings: Google style, required on public classes/functions (ruff D rules enforced).

Create new objects, never mutate existing ones. Use frozen Pydantic models for config/identity; for non-Pydantic registries use copy.deepcopy() at construction + MappingProxyType wrapping; deepcopy at system boundaries (tool execution, provider serialization, persistence).

Frozen Pydantic v2 models: ConfigDict(frozen=True, allow_inf_nan=False) everywhere; extra='forbid' on every model not needing model_dump() round-trip; request DTOs always forbid extra; ~489 total ConfigDicts with forbid; ~46 classes carrying @computed_field are exceptions.

Use NotBlankStr from core.types for identifier/name fields in all Pydantic models.

Every BaseTool subclass, MCP tool registration, A2A RPC method, and WebSocket event declares a typed Pydantic args model and is validated before dispatch. See docs/reference/conventions.md §9 for the inventory.

Every entry-point ingesting a dict payload from external sources (MCP handler args, JWT decode, WebSocket control message, audit-chain payload, A2A JSON-RPC params, settings security import) calls parse_typed() from synthorg.api.boundary with a Pydantic model class or TypeAdapter. Validates, emits API_BOUNDARY_VALIDATION_FAILED on failure, re-raises ValidationError. The boundary label MUST be hardcoded LiteralString, never user-controlled. Phase 3 lint: scripts/check_boundary_typed.py.

Prefer asyncio.TaskGroup for fan-out/fan-in. Wrap independent task bodies in async def helpers catching Exception (re-raise only MemoryError/RecursionError) so one failure doesn't unwind the group. See docs/r...

Files:

src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/user_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py
tests/unit/communication/loop_prevention/test_circuit_breaker.py
src/synthorg/persistence/postgres/user_repo.py
src/synthorg/observability/events/persistence.py

src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/persistence/ is the ONLY place permitted to import aiosqlite, sqlite3, psycopg, or psycopg_pool, or emit raw SQL DDL/DML keywords. Enforced by scripts/check_persistence_boundary.py (pre-push + CI). Per-line opt-out: # lint-allow: persistence-boundary -- <required justification>.

Datetime marshalling: round-trip ISO 8601 timestamps via parse_iso_utc/format_iso_utc from synthorg.persistence._shared (both reject naive datetimes); use normalize_utc for relaxed coercion on already-typed datetime inputs. See docs/reference/persistence-boundary.md §'Shared helpers'.

Files:

src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/user_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/user_repo.py

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, MemoryService), never directly into repositories. Services centralize audit logging and cross-repo orchestration; repositories MUST NOT log mutations themselves. Enforced by scripts/check_persistence_boundary.py.

New settings register in src/synthorg/settings/definitions/<namespace>.py and are consumed via ConfigResolver.get_*. Direct os.environ.get(...) reads outside startup forbidden. Exceptions: init-time only (DB credentials, secrets—env-only, no registry) and read-only post-init (log dir, NATS URL, worker count—read_only_post_init=True). See docs/reference/configuration-precedence.md.

All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code. RetryConfig and RateLimiterConfig set per-provider in ProviderConfig. Retryable errors: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately. RetryExhaustedError signals all retries failed.

Monetary models: every cost-bearing Pydantic model carries currency: CurrencyCode; aggregation sites enforce same-currency invariant (mixing raises MixedCurrencyAggregationError, HTTP 409).

Files:

src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/user_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py
src/synthorg/persistence/postgres/user_repo.py
src/synthorg/observability/events/persistence.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, config. Use generic names: example-provider, example-large-001, example-medium-001, large/medium/small. Tests use test-provider, test-small-001. Vendor names only in: (1) .claude/ files, (2) third-party imports, (3) provider presets (src/synthorg/providers/presets.py—user-facing runtime data), (4) provider logo assets (web/public/provider-logos/*.svg).

Files:

src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/user_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py
src/synthorg/persistence/postgres/user_repo.py
src/synthorg/observability/events/persistence.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/user_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py
src/synthorg/persistence/postgres/user_repo.py
src/synthorg/observability/events/persistence.py

src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS prevention): silent clients closed with code 1008 once exceeding api.ws_frame_timeout_seconds (default 30s). Wraps socket.receive_text() in asyncio.wait_for(...). WebSocket revalidation sliding window: failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds default 60s, api.ws_revalidation_max_failures default 5). Flaky persistence cannot indefinitely keep stale-auth connections alive; once window saturates, socket closes with code 4011.

Files:

src/synthorg/api/approval_store.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow. Mock-spec gate (#1604): every Mock()/AsyncMock()/MagicMock() MUST declare interface via spec=ConcreteClass. Pre-commit gate scripts/check_mock_spec.py blocks new bare calls. Regenerate baseline only via uv run python scripts/check_mock_spec.py --update. Without spec= the mock silently absorbs every attribute access, hiding production code renames/drops.

Time-driven tests: import FakeClock from tests._shared.fake_clock (NOT from rollout-subsystem paths) and inject into class under test. FakeClock.sleep advances virtual time AND yields once via asyncio.sleep(0) so cancellation propagates. For blocking tasks, use await clock.advance_async(seconds). FakeClock-first: patch time.monotonic()/asyncio.sleep() globals only for legacy code paths lacking Clock seam.

Async tests: asyncio_mode = 'auto' in pyproject.toml; no manual @pytest.mark.asyncio needed. Timeout: 30 seconds global; non-default overrides like timeout(60) allowed. Parallelism: pytest-xdist via -n 8, distribution --dist=loadfile (default). ALWAYS include -n 8 locally, never sequential. CI uses -n auto. loadfile prevents ProactorEventLoop socket-finaliser exhaustion on Python 3.14 + Windows.

Isolation regression gate: affected-tests pre-push runner (scripts/run_affected_tests.py) runs affected subset twice via pytest-repeat (--count 2 -x) after primary green pass. Catches module-level state leaks at PR time. Opt out via SYNTHORG_SKIP_ISOLATION_GATE=1 emergency-pushes only.

Logger spying: never use monkeypatch.setattr(module.logger, 'info', spy). BoundLoggerLazyProxy serves log methods via __getattr__ (per-call rebind). monkeypatch.setattr snapshots the bound method and permanently caches it, shadowing __getattr__. Use context manager wrapping direct setattr + try/finally del proxy.<level>. See _logger_info_spy in `tests/unit/settings/test...

Files:

tests/unit/communication/loop_prevention/test_circuit_breaker.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/communication/loop_prevention/test_circuit_breaker.py

src/synthorg/observability/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every business-logic module has from synthorg.observability import get_logger then logger = get_logger(__name__). Variable name always logger. Carve-outs (e.g. meta/mcp/handlers/common_logging.py) documented in module docstring. Never use bare import logging / logging.getLogger() / print() in application code (exception: observability/{setup,sinks,syslog_handler,http_handler,otlp_handler}.py for bootstrap).

Files:

src/synthorg/observability/events/persistence.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts. At every phase of planning and implementation, be critical: actively look for design improvements in the spirit of robustness, correctness, simplicity, future-proofing. Surface improvements as suggestions, not silent changes; the user decides.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Prioritize issues by dependency order, not priority labels; unblocked dependencies come first.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: If implementation deviates from the spec (better approach found, scope evolved, etc.), alert the user and explain why; the user decides whether to proceed or update the spec. When approved deviations occur, update the relevant `docs/design/` page to reflect the new reality. Do NOT silently diverge.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Never use `cd` in Bash commands; working directory is already project root. Use absolute paths or run commands directly. Exception: `bash -c 'cd <dir> && <cmd>'` is safe (child process, no cwd side effects). Never use Bash to write/modify files; use Write or Edit tools. Forbidden: `cat >`, `cat << EOF`, `echo >`, `echo >>`, `sed -i`, `python -c open(...)`, `tee` (read-only inspection like piping to stdout is fine).

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Commits follow `<type>: <description>` format (types: feat, fix, refactor, docs, test, chore, perf, ci). Enforced by commitizen (commit-msg hook). Signed commits required on `main` via branch protection; all commits GPG/SSH signed. Exception: GitHub App-signed commits from `synthorg-repo-bot` also satisfy `required_signatures` (used by release.yml, dev-release.yml, auto-rollover.yml, graduate.yml, cla.yml).

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Branches: `<type>/<slug>` from main. Pre-commit hooks: trailing-whitespace, end-of-file-fixer, check-yaml/toml/json, check-merge-conflict, check-added-large-files, no-commit-to-branch (main), ruff check+format, gitleaks, hadolint, golangci-lint + go vet (CLI conditional on cli/**/*.go), no-em-dashes, no-redundant-timeout, check-single-migration-per-pr, check-no-modify-migration (bypass: SYNTHORG_MIGRATION_SQUASH=1), no-release-please-token, workflow-shell-git-commits (scoped to .github/workflows/*.yml). eslint-web runs at pre-push only (TS project-graph boot is 15-30s).

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Hookify rules committed in `.claude/hookify.*.md`: block-pr-create (no gh pr create, use /pre-pr-review), block-double-push (throttle second push within 5min when open PR exists; override via .claude/state/allow-double-push.flag—one-shot, model cannot create), enforce-parallel-tests (-n 8 with pytest), no-cd-prefix, no-local-coverage (--cov blocked locally, CI handles).

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Merge strategy: squash merge. PR body becomes squash commit message on main. Trailers (e.g. `Release-As`, `Closes `#N``) must be in PR body to land in final commit. Preserve existing `Closes `#NNN`` references; never remove unless explicitly asked.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: After finishing an issue implementation: create feature branch (`<type>/<slug>`), commit, push; do NOT create PR automatically. Do NOT leave work uncommitted on main; branch, commit, push immediately after finishing.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: NEVER create PR directly; `gh pr create` blocked by hookify. ALWAYS use `/pre-pr-review` (runs automated checks + review agents + fixes before PR creation). For trivial/docs-only: `/pre-pr-review quick` skips agents but runs checks. After PR exists, use `/aurelio-review-pr` for external feedback. Fix everything valid, never skip; no deferring, no out-of-scope skipping.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Test regression (MANDATORY): When tests fail due to timeout/slowness/xdist contention: NEVER delete/skip tests or mark xfail to 'fix' slowness; NEVER use --no-verify to bypass pre-push hooks; NEVER modify tests/baselines/unit_timing.json (baseline updates require explicit user approval, enforced by scripts/check_no_edit_baseline.sh PreToolUse hook). Run: `uv run python -m pytest tests/unit/ -m unit -n 8 --durations=50 --durations-min=0.5 -q --no-header` to identify slow tests. Compare against baseline. If suite time exceeds baseline * 1.3: this is a SOURCE CODE REGRESSION, not test bug. Fix source code, not tests. pytest_sessionfinish hook warns if regression detected; trust the warning.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Property-based testing: Python uses Hypothesis (given decorators). CI runs 10 deterministic examples per property test (derandomize=True, no flakes). When Hypothesis finds failure, it is a REAL BUG: read shrunk example, fix bug, add explicit example(...) decorator for permanent coverage. See docs/reference/claude-reference.md §'Property-based Testing: Deep Dive' for profile catalog, local fuzzing commands, failure-handling workflow.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Ghost-wired settings: `scripts/check_setting_to_startup_trace.py` (pre-push + CI) flags settings whose consuming service is never instantiated at boot. Per-setting opt-out via `# lint-allow: bootstrap-wiring -- <reason>` on `_r.register(...)` closing line; `read_only_post_init=True` settings skipped automatically. See docs/reference/configuration-precedence.md.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Telemetry (Product): opt-in, off by default. Every event property must be explicitly listed in `_ALLOWED_PROPERTIES` keyed by event type; unknown keys raise `PrivacyViolationError` and are dropped. Never bypass the scrubber. See docs/reference/telemetry.md for enable flags, 4-step environment resolution chain, forbidden key patterns, Docker daemon enrichment, add-new-property checklist.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Regional Defaults (MANDATORY): No default may privilege a single region, currency, or locale. Every user-facing format resolves from: user/company setting -> browser/system -> neutral fallback. Enforced by `scripts/check_web_design_system.py` (web edits), `scripts/check_backend_regional_defaults.py` (backend edits), `scripts/check_forbidden_literals.py` (pre-push + CI). Per-line opt-out: `# lint-allow: regional-defaults`. See docs/reference/regional-defaults.md for resolution chain, allowlisted files, monetary-model inventory.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Persistence Boundary (MANDATORY): `src/synthorg/persistence/` is the only place importing DB drivers or emitting raw SQL. Every durable feature defines repository Protocol in `persistence/<domain>_protocol.py`, concrete impls under `persistence/{sqlite,postgres}/`, exposed on `PersistenceBackend`. Controllers/APIs access persistence through service layers, never directly into repositories. Services centralize audit logging; repositories MUST NOT log mutations. Enforced by `scripts/check_persistence_boundary.py` (pre-push + CI). Per-line opt-out: `# lint-allow: persistence-boundary -- <required justification>`. See docs/reference/persistence-boundary.md for three sanctioned exception categories, in-memory fallback rules, migration-hash guardrails.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-03T23:43:39.047Z
Learning: Configuration Precedence (MANDATORY): For every mutable setting: DB > env (SYNTHORG_<NAMESPACE>_<KEY>) > YAML > code default, resolved through `SettingsService`/`ConfigResolver`. First-cold-read emits one INFO `settings.value.resolved` carrying source + yaml_path; subsequent reads at DEBUG. Sanctioned exceptions: init-time only (DB credentials—env-only, no registry) and read-only post-init (log dir, NATS URL, worker count—`read_only_post_init=True`). Direct `os.environ.get(...)` outside startup forbidden. See docs/reference/configuration-precedence.md for full matrix, exception registry, migration recipe.

🔇 Additional comments (5)

src/synthorg/persistence/postgres/user_repo.py (2)

42-42: Shared pagination validator import is appropriate.

Using the shared validator here keeps pagination validation/logging behavior consistent with the persistence layer.

514-531: Bounded pagination update is correct and consistent.

limit=100 default, validate_pagination_args(...), and unconditional LIMIT/OFFSET produce clear, deterministic pagination behavior.

src/synthorg/persistence/sqlite/user_repo.py (2)

41-41: Using the shared pagination validator is the right consolidation.

This keeps validation and observability behavior centralized instead of duplicating checks in-repo.

679-707: Pagination semantics and SQL construction look good.

The method now clearly enforces bounded pagination and applies LIMIT/OFFSET consistently with validated inputs.

src/synthorg/observability/events/persistence.py (1)

120-122: Looks good — new persistence failure event is well-scoped and consistent.

- approval_store: per-page chunked list path releases the store lock during repo I/O, save_many, and callback dispatch. Concurrent get() / save() callers no longer block on an unbounded scan. Pure compute factored into _compute_page helper. - approval_store: align _check_expiration_locked callback failure to ERROR severity matching _fire_expire_callback (operationally meaningful side-effect failures must alert regardless of which expiration path fired). - engine/health/models: replace __init__ pre-validation wrap with @field_validator(mode='after'). Pydantic 2.x unwraps generic Mapping annotations into a plain dict during validation, silently discarding the pre-validation MappingProxyType wrap. - sqlite/mcp_installation_repo: docstring no longer claims callers must loop with offset; reflects actual behavior (no max enforced). - test_circuit_breaker: replace time.sleep(0.05) handshake with a deterministic threading.Event; the sibling sets the event when its non-blocking acquire fails (i.e. once it has observed the contended acquire), so the regression cannot pass without exercising the interleaving.

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

tests/unit/communication/loop_prevention/test_circuit_breaker.py (1)

299-300: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the repository double specced end-to-end.

MagicMock(spec=CircuitBreakerStateRepository) already gives you interface-checked async members. Rebinding save / load_all with bare AsyncMock() drops that protection again and is why this file now needs new suppressions in scripts/mock_spec_baseline.txt. Configure the existing mocked methods instead of replacing them.

Suggested fix

         config = CircuitBreakerConfig(bounce_threshold=1, cooldown_seconds=10)
         repo = MagicMock(spec=CircuitBreakerStateRepository)
-        repo.save = AsyncMock()
+        repo.save.return_value = None
         cb = DelegationCircuitBreaker(config, state_repo=repo)
@@
         )
         record = CircuitBreakerStateRecord(
@@
         )
         repo = MagicMock(spec=CircuitBreakerStateRepository)
-        repo.load_all = AsyncMock(return_value=(record,))
+        repo.load_all.return_value = (record,)
@@
         )
         repo = MagicMock(spec=CircuitBreakerStateRepository)
-        repo.load_all = AsyncMock(return_value=(record,))
+        repo.load_all.return_value = (record,)

#!/bin/bash
set -euo pipefail

# Expect three bare AsyncMock rebinds in the edited test file.
rg -n 'repo\.(save|load_all)\s*=\s*AsyncMock\(' tests/unit/communication/loop_prevention/test_circuit_breaker.py

# Expect matching suppression entries that were added for this file.
rg -n 'tests/unit/communication/loop_prevention/test_circuit_breaker.py:(300:20|324:24|362:24)' scripts/mock_spec_baseline.txt

As per coding guidelines, Every Mock() / AsyncMock() / MagicMock() in tests MUST declare the interface via spec=ConcreteClass (Protocol or class); a pre-commit gate blocks new bare-call sites.

Also applies to: 323-324, 361-362

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/communication/loop_prevention/test_circuit_breaker.py` around
lines 299 - 300, The test replaces spec-checked members on the repo mock (repo =
MagicMock(spec=CircuitBreakerStateRepository)) with bare AsyncMock instances
which removes the spec; instead of rebinding repo.save or repo.load_all to
AsyncMock(), configure the existing spec-checked mock members (e.g., call
repo.save.configure_mock(return_value=...), repo.save.side_effect=..., or set
repo.save.return_value = <awaitable> and similarly for repo.load_all) so you
preserve the CircuitBreakerStateRepository spec on repo and remove the new
bare-AsyncMock assignments.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/api/approval_store.py`:
- Around line 326-354: When persisting a non-empty to_persist batch, the code
only applies cache_updates (which _compute_page currently fills only for rows
that flipped to EXPIRED), leaving sibling rows stale in self._items; change the
post-save cache update to refresh the entire page slice in self._items instead
of only cache_updates. Concretely: after await self._repo.save_many(to_persist)
and inside async with self._lock replace the single
self._items.update(cache_updates) with an update that writes the full page
mapping returned by _compute_page (or fetch the authoritative page rows from the
repo if needed) so every row from that page overwrites the existing entries;
apply the same change to the analogous block around the other occurrence (the
396-409 region) to ensure list_items(), get(), and save_if_pending() never see
stale siblings.
- Around line 314-369: The repo-scan path in _list_from_repo can repopulate
_items after a concurrent clear() because it does repo I/O outside the lock and
updates the cache without checking generation; mirror the save() pattern by
capturing the current self._generation before doing I/O (around the start of the
while loop), and after the await self._repo.save_many(...) or before applying
cache_updates/assigning self._items in the async with self._lock block, compare
the captured generation to self._generation and skip applying cache_updates /
skipping the per-item assignment if they differ; ensure this guard is applied
both for the to_persist branch (before self._items.update(cache_updates)) and
the else branch (before populating self._items from page) so in-flight scans
cannot resurrect entries after clear().

In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py`:
- Around line 109-117: The change introducing a default limit=100 on the
repository list_all method causes callers like self._catalog.list_all in
src/synthorg/integrations/connections/mcp_service.py to see truncated results
and wrong totals; fix by either (A) adding an explicit unbounded API (e.g.,
list_all_unbounded or accept limit=None to mean no limit) in
mcp_installation_repo and update callers to use it, or (B) update callers of
_catalog.list_all (in mcp_service.py) to perform iterative pagination using the
limit/offset parameters (call list_all in a loop, accumulating results until a
shorter-than-limit page is returned) and derive totals from the accumulated set;
reference list_all, _catalog.list_all, and the calling code in mcp_service.py
when making the change.
- Around line 130-131: The list_all method can raise sqlite3/aiosqlite errors
during the read without logging or normalizing them; wrap the async block that
calls self._db.execute(...) and await cursor.fetchall() in a try/except that
catches sqlite3.Error and aiosqlite.Error (or a general DB error), log the
failure using the same logger pattern and constant name
PERSISTENCE_MCP_INSTALLATION_LIST_FAILED with contextual info (limit/offset),
and then raise QueryError so the error path matches save/delete; reference the
list_all method, self._db.execute, PERSISTENCE_MCP_INSTALLATION_LIST_FAILED, and
QueryError when making the change.

---

Duplicate comments:
In `@tests/unit/communication/loop_prevention/test_circuit_breaker.py`:
- Around line 299-300: The test replaces spec-checked members on the repo mock
(repo = MagicMock(spec=CircuitBreakerStateRepository)) with bare AsyncMock
instances which removes the spec; instead of rebinding repo.save or
repo.load_all to AsyncMock(), configure the existing spec-checked mock members
(e.g., call repo.save.configure_mock(return_value=...),
repo.save.side_effect=..., or set repo.save.return_value = <awaitable> and
similarly for repo.load_all) so you preserve the CircuitBreakerStateRepository
spec on repo and remove the new bare-AsyncMock assignments.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3467a4dc-b62a-4e3c-938e-18efcdd2a346

📥 Commits

Reviewing files that changed from the base of the PR and between 4f8a47e and c56b1d9.

📒 Files selected for processing (5)

scripts/mock_spec_baseline.txt
src/synthorg/api/approval_store.py
src/synthorg/engine/health/models.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
tests/unit/communication/loop_prevention/test_circuit_breaker.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Web Assets (melange)
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: CLI Bench Regression
GitHub Check: Lighthouse Dashboard
GitHub Check: Lighthouse Site
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (9)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

No from __future__ import annotations in Python code; use Python 3.14 PEP 649 native lazy annotations

Use PEP 758 except syntax: except A, B: (no parens) when not binding to a name; as exc requires parens except (A, B) as exc:

All public functions must have type hints; mypy strict mode is enforced

Docstrings must be Google style and are required on all public classes and functions; ruff D rules enforce this

Classes that read time or sleep must accept clock: Clock | None = None defaulting to SystemClock(); tests inject FakeClock from tests._shared.fake_clock

Line length: 88 (enforced by ruff); functions must be < 50 lines; files must be < 800 lines

Comments explain WHY only, never origin/review/issue context; forbidden: reviewer citations, in-code issue/PR back-references, cryptic internal-taxonomy shorthand, migration/rebrand framing, round/iteration narrative, self-evident restatements

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
tests/unit/communication/loop_prevention/test_circuit_breaker.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Create new objects, never mutate existing ones; use frozen Pydantic models for config/identity; for non-Pydantic registries use copy.deepcopy() at construction and MappingProxyType wrapping; deepcopy at system boundaries

Separate config from runtime state: use frozen models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves. Never mix static config and mutable runtime fields in one model

Pydantic v2 conventions: ConfigDict(frozen=True, allow_inf_nan=False) everywhere; extra="forbid" on models that do not need to round-trip through model_dump(); use @computed_field for derived values; use NotBlankStr from core.types for identifier/name fields

Every BaseTool subclass, MCP tool registration, A2A RPC method, and WebSocket event must declare a typed Pydantic args model and be validated before dispatch

Every entry-point that ingests a dict payload from an external source (MCP handler args, JWT decode, WebSocket control message, audit-chain payload, A2A JSON-RPC params, settings security import) must call parse_typed() from synthorg.api.boundary; the boundary label must be a hardcoded LiteralString

Prefer asyncio.TaskGroup for fan-out/fan-in; wrap independent task bodies in async def helpers that catch Exception (re-raise only MemoryError / RecursionError) so one failure doesn't unwind the group

Async start() / stop() services own a dedicated self._lifecycle_lock; timed-out stops mark the service unrestartable

Wrap attacker-controllable strings at LLM call sites via wrap_untrusted() from synthorg.engine.prompt_safety; append untrusted_content_directive(tags) to the enclosing system prompt

Never call lxml.html.fromstring on attacker input; use HTMLParseGuard from synthorg.tools.html_parse_guard

Cross-cutting subsystems follow protocol + strategy + factory + config discriminator pattern with safe defaults; services (which wrap repositories) are a di...

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py

src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Repository CRUD vocabulary: use save(entity) -> None (insert-or-update, idempotent), get(id) -> Entity | None (None on miss, never raises), delete(id) -> bool (True on removal, False if absent), list_items(...) -> tuple[Entity, ...] (paginated/filtered), and query(...) -> tuple[Entity, ...] (always return tuples, never lists)

Round-trip ISO 8601 timestamps through parse_iso_utc / format_iso_utc from synthorg.persistence._shared (both reject naive datetimes); use normalize_utc for relaxed coercion on already-typed datetime inputs

src/synthorg/persistence/ is the ONLY place that may import aiosqlite, sqlite3, psycopg, or psycopg_pool, or emit raw SQL DDL/DML keywords in string literals

Every durable feature MUST define a repository Protocol in persistence/<domain>_protocol.py, concrete impls under persistence/{sqlite,postgres}/, and be exposed on PersistenceBackend

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every business-logic module has from synthorg.observability import get_logger then logger = get_logger(__name__); variable name is always logger

Never use import logging / logging.getLogger() / print() in application code (exception: observability setup/handler files)

Always import log event constants from synthorg.observability.events.<domain>; never use string literals for event names

Always use structured logging: logger.info(EVENT, key=value); never use format strings like logger.info("msg %s", val)

All error paths must log at WARNING or ERROR with context before raising

Every status transition (including non-terminal hops like PENDING -> RUNNING) must log at INFO using a domain-scoped *_STATUS_TRANSITIONED constant carrying from_status / to_status / domain id, AFTER the persistence write succeeds

DEBUG logging is for object creation, internal flow, and entry/exit of key functions; pure data models, enums, re-exports do NOT need logging

Every settings read emits one INFO settings.value.resolved event on first cold read per process, carrying source + yaml_path; subsequent reads stay at DEBUG

Never call any logger severity (exception / warning / error / info / debug) with error=str(exc); use error_type=type(exc).__name__ and error=safe_error_description(exc), keeping severity appropriate to context

Controllers and API endpoints access persistence through domain-scoped service layers (e.g., ArtifactService, WorkflowService), never directly into repositories; services centralize audit logging and cross-repo orchestration; repositories must not log mutations

For every mutable setting: DB > env (SYNTHORG_<NAMESPACE>_<KEY>) > YAML > code default, resolved through SettingsService / ConfigResolver; first cold-read emits one INFO settings.value.resolved event; subsequent reads stay at DEBUG

Two sanctioned exceptions to configuration precedence: init-time only (DB credentials, bootstrap secrets — env-...

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py

{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
tests/unit/communication/loop_prevention/test_circuit_breaker.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py

{web/src,src/synthorg}/**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

No default may privilege a single region, currency, or locale; every user-facing format resolves from: user/company setting → browser/system → neutral fallback; never hardcode ISO 4217 codes, BCP 47 tags, or call bare .toLocaleString()

Store UTC; render via Intl without passing timeZone (browser tz wins); always use Intl for date/number formats, no hand-rolled templates

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py

{src/synthorg,web/src}/**/*.{py,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Currency: never hardcode ISO 4217 codes or symbols; backend: use DEFAULT_CURRENCY from synthorg.budget.currency or runtime budget.currency setting; frontend: use DEFAULT_CURRENCY from @/utils/currencies or useSettingsStore().currency

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/engine/health/models.py
src/synthorg/api/approval_store.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never delete, skip, or mark tests as xfail to fix slowness or timeouts; identify slow tests via --durations, compare against tests/baselines/unit_timing.json, and fix source code regressions, not tests

Every Mock() / AsyncMock() / MagicMock() in tests MUST declare the interface via spec=ConcreteClass (Protocol or class); a pre-commit gate blocks new bare-call sites

Use FakeClock from tests._shared.fake_clock (NOT from rollout-subsystem paths) and inject it into the class under test; patch time.monotonic() / asyncio.sleep() globals only for legacy code paths without a Clock seam

Coverage minimum is 80% (enforced in CI; benchmarks excluded via --ignore=tests/benchmarks/)

Use asyncio_mode = "auto" in test configuration; no manual @pytest.mark.asyncio needed

Global test timeout is 30 seconds per test (set in pyproject.toml); do not add per-file @pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed

Always run pytest with -n 8 for parallelism via pytest-xdist with --dist=loadfile (default); never run tests sequentially. CI uses -n auto

Prefer @pytest.mark.parametrize for testing similar cases

Files:

tests/unit/communication/loop_prevention/test_circuit_breaker.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/communication/loop_prevention/test_circuit_breaker.py

src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS prevention): silent clients are closed with policy code 1008 once they exceed api.ws_frame_timeout_seconds (default 30s) without sending a frame

WebSocket revalidation sliding window: use _SlidingWindowRateLimiter for persistence-backend failures during periodic revalidation (default 60s window, 5 failures max); once the window saturates, close the socket with code 4011

Files:

src/synthorg/api/approval_store.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue; do not silently diverge from the design spec — alert the user and explain why, and update the spec if the user approves the deviation

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts; be critical and look for ways to improve the design; surface improvements as suggestions, not silent changes

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Prioritize issues by dependency order, not priority labels; unblocked dependencies come first

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Never use `cd` in Bash commands; the working directory is already set to the project root. Use absolute paths or run commands directly. Use `bash -c "cd <dir> && <cmd>"` for tools without a `-C` flag

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Never use Bash to write or modify files; use provided tools (Write or Edit). Do not use `cat >`, `cat << EOF`, `echo >`, `echo >>`, `sed -i`, `python -c "open(...).write(...)"`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Commits follow `<type>: <description>` format; types: feat, fix, refactor, docs, test, chore, perf, ci; enforced by commitizen via commit-msg hook

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: All commits must be GPG/SSH signed; GitHub App-signed commits from `synthorg-repo-bot` also satisfy `required_signatures`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Branches follow `<type>/<slug>` naming convention from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Block `cd` prefix in Bash commands; use absolute paths or run commands directly with `-C` flag when available; use `bash -c "cd <dir> && <cmd>"` for tools without `-C`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Never create PRs directly via `gh pr create`; always use `/pre-pr-review` to create PRs; it runs automated checks + review agents + fixes before creating the PR; for trivial/docs-only changes use `/pre-pr-review quick`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: After finishing an issue implementation, always create a feature branch, commit, and push; do NOT create a PR automatically; do NOT leave work uncommitted on main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:10:49.568Z
Learning: Fix everything valid when review agents find issues (including pre-existing issues in surrounding code, suggestions, and adjacent findings); never defer or skip as out-of-scope

🔇 Additional comments (5)

src/synthorg/engine/health/models.py (4)

7-14: LGTM!

Import additions are correct and minimal for the new immutability implementation. The noqa: TC003 annotation on Mapping is appropriate since it's used as a runtime Pydantic field annotation.

61-62: LGTM!

Adding extra="forbid" aligns with the PR's objective to harden Pydantic models and matches the coding guideline requiring extra="forbid" on models that do not need to round-trip through model_dump().

100-103: LGTM!

The type annotation change to Mapping[str, object] with MappingProxyType({}) default correctly expresses the read-only contract. The field description accurately documents the runtime immutability guarantee.

105-119: LGTM!

The @field_validator("metadata", mode="after") approach correctly addresses the Pydantic v2 behavior where generic Mapping[...] annotations are unwrapped to dict during validation. The implementation:

Deep-copies to prevent input aliasing (caller can't mutate original dict to affect ticket)

Wraps in MappingProxyType to prevent post-construction mutation (ticket.metadata['key'] = ...)

Uses dict(value) to uniformly handle both dict and MappingProxyType inputs

The docstring accurately explains the rationale.

src/synthorg/api/approval_store.py (1)

90-100: Clock-driven expiration is now consistent across both paths.

Nice change: get() and repo-backed list_items() now share the same injected clock, so FakeClock can drive lazy expiration deterministically in both flows.

Also applies to: 624-687

coderabbitai · 2026-05-04T00:18:36Z

+        limit: int = 100,
        offset: int = 0,
    ) -> tuple[McpInstallation, ...]:
-        """List all recorded installations, oldest-first."""
+        """Return up to ``limit`` recorded installations, oldest-first.
+
+        ``limit`` defaults to 100 (matches the protocol-wide pagination
+        floor) and accepts any positive integer; no upper bound is
+        enforced. Callers may either pass a larger ``limit`` or loop
+        with ``offset`` for cursor-style pagination.


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Default limit=100 now truncates callers that assume full-set reads.

src/synthorg/integrations/connections/mcp_service.py (Line 52 in the provided snippet) still does tuple(await self._catalog.list_all()) and derives total from that complete result. With this change, results/totals become wrong once installations exceed 100. Please migrate those call sites to iterative pagination (fetch-until-short-page) or provide an explicit unbounded path.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py` around lines 109 - 117, The change introducing a default limit=100 on the repository list_all method causes callers like self._catalog.list_all in src/synthorg/integrations/connections/mcp_service.py to see truncated results and wrong totals; fix by either (A) adding an explicit unbounded API (e.g., list_all_unbounded or accept limit=None to mean no limit) in mcp_installation_repo and update callers to use it, or (B) update callers of _catalog.list_all (in mcp_service.py) to perform iterative pagination using the limit/offset parameters (call list_all in a loop, accumulating results until a shorter-than-limit page is returned) and derive totals from the accumulated set; reference list_all, _catalog.list_all, and the calling code in mcp_service.py when making the change.

- approval_store: capture _generation under the lock before any repo I/O in _list_from_repo; per-page cache update now skips when the generation no longer matches (concurrent clear() landed mid-scan). Mirrors the same guard save() already applies; without it an in-flight scan could repopulate _items after a clear. - approval_store: refresh the entire page slice in _items, not just the EXPIRED transitions. _compute_page now returns page_cache (every row) so non-expired siblings whose authoritative repo state has drifted from the cache get refreshed, preventing stale cached copies from leaking into later get()/save_if_pending() decisions. - sqlite/mcp_installation_repo: list_all wraps the read in try/except to log PERSISTENCE_MCP_INSTALLATION_LIST_FAILED and raise QueryError on sqlite/aiosqlite errors, matching save/delete error pattern. - postgres/mcp_installation_repo: list_all read failures normalize to QueryError and use PERSISTENCE_MCP_INSTALLATION_LIST_FAILED for parity with sqlite (was using MCP_SERVER_INSTALL_FAILED with bare re-raise).

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/synthorg/persistence/sqlite/mcp_installation_repo.py (1)

130-150: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Broaden the normalization to cover row mapping.

This only normalizes driver failures. If NotBlankStr(...) or coerce_row_timestamp(...) rejects a persisted row, the method still leaks a raw exception and skips the PERSISTENCE_MCP_INSTALLATION_LIST_FAILED log.

Suggested fix

         try:
             async with self._db.execute(sql, (limit, offset)) as cursor:
                 rows = await cursor.fetchall()
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+            return tuple(
+                McpInstallation(
+                    catalog_entry_id=NotBlankStr(row[0]),
+                    connection_name=(NotBlankStr(row[1]) if row[1] else None),
+                    installed_at=coerce_row_timestamp(row[2]),
+                )
+                for row in rows
+            )
+        except MemoryError, RecursionError:
+            raise
+        except Exception as exc:
             msg = "Failed to list mcp installations"
             logger.warning(
                 PERSISTENCE_MCP_INSTALLATION_LIST_FAILED,
                 limit=limit,
                 offset=offset,
                 error_type=type(exc).__name__,
                 error=safe_error_description(exc),
             )
             raise QueryError(msg) from exc
-        return tuple(
-            McpInstallation(
-                catalog_entry_id=NotBlankStr(row[0]),
-                connection_name=(NotBlankStr(row[1]) if row[1] else None),
-                installed_at=coerce_row_timestamp(row[2]),
-            )
-            for row in rows
-        )

As per coding guidelines: "All error paths log at WARNING or ERROR with context before raising."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py` around lines 130 -
150, The row mapping can raise errors (from NotBlankStr or coerce_row_timestamp)
that bypass the existing sqlite exception handler and skip logging; wrap the
tuple comprehension in a try/except that catches Exception (or the specific
validation errors), log PERSISTENCE_MCP_INSTALLATION_LIST_FAILED with the same
context (limit, offset, error_type, error using safe_error_description) and then
raise QueryError(msg) from the caught exception; update the code around
McpInstallation creation (the generator/tuple building) so mapping occurs inside
that try block and reference NotBlankStr, coerce_row_timestamp,
PERSISTENCE_MCP_INSTALLATION_LIST_FAILED and QueryError in your changes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/api/approval_store.py`:
- Around line 339-359: The current flow builds to_persist from page (via
_compute_page) and then calls self._repo.save_many(to_persist), which can
blindly upsert stale rows; change the repository boundary to perform
compare-and-set semantics so expirations are applied only when the current DB
state matches the expected prior state (e.g., expire only if status == PENDING
or matches the snapshot version), by replacing the blind save_many call with a
conditional bulk update API on self._repo (or augment save_many) that returns
which rows were actually updated; then use that returned set to drive cache
refresh, audit events and callbacks so side effects run only for rows that truly
transitioned. Ensure list_items, _compute_page and to_persist remain but rely on
the repo’s conditional update to prevent overwriting newer states.
- Around line 327-345: The offset-based scan in the loop uses repo_status = None
if status in {None, ApprovalStatus.EXPIRED} else status which pushes
status=PENDING into self._repo.list_items and causes missing rows when
_compute_page() moves PENDING → EXPIRED between pages; stop pushing PENDING into
the repo query so the offset scan sees a stable set: only apply repo-side status
pushdown for terminal statuses (e.g. ApprovalStatus.EXPIRED) and treat PENDING
the same as None (repo_status=None) in the list_items call, or switch this
method to keyset pagination; update the code around repo_status, the while loop
and the call to self._repo.list_items (and any logic relying on offset
increments) to ensure PENDING is not passed to list_items.

In `@src/synthorg/persistence/postgres/mcp_installation_repo.py`:
- Around line 155-175: The code currently only wraps execute/fetchall in the
try/except so deserialization errors from _row_to_installation escape without
logging; move the conversion into the same error-handling path or add a new
try/except around the tuple(_row_to_installation(row) ...) that mirrors the
existing except: block: catch all non-memory/recursion exceptions, call
logger.warning with PERSISTENCE_MCP_INSTALLATION_LIST_FAILED and the same
context (limit, offset, error_type, error, backend="postgres"), then raise
QueryError(msg) from exc; reference _row_to_installation,
PERSISTENCE_MCP_INSTALLATION_LIST_FAILED, QueryError and logger.warning when
implementing.

---

Outside diff comments:
In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py`:
- Around line 130-150: The row mapping can raise errors (from NotBlankStr or
coerce_row_timestamp) that bypass the existing sqlite exception handler and skip
logging; wrap the tuple comprehension in a try/except that catches Exception (or
the specific validation errors), log PERSISTENCE_MCP_INSTALLATION_LIST_FAILED
with the same context (limit, offset, error_type, error using
safe_error_description) and then raise QueryError(msg) from the caught
exception; update the code around McpInstallation creation (the generator/tuple
building) so mapping occurs inside that try block and reference NotBlankStr,
coerce_row_timestamp, PERSISTENCE_MCP_INSTALLATION_LIST_FAILED and QueryError in
your changes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 260b7a6f-21d0-481e-b325-24c42d98dd69

📥 Commits

Reviewing files that changed from the base of the PR and between c56b1d9 and e155e08.

📒 Files selected for processing (3)

src/synthorg/api/approval_store.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: CLI Test (windows-latest)
GitHub Check: Build Web Assets (melange)
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Lighthouse Site
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (5)

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every cost-bearing Pydantic model carries currency: CurrencyCode; aggregation sites enforce a same-currency invariant (mixing raises MixedCurrencyAggregationError, HTTP 409)

src/synthorg/persistence/ is the ONLY place that may import aiosqlite, sqlite3, psycopg, or psycopg_pool, or emit raw SQL DDL/DML keywords in string literals

Direct os.environ.get(...) reads in application code outside startup are forbidden. New settings register in src/synthorg/settings/definitions/<namespace>.py and are consumed via ConfigResolver.get_*

No from __future__ import annotations: Python 3.14 has PEP 649

Use PEP 758 except syntax: except A, B: (no parens) when not binding to a name; as exc requires parens (except (A, B) as exc:)

Type hints: all public functions, mypy strict mode

Docstrings: Google style, required on public classes / functions (ruff D rules)

Config vs runtime state: frozen models for config/identity; separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves. Never mix static config and mutable runtime fields in one model

Pydantic v2 conventions: ConfigDict(frozen=True, allow_inf_nan=False) everywhere; extra="forbid" on every model that does not need to round-trip through model_dump(); @computed_field for derived values; NotBlankStr from core.types for identifier / name fields

Args models at every system boundary: every BaseTool subclass, MCP tool registration, A2A RPC method, and WebSocket event declares a typed Pydantic args model and is validated before dispatch

Every entry-point that ingests a dict payload from an external source calls parse_typed() from synthorg.api.boundary. The helper validates and emits API_BOUNDARY_VALIDATION_FAILED on failure; the boundary label MUST be a hardcoded LiteralString

Async concurrency: prefer asyncio.TaskGroup for fan-out / fan-in. Wrap independent task bodies in async def helpers that catch Exception (re-raise only MemoryError...

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/api/approval_store.py

src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every durable feature MUST define a repository Protocol in persistence/<domain>_protocol.py, concrete impls under persistence/{sqlite,postgres}/, and be exposed on PersistenceBackend

Repository CRUD vocabulary: persistence repositories use save(entity) -> None (insert-or-update, idempotent), get(id) -> Entity | None (None on miss), delete(id) -> bool (True on removal, False if absent), list_items(...) -> tuple[Entity, ...] (paginated / filtered), and query(...) -> tuple[Entity, ...]

Datetime marshalling in persistence: round-trip ISO 8601 timestamps through parse_iso_utc / format_iso_utc from synthorg.persistence._shared (both reject naive datetimes); use normalize_utc for relaxed coercion on already-typed datetime inputs

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/mcp_installation_repo.py

@(src|tests)/**/*.@(py|ts|tsx)

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/api/approval_store.py

src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/api/approval_store.py

src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS prevention): silent clients are closed with policy code 1008 once they exceed api.ws_frame_timeout_seconds (default 30s) without sending a frame

WebSocket revalidation sliding window: persistence-backend failures during periodic revalidation are tracked via a _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds default 60s, api.ws_revalidation_max_failures default 5)

Files:

src/synthorg/api/approval_store.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Always read the relevant `docs/design/` page (linked via DESIGN_SPEC.md) before implementing any feature or planning any issue

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: If implementation deviates from the design spec, alert the user and explain why; the user decides whether to proceed or update the spec. Do not silently diverge; every deviation needs explicit user approval

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: When a spec deviation is approved, update the relevant `docs/design/` page to reflect the new reality

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: At every phase of planning and implementation, be critical: actively look for ways to improve the design; surface improvements as suggestions, not silent changes; the user decides

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Use D2 (`\`\`\`d2`) for architecture diagrams, nested container layouts, and complex entity relationships

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Use Mermaid (`\`\`\`mermaid`) for flowcharts, sequence diagrams, simple hierarchies, and pipelines

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Never use `\`\`\`text` blocks with ASCII/Unicode box-drawing characters for diagrams

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: No default may privilege a single region, currency, or locale. Every user-facing format resolves from: user/company setting -> browser/system -> neutral fallback

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Never hardcode ISO 4217 currency codes or symbols. Backend: use `DEFAULT_CURRENCY` from `synthorg.budget.currency` or the runtime `budget.currency` setting. Frontend: use `DEFAULT_CURRENCY` from `@/utils/currencies` or `useSettingsStore().currency`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: No `_usd` suffix on money fields anywhere. The type carries money semantics; the value is in the operator's configured currency

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Store UTC datetimes; render via `Intl` without passing `timeZone` (browser tz wins)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Always use `Intl` for date/number formatting; no hand-rolled templates

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Use metric only for units. Use International/British English UI default (`colour`, `behaviour`, `organise`, `centred`, `analyse`, `cancelled`); document deviations

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Controllers and API endpoints access persistence through domain-scoped service layers (e.g. `ArtifactService`, `WorkflowService`, `MemoryService`, etc.), never directly into repositories

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Services centralize audit logging and cross-repo orchestration; repositories must not log mutations themselves

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: When adding a migration, read `docs/guides/persistence-migrations.md` first. Do not hand-edit SQL or `atlas.sum`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Immutability: create new objects, never mutate existing ones. Frozen Pydantic models for config/identity; for non-Pydantic registries use `copy.deepcopy()` at construction + `MappingProxyType` wrapping; deepcopy at system boundaries

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Pluggable subsystems: cross-cutting subsystems follow protocol + strategy + factory + config discriminator with safe defaults. Services (which wrap repositories) are a distinct pattern

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Validate at system boundaries (user input, external APIs, config files)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Never delete tests, skip tests, or mark them `xfail` to fix slowness. Never modify `tests/baselines/unit_timing.json`; baseline updates require explicit user approval

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: When tests fail due to timeout or slowness, run: `uv run python -m pytest tests/unit/ -m unit -n 8 --durations=50 --durations-min=0.5 -q --no-header` to identify slow tests and compare against baseline

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Coverage: 80% minimum (enforced in CI; benchmarks are excluded via `--ignore=tests/benchmarks/`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Parallelism: `pytest-xdist` via `-n 8`, distribution `--dist=loadfile`. ALWAYS include `-n 8` when running pytest locally

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Isolation regression gate: the affected-tests pre-push runner runs the affected subset twice via `pytest-repeat` (`--count 2 -x`). Opt out via `SYNTHORG_SKIP_ISOLATION_GATE=1` for emergency pushes only

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Git commits: `<type>: <description>`. Types: feat, fix, refactor, docs, test, chore, perf, ci. Enforced by commitizen (commit-msg hook)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Signed commits: required on `main` via branch protection. All commits must be GPG/SSH signed. Exception: GitHub App-signed commits from `synthorg-repo-bot` satisfy `required_signatures`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Pre-commit hooks enforce: ruff check+format, mypy, no persistence boundary violations, no bare mocks in tests, no forbidden literals (currency, locale), no editing migrations, no double-push throttle violations

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Pre-push hooks enforce: mypy type-check (affected modules only), pytest unit tests (affected modules only), golangci-lint + go vet (CLI), eslint-web (dashboard), setting-to-startup-trace lint

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Merge strategy: squash merge. PR body becomes the squash commit message on main. Trailers (e.g. `Release-As`, `Closes `#N``) must be in the PR body to land in the final commit

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: After finishing an issue implementation, create a feature branch (`<type>/<slug>`), commit, and push; do NOT create a PR automatically

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: NEVER create a PR directly: `gh pr create` is blocked by hookify. ALWAYS use `/pre-pr-review` to create PRs; it runs automated checks + review agents + fixes before creating the PR

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T00:40:02.900Z
Learning: Comment what stays: hidden constraints, subtle invariants, workarounds for specific upstream bugs (with stable bug-tracker URL), and why a non-obvious choice was made

- approval_repo (sqlite + postgres): add expire_if_pending(ids) -> tuple[NotBlankStr, ...] compare-and-set method that flips rows still PENDING to EXPIRED via UPDATE ... WHERE id IN (...) AND status='pending' RETURNING id, returning the ids actually updated. Replaces the blind save_many upsert in the lazy-expire path so a concurrent save() that wrote a newer terminal status (APPROVED, REJECTED, CANCELLED) between page read and persist cannot be clobbered. - approval_store: _list_from_repo now uses expire_if_pending and filters cache writes, audit events, callbacks, and the response to actually-transitioned ids only. Lost-race rows are evicted from cache (next get() refetches authoritative state) and dropped from the response (surfacing them as EXPIRED would leak stale data). - approval_store: drop status=PENDING from the repo-side pushdown. PENDING cannot be pushed down because per-page expiration removes rows from the filtered set as the iterator advances; offset += 100 would then skip rows that should still be visible. - mcp_installation_repo (sqlite + postgres): pull row deserialization inside the same try/except as execute/fetchall so malformed persisted rows surface under PERSISTENCE_MCP_INSTALLATION_LIST_FAILED + QueryError envelope, not as raw exceptions escaping the persistence boundary. - conformance/persistence/test_approval_repository: add tests for expire_if_pending (compare-and-set semantics, empty input no-op, unknown ids no-op).

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/synthorg/api/approval_store.py (1)

689-699: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Use compare-and-set for the scalar lazy-expire path too.

get() still reaches this branch, and self._repo.save(expired) can overwrite a concurrent APPROVED/REJECTED decision back to EXPIRED if that decision lands after the repo read but before Line 698. The batch path already fixed this race with expire_if_pending(...); this scalar path needs the same protection, and the transition logs/callbacks should only run when that compare-and-set actually succeeds.

Suggested fix

         if (
             item.status == ApprovalStatus.PENDING
             and item.expires_at is not None
             and self._clock.now() >= item.expires_at
         ):
             expired = item.model_copy(
                 update={"status": ApprovalStatus.EXPIRED},
             )
             if self._repo is not None:
-                await self._repo.save(expired)
+                updated = await self._repo.expire_if_pending((item.id,))
+                if not updated:
+                    refreshed = await self._repo.get(item.id)
+                    if refreshed is not None:
+                        self._items[item.id] = refreshed
+                        return refreshed
+                    self._items.pop(item.id, None)
+                    return item
             self._items[item.id] = expired

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/api/approval_store.py` around lines 689 - 699, The scalar
lazy-expire branch must use a compare-and-set like the batch path to avoid
overwriting concurrent decisions: instead of unconditionally calling
self._repo.save(expired) and assigning self._items[item.id], call the repository
compare-and-set helper (expire_if_pending(...))—passing item.id, expected
ApprovalStatus.PENDING and the new ApprovalStatus.EXPIRED (and any expires_at
check) —and only update the in-memory cache (self._items[item.id] = expired) and
invoke logs/callbacks when that expire_if_pending call reports success; if
self._repo is None (in-memory mode) implement the same CAS by re-checking
current self._items[item.id].status before replacing so the transition happens
only when the status was still PENDING.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/api/approval_store.py`:
- Around line 399-434: The current logic in expire_if_pending() evicts
lost_race_ids from the in-memory cache and omits them from the final result;
instead, when the caller requested an unfiltered listing (status is None),
refetch those lost_race_ids from the authoritative repo after the cache update
and reapply the original page filter before extending result. Concretely: after
the async with self._lock block and before result.extend(...), if the original
status filter is None and lost_race_ids is non-empty, call the store/repo read
used elsewhere (the same code path that populates page_result) to fetch the
latest rows for lost_race_ids, apply the same filtering/paging logic that
produced page_result, merge those rows into page_result (or directly into
result) excluding actually_expired_ids, and ensure self._items is updated with
any freshly read rows only if self._generation == captured_generation; keep the
existing eviction for filtered listings. Use the existing names attempted_ids,
lost_race_ids, page_cache, page_result, captured_generation, self._items, and
actually_expired_ids to locate where to insert this refetch and reapply step.

In `@src/synthorg/persistence/sqlite/approval_repo.py`:
- Around line 295-297: The except block that catches (sqlite3.Error,
aiosqlite.Error) currently suppresses any error from await self._db.rollback(),
which hides rollback failures; change this to explicitly catch rollback
exceptions, log the rollback exception (including exception details) and then
either chain it to the original exception (raise original_exception from
rollback_exc) or re-raise the original after logging so the rollback failure is
not swallowed; locate the handler around await self._db.rollback() in the method
that performs DB operations and replace contextlib.suppress(sqlite3.Error,
aiosqlite.Error) with explicit exception handling that logs the rollback failure
and preserves/chains the original exception.

In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py`:
- Around line 130-155: The deserialization loop that constructs McpInstallation
(using NotBlankStr and coerce_row_timestamp) can raise non-SQLite exceptions
that currently bypass the sqlite-only except and escape without logging or
wrapping; wrap the deserialization in a try/except that catches Exception (in
addition to sqlite3/aiosqlite errors) and, in that handler, log
PERSISTENCE_MCP_INSTALLATION_LIST_FAILED with limit/offset/error_type/error
using logger and safe_error_description(exc), then raise QueryError("Failed to
list mcp installations") from exc so all failures (DB or malformed rows) are
logged and wrapped consistently.

---

Outside diff comments:
In `@src/synthorg/api/approval_store.py`:
- Around line 689-699: The scalar lazy-expire branch must use a compare-and-set
like the batch path to avoid overwriting concurrent decisions: instead of
unconditionally calling self._repo.save(expired) and assigning
self._items[item.id], call the repository compare-and-set helper
(expire_if_pending(...))—passing item.id, expected ApprovalStatus.PENDING and
the new ApprovalStatus.EXPIRED (and any expires_at check) —and only update the
in-memory cache (self._items[item.id] = expired) and invoke logs/callbacks when
that expire_if_pending call reports success; if self._repo is None (in-memory
mode) implement the same CAS by re-checking current self._items[item.id].status
before replacing so the transition happens only when the status was still
PENDING.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8c1a7d7c-90fe-449f-9445-484235606832

📥 Commits

Reviewing files that changed from the base of the PR and between e155e08 and 9ed706f.

📒 Files selected for processing (7)

src/synthorg/api/approval_store.py
src/synthorg/persistence/approval_protocol.py
src/synthorg/persistence/postgres/approval_repo.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
tests/conformance/persistence/test_approval_repository.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Web Assets (melange)
GitHub Check: Test (Python 3.14)
GitHub Check: Dashboard Test
GitHub Check: Lighthouse Site
GitHub Check: CLI Bench Regression
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (6)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use from __future__ import annotations in Python files; Python 3.14 has PEP 649 native lazy annotations

Use PEP 758 except syntax: except A, B: (no parens) when not binding to a name; as exc requires parens (except (A, B) as exc:)

All public functions must have type hints; mypy strict mode is enforced

Docstrings must use Google style and are required on public classes and functions (ruff D rules enforce this)

Every Pydantic model must be frozen by default via ConfigDict(frozen=True, ...) unless documented otherwise; mutations go through model_copy(update=...), never direct attribute assignment

Use ConfigDict(frozen=True, allow_inf_nan=False) on every Pydantic model; extra="forbid" on every model that does not need to round-trip through model_dump()

Use NotBlankStr from core.types for identifier and name fields in Pydantic models

Every BaseTool subclass, MCP tool registration, A2A RPC method, and WebSocket event must declare a typed Pydantic args model and validate it before dispatch

Every entry-point that ingests a dict payload from an external source must call parse_typed() from synthorg.api.boundary with a hardcoded LiteralString boundary label (never user-controlled)

Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in; wrap independent task bodies in async def helpers that catch Exception (re-raise only MemoryError / RecursionError) so one failure doesn't unwind the group

Classes that read time or sleep cooperatively must take clock: Clock | None = None defaulting to SystemClock() from synthorg.core.clock; tests inject FakeClock

Wrap attacker-controllable strings at LLM call sites via wrap_untrusted() from synthorg.engine.prompt_safety; append untrusted_content_directive(tags) to the enclosing system prompt

Never call lxml.html.fromstring on attacker input; use HTMLParseGuard from synthorg.tools.html_parse_guard instead (SEC-1)

Handle errors explicitly; never swallow. Domain...

Files:

src/synthorg/persistence/approval_protocol.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
tests/conformance/persistence/test_approval_repository.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/approval_repo.py
src/synthorg/api/approval_store.py
src/synthorg/persistence/sqlite/approval_repo.py

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Domain error class naming: error classes in domain modules use <Domain><Condition>Error and inherit from DomainError (or a domain-scoped intermediate that itself inherits DomainError). Bare Exception / RuntimeError at domain boundaries is forbidden

Every business-logic module must import from synthorg.observability import get_logger then logger = get_logger(__name__). Variable name is always logger

Never use import logging / logging.getLogger() / print() in application code (exception: observability/{setup,sinks,syslog_handler,http_handler,otlp_handler}.py for handler-construction / bootstrap code)

Event names: always import constants from synthorg.observability.events.<domain>; never use string literals

Structured kwargs in logging: always logger.info(EVENT, key=value); never logger.info("msg %s", val)

All error paths must log at WARNING or ERROR with context before raising

State transitions: every hop on a status enum (including non-terminal hops like PENDING -> RUNNING) must log at INFO using a domain-scoped *_STATUS_TRANSITIONED constant carrying from_status / to_status / domain id, AFTER the persistence write succeeds

DEBUG logging for object creation, internal flow, entry/exit of key functions. Pure data models, enums, re-exports do NOT need logging

Never call any logger severity method with error=str(exc); use structured logging with error_type=type(exc).__name__ and error=safe_error_description(exc) from synthorg.observability

Controllers and API endpoints must access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, etc.), never directly into repositories

src/synthorg/persistence/ is the ONLY place that may import aiosqlite, sqlite3, psycopg, or psycopg_pool, or emit raw SQL DDL/DML keywords in string literals

New settings register in src/synthorg/settings/definitions/<namespace>.py and are consumed via ConfigResolver.get_*. Direct `os.environ.get...

Files:

src/synthorg/persistence/approval_protocol.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/approval_repo.py
src/synthorg/api/approval_store.py
src/synthorg/persistence/sqlite/approval_repo.py

src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Repository CRUD vocabulary: persistence repositories use save(entity) -> None, get(id) -> Entity | None, delete(id) -> bool, list_items(...) -> tuple[Entity, ...], and query(...) -> tuple[Entity, ...]. Query methods always return tuples, never lists

Datetime marshalling in persistence: round-trip ISO 8601 timestamps through parse_iso_utc / format_iso_utc from synthorg.persistence._shared (both reject naive datetimes); use normalize_utc for relaxed coercion on already-typed datetime inputs

Repository services must not log mutations themselves; audit logging is centralized in service layers

Every durable feature MUST define a repository Protocol in persistence/<domain>_protocol.py, concrete impls under persistence/{sqlite,postgres}/, and be exposed on PersistenceBackend

Files:

src/synthorg/persistence/approval_protocol.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/approval_repo.py
src/synthorg/persistence/sqlite/approval_repo.py

src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/persistence/approval_protocol.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/approval_repo.py
src/synthorg/api/approval_store.py
src/synthorg/persistence/sqlite/approval_repo.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every Mock() / AsyncMock() / MagicMock() in tests MUST declare the interface via spec=ConcreteClass (Protocol or class). A pre-commit gate blocks new bare-call sites. Pre-existing sites frozen in scripts/mock_spec_baseline.txt. Regenerate via uv run python scripts/check_mock_spec.py --update

Time-driven tests: import FakeClock from tests._shared.fake_clock and inject it. Prefer FakeClock over patching time.monotonic() / asyncio.sleep() globals when the class accepts a clock= parameter

Coverage: 80% minimum (enforced in CI; benchmarks excluded via --ignore=tests/benchmarks/)

Async tests: asyncio_mode = "auto"; no manual @pytest.mark.asyncio needed

Test timeout: 30 seconds per test (global in pyproject.toml); non-default overrides like timeout(60) allowed

Test parallelism: pytest-xdist via -n 8, distribution --dist=loadfile (default in pyproject.toml addopts). ALWAYS include -n 8 when running pytest locally; never run tests sequentially

Isolation regression gate: affected-tests pre-push runner runs the affected subset twice via pytest-repeat (--count 2 -x). Opt out via SYNTHORG_SKIP_ISOLATION_GATE=1 for emergency pushes only

Never use monkeypatch.setattr(module.logger, "info", spy) to spy on logger; use context manager wrapping with direct setattr + try/finally del proxy.<level>. Canonical pattern: _logger_info_spy in tests/unit/settings/test_service.py

Parametrize: prefer @pytest.mark.parametrize for testing similar cases

Property-based testing: Python uses Hypothesis; when Hypothesis finds a failure, it is a real bug. Read the shrunk example, fix the underlying bug, add an explicit @example(...) decorator for permanent coverage

Flaky tests: NEVER skip, dismiss, or ignore. Always fix them fully and fundamentally. Use FakeClock injection when the class accepts clock=; patch globals only for legacy code paths without Clock seam

Files:

tests/conformance/persistence/test_approval_repository.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/conformance/persistence/test_approval_repository.py

src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS prevention): silent clients closed with policy code 1008 once they exceed api.ws_frame_timeout_seconds (default 30s) without sending a frame

WebSocket revalidation sliding window: persistence-backend failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds default 60s, api.ws_revalidation_max_failures default 5). Once saturated, socket closes with code 4011

Files:

src/synthorg/api/approval_store.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. Do NOT silently diverge from the design spec; every deviation needs explicit user approval, with the user deciding whether to proceed or update the spec.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: Commits: `<type>: <description>`. Types: feat, fix, refactor, docs, test, chore, perf, ci. Enforced by commitizen (commit-msg hook). Signed commits required on `main` via branch protection

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: Branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: Pre-push hooks: mypy type-check (affected modules only) + pytest unit tests (affected modules only) + golangci-lint + go vet + go test (CLI, conditional) + eslint-web (web dashboard) + `orphan-fixtures` (opt-in) + `setting-to-startup-trace` (conditional). Foundational module changes trigger full runs

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: Never create a PR directly via `gh pr create` (blocked by hookify). ALWAYS use `/pre-pr-review` to create PRs; it runs automated checks + review agents + fixes before creating the PR. For trivial/docs-only changes: `/pre-pr-review quick` skips agents but still runs checks

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: After finishing an issue implementation: create a feature branch (`<type>/<slug>`), commit, and push; do NOT create a PR automatically. Do NOT leave work uncommitted on main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts. At every phase, be critical and actively look for ways to improve the design; surface improvements as suggestions, not silent changes

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: Prioritize issues by dependency order, not priority labels; unblocked dependencies come first

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: NEVER use `cd` in Bash commands; working directory is already set to project root. Use absolute paths or run commands directly. Exception: `bash -c "cd <dir> && <cmd>"` is safe (child process, no cwd side effects)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:11:36.073Z
Learning: NEVER use Bash to write or modify files; use the Write or Edit tools. Forbidden: `cat >`, `cat << EOF`, `echo >`, `echo >>`, `sed -i`, `python -c "open(...).write(...)"`, `tee` for file creation/modification. Read-only inspection via piping to stdout is fine

🔇 Additional comments (1)

src/synthorg/persistence/postgres/mcp_installation_repo.py (1)

144-180: list_all now correctly enforces pagination inputs and normalizes failures.

Good update: Line 144 validates inputs at the boundary, and Lines 155-180 consistently log PERSISTENCE_MCP_INSTALLATION_LIST_FAILED and raise QueryError for DB/deserialization failures.

- approval_store: refetch lost-race rows from the repo and apply the caller's filters before extending the result. Previously dropped them entirely, which under-reported rows for unfiltered (status= None) listings when a concurrent save() raced our compare-and-set. Refetched rows also land in the cache so the next get() returns authoritative state. - sqlite/approval_repo.expire_if_pending: replace contextlib.suppress on the rollback path with explicit try/except that logs the rollback failure as a separate structured event before re-raising the original. Suppressing rollback diagnostics left the shared aiosqlite.Connection in an unknown state with no trace of why subsequent writes might start failing. - sqlite/mcp_installation_repo.list_all: broaden the read-path except from (sqlite3.Error, aiosqlite.Error) to Exception (with MemoryError/RecursionError carve-out) so NotBlankStr / coerce_row_ timestamp failures on a malformed persisted row also surface under PERSISTENCE_MCP_INSTALLATION_LIST_FAILED + QueryError, matching the postgres impl's pattern.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py`:
- Around line 106-112: The method name list_all on the SQLite repository doesn't
match the project's CRUD vocabulary; rename the method to list_items with the
same signature and return type (async def list_items(self, *, limit: int = 100,
offset: int = 0) -> tuple[McpInstallation, ...]) and update all call sites and
any interface/protocol definitions that reference list_all to use list_items
instead, ensuring the implementation body (in mcp_installation_repo.py) is
unchanged other than the name so behavior remains identical.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 308b11f7-4136-464c-82aa-6e99e2a4aaef

📥 Commits

Reviewing files that changed from the base of the PR and between 9ed706f and 0cfa0df.

📒 Files selected for processing (3)

src/synthorg/api/approval_store.py
src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Test (Python 3.14)
GitHub Check: Dashboard Test
GitHub Check: Build Web Assets (melange)
GitHub Check: CLI Bench Regression
GitHub Check: Lighthouse Site
GitHub Check: Lighthouse Dashboard
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (9)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Python version requirement: 3.14+ (PEP 649 native lazy annotations); no from __future__ import annotations

Use PEP 758 except syntax: except A, B: (no parens) when not binding; except (A, B) as exc: when binding

Files:

src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/api/approval_store.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

All public functions require type hints; use mypy strict mode

Use Google-style docstrings required on public classes and functions (ruff D rules enforced)

Create new objects instead of mutating existing ones; use frozen Pydantic models for config/identity; use copy.deepcopy() + MappingProxyType for non-Pydantic registries

Use Pydantic v2 with ConfigDict(frozen=True, allow_inf_nan=False) on all models; use extra="forbid" on models not needing round-trip; use NotBlankStr for identifier/name fields

Every BaseTool subclass, MCP tool registration, A2A RPC method, and WebSocket event must declare a typed Pydantic args model and validate before dispatch

Every entry-point ingesting dict payload from external sources must call parse_typed() from synthorg.api.boundary with a hardcoded LiteralString boundary label

Prefer asyncio.TaskGroup for fan-out/fan-in; wrap independent task bodies in async def helpers that catch Exception (re-raise only MemoryError/RecursionError)

Classes that read time or sleep must take clock: Clock | None = None parameter and inject FakeClock in tests

Async start()/stop() services must own a dedicated self._lifecycle_lock; timed-out stops mark the service unrestartable

Wrap attacker-controllable strings at LLM call sites via wrap_untrusted() from synthorg.engine.prompt_safety; append untrusted_content_directive(tags) to the system prompt

Never call lxml.html.fromstring on attacker input; use HTMLParseGuard from synthorg.tools.html_parse_guard

Cross-cutting subsystems follow protocol + strategy + factory + config discriminator with safe defaults

Line length: 88 characters (ruff enforced); functions: < 50 lines; files: < 800 lines

Handle errors explicitly, never swallow; domain error families register in EXCEPTION_HANDLERS for correct status codes

Every Pydantic model is ConfigDict(frozen=True, ...) unless documented otherwise; mutations use model_copy(update=...), never direct assign...

Files:

src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/api/approval_store.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/api/approval_store.py

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Domain error classes use <Domain><Condition>Error naming and inherit from DomainError (or domain-scoped intermediate); bare Exception/RuntimeError forbidden at domain boundaries

Every business-logic module has from synthorg.observability import get_logger then logger = get_logger(__name__)

Never use import logging, logging.getLogger(), or print() in application code (exception: handler-construction code in observability setup files)

Event names always import constants from synthorg.observability.events.<domain>; never use string literals

Always use structured logging: logger.info(EVENT, key=value); never use logger.info("msg %s", val)

All error paths log at WARNING or ERROR with context before raising

Every status enum hop logs at INFO using a domain-scoped *_STATUS_TRANSITIONED constant AFTER the persistence write succeeds

DEBUG logging for object creation, internal flow, entry/exit of key functions; pure data models, enums, re-exports do not need logging

For every mutable setting: DB > env (SYNTHORG_<NAMESPACE>_<KEY>) > YAML > code default, resolved through SettingsService/ConfigResolver

Direct os.environ.get(...) reads in application code outside startup are forbidden; use ConfigResolver.get_*

Files:

src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/api/approval_store.py

src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Repository CRUD vocabulary: save(entity) -> None, get(id) -> Entity | None, delete(id) -> bool, list_items(...) -> tuple[Entity, ...], query(...) -> tuple[Entity, ...]

Round-trip ISO 8601 timestamps through parse_iso_utc/format_iso_utc from synthorg.persistence._shared; use normalize_utc for relaxed coercion

Every durable feature MUST define a repository Protocol in persistence/<domain>_protocol.py, concrete impls under persistence/{sqlite,postgres}/, and expose on PersistenceBackend

Repositories MUST NOT log mutations themselves (enforced by scripts/check_persistence_boundary.py)

Files:

src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py

{src,tests,web,cli}/**/*.{py,ts,tsx,js,go}

📄 CodeRabbit inference engine (CLAUDE.md)

Comments explain WHY only, never origin/review/issue context; forbidden: reviewer citations, issue back-references, cryptic taxonomy shorthand, migration/rebrand framing, round/iteration narrative, self-evident restatements

Files:

src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/api/approval_store.py

{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project code, docstrings, comments, tests, or config examples; use generic names (example-provider, example-large-001, etc.)

Files:

src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/api/approval_store.py

{src,web}/**/*.{py,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

No default may privilege a single region, currency, or locale; every user-facing format resolves from: user/company setting -> browser/system -> neutral fallback

Currency: never hardcode ISO 4217 codes/symbols. Backend: use DEFAULT_CURRENCY from synthorg.budget.currency or runtime budget.currency. Frontend: use DEFAULT_CURRENCY from @/utils/currencies or useSettingsStore().currency

No _usd suffix on money fields; the type carries money semantics; the value is in the operator's configured currency

Locale: never hardcode BCP 47 tags or call bare .toLocaleString()/.toLocaleDateString()/.toLocaleTimeString(). Use helpers in @/utils/format (frontend) or system locale (backend)

Timezone: store UTC; render via Intl without passing timeZone (browser tz wins)

Date/number format: always via Intl; no hand-rolled templates

Units: metric only. Spelling: International/British English UI default (colour, behaviour, organise, centred, analyse, cancelled); document deviations

Files:

src/synthorg/persistence/sqlite/approval_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/api/approval_store.py

src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout: close silent clients after api.ws_frame_timeout_seconds (default 30s) with policy code 1008

WebSocket revalidation uses _SlidingWindowRateLimiter instead of reset-on-success streak counter to track persistence failures

Files:

src/synthorg/api/approval_store.py

src/synthorg/{api,controllers}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Controllers and API endpoints access persistence through domain-scoped service layers (not directly into repositories); services centralize audit logging and cross-repo orchestration

Files:

src/synthorg/api/approval_store.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Read the relevant `docs/design/` page before implementing any feature or planning any issue; the design spec is the starting point for architecture, data models, and behavior

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: When implementation deviates from the spec, alert the user with explanation before proceeding; every deviation needs explicit user approval

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: When approved deviations occur, update the relevant `docs/design/` page to reflect the new reality

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: At every phase of planning and implementation, be critical and actively look for ways to improve the design; surface improvements as suggestions, prioritize issues by dependency order

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Always use `uv` (not `pip`, `poetry`, or `venv`) for dependency management; use `uv sync --group docs` to install docs toolchain

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Git commits: `<type>: <description>`. Types: feat, fix, refactor, docs, test, chore, perf, ci (enforced by commitizen)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: All commits must be GPG/SSH signed (exception: GitHub App-signed commits from `synthorg-repo-bot` via Git Data API)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: No `cd` in Bash commands; working directory is project root. Exception: `bash -c "cd <dir> && <cmd>"` is safe (child process, no cwd side effects)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Never use Bash to write or modify files; use Write or Edit tools. Forbidden: `cat >`, `cat << EOF`, `echo >`, `echo >>`, `sed -i`, `python -c` writes, `tee`. Read-only/inspection OK

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: `scripts/check_setting_to_startup_trace.py` (pre-push + CI) flags settings whose consuming service is never instantiated at boot; per-setting opt-out via `# lint-allow: bootstrap-wiring -- <reason>`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: After finishing an issue implementation: always create a feature branch (`<type>/<slug>`), commit, and push; do NOT create a PR automatically

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Do NOT leave work uncommitted on main; branch, commit, push immediately after finishing

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Never create a PR directly via `gh pr create` (blocked by hookify); ALWAYS use `/pre-pr-review` to create PRs with automated checks + review agents

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: For trivial/docs-only changes: `/pre-pr-review quick` skips agents but still runs automated checks

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: After the PR exists, use `/aurelio-review-pr` to handle external reviewer feedback

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T01:36:53.660Z
Learning: Fix everything valid from review agents (including pre-existing issues in surrounding code); never skip or defer

🔇 Additional comments (13)

src/synthorg/persistence/sqlite/mcp_installation_repo.py (2)

119-123: Pagination argument validation is correctly centralized.

Good use of validate_pagination_args(...) with the domain event constant; this keeps invalid-input handling/logging consistent.

130-161: Unified failure envelope for DB + deserialization paths looks solid.

Wrapping fetch and row materialization in the same QueryError/event path is the right persistence-boundary behavior.

src/synthorg/persistence/sqlite/approval_repo.py (4)

6-19: LGTM!

Clean import organization: TYPE_CHECKING guard for the Sequence type hint and NotBlankStr addition align with the new method signatures.

207-268: LGTM!

Well-structured batch upsert implementation:

Empty input short-circuits correctly

Single-item delegation to save() preserves per-item error context in ConstraintViolationError

executemany with single commit amortizes I/O

Error handling maps to domain errors with structured logging before raising

270-324: LGTM – past review comment addressed.

The rollback failure is now explicitly logged at ERROR with structured context (phase="rollback") rather than suppressed, while the original exception remains chained on QueryError. The compare-and-set semantics with UPDATE...WHERE...AND status='pending' RETURNING id correctly implements the protocol contract.

102-120: LGTM!

PEP 758 except syntax at line 111 (except TypeError, KeyError:) is correctly applied for the row-id fallback extraction.

src/synthorg/api/approval_store.py (7)

85-100: LGTM!

Clock seam correctly injected per coding guidelines: optional clock parameter defaults to SystemClock(), enabling deterministic testing with FakeClock.

265-461: LGTM!

The repo-backed listing path is well-designed:

Per-page chunking keeps lock-hold duration short

Generation guard prevents cache resurrection after clear()

expire_if_pending CAS semantics prevent clobbering concurrent decisions

Lost-race rows are refetched and refiltered to avoid under-reporting

Status filter correctly excludes PENDING/EXPIRED from repo pushdown to avoid offset-based pagination drift

The complexity pragmas are appropriate given the interdependent sequential logic.

383-398: LGTM!

Correct PEP 758 syntax (except MemoryError, RecursionError: raise) and proper error logging with context before re-raising the batch-expiry exception.

463-500: LGTM!

Clean pure helper that separates computation from I/O. page_cache correctly captures the entire page (not just EXPIRED transitions) so stale non-expired siblings don't outlive a fresh repo read.

502-526: LGTM!

Cache-only listing path appropriately uses per-item _check_expiration_locked since there's no batch endpoint to amortize. Lock expectation is correctly documented in the method name.

695-762: LGTM – past review comments addressed.

Clock seam threaded through (self._clock.now() at line 715) and callback failure severity aligned to ERROR matching _fire_expire_callback, ensuring alerting isn't path-sensitive.

764-808: LGTM!

_compute_expiration: Correctly marked as "pure" (no store mutations); uses clock seam

_fire_expire_callback: Best-effort callback dispatch with proper PEP 758 exception handling and ERROR-level failure logging

coderabbitai · 2026-05-04T01:43:51Z

    async def list_all(
        self,
        *,
-        limit: int | None = None,
+        limit: int = 100,
        offset: int = 0,
    ) -> tuple[McpInstallation, ...]:
-        """List all recorded installations, oldest-first."""
+        """Return up to ``limit`` recorded installations, oldest-first.


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Align repository CRUD naming with list_items.

This repository method remains list_all(...), but the persistence CRUD contract requires list_items(...). Please align protocol + concrete repos (and call sites) to the standard vocabulary.

As per coding guidelines: src/synthorg/persistence/**/*.py: "Repository CRUD vocabulary: save(entity) -> None, get(id) -> Entity | None, delete(id) -> bool, list_items(...) -> tuple[Entity, ...], query(...) -> tuple[Entity, ...]".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/synthorg/persistence/sqlite/mcp_installation_repo.py` around lines 106 - 112, The method name list_all on the SQLite repository doesn't match the project's CRUD vocabulary; rename the method to list_items with the same signature and return type (async def list_items(self, *, limit: int = 100, offset: int = 0) -> tuple[McpInstallation, ...]) and update all call sites and any interface/protocol definitions that reference list_all to use list_items instead, ensuring the implementation body (in mcp_installation_repo.py) is unchanged other than the name so behavior remains identical.

Round-10 head review: rename list_all to list_items on the McpInstallationRepository protocol and all four implementations (sqlite, postgres, in_memory, plus the mcp_protocol.py mirror) so the persistence CRUD vocabulary matches the canonical contract spelled out in CLAUDE.md (save, get, delete, list_items, query). Aligns the in_memory impl's limit=int|None=None outlier to the standard limit=int=100 signature used by the durable backends; the protocol body itself previously had no pagination params at all and now matches. Updated callers: conformance test (4 sites plus 2 test method renames), unit test (1 site), integration test (1 site). 26499 unit tests pass. Plus a transient CLA workflow flake on the prior head (action verified the signature, then died on a GitHub API socket-hang-up post-success); reran the failed job, now green.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

tests/conformance/persistence/test_mcp_installations_repository.py (1)

101-123: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add invalid-pagination conformance cases for list_items().

This only proves the happy path. An implementation that silently coerces limit=0, offset=-1, or booleans can still pass, even though the new repository contract says bad pagination inputs should raise QueryError. Lock that behavior in here so all backends stay aligned.

Suggested test shape

+from synthorg.core.persistence_errors import QueryError
@@
     async def test_list_items_pagination(self, backend: PersistenceBackend) -> None:
         # Insert with monotonically increasing installed_at so the
         # deterministic ORDER BY installed_at, catalog_entry_id places
         # the rows in a known order.
@@
         assert [r.catalog_entry_id for r in page_two] == ["cat_pag_2", "cat_pag_3"]
         assert [r.catalog_entry_id for r in page_three] == ["cat_pag_4"]
+
+    `@pytest.mark.parametrize`(
+        ("limit", "offset"),
+        [
+            (0, 0),
+            (-1, 0),
+            (1, -1),
+            (True, 0),
+            (1, False),
+        ],
+    )
+    async def test_list_items_rejects_invalid_pagination(
+        self,
+        backend: PersistenceBackend,
+        limit: object,
+        offset: object,
+    ) -> None:
+        with pytest.raises(QueryError):
+            await backend.mcp_installations.list_items(
+                limit=limit,  # type: ignore[arg-type]
+                offset=offset,  # type: ignore[arg-type]
+            )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/conformance/persistence/test_mcp_installations_repository.py` around
lines 101 - 123, Extend the test_list_items_pagination to include
negative/zero/boolean pagination cases that must raise QueryError: call
backend.mcp_installations.list_items with limit=0, offset=-1, limit=True or
offset=False (and any other invalid types your contract cares about) and assert
that each call raises QueryError; locate the test in test_list_items_pagination
and add these assertions referencing backend.mcp_installations.list_items and
the QueryError exception to lock the required invalid-input behavior.

src/synthorg/integrations/mcp_catalog/in_memory_installations.py (1)

58-81: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject invalid pagination inputs here instead of silently coercing them.

max(0, int(...)) makes limit=0, offset=-1, and even True/False look valid in the in-memory repo, while the updated list_items() contract now expects invalid pagination to fail with QueryError. That leaves this implementation out of sync with SQLite/Postgres and can hide real bugs in tests and no-persistence deployments.

Suggested fix

+from synthorg.observability.events.persistence import (
+    PERSISTENCE_MCP_INSTALLATION_LIST_FAILED,
+)
+from synthorg.persistence._shared.pagination import validate_pagination_args
+
     async def list_items(
         self,
         *,
         limit: int = 100,
         offset: int = 0,
     ) -> tuple[McpInstallation, ...]:
         """List installations ordered by ``installed_at, catalog_entry_id`` ASC.
@@
         ``limit`` defaults to the protocol-wide pagination floor;
         callers needing more must loop with ``offset`` or pass a
         larger ``limit`` explicitly.
         """
+        validate_pagination_args(
+            limit,
+            offset,
+            event=PERSISTENCE_MCP_INSTALLATION_LIST_FAILED,
+            backend="in_memory",
+        )
         rows = tuple(
             sorted(
                 self._store.values(),
                 key=lambda i: (i.installed_at, i.catalog_entry_id),
             ),
         )
-        effective_offset = max(0, int(offset))
-        return rows[effective_offset : effective_offset + max(0, int(limit))]
+        return rows[offset : offset + limit]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/integrations/mcp_catalog/in_memory_installations.py` around
lines 58 - 81, list_items currently coerces invalid pagination via max(0,
int(...)) which hides bad inputs; instead validate offset and limit explicitly
in list_items (the method on this in-memory repo that reads self._store and
returns tuple[McpInstallation,...]) and raise QueryError for invalid values —
e.g. ensure offset and limit are integers (reject booleans), offset >= 0, and
limit > 0 (or whatever the protocol requires) and if validation fails raise
QueryError with a clear message so behavior matches the SQLite/Postgres
implementations.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/synthorg/integrations/mcp_catalog/in_memory_installations.py`:
- Around line 58-81: list_items currently coerces invalid pagination via max(0,
int(...)) which hides bad inputs; instead validate offset and limit explicitly
in list_items (the method on this in-memory repo that reads self._store and
returns tuple[McpInstallation,...]) and raise QueryError for invalid values —
e.g. ensure offset and limit are integers (reject booleans), offset >= 0, and
limit > 0 (or whatever the protocol requires) and if validation fails raise
QueryError with a clear message so behavior matches the SQLite/Postgres
implementations.

In `@tests/conformance/persistence/test_mcp_installations_repository.py`:
- Around line 101-123: Extend the test_list_items_pagination to include
negative/zero/boolean pagination cases that must raise QueryError: call
backend.mcp_installations.list_items with limit=0, offset=-1, limit=True or
offset=False (and any other invalid types your contract cares about) and assert
that each call raises QueryError; locate the test in test_list_items_pagination
and add these assertions referencing backend.mcp_installations.list_items and
the QueryError exception to lock the required invalid-input behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d168a1f3-c8cb-4531-ac3b-3c5013e3e20c

📥 Commits

Reviewing files that changed from the base of the PR and between 0cfa0df and cc7b04d.

📒 Files selected for processing (8)

src/synthorg/integrations/mcp_catalog/in_memory_installations.py
src/synthorg/integrations/mcp_catalog/installations.py
src/synthorg/persistence/mcp_protocol.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
tests/conformance/persistence/test_mcp_installations_repository.py
tests/integration/integrations/test_controllers.py
tests/unit/integrations/test_mcp_catalog.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Backend
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Build Web Assets (melange)
GitHub Check: Lighthouse Site
GitHub Check: Lighthouse Dashboard
GitHub Check: CLI Test (windows-latest)
GitHub Check: CLI Bench Regression
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (6)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

No from __future__ import annotations in Python 3.14+ code; PEP 649 provides native lazy annotations.

Use PEP 758 except syntax: except A, B: (no parens) when not binding to a name; except (A, B) as exc: when binding. Ruff enforces this on Python 3.14.

All public functions and classes must have type hints; mypy runs in strict mode. Docstrings are required on public classes and functions using Google style; ruff enforces D rules.

Create new objects instead of mutating existing ones. Use frozen Pydantic models for config/identity; for non-Pydantic registries use copy.deepcopy() at construction and MappingProxyType wrapping. Use deepcopy() at system boundaries (tool execution, provider serialization, persistence).

Separate config from runtime state: use frozen models for config/identity and separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves. Never mix static config and mutable runtime fields in one model.

Pydantic v2 conventions: use ConfigDict(frozen=True, allow_inf_nan=False) on all models; use extra='forbid' on every model that does not need to round-trip through model_dump(); use @computed_field for derived values; use NotBlankStr from core.types for identifier/name fields.

Every BaseTool subclass, MCP tool registration, A2A RPC method, and WebSocket event must declare a typed Pydantic args model and be validated before dispatch.

Prefer asyncio.TaskGroup for fan-out/fan-in concurrent work. Wrap independent task bodies in async def helpers that catch Exception (re-raise only MemoryError/RecursionError) so one failure does not unwind the group.

Classes that read time or sleep cooperatively must take clock: Clock | None = None defaulting to SystemClock() from synthorg.core.clock; tests inject FakeClock. Never patch time.monotonic() / asyncio.sleep() globals if a Clock seam is available.

Async services own a dedicated self._lifecycle_lock for synchronized start()/`sto...

Files:

src/synthorg/persistence/mcp_protocol.py
src/synthorg/integrations/mcp_catalog/installations.py
tests/conformance/persistence/test_mcp_installations_repository.py
src/synthorg/integrations/mcp_catalog/in_memory_installations.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
tests/unit/integrations/test_mcp_catalog.py
src/synthorg/persistence/postgres/mcp_installation_repo.py
tests/integration/integrations/test_controllers.py

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every entry-point that ingests a dict payload from an external source (MCP handler args, JWT decode, WebSocket control message, audit-chain payload, A2A JSON-RPC params, settings security import) must call parse_typed() from synthorg.api.boundary to validate and emit API_BOUNDARY_VALIDATION_FAILED on failure.

Comments explain WHY only, never origin/review/issue context. Forbidden: reviewer citations, issue/PR back-references, cryptic internal-taxonomy shorthand without explanation, migration/rebrand framing, round/iteration narrative. Allowed: hidden constraints, subtle invariants, workarounds for upstream bugs (with stable bug-tracker URL), and why a non-obvious choice was made.

Every business-logic module must import from synthorg.observability import get_logger and assign logger = get_logger(__name__). Variable name is always logger. Never use bare import logging or logging.getLogger() in application code.

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, MemoryService), never directly into repositories. Services centralize audit logging and cross-repo orchestration; repositories must not log mutations.

Direct os.environ.get(...) reads in application code outside startup are forbidden; use ConfigResolver.get_*() instead.

No default may privilege a single region, currency, or locale. Currency: never hardcode ISO 4217 codes or symbols; use DEFAULT_CURRENCY from synthorg.budget.currency (backend) or @/utils/currencies (frontend). Locale: never hardcode BCP 47 tags or call bare .toLocaleString(); use helpers in @/utils/format. Timezone: store UTC; render via Intl without passing timeZone. Monetary models: every cost-bearing Pydantic model carries currency: CurrencyCode; aggregation enforces same-currency invariants.

Files:

src/synthorg/persistence/mcp_protocol.py
src/synthorg/integrations/mcp_catalog/installations.py
src/synthorg/integrations/mcp_catalog/in_memory_installations.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/mcp_installation_repo.py

src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/persistence/ is the only place that may import aiosqlite, sqlite3, psycopg, or psycopg_pool, or emit raw SQL DDL/DML keywords in string literals. Enforced by scripts/check_persistence_boundary.py.

Repository CRUD vocabulary: use save(entity) -> None (insert-or-update, idempotent), get(id) -> Entity | None (None on miss, never raises), delete(id) -> bool (True on removal, False if absent), list_items(...) -> tuple[Entity, ...] (paginated/filtered), and query(...) -> tuple[Entity, ...]. Query methods always return tuples, never lists.

Datetime marshalling in persistence: round-trip ISO 8601 timestamps through parse_iso_utc/format_iso_utc from synthorg.persistence._shared (both reject naive datetimes); use normalize_utc for relaxed coercion on already-typed datetime inputs.

Files:

src/synthorg/persistence/mcp_protocol.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/mcp_installation_repo.py

src/synthorg/persistence/**/*_protocol.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every durable feature must define a repository Protocol in persistence/<domain>_protocol.py, with concrete implementations under persistence/{sqlite,postgres}/, exposed on PersistenceBackend.

Files:

src/synthorg/persistence/mcp_protocol.py

src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/persistence/mcp_protocol.py
src/synthorg/integrations/mcp_catalog/installations.py
src/synthorg/integrations/mcp_catalog/in_memory_installations.py
src/synthorg/persistence/sqlite/mcp_installation_repo.py
src/synthorg/persistence/postgres/mcp_installation_repo.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: use @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow. Every Mock() / AsyncMock() / MagicMock() must declare the interface via spec=ConcreteClass (enforced by scripts/check_mock_spec.py). Use shared mocks from conftest.py (e.g. mock_dispatcher).

Time-driven tests: import FakeClock from tests._shared.fake_clock and inject it via clock= parameter. Use FakeClock.sleep (yields via asyncio.sleep(0)) and await clock.advance_async(seconds) to drive tasks. Never patch time.monotonic() / asyncio.sleep() globals if a Clock seam is available.

Coverage minimum: 80% (enforced in CI; benchmarks excluded via --ignore=tests/benchmarks/). Timeout: 30 seconds per test (global; non-default overrides like timeout(60) allowed). Parallelism: always use -n 8 with pytest-xdist; distribution via --dist=loadfile (default).

Flaky tests must be fixed fundamentally, never skipped or dismissed. Use FakeClock for timing-sensitive tests; use asyncio.Event().wait() instead of asyncio.sleep(large_number) for indefinite blocks (cancellation-safe, no timing assumptions).

Files:

tests/conformance/persistence/test_mcp_installations_repository.py
tests/unit/integrations/test_mcp_catalog.py
tests/integration/integrations/test_controllers.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/conformance/persistence/test_mcp_installations_repository.py
tests/unit/integrations/test_mcp_catalog.py
tests/integration/integrations/test_controllers.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:04:16.772Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts. At every phase of planning and implementation, actively look for ways to improve the design; surface improvements as suggestions and let the user decide.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:04:16.772Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts. After finishing implementation, always create a feature branch, commit, and push; do NOT create a PR automatically. Use `/pre-pr-review` to run automated checks + agents before PR creation (or `/pre-pr-review quick` for trivial/docs-only changes). After the PR exists, use `/aurelio-review-pr` for external reviewer feedback.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-04T02:04:16.772Z
Learning: Fix everything valid during code review, never skip. When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and adjacent findings), fix them all; no deferring or out-of-scope skipping.

Round-11 head review (outside-diff-range): - in_memory_installations.list_items: replace silent max(0, int(...)) coercion with the shared validate_pagination_args helper so invalid pagination inputs (limit=0, offset<0, non-int, bool) raise QueryError. Aligns the in-memory shim with the sqlite/postgres impls; previously a no-persistence test or headless dev app could mask a real bug that the durable backends catch. - conformance test: add parametrized test_list_items_rejects_invalid_pagination covering limit=0, offset=-1, limit=-1, limit=True (bool subtype), offset=False; runs against every backend on PersistenceBackend so the QueryError contract stays locked across sqlite + postgres + future backends. 26499 unit tests pass; 29 MCP-install tests pass.

## Highlights > _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub Models). Commit-based changelog below._ ### What you'll notice - Release notes now include AI-generated Highlights and commit subject/body in dev releases. ### What's new - Trace added to detect ghost-wired settings for better linting feedback. ### Under the hood - Docker publish step now retries transient GHCR errors to improve release stability. - Codebase audit incorporated 21 mechanical fixes from recent reviews to improve quality. - Added false-positive prevention rules to codebase audit prompts to reduce errors. - Enhanced codebase audit skill based on insights from latest audits. - Web component tightened async-leak detection and improved jsdom Storage timer handling.  :robot: I have created a release *beep* *boop* --- ## [0.7.9](v0.7.8...v0.7.9) (2026-05-04) ### Features * **ci:** promote AI Highlights to release body, add commit subject/body to dev releases ([#1743](#1743)) ([70bffe9](70bffe9)) * **lint:** bootstrap-wiring trace for ghost-wired settings ([#1742](#1742)) ([6ddbb86](6ddbb86)), closes [#1737](#1737) ### Bug Fixes * **ci:** retry transient GHCR errors in docker publish + retag ([#1741](#1741)) ([18e5349](18e5349)) ### Maintenance * 2026-05-03 Bucket A audit ([#1733](#1733)): 21 mechanical fixes ([#1744](#1744)) ([334262b](334262b)) * add 5 FP-prevention rules to /codebase-audit agent prompts ([#1734](#1734)) ([8f33ad6](8f33ad6)) * improve codebase-audit skill from 2026-05-03 lessons ([#1732](#1732)) ([7281cd7](7281cd7)) * **web:** tighten async-leak ceiling, bypass jsdom Storage timer dispatch ([#1728](#1728)) ([de0a1e1](de0a1e1)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Aureliolo temporarily deployed to atlas May 3, 2026 20:43 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 3, 2026 20:44 — with GitHub Actions Inactive

gemini-code-assist Bot reviewed May 3, 2026

View reviewed changes

coderabbitai Bot requested changes May 3, 2026

View reviewed changes

Aureliolo added 22 commits May 3, 2026 23:33

refactor: remove _group_records_by_agent wrapper

4bb3831

The wrapper was a one-line passthrough to budget._aggregation.group_by_agent. Callers now import group_by_agent directly, removing a layer of indirection that obscured the call graph without adding any value. Refs #1733 (R11)

Aureliolo temporarily deployed to cloudflare-preview May 3, 2026 23:44 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 3, 2026

View reviewed changes

Aureliolo temporarily deployed to atlas May 4, 2026 00:09 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 4, 2026 00:10 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 4, 2026

View reviewed changes

Aureliolo temporarily deployed to atlas May 4, 2026 00:38 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 4, 2026 00:39 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 4, 2026

View reviewed changes

Comment thread src/synthorg/api/approval_store.py Outdated

Comment thread src/synthorg/api/approval_store.py

Comment thread src/synthorg/persistence/postgres/mcp_installation_repo.py Outdated

Aureliolo temporarily deployed to atlas May 4, 2026 01:10 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 4, 2026 01:11 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 4, 2026

View reviewed changes

Comment thread src/synthorg/api/approval_store.py

Comment thread src/synthorg/persistence/sqlite/approval_repo.py

Comment thread src/synthorg/persistence/sqlite/mcp_installation_repo.py

Aureliolo temporarily deployed to atlas May 4, 2026 01:35 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 4, 2026 01:36 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 4, 2026

View reviewed changes

Aureliolo temporarily deployed to atlas May 4, 2026 02:03 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 4, 2026 02:04 — with GitHub Actions Inactive

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Aureliolo temporarily deployed to atlas May 4, 2026 02:27 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 4, 2026 02:28 — with GitHub Actions Inactive

Aureliolo merged commit 334262b into main May 4, 2026
92 checks passed

Aureliolo deleted the chore/audit-bucket-a branch May 4, 2026 02:45

Aureliolo temporarily deployed to cloudflare-preview May 4, 2026 02:45 — with GitHub Actions Inactive

synthorg-repo-bot Bot mentioned this pull request May 3, 2026

chore(main): release 0.7.9 #1731

Merged

	model_config = ConfigDict(frozen=True, allow_inf_nan=False)
	model_config = ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid")

Conversation

Aureliolo commented May 3, 2026

Acceptance items (21/21 RESOLVED)

Sub-section 1 — Audit-chain + security misroutes

Sub-section 2 — Concurrency + lifecycle locks

Sub-section 3 — Persistence + migrations + N+1

Sub-section 4 — Settings wiring + Clock seam

Sub-section 5 — Pydantic + pagination + magic numbers + Prometheus

Sub-section 6 — Frontend + docs + consolidation

Pre-PR review — agent findings addressed in this PR

New regression coverage

Verification

Notes for the reviewer

Uh oh!

github-actions Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

codspeed-hq Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 3, 2026 •

edited

Loading

coderabbitai Bot commented May 3, 2026 •

edited

Loading

codspeed-hq Bot commented May 3, 2026 •

edited

Loading

codecov Bot commented May 3, 2026 •

edited

Loading