Skip to content

perf: composite indexes + cursor pagination + clock seam + SSE rate-limit + Ollama sanitization + retry-after web client + WS reconnect jitter#1822

Merged
Aureliolo merged 21 commits into
mainfrom
perf/performance-data-integrity
May 9, 2026
Merged

perf: composite indexes + cursor pagination + clock seam + SSE rate-limit + Ollama sanitization + retry-after web client + WS reconnect jitter#1822
Aureliolo merged 21 commits into
mainfrom
perf/performance-data-integrity

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

Implements all 18 perf / data-integrity findings from the 2026-05-05 audit, plus 20 follow-up findings surfaced by the pre-PR review pipeline.

Closes #1780

What changed

Persistence (3 composite indexes)

  • cost_records (agent_id, timestamp DESC) and (task_id, timestamp DESC)
  • decision_records (task_id, recorded_at, id)
  • Atlas migrations on both SQLite and Postgres backends; EXPLAIN-asserting integration tests under tests/integration/persistence/test_perf_indices_{sqlite,postgres}.py

Cursor pagination on 7 list endpoints -- BREAKING wire-format change

  • GET /providers (was Mapping[str, ProviderResponse])
  • GET /providers/{name}/models
  • GET /setup/agents (was wrapped in SetupAgentsListResponse)
  • GET /setup/personality-presets (was wrapped in PersonalityPresetsListResponse)
  • GET /scaling/strategies
  • GET /scaling/signals
  • GET /settings/observability/sinks (now typed SinkInfoResponse, was dict[str, Any])
  • All return PaginatedResponse[T] with HMAC-signed cursor + bounded limit, matching the existing WorkflowController.list_workflows shape
  • ProviderResponse gains an optional name field for paginated entries
  • Wrapper DTOs (SetupAgentsListResponse, PersonalityPresetsListResponse) removed -- pre-alpha posture, no compat shims

Clock seam injection (3 services)

  • LiteLLMDriver (OAuth credential cache TTL)
  • FineTuneOrchestrator (WS progress throttle)
  • HttpAnalyticsEmitter (flush throttle bookkeeping)
  • All accept clock: Clock | None = None, default SystemClock(); new FakeClock-driven unit tests cover TTL boundaries, throttle intervals, and flush timestamp propagation

SSE rate-limit + concurrency cap on /events/stream

  • New policy events.stream: (60, 60) registered alongside the existing per-op rate-limit table
  • per_op_concurrency("events.stream", max_inflight=4, key="user") mirrors the pull_model SSE template

Ollama error sanitization

  • New _sanitize_ollama_error() helper redacts POSIX paths, Windows paths, and host:port tokens before forwarding errors via SSE / structured logs; non-string error payloads collapse to a generic message and the original type is logged at WARN
  • Adversarial-input parametrised test covers paths, host:port, oversize, integer / null inputs, and a benign control case

Web 429 / Retry-After

  • New web/src/utils/fetch-with-retry.ts wraps window.fetch with bounded retries honouring Retry-After (RFC 9110 delta-seconds + HTTP-date), capped at MAX_RETRY_AFTER_MS=5000, with explicit idempotent: boolean opt-in for non-GET methods
  • parseRetryAfterMs extracted to web/src/utils/retry-after.ts so the axios interceptor and the new helper share one parser; emits a structured warn on malformed input
  • Wired into pullModel SSE (web/src/api/endpoints/providers.ts) and callServerLogout (web/src/utils/app-version.ts)
  • AbortSignal short-circuits during retry sleep so caller cancellation lands promptly

WS reconnect jitter (±20%)

  • web/src/stores/websocket.ts multiplies the deterministic exponential delay by a uniform [0.8, 1.2) multiplier, clamped to >= 1ms; constants WS_RECONNECT_JITTER_MIN/MAX live in utils/constants.ts
  • Test parametrised across lower-bound / midpoint / upper-bound Math.random values

Test plan

  • uv run mypy src/ tests/ - PASS (3678 source files)
  • uv run ruff check src/ tests/ + ruff format --check - PASS
  • uv run python -m pytest tests/unit/api tests/unit/providers tests/unit/memory/embedding tests/unit/meta/telemetry -m unit -n 8 - 4467 passed, 3 skipped (POSIX-only)
  • uv run python -m pytest tests/integration/persistence/test_perf_indices_sqlite.py -m integration -n 0 - 3 passed
  • npm --prefix web run lint - PASS (zero warnings)
  • npm --prefix web run test (vitest) - 3038 passed (250 files)
  • atlas migrate validate --env sqlite + --env postgres - clean on both
  • All convention gates pass: check_list_pagination, check_dual_backend_test_parity, check_no_em_dashes, check_no_review_origin_in_code, check_no_migration_framing, check_logger_exception_str_exc, check_no_magic_numbers (baseline regenerated for shifted line numbers), check_dto_forbid_extra

Review coverage

18 review agents ran in parallel:

  • Always-on: docs-consistency, comment-quality-rot, mini-pass missing-logger, mini-pass missing-event-constants, mini-pass missing-state-transition-log, mini-pass race-conditions
  • Conditional: python-reviewer, code-reviewer, pr-test-analyzer, silent-failure-hunter, type-design-analyzer, logging-audit, resilience-audit, conventions-enforcer, security-reviewer, frontend-reviewer, api-contract-drift, persistence-reviewer, test-quality-reviewer, async-concurrency-reviewer
  • Issue-resolution: all 7 audit tasks confirmed RESOLVED at confidence 100

25 findings total: 20 valid + 5 false-positives (logged in _audit/pre-pr-review/triage.md). All 20 valid items addressed in the final fixup commit on this branch.

Breaking changes

The seven paginated endpoints changed their response shape from ApiResponse[Mapping[str, T]] / ApiResponse[ListResponse] to PaginatedResponse[T] ({ data: T[], pagination: { limit, next_cursor, has_more } }). Web clients walk pages via paginateAll(); MSW handlers return the new wire envelope. Pre-alpha posture allows the change without compat shims.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 25566eeb-2d59-4016-ab1c-721210f56ecb

📥 Commits

Reviewing files that changed from the base of the PR and between 05827f8 and ece7ed5.

📒 Files selected for processing (5)
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/settings/definitions/api.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/core/conftest.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Deploy Preview
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Build Backend
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: Socket Security: Pull Request Alerts
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

No default may privilege a region, currency, or locale. Resolution: user/company → browser/system → neutral fallback. Use International/British English UI default (e.g. colour, behaviour, organise, centred, analyse).

Files:

  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/core/conftest.py
  • src/synthorg/settings/definitions/api.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit / integration / e2e / slow. Every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt. Without spec= mocks silently absorb every attribute access. Enforced by scripts/check_mock_spec.py. Use mock_dispatcher from tests/conftest.py for shared mocks.
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Never use monkeypatch.setattr(module.logger, ...) antipattern; the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead (see _logger_info_spy in tests/unit/settings/test_service.py).
Prefer @pytest.mark.parametrize for similar test cases.

Files:

  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/core/conftest.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/core/conftest.py
**/{src,tests,docs,web}/**/*.{py,md,mdx,yaml,yml,json}

📄 CodeRabbit inference engine (CLAUDE.md)

No em-dashes in code, config, or documentation. Use -- (two hyphens). Enforced by pre-commit hook.

Files:

  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/core/conftest.py
  • src/synthorg/settings/definitions/api.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
tests/**/conftest.py

📄 CodeRabbit inference engine (CLAUDE.md)

Always use parallelism: pytest-xdist -n 8 --dist=loadfile (never worksteal on Python 3.14 + Windows ProactorEventLoop). Unit tests run under WindowsSelectorEventLoopPolicy (set by tests/unit/conftest.py). Tool tests driving real asyncio.create_subprocess_exec override back to default policy in tests/unit/tools/conftest.py.

Files:

  • tests/unit/core/conftest.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: All cost-bearing Pydantic models must carry currency: CurrencyCode. Mixing currencies in aggregations raises MixedCurrencyAggregationError (HTTP 409, error code 4007). Aggregations call assert_currencies_match() from synthorg.budget.currency before reducing. Per-line opt-out: # lint-allow: currency-aggregation -- <reason>.
Never use unguarded sum(), math.fsum(), statistics.mean(), statistics.fmean() (including bare-name imports) over .cost, .amount, .total_cost, .usd, or .eur fields without asserting currency invariants. Enforced by scripts/check_currency_aggregation_invariant.py. Per-line opt-out: # lint-allow: currency-aggregation -- <reason> (mandatory non-empty reason).
src/synthorg/persistence/ is the only place that may import aiosqlite, sqlite3, psycopg, psycopg_pool, or emit raw SQL DDL/DML. Every durable feature defines a Protocol in persistence/<domain>_protocol.py with concrete impls under persistence/{sqlite,postgres}/ exposed on PersistenceBackend. Controllers and API endpoints access persistence through domain-scoped service layers; services centralize audit logging; repositories must not log mutations. Per-line opt-out: # lint-allow: persistence-boundary -- <reason>. Enforced by scripts/check_persistence_boundary.py.
Provide type hints on all public functions. mypy strict enforcement required.
Use Google-style docstrings on public classes and functions. Enforced by ruff D rules.
Never mutate objects; create new objects via model_copy(update=...) or copy.deepcopy(). Frozen Pydantic for config/identity; MappingProxyType for non-Pydantic registries; deepcopy at system boundaries.
Separate frozen config models from mutable-via-copy runtime models; never mix in one model.
Use Pydantic v2 with ConfigDict(frozen=True, allow_inf_nan=False) everywhere. Apply extra="forbid" on every model that doesn't round-trip through model_dump() (every API-boundary DTO with Request/Response/Sna...

Files:

  • src/synthorg/settings/definitions/api.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/settings/definitions/api.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes socket with code 4011.

Files:

  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/core/conftest.py
  • src/synthorg/settings/definitions/api.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
🔇 Additional comments (11)
tests/unit/core/conftest.py (1)

104-111: Good stabilisation of AgentIdentityFactory.skills generation.

Explicitly binding skills = SkillSetFactory removes the flaky overlap path and keeps fixture generation deterministic in CI.

tests/unit/api/controllers/test_setup_personality.py (3)

41-57: Good hardening of persistence verification.

Capturing the updated agent name and validating against an explicitly bounded fetch (limit=100) makes this test much less brittle under pagination changes.


121-134: Strong contract assertions for paginated response shape.

The added checks on both data and pagination appropriately pin the new PaginatedResponse wire format.


142-144: Nice fail-fast guard before payload iteration.

Asserting success before traversing body["data"] improves diagnostics when the endpoint regresses.

src/synthorg/settings/definitions/api.py (1)

31-77: Well-structured tuning constants with clear documentation.

The new Final[int] constants are properly placed in the allowlisted settings/definitions/ module, with clear docstrings explaining purpose and rationale for each value. The separation of rate-limit defaults, inflight caps, and the priority fallback sentinel is clean.

src/synthorg/api/controllers/scaling.py (2)

194-211: Priority fallback constant properly extracted.

The magic number 999 has been replaced with the imported SCALING_STRATEGY_PRIORITY_FALLBACK constant from settings, addressing the concern from a previous review. The pagination implementation with cursor/limit and deterministic name-based sorting is correct.


260-326: Signal pagination implementation looks correct.

The deduplication logic (via seen set) preserves signal uniqueness, sorting by name ensures deterministic cursor-based pagination, and the fallback to decision history on cold start is a reasonable UX choice.

src/synthorg/api/controllers/memory.py (4)

152-220: Threshold resolution implementation addresses prior review feedback.

The _FineTuneThresholds model with frozen=True, extra="forbid" and the _resolve_fine_tune_thresholds helper correctly:

  • Fall back to imported defaults when SettingsService is unavailable
  • Reject non-positive values (line 201: value >= 1)
  • Enforce the cross-field invariant min_docs_recommended >= min_docs_required (lines 208-215)

The PEP 758 comma-separated except clause at line 194 is valid for Python 3.14+.


262-265: Policy-based concurrency limiting is a clean abstraction.

Switching from hardcoded per_op_concurrency(max_inflight=1) to per_op_concurrency_from_policy("memory.fine_tune", key="user") allows operators to tune inflight caps via the settings registry without code changes.


841-852: Document count boundaries are now inclusive as expected.

Line 841 uses < for the hard requirement floor (fail if below minimum), and line 847 uses <= for the warn band (warn at or below recommended), which matches the documented behaviour for the memory.fine_tune_min_docs_recommended setting.


930-968: Batch size recommendation correctly uses resolved default.

The default_batch_size parameter is now resolved from settings at the API boundary and passed in, with the imported constant serving only as the function's offline/test fallback. The return paths at lines 947 and 953 correctly use the resolved default when CUDA is unavailable or no VRAM tier matches.


Walkthrough

This pull request implements cursor-based pagination across 7 API list endpoints, centralizes Retry-After parsing and adds fetchWithRetryAfter, applies policy-driven rate limits and inflight concurrency guards, injects Clock dependencies for deterministic timing, creates composite DB indexes and migrations for pagination performance, sanitizes Ollama pull-stream errors, and updates tests and MSW mocks to use paginated envelopes.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements cursor-based pagination across several API controllers, including providers, scaling, settings, and setup, while updating the frontend to handle these paginated responses. It also introduces a shared fetchWithRetryAfter utility for the web client, adds performance indices to the database for cost and decision records, and improves backend testability by injecting a Clock service into various components. Review feedback identifies a missing import for PaginatedResponse in the providers controller, suggests adding explicit type parameters to paginated responses for better type safety, and recommends a more robust regex for host-port redaction in Ollama error messages to prevent internal topology leakage.

Comment on lines 22 to 24
from synthorg.api.dto import (
DEFAULT_LIMIT,
ApiResponse,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The PaginatedResponse type is used in the new paginated endpoints (e.g., list_providers and list_models) but is missing from the synthorg.api.dto imports. This will lead to a NameError at runtime.

from synthorg.api.dto import (
    DEFAULT_LIMIT,
    ApiResponse,
    PaginatedResponse,

Comment thread src/synthorg/api/controllers/scaling.py Outdated
Comment on lines +179 to +186
return PaginatedResponse(
data=(),
pagination=PaginationMeta(
limit=limit,
next_cursor=None,
has_more=False,
),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with other paginated endpoints in this PR and to support better static type checking, the PaginatedResponse return should include the explicit type parameter [ScalingStrategyResponse].

Suggested change
return PaginatedResponse(
data=(),
pagination=PaginationMeta(
limit=limit,
next_cursor=None,
has_more=False,
),
)
return PaginatedResponse[ScalingStrategyResponse](
data=(),
pagination=PaginationMeta(
limit=limit,
next_cursor=None,
has_more=False,
),
)

Comment thread src/synthorg/api/controllers/scaling.py Outdated
Comment on lines +280 to +287
return PaginatedResponse(
data=(),
pagination=PaginationMeta(
limit=limit,
next_cursor=None,
has_more=False,
),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with other paginated endpoints in this PR and to support better static type checking, the PaginatedResponse return should include the explicit type parameter [ScalingSignalResponse].

Suggested change
return PaginatedResponse(
data=(),
pagination=PaginationMeta(
limit=limit,
next_cursor=None,
has_more=False,
),
)
return PaginatedResponse[ScalingSignalResponse](
data=(),
pagination=PaginationMeta(
limit=limit,
next_cursor=None,
has_more=False,
),
)

Comment on lines +45 to +48
# Dotted ``host:port`` tokens (``ollama-internal.local:11434``).
_OLLAMA_HOST_PORT: Final[re.Pattern[str]] = re.compile(
r"(?:[\w-]+\.){1,}[\w-]+:\d{2,5}",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The _OLLAMA_HOST_PORT regular expression requires at least one dot in the hostname, which means it will fail to redact common local identifiers like localhost or single-label internal hostnames (e.g., ollama:11434). This could lead to accidental leakage of internal topology in error messages forwarded to clients.

Suggested change
# Dotted ``host:port`` tokens (``ollama-internal.local:11434``).
_OLLAMA_HOST_PORT: Final[re.Pattern[str]] = re.compile(
r"(?:[\w-]+\.){1,}[\w-]+:\d{2,5}",
)
# Dotted or single-label ``host:port`` tokens (``localhost:11434``, ``ollama.local:11434``).
_OLLAMA_HOST_PORT: Final[re.Pattern[str]] = re.compile(
r"(?:[\w-]+\.)*[\w-]+:\d{2,5}",
)

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 8, 2026

Merging this PR will not alter performance

✅ 54 untouched benchmarks


Comparing perf/performance-data-integrity (ece7ed5) with main (d01e624)1

Open in CodSpeed

Footnotes

  1. No successful run was found on main (b7b9a59) during the generation of this report, so d01e624 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/controllers/events.py`:
- Around line 534-537: The hardcoded max_inflight=4 on the per_op_concurrency
decorator for the "events.stream" endpoint should be replaced with a
settings-backed constant: define (or add) a setting like EVENTS_SSE_MAX_INFLIGHT
in the settings definitions module for the events namespace, load it via the
existing settings API (or import the constant) and pass that value to
per_op_concurrency instead of the literal 4, providing a sensible
default/fallback if the setting is absent so behavior is unchanged until
operators tune it.

In `@src/synthorg/api/controllers/setup.py`:
- Around line 442-449: list_agents() currently reorders the returned list by
name (see the sorted(...) assigned to ordered) while write endpoints still use
positional agent_index against the persisted array, which breaks correctness;
either stop reordering here by returning summaries in their original storage
order (replace the sorted(...) usage and pass summaries directly into
paginate_cursor) or else add and return a stable identifier on each summary
(e.g., agent_id) and update all write handlers that accept agent_index to
resolve by that stable ID instead of array position (choose one approach and
implement it consistently across list_agents(), paginate_cursor call sites, and
PUT/POST handlers).

In `@src/synthorg/providers/management/local_models.py`:
- Around line 46-48: The _OLLAMA_HOST_PORT regex only matches dotted domain
names and misses common host:port forms (localhost, IPv4, bracketed IPv6);
update the pattern used in _OLLAMA_HOST_PORT (and the other similar occurrence
around the same block) to also match localhost, IPv4 addresses, and bracketed
IPv6 literals before the colon and port. Use a single hardened regex that
alternates bracketed IPv6, IPv4, literal "localhost", or standard hostnames
followed by :<port> (2–5 digits), and replace both occurrences of the older
pattern with this new regex so all host:port forms are redacted.
- Around line 186-188: The current guard uses `if error:` which skips falsey
error values; change the presence check to treat any `error` field as terminal
by replacing that conditional with a presence test (e.g., `if "error" in data:`)
so the code path that calls `_sanitize_ollama_error(error)` and subsequent
terminal-event handling always runs when the `error` key exists (use `error =
data.get("error")` then `if "error" in data:` to keep the same `error`
variable).

In `@tests/integration/persistence/test_perf_indices_postgres.py`:
- Around line 131-151: The test currently seeds only one row per task_id so the
planner can satisfy the ORDER BY without the new index; update the fixture
seeding in the loop that calls
postgres_backend.decision_records.append_with_next_version (and/or _TASKS) to
insert multiple rows per task_id with different recorded_at timestamps (e.g.,
several iterations per task) so the SELECT in _explain_plan (the SQL querying
"SELECT * FROM decision_records WHERE task_id = %s ORDER BY recorded_at ASC, id
ASC LIMIT 50") must rely on the ordering index; ensure the inserted recorded_at
values are distinct and ordered to make the planner choose the pinned plan.

In `@tests/integration/persistence/test_perf_indices_sqlite.py`:
- Around line 125-143: The seed creates only one row per task_id which makes the
ORDER BY recorded_at ASC, id ASC non-discriminative and can cause the planner to
not require the composite index; update the seeding loop that calls
on_disk_backend.decision_records.append_with_next_version (and uses
_TASKS/_AGENTS/_BASE) to insert multiple rows per task_id (e.g., multiple
iterations per task with distinct recorded_at timestamps and distinct record_id
values) so that ORDER BY recorded_at, id yields different ordering within a task
and the _explain_plan call against the decision_records query can reliably
detect index usage.

In `@tests/unit/api/controllers/test_settings_sinks.py`:
- Around line 164-165: Replace the runtime skip with a failing assertion so
missing sinks fail the test: instead of calling pytest.skip when the variable
full has fewer than two items, assert that len(full) >= 2 (with a clear message
like "need at least two sinks for cursor round-trip") so the test fails loudly;
update the check around the existing `full` variable and remove the
`pytest.skip` call to enforce the precondition.

In `@tests/unit/api/controllers/test_setup.py`:
- Around line 1573-1609: The assertions after the strict equality in
test_list_presets_round_trip_with_cursor are redundant; keep the strict equality
check (assert collected == full) and remove the subsequent duplicate/gap checks
that traverse the collection again (the assertions that build names and check
len(set(names)) and len(collected) == len(full)); update the test by deleting
the two extra assertions and the intermediate names list so the function
(test_list_presets_round_trip_with_cursor) relies solely on the existing
collected == full check.

In `@web/src/mocks/handlers/setup.ts`:
- Around line 115-116: The paginated mock uses
emptyPaginatedEnvelope<SetupAgentSummary>() which only types items and can drift
from the actual endpoint contract; replace it with the endpoint-typed helper
paginatedFor to tie the mock to the real response type (e.g.,
paginatedFor<typeof getAgents>()) and do the same for listPersonalityPresets
(and the other occurrences around lines 153-154) so the MSW handlers use
paginatedFor/successFor/voidSuccess helpers and stay 1:1 with the client
endpoints.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b91c23a1-0d1c-4f0c-af65-f9b6e8e47ded

📥 Commits

Reviewing files that changed from the base of the PR and between 11aeafe and e2bf3fb.

⛔ Files ignored due to path filters (2)
  • src/synthorg/persistence/postgres/revisions/atlas.sum is excluded by !**/*.sum
  • src/synthorg/persistence/sqlite/revisions/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (57)
  • docs/reference/conventions.md
  • scripts/loop_bound_init_baseline.txt
  • scripts/mock_spec_baseline.txt
  • scripts/no_magic_numbers_baseline.txt
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_models.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/persistence/postgres/revisions/20260508131900_perf_indices_cost_decision.sql
  • src/synthorg/persistence/postgres/schema.sql
  • src/synthorg/persistence/sqlite/revisions/20260508131842_perf_indices_cost_decision.sql
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/providers/management/local_models.py
  • tests/integration/api/controllers/test_providers.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/test_dto_forbid_extra.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/providers/management/test_local_models.py
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/client.ts
  • web/src/api/endpoints/providers.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/api/endpoints/setup.ts
  • web/src/api/types/providers.ts
  • web/src/api/types/setup.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/stores/websocket.ts
  • web/src/utils/app-version.ts
  • web/src/utils/constants.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/utils/retry-after.ts
💤 Files with no reviewable changes (3)
  • web/src/api/types/setup.ts
  • src/synthorg/api/controllers/setup_models.py
  • tests/unit/api/test_dto_forbid_extra.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: Deploy Preview
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Type Check
  • GitHub Check: Dashboard Test
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (10)
web/src/**/*.ts?(x)

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.ts?(x): Always use createLogger from @/lib/logger instead of bare console.warn/console.error/console.debug in application code; use variable name log (e.g., const log = createLogger('module-name'))
Wrap attacker-controlled fields inside structured objects with sanitizeForLog() before embedding in log calls
Use design tokens, @/lib/motion presets, helpers in @/utils/format, and DEFAULT_CURRENCY from @/utils/currencies instead of hardcoding styling and formatting values
Detect fetch() in effects without AbortController cleanup using @eslint-react/web-api-no-leaked-fetch ESLint rule
A PostToolUse hook (scripts/check_web_design_system.py) runs on every web/src/ edit and flags hardcoded hex / rgba / fonts / Motion durations / locale literals / bare .toLocale*String() calls / missing Storybook stories / duplicate component patterns / complex .map() blocks; fix every violation before proceeding

Files:

  • web/src/mocks/handlers/settings.ts
  • web/src/utils/constants.ts
  • web/src/api/endpoints/settings.ts
  • web/src/utils/app-version.ts
  • web/src/mocks/handlers/index.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/utils/retry-after.ts
  • web/src/stores/websocket.ts
  • web/src/api/endpoints/setup.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/api/types/providers.ts
  • web/src/api/endpoints/providers.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/api/client.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
web/src/mocks/handlers/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/mocks/handlers/**/*.ts: Mirror every exported endpoint in web/src/api/endpoints/*.ts with a 1:1 default happy-path MSW handler in web/src/mocks/handlers/; boot test-setup with onUnhandledRequest: 'error' and override per-case via server.use(...), never vi.mock('@/api/endpoints/*')
Use typed envelope helpers (successFor, paginatedFor, voidSuccess) in MSW handlers to keep handlers in lockstep with endpoint return types

Files:

  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/providers.ts
web/src/utils/constants.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Keep the WebSocket wire protocol constants (WS_PROTOCOL_VERSION, WS_MAX_MESSAGE_SIZE, WS_HEARTBEAT_INTERVAL_MS, WS_PONG_TIMEOUT_MS, LOG_SANITIZE_MAX_LENGTH) in web/src/utils/constants.ts in lockstep with src/synthorg/api/ws_models.py / src/synthorg/api/controllers/ws.py; bump protocol version on both sides together for breaking payload changes

Files:

  • web/src/utils/constants.ts
web/src/stores/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/stores/**/*.ts: All store mutation actions (create / update / delete) must follow the stores/connections/crud-actions.ts pattern: wrap API calls in try/catch, success updates state + emits success toast, failure logs + emits error toast + returns sentinel (null for entity, false for delete); callers MUST NOT wrap store mutation calls in try/catch
List-read store actions must set error: string | null on the store instead of toasting; use opaque cursor-based pagination via PaginationMeta, keep nextCursor + hasMore in state (not offset arithmetic), and early-return when !hasMore || !nextCursor
Always capture previous synchronously in optimistic mutations and restore in the catch block
Any new Zustand store that schedules timers or attaches event listeners must expose an equivalent cleanup hook and register it in the global afterEach in test-setup.tsx
Store files over ~600 lines must be sliced into packages with one of two aggregation patterns: package-internal index.ts or sibling .ts aggregator

Files:

  • web/src/stores/websocket.ts
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Read the relevant docs/design/ page before implementing or planning; deviations require explicit user approval and design page updates
Present every implementation plan to the user for accept/deny before coding; surface improvements as suggestions, prioritize by dependency order

Files:

  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/controllers/scaling.py
  • tests/unit/providers/management/test_local_models.py
  • src/synthorg/providers/management/dtos.py
  • tests/integration/api/controllers/test_providers.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/controllers/events.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • src/synthorg/api/controllers/setup.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • src/synthorg/meta/telemetry/emitter.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/rate_limits/test_policies.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: NEVER import aiosqlite, sqlite3, psycopg, or psycopg_pool; emit raw SQL DDL/DML ONLY in src/synthorg/persistence/
Define persistence Protocols in persistence/<domain>_protocol.py and concrete implementations under persistence/{sqlite,postgres}/, exposed on PersistenceBackend
No hardcoded numeric thresholds/weights/limits/timeouts/scoring policies; all live in src/synthorg/settings/definitions/<namespace>.py
Never mutate objects; create new objects via model_copy(update=...) or copy.deepcopy(). Use frozen Pydantic for config/identity; MappingProxyType for non-Pydantic registries
Pydantic v2 mandatory: ConfigDict(frozen=True, allow_inf_nan=False) for all config models; use @computed_field for derived values; use NotBlankStr from core.types for identifiers/names
Use parse_typed() from synthorg.api.boundary with hardcoded LiteralString boundary label at every dict ingestion from external sources (MCP args, JWT decode, WebSocket control, audit-chain payload, A2A JSON-RPC, settings security import)
Wrap attacker-controllable strings via wrap_untrusted() from synthorg.engine.prompt_safety; append untrusted_content_directive(tags) (SEC-1)
Never call lxml.html.fromstring on attacker input; use HTMLParseGuard (SEC-1)
Domain error families must register a base-class entry in EXCEPTION_HANDLERS (src/synthorg/api/exception_handlers.py); use <Domain><Condition>Error inheriting from DomainError
Use <Domain><Condition>Error inheriting from DomainError; direct bases of Exception, RuntimeError, LookupError, PermissionError, ValueError, TypeError, KeyError, IndexError, AttributeError, OSError, IOError are forbidden in src/synthorg/
Repository CRUD methods: save(entity) -> None (idempotent), get(id) -> Entity | None, delete(id) -> bool, list_items(...) -> tuple[Entity, ...], query(...) -> tuple[Entity, ...]; queries always return tuples
Classes that read time or sleep take `cl...

Files:

  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/api/**/*.py: Use extra="forbid" on every API-boundary DTO with Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffix
Controllers and API endpoints access persistence through domain-scoped service layers; services centralize audit logging; repositories must not log mutations themselves

Files:

  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Every Mock() / AsyncMock() / MagicMock() MUST declare spec=ConcreteClass to prevent silent attribute absorption
Use mock_dispatcher from tests/conftest.py (AsyncMock(spec=NotificationDispatcher)) for shared mocks
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter; FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0)
Markers: @pytest.mark.unit / integration / e2e / slow; pytest timeout 30s global; non-default like timeout(60) allowed; run with -n 8 --dist=loadfile (xdist)
Coverage minimum 80% (CI enforced); benchmarks excluded; use -n 8 --dist=loadfile always
Never skip/xfail/dismiss flaky tests; fix fundamentally; use FakeClock-first, asyncio.Event().wait() for blocking
Never use real vendor names (Anthropic, OpenAI, Claude, GPT); use example-provider, example-{large,medium,small}-001 in project code; tests use test-provider, test-small-001
Hypothesis: CI runs 10 deterministic examples (derandomize=True); failures are real bugs—fix bug and add @example(...) decorator; use property-based tests for invariant validation

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/rate_limits/test_policies.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/rate_limits/test_policies.py
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

CI runs scripts/generate_runtime_stats.py and scripts/inject_runtime_stats.py BEFORE zensical build to keep numeric claims fresh; HTML comments are stripped by renderer

Files:

  • docs/reference/conventions.md
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Use markdown tables for tabular data; never use text fences with ASCII box-drawing

Files:

  • docs/reference/conventions.md
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/controllers/scaling.py
  • tests/unit/providers/management/test_local_models.py
  • src/synthorg/providers/management/dtos.py
  • tests/integration/api/controllers/test_providers.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/controllers/events.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • src/synthorg/api/controllers/setup.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • src/synthorg/meta/telemetry/emitter.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/rate_limits/test_policies.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
🪛 OpenGrep (1.20.0)
tests/integration/persistence/test_perf_indices_sqlite.py

[ERROR] 77-77: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)


[ERROR] 78-78: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)

🔇 Additional comments (52)
web/src/utils/constants.ts (1)

12-23: Reconnect jitter constants are well centralized and clearly documented.

This is a clean, testable way to keep reconnect jitter behavior consistent across store logic and tests.

web/src/stores/websocket.ts (1)

232-249: Jittered reconnect scheduling is implemented correctly and defensively.

The exponential-delay + jitter computation is straightforward, and the post-rounding >=1ms clamp is a solid safeguard for future tuning.

web/src/__tests__/stores/websocket.test.ts (1)

425-475: Reconnect timing tests are now deterministic and validate jitter bounds well.

Good coverage improvement: the timer advancement and Math.random() parameterization make the jitter behavior explicit and reliable.

src/synthorg/persistence/sqlite/revisions/20260508131842_perf_indices_cost_decision.sql (1)

1-6: Composite index migration looks correct.

The added index set matches the intended query shapes and naming consistency for SQLite migration parity.

src/synthorg/persistence/sqlite/schema.sql (1)

108-112: Schema index additions are aligned and consistent.

These index definitions match the migration and expected pagination query patterns for SQLite.

Also applies to: 648-649

tests/integration/persistence/test_perf_indices_sqlite.py (1)

84-117: Cost-records planner assertions are well-targeted.

These cases seed enough rows per predicate to make the composite index choice meaningful.

src/synthorg/persistence/postgres/revisions/20260508131900_perf_indices_cost_decision.sql (1)

1-6: Postgres migration index set is consistent and complete.

The three composite indexes match the intended query patterns and backend parity.

src/synthorg/persistence/postgres/schema.sql (1)

94-97: Schema composite indexes are correctly defined for Postgres.

These additions align with the migration and expected keyset-pagination access paths.

Also applies to: 663-664

tests/integration/persistence/test_perf_indices_postgres.py (1)

85-123: Cost-records EXPLAIN assertions are solid.

The setup and plan checks clearly validate the intended composite index usage.

src/synthorg/providers/drivers/litellm_driver.py (1)

46-46: LGTM! Clock seam correctly implemented.

The clock injection follows the repository's standard pattern: optional parameter with SystemClock() default, stored on self._clock, and used for monotonic time reads in credential cache TTL checks. This preserves production behavior while enabling deterministic testing.

Also applies to: 142-143, 158-158, 184-184

src/synthorg/meta/telemetry/emitter.py (1)

16-16: LGTM! Clock seam correctly implemented for flush throttling.

The HttpAnalyticsEmitter now accepts an injectable clock and uses it for _last_flush_at initialization and updates. This follows the standard clock seam pattern and enables deterministic testing of flush interval logic.

Also applies to: 109-110, 114-114, 137-137, 240-240

tests/unit/meta/telemetry/test_emitter.py (1)

384-432: LGTM! Clock seam tests are thorough and correct.

The new TestEmitterClockSeam class validates both initialization and flush-time updates of _last_flush_at using FakeClock. The tests properly advance monotonic time and verify the emitter tracks it correctly. Cleanup with finally blocks ensures no resource leaks.

tests/unit/providers/drivers/test_litellm_auth.py (1)

164-257: LGTM! Comprehensive clock seam tests with excellent boundary coverage.

The new TestLiteLLMDriverCredentialCacheClock class validates credential caching behavior across the TTL window. The test_cache_boundary_at_exactly_ttl test is particularly valuable—it locks in the exclusive upper bound (< comparison) at the exact TTL boundary, ensuring a future change to <= would surface as a test failure rather than a silent off-by-one bug.

src/synthorg/memory/embedding/fine_tune_orchestrator.py (1)

15-15: LGTM! Clock seam implementation with excellent documentation.

The clock injection follows the standard pattern and is correctly used in the progress throttle callback. The detailed comment at lines 601-609 is particularly valuable—it explains the clock binding outside the closure and clarifies thread-safety for both production (SystemClock.monotonic → thread-safe time.monotonic) and tests (FakeClock must be invoked from the test thread).

Also applies to: 80-89, 95-95, 601-610, 635-635

scripts/loop_bound_init_baseline.txt (1)

32-32: LGTM! Baseline correctly updated for line number shift.

The _op_lock initialization moved from line 97 to line 99 in fine_tune_orchestrator.py due to the clock parameter and field additions. This baseline update is accurate.

tests/unit/memory/embedding/test_fine_tune_orchestrator.py (1)

201-259: LGTM! Clock seam test validates throttling behavior correctly.

The new TestProgressThrottleClockSeam.test_throttle_uses_injected_clock test advances FakeClock through the throttle window in precise increments (0%, 50%, 110%) and verifies emission counts at each step. The test correctly uses await asyncio.sleep(0) to yield control and allow call_soon_threadsafe callbacks to propagate.

src/synthorg/api/rate_limits/policies.py (1)

91-92: events.stream policy registration looks consistent.

This new default is correctly shaped and integrated with the existing registry pattern.

web/src/api/types/providers.ts (1)

79-85: Provider name optional field update is clear and safe.

The documentation and type shape are aligned with the list/single endpoint contract.

tests/unit/api/rate_limits/test_controller_coverage.py (1)

143-147: Coverage map extension for events.stream is correct.

Good addition—this prevents silent guard drift on the SSE endpoint.

tests/unit/api/rate_limits/test_policies.py (1)

154-158: Pinned default for events.stream is a good regression guard.

This keeps policy tuning changes explicit and test-visible.

tests/unit/providers/management/test_local_models.py (1)

303-343: Sanitizer test matrix is solid.

Good coverage of adversarial/error-shape cases and output constraints.

src/synthorg/providers/management/dtos.py (2)

418-425: LGTM: Well-documented optional field addition.

The inline documentation clearly explains when name is populated (paginated lists) vs omitted (single-resource GET-by-path), and the type annotation correctly uses NotBlankStr | None per project conventions.


600-620: LGTM: Signature change correctly threads the provider name.

The keyword-only name parameter prevents accidental positional passing, and the updated docstring clearly documents when callers should provide it (paginated lists) vs omit it (single-resource GETs).

web/src/utils/retry-after.ts (1)

34-62: LGTM: Robust Retry-After parsing with proper error handling.

The implementation correctly handles both RFC 9110 formats (delta-seconds and HTTP-date), falls back to the RFC 9457 error_detail.retry_after field, and logs malformed inputs as warnings for operational visibility. The budget enforcement via the DO_NOT_RETRY sentinel is the right choice—it allows callers to surface 429s instead of hammering the backend with artificially shortened retries.

web/src/api/client.ts (1)

14-18: LGTM: Clean refactoring to shared retry utilities.

The axios interceptor now uses the centralized parseRetryAfterMs and constants from @/utils/retry-after, eliminating duplication without changing behavior. The retry decision flow (idempotency check, budget enforcement, retry-count tracking) remains intact.

Also applies to: 150-158

web/src/utils/fetch-with-retry.ts (2)

59-69: LGTM: Comprehensive idempotency-key detection.

The helper correctly handles all three RequestInit.headers formats (Headers object, array of tuples, plain object) and performs case-insensitive matching per HTTP header conventions.


95-129: LGTM: Solid retry loop with proper abort signal handling.

The helper correctly:

  • Checks abort signal both before and after the retry sleep (lines 118-119, 122-123), ensuring cancellation is observed promptly.
  • Short-circuits when the server requests a wait exceeding the budget (line 115).
  • Enforces the retry budget via MAX_RATE_LIMIT_RETRIES (line 109).
  • Supports testability via injected sleep and fetchImpl (lines 100-101).
web/src/__tests__/utils/fetch-with-retry.test.ts (1)

1-206: LGTM: Comprehensive test coverage for retry behavior.

The test suite thoroughly validates:

  • Idempotency gating (method-based, header-based, explicit opt-in).
  • Retry budget enforcement and exhaustion.
  • Malformed Retry-After handling (treated as immediate retry with 0ms wait).
  • Abort signal short-circuiting (both before and during the retry sleep).

The parameterized test (lines 144-164) efficiently covers multiple HTTP methods, and the abort tests (lines 166-204) ensure cancellation is observed without issuing unnecessary fetch calls.

web/src/utils/app-version.ts (1)

122-135: LGTM: Correct use of retry-after for idempotent logout.

The change to fetchWithRetryAfter with idempotent: true is appropriate—logout is replay-safe on the server (no observable side effect on a second call). The existing AbortController timeout (line 117-119) ensures the boot flow isn't blocked by an unbounded retry loop.

tests/unit/api/controllers/test_providers.py (2)

20-27: LGTM: Test updated for paginated response contract.

The assertions now correctly validate the PaginatedResponse envelope: data is an empty list, pagination.has_more is False, and pagination.next_cursor is None on an empty collection.


29-31: LGTM: Tampered cursor test validates error handling.

The new test ensures that a malformed or tampered cursor query parameter returns HTTP 400, preventing silent failures or security issues from unparseable cursors.

web/src/mocks/handlers/index.ts (1)

53-53: Good central re-export for paginated mocks.

Exposing emptyPaginatedEnvelope from the index keeps handler usage consistent across stories/tests as cursor-paginated endpoints expand.

tests/integration/api/controllers/test_providers.py (1)

78-83: Pagination-aware provider assertion looks correct.

Using a name-indexed lookup over body["data"] correctly matches the new paginated contract and still verifies DB override precedence.

web/src/__tests__/stores/sinks.test.ts (1)

9-17: Nice test fixture migration to paginated envelopes.

The local paginatedSinks helper and updated MSW responses keep these store tests aligned with the new PaginatedResponse contract.

Also applies to: 49-50, 69-70, 118-119

docs/reference/conventions.md (1)

54-58: Clear contract update for cursor pagination.

This wording makes the next_cursor traversal model and “no total on wire” behavior explicit for consumers.

web/src/mocks/handlers/helpers.ts (1)

158-176: Useful helper addition for paginated MSW wire bodies.

emptyPaginatedEnvelope<T> cleanly fills the gap where paginatedFor isn’t applicable and keeps envelope shape centralized.

web/src/mocks/handlers/settings.ts (1)

67-67: Sinks mock now correctly returns paginated envelope.

Good switch to emptyPaginatedEnvelope<SinkInfo>() for parity with the cursor-paginated endpoint contract.

web/src/api/endpoints/scaling.ts (1)

43-50: Cursor pagination integration is well applied in both list endpoints.

Both functions now correctly page through /scaling/strategies and /scaling/signals via paginateAll + unwrapPaginated.

Also applies to: 71-78

web/src/api/endpoints/settings.ts (1)

51-58: listSinks pagination migration looks correct.

The cursor loop and paginated unwrap align with the new wire contract and keep caller-facing behavior unchanged.

web/src/mocks/handlers/providers.ts (3)

151-154: LGTM!

The handler correctly returns an empty paginated envelope for the now cursor-paginated /providers endpoint, matching the backend change to PaginatedResponse[ProviderResponse].


193-195: LGTM!

The handler correctly returns an empty paginated envelope for the paginated /providers/:name/models endpoint.


345-382: LGTM!

The new fixture builders (buildProviderAuditEvent, buildRateLimitsConfig, buildPresetOverride) follow the established pattern with sensible defaults and override support. They will help tests construct realistic mock data.

src/synthorg/api/controllers/setup_personality.py (1)

137-183: LGTM!

The pagination implementation is clean and follows the established cursor pagination pattern:

  • Accepts cursor and limit parameters with appropriate types
  • Sorts presets by name for stable ordering before pagination
  • Uses paginate_cursor with the cursor secret for HMAC-signed cursors
  • Returns properly typed PaginatedResponse[PersonalityPresetInfoResponse]
src/synthorg/api/controllers/providers.py (2)

206-229: LGTM!

The list_providers endpoint correctly implements cursor pagination:

  • Stable ordering by provider name (sorted(providers))
  • Passes the name kwarg to to_provider_response for the paginated list context
  • Uses paginate_cursor with the cursor secret

257-325: LGTM!

The list_models endpoint correctly implements cursor pagination with stable ordering by model id. The graceful fallback to static model data when capability enrichment fails (via except* on retry/rate-limit exhaustion) is preserved.

src/synthorg/api/controllers/scaling.py (2)

155-207: LGTM!

The list_strategies endpoint correctly implements cursor pagination:

  • Returns properly structured empty PaginatedResponse with PaginationMeta when scaling service is unavailable
  • Sorts strategies by name for stable keyset ordering
  • Uses the standard paginate_cursor helper

256-322: LGTM!

The list_signals endpoint correctly implements cursor pagination while preserving the existing signal aggregation logic (live context fallback to decision history). Signals are sorted by name before pagination for stable ordering.

tests/unit/api/controllers/test_setup.py (2)

718-734: LGTM!

The test correctly validates the new paginated response shape with body["data"] and body["pagination"] structure. The tampered cursor rejection test is a good addition for security coverage.


761-790: LGTM!

The cursor pagination round-trip test properly verifies that walking pages with limit=1 collects all agents without duplicates or gaps. This ensures the keyset pagination implementation is correct.

src/synthorg/api/controllers/settings.py (3)

151-179: LGTM!

The new typed Pydantic response models (SinkRotationResponse, SinkInfoResponse) provide strong typing for the paginated sink list endpoint. The models correctly use frozen=True, allow_inf_nan=False, and extra="forbid" as required by coding guidelines, and use NotBlankStr for identifier fields.


181-209: LGTM!

The _sink_to_response helper cleanly converts SinkConfig to the typed API response model, handling the console vs file sink identifier derivation and optional rotation settings.


539-600: LGTM!

The list_sinks endpoint correctly implements cursor pagination:

  • Accepts cursor and limit parameters
  • Orders sinks by identifier for stable keyset ordering
  • Uses paginate_cursor with the cursor secret
  • Returns strongly-typed PaginatedResponse[SinkInfoResponse]
  • Falls back to defaults-only list on configuration errors

Comment thread src/synthorg/api/controllers/events.py Outdated
Comment thread src/synthorg/api/controllers/setup.py Outdated
Comment thread src/synthorg/providers/management/local_models.py
Comment thread src/synthorg/providers/management/local_models.py Outdated
Comment thread tests/integration/persistence/test_perf_indices_postgres.py Outdated
Comment thread tests/integration/persistence/test_perf_indices_sqlite.py Outdated
Comment thread tests/unit/api/controllers/test_settings_sinks.py Outdated
Comment thread tests/unit/api/controllers/test_setup.py
Comment thread web/src/mocks/handlers/setup.ts Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

❌ Patch coverage is 83.88889% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.76%. Comparing base (b7b9a59) to head (ece7ed5).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/api/controllers/settings.py 71.05% 10 Missing and 1 partial ⚠️
src/synthorg/api/controllers/scaling.py 16.66% 10 Missing ⚠️
src/synthorg/api/controllers/providers.py 70.58% 5 Missing ⚠️
src/synthorg/api/controllers/memory.py 91.89% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main    #1822    +/-   ##
========================================
  Coverage   84.76%   84.76%            
========================================
  Files        1798     1798            
  Lines      104306   104461   +155     
  Branches     9128     9146    +18     
========================================
+ Hits        88415    88547   +132     
- Misses      13676    13695    +19     
- Partials     2215     2219     +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Aureliolo Aureliolo force-pushed the perf/performance-data-integrity branch from e2bf3fb to c8cf96d Compare May 8, 2026 18:58
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 8, 2026 19:00 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/api/controllers/settings.py (1)

319-320: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use DEFAULT_LIMIT instead of bare 50 for consistency.

Line 320 uses a bare numeric literal 50 while line 544 uses DEFAULT_LIMIT. This is inconsistent and violates the magic numbers guideline.

Proposed fix
     `@get`()
     async def list_all_settings(
         self,
         state: State,
         cursor: CursorParam = None,
-        limit: CursorLimit = 50,
+        limit: CursorLimit = DEFAULT_LIMIT,
     ) -> PaginatedResponse[SettingEntry]:

As per coding guidelines: "Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/settings.py` around lines 319 - 320, Replace the
hard-coded default 50 for the `limit` parameter in the function signature (the
`cursor: CursorParam = None, limit: CursorLimit = 50,` declaration) with the
module constant DEFAULT_LIMIT used elsewhere; import DEFAULT_LIMIT from the
appropriate settings definitions module (the same definitions namespace where
DEFAULT_LIMIT is defined and referenced on line 544) and set `limit: CursorLimit
= DEFAULT_LIMIT` so the function uses the centralized limit constant instead of
a magic number.
♻️ Duplicate comments (1)
web/src/mocks/handlers/setup.ts (1)

115-116: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Keep these paginated mocks tied to the endpoint return types.

The handlers use emptyPaginatedEnvelope<T>(), which only constrains the item DTO. Envelope drift will not be caught when getAgents() or listPersonalityPresets() change. Wire these handlers through the endpoint-typed helpers (paginatedFor<typeof getAgents>(emptyPage<SetupAgentSummary>()) and paginatedFor<typeof listPersonalityPresets>(emptyPage<PersonalityPresetInfo>())) so the mocks stay 1:1 with the client contracts.

As per coding guidelines: "Use typed envelope helpers (successFor, paginatedFor, voidSuccess) in MSW handlers to keep handlers in lockstep with endpoint return types".

♻️ Proposed fix
 import type {
   completeSetup,
   createAgent,
   createCompany,
+  getAgents,
   getAvailableLocales,
   getNameLocales,
   getSetupStatus,
+  listPersonalityPresets,
   listTemplates,
   randomizeAgentName,
   saveNameLocales,
   updateAgentModel,
   updateAgentName,
   updateAgentPersonality,
 } from '@/api/endpoints/setup'
   http.get('/api/v1/setup/agents', () =>
-    HttpResponse.json(emptyPaginatedEnvelope<SetupAgentSummary>()),
+    HttpResponse.json(
+      paginatedFor<typeof getAgents>(emptyPage<SetupAgentSummary>()),
+    ),
   ),
   http.get('/api/v1/setup/personality-presets', () =>
-    HttpResponse.json(emptyPaginatedEnvelope<PersonalityPresetInfo>()),
+    HttpResponse.json(
+      paginatedFor<typeof listPersonalityPresets>(
+        emptyPage<PersonalityPresetInfo>(),
+      ),
+    ),
   ),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web/src/mocks/handlers/setup.ts` around lines 115 - 116, The paginated MSW
handlers currently use emptyPaginatedEnvelope<T>(), which only types the items
and can drift from endpoint contracts; update each handler to use the
endpoint-typed helpers instead (e.g., replace usages like
http.get('/api/v1/setup/agents',
HttpResponse.json(emptyPaginatedEnvelope<SetupAgentSummary>())) with
paginatedFor<typeof getAgents>(emptyPage<SetupAgentSummary>()) and similarly use
paginatedFor<typeof listPersonalityPresets>(emptyPage<PersonalityPresetInfo>())
for the presets handler) so the mock envelopes are tied 1:1 to the getAgents and
listPersonalityPresets endpoint return types.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/controllers/memory.py`:
- Around line 67-70: Move the magic-number tuning knobs out of the controller:
remove _DEFAULT_BATCH_SIZE, _MIN_DOCS_REQUIRED, and _MIN_DOCS_RECOMMENDED from
src/synthorg/api/controllers/memory.py and add them as named constants in the
appropriate settings/definitions module (e.g.,
src/synthorg/settings/definitions/<relevant_namespace>.py) following the
existing pattern for thresholds; then import those constants into memory.py and
replace the local literals/usages (also adjust the other occurrences referenced
around lines 147-149) so all business-logic numeric thresholds live in
settings/definitions and the controller only imports them.

In `@src/synthorg/api/controllers/settings.py`:
- Around line 189-195: The code wraps an empty string in NotBlankStr when
sink.file_path is None for non-CONSOLE sinks, causing validation errors; change
the identifier computation so that for non-CONSOLE sinks you fallback to a
non-empty token derived from the sink type instead of "" (e.g., use
sink.file_path or fallback to sink.sink_type.name/value). Update the expression
that assigns identifier (currently using CONSOLE_SINK_ID, SinkType.CONSOLE,
sink.file_path) so it becomes: CONSOLE_SINK_ID if sink.sink_type ==
SinkType.CONSOLE else (sink.file_path or sink.sink_type.name), then pass that
into NotBlankStr when constructing SinkInfoResponse.

In `@src/synthorg/api/rate_limits/policies.py`:
- Around line 104-105: The hard-coded numeric defaults for the "events.stream"
policy (and the other literals around lines 224-231) must be moved into the
settings definitions module and referenced from the registry; create appropriate
constants in src/synthorg/settings/definitions/events.py (e.g.
EVENTS_STREAM_RATE, EVENTS_STREAM_CONCURRENCY, etc.), export them, then replace
the numeric literals in src/synthorg/api/rate_limits/policies.py (the
"events.stream" key and the other affected entries) to reference those constants
instead of embedding 60, 4, 2, etc.; ensure import paths and names match the new
definitions so the registry reads values from settings/definitions rather than
hard-coded numbers.

In `@src/synthorg/providers/management/local_models.py`:
- Around line 68-75: The helper _sanitize_ollama_error currently logs
PROVIDER_MODEL_PULL_FAILED when raw is not a string, causing duplicate logs
because _parse_pull_line also logs the same event; remove the logger.warning
call from _sanitize_ollama_error so it becomes side-effect free (just return the
fallback "Pull failed" for non-string raw), and instead add the error_type
context to the existing logging call in _parse_pull_line (use type(raw).__name__
as error_type) so the caller emits the single, contextualized
PROVIDER_MODEL_PULL_FAILED log.
- Around line 37-44: The POSIX and Windows path regexes (_OLLAMA_PATH_POSIX and
_OLLAMA_PATH_WIN) stop at the first space and thus leak parts of paths like
"C:\Program Files\..." or "/var/lib/ollama/model cache/..."; update both
patterns to include space characters inside path segment character classes (e.g.
allow \s or literal space alongside \w.\- and backslash where applicable) so
each pattern continues consuming segments that contain spaces, and add
regression tests that assert full-redaction for paths with spaces (one POSIX
example like "/var/lib/ollama/model cache/file.bin" and one Windows example like
"C:\Program Files\Ollama\token.json") to prevent regressions.

In `@tests/integration/persistence/test_perf_indices_sqlite.py`:
- Around line 68-79: The helper _explain_plan currently interpolates raw SQL
into f-strings (EXPLAIN QUERY PLAN {sql} and ANALYZE {analyze}); tighten this by
enforcing an allowlist: define a small set of permitted SQL statements or
prefixes (e.g., allowed_sql_prefixes or allowed_statements) and a small set of
permitted analyze targets (allowed_analyze_tokens), validate that the incoming
sql string matches one of the allowed entries/prefixes and that analyze (if
provided) is in allowed_analyze_tokens, and only then build the EXPLAIN/ANALYZE
statements; if validation fails, raise ValueError. Ensure you still pass
parameters via db.execute(cursor) using the existing params tuple and stop using
raw interpolation for unvalidated inputs in _explain_plan, referencing the
function name _explain_plan, variables sql and analyze, and the db.execute call
to locate where to apply the checks.

In `@tests/unit/meta/telemetry/test_emitter.py`:
- Around line 425-432: The test currently only closes em._client, which leaves
the emitter's background flush task running because _enqueue() may have started
_flush_task and _closed is never set; update the cleanup in the finally block to
call the emitter's proper close method (await em.aclose() or whatever public
close/async-close method exists) so that _closed is set and _flush_task is
cancelled before closing em._client, ensuring no periodic flush task is leaked
(refer to _enqueue(), _flush_task, and _closed to locate the relevant teardown).

In `@web/src/__tests__/utils/fetch-with-retry.test.ts`:
- Around line 144-205: Add two regression tests in fetch-with-retry.test.ts: (1)
a case where the input to fetchWithRetryAfter is a Request object (not a URL
string) to ensure idempotency detection and retry logic still read the method
from the Request; construct a Request with method 'GET'/'PUT' etc., mock
fetchImpl to return a 429 then 200 and assert expected status and call counts.
(2) a case exercising the built-in/default sleep path (i.e. do not pass a custom
sleep) and ensure AbortController aborting while the internal timer is waiting
short-circuits the retry: mock fetchImpl to return 429 with a Retry-After
header, start the call with controller.signal, abort the controller during the
timer interval (e.g. via setTimeout in the test) and assert the response stays
429 and only the initial fetch ran. Ensure both tests reference
fetchWithRetryAfter and use makeResponse/fetchImpl mocks consistent with
existing tests.

In `@web/src/mocks/handlers/scaling.ts`:
- Around line 17-19: Replace uses of the untyped envelope helper
emptyPaginatedEnvelope<T>() in the MSW handlers with the typed paginatedFor
helper so mocks remain tied to the endpoint contracts: for the
'/api/v1/scaling/strategies' handler call paginatedFor<typeof
getScalingStrategies>(emptyPage<ScalingStrategyResponse>()) and for the signals
handler call paginatedFor<typeof
getScalingSignals>(emptyPage<ScalingSignalResponse>()), ensuring you import
paginatedFor and emptyPage and reference getScalingStrategies/getScalingSignals
so the mock envelope type matches the endpoint return type.

In `@web/src/mocks/handlers/settings.ts`:
- Around line 66-67: Replace the untyped envelope helper in the MSW handler for
the GET '/api/v1/settings/observability/sinks' route: instead of returning
emptyPaginatedEnvelope<SinkInfo>(), return a typed paginated envelope tied to
the actual client return type by using paginatedFor<typeof
listSinks>(emptyPage<SinkInfo>()). This ensures the handler's shape follows
listSinks' contract and prevents envelope drift; update the handler that
currently calls emptyPaginatedEnvelope<SinkInfo>() to use paginatedFor and
emptyPage as described.

In `@web/src/utils/fetch-with-retry.ts`:
- Around line 55-79: The idempotency check currently only inspects init via
methodOf and hasIdempotencyKey, which misses when the caller passed a Request
object; update methodOf and hasIdempotencyKey (or add overloads) to accept the
original input parameter (which may be a Request or string) and prefer
extracting method and headers from a Request input when present, falling back to
init; then update isRetriable to pass both input and init (instead of only init)
so it correctly detects request.method and request.headers for retriability
(also apply the same change to the similar logic around lines 95-103).
- Around line 49-53: defaultSleep currently ignores AbortSignal so if an
AbortSignal fires during the timer the sleep still waits to expiry; update
defaultSleep to accept an optional AbortSignal parameter, create the timeout
with window.setTimeout, attach an abort listener that clears the timer and
rejects the promise (with an AbortError or DOMException) immediately, and remove
the abort listener on resolve; then update every caller (e.g., places that call
defaultSleep in fetchWithRetry and any other uses) to pass the request/operation
signal so sleeps are cancelled immediately when aborted.

In `@web/src/utils/retry-after.ts`:
- Around line 55-56: The log call is writing an attacker-controlled value
`trimmed` directly into logs; wrap the header value with the sanitizer before
logging. Update the `log.warn('Malformed Retry-After header', { value: trimmed
})` call to use `sanitizeForLog(trimmed)` (import `sanitizeForLog` if missing)
so it becomes `log.warn('Malformed Retry-After header', { value:
sanitizeForLog(trimmed) })`, leaving the surrounding logic (and the `return 0`)
unchanged.

---

Outside diff comments:
In `@src/synthorg/api/controllers/settings.py`:
- Around line 319-320: Replace the hard-coded default 50 for the `limit`
parameter in the function signature (the `cursor: CursorParam = None, limit:
CursorLimit = 50,` declaration) with the module constant DEFAULT_LIMIT used
elsewhere; import DEFAULT_LIMIT from the appropriate settings definitions module
(the same definitions namespace where DEFAULT_LIMIT is defined and referenced on
line 544) and set `limit: CursorLimit = DEFAULT_LIMIT` so the function uses the
centralized limit constant instead of a magic number.

---

Duplicate comments:
In `@web/src/mocks/handlers/setup.ts`:
- Around line 115-116: The paginated MSW handlers currently use
emptyPaginatedEnvelope<T>(), which only types the items and can drift from
endpoint contracts; update each handler to use the endpoint-typed helpers
instead (e.g., replace usages like http.get('/api/v1/setup/agents',
HttpResponse.json(emptyPaginatedEnvelope<SetupAgentSummary>())) with
paginatedFor<typeof getAgents>(emptyPage<SetupAgentSummary>()) and similarly use
paginatedFor<typeof listPersonalityPresets>(emptyPage<PersonalityPresetInfo>())
for the presets handler) so the mock envelopes are tied 1:1 to the getAgents and
listPersonalityPresets endpoint return types.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 01e3cf30-093f-4d41-975d-b5b1345e306e

📥 Commits

Reviewing files that changed from the base of the PR and between e2bf3fb and c8cf96d.

⛔ Files ignored due to path filters (2)
  • src/synthorg/persistence/postgres/revisions/atlas.sum is excluded by !**/*.sum
  • src/synthorg/persistence/sqlite/revisions/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (59)
  • docs/reference/conventions.md
  • scripts/loop_bound_init_baseline.txt
  • scripts/mock_spec_baseline.txt
  • scripts/no_magic_numbers_baseline.txt
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_models.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/persistence/postgres/revisions/20260508131900_perf_indices_cost_decision.sql
  • src/synthorg/persistence/postgres/schema.sql
  • src/synthorg/persistence/sqlite/revisions/20260508131842_perf_indices_cost_decision.sql
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/providers/management/local_models.py
  • tests/integration/api/controllers/test_providers.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/test_dto_forbid_extra.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/providers/management/test_local_models.py
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/client.ts
  • web/src/api/endpoints/providers.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/api/endpoints/setup.ts
  • web/src/api/types/providers.ts
  • web/src/api/types/setup.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/stores/websocket.ts
  • web/src/utils/app-version.ts
  • web/src/utils/constants.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/utils/retry-after.ts
💤 Files with no reviewable changes (3)
  • tests/unit/api/test_dto_forbid_extra.py
  • web/src/api/types/setup.ts
  • src/synthorg/api/controllers/setup_models.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Build Backend
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (12)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit / integration / e2e / slow.
Mock-spec gate: every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt; regenerate via uv run python scripts/check_mock_spec.py --update. Without spec= mocks silently absorb every attribute access.
Shared mocks: use mock_dispatcher from tests/conftest.py (AsyncMock(spec=NotificationDispatcher)).
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Logger spying antipattern: never monkeypatch.setattr(module.logger, "info", spy); the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead.
Parametrize: prefer @pytest.mark.parametrize for similar cases.
Property-based: Hypothesis (Python), fast-check (React), testing.F (Go). CI runs 10 deterministic examples (derandomize=True). Hypothesis failures are real bugs: fix the bug and add an @example(...) decorator.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/rate_limits/test_policies.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/rate_limits/test_policies.py
web/src/**/*.ts?(x)

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.ts?(x): Always use createLogger from @/lib/logger instead of bare console.warn/console.error/console.debug in application code; use variable name log (e.g., const log = createLogger('module-name'))
Wrap attacker-controlled fields inside structured objects with sanitizeForLog() before embedding in log calls
Use design tokens, @/lib/motion presets, helpers in @/utils/format, and DEFAULT_CURRENCY from @/utils/currencies instead of hardcoding styling and formatting values
Detect fetch() in effects without AbortController cleanup using @eslint-react/web-api-no-leaked-fetch ESLint rule
A PostToolUse hook (scripts/check_web_design_system.py) runs on every web/src/ edit and flags hardcoded hex / rgba / fonts / Motion durations / locale literals / bare .toLocale*String() calls / missing Storybook stories / duplicate component patterns / complex .map() blocks; fix every violation before proceeding

Files:

  • web/src/mocks/handlers/settings.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/endpoints/providers.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/stores/websocket.ts
  • web/src/utils/retry-after.ts
  • web/src/utils/constants.ts
  • web/src/api/endpoints/setup.ts
  • web/src/utils/app-version.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/api/types/providers.ts
  • web/src/mocks/handlers/index.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/api/endpoints/settings.ts
  • web/src/mocks/handlers/providers.ts
web/src/mocks/handlers/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/mocks/handlers/**/*.ts: Mirror every exported endpoint in web/src/api/endpoints/*.ts with a 1:1 default happy-path MSW handler in web/src/mocks/handlers/; boot test-setup with onUnhandledRequest: 'error' and override per-case via server.use(...), never vi.mock('@/api/endpoints/*')
Use typed envelope helpers (successFor, paginatedFor, voidSuccess) in MSW handlers to keep handlers in lockstep with endpoint return types

Files:

  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/providers.ts
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Reuse components from web/src/components/ui/. Never hardcode hex colors, font-family, pixel spacing, Motion transitions, or BCP 47 locale strings; use design tokens, @/lib/motion presets, helpers in @/utils/format. Enforced by scripts/check_web_design_system.py.

Files:

  • web/src/mocks/handlers/settings.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/endpoints/providers.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/stores/websocket.ts
  • web/src/utils/retry-after.ts
  • web/src/utils/constants.ts
  • web/src/api/endpoints/setup.ts
  • web/src/utils/app-version.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/api/types/providers.ts
  • web/src/mocks/handlers/index.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/api/endpoints/settings.ts
  • web/src/mocks/handlers/providers.ts
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.md: Use fenced code blocks with language tags: d2 for architecture/nested containers, mermaid for flowcharts/sequence/pipelines. Use markdown tables for tabular data; never use text fences with ASCII box-drawing.
Static historical counts and illustrative scale numbers may carry a per-line opt-out: <!-- lint-allow: doc-numeric-macros -- <reason> --> (reason mandatory). Enforced by scripts/check_doc_numeric_macros.py (pre-push).

Files:

  • docs/reference/conventions.md
web/src/stores/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/stores/**/*.ts: All store mutation actions (create / update / delete) must follow the stores/connections/crud-actions.ts pattern: wrap API calls in try/catch, success updates state + emits success toast, failure logs + emits error toast + returns sentinel (null for entity, false for delete); callers MUST NOT wrap store mutation calls in try/catch
List-read store actions must set error: string | null on the store instead of toasting; use opaque cursor-based pagination via PaginationMeta, keep nextCursor + hasMore in state (not offset arithmetic), and early-return when !hasMore || !nextCursor
Always capture previous synchronously in optimistic mutations and restore in the catch block
Any new Zustand store that schedules timers or attaches event listeners must expose an equivalent cleanup hook and register it in the global afterEach in test-setup.tsx
Store files over ~600 lines must be sliced into packages with one of two aggregation patterns: package-internal index.ts or sibling .ts aggregator

Files:

  • web/src/stores/websocket.ts
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every cost-bearing Pydantic model carries currency: CurrencyCode; mixing raises MixedCurrencyAggregationError (HTTP 409). Aggregations over cost-bearing fields call assert_currencies_match before reducing.
Direct os.environ.get(...) outside startup is forbidden. Ghost-wired settings (consuming service never instantiated at boot) are flagged by scripts/check_setting_to_startup_trace.py; per-setting opt-out via # lint-allow: bootstrap-wiring -- <reason>.
Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal. Bare module-level _FOO = 1024 constants and bare numeric defaults (def f(timeout=30)) are forbidden.
Allowlisted numeric literals: 0, 1, -1 (sentinel/off-by-one), HTTP status codes 100-599 in status_code= defaults, hex bit-masks (0xff, 0x80), powers-of-2 in buffering= / chunk_size= / buffer_size= defaults, anything inside settings/definitions/, persistence/migrations/, observability/events/. Per-line opt-out: # lint-allow: magic-numbers -- <reason> (mandatory non-empty justification). Enforced by scripts/check_no_magic_numbers.py.
Comments explain WHY only, never origin / review / issue context. Forbidden: reviewer citations (pre-PR review #N``), in-code issue back-refs ((#1682`)`), naked `SEC-1` taxonomy in `src/`, migration framing (`ported from`), round narrative, self-evident restatements.
Keep in comments: hidden constraints, subtle invariants, upstream-bug workarounds (with stable bug-tracker URL), why a non-obvious choice was made. Enforced by `scripts/check_no_review_origin_in_code.py` and `scripts/check_no_migration_framing.py` (pre-push); per-line opt-outs `# lint-allow: review-origin -- ` and `# lint-allow: migration-framing -- `.
No `from future import annotations`: Python 3.14 has PEP 649.
PEP 758 except: `except A, B:` (no parens) when not bin...

Files:

  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/meta/telemetry/emitter.py
src/synthorg/**/{api,services,repositories}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, MemoryService); services centralize audit logging; repositories must not log mutations themselves.

Files:

  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/memory.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes the socket with code 4011.

Files:

  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/memory.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/meta/telemetry/emitter.py
src/synthorg/providers/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/providers/**/*.py: All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry in driver subclasses or calling code.
RetryConfig / RateLimiterConfig set per-provider in ProviderConfig. Retryable: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable raise immediately.

Files:

  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/management/dtos.py
web/src/utils/constants.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Keep the WebSocket wire protocol constants (WS_PROTOCOL_VERSION, WS_MAX_MESSAGE_SIZE, WS_HEARTBEAT_INTERVAL_MS, WS_PONG_TIMEOUT_MS, LOG_SANITIZE_MAX_LENGTH) in web/src/utils/constants.ts in lockstep with src/synthorg/api/ws_models.py / src/synthorg/api/controllers/ws.py; bump protocol version on both sides together for breaking payload changes

Files:

  • web/src/utils/constants.ts
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/integration/api/controllers/test_providers.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/providers/management/local_models.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • src/synthorg/providers/management/dtos.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • src/synthorg/api/controllers/providers.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/api/controllers/test_setup.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/meta/telemetry/emitter.py
  • tests/unit/api/rate_limits/test_policies.py
🪛 OpenGrep (1.20.0)
tests/integration/persistence/test_perf_indices_sqlite.py

[ERROR] 77-77: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)


[ERROR] 78-78: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)

🔇 Additional comments (54)
web/src/utils/constants.ts (1)

12-23: Jitter constants are cleanly centralized and well documented.

Good addition of WS_RECONNECT_JITTER_MIN/MAX with rationale; this keeps reconnect behavior tunable and testable from one place.

web/src/stores/websocket.ts (1)

21-23: Reconnect jitter is implemented correctly and defensively.

Using shared jitter bounds with a rounded, min-1ms clamped delay is a solid implementation for de-correlating reconnect bursts without risking zero-delay retries.

Also applies to: 232-248

web/src/__tests__/stores/websocket.test.ts (1)

425-428: Jitter test coverage is strong and deterministic.

Nice update to bound the reconnect timer test and explicitly assert the jitter formula across representative Math.random() inputs.

Also applies to: 434-475

src/synthorg/persistence/sqlite/revisions/20260508131842_perf_indices_cost_decision.sql (1)

1-6: Composite index migration looks correct.

The three new indexes match the targeted query shapes and align with the schema/index test intent.

src/synthorg/persistence/sqlite/schema.sql (1)

108-111: Schema index additions are aligned and well-scoped.

These composite indexes are consistent with the migration and support the intended filtered-order pagination paths.

Also applies to: 648-649

src/synthorg/persistence/postgres/schema.sql (1)

94-97: Postgres schema index updates look good.

The new composites are correctly defined and consistent with the cursor-query plan expectations.

Also applies to: 663-664

src/synthorg/persistence/postgres/revisions/20260508131900_perf_indices_cost_decision.sql (1)

1-6: Migration DDL is coherent with schema and tests.

Index names and column order align with the intended planner assertions.

tests/integration/persistence/test_perf_indices_postgres.py (1)

125-159: Decision-record fixture shape is now robust for planner validation.

Seeding many rows for one task_id materially strengthens the index-usage assertion for ORDER BY recorded_at, id.

tests/integration/persistence/test_perf_indices_sqlite.py (1)

125-151: Great fix on decision-record seed distribution.

Using many rows for a single task makes the (task_id, recorded_at, id) plan assertion much more reliable.

src/synthorg/providers/drivers/litellm_driver.py (1)

136-184: Clock seam integration is solid and preserves cache semantics.

clock injection and self._clock.monotonic() usage are consistent with deterministic testing while keeping production defaults unchanged.

src/synthorg/meta/telemetry/emitter.py (1)

103-240: Clock-backed flush timestamp bookkeeping looks correct.

The constructor and flush path now consistently read monotonic time from the injected clock seam, which improves determinism without changing runtime defaults.

tests/unit/providers/drivers/test_litellm_auth.py (1)

164-257: Great TTL-boundary coverage for credential cache behavior.

The new FakeClock tests meaningfully validate cache hit/miss boundaries and OAuth non-caching behavior with deterministic timing.

src/synthorg/memory/embedding/fine_tune_orchestrator.py (1)

80-96: Progress-throttle clock seam is implemented correctly.

Injecting and binding clock for monotonic checks makes the throttle path deterministic for tests while preserving production behavior.

Also applies to: 601-636

scripts/loop_bound_init_baseline.txt (1)

32-32: Baseline coordinate update looks correct.

The updated line mapping for FineTuneOrchestrator:_op_lock aligns with the constructor offset introduced in this PR.

tests/unit/memory/embedding/test_fine_tune_orchestrator.py (1)

204-259: Strong deterministic test for throttle timing behavior.

This test cleanly validates pre-boundary suppression and post-boundary emission using the injected clock seam.

web/src/api/client.ts (1)

14-18: Good centralization of retry-after constants/parsing.

This import switch removes duplicated retry policy logic and keeps retry behavior sourced from one module.

web/src/utils/app-version.ts (1)

122-135: Retry-after integration at logout call site looks correct.

Opting in with idempotent: true here is appropriate for replay-safe logout semantics.

tests/unit/api/rate_limits/test_policies.py (1)

10-13: Strong policy test expansion for inflight concurrency.

The added structure/default/lookup assertions materially improve guardrail coverage around INFLIGHT_POLICIES and per_op_concurrency_from_policy.

Also applies to: 195-277

src/synthorg/api/controllers/events.py (1)

22-25: Policy-based concurrency wiring for SSE stream is a solid fix.

Using per_op_concurrency_from_policy("events.stream", key="user") keeps limits centralized and tunable via the policy registry.

Also applies to: 530-537

tests/unit/api/rate_limits/test_controller_coverage.py (1)

143-147: Nice addition to endpoint guard coverage.

Pinning events.py::stream -> events.stream in _GUARDED_ENDPOINTS makes policy-key drift fail loudly.

tests/unit/api/controllers/test_providers.py (1)

25-31: Pagination and tampered-cursor coverage is aligned with the new API contract.

Nice update to assert the paginated empty envelope and reject invalid cursors with 400.

web/src/mocks/handlers/index.ts (1)

53-53: Barrel export update looks good.

Re-exporting emptyPaginatedEnvelope here is the right move for consistent mock usage.

web/src/api/types/providers.ts (1)

79-85: ProviderConfig.name contract update is clear and appropriate.

The optional nullable name field cleanly captures paginated-list vs single-resource response differences.

tests/integration/api/controllers/test_providers.py (1)

78-83: Integration assertion update is solid for paginated providers.

Mapping by embedded name is a good, pagination-safe way to assert the DB override entry.

docs/reference/conventions.md (1)

54-58: Conventions doc update matches the new cursor-pagination wire format.

web/src/mocks/handlers/helpers.ts (1)

158-176: emptyPaginatedEnvelope helper looks correct and well-scoped.

This is a clean utility for empty paginated wire responses when paginatedFor isn’t the right fit.

web/src/__tests__/stores/sinks.test.ts (1)

9-17: Sinks store tests now correctly model paginated success responses.

Good update to keep MSW fixtures in sync with the new PaginatedResponse<SinkInfo> contract.

Also applies to: 49-50, 67-70, 117-119

web/src/api/endpoints/settings.ts (1)

51-58: listSinks pagination migration is implemented correctly.

Using paginateAll with cursor propagation and unwrapPaginated is the right client-side adaptation.

web/src/api/endpoints/scaling.ts (2)

42-51: LGTM!

The cursor pagination logic correctly constructs URLSearchParams, conditionally appends the cursor, and calls paginateAll with the proper unwrapper. The pattern is consistent with other paginated endpoints in the codebase.


70-79: LGTM!

Identical cursor pagination pattern applied consistently to scaling signals.

tests/unit/api/controllers/test_setup_agent_ops.py (2)

236-239: LGTM!

The test correctly validates the new cursor-paginated response shape with data as the top-level list and pagination metadata.


314-323: LGTM!

Using any(...) to search the returned agents list makes the assertion order-independent and robust against future sorting changes.

tests/unit/api/controllers/test_setup_personality.py (2)

42-48: LGTM!

The order-independent assertion with any(...) correctly handles name-sorted agent ordering.


112-131: LGTM!

The test correctly validates the paginated response shape with data as the top-level list.

tests/unit/api/controllers/test_settings_sinks.py (2)

154-187: LGTM!

The pagination round-trip test correctly validates cursor-based pagination by walking pages with limit=1 and asserting the collected results match the full list. The precondition assertion with a clear message ensures the test fails loudly if the fixture regresses.


188-197: LGTM!

The tampered cursor test correctly validates that invalid cursor values return HTTP 400.

src/synthorg/providers/management/dtos.py (2)

418-425: LGTM!

The optional name field is correctly typed as NotBlankStr | None and the docstring clearly explains when it should be populated (paginated list responses) vs left None (single-provider GET-by-path responses).


600-640: LGTM!

The keyword-only name parameter prevents positional misuse, and the updated docstring clearly explains the pagination threading contract. The implementation correctly passes the name through to the ProviderResponse constructor.

tests/unit/api/controllers/test_setup.py (2)

1573-1607: Past review concern addressed.

The redundant assertions after assert collected == full have been removed. The equality check alone proves no duplicates, no gaps, and stable order.


724-790: LGTM!

The test updates for GET /setup/agents correctly validate the new paginated response structure:

  • Empty list assertions check body["data"] and body["pagination"] fields
  • Tampered cursor rejection test ensures 400 response
  • Pagination round-trip test validates cursor follow-through reaches all agents
web/src/api/endpoints/setup.ts (1)

40-49: LGTM!

The cursor pagination implementation for getAgents and listPersonalityPresets correctly follows the established pattern used in other paginated endpoints (providers.ts, scaling.ts, settings.ts). The paginateAll helper aggregates pages transparently, and the URL construction with URLSearchParams is clean.

Also applies to: 105-114

src/synthorg/api/controllers/setup.py (1)

441-455: LGTM! Past review concern properly addressed.

The sorting has been removed and the comment on lines 441-445 clearly explains why persisted-array order must be preserved: PUT/POST handlers resolve agent_index against the same array, so reordering here would cause clients to update the wrong agent. The pagination implementation is now correct.

web/src/api/endpoints/providers.ts (2)

46-70: LGTM!

The listProviders implementation properly aggregates paginated results and includes defense-in-depth against prototype pollution by using Object.create(null) and skipping __proto__, constructor, and prototype keys. The warning for missing name helps surface backend regressions.


225-242: LGTM!

The pullModel switch to fetchWithRetryAfter with { idempotent: true } is appropriate. The comment correctly explains that while POST is non-idempotent by default, the model pull operation itself is safe to retry on 429 since it resolves to the same result on the server. The signal propagation enables abort support.

src/synthorg/api/controllers/setup_personality.py (1)

141-183: LGTM!

The cursor pagination for list_personality_presets is correctly implemented. Unlike agent listing, presets are read-only from a constant registry, so sorting by name is safe here. The implementation follows the established pagination pattern with paginate_cursor and app_state.cursor_secret.

web/src/mocks/handlers/providers.ts (1)

151-154: LGTM!

The MSW handlers are correctly updated to return emptyPaginatedEnvelope for the now-paginated provider list and models list endpoints. This keeps the mock handlers in sync with the backend response shape changes.

Also applies to: 193-194

src/synthorg/api/controllers/scaling.py (1)

155-207: LGTM!

The cursor pagination for list_strategies and list_signals is correctly implemented:

  • Both use DEFAULT_LIMIT for the default page size
  • Empty PaginatedResponse with proper metadata is returned when scaling_service is unavailable
  • Items are sorted by name before pagination for stable cursor behavior
  • Signal deduplication logic is preserved

Also applies to: 256-322

src/synthorg/api/controllers/providers.py (3)

213-232: LGTM!

The list_providers endpoint correctly implements cursor pagination with DEFAULT_LIMIT. Providers are sorted by name for stable cursor behavior, and the response uses PaginatedResponse[ProviderResponse] with proper metadata.


264-328: LGTM!

The list_models endpoint correctly implements cursor pagination. Models are sorted by id for stable cursor behavior. The existing capability enrichment logic is preserved, with the pagination applied after the enrichment step.


556-559: LGTM!

The switch from per_op_concurrency to per_op_concurrency_from_policy for discover_models and pull_model aligns with the PR's policy-driven rate-limit/concurrency approach. This allows the inflight limits to be configured via the policy registry rather than hardcoded values.

Also applies to: 752-755

src/synthorg/api/controllers/settings.py (4)

151-178: LGTM!

The new SinkRotationResponse and SinkInfoResponse models are well-structured with proper ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid") configuration and appropriate use of NotBlankStr for identifier fields.


539-600: LGTM!

The list_sinks endpoint correctly implements cursor-based pagination:

  • Accepts cursor and limit parameters with proper typing.
  • Sorts sinks by identifier for stable keyset ordering.
  • Uses paginate_cursor helper with HMAC-signed cursor secret.
  • Returns properly typed PaginatedResponse[SinkInfoResponse].

634-635: LGTM!

The unparenthesized except MemoryError, RecursionError: syntax is correct PEP 758 for Python 3.14+ when not binding with as. Based on learnings from this repository.

Also applies to: 841-842


880-944: LGTM!

The helper functions are properly updated to work with the new typed SinkInfoResponse model instead of raw dicts. The type annotations are consistent throughout.

Comment thread src/synthorg/api/controllers/memory.py Outdated
Comment thread src/synthorg/api/controllers/settings.py Outdated
Comment thread src/synthorg/api/rate_limits/policies.py Outdated
Comment thread src/synthorg/providers/management/local_models.py Outdated
Comment thread src/synthorg/providers/management/local_models.py
Comment thread web/src/mocks/handlers/scaling.ts
Comment thread web/src/mocks/handlers/settings.ts Outdated
Comment thread web/src/utils/fetch-with-retry.ts Outdated
Comment thread web/src/utils/fetch-with-retry.ts Outdated
Comment thread web/src/utils/retry-after.ts Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/controllers/memory.py`:
- Around line 735-741: The preflight check currently treats a corpus with
exactly FINE_TUNE_MIN_DOCS_RECOMMENDED as passing; change the comparison in the
preflight logic so that a count at or below the threshold triggers the warning
by using a <= check on the variable count (replace the existing count <
FINE_TUNE_MIN_DOCS_RECOMMENDED condition), keeping the PreflightCheck return
(name="documents", status="warn", message=...) and
FINE_TUNE_MIN_DOCS_RECOMMENDED symbol unchanged.
- Around line 64-68: Change the hard-coded imports of
FINE_TUNE_DEFAULT_BATCH_SIZE, FINE_TUNE_MIN_DOCS_REQUIRED, and
FINE_TUNE_MIN_DOCS_RECOMMENDED to be resolved at runtime from
state.app_state.settings_service inside run_preflight(), then pass those
resolved values into _run_preflight_checks() and _recommend_batch_size() (use
the imported constants only as fallbacks if settings_service returns None).
Update all other call sites you touched (around the other ranges mentioned) to
accept the injected thresholds rather than importing defaults directly. Finally
add a regression test that overrides a memory.fine_tune.* setting via the
SettingsService mock and asserts the preflight output (batch size or min-doc
recommendations) changes accordingly.

In `@src/synthorg/api/controllers/settings.py`:
- Around line 181-200: The _append_disabled_defaults function duplicates the
identifier logic and mutates the input list; extract the identifier computation
used in _sink_to_response (based on CONSOLE_SINK_ID, SinkType.CONSOLE and
fallback to file_path or f"unnamed-{sink.sink_type.value}") into a single helper
(e.g., get_sink_identifier) and have both _sink_to_response and
_append_disabled_defaults call it, and rewrite _append_disabled_defaults to
return a new list without mutating the caller-owned list by creating copies of
SinkConfig objects via model_copy(update=...) (or deepcopy) when altering
defaults so expansion is pure and consistent with _sink_to_response.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a5e60e90-9231-442e-b143-61e62b2bfcf8

📥 Commits

Reviewing files that changed from the base of the PR and between c8cf96d and 8d49256.

📒 Files selected for processing (10)
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/settings/definitions/memory.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/providers/management/test_local_models.py
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/utils/retry-after.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (10)
web/src/**/*.ts?(x)

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.ts?(x): Always use createLogger from @/lib/logger instead of bare console.warn/console.error/console.debug in application code; use variable name log (e.g., const log = createLogger('module-name'))
Wrap attacker-controlled fields inside structured objects with sanitizeForLog() before embedding in log calls
Use design tokens, @/lib/motion presets, helpers in @/utils/format, and DEFAULT_CURRENCY from @/utils/currencies instead of hardcoding styling and formatting values
Detect fetch() in effects without AbortController cleanup using @eslint-react/web-api-no-leaked-fetch ESLint rule
A PostToolUse hook (scripts/check_web_design_system.py) runs on every web/src/ edit and flags hardcoded hex / rgba / fonts / Motion durations / locale literals / bare .toLocale*String() calls / missing Storybook stories / duplicate component patterns / complex .map() blocks; fix every violation before proceeding

Files:

  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/utils/retry-after.ts
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Reuse components from web/src/components/ui/. Never hardcode hex colors, font-family, pixel spacing, Motion transitions, or BCP 47 locale strings; use design tokens, @/lib/motion presets, helpers in @/utils/format. Enforced by scripts/check_web_design_system.py.

Files:

  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/utils/retry-after.ts
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit / integration / e2e / slow.
Mock-spec gate: every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt; regenerate via uv run python scripts/check_mock_spec.py --update. Without spec= mocks silently absorb every attribute access.
Shared mocks: use mock_dispatcher from tests/conftest.py (AsyncMock(spec=NotificationDispatcher)).
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Logger spying antipattern: never monkeypatch.setattr(module.logger, "info", spy); the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead.
Parametrize: prefer @pytest.mark.parametrize for similar cases.
Property-based: Hypothesis (Python), fast-check (React), testing.F (Go). CI runs 10 deterministic examples (derandomize=True). Hypothesis failures are real bugs: fix the bug and add an @example(...) decorator.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_memory_admin.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_memory_admin.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every cost-bearing Pydantic model carries currency: CurrencyCode; mixing raises MixedCurrencyAggregationError (HTTP 409). Aggregations over cost-bearing fields call assert_currencies_match before reducing.
Direct os.environ.get(...) outside startup is forbidden. Ghost-wired settings (consuming service never instantiated at boot) are flagged by scripts/check_setting_to_startup_trace.py; per-setting opt-out via # lint-allow: bootstrap-wiring -- <reason>.
Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal. Bare module-level _FOO = 1024 constants and bare numeric defaults (def f(timeout=30)) are forbidden.
Allowlisted numeric literals: 0, 1, -1 (sentinel/off-by-one), HTTP status codes 100-599 in status_code= defaults, hex bit-masks (0xff, 0x80), powers-of-2 in buffering= / chunk_size= / buffer_size= defaults, anything inside settings/definitions/, persistence/migrations/, observability/events/. Per-line opt-out: # lint-allow: magic-numbers -- <reason> (mandatory non-empty justification). Enforced by scripts/check_no_magic_numbers.py.
Comments explain WHY only, never origin / review / issue context. Forbidden: reviewer citations (pre-PR review #N``), in-code issue back-refs ((#1682`)`), naked `SEC-1` taxonomy in `src/`, migration framing (`ported from`), round narrative, self-evident restatements.
Keep in comments: hidden constraints, subtle invariants, upstream-bug workarounds (with stable bug-tracker URL), why a non-obvious choice was made. Enforced by `scripts/check_no_review_origin_in_code.py` and `scripts/check_no_migration_framing.py` (pre-push); per-line opt-outs `# lint-allow: review-origin -- ` and `# lint-allow: migration-framing -- `.
No `from future import annotations`: Python 3.14 has PEP 649.
PEP 758 except: `except A, B:` (no parens) when not bin...

Files:

  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
src/synthorg/settings/definitions/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

For every mutable setting: DB > env (SYNTHORG_<NS>_<KEY>) > YAML > code default, resolved through SettingsService / ConfigResolver. Register new settings in src/synthorg/settings/definitions/<namespace>.py.

Files:

  • src/synthorg/settings/definitions/memory.py
src/synthorg/settings/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

First cold read of a setting emits one INFO settings.value.resolved; subsequent reads stay DEBUG. Sanctioned exceptions: init-time only (env-only, no registry entry) and read-only post-init (read_only_post_init=True; set() raises SettingReadOnlyError).

Files:

  • src/synthorg/settings/definitions/memory.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
src/synthorg/providers/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/providers/**/*.py: All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry in driver subclasses or calling code.
RetryConfig / RateLimiterConfig set per-provider in ProviderConfig. Retryable: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable raise immediately.

Files:

  • src/synthorg/providers/management/local_models.py
src/synthorg/**/{api,services,repositories}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, MemoryService); services centralize audit logging; repositories must not log mutations themselves.

Files:

  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes the socket with code 4011.

Files:

  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/providers/management/test_local_models.py
  • src/synthorg/settings/definitions/memory.py
  • tests/unit/meta/telemetry/test_emitter.py
  • src/synthorg/providers/management/local_models.py
  • tests/unit/api/controllers/test_memory_admin.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
🔇 Additional comments (10)
web/src/utils/retry-after.ts (1)

1-66: LGTM!

The implementation correctly handles RFC 9110 Retry-After parsing for both delta-seconds and HTTP-date formats. The previous review concern about sanitizing attacker-controlled header values has been addressed—sanitizeForLog is now properly used at line 58. The sentinel-based approach (DO_NOT_RETRY) for excessive wait times is a clean way to signal back-pressure to callers.

web/src/utils/fetch-with-retry.ts (1)

1-168: LGTM!

The implementation properly addresses both concerns from previous reviews:

  1. Abort cancellation during sleep (lines 49-65): defaultSleep now accepts an AbortSignal, attaches an abort listener that clears the timer and resolves immediately, and short-circuits if already aborted.

  2. Request input metadata (lines 68-110): methodOf, headersOf, and hasIdempotencyKey now accept the input parameter and correctly extract method/headers from a Request object when init doesn't provide them.

The retry loop logic is sound—it checks abort status both before and after sleeping, respects the DO_NOT_RETRY sentinel, and properly enforces the retry budget.

web/src/__tests__/utils/fetch-with-retry.test.ts (1)

1-267: LGTM!

Comprehensive test coverage that addresses all previous review suggestions:

  • Request input tests (lines 206-243): Both the Idempotency-Key header extraction from Request.headers and the regression test for Request POST method detection are now covered.
  • Default sleep abort test (lines 245-266): Verifies the built-in defaultSleep path cancels immediately when AbortSignal fires mid-wait, using timing assertions to prove the helper didn't wait the full Retry-After duration.

The test suite thoroughly covers retry eligibility rules, budget exhaustion, malformed headers, and cancellation semantics.

src/synthorg/providers/management/local_models.py (3)

37-60: LGTM!

The regex patterns are well-designed:

  • POSIX paths include literal spaces in the character class to handle paths like /var/lib/ollama/model cache/...
  • Windows paths similarly include spaces for C:\Program Files\...
  • Host:port pattern covers IPv6 (bracketed), IPv4 (dotted quad), and DNS/single-label hostnames

The inline comments clearly explain the WHY for each pattern choice, which aligns with coding guidelines.


63-79: LGTM!

The sanitizer function is well-implemented:

  • Side-effect free as documented (no logging inside the helper)
  • Properly chains regex substitutions for all sensitive patterns
  • Handles edge cases: non-strings, empty results after truncation
  • Docstring correctly explains the design rationale

192-206: LGTM!

The integration addresses all past review feedback:

  • Presence check via "error" in data correctly handles falsey error values
  • Error is sanitized before use in both logging and the emitted event
  • error_type=type(error).__name__ provides safe type introspection without leaking the raw value
tests/unit/providers/management/test_local_models.py (1)

303-364: LGTM!

Comprehensive test coverage for the sanitizer including:

  • Non-string inputs (None, int, dict)
  • POSIX and Windows path redaction (with and without spaces)
  • Host:port redaction
  • Truncation behavior
  • Benign passthrough (no false positives)
  • Empty/whitespace fallback

The regression tests for space-containing paths directly address the concerns from past review comments.

tests/unit/meta/telemetry/test_emitter.py (2)

13-13: LGTM: FakeClock import supports new clock injection tests.

The import is correctly sourced from tests._shared.fake_clock and enables the deterministic time-driven tests in TestEmitterClockSeam.


384-437: LGTM: Clock injection tests correctly verify the new seam and address prior cleanup concerns.

The new TestEmitterClockSeam class properly validates the clock injection feature:

  • test_init_records_clock_monotonic confirms __init__ reads _last_flush_at from the injected clock's monotonic() value.
  • test_flush_updates_last_flush_at_via_clock verifies that flush() updates _last_flush_at after clock advancement.

The second test (lines 406–437) correctly uses await em.aclose() in the finally block and includes an explanatory comment (lines 421–425) documenting why aclose() is necessary to prevent leaking the _flush_task background task—fully addressing the prior review concern.

Both tests follow the recommended pattern for time-driven tests by injecting FakeClock via the clock= parameter.

src/synthorg/api/controllers/settings.py (1)

545-606: Cursor pagination is wired cleanly here.

Sorting before paginate_cursor() keeps the sink listing deterministic, and returning PaginatedResponse[SinkInfoResponse] removes the old untyped payload leak.

Comment thread src/synthorg/api/controllers/memory.py
Comment thread src/synthorg/api/controllers/memory.py Outdated
Comment thread src/synthorg/api/controllers/settings.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/api/controllers/memory.py (1)

145-149: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider moving VRAM tier thresholds to settings definitions.

_BATCH_SIZE_BY_VRAM_GB contains numeric thresholds (40.0, 16.0, 8.0 GB) and batch sizes (128, 64, 32) that are business-logic tuning knobs. Per coding guidelines, these should live in settings/definitions/memory.py rather than as bare numeric literals in the controller. However, since these are hardware-tier boundaries (unlikely to need runtime tuning) and stored in an immutable Final[tuple], this is a lower-priority refactor than the already-addressed FINE_TUNE_* constants.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/memory.py` around lines 145 - 149, Move the
VRAM-to-batch-size mapping out of the controller into the settings definitions:
create a constant (e.g., BATCH_SIZE_BY_VRAM_GB) in
settings/definitions/memory.py with the same immutable tuple value currently
assigned to _BATCH_SIZE_BY_VRAM_GB, then replace the local
_BATCH_SIZE_BY_VRAM_GB in src/synthorg/api/controllers/memory.py with an import
from settings.definitions.memory and update any references to use the imported
name; keep the tuple type/immutability and ensure imports and any type hints
(Final[tuple[tuple[float,int],...]]) are preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/controllers/settings.py`:
- Around line 181-197: _snk_identifier currently collapses all non-file sinks of
the same SinkType to the same token (unnamed-{sink_type}); update
_sink_identifier to append a stable sink-specific discriminator so fallback IDs
are unique per-sink (for example, use a stable attribute on SinkConfig such as
sink.id or sink.uuid or sink.name if available) instead of just
sink.sink_type.value — keep CONSOLE_SINK_ID for console, prefer sink.file_path
for FILE, and when falling back return something like
f"unnamed-{sink.sink_type.value}-{stable_discriminator}" so each sink serializes
to a unique, stable identifier.

In `@tests/unit/api/controllers/test_memory_admin.py`:
- Around line 497-551: Add a unit test that verifies the boundary where document
count == min_required to ensure _check_documents treats the required threshold
as inclusive; create a new test method (e.g.,
test_count_at_required_threshold_does_not_fail) that writes exactly min_required
files (10) into the temporary src and calls _check_documents(str(src),
min_required=10, min_recommended=50) and assert the result.status is "warn"
(i.e., not "fail" but still "warn" because it is below min_recommended).

---

Outside diff comments:
In `@src/synthorg/api/controllers/memory.py`:
- Around line 145-149: Move the VRAM-to-batch-size mapping out of the controller
into the settings definitions: create a constant (e.g., BATCH_SIZE_BY_VRAM_GB)
in settings/definitions/memory.py with the same immutable tuple value currently
assigned to _BATCH_SIZE_BY_VRAM_GB, then replace the local
_BATCH_SIZE_BY_VRAM_GB in src/synthorg/api/controllers/memory.py with an import
from settings.definitions.memory and update any references to use the imported
name; keep the tuple type/immutability and ensure imports and any type hints
(Final[tuple[tuple[float,int],...]]) are preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 878f149a-a205-4688-8a7b-c55a61f6a795

📥 Commits

Reviewing files that changed from the base of the PR and between 8d49256 and fb8838b.

📒 Files selected for processing (3)
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
  • tests/unit/api/controllers/test_memory_admin.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: Deploy Preview
  • GitHub Check: Build Backend
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Socket Security: Pull Request Alerts
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (5)
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every cost-bearing Pydantic model carries currency: CurrencyCode; mixing raises MixedCurrencyAggregationError (HTTP 409). Aggregations over cost-bearing fields call assert_currencies_match before reducing.
Direct os.environ.get(...) outside startup is forbidden. Ghost-wired settings (consuming service never instantiated at boot) are flagged by scripts/check_setting_to_startup_trace.py; per-setting opt-out via # lint-allow: bootstrap-wiring -- <reason>.
Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal. Bare module-level _FOO = 1024 constants and bare numeric defaults (def f(timeout=30)) are forbidden.
Allowlisted numeric literals: 0, 1, -1 (sentinel/off-by-one), HTTP status codes 100-599 in status_code= defaults, hex bit-masks (0xff, 0x80), powers-of-2 in buffering= / chunk_size= / buffer_size= defaults, anything inside settings/definitions/, persistence/migrations/, observability/events/. Per-line opt-out: # lint-allow: magic-numbers -- <reason> (mandatory non-empty justification). Enforced by scripts/check_no_magic_numbers.py.
Comments explain WHY only, never origin / review / issue context. Forbidden: reviewer citations (pre-PR review #N``), in-code issue back-refs ((#1682`)`), naked `SEC-1` taxonomy in `src/`, migration framing (`ported from`), round narrative, self-evident restatements.
Keep in comments: hidden constraints, subtle invariants, upstream-bug workarounds (with stable bug-tracker URL), why a non-obvious choice was made. Enforced by `scripts/check_no_review_origin_in_code.py` and `scripts/check_no_migration_framing.py` (pre-push); per-line opt-outs `# lint-allow: review-origin -- ` and `# lint-allow: migration-framing -- `.
No `from future import annotations`: Python 3.14 has PEP 649.
PEP 758 except: `except A, B:` (no parens) when not bin...

Files:

  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
src/synthorg/**/{api,services,repositories}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, MemoryService); services centralize audit logging; repositories must not log mutations themselves.

Files:

  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes the socket with code 4011.

Files:

  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit / integration / e2e / slow.
Mock-spec gate: every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt; regenerate via uv run python scripts/check_mock_spec.py --update. Without spec= mocks silently absorb every attribute access.
Shared mocks: use mock_dispatcher from tests/conftest.py (AsyncMock(spec=NotificationDispatcher)).
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Logger spying antipattern: never monkeypatch.setattr(module.logger, "info", spy); the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead.
Parametrize: prefer @pytest.mark.parametrize for similar cases.
Property-based: Hypothesis (Python), fast-check (React), testing.F (Go). CI runs 10 deterministic examples (derandomize=True). Hypothesis failures are real bugs: fix the bug and add an @example(...) decorator.

Files:

  • tests/unit/api/controllers/test_memory_admin.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/api/controllers/test_memory_admin.py
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/api/controllers/memory.py
  • tests/unit/api/controllers/test_memory_admin.py
  • src/synthorg/api/controllers/settings.py
🔇 Additional comments (8)
src/synthorg/api/controllers/memory.py (6)

152-200: LGTM!

The _FineTuneThresholds model correctly uses frozen=True, allow_inf_nan=False, and extra="forbid" per coding guidelines. The _resolve_fine_tune_thresholds helper properly resolves thresholds from SettingsService at request time with fallback to imported defaults, addressing the prior review feedback about runtime resolution. The PEP 758 except syntax at line 194 is valid for Python 3.14+.


242-245: LGTM!

The switch from per_op_concurrency(...) to per_op_concurrency_from_policy(...) aligns with the PR objective for policy-driven concurrency limiting. The pattern is consistent across all four endpoints that require inflight concurrency caps.


397-434: LGTM!

The preflight endpoint now correctly resolves thresholds at request time via _resolve_fine_tune_thresholds and passes them to the helper functions, ensuring DB/env/YAML overrides take effect. This addresses the prior review feedback about dead settings.


438-446: LGTM!

The pagination defaults are now consistently using DEFAULT_LIMIT from the DTO module, aligning with the PR's cursor pagination standardization across endpoints.


821-832: LGTM!

The boundary condition at line 827 now uses <= to make the warn band inclusive (documents "at or below" the recommended threshold trigger a warning), which aligns with the setting definition and addresses the prior review feedback.


910-948: LGTM!

The function now accepts the resolved default_batch_size parameter, enabling runtime override via settings while preserving the imported constant as fallback. The PEP 758 except syntax at line 934 is valid.

tests/unit/api/controllers/test_memory_admin.py (2)

171-208: LGTM!

The test assertions correctly use FINE_TUNE_DEFAULT_BATCH_SIZE from the settings definitions module, matching the updated controller implementation. The parametrized test at line 187 verifies the sub-8GB fallback case.


393-494: LGTM!

Excellent coverage of _resolve_fine_tune_thresholds: the four test methods cover the None-service fallback, successful overrides, unparseable values, and SettingNotFoundError handling. All mocks correctly use spec=SettingsService per the mock-spec gate guideline.

Comment thread src/synthorg/api/controllers/settings.py Outdated
Comment thread tests/unit/api/controllers/test_memory_admin.py
@Aureliolo Aureliolo force-pushed the perf/performance-data-integrity branch from fb8838b to 03b7f03 Compare May 8, 2026 20:45
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 8, 2026 20:47 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
tests/integration/persistence/test_perf_indices_sqlite.py (1)

68-78: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Harden _explain_plan by replacing interpolated SQL with an allowlist.

Line 77 and Line 78 still compose SQL via f-strings from inputs. This keeps static-analysis findings active and is avoidable in this helper.

💡 Suggested fix
 async def _explain_plan(
     backend: SQLitePersistenceBackend,
     sql: str,
     *params: object,
     analyze: str | None = None,
 ) -> str:
+    allowed_analyze = {
+        "cost_records": "ANALYZE cost_records",
+        "decision_records": "ANALYZE decision_records",
+    }
+    allowed_explain = {
+        "SELECT * FROM cost_records WHERE agent_id = ? ORDER BY timestamp DESC LIMIT 50":
+            "EXPLAIN QUERY PLAN SELECT * FROM cost_records WHERE agent_id = ? ORDER BY timestamp DESC LIMIT 50",
+        "SELECT * FROM cost_records WHERE task_id = ? ORDER BY timestamp DESC LIMIT 50":
+            "EXPLAIN QUERY PLAN SELECT * FROM cost_records WHERE task_id = ? ORDER BY timestamp DESC LIMIT 50",
+        "SELECT * FROM decision_records WHERE task_id = ? ORDER BY recorded_at ASC, id ASC LIMIT 50":
+            "EXPLAIN QUERY PLAN SELECT * FROM decision_records WHERE task_id = ? ORDER BY recorded_at ASC, id ASC LIMIT 50",
+    }
     db = backend._db
     assert db is not None, "fixture must connect the backend before EXPLAIN"
     if analyze is not None:
-        await db.execute(f"ANALYZE {analyze}")
-    cursor = await db.execute(f"EXPLAIN QUERY PLAN {sql}", params)
+        analyze_stmt = allowed_analyze.get(analyze)
+        if analyze_stmt is None:
+            raise ValueError(f"Unexpected ANALYZE target: {analyze}")
+        await db.execute(analyze_stmt)
+    explain_stmt = allowed_explain.get(sql)
+    if explain_stmt is None:
+        raise ValueError("Unexpected EXPLAIN query shape")
+    cursor = await db.execute(explain_stmt, params)
     rows = await cursor.fetchall()
     return "\n".join(str(tuple(row)) for row in rows)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/persistence/test_perf_indices_sqlite.py` around lines 68 -
78, The helper _explain_plan currently composes SQL with f-strings (EXPLAIN
QUERY PLAN {sql} and ANALYZE {analyze}), keep static-analysis happy by
validating inputs against an allowlist instead of interpolating raw input: add
an allowlist of permitted SQL statement prefixes (e.g., "SELECT", "INSERT",
"UPDATE", "DELETE", "REPLACE") and ensure the sql argument starts with one of
those prefixes and contains no semicolons or dangerous tokens, and add an
allowlist for analyze identifiers (explicit known table/index names or a
whitelist of allowed keywords) before building the final statements; after
validation, construct the SQL string only from trusted/validated pieces in
_explain_plan and then call db.execute with those constructed strings (do not
pass raw, unvalidated user SQL or analyze values into f-strings).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/api/controllers/test_setup_agent_ops.py`:
- Around line 314-323: The test currently checks persistence by matching only
model_provider/model_id which can collide with other agents; instead capture the
updated agent's identity from the PUT response (e.g., updated_id =
put_resp.json()["data"]["id"]) and then assert against that id when scanning the
GET response: ensure any(a["id"] == updated_id and a["model_provider"] ==
"test-provider" and a["model_id"] == "test-small-001" for a in agents). This
ties the assertion to the actual updated resource (use the PUT response variable
name used in the test) rather than just matching by provider/model.

In `@tests/unit/api/controllers/test_setup.py`:
- Around line 846-851: The current assertion uses any(...) over agents which can
pass if a different agent has the updated values; instead locate the specific
agent returned by the mutation (use the mutation response's unique identity
field such as "id" or "name" from the update response) and then assert that that
agent's model_provider == "test-provider" and model_id == "test-small-001";
update the assertions around get_resp.json()["data"] to first find the agent
where agent["id"] == updated_id (or agent["name"] == updated_name) and then
check the updated fields to ensure the mutation changed the intended agent only.
- Around line 761-788: The current test_pagination_round_trip_with_limit_one
uses a length-only check which can miss duplicates/gaps; replace the final
assertion with a strict equality check between the walked pages and the full
unpaginated result: fetch unpaginated =
test_client.get("/api/v1/setup/agents").json()["data"] and assert collected ==
unpaginated to ensure the paginated round-trip returns the exact same ordered
list (use the existing collected variable and the unpaginated data for
comparison).

In `@web/src/mocks/handlers/providers.ts`:
- Around line 152-154: The handler currently returns a loosely typed envelope
using emptyPaginatedEnvelope<ProviderConfig>(), which won't catch envelope shape
drift against the client API; replace that call with a typed helper: use
paginatedFor<typeof listProviders>(...) in the GET /api/v1/providers handler so
the mock envelope matches the exact return type of listProviders, populate the
required fields (data: [], plus the endpoint's meta/pagination fields) to
satisfy the typed helper, and import paginatedFor if not already present.

---

Duplicate comments:
In `@tests/integration/persistence/test_perf_indices_sqlite.py`:
- Around line 68-78: The helper _explain_plan currently composes SQL with
f-strings (EXPLAIN QUERY PLAN {sql} and ANALYZE {analyze}), keep static-analysis
happy by validating inputs against an allowlist instead of interpolating raw
input: add an allowlist of permitted SQL statement prefixes (e.g., "SELECT",
"INSERT", "UPDATE", "DELETE", "REPLACE") and ensure the sql argument starts with
one of those prefixes and contains no semicolons or dangerous tokens, and add an
allowlist for analyze identifiers (explicit known table/index names or a
whitelist of allowed keywords) before building the final statements; after
validation, construct the SQL string only from trusted/validated pieces in
_explain_plan and then call db.execute with those constructed strings (do not
pass raw, unvalidated user SQL or analyze values into f-strings).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2af95549-f279-4e9e-b198-16b2b80655cd

📥 Commits

Reviewing files that changed from the base of the PR and between fb8838b and 03b7f03.

⛔ Files ignored due to path filters (2)
  • src/synthorg/persistence/postgres/revisions/atlas.sum is excluded by !**/*.sum
  • src/synthorg/persistence/sqlite/revisions/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (62)
  • docs/reference/conventions.md
  • scripts/loop_bound_init_baseline.txt
  • scripts/mock_spec_baseline.txt
  • scripts/no_magic_numbers_baseline.txt
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_models.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/persistence/postgres/revisions/20260508204145_perf_indices_cost_decision.sql
  • src/synthorg/persistence/postgres/schema.sql
  • src/synthorg/persistence/sqlite/revisions/20260508204132_perf_indices_cost_decision.sql
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/settings/definitions/memory.py
  • tests/integration/api/controllers/test_providers.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/test_dto_forbid_extra.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/providers/management/test_local_models.py
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/client.ts
  • web/src/api/endpoints/providers.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/api/endpoints/setup.ts
  • web/src/api/types/providers.ts
  • web/src/api/types/setup.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/stores/websocket.ts
  • web/src/utils/app-version.ts
  • web/src/utils/constants.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/utils/retry-after.ts
💤 Files with no reviewable changes (3)
  • web/src/api/types/setup.ts
  • tests/unit/api/test_dto_forbid_extra.py
  • src/synthorg/api/controllers/setup_models.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Deploy Preview
  • GitHub Check: Build Backend
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Dashboard Test
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (14)
web/src/**/*.ts?(x)

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.ts?(x): Always use createLogger from @/lib/logger instead of bare console.warn/console.error/console.debug in application code; use variable name log (e.g., const log = createLogger('module-name'))
Wrap attacker-controlled fields inside structured objects with sanitizeForLog() before embedding in log calls
Use design tokens, @/lib/motion presets, helpers in @/utils/format, and DEFAULT_CURRENCY from @/utils/currencies instead of hardcoding styling and formatting values
Detect fetch() in effects without AbortController cleanup using @eslint-react/web-api-no-leaked-fetch ESLint rule
A PostToolUse hook (scripts/check_web_design_system.py) runs on every web/src/ edit and flags hardcoded hex / rgba / fonts / Motion durations / locale literals / bare .toLocale*String() calls / missing Storybook stories / duplicate component patterns / complex .map() blocks; fix every violation before proceeding

Files:

  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/utils/constants.ts
  • web/src/utils/retry-after.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/api/types/providers.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/stores/websocket.ts
  • web/src/api/endpoints/settings.ts
  • web/src/api/endpoints/setup.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/utils/app-version.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/api/endpoints/providers.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/providers.ts
web/src/mocks/handlers/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/mocks/handlers/**/*.ts: Mirror every exported endpoint in web/src/api/endpoints/*.ts with a 1:1 default happy-path MSW handler in web/src/mocks/handlers/; boot test-setup with onUnhandledRequest: 'error' and override per-case via server.use(...), never vi.mock('@/api/endpoints/*')
Use typed envelope helpers (successFor, paginatedFor, voidSuccess) in MSW handlers to keep handlers in lockstep with endpoint return types

Files:

  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/providers.ts
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Reuse components from web/src/components/ui/. Never hardcode hex colors, font-family, pixel spacing, Motion transitions, or BCP 47 locale strings; use design tokens, @/lib/motion presets, helpers in @/utils/format. Enforced by scripts/check_web_design_system.py.

Files:

  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/utils/constants.ts
  • web/src/utils/retry-after.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/api/types/providers.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/stores/websocket.ts
  • web/src/api/endpoints/settings.ts
  • web/src/api/endpoints/setup.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/utils/app-version.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/api/endpoints/providers.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/providers.ts
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every cost-bearing Pydantic model carries currency: CurrencyCode; mixing raises MixedCurrencyAggregationError (HTTP 409). Aggregations over cost-bearing fields call assert_currencies_match before reducing.
Direct os.environ.get(...) outside startup is forbidden. Ghost-wired settings (consuming service never instantiated at boot) are flagged by scripts/check_setting_to_startup_trace.py; per-setting opt-out via # lint-allow: bootstrap-wiring -- <reason>.
Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal. Bare module-level _FOO = 1024 constants and bare numeric defaults (def f(timeout=30)) are forbidden.
Allowlisted numeric literals: 0, 1, -1 (sentinel/off-by-one), HTTP status codes 100-599 in status_code= defaults, hex bit-masks (0xff, 0x80), powers-of-2 in buffering= / chunk_size= / buffer_size= defaults, anything inside settings/definitions/, persistence/migrations/, observability/events/. Per-line opt-out: # lint-allow: magic-numbers -- <reason> (mandatory non-empty justification). Enforced by scripts/check_no_magic_numbers.py.
Comments explain WHY only, never origin / review / issue context. Forbidden: reviewer citations (pre-PR review #N``), in-code issue back-refs ((#1682`)`), naked `SEC-1` taxonomy in `src/`, migration framing (`ported from`), round narrative, self-evident restatements.
Keep in comments: hidden constraints, subtle invariants, upstream-bug workarounds (with stable bug-tracker URL), why a non-obvious choice was made. Enforced by `scripts/check_no_review_origin_in_code.py` and `scripts/check_no_migration_framing.py` (pre-push); per-line opt-outs `# lint-allow: review-origin -- ` and `# lint-allow: migration-framing -- `.
No `from future import annotations`: Python 3.14 has PEP 649.
PEP 758 except: `except A, B:` (no parens) when not bin...

Files:

  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
src/synthorg/settings/definitions/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

For every mutable setting: DB > env (SYNTHORG_<NS>_<KEY>) > YAML > code default, resolved through SettingsService / ConfigResolver. Register new settings in src/synthorg/settings/definitions/<namespace>.py.

Files:

  • src/synthorg/settings/definitions/memory.py
src/synthorg/settings/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

First cold read of a setting emits one INFO settings.value.resolved; subsequent reads stay DEBUG. Sanctioned exceptions: init-time only (env-only, no registry entry) and read-only post-init (read_only_post_init=True; set() raises SettingReadOnlyError).

Files:

  • src/synthorg/settings/definitions/memory.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
web/src/utils/constants.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Keep the WebSocket wire protocol constants (WS_PROTOCOL_VERSION, WS_MAX_MESSAGE_SIZE, WS_HEARTBEAT_INTERVAL_MS, WS_PONG_TIMEOUT_MS, LOG_SANITIZE_MAX_LENGTH) in web/src/utils/constants.ts in lockstep with src/synthorg/api/ws_models.py / src/synthorg/api/controllers/ws.py; bump protocol version on both sides together for breaking payload changes

Files:

  • web/src/utils/constants.ts
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit / integration / e2e / slow.
Mock-spec gate: every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt; regenerate via uv run python scripts/check_mock_spec.py --update. Without spec= mocks silently absorb every attribute access.
Shared mocks: use mock_dispatcher from tests/conftest.py (AsyncMock(spec=NotificationDispatcher)).
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Logger spying antipattern: never monkeypatch.setattr(module.logger, "info", spy); the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead.
Parametrize: prefer @pytest.mark.parametrize for similar cases.
Property-based: Hypothesis (Python), fast-check (React), testing.F (Go). CI runs 10 deterministic examples (derandomize=True). Hypothesis failures are real bugs: fix the bug and add an @example(...) decorator.

Files:

  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/controllers/test_setup.py
  • tests/integration/api/controllers/test_providers.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/rate_limits/test_policies.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/controllers/test_setup.py
  • tests/integration/api/controllers/test_providers.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/rate_limits/test_policies.py
src/synthorg/**/{api,services,repositories}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, MemoryService); services centralize audit logging; repositories must not log mutations themselves.

Files:

  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes the socket with code 4011.

Files:

  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
web/src/stores/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/stores/**/*.ts: All store mutation actions (create / update / delete) must follow the stores/connections/crud-actions.ts pattern: wrap API calls in try/catch, success updates state + emits success toast, failure logs + emits error toast + returns sentinel (null for entity, false for delete); callers MUST NOT wrap store mutation calls in try/catch
List-read store actions must set error: string | null on the store instead of toasting; use opaque cursor-based pagination via PaginationMeta, keep nextCursor + hasMore in state (not offset arithmetic), and early-return when !hasMore || !nextCursor
Always capture previous synchronously in optimistic mutations and restore in the catch block
Any new Zustand store that schedules timers or attaches event listeners must expose an equivalent cleanup hook and register it in the global afterEach in test-setup.tsx
Store files over ~600 lines must be sliced into packages with one of two aggregation patterns: package-internal index.ts or sibling .ts aggregator

Files:

  • web/src/stores/websocket.ts
src/synthorg/providers/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/providers/**/*.py: All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry in driver subclasses or calling code.
RetryConfig / RateLimiterConfig set per-provider in ProviderConfig. Retryable: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable raise immediately.

Files:

  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/management/dtos.py
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.md: Use fenced code blocks with language tags: d2 for architecture/nested containers, mermaid for flowcharts/sequence/pipelines. Use markdown tables for tabular data; never use text fences with ASCII box-drawing.
Static historical counts and illustrative scale numbers may carry a per-line opt-out: <!-- lint-allow: doc-numeric-macros -- <reason> --> (reason mandatory). Enforced by scripts/check_doc_numeric_macros.py (pre-push).

Files:

  • docs/reference/conventions.md
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/settings/definitions/memory.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/setup_personality.py
  • tests/unit/meta/telemetry/test_emitter.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • src/synthorg/api/controllers/setup.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/unit/api/controllers/test_setup.py
  • tests/integration/api/controllers/test_providers.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • src/synthorg/api/controllers/scaling.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/management/dtos.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • src/synthorg/meta/telemetry/emitter.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/rate_limits/test_policies.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
🪛 OpenGrep (1.20.0)
tests/integration/persistence/test_perf_indices_sqlite.py

[ERROR] 77-77: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)


[ERROR] 78-78: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.

(coderabbit.sql-injection.python-fstring-execute)

🔇 Additional comments (68)
web/src/utils/constants.ts (1)

12-23: Jitter bounds are well centralized and documented.

Good change extracting reconnect jitter into shared constants with explicit rationale; this keeps behavior discoverable and test-friendly.

web/src/stores/websocket.ts (1)

232-249: Reconnect jitter backoff calculation looks solid.

Applying a uniform multiplier from shared constants plus post-rounding Math.max(1, ...) is a clean, robust implementation for de-correlated reconnects.

web/src/__tests__/stores/websocket.test.ts (2)

425-428: Worst-case reconnect advance now correctly accounts for jitter ceiling.

Nice stabilization of the reconnection test timing window.


434-475: Great deterministic coverage for jittered delay mapping.

The parameterized Math.random() cases make the jitter contract explicit and prevent regressions.

src/synthorg/providers/drivers/litellm_driver.py (1)

136-159: Clock seam wiring in LiteLLMDriver looks correct.

The constructor/default and monotonic TTL reads are consistent and preserve existing cache semantics while making timing deterministic for tests.

Also applies to: 184-194

src/synthorg/meta/telemetry/emitter.py (1)

103-110: HttpAnalyticsEmitter clock injection is clean and consistent.

_last_flush_at initialization and flush-time updates now use the same injectable monotonic source with no lock-order regressions.

Also applies to: 114-115, 137-137, 235-241

tests/unit/meta/telemetry/test_emitter.py (1)

384-437: Great deterministic coverage for emitter clock behavior.

These tests validate both initialization and flush timestamp updates through FakeClock, and the aclose() cleanup path is correctly used in the flush test.

tests/unit/providers/drivers/test_litellm_auth.py (1)

164-257: Credential-cache clock tests are strong and well-targeted.

The suite locks in within-TTL reuse, post-TTL refetch, exact-boundary behavior, and OAuth non-caching with deterministic timing.

src/synthorg/memory/embedding/fine_tune_orchestrator.py (1)

80-96: Progress-throttle clock seam integration is solid.

The injected clock is threaded correctly through callback throttling without changing the existing throttle boundary behavior.

Also applies to: 601-609, 633-639

scripts/loop_bound_init_baseline.txt (1)

32-32: Baseline update is correct.

The recorded line-number adjustment matches the constructor shift in FineTuneOrchestrator.

tests/unit/memory/embedding/test_fine_tune_orchestrator.py (1)

204-259: Good clock-seam regression test for throttled progress emits.

This test cleanly pins throttle behavior to fake monotonic time across below-threshold and boundary-crossing calls.

src/synthorg/persistence/sqlite/schema.sql (1)

108-111: Composite index additions align with cursor-pagination query shapes.

These index definitions are consistent with the new list/query access patterns and with the integration-plan assertions.

Also applies to: 648-649

src/synthorg/persistence/sqlite/revisions/20260508204132_perf_indices_cost_decision.sql (1)

1-6: Migration index set matches schema and test-plan expectations.

The added migration statements correctly mirror the intended composite indexes.

src/synthorg/persistence/postgres/revisions/20260508204145_perf_indices_cost_decision.sql (1)

1-6: Postgres migration is consistent with the new pagination index strategy.

Index definitions look correct and coherent with the corresponding schema/test changes.

src/synthorg/persistence/postgres/schema.sql (1)

94-97: Schema-level composite indexes are correctly defined for the targeted filters + ordering.

The added index shapes and names are consistent with migration and plan-based integration tests.

Also applies to: 663-664

tests/integration/persistence/test_perf_indices_postgres.py (2)

64-83: EXPLAIN helper is well-structured and safely composed for this test scope.

Using sql.Identifier for the ANALYZE target plus bound params keeps plan assertions robust and safe.


131-156: Good fixture shape for plan pinning on decision_records.

Seeding many rows for a single task makes the ORDER BY recorded_at, id assertion meaningfully exercise the composite index.

tests/integration/persistence/test_perf_indices_sqlite.py (1)

125-151: Decision-record fixture shape is solid for validating the composite ordering index.

The single-task multi-row seed pattern makes this EXPLAIN assertion substantially more reliable.

src/synthorg/api/rate_limits/policies.py (1)

104-106: Hardcoded operation limits are still embedded in policy registries.

Line 105 and Lines 224-231 still introduce raw limit values in business-logic policy maps instead of sourcing them from settings definitions. This is the same unresolved concern already raised in prior review.

As per coding guidelines, "Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal."

Also applies to: 224-231

src/synthorg/api/controllers/events.py (1)

530-537: LGTM!

The SSE endpoint now uses policy-driven rate limiting (per_op_rate_limit_from_policy) and concurrency capping (per_op_concurrency_from_policy) instead of hardcoded values. This aligns with the coding guidelines requiring numeric thresholds to live in settings definitions. The key="user_or_ip" for rate limiting and key="user" for concurrency are appropriate for this endpoint.

tests/unit/api/rate_limits/test_controller_coverage.py (1)

143-147: LGTM!

The guard coverage assertion correctly maps the stream endpoint to the events.stream policy key, ensuring the test will catch any accidental removal of the rate-limit guard.

src/synthorg/api/controllers/setup.py (1)

441-455: LGTM!

The pagination implementation correctly preserves the storage-array order instead of sorting, which maintains the critical invariant that agent_index in PUT/POST handlers aligns with the visible list order. The inline comment clearly documents this constraint for future maintainers.

src/synthorg/api/controllers/setup_personality.py (1)

161-183: LGTM!

The pagination implementation correctly sorts presets alphabetically by name before paginating. Unlike the agents endpoint, sorting is safe here because there are no positional-index-based mutation endpoints for personality presets.

web/src/utils/retry-after.ts (1)

1-66: LGTM!

The shared Retry-After parser correctly handles both RFC 9110 formats (delta-seconds and HTTP-date), properly sanitizes attacker-controlled header values before logging, and enforces a sensible budget cap. The constants are well-documented with clear semantics for the DO_NOT_RETRY sentinel.

web/src/utils/fetch-with-retry.ts (1)

125-168: LGTM!

The fetch wrapper correctly implements 429 retry handling with:

  • Proper idempotency detection from both Request input and RequestInit
  • AbortSignal propagation to the sleep function for immediate cancellation
  • Pre/post-sleep abort checks to avoid unnecessary retries
  • Bounded retry budget via shared constants

The design choice to return the 429 response (rather than throwing) when budget is exhausted gives callers flexibility to show appropriate back-pressure UI.

src/synthorg/api/controllers/scaling.py (2)

155-207: LGTM!

The list_strategies endpoint correctly implements cursor pagination with deterministic alphabetical ordering by strategy name. The empty-service fallback properly returns a PaginatedResponse with appropriate metadata (has_more=False, next_cursor=None).


256-322: LGTM!

The list_signals endpoint correctly implements cursor pagination. The signal deduplication logic is preserved, followed by alphabetical sorting before pagination, which ensures stable cursor positioning across pages.

web/src/__tests__/utils/fetch-with-retry.test.ts (1)

1-267: LGTM!

Comprehensive test suite covering all critical paths:

  • Retry mechanics with various Retry-After formats
  • Idempotency detection from both RequestInit and Request input
  • Budget enforcement (max retries and max wait time)
  • Abort signal handling including the built-in defaultSleep path
  • Method-based retry policies via parametrized tests

The regression tests for Request input (lines 206-243) and default sleep abort (lines 245-266) lock the fixes from prior review feedback.

web/src/mocks/handlers/scaling.ts (1)

17-19: Keep paginated mocks tied to endpoint return types.

The handlers for strategies (line 18) and signals (line 28) use emptyPaginatedEnvelope<T>() instead of paginatedFor<typeof getScalingStrategies>(...) / paginatedFor<typeof getScalingSignals>(...). This was flagged in a prior review—envelope drift won't be caught when the endpoint function signatures change.

As per coding guidelines: "Use typed envelope helpers (successFor, paginatedFor, voidSuccess) in MSW handlers to keep handlers in lockstep with endpoint return types".

Also applies to: 27-28

web/src/api/client.ts (1)

14-18: LGTM!

Centralizing the retry-after constants and parser into @/utils/retry-after eliminates duplication with fetch-with-retry.ts and keeps the retry logic consistent across the axios client and native fetch paths.

web/src/api/endpoints/scaling.ts (1)

42-51: LGTM!

The paginateAll wrapper correctly aggregates all pages for strategies and signals, matching the backend's cursor pagination change while keeping the API surface simple for callers that need the full list.

Also applies to: 70-79

web/src/utils/app-version.ts (1)

17-17: LGTM!

The switch to fetchWithRetryAfter with idempotent: true is appropriate—logout is replay-safe on the server, and the existing AbortController timeout ensures the boot path doesn't stall indefinitely.

Also applies to: 122-135

src/synthorg/api/controllers/settings.py (4)

151-179: LGTM!

The new SinkRotationResponse and SinkInfoResponse models follow the project conventions: frozen ConfigDict, extra="forbid", NotBlankStr for identifier fields, and typed fields throughout. This replaces the previous dict[str, Any] return shape with proper type safety.


181-212: LGTM!

The _sink_identifier() helper correctly derives stable, unique identifiers for each sink type: CONSOLE uses the fixed token, FILE uses file_path, and shipping sinks (SYSLOG/HTTP/OTLP) derive from their endpoint fields. This addresses the prior concern about colliding unnamed-<type> keys for multiple sinks of the same type.


939-964: LGTM!

The refactored _append_disabled_defaults() now returns a new list instead of mutating the input, and uses the centralized _sink_identifier() for membership testing. This follows the immutability covenant and eliminates the duplicate identifier logic that previously drifted from _sink_to_response().


569-629: LGTM!

The list_sinks endpoint correctly implements cursor-keyset pagination: accepts cursor and limit params, sorts by identifier for stable ordering, and uses paginate_cursor() to encode the response. The typed PaginatedResponse[SinkInfoResponse] return replaces the previous untyped envelope.

docs/reference/conventions.md (1)

54-58: LGTM!

The documentation accurately reflects the cursor-based pagination envelope ({limit, next_cursor, has_more}) and clarifies that there's no total count on the wire. This aligns with the implementation across the paginated controllers.

tests/unit/api/controllers/test_settings_sinks.py (1)

154-197: LGTM!

The pagination tests provide good coverage: test_pagination_round_trip validates that walking pages with limit=1 enumerates every sink exactly once, and test_tampered_cursor_rejected confirms the HMAC validation rejects invalid cursors. The assertion at lines 167-170 correctly fails loudly instead of skipping, addressing the prior review feedback.

web/src/api/endpoints/settings.ts (1)

1-1: LGTM!

The listSinks function now uses the same paginateAll + unwrapPaginated pattern as other paginated endpoints, correctly aggregating all pages into a flat array while keeping the return type unchanged for callers.

Also applies to: 50-59

web/src/mocks/handlers/settings.ts (1)

66-67: Keep this paginated mock tied to the endpoint return type.

The handler still uses emptyPaginatedEnvelope<SinkInfo>() rather than paginatedFor<typeof listSinks>(...). This decouples the mock from the actual endpoint contract, meaning envelope drift won't be caught here.

As per coding guidelines: "Use typed envelope helpers (successFor, paginatedFor, voidSuccess) in MSW handlers to keep handlers in lockstep with endpoint return types".

web/src/__tests__/stores/sinks.test.ts (1)

9-17: LGTM!

The paginatedSinks helper is well-typed and correctly constructs the PaginatedResponse<SinkInfo> shape expected by the store. The test fixtures are properly updated to match the new cursor-paginated wire format.

tests/unit/api/rate_limits/test_policies.py (2)

208-234: LGTM!

The TestInflightRegistryStructure class mirrors the rate-limit registry tests appropriately, covering immutability, positive value constraints, canonical key format, and non-emptiness. Good structural parity with the existing test patterns.


236-278: LGTM!

The TestPerOpConcurrencyFromPolicy and TestInflightEndpointPolicies classes provide thorough coverage for the concurrency helper and documented defaults. The parametrized approach keeps test nodes minimal while covering all registered operations.

src/synthorg/api/controllers/memory.py (3)

151-166: LGTM!

The _FineTuneThresholds model correctly uses ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid") per project conventions and provides proper field constraints with Field(ge=1).


168-200: LGTM!

The _resolve_fine_tune_thresholds helper properly resolves settings from SettingsService with fallback to imported defaults. The exception handling correctly catches SettingNotFoundError, ValueError, TypeError (PEP 758 syntax) to gracefully degrade when settings are missing or unparseable.


397-434: LGTM!

The preflight handler now correctly resolves thresholds at request time via _resolve_fine_tune_thresholds and passes them to the preflight check functions. This ensures DB/env/YAML overrides take effect rather than using hardcoded defaults.

tests/integration/api/test_per_op_rate_limit_concurrent.py (1)

310-314: LGTM!

The defensive signature *_args: Any, **_kwargs: Any ensures the patch remains compatible with production helper changes without test brittleness. The inline comment clearly documents the rationale.

tests/unit/api/controllers/test_memory_admin.py (3)

176-187: LGTM!

Test expectations correctly updated to use the FINE_TUNE_DEFAULT_BATCH_SIZE constant, ensuring test assertions stay aligned with the settings definition rather than hardcoded values.


393-495: LGTM!

The TestResolveFineTuneThresholds class provides thorough coverage for the threshold resolution helper: fallback when service is missing, overrides from SettingsService, unparseable values, and missing settings. Mocks correctly use spec=SettingsService.


497-575: LGTM!

The TestCheckDocumentsBoundaries class covers all critical boundary conditions for the document count thresholds, including the count == min_required case that was previously suggested. This ensures the < vs <= logic is validated.

src/synthorg/api/controllers/providers.py (3)

213-232: LGTM!

The list_providers endpoint correctly implements cursor-based pagination: sorts providers by name for stable ordering, uses paginate_cursor with the cursor secret, and returns PaginatedResponse[ProviderResponse] with proper metadata.


264-328: LGTM!

The list_models endpoint follows the same pagination pattern: sorts models by id for deterministic ordering, handles capability enrichment gracefully with fallback, and returns PaginatedResponse[ProviderModelResponse].


556-559: LGTM!

Switching to per_op_concurrency_from_policy(...) aligns these endpoints with the policy-driven inflight guard pattern, centralizing concurrency limits in the policy registry for easier tuning.

Also applies to: 752-755

web/src/api/types/providers.ts (1)

79-84: LGTM!

The optional name field with clear docstring properly reflects the backend contract where paginated list endpoints populate the field while single-resource GETs leave it null (since the URL already identifies the provider).

web/src/mocks/handlers/setup.ts (1)

115-116: Keep these paginated mocks tied to the endpoint return types.

This uses emptyPaginatedEnvelope<SetupAgentSummary>() which only types items, not the full envelope shape. Using paginatedFor<typeof getAgents>() would keep the mock 1:1 with the client contract.

web/src/api/endpoints/providers.ts (3)

46-70: LGTM!

The cursor pagination implementation is correct. The Object.create(null) pattern and prototype-pollution key filtering (__proto__, constructor, prototype) provide solid defense-in-depth. The warning log for missing name appropriately surfaces contract violations.


77-87: LGTM!

The pagination logic for provider models correctly mirrors the listProviders pattern using paginateAll and unwrapPaginated.


225-242: LGTM!

Using fetchWithRetryAfter with { idempotent: true } is appropriate here—the initial POST that opens the SSE stream is safe to retry on 429 since the server-side pull operation is idempotent.

tests/unit/api/controllers/test_providers.py (1)

20-31: LGTM!

The test correctly validates the new paginated response shape (data == [], pagination.has_more, pagination.next_cursor) and the new tampered-cursor rejection test ensures the HMAC-signed cursor validation is enforced.

tests/integration/api/controllers/test_providers.py (1)

78-83: LGTM!

The test correctly adapts to the paginated list response by building a name-indexed dictionary from body["data"] and validating the provider properties.

web/src/mocks/handlers/providers.ts (1)

345-382: LGTM!

The new fixture builders follow the established pattern with Partial<...> overrides and provide sensible defaults for test scenarios.

tests/unit/providers/management/test_local_models.py (1)

303-365: LGTM!

Excellent adversarial test coverage for _sanitize_ollama_error. The tests thoroughly verify: non-string fallback, POSIX/Windows path redaction (including paths with spaces), host:port redaction, truncation bounds, benign message passthrough, and empty-string fallback.

tests/unit/api/controllers/test_setup_personality.py (2)

42-48: LGTM!

The switch to any() check is appropriate since the paginated list returns agents in name-sorted order rather than insertion order.


112-131: LGTM!

Test assertions correctly adapted to the new paginated response shape where body["data"] is the presets list directly.

src/synthorg/providers/management/local_models.py (3)

35-60: LGTM!

The regex patterns correctly address the past review concerns:

  • POSIX/Windows path patterns include space in character classes for paths like /var/lib/ollama/model cache/ and C:\Program Files\.
  • Host:port pattern covers bracketed IPv6, IPv4, and DNS/single-label hostnames.

63-79: LGTM!

The sanitizer is correctly side-effect free (no logging), handles non-string inputs with a fallback, applies all three redaction patterns in sequence, truncates to the configured max length, and never returns an empty string.


192-206: LGTM!

The error handling correctly uses "error" in data presence check (addressing the past review concern about falsey values), calls the sanitizer, and logs with error_type=type(error).__name__ rather than the raw error value.

web/src/api/endpoints/setup.ts (1)

41-48: Pagination client integration looks solid

Good migration to paginateAll + unwrapPaginated for both setup agents and personality presets; this keeps behavior consistent with the new cursor-paginated backend contract.

Also applies to: 106-113

Comment thread tests/unit/api/controllers/test_setup_agent_ops.py Outdated
Comment thread tests/unit/api/controllers/test_setup.py Outdated
Comment thread tests/unit/api/controllers/test_setup.py
Comment thread web/src/mocks/handlers/providers.ts
@Aureliolo Aureliolo force-pushed the perf/performance-data-integrity branch from 03b7f03 to acea685 Compare May 8, 2026 22:17
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 8, 2026 22:19 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/synthorg/providers/management/dtos.py (1)

600-619: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Make name explicit at every call site.

Line 603 currently makes the paginated-list identifier optional, so a future list caller can omit it and still get a valid-looking ProviderResponse with name=None. That would quietly break the dict-by-name reconstruction contract. Requiring callers to pass name= explicitly, including None for GET-by-path, makes that omission fail fast.

Suggested change
 def to_provider_response(
     config: ProviderConfig,
     *,
-    name: str | None = None,
+    name: str | None,
 ) -> ProviderResponse:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/providers/management/dtos.py` around lines 600 - 619, The
to_provider_response function currently has an optional parameter name: str |
None = None which allows callers to omit the identifier and can silently produce
a ProviderResponse with name=None; change the function signature in
to_provider_response to require name: str | None (remove the default) so every
call site must explicitly pass name=... (even when None for GET-by-path); update
any call sites that invoke to_provider_response(...) to pass name=... to satisfy
the new required parameter.
web/src/api/client.ts (1)

149-169: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Retry incorrectly skipped when Retry-After header is absent.

The condition if (waitMs > 0 && waitMs !== DO_NOT_RETRY) prevents retrying when parseRetryAfterMs returns 0. Per the function's contract, 0 means "retry immediately", but this code skips the retry entirely. In web/src/utils/fetch-with-retry.ts, the same function is correctly handled by unconditionally calling sleep(waitMs), which sleeps for 0 ms when the header is absent. Move the sleep into a conditional to match this pattern:

-        if (waitMs > 0 && waitMs !== DO_NOT_RETRY) {
+        if (waitMs !== DO_NOT_RETRY) {
           config._rateLimitRetries = retries + 1
           const nextHeaders = { ...(config.headers ?? {}) } as Record<string, string>
           nextHeaders[RETRY_COUNT_HEADER] = String(retries + 1)
           const retryConfig: AxiosRequestConfig = {
             ...config,
             headers: nextHeaders,
           }
-          await sleep(waitMs)
+          if (waitMs > 0) {
+            await sleep(waitMs)
+          }
           return apiClient.request(retryConfig)
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@web/src/api/client.ts` around lines 149 - 169, The retry logic skips retries
when parseRetryAfterMs returns 0 (meaning "retry immediately"); update the
isIdempotent retry block so you only treat DO_NOT_RETRY as the no-retry case
(i.e., check waitMs !== DO_NOT_RETRY) and preserve the existing max-retries
guard (retries < MAX_RATE_LIMIT_RETRIES); when waitMs !== DO_NOT_RETRY,
increment config._rateLimitRetries, set RETRY_COUNT_HEADER on the cloned
headers, build retryConfig, await sleep(waitMs) (which may be 0) and then call
apiClient.request(retryConfig); keep DO_NOT_RETRY handling as-is to propagate
the 429.
src/synthorg/api/controllers/settings.py (1)

611-617: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't log the full sink config blobs on validation failure.

sink_overrides and custom_sinks are operator-supplied JSON and can contain filesystem paths, URLs, or auth material. Logging them verbatim at WARNING turns a bad config into a log leak. Keep error_type/error and log only coarse metadata such as presence/counts.

♻️ Minimal scrubbed logging shape
         except ValueError as exc:
             logger.warning(
                 SETTINGS_OBSERVABILITY_VALIDATION_FAILED,
                 error_type=type(exc).__name__,
                 error=safe_error_description(exc),
-                sink_overrides=overrides_json,
-                custom_sinks=custom_json,
+                has_sink_overrides=overrides_json != "{}",
+                has_custom_sinks=custom_json != "[]",
             )
             sinks = _defaults_only_sinks()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/settings.py` around lines 611 - 617, The
SETTINGS_OBSERVABILITY_VALIDATION_FAILED log currently emits operator-supplied
blobs sink_overrides and custom_sinks which may contain secrets; change the
logger.warning call in the exception handler so it no longer logs these raw JSON
blobs. Instead, compute and pass scrubbed metadata (e.g., has_sink_overrides:
bool, sink_overrides_count: int, has_custom_sinks: bool, custom_sinks_count:
int) alongside error_type (type(exc).__name__) and error
(safe_error_description(exc)), and remove sink_overrides/custom_sinks from the
structured payload; update the call site where logger.warning is invoked to use
these scrubbed fields.
♻️ Duplicate comments (2)
src/synthorg/api/rate_limits/policies.py (1)

104-105: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move hardcoded rate-limit and inflight defaults to settings definitions.

The numeric literals 60, 4, 2, etc. are business-logic thresholds that must live in src/synthorg/settings/definitions/<namespace>.py per the coding guidelines. Without this, operators cannot tune these values through the documented config precedence chain.

As per coding guidelines, "Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal."

Also applies to: 224-231

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/rate_limits/policies.py` around lines 104 - 105, The
hardcoded numeric literals used as rate-limit and inflight defaults (e.g., the
tuple (60, 60) for the "events.stream" entry and the other numeric literals
around lines 224-231) must be moved into the settings definitions module and
referenced from here; create appropriate constants in
src/synthorg/settings/definitions/<namespace>.py (naming like
EVENTS_STREAM_RATE_LIMIT and EVENTS_STREAM_INFLIGHT or a single
EVENTS_STREAM_POLICY tuple) and replace the bare numbers in the policies mapping
in src/synthorg/api/rate_limits/policies.py (the mapping entry keyed
"events.stream" and the other entries around 224-231) to reference those
settings constants so operators can tune them via the config precedence chain.
src/synthorg/api/controllers/settings.py (1)

181-212: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid using raw sink destinations as the public identifier.

identifier is now both the client-visible sink key and the pagination sort key. Returning http_url / otlp_endpoint verbatim can expose embedded credentials or query tokens, and two sinks aimed at the same destination still collide, which makes cursor paging unstable. Prefer a stable non-reversible fingerprint (or persisted sink id) instead of the raw destination string.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/settings.py` around lines 181 - 212,
_sink_identifier currently returns raw destination strings (file_path, http_url,
otlp_endpoint, syslog host:port) which can leak credentials and break
pagination; update _sink_identifier (and any callers like
_append_disabled_defaults) to return a stable, non-reversible fingerprint
instead of the raw destination: compute a deterministic hash (e.g. SHA‑256 hex)
over a canonicalized destination string for each non-CONSOLE SinkConfig and
return a type-prefixed compact fingerprint (e.g. "http:<hash>" / "otlp:<hash>" /
"syslog:<hash>" / "file:<hash>") so identical targets map the same key but no
sensitive data is exposed and pagination sort keys remain stable. Ensure CONSOLE
still returns CONSOLE_SINK_ID and that the hashing logic uses consistent
canonicalization across runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/api/controllers/test_setup.py`:
- Around line 1585-1605: The test's "full" value is currently fetched with
test_client.get("/api/v1/setup/personality-presets") which now returns only the
first page; change the setup so "full" is built by walking the pagination (use
the same cursor loop used later) or by deriving the expected ordered list from
the preset registry instead of relying on the default single-page
response—update the code that assigns `full` (and any use of `DEFAULT_LIMIT`
behavior) to repeatedly call test_client.get with `?limit=` and `cursor=` until
pagination["next_cursor"] is null so `full` truly represents the entire dataset
before performing the round-trip/assertion.

In `@tests/unit/meta/telemetry/test_emitter.py`:
- Line 427: The AsyncMock created by patch.object(em, "_send_batch",
new_callable=AsyncMock) is unspecced; change each patch to pass the original
method as the spec (e.g., patch.object(em, "_send_batch",
new_callable=AsyncMock, spec=em._send_batch) or spec_set=em._send_batch) so the
mock matches the real async method and remains awaitable; update every
occurrence for em._send_batch in this test file (the lines called out) to use
new_callable=AsyncMock with spec/spec_set instead of relying on autospec.

In `@tests/unit/providers/management/test_local_models.py`:
- Around line 346-351: The current test test_redacts_host_port only checks a DNS
hostname case; update it to a parametrized pytest test using
`@pytest.mark.parametrize` to cover the other supported host forms so the regex
regression is detected: add cases for "localhost:11434", an IPv4 like
"127.0.0.1:11434", and a bracketed IPv6 like "[::1]:11434"; for each case call
_sanitize_ollama_error and assert the original host is not present and
"[REDACTED-HOST]" is present, keeping the test name and referencing
test_redacts_host_port and _sanitize_ollama_error so the behavior is validated
across all formats.

---

Outside diff comments:
In `@src/synthorg/api/controllers/settings.py`:
- Around line 611-617: The SETTINGS_OBSERVABILITY_VALIDATION_FAILED log
currently emits operator-supplied blobs sink_overrides and custom_sinks which
may contain secrets; change the logger.warning call in the exception handler so
it no longer logs these raw JSON blobs. Instead, compute and pass scrubbed
metadata (e.g., has_sink_overrides: bool, sink_overrides_count: int,
has_custom_sinks: bool, custom_sinks_count: int) alongside error_type
(type(exc).__name__) and error (safe_error_description(exc)), and remove
sink_overrides/custom_sinks from the structured payload; update the call site
where logger.warning is invoked to use these scrubbed fields.

In `@src/synthorg/providers/management/dtos.py`:
- Around line 600-619: The to_provider_response function currently has an
optional parameter name: str | None = None which allows callers to omit the
identifier and can silently produce a ProviderResponse with name=None; change
the function signature in to_provider_response to require name: str | None
(remove the default) so every call site must explicitly pass name=... (even when
None for GET-by-path); update any call sites that invoke
to_provider_response(...) to pass name=... to satisfy the new required
parameter.

In `@web/src/api/client.ts`:
- Around line 149-169: The retry logic skips retries when parseRetryAfterMs
returns 0 (meaning "retry immediately"); update the isIdempotent retry block so
you only treat DO_NOT_RETRY as the no-retry case (i.e., check waitMs !==
DO_NOT_RETRY) and preserve the existing max-retries guard (retries <
MAX_RATE_LIMIT_RETRIES); when waitMs !== DO_NOT_RETRY, increment
config._rateLimitRetries, set RETRY_COUNT_HEADER on the cloned headers, build
retryConfig, await sleep(waitMs) (which may be 0) and then call
apiClient.request(retryConfig); keep DO_NOT_RETRY handling as-is to propagate
the 429.

---

Duplicate comments:
In `@src/synthorg/api/controllers/settings.py`:
- Around line 181-212: _sink_identifier currently returns raw destination
strings (file_path, http_url, otlp_endpoint, syslog host:port) which can leak
credentials and break pagination; update _sink_identifier (and any callers like
_append_disabled_defaults) to return a stable, non-reversible fingerprint
instead of the raw destination: compute a deterministic hash (e.g. SHA‑256 hex)
over a canonicalized destination string for each non-CONSOLE SinkConfig and
return a type-prefixed compact fingerprint (e.g. "http:<hash>" / "otlp:<hash>" /
"syslog:<hash>" / "file:<hash>") so identical targets map the same key but no
sensitive data is exposed and pagination sort keys remain stable. Ensure CONSOLE
still returns CONSOLE_SINK_ID and that the hashing logic uses consistent
canonicalization across runs.

In `@src/synthorg/api/rate_limits/policies.py`:
- Around line 104-105: The hardcoded numeric literals used as rate-limit and
inflight defaults (e.g., the tuple (60, 60) for the "events.stream" entry and
the other numeric literals around lines 224-231) must be moved into the settings
definitions module and referenced from here; create appropriate constants in
src/synthorg/settings/definitions/<namespace>.py (naming like
EVENTS_STREAM_RATE_LIMIT and EVENTS_STREAM_INFLIGHT or a single
EVENTS_STREAM_POLICY tuple) and replace the bare numbers in the policies mapping
in src/synthorg/api/rate_limits/policies.py (the mapping entry keyed
"events.stream" and the other entries around 224-231) to reference those
settings constants so operators can tune them via the config precedence chain.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a13df804-d797-4a2e-a68f-c4e7e9664ba3

📥 Commits

Reviewing files that changed from the base of the PR and between 03b7f03 and acea685.

⛔ Files ignored due to path filters (2)
  • src/synthorg/persistence/postgres/revisions/atlas.sum is excluded by !**/*.sum
  • src/synthorg/persistence/sqlite/revisions/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (65)
  • docs/reference/conventions.md
  • scripts/loop_bound_init_baseline.txt
  • scripts/mock_spec_baseline.txt
  • scripts/no_magic_numbers_baseline.txt
  • scripts/run_affected_tests.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_models.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/persistence/postgres/revisions/20260508204145_perf_indices_cost_decision.sql
  • src/synthorg/persistence/postgres/schema.sql
  • src/synthorg/persistence/sqlite/revisions/20260508204132_perf_indices_cost_decision.sql
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/settings/definitions/memory.py
  • tests/integration/api/controllers/test_providers.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/test_dto_forbid_extra.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/scripts/test_run_affected_tests.py
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/client.ts
  • web/src/api/endpoints/providers.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/api/endpoints/setup.ts
  • web/src/api/types/providers.ts
  • web/src/api/types/setup.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/stores/websocket.ts
  • web/src/utils/app-version.ts
  • web/src/utils/constants.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/utils/retry-after.ts
💤 Files with no reviewable changes (3)
  • tests/unit/api/test_dto_forbid_extra.py
  • web/src/api/types/setup.ts
  • src/synthorg/api/controllers/setup_models.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Deploy Preview
  • GitHub Check: Build Backend
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🧰 Additional context used
📓 Path-based instructions (14)
web/src/**/*.ts?(x)

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.ts?(x): Always use createLogger from @/lib/logger instead of bare console.warn/console.error/console.debug in application code; use variable name log (e.g., const log = createLogger('module-name'))
Wrap attacker-controlled fields inside structured objects with sanitizeForLog() before embedding in log calls
Use design tokens, @/lib/motion presets, helpers in @/utils/format, and DEFAULT_CURRENCY from @/utils/currencies instead of hardcoding styling and formatting values
Detect fetch() in effects without AbortController cleanup using @eslint-react/web-api-no-leaked-fetch ESLint rule
A PostToolUse hook (scripts/check_web_design_system.py) runs on every web/src/ edit and flags hardcoded hex / rgba / fonts / Motion durations / locale literals / bare .toLocale*String() calls / missing Storybook stories / duplicate component patterns / complex .map() blocks; fix every violation before proceeding

Files:

  • web/src/api/endpoints/scaling.ts
  • web/src/mocks/handlers/index.ts
  • web/src/api/types/providers.ts
  • web/src/stores/websocket.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/api/endpoints/setup.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/api/endpoints/providers.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/endpoints/settings.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/utils/retry-after.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/utils/app-version.ts
  • web/src/utils/constants.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/providers.ts
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Reuse components from web/src/components/ui/. Never hardcode hex colors, font-family, pixel spacing, Motion transitions, or BCP 47 locale strings; use design tokens, @/lib/motion presets, helpers in @/utils/format. Enforced by scripts/check_web_design_system.py.

Files:

  • web/src/api/endpoints/scaling.ts
  • web/src/mocks/handlers/index.ts
  • web/src/api/types/providers.ts
  • web/src/stores/websocket.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/api/endpoints/setup.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/api/endpoints/providers.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/endpoints/settings.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/utils/retry-after.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/utils/app-version.ts
  • web/src/utils/constants.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/providers.ts
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every cost-bearing Pydantic model carries currency: CurrencyCode; mixing raises MixedCurrencyAggregationError (HTTP 409). Aggregations over cost-bearing fields call assert_currencies_match before reducing.
Direct os.environ.get(...) outside startup is forbidden. Ghost-wired settings (consuming service never instantiated at boot) are flagged by scripts/check_setting_to_startup_trace.py; per-setting opt-out via # lint-allow: bootstrap-wiring -- <reason>.
Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal. Bare module-level _FOO = 1024 constants and bare numeric defaults (def f(timeout=30)) are forbidden.
Allowlisted numeric literals: 0, 1, -1 (sentinel/off-by-one), HTTP status codes 100-599 in status_code= defaults, hex bit-masks (0xff, 0x80), powers-of-2 in buffering= / chunk_size= / buffer_size= defaults, anything inside settings/definitions/, persistence/migrations/, observability/events/. Per-line opt-out: # lint-allow: magic-numbers -- <reason> (mandatory non-empty justification). Enforced by scripts/check_no_magic_numbers.py.
Comments explain WHY only, never origin / review / issue context. Forbidden: reviewer citations (pre-PR review #N``), in-code issue back-refs ((#1682`)`), naked `SEC-1` taxonomy in `src/`, migration framing (`ported from`), round narrative, self-evident restatements.
Keep in comments: hidden constraints, subtle invariants, upstream-bug workarounds (with stable bug-tracker URL), why a non-obvious choice was made. Enforced by `scripts/check_no_review_origin_in_code.py` and `scripts/check_no_migration_framing.py` (pre-push); per-line opt-outs `# lint-allow: review-origin -- ` and `# lint-allow: migration-framing -- `.
No `from future import annotations`: Python 3.14 has PEP 649.
PEP 758 except: `except A, B:` (no parens) when not bin...

Files:

  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
src/synthorg/providers/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/providers/**/*.py: All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry in driver subclasses or calling code.
RetryConfig / RateLimiterConfig set per-provider in ProviderConfig. Retryable: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable raise immediately.

Files:

  • src/synthorg/providers/management/dtos.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/drivers/litellm_driver.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit / integration / e2e / slow.
Mock-spec gate: every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt; regenerate via uv run python scripts/check_mock_spec.py --update. Without spec= mocks silently absorb every attribute access.
Shared mocks: use mock_dispatcher from tests/conftest.py (AsyncMock(spec=NotificationDispatcher)).
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Logger spying antipattern: never monkeypatch.setattr(module.logger, "info", spy); the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead.
Parametrize: prefer @pytest.mark.parametrize for similar cases.
Property-based: Hypothesis (Python), fast-check (React), testing.F (Go). CI runs 10 deterministic examples (derandomize=True). Hypothesis failures are real bugs: fix the bug and add an @example(...) decorator.

Files:

  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/providers/management/test_local_models.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/api/controllers/test_providers.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/scripts/test_run_affected_tests.py
  • tests/unit/api/controllers/test_setup.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/providers/management/test_local_models.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/api/controllers/test_providers.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/scripts/test_run_affected_tests.py
  • tests/unit/api/controllers/test_setup.py
web/src/mocks/handlers/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/mocks/handlers/**/*.ts: Mirror every exported endpoint in web/src/api/endpoints/*.ts with a 1:1 default happy-path MSW handler in web/src/mocks/handlers/; boot test-setup with onUnhandledRequest: 'error' and override per-case via server.use(...), never vi.mock('@/api/endpoints/*')
Use typed envelope helpers (successFor, paginatedFor, voidSuccess) in MSW handlers to keep handlers in lockstep with endpoint return types

Files:

  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/providers.ts
web/src/stores/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/stores/**/*.ts: All store mutation actions (create / update / delete) must follow the stores/connections/crud-actions.ts pattern: wrap API calls in try/catch, success updates state + emits success toast, failure logs + emits error toast + returns sentinel (null for entity, false for delete); callers MUST NOT wrap store mutation calls in try/catch
List-read store actions must set error: string | null on the store instead of toasting; use opaque cursor-based pagination via PaginationMeta, keep nextCursor + hasMore in state (not offset arithmetic), and early-return when !hasMore || !nextCursor
Always capture previous synchronously in optimistic mutations and restore in the catch block
Any new Zustand store that schedules timers or attaches event listeners must expose an equivalent cleanup hook and register it in the global afterEach in test-setup.tsx
Store files over ~600 lines must be sliced into packages with one of two aggregation patterns: package-internal index.ts or sibling .ts aggregator

Files:

  • web/src/stores/websocket.ts
src/synthorg/**/{api,services,repositories}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, MemoryService); services centralize audit logging; repositories must not log mutations themselves.

Files:

  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes the socket with code 4011.

Files:

  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.md: Use fenced code blocks with language tags: d2 for architecture/nested containers, mermaid for flowcharts/sequence/pipelines. Use markdown tables for tabular data; never use text fences with ASCII box-drawing.
Static historical counts and illustrative scale numbers may carry a per-line opt-out: <!-- lint-allow: doc-numeric-macros -- <reason> --> (reason mandatory). Enforced by scripts/check_doc_numeric_macros.py (pre-push).

Files:

  • docs/reference/conventions.md
web/src/utils/constants.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Keep the WebSocket wire protocol constants (WS_PROTOCOL_VERSION, WS_MAX_MESSAGE_SIZE, WS_HEARTBEAT_INTERVAL_MS, WS_PONG_TIMEOUT_MS, LOG_SANITIZE_MAX_LENGTH) in web/src/utils/constants.ts in lockstep with src/synthorg/api/ws_models.py / src/synthorg/api/controllers/ws.py; bump protocol version on both sides together for breaking payload changes

Files:

  • web/src/utils/constants.ts
src/synthorg/settings/definitions/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

For every mutable setting: DB > env (SYNTHORG_<NS>_<KEY>) > YAML > code default, resolved through SettingsService / ConfigResolver. Register new settings in src/synthorg/settings/definitions/<namespace>.py.

Files:

  • src/synthorg/settings/definitions/memory.py
src/synthorg/settings/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

First cold read of a setting emits one INFO settings.value.resolved; subsequent reads stay DEBUG. Sanctioned exceptions: init-time only (env-only, no registry entry) and read-only post-init (read_only_post_init=True; set() raises SettingReadOnlyError).

Files:

  • src/synthorg/settings/definitions/memory.py
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/providers/management/dtos.py
  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/meta/telemetry/test_emitter.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • src/synthorg/api/rate_limits/__init__.py
  • tests/unit/api/controllers/test_setup_personality.py
  • scripts/run_affected_tests.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • tests/unit/providers/management/test_local_models.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • src/synthorg/settings/definitions/memory.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/api/controllers/test_providers.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_personality.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/controllers/test_memory_admin.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/memory.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/providers.py
  • tests/unit/scripts/test_run_affected_tests.py
  • tests/unit/api/controllers/test_setup.py
  • src/synthorg/api/controllers/scaling.py
🔇 Additional comments (73)
src/synthorg/providers/drivers/litellm_driver.py (2)

136-159: Clock seam injection is clean and correctly defaulted.

clock is optional and safely defaults to SystemClock(), which preserves runtime behavior while enabling deterministic tests.


184-184: TTL now correctly uses the injected clock source.

Using self._clock.monotonic() here is the right seam for deterministic credential-cache timing.

tests/unit/providers/drivers/test_litellm_auth.py (1)

164-257: Deterministic credential-cache coverage is strong.

These FakeClock-driven tests lock in cache-hit/miss behavior, exact-TTL boundary semantics, and OAuth non-caching in a stable way.

web/src/utils/constants.ts (1)

12-23: Well-scoped jitter constants with clear intent.

Centralizing the reconnect jitter bounds here (with explicit docs) makes the backoff behavior easy to reason about and test across store + test code.

web/src/stores/websocket.ts (1)

232-249: Reconnect jitter implementation looks correct and robust.

The jittered exponential backoff is implemented cleanly, and the post-rounding minimum delay clamp is a good guard against pathological future constant changes.

web/src/__tests__/stores/websocket.test.ts (1)

425-428: Great deterministic coverage for reconnect jitter boundaries.

These updates make reconnect timing assertions resilient and explicitly verify lower/mid/upper jitter behavior against the backoff formula.

Also applies to: 434-476

src/synthorg/persistence/sqlite/schema.sql (2)

108-111: Composite cost-record indexes look correct for the new pagination shape.

These indexes align with agent/task filtered newest-first scans and are consistent with the PR’s query-plan goals.


648-649: Decision-records composite index shape is solid.

Including id as the trailing key provides deterministic tie-breaking for cursor-style ordering.

src/synthorg/persistence/postgres/schema.sql (2)

94-97: Cost-records composite indexes are correctly defined.

The (agent_id, timestamp DESC) and (task_id, timestamp DESC) shapes match the intended Postgres plan pinning.


663-664: decision_records pagination index looks good.

(task_id, recorded_at, id) is appropriate for stable task-scoped ordered reads.

src/synthorg/persistence/postgres/revisions/20260508204145_perf_indices_cost_decision.sql (1)

1-6: Migration contents are consistent with the schema changes.

Good parity with the Postgres source-of-truth index definitions.

src/synthorg/persistence/sqlite/revisions/20260508204132_perf_indices_cost_decision.sql (1)

1-6: SQLite migration is clean and aligned with the performance objective.

Index definitions match the updated SQLite schema and test intent.

tests/integration/persistence/test_perf_indices_postgres.py (3)

25-25: Integration marker usage is correct.

Good to see module-level pytest.mark.integration applied.


64-83: _explain_plan helper is well-structured for deterministic planner assertions.

Using ANALYZE, scoped planner toggles, and composed SQL keeps these checks focused and stable.


126-159: Decision-record fixture shape is strong for index-plan pinning.

Seeding many rows for a single task makes the ORDER BY recorded_at, id assertion meaningfully exercise the composite index.

tests/integration/persistence/test_perf_indices_sqlite.py (3)

28-28: Integration marker is correctly applied.

Module-level pytest.mark.integration matches the test classification rules.


74-113: SQL allowlist hardening in _explain_plan is a solid improvement.

Constraining ANALYZE/EXPLAIN shapes here makes the helper safer and clearer.


151-183: Decision-records plan test is well-designed.

The single-task high-cardinality seed gives a reliable signal for idx_dr_task_recorded_id usage.

scripts/run_affected_tests.py (3)

203-227: LGTM!

The new regexes for detecting xdist worker abort patterns are well-structured:

  • _NODE_DOWN_RE correctly captures the worker ID from the [gwN] node down: Not properly terminated format
  • _XDIST_INTERNAL_ERROR_RE uses re.MULTILINE appropriately to match INTERNALERROR> at the start of any line

The docstrings clearly explain the relationship between these signatures and the underlying Python 3.14 + Windows ProactorEventLoop teardown race.


592-595: LGTM!

The function follows the established pattern of _parse_worker_crashes and correctly extracts worker IDs from the node-down announcements.


665-675: LGTM!

The new branch correctly handles the case where workers terminated between tests (no paired crashed while running line). Key points:

  1. The check node_down_workers and has_internal_error and returncode != 0 ensures all three conditions are required for advisory classification
  2. Placing this after the if crashes: branch ensures that if xdist captured the test ID in a crash line, that takes precedence
  3. Formatting worker IDs as <worker gwN> satisfies the crash_advisory invariant (non-empty crashed_tests) while clearly indicating these are worker identifiers, not test names
tests/unit/scripts/test_run_affected_tests.py (3)

524-553: LGTM!

The test thoroughly covers the node-down + internal-error advisory path. The docstring explains the exact scenario (Python 3.14 + Windows ProactorEventLoop teardown race) and the assertion correctly expects worker IDs in the crashed_tests field in lieu of test IDs.


556-569: LGTM!

Good boundary test: node down alone without the INTERNALERROR> signature should not be classified as advisory. This ensures the gate fails closed when the output is degraded but doesn't match the documented native-flakiness pattern.


572-589: LGTM!

Critical precedence test: real FAILED lines must never be demoted by node-down/internal-error noise. The test correctly verifies that the regression classification includes the actual failed test in failed_tests.

src/synthorg/memory/embedding/fine_tune_orchestrator.py (2)

80-95: LGTM!

The clock seam implementation follows the project's clock injection pattern correctly. The clock: Clock | None = None parameter with SystemClock() default enables deterministic testing while maintaining backward compatibility.


601-639: LGTM!

The clock binding pattern is correct: capturing self._clock to a local variable outside the inner closure ensures thread-safe access and stable test behavior. The comment clearly explains the thread-safety invariants and test constraints.

scripts/loop_bound_init_baseline.txt (1)

32-32: LGTM!

Baseline line number correctly updated to reflect the new position of _op_lock after the clock parameter addition.

tests/unit/memory/embedding/test_fine_tune_orchestrator.py (2)

20-29: LGTM!

Good practice importing _PROGRESS_THROTTLE_SEC from the module under test rather than duplicating the literal—this keeps the test in sync if the throttle value changes.


204-259: LGTM!

The test correctly validates throttle behavior at the boundary:

  • Initial calls within the window are suppressed
  • First call crossing _PROGRESS_THROTTLE_SEC triggers an emit
  • Subsequent calls within the new window are again suppressed

The asyncio.sleep(0) after each callback correctly yields control to execute the call_soon_threadsafe-scheduled updates.

tests/unit/meta/telemetry/test_anonymizer.py (1)

437-446: Health-check suppression update looks correct.

Re-listing HealthCheck.differing_executors in per-test @settings is a sound fix for deterministic repeated execution.

Also applies to: 476-479

src/synthorg/meta/telemetry/emitter.py (1)

109-114: Clock seam wiring is consistent.

Using the injected clock for both initialization and flush timestamp updates is correct and keeps timing behavior deterministic in tests.

Also applies to: 137-137, 240-240

web/src/api/types/providers.ts (1)

79-84: LGTM: Clear documentation for the optional name field.

The documentation clearly explains the list-vs-single-GET contract for the name field. This aligns with the backend's ProviderResponse DTO and the paginated endpoint behavior.

src/synthorg/api/rate_limits/__init__.py (1)

35-37: LGTM: Clean public API extension.

The new exports (INFLIGHT_POLICIES, per_op_concurrency_from_policy) are correctly added to both the import block and __all__.

Also applies to: 47-60

src/synthorg/api/controllers/events.py (1)

22-25: LGTM: Policy-driven guards correctly applied.

The SSE stream endpoint now enforces both rate-limiting (via per_op_rate_limit_from_policy) and concurrency caps (via per_op_concurrency_from_policy), following the policy registry pattern consistently.

Also applies to: 530-537

tests/unit/api/rate_limits/test_controller_coverage.py (1)

143-147: LGTM: Test coverage extended correctly.

The stream endpoint is now asserted to carry the events.stream policy guard, matching the production decorator.

web/src/utils/retry-after.ts (1)

1-66: LGTM: Well-structured Retry-After parser with proper sanitization.

The shared parser centralizes retry logic, properly sanitizes attacker-controlled header values before logging, and clearly documents the retry budget and sentinel values.

web/src/utils/fetch-with-retry.ts (3)

49-66: LGTM: Proper AbortSignal cancellation in sleep.

The defaultSleep function correctly clears the timer and resolves immediately when the signal fires, avoiding unnecessary wait time on abort.


68-110: LGTM: Comprehensive idempotency detection.

The helper functions properly extract method and headers from both RequestInit and Request objects, preventing misclassification of pre-built Request instances.


125-168: LGTM: Robust retry loop with proper abort handling.

The retry logic correctly:

  • Honors DO_NOT_RETRY by returning immediately
  • Handles waitMs === 0 by calling sleep with zero delay (no-op)
  • Checks abort signal before and after sleep
  • Respects the retry budget
web/src/utils/app-version.ts (1)

110-141: LGTM!

The switch to fetchWithRetryAfter with { idempotent: true } is appropriate here since logout is replay-safe on the server. The existing AbortController timeout bounds the retry window, and error handling remains intact.

web/src/__tests__/utils/fetch-with-retry.test.ts (1)

1-267: LGTM!

Comprehensive test coverage for fetchWithRetryAfter including idempotency rules, retry budget exhaustion, abort signal handling, and the previously requested regression tests for Request input (lines 206-243) and default sleep abort path (lines 245-266).

tests/unit/api/rate_limits/test_policies.py (1)

193-278: LGTM!

The new inflight policy tests (TestInflightRegistryStructure, TestPerOpConcurrencyFromPolicy, TestInflightEndpointPolicies) mirror the existing rate-limit registry test structure and provide comprehensive coverage for the new INFLIGHT_POLICIES registry and per_op_concurrency_from_policy helper.

src/synthorg/api/controllers/memory.py (3)

151-200: LGTM!

The _FineTuneThresholds model follows Pydantic best practices (frozen=True, allow_inf_nan=False, extra="forbid"), and _resolve_fine_tune_thresholds correctly resolves settings at request time with appropriate fallbacks for missing or unparseable values.


242-246: LGTM!

The switch from hardcoded per_op_concurrency(...) to per_op_concurrency_from_policy(...) aligns these endpoints with the centralized inflight policy registry, enabling consistent operator-level tuning.

Also applies to: 296-300, 468-472, 525-529


769-798: LGTM!

The preflight helpers now accept threshold parameters resolved at the API boundary, correctly implementing the settings-driven approach requested in prior reviews. The inclusive <= comparison at line 827 properly implements the "warn at or below" semantic.

Also applies to: 804-837

tests/integration/api/test_per_op_rate_limit_concurrent.py (1)

310-314: LGTM!

Making _held_batch_size accept *_args, **_kwargs keeps the patch signature-tolerant as the production _recommend_batch_size helper evolves. The comment clearly explains the rationale.

tests/unit/api/controllers/test_memory_admin.py (2)

391-495: LGTM!

The TestResolveFineTuneThresholds tests comprehensively cover the threshold resolution logic: fallback to defaults when service is missing, settings overrides taking effect, and fallback on unparseable or missing values. Mock usage with spec=SettingsService is compliant.


497-575: LGTM!

The TestCheckDocumentsBoundaries tests lock in the boundary behavior for the preflight document count checks, including the count == min_required case (lines 553-575) that was requested in prior review.

tests/unit/api/controllers/test_providers.py (1)

20-31: LGTM!

The test updates correctly reflect the cursor-based pagination changes: empty response is now data: [] with pagination metadata, and the new tampered cursor test ensures proper validation with HTTP 400.

web/src/mocks/handlers/index.ts (1)

47-57: LGTM!

Adding paginatedEnvelopeFor to the exports aligns with the typed envelope helper pattern used in MSW handlers throughout the PR.

tests/integration/api/controllers/test_providers.py (1)

78-83: Pagination-aware provider assertions look correct.

The updated assertions correctly validate the paginated data shape and the overridden provider fields.

docs/reference/conventions.md (1)

54-58: Cursor pagination convention update is consistent.

This wording aligns with the new wire contract and removes offset/total assumptions.

web/src/api/endpoints/settings.ts (1)

51-58: listSinks() pagination migration is solid.

Using paginateAll with cursor forwarding and paginated unwrapping matches the backend contract.

web/src/api/endpoints/scaling.ts (1)

43-50: Strategies and signals pagination handling is correctly aligned.

Both list methods now consistently consume cursor-paginated envelopes and return fully aggregated arrays.

Also applies to: 71-78

web/src/mocks/handlers/settings.ts (1)

68-68: Sinks MSW handler now matches paginated wire shape.

Good alignment with the cursor-paginated API contract for this route.

web/src/__tests__/stores/sinks.test.ts (1)

9-17: Paginated test fixture updates are correct.

The shared paginatedSinks() helper keeps the success-path store tests consistent with the new API envelope.

Also applies to: 49-50, 67-70, 117-119

tests/unit/api/controllers/test_settings_sinks.py (1)

154-198: Great coverage addition for cursor behavior.

The round-trip and tampered-cursor tests meaningfully strengthen pagination contract validation.

web/src/mocks/handlers/scaling.ts (1)

18-19: Scaling list handlers now correctly mock paginated envelopes.

This keeps MSW behavior aligned with the updated cursor-based API responses.

Also applies to: 28-29

tests/unit/api/controllers/test_setup_personality.py (2)

42-48: LGTM!

The persistence verification correctly adapts to the paginated response shape by reading from body["data"] and using any(...) for order-independent matching. This aligns with the GET /setup/agents now returning PaginatedResponse[SetupAgentSummary].


112-131: LGTM!

The test correctly validates the new paginated envelope for personality presets: checking len(body["data"]) >= 1 and iterating directly over body["data"] items matches the updated wire shape.

web/src/api/endpoints/providers.ts (3)

46-70: LGTM!

The pagination implementation is solid:

  • Uses Object.create(null) to prevent prototype pollution via keyed result.
  • Filters out dangerous keys (__proto__, constructor, prototype).
  • Logs a warning for entries missing name (the wire contract guarantees one).

77-87: LGTM!

getProviderModels correctly implements cursor-based pagination with paginateAll and properly encodes the provider name in the URL.


225-242: LGTM!

Using fetchWithRetryAfter with { idempotent: true } for the pull-model POST is appropriate — the operation is semantically idempotent (re-pulling resolves the same model), and the flag enables safe retry on 429 before opening the SSE stream.

web/src/mocks/handlers/helpers.ts (1)

157-216: LGTM!

The paginatedEnvelopeFor helper is well-designed:

  • The PaginatedItem<R> type correctly extracts item types from both array returns (T[]) and keyed-map returns (Record<string, T>), with the string extends keyof R guard preventing misuse on shaped objects.
  • The envelope structure matches the cursor-pagination wire format (limit, next_cursor, has_more).
  • This complements paginatedFor by handling endpoints that flatten pagination internally via paginateAll.
tests/unit/api/controllers/test_setup_agent_ops.py (3)

236-239: LGTM!

The empty-agents test correctly validates the new paginated envelope shape: checking body["data"] == [] plus pagination metadata (has_more, next_cursor).


312-326: LGTM!

The persistence check now correctly anchors on the updated agent's name from the PUT response, preventing false positives from other agents that might share the same provider/model pair. This addresses the concern from the prior review.


384-387: LGTM!

Both name update and randomize-name tests correctly use any(...) over body["data"] for order-independent persistence verification.

Also applies to: 451-454

src/synthorg/api/controllers/setup.py (1)

414-455: LGTM!

The list_agents endpoint correctly implements cursor-based pagination:

  • Preserves storage order (as documented in the comment) to keep PUT/POST handlers that use positional agent_index in sync with the visible list.
  • Uses paginate_cursor with the HMAC secret for cursor signing.
  • Returns the properly typed PaginatedResponse[SetupAgentSummary].

This addresses the concern from the prior review about sorting causing index mismatches.

web/src/mocks/handlers/setup.ts (1)

116-118: LGTM!

Both handlers now use paginatedEnvelopeFor<typeof ...>() which ties the mock envelope to the endpoint's actual return type. If the endpoint contract changes, TypeScript will flag the handlers. This addresses the prior review suggestion.

Also applies to: 154-156

web/src/mocks/handlers/providers.ts (2)

152-155: LGTM!

Both provider list handlers now use paginatedEnvelopeFor<typeof ...>(), tying the mock envelope to the endpoint's return type. This keeps the mocks in lockstep with the client contract and addresses the prior review suggestion.

Also applies to: 194-196


346-383: LGTM!

The new fixture builders (buildProviderAuditEvent, buildRateLimitsConfig, buildPresetOverride) provide consistent default objects with optional Partial overrides, following the existing pattern established by buildProvider and buildCloudPreset.

web/src/api/endpoints/setup.ts (2)

40-49: LGTM!

getAgents() correctly implements cursor-based pagination with paginateAll, returning a readonly array to signal immutability. The implementation matches the established pattern in listProviders().


105-114: LGTM!

listPersonalityPresets() correctly follows the same pagination pattern as getAgents(), using paginateAll to aggregate all pages transparently.

Comment thread tests/unit/api/controllers/test_setup.py Outdated
Comment thread tests/unit/meta/telemetry/test_emitter.py Outdated
Comment thread tests/unit/providers/management/test_local_models.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/api/controllers/providers.py (1)

291-328: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Page before calling batch_get_capabilities().

list_models() still fetches capabilities for the entire model catalog and only paginates afterward, so GET /providers/{name}/models?limit=1 pays the provider/API cost for every model before slicing. That largely defeats the new pagination path on the heaviest part of this handler. Page the sorted provider.models first, then enrich only the returned page.

♻️ Minimal reshaping
-        caps_by_id: Mapping[str, ModelCapabilities | None] = {}
-        if driver is not None:
-            try:
-                caps_by_id = await driver.batch_get_capabilities(
-                    tuple(m.id for m in provider.models),
-                )
-            except* (RetryExhaustedError, RateLimitError) as exc_group:
+        ordered_models = tuple(sorted(provider.models, key=lambda m: m.id))
+        page_models, meta = paginate_cursor(
+            ordered_models,
+            limit=limit,
+            cursor=cursor,
+            secret=app_state.cursor_secret,
+        )
+
+        caps_by_id: Mapping[str, ModelCapabilities | None] = {}
+        if driver is not None and page_models:
+            try:
+                caps_by_id = await driver.batch_get_capabilities(
+                    tuple(model.id for model in page_models),
+                )
+            except* (RetryExhaustedError, RateLimitError) as exc_group:
                 exc = exc_group.exceptions[0]
                 logger.warning(
                     API_PROVIDER_USAGE_ENRICHMENT_FAILED,
                     provider=name,
                     error_type=type(exc).__name__,
                     error=safe_error_description(exc),
                 )
-        ordered = tuple(
+        page = tuple(
             to_provider_model_response(
                 model_config,
                 caps_by_id.get(model_config.id),
             )
-            for model_config in sorted(provider.models, key=lambda m: m.id)
+            for model_config in page_models
         )
-        page, meta = paginate_cursor(
-            ordered,
-            limit=limit,
-            cursor=cursor,
-            secret=app_state.cursor_secret,
-        )
         return PaginatedResponse[ProviderModelResponse](data=page, pagination=meta)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/providers.py` around lines 291 - 328, The
handler currently calls driver.batch_get_capabilities over the entire
provider.models list before pagination; change the flow to sort provider.models
(use sorted(provider.models, key=lambda m: m.id)), then call paginate_cursor on
that sorted list to get the page and meta, then call
driver.batch_get_capabilities only for the models on the returned page (pass
tuple(m.id for m in page_models)); finally map page_models with
to_provider_model_response using the caps_by_id for those ids and return
PaginatedResponse with data=page and pagination=meta. Ensure you keep the same
exception handling around driver.batch_get_capabilities and reference the
existing symbols driver.batch_get_capabilities, provider.models,
paginate_cursor, to_provider_model_response, and PaginatedResponse.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/controllers/providers.py`:
- Line 258: The non-paginated handlers are incorrectly populating
ProviderResponse.name and expanding the wire format; update each single-resource
or mutation response to keep the list-only `name` field out by calling
to_provider_response(..., name=None) instead of passing the local `name`
variable (e.g., change calls to to_provider_response(provider, name=name) used
with ApiResponse to to_provider_response(provider, name=None)); adjust the calls
at the indicated sites (the call within ApiResponse return and the other similar
calls referenced) so non-paginated responses do not set ProviderResponse.name.

In `@tests/unit/api/controllers/test_setup.py`:
- Around line 761-788: The test test_pagination_round_trip_with_limit_one
incorrectly uses full = test_client.get("/api/v1/setup/agents").json()["data"]
which is paginated and may only return the first page; instead build the
authoritative "full" result by walking the pagination cursor like you do for the
limited walk: call GET "/api/v1/setup/agents?limit=..." (or no limit if an
unpaginated endpoint exists) and loop following pagination["next_cursor"],
collecting each page's data into full_collected, then assert collected ==
full_collected; update the variable names (e.g., replace full with
full_collected) and reuse the cursor-following pattern used with
first/collected/cursor to ensure a true round-trip comparison.

---

Outside diff comments:
In `@src/synthorg/api/controllers/providers.py`:
- Around line 291-328: The handler currently calls driver.batch_get_capabilities
over the entire provider.models list before pagination; change the flow to sort
provider.models (use sorted(provider.models, key=lambda m: m.id)), then call
paginate_cursor on that sorted list to get the page and meta, then call
driver.batch_get_capabilities only for the models on the returned page (pass
tuple(m.id for m in page_models)); finally map page_models with
to_provider_model_response using the caps_by_id for those ids and return
PaginatedResponse with data=page and pagination=meta. Ensure you keep the same
exception handling around driver.batch_get_capabilities and reference the
existing symbols driver.batch_get_capabilities, provider.models,
paginate_cursor, to_provider_model_response, and PaginatedResponse.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e168a677-4178-45eb-b09b-8f382d99afb8

📥 Commits

Reviewing files that changed from the base of the PR and between acea685 and 595d376.

📒 Files selected for processing (11)
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/settings/definitions/api.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/providers/management/test_local_models.py
  • web/src/api/client.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Build Backend
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Dashboard Test
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (10)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit / integration / e2e / slow.
Mock-spec gate: every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt; regenerate via uv run python scripts/check_mock_spec.py --update. Without spec= mocks silently absorb every attribute access.
Shared mocks: use mock_dispatcher from tests/conftest.py (AsyncMock(spec=NotificationDispatcher)).
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Logger spying antipattern: never monkeypatch.setattr(module.logger, "info", spy); the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead.
Parametrize: prefer @pytest.mark.parametrize for similar cases.
Property-based: Hypothesis (Python), fast-check (React), testing.F (Go). CI runs 10 deterministic examples (derandomize=True). Hypothesis failures are real bugs: fix the bug and add an @example(...) decorator.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/meta/telemetry/test_emitter.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/meta/telemetry/test_emitter.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every cost-bearing Pydantic model carries currency: CurrencyCode; mixing raises MixedCurrencyAggregationError (HTTP 409). Aggregations over cost-bearing fields call assert_currencies_match before reducing.
Direct os.environ.get(...) outside startup is forbidden. Ghost-wired settings (consuming service never instantiated at boot) are flagged by scripts/check_setting_to_startup_trace.py; per-setting opt-out via # lint-allow: bootstrap-wiring -- <reason>.
Every numeric threshold / weight / limit / timeout / scoring policy in business logic lives in src/synthorg/settings/definitions/<namespace>.py, not as a bare numeric literal. Bare module-level _FOO = 1024 constants and bare numeric defaults (def f(timeout=30)) are forbidden.
Allowlisted numeric literals: 0, 1, -1 (sentinel/off-by-one), HTTP status codes 100-599 in status_code= defaults, hex bit-masks (0xff, 0x80), powers-of-2 in buffering= / chunk_size= / buffer_size= defaults, anything inside settings/definitions/, persistence/migrations/, observability/events/. Per-line opt-out: # lint-allow: magic-numbers -- <reason> (mandatory non-empty justification). Enforced by scripts/check_no_magic_numbers.py.
Comments explain WHY only, never origin / review / issue context. Forbidden: reviewer citations (pre-PR review #N``), in-code issue back-refs ((#1682`)`), naked `SEC-1` taxonomy in `src/`, migration framing (`ported from`), round narrative, self-evident restatements.
Keep in comments: hidden constraints, subtle invariants, upstream-bug workarounds (with stable bug-tracker URL), why a non-obvious choice was made. Enforced by `scripts/check_no_review_origin_in_code.py` and `scripts/check_no_migration_framing.py` (pre-push); per-line opt-outs `# lint-allow: review-origin -- ` and `# lint-allow: migration-framing -- `.
No `from future import annotations`: Python 3.14 has PEP 649.
PEP 758 except: `except A, B:` (no parens) when not bin...

Files:

  • src/synthorg/settings/definitions/api.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/providers.py
src/synthorg/settings/definitions/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

For every mutable setting: DB > env (SYNTHORG_<NS>_<KEY>) > YAML > code default, resolved through SettingsService / ConfigResolver. Register new settings in src/synthorg/settings/definitions/<namespace>.py.

Files:

  • src/synthorg/settings/definitions/api.py
src/synthorg/settings/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

First cold read of a setting emits one INFO settings.value.resolved; subsequent reads stay DEBUG. Sanctioned exceptions: init-time only (env-only, no registry entry) and read-only post-init (read_only_post_init=True; set() raises SettingReadOnlyError).

Files:

  • src/synthorg/settings/definitions/api.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/settings/definitions/api.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/providers.py
src/synthorg/providers/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/providers/**/*.py: All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry in driver subclasses or calling code.
RetryConfig / RateLimiterConfig set per-provider in ProviderConfig. Retryable: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable raise immediately.

Files:

  • src/synthorg/providers/management/dtos.py
src/synthorg/**/{api,services,repositories}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Controllers and API endpoints access persistence through domain-scoped service layers (e.g. ArtifactService, WorkflowService, MemoryService); services centralize audit logging; repositories must not log mutations themselves.

Files:

  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/providers.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes the socket with code 4011.

Files:

  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/providers.py
web/src/**/*.ts?(x)

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.ts?(x): Always use createLogger from @/lib/logger instead of bare console.warn/console.error/console.debug in application code; use variable name log (e.g., const log = createLogger('module-name'))
Wrap attacker-controlled fields inside structured objects with sanitizeForLog() before embedding in log calls
Use design tokens, @/lib/motion presets, helpers in @/utils/format, and DEFAULT_CURRENCY from @/utils/currencies instead of hardcoding styling and formatting values
Detect fetch() in effects without AbortController cleanup using @eslint-react/web-api-no-leaked-fetch ESLint rule
A PostToolUse hook (scripts/check_web_design_system.py) runs on every web/src/ edit and flags hardcoded hex / rgba / fonts / Motion durations / locale literals / bare .toLocale*String() calls / missing Storybook stories / duplicate component patterns / complex .map() blocks; fix every violation before proceeding

Files:

  • web/src/api/client.ts
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Reuse components from web/src/components/ui/. Never hardcode hex colors, font-family, pixel spacing, Motion transitions, or BCP 47 locale strings; use design tokens, @/lib/motion presets, helpers in @/utils/format. Enforced by scripts/check_web_design_system.py.

Files:

  • web/src/api/client.ts
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • src/synthorg/settings/definitions/api.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/controllers/settings.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/meta/telemetry/test_emitter.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/providers.py
🔇 Additional comments (10)
tests/unit/providers/management/test_local_models.py (3)

303-345: Strong adversarial sanitizer coverage for path redaction

These cases are well targeted (POSIX/Windows + space-containing paths) and protect against the previously leaky regex behavior.


346-372: Good parametrized host:port regression coverage

Great use of @pytest.mark.parametrize to lock down localhost, IPv4, IPv6, and DNS-hostname redaction paths in one test.


373-384: Fallback + truncation assertions look correct

The oversized-input cap and empty/non-string fallback checks clearly validate the sanitizer’s safety envelope.

web/src/api/client.ts (2)

14-18: Good consolidation of 429 retry policy helpers.

Moving these into @/utils/retry-after avoids policy drift between axios and fetch retry paths.


157-162: Retry-After: 0 now follows the intended retry contract.

Treating waitMs === 0 as “retry now” (without delay) fixes the previous short-circuit while preserving bounded retry behavior.

Also applies to: 170-172

tests/unit/meta/telemetry/test_emitter.py (2)

42-47: Good mock-spec hardening across async patch sites.

These updates make the async patch points stricter and safer by preventing permissive mock behavior in test flows.
As per coding guidelines, “every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass.”

Also applies to: 60-65, 90-95, 111-116, 130-135, 154-159, 172-177, 187-192, 213-218, 472-477


429-487: Clock-seam coverage is clean and deterministic.

The new FakeClock tests validate _last_flush_at behavior through the injected seam and include explicit async cleanup, which matches the intended time-control strategy.
As per coding guidelines, “Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter.”

tests/unit/api/controllers/test_settings_sinks.py (3)

9-9: Good alignment with endpoint identifier semantics.

Using _sink_identifier here keeps the default-sink expectation in sync with the controller output format.


67-70: Canonical default-id derivation looks solid.

This improves test robustness against identifier-format drift by removing ad-hoc expectation logic.


151-195: Strong pagination and cursor-security coverage.

The round-trip cursor walk plus explicit tampered-cursor rejection materially improves confidence in the new paginated contract.

Comment thread src/synthorg/api/controllers/providers.py Outdated
Comment thread tests/unit/api/controllers/test_setup.py
Aureliolo added 14 commits May 9, 2026 01:47
Multiplying the deterministic exponential delay by a uniform 0.8..1.2
multiplier de-correlates reconnect attempts across clients so a
gateway restart no longer triggers a synchronised reconnect storm.

Constants WS_RECONNECT_JITTER_MIN/MAX live in web/src/utils/constants
alongside the existing reconnect base/max-delay knobs so the bounds
are greppable and testable from one place.

Test stubs Math.random to confirm the scheduled setTimeout falls in
[base*0.8, base*1.2]; existing reconnect-firing test was updated to
advance past the new jitter ceiling.
Sibling file to test_setup.py with parallel coverage of the agent
update endpoints. The paginated response shape replaces 'data.agents'
with 'data' (top-level array) and drops 'agent_count'. Searches use
'any(...)' rather than indexing because pagination orders agents by
name rather than insertion position.
Triage of 18 review-agents surfaced 20 valid items (1 CRITICAL, 9 MAJOR,
8 MEDIUM, 2 MINOR). Implements all of them.

CRITICAL: integration test for /providers asserted the legacy
dict-keyed body shape; walks the paginated data array now.

MAJOR fixes:
- docs/reference/conventions.md: corrected PaginationMeta envelope
  (drops offset / total fields)
- removed dead SetupAgentsListResponse / PersonalityPresetsListResponse
  wrappers (no callers after pagination)
- typed SinkInfoResponse / SinkRotationResponse replace
  PaginatedResponse[dict[str, Any]] for /settings/observability/sinks
- _sanitize_ollama_error logs original error type at WARN before
  falling back to the generic Pull failed message
- parseRetryAfterMs emits a structured warn when Retry-After header
  is unparseable (visibility on server misconfig)
- listProviders logs a warn when a paginated entry is missing its
  name field (regression catch)
- emptyPaginatedEnvelope extracted to web/src/mocks/handlers/helpers.ts
- WS reconnect jitter delay clamped to a 1ms floor

MEDIUM:
- jitter test parametrised across lower / midpoint / upper-bound
  Math.random values
- fetchWithRetryAfter honours AbortSignal during retry sleep; tests
  cover pre-sleep + during-sleep abort plus DELETE / PUT / PATCH
  idempotency cases
- documented the deliberate ORDER BY simplification in the EXPLAIN
  index conformance tests
- pagination round-trip asserts no-duplicate / no-gap invariants
- pinned events.stream policy at (60, 60)
- documented ProviderResponse.name invariant in DTO + helper
- added exact-T=TTL boundary case for credential cache

MINOR:
- documented _make_progress_cb closure thread-safety contract

Plus simplifier polish (flattened isRetriable, parametrised method
matrix, dropped redundant Object.create cast).
…fixes after dedup)

- local_models: harden _OLLAMA_HOST_PORT regex to redact localhost,
  IPv4, and bracketed IPv6 host:port forms (gemini + coderabbit).
- local_models: change 'if error:' to 'if error in data' presence
  check so falsey error payloads still flow through sanitization
  and the terminal-event path.
- scaling: add explicit type parameters on PaginatedResponse returns
  in list_strategies / list_signals empty fallbacks and on every
  list_decisions return (gemini).
- setup: stop reordering /setup/agents by name; PUT/POST handlers
  resolve agent_index against the persisted array, so storage order
  must equal listing order (coderabbit).
- rate_limits: add INFLIGHT_POLICIES registry and
  per_op_concurrency_from_policy factory; migrate every site
  (events.stream, memory.fine_tune {start,resume},
  memory.checkpoint_{deploy,rollback}, providers.{discover_models,pull_model})
  off bare max_inflight literals (coderabbit).
- tests: stronger seed for decision_records perf-index tests so
  ORDER BY (recorded_at, id) is genuinely discriminative on
  Postgres + SQLite (coderabbit).
- tests: replace pytest.skip with assert in settings sinks
  round-trip; remove redundant duplicate and gap assertions in
  setup presets round-trip (coderabbit).

Skipped 2 findings as factually wrong:
- web setup.ts mocks: paginatedFor parameterised on the endpoint
  type does not type-check because getAgents returns readonly T
  array (post paginateAll), not PaginatedResult; emptyPaginatedEnvelope
  is the correct helper here per its docstring.
- OpenGrep f-string SQL warning at test_perf_indices_sqlite.py 77-78:
  values are class-level test constants (analyze='cost_records',
  etc.), no user-controlled vector.
…valid + 3 skipped)

- memory: migrate _DEFAULT_BATCH_SIZE / _MIN_DOCS_REQUIRED /
  _MIN_DOCS_RECOMMENDED out of the controller into
  settings/definitions/memory.py as registered settings, replacing
  round 1's lint-allow opt-out per the no-magic-numbers convention.
- settings: list_all_settings now defaults to DEFAULT_LIMIT instead
  of a bare 50 literal (matches the rest of the controller).
- settings: _sink_to_response falls back to 'unnamed-<sink_type>'
  for non-CONSOLE non-FILE sinks so NotBlankStr does not raise on
  custom SYSLOG / HTTP shipping sinks with file_path=None.
- local_models: widen _OLLAMA_PATH_POSIX and _OLLAMA_PATH_WIN to
  consume path segments containing spaces; paths like
  '/var/lib/ollama/model cache/...' or 'C:\Program Files\...' now
  redact in full instead of leaking the trailing tail. Adds
  regression tests for both forms.
- local_models: _sanitize_ollama_error is now side-effect free; the
  PROVIDER_MODEL_PULL_FAILED log now lives only in _parse_pull_line
  with error_type=type(error).__name__ for context, eliminating the
  duplicate log on non-string upstream errors.
- test_emitter: replace _client.aclose() with em.aclose() in the
  flush-clock test cleanup so the periodic _flush_task is cancelled
  via the kill switch instead of leaking across tests.
- web fetch-with-retry: methodOf and hasIdempotencyKey now consider
  the Request input alongside init, so a Request POST without
  idempotent opt-in correctly skips retry; defaultSleep accepts an
  optional AbortSignal and resolves immediately when aborted, so
  cancellation latency tracks the abort instead of the Retry-After
  budget. Three new regression tests cover Request-input idempotency
  detection and AbortSignal cancellation through the default sleep.
- web retry-after: log.warn on a malformed Retry-After header now
  wraps the attacker-controlled trimmed value with sanitizeForLog
  before structured logging.

Skipped 3 findings:
- coderabbit re-suggestion to migrate the events.stream rate /
  inflight registry literals into settings/definitions: the
  no-magic-numbers gate does not flag dict-literal values (verified
  empirically on round 1 push) and the registry is the established
  operator-tuning surface; one-entry migration would create
  asymmetry, full migration is a fundamental redesign of ~80
  registry rows out of scope for this round.
- coderabbit re-suggestion of the SQL-allowlist hardening on the
  sqlite perf-index test helper: values come from class-level test
  constants (analyze='cost_records', etc.) with no user-controlled
  input vector; the allowlist would only add maintenance overhead
  per query shape without changing the security posture.
- coderabbit re-suggestion to wire scaling / settings / setup MSW
  handlers through paginatedFor parameterised on the endpoint type:
  paginatedFor's constraint requires Promise of PaginatedResult,
  but every affected endpoint (getAgents, listPersonalityPresets,
  getScalingStrategies, getScalingSignals, listSinks) returns the
  flattened readonly T[] post paginateAll. The suggested form does
  not type-check; emptyPaginatedEnvelope[T]() is the helper that
  exists for exactly this case (see its docstring).
- memory: thread fine-tune preflight thresholds through
  SettingsService at request time. _resolve_fine_tune_thresholds()
  reads memory.fine_tune_default_batch_size /
  memory.fine_tune_min_docs_required /
  memory.fine_tune_min_docs_recommended and falls back to the
  imported FINE_TUNE_* constants when the registry / parser is
  unavailable. _check_documents, _run_preflight_checks, and
  _recommend_batch_size accept the resolved thresholds as keyword
  args (constants stay as the offline / unit-test fallback). Adds
  4 regression tests covering the override + fallback paths.
- memory: tighten the recommended-docs preflight boundary from
  count < FINE_TUNE_MIN_DOCS_RECOMMENDED to count <=. The setting
  description says "warn band for corpora at or below this size";
  the < check let a corpus at exactly the threshold pass instead
  of warning. Adds boundary tests at the warn / pass / fail
  thresholds.
- settings: extract _sink_identifier(sink) helper used by both
  _sink_to_response and _append_disabled_defaults. Make
  _append_disabled_defaults return a new list instead of mutating
  the caller-owned list, and route the active-set membership test
  through the same helper so a future non-FILE default cannot
  drift between sites.
- ci: integration test test_inflight_cap_fires_5002 patches
  _recommend_batch_size with a sync helper. Round 3 added a
  default_batch_size kwarg to the production helper, so the patch
  must accept and ignore kwargs to keep the holder request from
  raising inside its asyncio.TaskGroup (the failure surfaced as a
  500 in the integration suite). _held_batch_size now takes
  star-args / kwargs so the patch stays signature-tolerant across
  controller changes.
- settings: _sink_identifier now derives a stable per-sink
  discriminator for non-FILE custom shipping sinks (SYSLOG
  host:port, HTTP url, OTLP endpoint) so two HTTP / SYSLOG / OTLP
  sinks of the same type don't collide on a single 'unnamed-TYPE'
  key. The 'unnamed-TYPE' fallback only fires defensively against a
  future sink type that hasn't been wired here yet; well-formed
  config is blocked from reaching it by
  SinkConfig._validate_sink_type_fields.
- tests: add count == FINE_TUNE_MIN_DOCS_REQUIRED boundary case to
  test_memory_admin so the strict less-than on the hard floor stays
  pinned (count exactly at the required threshold must warn, not
  fail).

Skipped 1 finding:
- coderabbit outside-diff suggesting _BATCH_SIZE_BY_VRAM_GB be
  relocated to settings/definitions/memory.py: relocating shifts
  test_memory_admin.py import line numbers, which forces drift in
  the mock-spec baseline; the baseline cannot be regenerated per
  the project's never-update-baseline rule. CodeRabbit themselves
  rated this finding as Trivial / Low value, noting 'since these
  are hardware-tier boundaries (unlikely to need runtime tuning)
  and stored in an immutable Final tuple, this is a lower-priority
  refactor than the already-addressed FINE_TUNE_ constants'.
tests/integration/persistence/test_perf_indices_sqlite.py: replace f-string SQL composition in _explain_plan with hardcoded ANALYZE/EXPLAIN allowlists keyed on the test inputs (aiosqlite has no psycopg-equivalent safe-composition primitive).

tests/unit/api/controllers/test_setup.py + test_setup_agent_ops.py: anchor three persistence asserts on the updated agent name from the PUT response so the any(...) predicate cannot pass via a different agent that already carries the target provider/model or preset.

tests/unit/api/controllers/test_setup.py: tighten pagination round-trip to strict equality vs the unpaginated GET so duplicates, gaps, or reordering across pages fail loudly.

web/src/mocks/handlers/helpers.ts + 4 handler files: add paginatedEnvelopeFor<Fn> helper that ties the mock envelope item type to the endpoint function flattened return type (handles arrays and Record string-keyed maps via PaginatedItem<R>); migrate all 7 emptyPaginatedEnvelope<T>() callsites in providers/setup/scaling/settings handlers; remove emptyPaginatedEnvelope.
Two property tests in test_anonymizer.py declare a local @settings(suppress_health_check=[HealthCheck.function_scoped_fixture]) decorator. Hypothesis treats the per-test list as a replacement for the profile-level list, not an extension, so the ci profile's suppression of HealthCheck.differing_executors is dropped on these two tests.

Under the isolation gate's pytest-repeat --count 2 replay, the second iteration runs the test method through a fresh fixture executor; Hypothesis emits HealthCheck.differing_executors and fails [2-2]. Re-list differing_executors alongside function_scoped_fixture so the ci profile's stance is preserved at the decorator.

Verified: pytest tests/unit/meta/telemetry/test_anonymizer.py -n 8 --count 2 -- 52 passed.
The isolation gate's _classify_isolation_outcome only recognised the canonical xdist worker-crash signature: 'worker gwN crashed while running ...'. When workers terminate between tests rather than during one (the Python 3.14 + Windows ProactorEventLoop IOCP teardown race), xdist instead prints '[gwN] node down: Not properly terminated' and the scheduler subsequently raises INTERNALERROR (KeyError on the dead WorkerController). With neither signature parsable, the classifier fell through to fail-closed and blocked the push on documented native-level flakiness.

Add _NODE_DOWN_RE + _XDIST_INTERNAL_ERROR_RE detectors and a dedicated branch in the classifier that treats node-down + scheduler INTERNALERROR + non-zero returncode as crash advisory (the existing decision-tree intent for native crashes scattered across unrelated tests). Worker ids substitute for test ids on the advisory banner since the node-down signature lacks per-test attribution.

Three new unit tests cover the new branch, the boundary case where node-down without INTERNALERROR still fails closed, and the priority case where a real FAILED line outranks node-down noise.
…cates)

web/src/api/client.ts: axios 429 retry condition skipped retry when parseRetryAfterMs returned 0 (RFC 9110 'retry immediately'); only DO_NOT_RETRY now bypasses, sleeping is gated on waitMs > 0 to match the fetch-with-retry contract.

src/synthorg/api/controllers/settings.py: SETTINGS_OBSERVABILITY_VALIDATION_FAILED log no longer emits raw sink_overrides / custom_sinks JSON blobs (operator-supplied, may contain credentials/URLs); replaced with presence/length metadata. _sink_identifier now returns a type-prefixed SHA-256 fingerprint (file:<hex16> / syslog:<hex16> / http:<hex16> / otlp:<hex16>) instead of the raw destination string so the public envelope can no longer leak embedded auth tokens, and identical destinations still collapse to the same identifier for cursor pagination.

src/synthorg/providers/management/dtos.py: to_provider_response() name parameter is now required (no default) so future list endpoints cannot silently ship name=None entries; updated all 7 single-provider callsites in providers.py to pass name= explicitly (path-keyed endpoints pass name=name, create endpoints pass name=data.name) and 3 unit-test sites to pass name=None.

src/synthorg/api/rate_limits/policies.py + settings/definitions/api.py: hardcoded events.stream rate-limit + the entire _INFLIGHT_POLICIES map moved to Final[int] constants in settings/definitions/api.py (gate-allowlisted directory) and imported by the policy registry. The dict shape stays import-time-evaluable for decorator binding; operator runtime tuning continues through per_op_rate_limit_overrides / per_op_concurrency_overrides.

tests/unit/api/controllers/test_setup.py: pagination round-trip for personality-presets now walks pagination at limit=200 instead of trusting the default-page response as the full dataset.

tests/unit/meta/telemetry/test_emitter.py: all 10 patch.object(_send_batch, new_callable=AsyncMock) sites now also pass spec=HttpAnalyticsEmitter._send_batch so the mock matches the real async signature and survives the mock-spec gate.

tests/unit/providers/management/test_local_models.py: parametrized test_redacts_host_port over localhost / IPv4 / IPv6 / DNS-hostname forms; locks the regex regression for every supported shape.
The new _SINK_FINGERPRINT_LEN module-level constant introduced in the previous round-6 commit landed at src/synthorg/api/controllers/settings.py, where the magic-numbers gate is active. Move it to src/synthorg/settings/definitions/api.py as SINK_IDENTIFIER_FINGERPRINT_LENGTH (the gate-allowlisted directory and the canonical home for numeric tuning knobs).

The pre-push gate caught this on the first push attempt; relocating instead of bypassing keeps the convention enforced.
src/synthorg/api/controllers/providers.py: revert the round-6 over-broadening of ProviderResponse.name to non-paginated handlers. The intended wire-format scope is list-only (so the dict-by-name reconstruction on the frontend can attribute each item to its key); single-resource and mutation responses identify the provider via the URL path and must not also carry name in the body. The 6 single-resource handlers (read, create, create-from-preset, update, add-model, rotate-credentials) now pass name=None; the paginated list handler keeps name=name.

tests/unit/api/controllers/test_setup.py: walk pagination at limit=200 to build the round-trip oracle for the agents endpoint, mirroring the personality-presets test fixed in round 6. The previous single-GET version compared two truncated views once the agent count exceeded DEFAULT_LIMIT, which silently passed even on duplicates / gaps in the cursor walk.
src/synthorg/api/controllers/providers.py: list_models now paginates the model list FIRST, then runs batch_get_capabilities only against the page slice. The previous flow probed capabilities for every model the provider declared before slicing, which defeated the cursor-pagination perf goal -- a small-page client still paid the full upstream cost on every request. Sort + paginate over the typed ModelConfig sequence and build ProviderModelResponse per-page; retry-exhaustion / rate-limit handling is preserved verbatim and now also gates on page_models being non-empty.

tests/unit/api/controllers/test_setup.py: assert first['pagination']['next_cursor'] is not None on the agents round-trip after the first GET. solo_founder happens to seed two agents today so the cursor IS crossed, but nothing pinned the cross-page invariant; a future template shrink to one agent would leave the cursor-walk loop body unexercised and turn the test into a vacuous pass.
@Aureliolo Aureliolo force-pushed the perf/performance-data-integrity branch from 523071e to 8af2981 Compare May 8, 2026 23:52
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 8, 2026 23:53 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/synthorg/api/controllers/providers.py (1)

1429-1451: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the documented default page size.

limit now defaults to DEFAULT_LIMIT, but the docstring still says default 50. That will leak the wrong contract into generated API docs and client expectations.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/providers.py` around lines 1429 - 1451, Update
the function docstring to reflect the actual default page size by replacing the
hard-coded "default 50" with a reference to DEFAULT_LIMIT (or its numeric value)
for the `limit` parameter description; locate the docstring that documents
`limit` in the provider audit listing function (the one returning
PaginatedResponse[ProviderAuditEvent]) and change "Page size (default 50, max
``MAX_LIMIT``)" to something like "Page size (default DEFAULT_LIMIT, max
``MAX_LIMIT``)" or the literal value of DEFAULT_LIMIT so generated docs match
the code.
src/synthorg/api/controllers/scaling.py (1)

214-214: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider using DEFAULT_LIMIT for consistency.

list_decisions uses a hardcoded limit: CursorLimit = 50 while the new list_strategies and list_signals use DEFAULT_LIMIT. For consistency across the controller, consider using DEFAULT_LIMIT here as well.

♻️ Suggested change
     `@get`("/decisions", guards=[require_read_access])
     async def list_decisions(
         self,
         state: State,
         cursor: CursorParam = None,
-        limit: CursorLimit = 50,
+        limit: CursorLimit = DEFAULT_LIMIT,
     ) -> PaginatedResponse[ScalingDecisionResponse]:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/scaling.py` at line 214, The list_decisions
function currently sets a hardcoded default limit (limit: CursorLimit = 50);
change it to use the shared DEFAULT_LIMIT constant for consistency with
list_strategies and list_signals by replacing the default value for the limit
parameter in list_decisions with DEFAULT_LIMIT (ensure DEFAULT_LIMIT is imported
or in scope where list_decisions is defined).
♻️ Duplicate comments (1)
tests/unit/api/controllers/test_setup_personality.py (1)

42-48: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Anchor the persistence check to the updated agent.

This any(...) can still pass if another agent already has "visionary_leader". Capture a stable field from the PUT response and match on that as well when scanning GET /api/v1/setup/agents.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/api/controllers/test_setup_personality.py` around lines 42 - 48,
The persistence check is too broad because any agent might already have
"visionary_leader"; capture a stable identifier from the PUT response (e.g.,
put_resp.json()["data"]["id"] or put_resp.json()["data"]["agent_name"]) and then
assert that the GET /api/v1/setup/agents response contains an agent matching
both that identifier and the expected "personality_preset". Locate the PUT
response variable (put_resp) and the GET variables (get_resp / agents) and
change the assertion to filter agents by the returned id/name and verify that
the matching agent's "personality_preset" == "visionary_leader".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/controllers/memory.py`:
- Around line 190-200: The loop resolving fine-tune thresholds must treat
non-positive integers as invalid and use the provided fallback: inside the for
key, fallback in fallbacks.items() loop (and when calling
settings_service.get("memory", key)), parse entry.value to int, and if parsing
fails or the parsed value is <= 0, set resolved[key] = fallback (also handle
SettingNotFoundError, ValueError, TypeError); ensure the resolved values are all
positive before constructing _FineTuneThresholds(default_batch_size=...,
min_docs_required=..., min_docs_recommended=...) so invalid overrides like "0"
or "-1" fall back to the defaults.

In `@src/synthorg/api/controllers/providers.py`:
- Around line 221-232: The current code builds a ProviderResponse for every
configured provider (via to_provider_response) before calling paginate_cursor,
causing O(n) DTO construction; instead, first collect and sort the provider
names from app_state.config_resolver.get_provider_configs(), call
paginate_cursor on that list of names (using the same limit/cursor/secret), then
only map to_provider_response over the returned page of names to construct
ProviderResponse objects and return
PaginatedResponse[ProviderResponse](data=page_dtos, pagination=meta); update the
variables around providers, ordered, page and meta and keep use of
paginate_cursor, to_provider_response, ProviderResponse and PaginatedResponse
consistent.

In
`@src/synthorg/persistence/postgres/revisions/20260508204145_perf_indices_cost_decision.sql`:
- Around line 1-6: The migration creates regular indexes on cost_records and
decision_records which acquire AccessExclusive locks; change the three CREATE
INDEX statements for idx_cost_records_agent_timestamp,
idx_cost_records_task_timestamp and idx_dr_task_recorded_id to use CREATE INDEX
CONCURRENTLY when your deployment has concurrent writes (update the migration in
20260508204145_perf_indices_cost_decision.sql), but first verify
Atlas/non-transactional migration constraints: if Atlas wraps migrations in a
transaction (preventing CONCURRENTLY), mark this migration as non-transactional
or run it manually/outside the transaction wrapper so CONCURRENTLY can be used
safely; ensure you test on a staging instance with writes enabled before
applying to production.

In `@src/synthorg/providers/management/local_models.py`:
- Around line 41-43: The regex _OLLAMA_PATH_POSIX currently only matches
multi-segment absolute POSIX paths, leaving single-segment absolute paths like
/models or /tmp unredacted; update _OLLAMA_PATH_POSIX to also match
single-segment absolute paths (e.g., change the pattern to allow a single
segment after the leading slash) and add a regression test case (for example the
string "open /models: permission denied") next to the existing POSIX-path tests
to ensure single-segment paths are sanitized.

In `@tests/unit/providers/management/test_local_models.py`:
- Around line 303-384: Add a parametrized regression test in
TestSanitizeOllamaError that feeds falsey "error" payloads through the sanitizer
(use pytest.mark.parametrize with inputs {"error": ""} and {"error": False}) and
assert _sanitize_ollama_error(...) returns the fallback "Pull failed"; this
ensures the stream-level case that checks presence of the "error" key (not
truthiness) remains correct—place the new test next to
test_non_string_falls_back and reference _sanitize_ollama_error in the
assertion.

In `@web/src/__tests__/stores/websocket.test.ts`:
- Around line 444-473: The test creates a spy on window.setTimeout
(setTimeoutSpy) but never restores it, causing mock leakage across tests; update
the test around useWebSocketStore.getState().connect() / MockWebSocket.latest()
to restore the spy in the finally block (or call vi.restoreAllMocks()) —
specifically call setTimeoutSpy.mockRestore() (or vi.restoreAllMocks()) before
mathRandom.mockRestore() so the setTimeout spy is removed after the test
completes.

In `@web/src/stores/websocket.ts`:
- Around line 232-249: The jitter multiplication can push the final reconnect
delay above WS_RECONNECT_MAX_DELAY; modify the delay calculation after computing
jitterMultiplier so the rounded value is also capped to WS_RECONNECT_MAX_DELAY.
Specifically, replace the current delay assignment (const delay = Math.max(1,
Math.round(baseDelay * jitterMultiplier))) with a form that applies
Math.min(..., WS_RECONNECT_MAX_DELAY) around the rounded jittered value and
still enforces the 1ms floor (e.g., Math.max(1, Math.min(WS_RECONNECT_MAX_DELAY,
Math.round(baseDelay * jitterMultiplier)))), leaving the existing baseDelay,
jitterMultiplier and reconnectAttempts++ logic intact.

---

Outside diff comments:
In `@src/synthorg/api/controllers/providers.py`:
- Around line 1429-1451: Update the function docstring to reflect the actual
default page size by replacing the hard-coded "default 50" with a reference to
DEFAULT_LIMIT (or its numeric value) for the `limit` parameter description;
locate the docstring that documents `limit` in the provider audit listing
function (the one returning PaginatedResponse[ProviderAuditEvent]) and change
"Page size (default 50, max ``MAX_LIMIT``)" to something like "Page size
(default DEFAULT_LIMIT, max ``MAX_LIMIT``)" or the literal value of
DEFAULT_LIMIT so generated docs match the code.

In `@src/synthorg/api/controllers/scaling.py`:
- Line 214: The list_decisions function currently sets a hardcoded default limit
(limit: CursorLimit = 50); change it to use the shared DEFAULT_LIMIT constant
for consistency with list_strategies and list_signals by replacing the default
value for the limit parameter in list_decisions with DEFAULT_LIMIT (ensure
DEFAULT_LIMIT is imported or in scope where list_decisions is defined).

---

Duplicate comments:
In `@tests/unit/api/controllers/test_setup_personality.py`:
- Around line 42-48: The persistence check is too broad because any agent might
already have "visionary_leader"; capture a stable identifier from the PUT
response (e.g., put_resp.json()["data"]["id"] or
put_resp.json()["data"]["agent_name"]) and then assert that the GET
/api/v1/setup/agents response contains an agent matching both that identifier
and the expected "personality_preset". Locate the PUT response variable
(put_resp) and the GET variables (get_resp / agents) and change the assertion to
filter agents by the returned id/name and verify that the matching agent's
"personality_preset" == "visionary_leader".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f07d0ca9-977e-4c18-9850-b650bda2ad37

📥 Commits

Reviewing files that changed from the base of the PR and between 523071e and 8af2981.

⛔ Files ignored due to path filters (2)
  • src/synthorg/persistence/postgres/revisions/atlas.sum is excluded by !**/*.sum
  • src/synthorg/persistence/sqlite/revisions/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (66)
  • docs/reference/conventions.md
  • scripts/loop_bound_init_baseline.txt
  • scripts/mock_spec_baseline.txt
  • scripts/no_magic_numbers_baseline.txt
  • scripts/run_affected_tests.py
  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/settings.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/setup_models.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/persistence/postgres/revisions/20260508204145_perf_indices_cost_decision.sql
  • src/synthorg/persistence/postgres/schema.sql
  • src/synthorg/persistence/sqlite/revisions/20260508204132_perf_indices_cost_decision.sql
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/settings/definitions/api.py
  • src/synthorg/settings/definitions/memory.py
  • tests/integration/api/controllers/test_providers.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/api/controllers/test_setup.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/test_dto_forbid_extra.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/scripts/test_run_affected_tests.py
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/api/client.ts
  • web/src/api/endpoints/providers.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/api/endpoints/setup.ts
  • web/src/api/types/providers.ts
  • web/src/api/types/setup.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/stores/websocket.ts
  • web/src/utils/app-version.ts
  • web/src/utils/constants.ts
  • web/src/utils/fetch-with-retry.ts
  • web/src/utils/retry-after.ts
💤 Files with no reviewable changes (3)
  • web/src/api/types/setup.ts
  • src/synthorg/api/controllers/setup_models.py
  • tests/unit/api/test_dto_forbid_extra.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Deploy Preview
  • GitHub Check: Build Backend
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (12)
web/src/**/*.ts?(x)

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.ts?(x): Always use createLogger from @/lib/logger instead of bare console.warn/console.error/console.debug in application code; use variable name log (e.g., const log = createLogger('module-name'))
Wrap attacker-controlled fields inside structured objects with sanitizeForLog() before embedding in log calls
Use design tokens, @/lib/motion presets, helpers in @/utils/format, and DEFAULT_CURRENCY from @/utils/currencies instead of hardcoding styling and formatting values
Detect fetch() in effects without AbortController cleanup using @eslint-react/web-api-no-leaked-fetch ESLint rule
A PostToolUse hook (scripts/check_web_design_system.py) runs on every web/src/ edit and flags hardcoded hex / rgba / fonts / Motion durations / locale literals / bare .toLocale*String() calls / missing Storybook stories / duplicate component patterns / complex .map() blocks; fix every violation before proceeding

Files:

  • web/src/api/types/providers.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/utils/app-version.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/mocks/handlers/index.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/api/endpoints/setup.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/utils/constants.ts
  • web/src/api/endpoints/providers.ts
  • web/src/stores/websocket.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/utils/retry-after.ts
  • web/src/utils/fetch-with-retry.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Prefer interface for defining object shapes in TypeScript

Files:

  • web/src/api/types/providers.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/utils/app-version.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/mocks/handlers/index.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/api/endpoints/setup.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/utils/constants.ts
  • web/src/api/endpoints/providers.ts
  • web/src/stores/websocket.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/utils/retry-after.ts
  • web/src/utils/fetch-with-retry.ts
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Web dashboard components must reuse designs from web/src/components/ui/. Never hardcode hex colors, font-family, pixel spacing, Motion transitions, or BCP 47 locale strings; use design tokens, @/lib/motion presets, and helpers in @/utils/format. Enforced by scripts/check_web_design_system.py.

Files:

  • web/src/api/types/providers.ts
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/mocks/handlers/settings.ts
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/utils/app-version.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • web/src/mocks/handlers/index.ts
  • web/src/api/client.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/api/endpoints/setup.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/providers.ts
  • web/src/utils/constants.ts
  • web/src/api/endpoints/providers.ts
  • web/src/stores/websocket.ts
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/utils/retry-after.ts
  • web/src/utils/fetch-with-retry.ts
**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

No default may privilege a region, currency, or locale. Resolution: user/company → browser/system → neutral fallback. Use International/British English UI default (e.g. colour, behaviour, organise, centred, analyse).

Files:

  • web/src/api/types/providers.ts
  • src/synthorg/api/controllers/events.py
  • web/src/__tests__/stores/sinks.test.ts
  • web/src/mocks/handlers/settings.ts
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • web/src/api/endpoints/scaling.ts
  • web/src/api/endpoints/settings.ts
  • web/src/utils/app-version.ts
  • web/src/__tests__/utils/fetch-with-retry.test.ts
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • web/src/mocks/handlers/index.ts
  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/providers/management/dtos.py
  • web/src/api/client.ts
  • src/synthorg/api/rate_limits/__init__.py
  • web/src/mocks/handlers/helpers.ts
  • src/synthorg/settings/definitions/api.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/integration/api/controllers/test_providers.py
  • src/synthorg/providers/management/local_models.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • web/src/api/endpoints/setup.ts
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/meta/telemetry/test_anonymizer.py
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/providers.ts
  • tests/unit/api/controllers/test_setup_personality.py
  • src/synthorg/api/controllers/setup_personality.py
  • web/src/utils/constants.ts
  • web/src/api/endpoints/providers.ts
  • tests/unit/api/controllers/test_providers.py
  • web/src/stores/websocket.ts
  • scripts/run_affected_tests.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/scripts/test_run_affected_tests.py
  • web/src/__tests__/stores/websocket.test.ts
  • tests/integration/persistence/test_perf_indices_postgres.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • web/src/utils/retry-after.ts
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • web/src/utils/fetch-with-retry.ts
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/scaling.py
  • tests/unit/api/rate_limits/test_policies.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/settings.py
  • tests/unit/api/controllers/test_setup.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: All cost-bearing Pydantic models must carry currency: CurrencyCode. Mixing currencies in aggregations raises MixedCurrencyAggregationError (HTTP 409, error code 4007). Aggregations call assert_currencies_match() from synthorg.budget.currency before reducing. Per-line opt-out: # lint-allow: currency-aggregation -- <reason>.
Never use unguarded sum(), math.fsum(), statistics.mean(), statistics.fmean() (including bare-name imports) over .cost, .amount, .total_cost, .usd, or .eur fields without asserting currency invariants. Enforced by scripts/check_currency_aggregation_invariant.py. Per-line opt-out: # lint-allow: currency-aggregation -- <reason> (mandatory non-empty reason).
src/synthorg/persistence/ is the only place that may import aiosqlite, sqlite3, psycopg, psycopg_pool, or emit raw SQL DDL/DML. Every durable feature defines a Protocol in persistence/<domain>_protocol.py with concrete impls under persistence/{sqlite,postgres}/ exposed on PersistenceBackend. Controllers and API endpoints access persistence through domain-scoped service layers; services centralize audit logging; repositories must not log mutations. Per-line opt-out: # lint-allow: persistence-boundary -- <reason>. Enforced by scripts/check_persistence_boundary.py.
Provide type hints on all public functions. mypy strict enforcement required.
Use Google-style docstrings on public classes and functions. Enforced by ruff D rules.
Never mutate objects; create new objects via model_copy(update=...) or copy.deepcopy(). Frozen Pydantic for config/identity; MappingProxyType for non-Pydantic registries; deepcopy at system boundaries.
Separate frozen config models from mutable-via-copy runtime models; never mix in one model.
Use Pydantic v2 with ConfigDict(frozen=True, allow_inf_nan=False) everywhere. Apply extra="forbid" on every model that doesn't round-trip through model_dump() (every API-boundary DTO with Request/Response/Sna...

Files:

  • src/synthorg/api/controllers/events.py
  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/settings/definitions/api.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/settings.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/controllers/events.py
  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/settings/definitions/api.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/settings.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes socket with code 4011.

Files:

  • src/synthorg/api/controllers/events.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/api/controllers/setup_personality.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/settings.py
**/{src,tests,docs,web}/**/*.{py,md,mdx,yaml,yml,json}

📄 CodeRabbit inference engine (CLAUDE.md)

No em-dashes in code, config, or documentation. Use -- (two hyphens). Enforced by pre-commit hook.

Files:

  • src/synthorg/api/controllers/events.py
  • docs/reference/conventions.md
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/settings/definitions/api.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/integration/api/controllers/test_providers.py
  • src/synthorg/providers/management/local_models.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/unit/api/controllers/test_setup_personality.py
  • src/synthorg/api/controllers/setup_personality.py
  • tests/unit/api/controllers/test_providers.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/scripts/test_run_affected_tests.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/scaling.py
  • tests/unit/api/rate_limits/test_policies.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/settings.py
  • tests/unit/api/controllers/test_setup.py
**/*.{md,mdx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use fenced code blocks with d2 language tag for architecture/nested containers diagrams and mermaid for flowcharts/sequence/pipelines in documentation. Never use text fences with ASCII box-drawing. Use markdown tables for tabular data.

Files:

  • docs/reference/conventions.md
web/src/mocks/handlers/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/mocks/handlers/**/*.ts: Mirror every exported endpoint in web/src/api/endpoints/*.ts with a 1:1 default happy-path MSW handler in web/src/mocks/handlers/; boot test-setup with onUnhandledRequest: 'error' and override per-case via server.use(...), never vi.mock('@/api/endpoints/*')
Use typed envelope helpers (successFor, paginatedFor, voidSuccess) in MSW handlers to keep handlers in lockstep with endpoint return types

Files:

  • web/src/mocks/handlers/settings.ts
  • web/src/mocks/handlers/index.ts
  • web/src/mocks/handlers/helpers.ts
  • web/src/mocks/handlers/scaling.ts
  • web/src/mocks/handlers/setup.ts
  • web/src/mocks/handlers/providers.ts
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit / integration / e2e / slow. Every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt. Without spec= mocks silently absorb every attribute access. Enforced by scripts/check_mock_spec.py. Use mock_dispatcher from tests/conftest.py for shared mocks.
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Never use monkeypatch.setattr(module.logger, ...) antipattern; the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead (see _logger_info_spy in tests/unit/settings/test_service.py).
Prefer @pytest.mark.parametrize for similar test cases.

Files:

  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/scripts/test_run_affected_tests.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/controllers/test_setup.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/integration/api/controllers/test_providers.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/api/controllers/test_providers.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/scripts/test_run_affected_tests.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • tests/unit/api/rate_limits/test_policies.py
  • tests/unit/api/controllers/test_setup.py
web/src/utils/constants.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Keep the WebSocket wire protocol constants (WS_PROTOCOL_VERSION, WS_MAX_MESSAGE_SIZE, WS_HEARTBEAT_INTERVAL_MS, WS_PONG_TIMEOUT_MS, LOG_SANITIZE_MAX_LENGTH) in web/src/utils/constants.ts in lockstep with src/synthorg/api/ws_models.py / src/synthorg/api/controllers/ws.py; bump protocol version on both sides together for breaking payload changes

Files:

  • web/src/utils/constants.ts
web/src/stores/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/stores/**/*.ts: All store mutation actions (create / update / delete) must follow the stores/connections/crud-actions.ts pattern: wrap API calls in try/catch, success updates state + emits success toast, failure logs + emits error toast + returns sentinel (null for entity, false for delete); callers MUST NOT wrap store mutation calls in try/catch
List-read store actions must set error: string | null on the store instead of toasting; use opaque cursor-based pagination via PaginationMeta, keep nextCursor + hasMore in state (not offset arithmetic), and early-return when !hasMore || !nextCursor
Always capture previous synchronously in optimistic mutations and restore in the catch block
Any new Zustand store that schedules timers or attaches event listeners must expose an equivalent cleanup hook and register it in the global afterEach in test-setup.tsx
Store files over ~600 lines must be sliced into packages with one of two aggregation patterns: package-internal index.ts or sibling .ts aggregator

Files:

  • web/src/stores/websocket.ts
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/api/controllers/events.py
  • tests/unit/api/rate_limits/test_controller_coverage.py
  • tests/integration/api/test_per_op_rate_limit_concurrent.py
  • src/synthorg/settings/definitions/memory.py
  • src/synthorg/providers/management/dtos.py
  • src/synthorg/api/rate_limits/__init__.py
  • src/synthorg/settings/definitions/api.py
  • tests/unit/memory/embedding/test_fine_tune_orchestrator.py
  • tests/integration/api/controllers/test_providers.py
  • src/synthorg/providers/management/local_models.py
  • tests/unit/api/controllers/test_setup_agent_ops.py
  • tests/unit/api/controllers/test_settings_sinks.py
  • src/synthorg/memory/embedding/fine_tune_orchestrator.py
  • tests/unit/meta/telemetry/test_emitter.py
  • tests/unit/meta/telemetry/test_anonymizer.py
  • tests/unit/api/controllers/test_setup_personality.py
  • src/synthorg/api/controllers/setup_personality.py
  • tests/unit/api/controllers/test_providers.py
  • scripts/run_affected_tests.py
  • src/synthorg/providers/drivers/litellm_driver.py
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/scripts/test_run_affected_tests.py
  • tests/integration/persistence/test_perf_indices_postgres.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/unit/providers/drivers/test_litellm_auth.py
  • tests/unit/api/controllers/test_memory_admin.py
  • tests/integration/persistence/test_perf_indices_sqlite.py
  • src/synthorg/api/controllers/setup.py
  • src/synthorg/api/controllers/scaling.py
  • tests/unit/api/rate_limits/test_policies.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/meta/telemetry/emitter.py
  • src/synthorg/api/controllers/settings.py
  • tests/unit/api/controllers/test_setup.py
🔇 Additional comments (59)
scripts/run_affected_tests.py (3)

203-227: LGTM!

The new regexes and their documentation are well-structured:

  • _NODE_DOWN_RE correctly captures the worker ID from the xdist dead-worker signature
  • _XDIST_INTERNAL_ERROR_RE with re.MULTILINE properly anchors ^ to any line start, catching scheduler tracebacks that appear mid-output
  • The comments clearly document the relationship between these patterns and the native-level crash chain

592-595: LGTM!

Clean helper that mirrors _parse_worker_crashes style. Returning only worker IDs (rather than (worker, test) pairs) correctly reflects that the node down signature lacks per-test attribution.


665-675: LGTM!

The new branch correctly handles the xdist scheduler crash-cascade pattern:

  • Requires all three signals (node down + INTERNALERROR + non-zero return) to avoid false positives
  • Uses synthetic <worker gwN> identifiers since no test ID is recoverable from this signature
  • Position after the crashes check ensures canonical crashed while running output still takes precedence when both patterns appear
tests/unit/scripts/test_run_affected_tests.py (3)

524-553: LGTM!

Good coverage of the primary scenario: dual-worker node down plus scheduler INTERNALERROR> traceback correctly classified as crash_advisory with exit code forced to 0. The assertion on crashed_tests confirms the synthetic <worker gwN> placeholder format.


556-569: LGTM!

This test verifies the fail-closed behavior: node down alone (without the INTERNALERROR> signature) is not automatically demoted to advisory. The conservative stance prevents silent passes on novel failure patterns.


572-590: LGTM!

Critical precedence test: a genuine FAILED line must never be masked by crash-advisory noise. This confirms the classifier's decision-tree order is correct (real failures checked before the node-down branch).

src/synthorg/meta/telemetry/emitter.py (1)

103-115: Clock seam integration is solid.

The emitter now uses a single injected Clock path for flush timestamps, with a safe SystemClock() default.

Also applies to: 137-137, 240-240

tests/unit/meta/telemetry/test_emitter.py (2)

42-47: Mock hardening is correctly applied.

Using specced AsyncMock for _send_batch across these tests tightens interface checks without changing test intent.

Also applies to: 60-65, 90-95, 111-116, 130-135, 154-159, 172-177, 187-192, 213-218, 472-477


429-449: Clock seam tests are strong and deterministic.

The new FakeClock coverage validates both initial _last_flush_at capture and flush() timestamp updates, with proper emitter shutdown in the task-creating path.

Also applies to: 451-487

tests/unit/meta/telemetry/test_anonymizer.py (1)

437-446: Hypothesis health-check suppression update is appropriate.

Re-listing differing_executors in per-test @settings is a good fix for repeat-run executor isolation behavior.

Also applies to: 476-479

src/synthorg/memory/embedding/fine_tune_orchestrator.py (1)

80-96: Clock seam adoption in progress throttling is clean.

The orchestrator now consistently uses the injected clock for throttle timing while preserving default runtime behavior.

Also applies to: 601-609, 633-636

scripts/loop_bound_init_baseline.txt (1)

32-32: Baseline refresh looks correct.

The line-number update is consistent with the constructor shifts in FineTuneOrchestrator.

tests/unit/memory/embedding/test_fine_tune_orchestrator.py (1)

201-258: Throttle seam test coverage is strong.

This validates the injected-clock throttling boundaries with clear hit/miss assertions around _PROGRESS_THROTTLE_SEC.

src/synthorg/providers/drivers/litellm_driver.py (1)

136-143: Credential-cache clock seam is correctly implemented.

Using self._clock.monotonic() for TTL checks makes cache behaviour deterministic in tests while preserving production defaults.

Also applies to: 158-159, 184-194

tests/unit/providers/drivers/test_litellm_auth.py (1)

164-257: Excellent deterministic coverage for credential TTL behaviour.

The new tests lock in within-TTL reuse, post-TTL refetch, exact-boundary semantics, and OAuth always-refetch behavior.

web/src/utils/constants.ts (1)

12-23: Good constant centralisation for reconnect jitter.

Defining the jitter bounds in @/utils/constants keeps reconnect behaviour tunable and testable from one place.

web/src/__tests__/stores/websocket.test.ts (1)

425-428: Nice deterministic reconnect timing assertion.

Advancing past the jitter ceiling makes the reconnect test stable regardless of Math.random() output.

src/synthorg/api/controllers/setup.py (1)

421-455: LGTM!

The pagination implementation correctly preserves persisted-array order (addressing the prior review concern about positional index stability), properly accepts cursor/limit parameters, and returns a well-typed PaginatedResponse[SetupAgentSummary]. The inline comment at lines 441-445 clearly documents the rationale for not re-sorting.

web/src/mocks/handlers/helpers.ts (1)

157-216: LGTM!

The paginatedEnvelopeFor helper correctly infers item types from endpoint return shapes (arrays, readonly arrays, and string-keyed records) and produces typed PaginatedResponse envelopes. The string extends keyof R guard properly discriminates index-signature records from shaped objects, failing to type-check misuse against single-resource endpoints.

web/src/utils/retry-after.ts (1)

1-66: LGTM!

The shared Retry-After parser correctly handles both delta-seconds and HTTP-date formats per RFC 9110, sanitizes attacker-controlled header values before logging, clamps negative deltas to zero, and returns DO_NOT_RETRY when the server-requested wait exceeds the bounded budget.

web/src/api/client.ts (1)

157-174: LGTM!

The retry decision now correctly handles waitMs === 0 as a valid "retry immediately" signal per RFC 9110, while DO_NOT_RETRY (-1) surfaces the 429 to the caller. The sleep is appropriately skipped when waitMs <= 0, and the behavior aligns with fetchWithRetryAfter.

src/synthorg/api/controllers/setup_personality.py (1)

141-183: LGTM!

The pagination implementation correctly sorts presets alphabetically (appropriate since presets are static with no positional write operations), uses proper type annotations, and returns a well-typed PaginatedResponse[PersonalityPresetInfoResponse].

web/src/api/endpoints/setup.ts (1)

40-48: LGTM!

Both getAgents and listPersonalityPresets correctly implement cursor pagination using paginateAll with proper URLSearchParams cursor construction and unwrapPaginated extraction. The return types appropriately use readonly arrays.

Also applies to: 105-114

web/src/mocks/handlers/setup.ts (1)

116-117: LGTM!

The handlers now use paginatedEnvelopeFor<typeof getAgents>() and paginatedEnvelopeFor<typeof listPersonalityPresets>(), keeping the mock responses type-safe and in lockstep with the endpoint contracts. This addresses the prior review comment about using typed envelope helpers.

Also applies to: 154-155

web/src/utils/fetch-with-retry.ts (1)

125-168: LGTM!

The fetchWithRetryAfter implementation correctly handles:

  • Idempotency detection from both Request objects and RequestInit (addressing prior review)
  • AbortSignal cancellation during retry sleep (addressing prior review)
  • Retry budget enforcement via MAX_RATE_LIMIT_RETRIES
  • Immediate retry when waitMs === 0 and early exit on DO_NOT_RETRY
web/src/utils/app-version.ts (1)

17-17: LGTM!

The switch to fetchWithRetryAfter with idempotent: true is appropriate for the replay-safe logout call. The existing AbortController timeout remains as a safeguard, and the comment correctly explains the idempotency rationale.

Also applies to: 122-135

src/synthorg/settings/definitions/api.py (1)

23-69: LGTM!

The new Final[int] constants are correctly placed in settings/definitions/api.py (the allowlisted home for numeric tuning knobs), well-documented, and follow naming conventions. This addresses the previous review feedback about moving magic numbers out of policies.py.

src/synthorg/api/rate_limits/policies.py (1)

1-25: LGTM!

The refactoring properly addresses past review feedback:

  • Rate-limit and inflight defaults now reference constants from settings/definitions/api.py instead of inline literals.
  • The new per_op_concurrency_from_policy helper mirrors the existing per_op_rate_limit_from_policy pattern.
  • Both registries are immutable via MappingProxyType with clear docstrings.

Also applies to: 33-44, 114-118, 228-249, 294-324

src/synthorg/api/rate_limits/__init__.py (1)

35-39: LGTM!

The new exports (INFLIGHT_POLICIES, per_op_concurrency_from_policy) are correctly added to the module's public API and maintain alphabetical ordering in __all__.

Also applies to: 47-47, 60-60

src/synthorg/api/controllers/events.py (1)

22-25: LGTM!

The stream endpoint now uses policy-driven guards for both rate limiting and concurrency, addressing the previous review feedback about hardcoded max_inflight=4. The key parameters are appropriately chosen (user_or_ip for rate limit to handle unauthenticated fallback, user for concurrency).

Also applies to: 530-537

tests/unit/api/rate_limits/test_controller_coverage.py (1)

143-147: LGTM!

The new entry correctly extends the coverage guard to verify that EventStreamController.stream carries the events.stream policy guard, consistent with the production controller changes.

web/src/__tests__/utils/fetch-with-retry.test.ts (1)

1-267: LGTM!

Comprehensive test coverage for fetchWithRetryAfter including:

  • Retry behavior for idempotent vs non-idempotent requests
  • Budget and max-retry enforcement
  • Abort signal handling (pre-abort, mid-sleep abort, default sleep abort)
  • Request object input handling

The regression tests for Request input (lines 206-243) and default sleep abort (lines 245-266) address the previous review feedback.

tests/unit/api/controllers/test_setup.py (4)

724-734: LGTM!

The pagination shape assertions (body["data"], body["pagination"]) and the tampered cursor rejection test correctly validate the new PaginatedResponse wire format.


761-814: LGTM!

The rewritten round-trip test addresses previous review feedback:

  • Builds full oracle by walking pagination (lines 801-813) instead of relying on the default page.
  • Asserts cursor is not None (lines 783-787) to guarantee the loop is actually exercised.
  • Uses strict equality (collected == full) to catch duplicates, gaps, or reordering.

866-880: LGTM!

The persistence assertions now anchor on updated_name from the mutation response, preventing false positives when another agent already has the same values. This addresses previous review feedback.

Also applies to: 1508-1519


1606-1654: LGTM!

The personality presets round-trip test correctly:

  • Builds the full oracle by walking pagination at high limit (lines 1617-1625).
  • Uses strict equality assertion (line 1645).
  • Adds tampered cursor rejection coverage (lines 1647-1654).
web/src/api/types/providers.ts (1)

79-84: LGTM!

The optional name field correctly models the wire contract where paginated list endpoints populate the provider name in each entry, while single-provider GET responses omit it (since the name is already in the URL path). The JSDoc clearly documents this distinction.

tests/unit/api/controllers/test_providers.py (2)

20-31: LGTM!

The test correctly validates the new cursor-pagination envelope structure:

  • data is now an array (not a dict)
  • pagination.has_more and pagination.next_cursor fields are present

The new test_list_providers_tampered_cursor test appropriately verifies that invalid cursors are rejected with HTTP 400, which is essential for the HMAC-signed cursor security model.


59-59: LGTM!

The name=None parameter now correctly aligns with the updated to_provider_response() signature where name is a required keyword-only argument. This ensures single-provider responses (which don't need the name in the payload) are handled correctly.

Also applies to: 77-77, 95-95

web/src/api/endpoints/providers.ts (3)

46-70: LGTM!

The cursor pagination implementation is well-designed:

  • Object.create(null) creates a null-prototype object, preventing prototype pollution
  • Explicit filtering of __proto__, constructor, prototype keys adds defense-in-depth
  • Logging a warning for missing name (without including the payload content) correctly surfaces wire contract violations without leaking potentially sensitive data

77-87: LGTM!

The getProviderModels pagination follows the same clean pattern as listProviders, correctly building the cursor query string and unwrapping paginated responses.


225-242: LGTM!

Using fetchWithRetryAfter with { idempotent: true } for the model pull SSE stream is appropriate. The comment correctly explains that while POST is typically non-idempotent, model pulls resolve to the same state on re-pull, making retry-on-429 safe for the stream initiation request.

web/src/mocks/handlers/providers.ts (2)

152-155: LGTM!

The MSW handlers now correctly return cursor-pagination envelopes via paginatedEnvelopeFor<typeof listProviders>() and paginatedEnvelopeFor<typeof getProviderModels>(). This keeps the mocks in sync with the updated endpoint return types and addresses the previous review feedback about using typed helpers.

Also applies to: 194-196


346-383: LGTM!

The new fixture builders (buildProviderAuditEvent, buildRateLimitsConfig, buildPresetOverride) follow the established pattern with sensible defaults and Partial<T> overrides for test customization.

src/synthorg/api/controllers/settings.py (4)

183-234: LGTM!

The sink identifier implementation elegantly addresses the previous review concerns:

  • SHA-256 fingerprints ensure uniqueness per sink instance (solving the duplicate ID issue for multiple HTTP/SYSLOG sinks)
  • Type-prefixed identifiers (file:, syslog:, http:, otlp:) provide clear categorization
  • Hashing prevents credential/path leakage from the public API envelope
  • The SINK_IDENTIFIER_FINGERPRINT_LENGTH constant centralizes the wire-format contract
  • The unnamed-{type} fallback is appropriately defensive for future sink types

153-181: LGTM!

The new typed DTOs (SinkRotationResponse, SinkInfoResponse) follow all Pydantic conventions:

  • ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid")
  • NotBlankStr for identifier/name fields
  • Proper Field constraints (ge=0 for counts)
  • Typed routing_prefixes: tuple[str, ...] instead of untyped collection

968-993: LGTM!

_append_disabled_defaults is now pure -- it returns a new list [*sinks, *disabled_defaults] instead of mutating the input, addressing the previous review feedback about immutability. The implementation uses list comprehension and unpacking for clarity.


632-646: Redaction of operator-supplied blobs is well-handled.

The warning log correctly avoids embedding overrides_json / custom_json content (which could contain credentials, paths, or auth material). Instead, it logs only coarse metadata (has_sink_overrides, sink_overrides_length) for observability without secret leakage.

docs/reference/conventions.md (1)

54-58: LGTM!

The documentation update accurately reflects the cursor-based pagination changes:

  • Wire format now shows {limit, next_cursor, has_more} (no offset/total)
  • Clearly explains that cursors are HMAC-signed and opaque to clients
  • Explicitly notes there's no total count on the wire

This aligns with the implementation across all paginated endpoints in this PR.

src/synthorg/api/controllers/scaling.py (2)

155-207: LGTM!

The list_strategies endpoint is correctly updated for cursor pagination:

  • Accepts cursor: CursorParam and limit: CursorLimit
  • Sorts strategies by name for stable pagination order
  • Uses paginate_cursor with the app's cursor secret
  • Returns a properly typed empty PaginatedResponse when the scaling service is unavailable

256-322: LGTM!

The list_signals endpoint follows the same clean pagination pattern. The signal deduplication logic (via seen set) is preserved, and the final sorted tuple provides stable cursor ordering.

web/src/api/endpoints/scaling.ts (1)

42-51: LGTM!

Both getScalingStrategies() and getScalingSignals() correctly implement cursor pagination using the same pattern as listProviders():

  • Build URLSearchParams for cursor when present
  • Construct URL with optional query string
  • Use paginateAll to iterate through all pages
  • unwrapPaginated extracts data from each paginated response

The implementation is consistent across the codebase.

Also applies to: 70-79

web/src/mocks/handlers/scaling.ts (1)

17-19: Paginated MSW response shape is correctly aligned for strategies/signals.

These handlers now return paginated envelopes matching the cursor-based wire contract, which keeps store/tests in sync with the endpoint transition.

Also applies to: 27-29

web/src/mocks/handlers/index.ts (1)

47-57: Helper export update looks good.

Re-exporting paginatedEnvelopeFor from the handlers entrypoint keeps mock usage consistent across tests/stories.

web/src/api/endpoints/settings.ts (1)

50-58: listSinks() pagination migration is solid.

Good use of paginateAll + unwrapPaginated to adapt to cursor pages while keeping the existing endpoint return type unchanged.

web/src/mocks/handlers/settings.ts (1)

67-69: Sinks mock is correctly switched to paginated envelope output.

This keeps the MSW response shape aligned with the updated /settings/observability/sinks endpoint behavior.

web/src/__tests__/stores/sinks.test.ts (1)

9-17: Updated test fixtures correctly reflect paginated sink responses.

Using a dedicated PaginatedResponse<SinkInfo> helper here keeps success-path tests aligned with the new API contract.

Also applies to: 48-50, 67-70, 117-119

tests/unit/api/controllers/test_settings_sinks.py (2)

67-70: Canonical identifier expectation is a good stabilization.

Using _sink_identifier keeps default-id assertions in lockstep with endpoint identifier generation.


151-195: Excellent pagination hardening tests.

The round-trip cursor walk plus tampered-cursor 400 assertion directly validates the new pagination integrity guarantees.

Comment thread src/synthorg/api/controllers/memory.py
Comment thread src/synthorg/api/controllers/providers.py
Comment on lines +1 to +6
-- Create index "idx_cost_records_agent_timestamp" to table: "cost_records"
CREATE INDEX "idx_cost_records_agent_timestamp" ON "cost_records" ("agent_id", "timestamp" DESC);
-- Create index "idx_cost_records_task_timestamp" to table: "cost_records"
CREATE INDEX "idx_cost_records_task_timestamp" ON "cost_records" ("task_id", "timestamp" DESC);
-- Create index "idx_dr_task_recorded_id" to table: "decision_records"
CREATE INDEX "idx_dr_task_recorded_id" ON "decision_records" ("task_id", "recorded_at", "id");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

# Find the schema definitions for the tables mentioned
find . -type f \( -name "*.sql" -o -name "*.yml" -o -name "*.yaml" \) | head -20

Repository: Aureliolo/synthorg

Length of output: 920


🏁 Script executed:

# Search for cost_records and decision_records table definitions
rg "cost_records|decision_records" -t sql --max-count 20

Repository: Aureliolo/synthorg

Length of output: 9939


🏁 Script executed:

# Look for Atlas configuration
find . -type f -name "atlas.hcl" -o -name "atlas.yml" -o -name ".atlas*"

Repository: Aureliolo/synthorg

Length of output: 74


🏁 Script executed:

# Check if other migrations use CREATE INDEX CONCURRENTLY
rg "CREATE INDEX CONCURRENTLY" -t sql

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

# Look at the directory structure of migrations
ls -la src/synthorg/persistence/postgres/revisions/ | head -20

Repository: Aureliolo/synthorg

Length of output: 1436


🏁 Script executed:

# Read atlas.hcl to understand transaction handling
cat atlas.hcl

Repository: Aureliolo/synthorg

Length of output: 1029


🏁 Script executed:

# Read the schema.sql to understand the tables
cat src/synthorg/persistence/postgres/schema.sql | head -150

Repository: Aureliolo/synthorg

Length of output: 6476


🏁 Script executed:

# Check a recent migration to see Atlas pattern
cat src/synthorg/persistence/postgres/revisions/20260427161641_perf_indices_and_a2a_peer.sql

Repository: Aureliolo/synthorg

Length of output: 866


🏁 Script executed:

# Read more of schema.sql to see decision_records definition
cat src/synthorg/persistence/postgres/schema.sql | grep -A 30 "Create \"decision_records\" table"

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

# Look for any comments about write patterns or deployment in the codebase
rg "append-heavy|write-heavy|TimescaleDB|CONCURRENTLY|transaction" -t sql src/synthorg/persistence/postgres/

Repository: Aureliolo/synthorg

Length of output: 624


🏁 Script executed:

# Check if atlas.hcl has any transaction settings in the migration config
cat atlas.hcl | grep -A 10 "migration"

Repository: Aureliolo/synthorg

Length of output: 681


🏁 Script executed:

# Check if there's any documentation on deployment or migration execution
find . -type f \( -name "README*" -o -name "DEPLOY*" -o -name "docs*" \) | head -20

Repository: Aureliolo/synthorg

Length of output: 196


🏁 Script executed:

# Look for any CI/deployment configuration that might hint at transaction handling
find . -path "./.github/workflows/*.yml" -o -path "./.github/workflows/*.yaml" | xargs grep -l "atlas\|migrate" 2>/dev/null | head -5

Repository: Aureliolo/synthorg

Length of output: 89


🏁 Script executed:

# Check Atlas documentation references or comments about transactions
rg "atlas migrate\|exec\|transaction" --max-count 10

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

# Verify the exact Atlas version or settings
rg "atlas" -t toml -t json -t yaml | head -20

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

# Read the CI workflow to understand how migrations are executed
cat .github/workflows/ci.yml | grep -A 20 -B 5 "atlas\|migrate"

Repository: Aureliolo/synthorg

Length of output: 5021


🏁 Script executed:

# Check if there's any pyproject.toml or setup.cfg with Atlas config
cat pyproject.toml 2>/dev/null | grep -A 10 "atlas\|migrate" || echo "Not found in pyproject.toml"

Repository: Aureliolo/synthorg

Length of output: 90


🏁 Script executed:

# Look at the baseline migration to understand the pattern - read just the indexes section
sed -n '/CREATE INDEX/,/^$/p' src/synthorg/persistence/postgres/revisions/00000000000000_baseline.sql | head -50

Repository: Aureliolo/synthorg

Length of output: 2597


🏁 Script executed:

# Check when these tables are populated in the codebase
rg "insert into cost_records|INSERT INTO cost_records" -i -t sql -t py | head -10

Repository: Aureliolo/synthorg

Length of output: 234


🏁 Script executed:

# Look for any deployment documentation
find . -type f -name "*.md" | xargs grep -l "deploy\|migration\|Atlas" | head -5

Repository: Aureliolo/synthorg

Length of output: 262


🏁 Script executed:

# Check if other tables with CONCURRENTLY are in recent migrations
rg "CREATE.*INDEX" src/synthorg/persistence/postgres/revisions/ -A 1 | tail -50

Repository: Aureliolo/synthorg

Length of output: 6181


🏁 Script executed:

# Read the deployment documentation
cat ./.claude/agents/persistence-reviewer.md

Repository: Aureliolo/synthorg

Length of output: 10280


🏁 Script executed:

# Check if there are any production deployment instructions
find . -type f -name "*.md" | xargs grep -l "production\|deploy\|kubernetes\|aws\|gcp" | head -5

Repository: Aureliolo/synthorg

Length of output: 258


🏁 Script executed:

# Look at the actual migration file directly to confirm its full content
cat src/synthorg/persistence/postgres/revisions/20260508204145_perf_indices_cost_decision.sql

Repository: Aureliolo/synthorg

Length of output: 573


Consider CREATE INDEX CONCURRENTLY for high-write deployments, but verify your deployment context.

Plain CREATE INDEX holds an AccessExclusive lock on cost_records and decision_records for the build duration. This matters only on actively-written tables; if your deployment uses TimescaleDB or runs with concurrent writes during schema updates, use CREATE INDEX CONCURRENTLY instead.

Note: This codebase has no established pattern for concurrent index builds, and atlas.hcl does not configure non-transactional migrations. Before adopting CONCURRENTLY, confirm your deployment requires it and handle the Atlas transaction-wrapper limitation (migrations may need manual adjustment or explicit non-transactional execution).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/synthorg/persistence/postgres/revisions/20260508204145_perf_indices_cost_decision.sql`
around lines 1 - 6, The migration creates regular indexes on cost_records and
decision_records which acquire AccessExclusive locks; change the three CREATE
INDEX statements for idx_cost_records_agent_timestamp,
idx_cost_records_task_timestamp and idx_dr_task_recorded_id to use CREATE INDEX
CONCURRENTLY when your deployment has concurrent writes (update the migration in
20260508204145_perf_indices_cost_decision.sql), but first verify
Atlas/non-transactional migration constraints: if Atlas wraps migrations in a
transaction (preventing CONCURRENTLY), mark this migration as non-transactional
or run it manually/outside the transaction wrapper so CONCURRENTLY can be used
safely; ensure you test on a staging instance with writes enabled before
applying to production.

Comment thread src/synthorg/providers/management/local_models.py
Comment on lines +303 to +384
class TestSanitizeOllamaError:
"""Adversarial-input fixtures for the SSE-forwarded error sanitizer."""

def test_non_string_falls_back(self) -> None:
assert _sanitize_ollama_error(None) == "Pull failed"
assert _sanitize_ollama_error(12345) == "Pull failed"
assert _sanitize_ollama_error({"error": "x"}) == "Pull failed"

def test_redacts_posix_paths(self) -> None:
sanitized = _sanitize_ollama_error(
"open /var/lib/ollama/models/secret.bin: permission denied",
)
assert "/var/lib/ollama" not in sanitized
assert "[REDACTED-PATH]" in sanitized

def test_redacts_posix_paths_with_spaces(self) -> None:
# Path segments that contain a literal space must still be
# consumed in full -- the older regex stopped at the first
# whitespace and leaked the trailing tail.
sanitized = _sanitize_ollama_error(
"open /var/lib/ollama/model cache/secret.bin: permission denied",
)
assert "model cache" not in sanitized
assert "secret.bin" not in sanitized
assert "[REDACTED-PATH]" in sanitized

def test_redacts_windows_paths(self) -> None:
sanitized = _sanitize_ollama_error(
r"C:\Users\admin\AppData\token.json missing",
)
assert "Users" not in sanitized
assert "[REDACTED-PATH]" in sanitized

def test_redacts_windows_paths_with_spaces(self) -> None:
# ``Program Files`` is the canonical Windows-path-with-space
# case; the redaction must consume both segments.
sanitized = _sanitize_ollama_error(
r"C:\Program Files\Ollama\token.json missing",
)
assert "Program Files" not in sanitized
assert "token.json" not in sanitized
assert "[REDACTED-PATH]" in sanitized

@pytest.mark.parametrize(
("raw", "leak"),
[
(
"dial tcp localhost:11434: connection refused",
"localhost",
),
(
"dial tcp 127.0.0.1:11434: connection refused",
"127.0.0.1",
),
(
"dial tcp [::1]:11434: connection refused",
"[::1]",
),
(
"dial tcp ollama-internal.local:11434: connection refused",
"ollama-internal.local",
),
],
)
def test_redacts_host_port(self, raw: str, leak: str) -> None:
sanitized = _sanitize_ollama_error(raw)
assert leak not in sanitized
assert "11434" not in sanitized
assert "[REDACTED-HOST]" in sanitized

def test_truncates_oversized_input(self) -> None:
long = "x" * 1000
sanitized = _sanitize_ollama_error(long)
assert len(sanitized) <= _OLLAMA_ERROR_MAX_LEN

def test_benign_message_passes_through(self) -> None:
sanitized = _sanitize_ollama_error("model not found")
assert sanitized == "model not found"

def test_empty_string_falls_back(self) -> None:
assert _sanitize_ollama_error("") == "Pull failed"
assert _sanitize_ollama_error(" ") == "Pull failed"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Add a stream-level regression for falsey error payloads.

These tests prove the helper fallback, but they still will not catch a regression from if "error" in data back to a truthy check. A small parametrised pull-stream case for {"error": ""} and {"error": false} would lock down the exact bug this PR fixed. As per coding guidelines, "Prefer @pytest.mark.parametrize for similar test cases."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/providers/management/test_local_models.py` around lines 303 - 384,
Add a parametrized regression test in TestSanitizeOllamaError that feeds falsey
"error" payloads through the sanitizer (use pytest.mark.parametrize with inputs
{"error": ""} and {"error": False}) and assert _sanitize_ollama_error(...)
returns the fallback "Pull failed"; this ensures the stream-level case that
checks presence of the "error" key (not truthiness) remains correct—place the
new test next to test_non_string_falls_back and reference _sanitize_ollama_error
in the assertion.

Comment thread web/src/__tests__/stores/websocket.test.ts
Comment thread web/src/stores/websocket.ts
memory.py: _resolve_fine_tune_thresholds rejects non-positive override values ('0', '-1') and falls back to FINE_TUNE_* defaults instead of letting them reach _FineTuneThresholds (Field ge=1) and surface as a 500 from /fine-tune/preflight.

providers.py:list_providers: paginate the sorted name list FIRST then build ProviderResponse only for the page slice. Previously a small-page request still paid O(n) DTO construction across every configured provider, defeating the cursor-pagination perf goal. Same shape as list_models from round 8.

providers.py:1450 docstring: 'default 50' replaced with 'default DEFAULT_LIMIT'.

scaling.py:list_decisions: limit default switched from hardcoded 50 to DEFAULT_LIMIT for consistency with list_strategies / list_signals.

local_models.py: _OLLAMA_PATH_POSIX trailing repetition group changed from + to * so single-segment absolute paths (/models, /tmp, /token.json) are also redacted; the previous form left them visible to remote clients via SSE error forwarding.

test_local_models.py: parametrized test_falsey_payloads_fall_back covering '', '   ', False, 0, None to lock the stream-level 'error' presence-vs-truthiness contract; test_redacts_single_segment_posix_paths added as the regex regression.

websocket.test.ts: hoisted setTimeoutSpy above the try block and restored it in finally; previously the spy leaked across tests because vitest config has no restoreMocks and the global afterEach has no vi.restoreAllMocks().

websocket.ts:scheduleReconnect: jittered delay clamped to [1ms, WS_RECONNECT_MAX_DELAY]; the upper-bound jitter multiplier (1.2x) could push the rounded delay past the configured max once baseDelay had already saturated at the cap.

test_setup_personality.py: anchor the persistence assertion on the updated agent's name from the PUT response (mirrors round-5 fixes); the previous any() form could pass via a different agent already carrying the visionary_leader preset.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/api/controllers/memory.py (1)

916-944: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return the configured fallback batch size when torch is absent.

This helper now accepts default_batch_size, but the ImportError branch still returns None. On CPU-only installs without torch, /fine-tune/preflight loses the recommendation and any memory.fine_tune_default_batch_size override is ignored.

Suggested fix
     except ImportError:
         # torch is optional -- absence is expected on CPU-only installs.
-        return None
+        return default_batch_size
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/memory.py` around lines 916 - 944, In
_recommend_batch_size: when ImportError is raised (torch missing) the function
mistakenly returns None losing the configured default_batch_size; change the
ImportError branch to return the provided default_batch_size parameter so
CPU-only installs honor memory.fine_tune_default_batch_size (update the
ImportError handler in _recommend_batch_size to return default_batch_size
instead of None).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/controllers/memory.py`:
- Around line 152-206: The current _resolve_fine_tune_thresholds() only enforces
per-value positivity and may accept an inverted pair where min_docs_recommended
< min_docs_required; update the function to detect this after computing resolved
values (keys "fine_tune_min_docs_required" and "fine_tune_min_docs_recommended")
and handle it by falling back the recommended value to a safe value (e.g., the
fallback constant FINE_TUNE_MIN_DOCS_RECOMMENDED or at minimum equal to the
required value) before constructing the _FineTuneThresholds instance so the
invariant (recommended >= required) is always preserved.

In `@src/synthorg/api/controllers/scaling.py`:
- Around line 193-200: Extract the magic number 999 into a named constant (e.g.,
PRIORITY_FALLBACK = 999 or load from settings) and replace the inline literal in
the tuple comprehension that builds ScalingStrategyResponse (where
configured_order.get(str(s.name), 999) is used) with that constant; do the same
replacement for the other occurrence referenced around line 434 so both
fallbacks use the same module-level constant or settings value and improve
readability and maintainability.

In `@tests/unit/api/controllers/test_setup_personality.py`:
- Around line 46-52: The test in
tests/unit/api/controllers/test_setup_personality.py currently relies on the
default pagination when calling test_client.get("/api/v1/setup/agents"); modify
the GET to include an explicit limit (e.g., "?limit=100") so the updated agent
is returned regardless of default page, assert the response succeeded (check
get_resp.status_code == 200 or use get_resp.raise_for_status()) before accessing
get_resp.json()["data"], and then perform the existing any(...) assertion using
updated_name and "visionary_leader".
- Around line 116-122: The test currently asserts the endpoint returns status
200 and non-empty data but does not validate the pagination envelope; update the
test that calls test_client.get("/api/v1/setup/personality-presets") (the body
variable) to also assert that body["pagination"] exists and is an object/dict
(e.g., isinstance(body["pagination"], dict) or equivalent) and optionally that
expected pagination keys (like "page", "per_page", "total") are present to
ensure correct pagination shape.
- Around line 128-130: The test calls
test_client.get("/api/v1/setup/personality-presets") and iterates body["data"]
without asserting the response was successful; add an assertion immediately
after obtaining resp (e.g., assert resp.status_code == 200 or resp.ok) to fail
fast and show the actual payload when the endpoint returns an error, then
proceed to parse body = resp.json() and iterate presets.

---

Outside diff comments:
In `@src/synthorg/api/controllers/memory.py`:
- Around line 916-944: In _recommend_batch_size: when ImportError is raised
(torch missing) the function mistakenly returns None losing the configured
default_batch_size; change the ImportError branch to return the provided
default_batch_size parameter so CPU-only installs honor
memory.fine_tune_default_batch_size (update the ImportError handler in
_recommend_batch_size to return default_batch_size instead of None).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 19c8df11-7e84-490c-bc47-d4a98ac5ad64

📥 Commits

Reviewing files that changed from the base of the PR and between 8af2981 and 05827f8.

📒 Files selected for processing (8)
  • src/synthorg/api/controllers/memory.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/providers/management/local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • tests/unit/providers/management/test_local_models.py
  • web/src/__tests__/stores/websocket.test.ts
  • web/src/stores/websocket.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Build Backend
  • GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
  • GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: CodSpeed Web benchmarks
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (9)
web/src/**/*.ts?(x)

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.ts?(x): Always use createLogger from @/lib/logger instead of bare console.warn/console.error/console.debug in application code; use variable name log (e.g., const log = createLogger('module-name'))
Wrap attacker-controlled fields inside structured objects with sanitizeForLog() before embedding in log calls
Use design tokens, @/lib/motion presets, helpers in @/utils/format, and DEFAULT_CURRENCY from @/utils/currencies instead of hardcoding styling and formatting values
Detect fetch() in effects without AbortController cleanup using @eslint-react/web-api-no-leaked-fetch ESLint rule
A PostToolUse hook (scripts/check_web_design_system.py) runs on every web/src/ edit and flags hardcoded hex / rgba / fonts / Motion durations / locale literals / bare .toLocale*String() calls / missing Storybook stories / duplicate component patterns / complex .map() blocks; fix every violation before proceeding

Files:

  • web/src/stores/websocket.ts
  • web/src/__tests__/stores/websocket.test.ts
web/src/stores/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/stores/**/*.ts: All store mutation actions (create / update / delete) must follow the stores/connections/crud-actions.ts pattern: wrap API calls in try/catch, success updates state + emits success toast, failure logs + emits error toast + returns sentinel (null for entity, false for delete); callers MUST NOT wrap store mutation calls in try/catch
List-read store actions must set error: string | null on the store instead of toasting; use opaque cursor-based pagination via PaginationMeta, keep nextCursor + hasMore in state (not offset arithmetic), and early-return when !hasMore || !nextCursor
Always capture previous synchronously in optimistic mutations and restore in the catch block
Any new Zustand store that schedules timers or attaches event listeners must expose an equivalent cleanup hook and register it in the global afterEach in test-setup.tsx
Store files over ~600 lines must be sliced into packages with one of two aggregation patterns: package-internal index.ts or sibling .ts aggregator

Files:

  • web/src/stores/websocket.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Prefer interface for defining object shapes in TypeScript

Files:

  • web/src/stores/websocket.ts
  • web/src/__tests__/stores/websocket.test.ts
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Web dashboard components must reuse designs from web/src/components/ui/. Never hardcode hex colors, font-family, pixel spacing, Motion transitions, or BCP 47 locale strings; use design tokens, @/lib/motion presets, and helpers in @/utils/format. Enforced by scripts/check_web_design_system.py.

Files:

  • web/src/stores/websocket.ts
  • web/src/__tests__/stores/websocket.test.ts
**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

No default may privilege a region, currency, or locale. Resolution: user/company → browser/system → neutral fallback. Use International/British English UI default (e.g. colour, behaviour, organise, centred, analyse).

Files:

  • web/src/stores/websocket.ts
  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • src/synthorg/providers/management/local_models.py
  • web/src/__tests__/stores/websocket.test.ts
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit / integration / e2e / slow. Every Mock() / AsyncMock() / MagicMock() in tests/ MUST declare spec=ConcreteClass. Pre-existing sites frozen in scripts/mock_spec_baseline.txt. Without spec= mocks silently absorb every attribute access. Enforced by scripts/check_mock_spec.py. Use mock_dispatcher from tests/conftest.py for shared mocks.
Time-driven tests: import FakeClock from tests._shared.fake_clock; inject via clock= parameter. FakeClock.sleep advances virtual time and yields once via asyncio.sleep(0). Patch time.monotonic() / asyncio.sleep() globals only for legacy paths without a Clock seam.
Never use monkeypatch.setattr(module.logger, ...) antipattern; the BoundLoggerLazyProxy caches the stale bound method via __dict__. Use try/finally del proxy.<level> instead (see _logger_info_spy in tests/unit/settings/test_service.py).
Prefer @pytest.mark.parametrize for similar test cases.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
**/{src,tests,docs,web}/**/*.{py,md,mdx,yaml,yml,json}

📄 CodeRabbit inference engine (CLAUDE.md)

No em-dashes in code, config, or documentation. Use -- (two hyphens). Enforced by pre-commit hook.

Files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: All cost-bearing Pydantic models must carry currency: CurrencyCode. Mixing currencies in aggregations raises MixedCurrencyAggregationError (HTTP 409, error code 4007). Aggregations call assert_currencies_match() from synthorg.budget.currency before reducing. Per-line opt-out: # lint-allow: currency-aggregation -- <reason>.
Never use unguarded sum(), math.fsum(), statistics.mean(), statistics.fmean() (including bare-name imports) over .cost, .amount, .total_cost, .usd, or .eur fields without asserting currency invariants. Enforced by scripts/check_currency_aggregation_invariant.py. Per-line opt-out: # lint-allow: currency-aggregation -- <reason> (mandatory non-empty reason).
src/synthorg/persistence/ is the only place that may import aiosqlite, sqlite3, psycopg, psycopg_pool, or emit raw SQL DDL/DML. Every durable feature defines a Protocol in persistence/<domain>_protocol.py with concrete impls under persistence/{sqlite,postgres}/ exposed on PersistenceBackend. Controllers and API endpoints access persistence through domain-scoped service layers; services centralize audit logging; repositories must not log mutations. Per-line opt-out: # lint-allow: persistence-boundary -- <reason>. Enforced by scripts/check_persistence_boundary.py.
Provide type hints on all public functions. mypy strict enforcement required.
Use Google-style docstrings on public classes and functions. Enforced by ruff D rules.
Never mutate objects; create new objects via model_copy(update=...) or copy.deepcopy(). Frozen Pydantic for config/identity; MappingProxyType for non-Pydantic registries; deepcopy at system boundaries.
Separate frozen config models from mutable-via-copy runtime models; never mix in one model.
Use Pydantic v2 with ConfigDict(frozen=True, allow_inf_nan=False) everywhere. Apply extra="forbid" on every model that doesn't round-trip through model_dump() (every API-boundary DTO with Request/Response/Sna...

Files:

  • src/synthorg/providers/management/local_models.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/providers/management/local_models.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

WebSocket per-frame timeout (DoS): silent peer closed with code 1008 after api.ws_frame_timeout_seconds (default 30s). Revalidation failures tracked via _SlidingWindowRateLimiter (api.ws_revalidation_window_seconds 60s, api.ws_revalidation_max_failures 5); saturation closes socket with code 4011.

Files:

  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/providers/management/test_local_models.py
  • tests/unit/api/controllers/test_setup_personality.py
  • src/synthorg/providers/management/local_models.py
  • src/synthorg/api/controllers/providers.py
  • src/synthorg/api/controllers/scaling.py
  • src/synthorg/api/controllers/memory.py
🔇 Additional comments (16)
web/src/stores/websocket.ts (2)

21-22: Jitter constants import is correctly wired.

The new constants are scoped and used appropriately for reconnect-delay randomisation.


232-258: Reconnect jitter math and bounds look correct.

Good fix: delay is jittered and then clamped to both floor and ceiling, preventing post-jitter overflow beyond WS_RECONNECT_MAX_DELAY.

web/src/__tests__/stores/websocket.test.ts (2)

425-428: Deterministic reconnect timing assertion is solid.

Advancing past the jitter ceiling is a robust way to avoid flakiness while still validating reconnect scheduling.


434-482: Jitter-delay test coverage and spy teardown look good.

Nice addition: parameterized jitter cases plus explicit mockRestore() in finally closes the previous cross-test leakage risk.

src/synthorg/providers/management/local_models.py (3)

8-9: LGTM - Regex patterns and constants are well-designed.

The patterns correctly address all previously identified gaps:

  • POSIX paths now use * for zero-or-more trailing segments, catching single-segment paths like /tmp
  • Space character included in character classes for paths like /model cache/ and C:\Program Files\
  • Host:port covers IPv4, bracketed IPv6, and DNS/single-label hostnames

Also applies to: 35-64


67-83: LGTM - Side-effect free sanitizer with proper fallback handling.

The implementation correctly:

  • Returns early for non-string inputs
  • Chains regex substitutions without modifying the original
  • Uses truncation + strip to handle oversized/whitespace-only results
  • Falls back to "Pull failed" for empty results

196-210: LGTM - Correctly handles falsey error payloads.

The change from if error: to if "error" in data: ensures that falsey-but-present error values (like "", 0, False) are still treated as terminal events and sanitized. The logging correctly captures error_type from the raw payload type while forwarding only the sanitized message.

tests/unit/providers/management/test_local_models.py (4)

11-16: LGTM - Test class properly imports and documents its purpose.

The test class clearly describes its intent as "adversarial-input fixtures for the SSE-forwarded error sanitizer" and imports the necessary private symbols for direct unit testing.

Also applies to: 303-304


306-320: LGTM - Comprehensive fallback coverage with clear contract documentation.

The parametrized test_falsey_payloads_fall_back correctly locks down the behavior that the stream-level caller's "error" in data check (not truthiness) requires the sanitizer to handle all falsey inputs gracefully. The inline comment explains the regression this prevents.


322-402: LGTM - Comprehensive redaction coverage with parametrized edge cases.

The tests thoroughly cover:

  • Single-segment POSIX paths (/models, /tmp, /token.json)
  • Paths containing spaces (POSIX and Windows Program Files)
  • All host:port forms (localhost, IPv4, bracketed IPv6, DNS hostname)

Each test properly asserts both the absence of leaked data and the presence of redaction markers.


404-415: LGTM - Boundary condition tests complete the coverage.

Truncation, pass-through, and empty-string handling are all verified, ensuring the sanitizer behaves correctly at its boundaries.

src/synthorg/api/controllers/scaling.py (3)

11-16: LGTM!

The pagination-related imports are correctly added and necessary for the endpoint changes.


209-254: LGTM!

The list_decisions endpoint correctly implements typed PaginatedResponse[ScalingDecisionResponse] with consistent handling of the missing-service fallback path. The sorting by created_at descending followed by cursor pagination is appropriate for decision history.


256-322: LGTM!

The list_signals endpoint correctly implements cursor pagination with:

  • Proper deduplication by signal name before sorting
  • Deterministic ordering by name for stable pagination
  • Consistent fallback to decision history signals on cold start
  • Correct PaginatedResponse typing and cursor secret usage
src/synthorg/api/controllers/providers.py (2)

216-238: Good pagination boundary placement.

Paginating before DTO construction and capability enrichment keeps these endpoints aligned with the perf objective and avoids O(n) work for small pages.

Also applies to: 303-349


264-270: Nice containment of the ProviderResponse.name schema change.

Using name=None on single-resource and mutation responses keeps the wire-format expansion scoped to the paginated list endpoints.

Also applies to: 432-433, 483-484, 533-534, 1054-1054, 1177-1177

Comment thread src/synthorg/api/controllers/memory.py
Comment thread src/synthorg/api/controllers/scaling.py
Comment thread tests/unit/api/controllers/test_setup_personality.py Outdated
Comment thread tests/unit/api/controllers/test_setup_personality.py
Comment thread tests/unit/api/controllers/test_setup_personality.py
tests/unit/core/conftest.py: bind skills = SkillSetFactory on AgentIdentityFactory. polyfactory was generating the SkillSet field directly from the model definition rather than going through SkillSetFactory's empty-default guard, so an unlucky random draw could produce overlapping primary/secondary skill ids and trip the SkillSet overlap validator. Verified deterministic over 50 repeats.

src/synthorg/api/controllers/memory.py: enforce the recommended >= required cross-field invariant in _resolve_fine_tune_thresholds. Per-value positivity alone allowed an inverted pair past the gate, after which _check_documents could never emit the warn band. An inverted recommended falls back to max(FINE_TUNE_MIN_DOCS_RECOMMENDED, resolved_required) so the invariant always holds before _FineTuneThresholds construction.

src/synthorg/settings/definitions/api.py + scaling.py: extract the magic 999 sort-to-end sentinel for ScalingStrategyResponse.priority into SCALING_STRATEGY_PRIORITY_FALLBACK in settings/definitions/api.py (gate-allowlisted directory) and import at both call sites (list_strategies, get_strategy).

tests/unit/api/controllers/test_setup_personality.py: persistence GET on the personality-update test now requests ?limit=100 and asserts status_code == 200 before iterating, so the assertion cannot become flaky if the updated agent ever falls past the default page boundary. Added pagination envelope checks (isinstance dict, key presence) to the test_list_presets_returns_non_empty contract test, and a status_code assert before iterating presets in test_list_presets_field_shape.
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 00:51 — with GitHub Actions Inactive
@Aureliolo Aureliolo merged commit d1faf86 into main May 9, 2026
78 checks passed
@Aureliolo Aureliolo deleted the perf/performance-data-integrity branch May 9, 2026 07:23
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 07:24 — with GitHub Actions Inactive
Aureliolo pushed a commit that referenced this pull request May 10, 2026
<!-- HIGHLIGHTS_START -->
## Highlights

> _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub
Models). Commit-based changelog below._

### What you'll notice
- Improved error logging and Prometheus instrumentation provide better
system monitoring.
- Eliminated race conditions in CI tagging for more reliable development
releases.
- Fixed critical configuration access and kill-switch bugs to enhance
system stability.
- Enhanced client experience with retry-after headers and better
websocket reconnect behavior.

### What's new
- Introduced composite indexes and cursor pagination for faster data
queries.
- Added server-sent events rate limiting and Ollama input sanitization
for improved security.

### Under the hood
- Centralized workflow error mappings to standardize error handling.
- Refactored API lifecycle fallback to use a configuration snapshot for
consistency.
- Tightened startup settings baseline and reduced controller error
baseline to zero.
- Replaced flaky contributor-assistant GitHub action with a custom
stable step.
- Consolidated Renovate dependency groups to avoid update conflicts.
- Upgraded in-toto-golang dependency to fix security vulnerabilities and
dropped unnecessary CVE waivers.
- Extensive lock file maintenance and multiple infrastructure and Python
dependency updates.

<!-- HIGHLIGHTS_END -->

:robot: I have created a release *beep* *boop*
---


##
[0.8.2](v0.8.1...v0.8.2)
(2026-05-10)


### Features

* close audit gaps in error logging and Prometheus instrumentation
([#1821](#1821))
([ef00fdc](ef00fdc))


### Bug Fixes

* **ci:** eliminate dev-release tag-vs-downstream race + CI hygiene
audit ([#1827](#1827))
([b7b9a59](b7b9a59))
* **config:** close 6 settings reachability + kill-switch gaps
([#1798](#1798))
([410cb3b](410cb3b))
* correctness / safety fixes from 2026-05-05 audit (Wave 28)
([#1823](#1823))
([d01e624](d01e624))


### Performance

* composite indexes + cursor pagination + clock seam + SSE rate-limit +
Ollama sanitization + retry-after web client + WS reconnect jitter
([#1822](#1822))
([d1faf86](d1faf86))


### Refactoring

* **api:** move activities lifecycle-cap fallback to ApiBridgeConfig
snapshot ([#1840](#1840))
([7a56e9c](7a56e9c))
* centralise workflow error mapping and shared error codes
([#1778](#1778) sub-tasks A
+ E) ([#1843](#1843))
([11132cd](11132cd))
* drive controller-error baseline to zero
([#1778](#1778) sub-task A
tail) ([#1846](#1846))
([e96ae20](e96ae20))
* slim CLAUDE.md, port pr-review-toolkit agents, sync .opencode parity
([#1833](#1833))
([e6372b8](e6372b8))
* tighten settings → startup-trace baseline (8 → 0)
([#1847](#1847))
([3376ee2](3376ee2))


### Documentation

* fix CLAUDE.md inaccuracies and drop drift-prone counts
([#1844](#1844))
([371925f](371925f))


### Tests

* replace test placeholders with real subsystem wiring
([#1845](#1845))
([ddbb666](ddbb666))


### CI/CD

* **cla:** replace flaky contributor-assistant action with custom
read-path step
([#1819](#1819))
([11aeafe](11aeafe))
* tidy dev-release notes + stagger renovate lockfile day
([#1824](#1824))
([ec746a9](ec746a9))


### Maintenance

* cleanup roundup, sub-tasks a/c/d/g/h/j/l/m of
[#1781](#1781)
([#1838](#1838))
([099b871](099b871))
* close remaining 5 sub-tasks of
[#1781](#1781) (b/e/f/i/k)
([#1852](#1852))
([59cf0b2](59cf0b2))
* collapse Renovate dep groups into Python / Web / Infrastructure to
remove cross-PR overlap
([#1813](#1813))
([4cbd857](4cbd857))
* **deps,security:** bump in-toto-golang v0.11.0 + drop two patched CVE
waivers ([#1851](#1851))
([0b8b5bb](0b8b5bb))
* disable Renovate vulnerabilityAlerts so security flows into normal
updates ([#1834](#1834))
([6b7d15f](6b7d15f))
* Lock file maintenance
([#1820](#1820))
([ccbad73](ccbad73))
* Lock file maintenance
([#1842](#1842))
([13b68a5](13b68a5))
* Lock file maintenance
([#1853](#1853))
([db6650b](db6650b))
* Update dhi.io/nats:2.14-debian13 Docker digest to eb768bf
([#1841](#1841))
([37f84fc](37f84fc))
* Update Infrastructure dependencies
([#1815](#1815))
([75b12fe](75b12fe))
* Update Infrastructure dependencies
([#1831](#1831))
([3f3c50b](3f3c50b))
* Update Python dependencies
([#1817](#1817))
([e11332f](e11332f))
* Update Python dependencies
([#1832](#1832))
([4515c8e](4515c8e))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PERF] Performance + data integrity

1 participant