Skip to content

chore(wp7): hygiene, stubs, test/CI/tooling, doc gaps, boundary patterns doc#1926

Merged
Aureliolo merged 24 commits into
mainfrom
chore/wp7-hygiene-stubs-tooling-docs
May 16, 2026
Merged

chore(wp7): hygiene, stubs, test/CI/tooling, doc gaps, boundary patterns doc#1926
Aureliolo merged 24 commits into
mainfrom
chore/wp7-hygiene-stubs-tooling-docs

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Closes #1922.

WP-7 mega-PR landing the long-tail hygiene work from the 2026-05-15 codebase audit. ~92 files, ~5,000 LOC; one-PR-per-issue per the issue's explicit instruction.

What changed

Logging gaps

  • WARNING/ERROR log before every raise site in client/pool.py (5), client/store.py (2, including SimulationStore.get), workers/claim.py (3).
  • INFO state-transition log on ClientRequest.with_status and the persisted-write side of the new transition path; Sprint.with_transition logs at DEBUG since the scheduler keeps sprints in memory.
  • Observability completeness: WS lifetime / revalidation / active-conns gauges + histograms; Postgres pool size / active / acquire-duration / exhausted metrics; OTel span_id + trace_id attached to budget hard-stop events.

Comment hygiene

  • Stripped 13 reviewer-citation comments and 13 migration-framing docstrings; rewrote each as self-contained WHYs.
  • Extended scripts/check_no_migration_framing.py to catch the variants the 2026-05-15 run missed (previously <verb>, were/was previously inlined, used to be, originally <verb>); module docstring lists the new patterns.

Stubs / deferred features

  • New synthorg.experiments domain (A/B variant registry + deterministic SHA-256 hash-walk assignment) — service, protocol, models, in-memory repo, REST controller. Replaces 2 of the 7 placeholder stubs from WP-7: Hygiene, Stubs, Test/CI/Tooling, Doc Gaps, Boundary Patterns Doc #1922.
  • Real synthorg.workers HTTP executor (httpx.AsyncClient-based, deterministic status mapping, no manual retry — JetStream owns retries via TaskClaimStatus.RETRY) plus the supporting execution_service seam.
  • CeremonyEvalContext now accepts an optional budget_snapshot callable.
  • New in-memory test-double cleanup helpers (clear() + size()) on the MCP installations repo.

Doc completeness

  • 8 missing operational guides: docs/guides/{a2a-federation,webhook-management,custom-mcp-server-dev,ontology-extension,custom-rules-and-meta-loop,ceremony-scheduling-tuning,cost-attribution,dynamic-scoring}.md.
  • New docs/guides/approval-workflow.md, workers-and-background-tasks.md, rest-api-examples.md; new docs/reference/yaml-schema.md.
  • REWORK Implement hierarchical delegation (task flows down, results flow up) #12: docs/reference/typed-boundaries.md documenting the 6 boundary patterns (A2A JSON-RPC, MCP tool args, provider tool calls, webhook payloads, audit chain, tool-execution dual paths) plus JWT/Settings/WebSocket extras.

Prompt engineering

  • Explicit temperature + top_p pinned on LlmSecurityEvaluator, ProceduralMemoryProposer, LlmSemanticAnalyzer. New ModelPinMetadata type so prompt-class metadata can record a pinned model version. CompletionConfig.top_p now defaults centrally to 1.0 instead of per-site None.

Test discipline

  • web/src/__tests__/stores/artifacts.test.ts: replaced setTimeout(resolve, 0) race with vi.waitFor deterministic polling.
  • New post-start TCP accept probe in the Postgres conformance fixture to mitigate xdist + testcontainer races.
  • tests/conformance/persistence/test_backup_round_trip.py: walks the new backend.kind + backend.config public accessors instead of isinstance + _config private attribute.
  • Approval-gate / claim-lifecycle / MCP-install-lifecycle / rate-limits-inflight tests now use typed stubs (MagicMock(spec=...)).

Tooling polish

  • synthorg new: RunE prints help when invoked with no subcommand (was silent).
  • synthorg update: install-directory writability probe runs BEFORE the binary download.
  • synthorg start: parse-failure vs missing-file errors now distinguishable.
  • .pre-commit-config.yaml: moved domain-error-hierarchy, no-review-origin-in-code, no-migration-framing from pre-commit to pre-push (combined ~30s saved per commit; gates still run before push).

CLAUDE.md / docs/getting_started.md

  • Both clarify that scripts/install_cli_tools.sh installs golangci-lint only; d2 is installed separately (curl installer on Unix, winget install Terrastruct.d2 on Windows).

Pre-PR review

20 review agents launched in parallel against the full diff (code-reviewer, python-reviewer, docs-consistency, security-reviewer, persistence-reviewer, async-concurrency-reviewer, api-contract-drift, type-design-analyzer, go-reviewer, go-conventions-enforcer, issue-resolution-verifier, tool-parity-checker, frontend-reviewer + general-purpose runs of comment-quality-rot, logging-audit, resilience-audit, conventions-enforcer, infra-reviewer, test-quality-reviewer, and 5 mini-pass audits). Triage table at _audit/pre-pr-review/triage.md.

14 valid findings addressed (2 critical, 10 major, 2 minor) — see the squash commit body for the line-by-line list. Verified false positives explicitly documented (mostly PEP 758 except A, B: flagged by agents that didn't know Python 3.14 syntax — CLAUDE.md is explicit, test suite passes).

Test plan

  • uv run ruff check src/ tests/ --fix, uv run ruff format src/ tests/, uv run mypy src/ tests/ — clean.
  • uv run python -m pytest tests/ -n 830,530 passed, 50 skipped, all green.
  • npm --prefix web run lint && npm --prefix web run type-check && npm --prefix web run test — 256 files / 3,155 tests pass.
  • go -C cli vet ./... && go -C cli test ./... && go -C cli build ./... — clean.

Acceptance criteria (#1922)

  • 10 raise sites log first (9 in the 3 mandated files + the SimulationStore.get gap closed in the review round).
  • All 8 operational guides exist under docs/guides/.
  • domain-error-hierarchy no longer in pre-commit.
  • Artifacts test deterministic; remaining 4 timing-dependent targets from WP-7: Hygiene, Stubs, Test/CI/Tooling, Doc Gaps, Boundary Patterns Doc #1922 are outside this branch's diff (separate test files) and remain as a follow-up.

Aureliolo added 20 commits May 15, 2026 22:34
Section 1a: WARNING logs before 9 raise sites in client/pool.py,
client/store.py, workers/claim.py. New events: CLIENT_NOT_FOUND,
CLIENT_REQUEST_NOT_FOUND, WORKERS_QUEUE_START_REJECTED,
WORKERS_QUEUE_NOT_RUNNING.

Section 1b: INFO CLIENT_REQUEST_STATUS_TRANSITIONED at controller
caller sites after request_store.save() returns; SPRINT_STATUS_TRANSITIONED
in ceremony_scheduler after the active-sprint in-memory update.

Tests: test_pool_logging.py + test_store_logging.py +
test_claim_lifecycle.py extensions.

Refs #1922
Pre-reviewed by 20 agents; 14 valid findings addressed (2 critical, 10 major,
2 minor). Verification: ruff/format/mypy/pytest (30,530 passed) + web
lint/type-check/test + go vet/test/build, all green.

Critical:
* ws.py: guard ws_active_connection_count read-modify-write with a
  threading.Lock so the Prometheus scrape thread cannot read mid-increment.
* prometheus_recording.py: split WS lifetime + Postgres pool methods into a
  new StreamRecordingMixin in prometheus_recording_streams.py so the main
  module stays under the 800-line ceiling.

Major:
* rest-api-examples.md: rewrite the auth + every endpoint example for
  cookie-based auth (login returns CookieSessionResponse; token is in
  HttpOnly Set-Cookie, not response body).
* approval-workflow.md: replace stale event names with the real ones
  (api.approval.created / security.approval.{approved,rejected} / etc).
* getting_started.md + CLAUDE.md: clarify that install_cli_tools.sh
  installs golangci-lint only; document d2 install separately.
* client/store.py: log SIMULATION_RUN_NOT_FOUND before SimulationStore.get
  raises (completes the WP-7 ten-raise-sites-log-first mandate).
* experiments controller: switch list_assignments to HMAC-signed
  paginate_cursor flow (encode_cursor / decode_cursor) like tasks.py;
  regenerate web/src/api/types/openapi.gen.ts to match.
* experiments service: log EXPERIMENT_VARIANT_INVALID_WEIGHT and
  EXPERIMENT_NOT_FOUND before the two raise sites.
* ceremony_scheduler.py: downgrade SPRINT_STATUS_TRANSITIONED to DEBUG
  with a comment explaining sprint state is in-memory only.
* persistence/protocol.py: tighten backend.kind return type to
  Literal[sqlite|postgres]; update both backend impls + the fake.
* cli/cmd/update.go: drop unnecessary else after return in CLI-update
  reexec branch.

Minor:
* check_no_migration_framing.py: extend module docstring with the five
  new regex patterns added by this PR.
* workers-and-background-tasks.md: add a Worker authentication subsection
  documenting SYNTHORG_WORKER_AUTH_TOKEN, rotation and HTTPS requirements.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b0b12699-ee88-41a4-a3e3-9a11741d4383

📥 Commits

Reviewing files that changed from the base of the PR and between c8f2f03 and c2fa188.

📒 Files selected for processing (2)
  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/executor.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: CLI Test (windows-latest)
  • GitHub Check: CLI Bench Regression
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
{src/**/*.py,web/src/**/*.{ts,tsx}}

📄 CodeRabbit inference engine (CLAUDE.md)

No region/currency/locale privileged; use metric units; use British English per docs/reference/regional-defaults.md

Files:

  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/executor.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

No hardcoded numeric values; numerics live in settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants per check_no_magic_numbers.py

Comments explain WHY only; no reviewer citations/issue back-refs/migration framing enforced by check_no_review_origin_in_code.py + check_no_migration_framing.py

No from __future__ import annotations (Python 3.14 has PEP 649); use PEP 758 except syntax (no parens on except A, B: unless binding)

Type hints on public functions; mypy strict mode; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Error classes: <Domain><Condition>Error from DomainError, never inherit Exception/RuntimeError directly per check_domain_error_hierarchy.py

Files:

  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/executor.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/executor.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Pydantic v2 models: frozen + extra="forbid" on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); @computed_field for derived; NotBlankStr for identifiers

Args models at every system boundary; parse_typed() for every external dict ingestion per check_boundary_typed.py

Immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Async code: use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)

Clock seam: clock: Clock | None = None parameter; tests inject FakeClock; services own _lifecycle_lock; timed-out stops mark unrestartable

Untrusted content (SEC-1): use wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML per sec-prompt-safety.md

Import logger via from synthorg.observability import get_logger; variable always named logger; never import logging or print() in app code

Event names from observability.events.<domain> constants; use structured kwargs (logger.info(EVENT, key=value))

Error paths log WARNING/ERROR with context before raising; state transitions log INFO via *_STATUS_TRANSITIONED after persistence write; secret-log redaction (SEC-1): never error=str(exc), use error_type=type(exc).__name__ + error=safe_error_description(exc), never exc_info=True; OTel: forbid span.record_exception(exc), use span.set_attribute(...) with record_exception=False, set_status_on_exception=False per check_logger_exception_str_exc.py

Files:

  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/executor.py
{src/**/*.py,tests/**/*.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic: NEVER use real vendor names in project code/tests; use example-provider, test-provider, example-{large,medium,small}-001 per convention; allowed in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Files:

  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/executor.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Read design spec from `docs/design/` before implementing; deviations require approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.{txt,json}` or `scripts/_*_baseline.py`; per-invocation bypass: `ALLOW_BASELINE_GROWTH=1 git commit ...`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR use `/aurelio-review-pr` for external feedback; fix everything valid, no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Git commits: `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Pre-commit/pre-push hooks: `.pre-commit-config.yaml`; hookify rules: `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: Squash merge with PR body as squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T23:04:05.176Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/executor.py
🔇 Additional comments (7)
src/synthorg/observability/events/workers.py (1)

39-40: LGTM!

Also applies to: 54-56, 61-71

src/synthorg/workers/executor.py (6)

31-65: LGTM!


96-137: LGTM!


139-186: LGTM!


188-256: LGTM!


274-290: LGTM!


266-272: 💤 Low value

httpx.DecodingError is the correct exception class in httpx 0.28.1.

The response.json() method raises json.JSONDecodeError (subclass of ValueError) for JSON parsing failures, and httpx.DecodingError for character encoding issues during body decoding. The code correctly catches both exceptions using the PEP 758 syntax.


Walkthrough

Implements an A/B experiments subsystem (models, protocol, in-memory repo, service), controller routes, DTOs and web typings, and unit tests. Adds a worker execution seam and LifecycleAdvancingExecutionService plus TaskExecutionExecutor (HTTP callback) and POST /tasks/{task_id}/execute; wires defaults into AppState and CLI flags for executor selection. Expands observability (event constants, Prometheus mixin/metrics for WebSocket lifetimes and Postgres pool), standardizes LLM sampling/top_p propagation and ModelPinMetadata, exposes persistence backend kind/config, hardens CLI self-update and config-load errors, moves costly pre-commit hooks to pre-push, expands migration-framing checks, and adds multiple docs and tests.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 15, 2026 20:42 — with GitHub Actions Inactive
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 15, 2026

Merging this PR will degrade performance by 33.9%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

❌ 1 regressed benchmark
✅ 53 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
resolveLocale x1000 (full-fallback to APP_LOCALE_FALLBACK) 135.2 µs 204.5 µs -33.9%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing chore/wp7-hygiene-stubs-tooling-docs (c2fa188) with main (afb5cc5)

Open in CodSpeed

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 32

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docs/guides/monitoring.md (1)

317-318: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the Audit & Security row panel list to include the new audit-log fill-ratio gauge.

The new section adds this panel, but the summary table row still omits it, so the guide is internally inconsistent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/monitoring.md` around lines 317 - 318, Update the summary table
row that starts with the literal "`Audit & Security`" to include the new
"audit-log fill-ratio gauge" in its panel list so the row text matches the newly
added panel; edit the table cell that currently reads "Audit chain append rate,
depth, last-append age, security verdicts, agent identity version changes, API
error categories" and append ", audit-log fill-ratio gauge" (or insert it in the
preferred position) so the two references stay consistent.
tests/unit/experiments/test_service.py (1)

153-160: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Test claims recency ordering but never validates order.

test_list_assignments_paginates_in_recency_order currently only checks total and page length, so an ordering regression would still pass.

Suggested assertion addition
     assert total == 5
     assert len(page) == 3
+    assert [str(item.subject_id) for item in page] == ["u-4", "u-3", "u-2"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/experiments/test_service.py` around lines 153 - 160, Test
currently only verifies total and page length; add an assertion that the
returned assignments are in recency order. After calling svc.list_assignments
and receiving page, assert that the timestamp field on each assignment in page
(e.g., created_at or updated_at on the assignment objects returned by
svc.list_assignments) is non-increasing across the page (each item’s timestamp
>= next item’s timestamp) to validate recency ordering in
test_list_assignments_paginates_in_recency_order.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.pre-commit-config.yaml:
- Around line 626-630: Update the top-level rationale comment that mentions
pre-commit.ci to reflect the new hook staging: change any lines claiming that
"no-review-origin-in-code" and "no-migration-framing" are enforced by
pre-commit.ci to indicate they now run at stages: [pre-push] (or otherwise
reference the pre-push stage), so the header aligns with the actual
configuration (see the existing stages: [pre-push] entry and the named hooks
no-review-origin-in-code and no-migration-framing).

In `@cli/cmd/new.go`:
- Around line 37-45: The RunE for the top-level "synthorg new" command currently
just calls cmd.Help() and returns nil; change it so after calling cmd.Help() it
returns a usage error to signal a CLI usage failure (exit code 2). Concretely,
in the RunE implementation that currently references cobra.NoArgs and calls
cmd.Help(), call cmd.Help() and then return a usage-style error (for example
return cobra.ErrInvalidArgs or a new errors.New("requires a scaffold
kind/domain")) so the command both prints help and exits with a non-zero usage
error.

In `@cli/cmd/start.go`:
- Around line 92-109: The switch that checks err.Error() prefixes should be
replaced with typed/sentinel error checks against the error values/types
exported by the config package (use errors.Is and/or errors.As with symbols from
cli/internal/config, e.g. the package's sentinel errors like ErrParsing,
ErrReading or a ParseError type) so that parsing vs reading vs other config
failures are detected robustly; update the error handling in the function that
currently does the string-prefix switch to call errors.Is(err,
config.ErrReading)/errors.Is(err, config.ErrParsing) or use errors.As to match a
config.ParseError and then return the corresponding wrapped fmt.Errorf messages
(preserving the current user-facing messages and wrapping the original err) and
fall back to fmt.Errorf("config: %w", err) for anything else.

In `@docs/architecture/decisions.md`:
- Around line 132-137: The dated checklist header "Verification checkpoint
(2026-06-10)" and any hardcoded numerics in the new items must be replaced with
runtime-stats markers per the docs numeric-sourcing policy: pull the date/number
from data/runtime_stats.yaml and insert a <!--RS:NAME--> marker instead of the
literal "2026-06-10" (or move the volatile checkpoint into an internal/temporary
doc if you prefer). Update the checklist text in docs/architecture/decisions.md
(the "Verification checkpoint" heading and the numbered items 1–4) to reference
the appropriate <!--RS:...--> keys and ensure the corresponding entries exist in
data/runtime_stats.yaml; if you move the checkpoint out of public docs, note
that change here and remove hardcoded numerics.

In `@docs/getting_started.md`:
- Around line 35-53: Replace hardcoded version numerics in
docs/getting_started.md with runtime-stats markers: change the d2 version string
"v0.7.1" to a marker like <!--RS:D2_VERSION--> and replace any golangci-lint
pinned version mention with a marker like <!--RS:GOLANGCI_LINT_VERSION--> that
sources values from data/runtime_stats.yaml; ensure any other explicit numeric
literals in this section are similarly converted to <!--RS:...--> markers and
update the install guidance to reference scripts/install_cli_tools.sh and the CI
pin in .github/workflows/cli.yml so readers know where the marker values
originate.

In `@docs/guides/a2a-federation.md`:
- Around line 20-27: The table and other doc spots contain hard-coded numeric
defaults (e.g., the `a2a.timeout_seconds` default value of 30 and other literal
ports/status codes); replace each numeric literal with the appropriate
runtime-stat marker `<!--RS:NAME-->` (e.g., swap the "30" default for
`a2a.timeout_seconds` with a matching `<!--RS:...-->` marker name taken from
data/runtime_stats.yaml), update every occurrence noted (the table row for
`a2a.timeout_seconds`, the default/port/status code occurrences referenced in
the comment) to use the marker, and if a suitable marker does not exist add a
descriptive entry to data/runtime_stats.yaml and reference it here using
`<!--RS:NAME-->`; ensure marker names are descriptive and consistent with
existing markers so the public docs render the numeric values at runtime.

In `@docs/guides/approval-workflow.md`:
- Around line 19-26: Replace hardcoded numeric defaults in the table with
runtime-sourced marker comments: remove inline literals like `86400` for
`approval.timeout_seconds` and replace with a marker comment such as
`<!--RS:APPROVAL_TIMEOUT_SECONDS-->` (sourced from data/runtime_stats.yaml); do
the same for any other numeric literals cited in this section (also update
occurrences referenced at 85-89) so the docs use `<!--RS:...-->` markers instead
of literal numbers, keeping the config keys (`approval.timeout_seconds`,
`approval.enabled`, `approval.reviewer_groups`, `approval.notification_target`,
`approval.audit_chain_signing`) intact and only changing the displayed
numeric/default values to marker placeholders.

In `@docs/guides/ceremony-scheduling-tuning.md`:
- Around line 20-27: The table contains literal numeric defaults and flags
(e.g., the numeric/default values for ceremonies[].trigger.threshold,
ceremony_policy.strategy default, and ceremony_policy.auto_transition default)
that must be replaced with runtime-stat markers; update each numeric or boolean
literal in the doc (including other occurrences called out around the file) to
use the <!--RS:NAME--> marker format tied to data/runtime_stats.yaml (create or
reference appropriate NAME keys like RS_CEREMONY_THRESHOLD, RS_STRATEGY_DEFAULT,
RS_AUTO_TRANSITION, etc.), ensure markers replace the literal values in the
table entries and in the other mentioned sections, and keep the marker names
descriptive and added to runtime_stats.yaml if missing.

In `@docs/guides/cost-attribution.md`:
- Around line 12-18: Replace the hardcoded numeric estimates in
docs/guides/cost-attribution.md (the table rows for Provider, Model, Agent,
Project and the other occurrences noted at lines 43-57 and 84-99) with
runtime-stats markers that reference data/runtime_stats.yaml using the
<!--RS:NAME--> syntax; for each numeric value (e.g., ~10, ~50, 80, 95) create or
use an appropriate key in data/runtime_stats.yaml and replace the literal in the
markdown with the corresponding <!--RS:KEY--> marker so the docs-numerics gate
is satisfied and all public docs pull numbers from runtime_stats.

In `@docs/guides/custom-mcp-server-dev.md`:
- Line 8: Replace the hardcoded "200+ tools across 15 domain modules" text in
docs/guides/custom-mcp-server-dev.md with runtime-stats markers that pull values
from data/runtime_stats.yaml using the <!--RS:...--> syntax (e.g.,
<!--RS:TOOLS_COUNT--> and <!--RS:DOMAIN_MODULES_COUNT-->); update the sentence
that mentions "200+ tools" and "15 domain modules" to reference those markers so
the public doc numbers are sourced at runtime from runtime_stats.yaml.

In `@docs/guides/custom-rules-and-meta-loop.md`:
- Around line 8-115: The doc contains hardcoded numeric literals (e.g., the
"five" retry example in the intro, minutes=60 in HighFailRateRule, threshold 0.3
in the rule logic, 0.15 in the sample config, and the test outcomes counts 4/6)
that must be replaced with runtime-stats markers; update the guide to use
<!--RS:...--> markers (or remove precise numbers) sourced from
data/runtime_stats.yaml for each numeric claim referenced near HighFailRateRule,
the config block under meta_loop.rules, the example test vectors in
test_high_fail_rate, and the top-level policy example (retry limit), so readers
and the docs gate get values from runtime_stats instead of hardcoded literals.

In `@docs/guides/monitoring.md`:
- Line 104: The docs row for the metric `synthorg_security_audit_log_fill_ratio`
(and the nearby section around lines referenced 227-281) uses hardcoded numeric
values (e.g., 0.9, 0.75, 10m, 0..1); replace each hardcoded numeric with a
runtime-stats marker (<!--RS:NAME-->) that references a key in
data/runtime_stats.yaml (create/update entries in data/runtime_stats.yaml for
each numeric like alert_threshold_audit_log_fill_ratio, warning_threshold_...,
retention_window_minutes, value_range_audit_log, etc.), update the markdown
table/JSON/YAML to use those markers, and ensure marker names are unique and
documented so the build pulls values automatically.

In `@docs/guides/ontology-extension.md`:
- Around line 10-109: The guide contains hardcoded numeric literals that must be
sourced via runtime-stats markers; update the document sections (e.g.,
"Concepts", "Registering a new term", "Worked example", "Adding a profile", and
anywhere the example snippets reference numeric counts) to replace any numeric
literals with <!--RS:NAME--> markers tied to data/runtime_stats.yaml, or else
rephrase to avoid explicit numbers; ensure markers follow the exact format
<!--RS:KEY--> and map to an entry in runtime_stats.yaml so rendered prompts (and
references like FINANCE_PROFILE, InjectionProfile, and PROFILE =
FINANCE_PROFILE) include referenced runtime stats instead of hardcoded numerics.

In `@docs/guides/rest-api-examples.md`:
- Around line 8-10: Replace hardcoded numeric literals in the shown prose (e.g.,
the default backend port "3001" in "$BASE defaults to http://localhost:3001",
the "10" in "10 most common operations", and any other
counts/limits/percentages/pagination defaults) with runtime-stats markers like
<!--RS:NAME--> that map to data/runtime_stats.yaml; update occurrences around
the API mount string "/api/v1", the "$BASE" example, and any mentions of
response types (e.g., "ApiResponse<T>", "PaginatedResponse<T>") so examples
interpolate the markers instead of hard numbers, and ensure the README/public
docs follow the numeric-sourcing rule across the other reported sections.

In `@docs/guides/workers-and-background-tasks.md`:
- Around line 81-85: The fenced code block showing worker logs is missing a
language tag which triggers MD040; update the fenced block in
workers-and-background-tasks.md (the block containing
"workers.worker.claim_received task_id=task-A") by adding a language label such
as text after the opening triple backticks (e.g., ```text) so the snippet is
lint-clean; ensure the closing fence remains ``` and no other content is
changed.

In `@docs/reference/typed-boundaries.md`:
- Around line 57-60: The table mixes true parse_typed-enforced boundaries with
informational/non-parse_typed entries (e.g., extract_tool_calls, parse_payload,
invoke) which conflicts with the page text and lint guard claim of six
parse_typed boundaries; update the doc by either splitting the table into two
sections ("parse_typed-enforced" vs "informational/lenient"), or mark those rows
with a clear column/annotation (e.g., "enforcement: lenient" or "not
parse_typed") and adjust the surrounding text and lint-guard description to
reflect the actual count and enforcement scope so readers see that
extract_tool_calls, parse_payload, and invoke are not parse_typed-enforced.

In `@docs/reference/yaml-schema.md`:
- Around line 43-47: The docs reference shows hardcoded numeric literals (e.g.,
the `version` field value `1`) that must be replaced with runtime-stats markers;
update the `version` table entry and all other numeric literals referenced
(around lines 87-103, 159-162, 205-210) to use the <!--RS:NAME--> marker format
that pulls values from data/runtime_stats.yaml (create a meaningful NAME for
each value, e.g., RS_SCHEMA_VERSION for the `version` integer) and ensure the
marker replaces the literal in the markdown and the corresponding key exists in
data/runtime_stats.yaml.

In `@scripts/check_no_migration_framing.py`:
- Around line 104-109: The compiled regex for the "were previously ..." gate
currently lists only inlined|duplicated|scattered|extracted and therefore omits
the documented forbidden verbs; update the re.compile(...) pattern (the second
argument shown in the diff) to include owned, emitted, and wrapped in the
alternation (e.g. add |owned|emitted|wrapped) so phrases like "were previously
owned/emitted/wrapped" are matched; keep the existing word boundaries and
IGNORECASE flag intact.

In `@src/synthorg/api/controllers/tasks.py`:
- Around line 325-331: The POST endpoint decorator for the task execute route
currently relies on the default 201 response; update the `@post` decorator on the
"/{task_id:str}/execute" endpoint (the one decorated with guards
require_write_access and per_op_rate_limit_from_policy("tasks.execute",
key="user")) to explicitly set status_code=200 so the handler for executing an
existing task returns HTTP 200 OK instead of 201 Created.

In `@src/synthorg/api/controllers/ws.py`:
- Around line 707-711: The gauge write is happening outside the lock, so
concurrent opens/closes can overwrite it with a stale value; move the call to
app_state.prometheus_collector.set_ws_active_connections(count=current) into the
same critical section protected by _WS_CONNECTION_COUNT_LOCK immediately after
updating state_dict["ws_active_connection_count"] (do this for both the
increment path where current is computed and the decrement path that updates the
counter) so the computed current and the gauge write occur atomically.

In `@src/synthorg/api/dto.py`:
- Around line 435-479: The module defines public DTO classes
RegisterExperimentVariantRequest, AssignExperimentRequest, and
ExecuteTaskRequest but does not include them in the module export list; add
these class names to the module's __all__ so wildcard imports and the public
surface remain consistent (also check and add any other DTOs mentioned around
the other region referenced, e.g., the classes between lines 752-781). Update
the __all__ list to include "RegisterExperimentVariantRequest",
"AssignExperimentRequest", and "ExecuteTaskRequest".
- Line 441: The inline upper bound le=1000 on the weight Field should be
replaced with a module-level annotated constant following the existing pattern
(e.g. define _MAX_SELECTION_WEIGHT: int = 1000 near other _MAX_* constants) and
then reference that constant in the weight Field (use le=_MAX_SELECTION_WEIGHT);
update the constant name to be descriptive and typed with int so it follows the
project's allowlist of module-level numeric constants and keeps Field(...) free
of hardcoded literals.

In `@src/synthorg/api/state.py`:
- Around line 810-846: The lazy-initialisation of _worker_execution_service and
_experiment_service can race; wrap their first-access construction in a lock
using a double-checked pattern so only one instance is created. For both the
worker_execution_service property (creating LifecycleAdvancingExecutionService
and assigning _worker_execution_service) and the experiment_service property
(creating ExperimentService and assigning _experiment_service) acquire a shared
instance-level lock (e.g. self._init_lock or reuse an existing lock field),
re-check the backing attribute inside the lock, then construct and assign; add a
threading.Lock import/initialiser if no lock exists and keep the
set_worker_execution_service/_set_once behaviour unchanged.

In `@src/synthorg/experiments/models.py`:
- Line 29: Replace the hardcoded literal le=1000 in the weight Field by
extracting 1000 into a module-level annotated constant (e.g., MAX_WEIGHT: int =
1000) placed under your settings/definitions area, then import that constant
into src/synthorg/experiments/models.py and use it as le=MAX_WEIGHT in the
weight: int = Field(...) declaration so the numeric is no longer hardcoded and
remains type-annotated and centrally configurable.

In `@src/synthorg/experiments/service.py`:
- Around line 147-177: The assign() flow has a race between
_repo.get_assignment(...) and _repo.record_assignment(...); make recording
idempotent by handling concurrent-insert conflicts: after creating assignment
via _choose_variant(...)/ExperimentAssignment, call
_repo.record_assignment(assignment) and if it raises a unique-conflict/duplicate
error, catch it, call _repo.get_assignment(experiment, subject_id) again and
return that recorded assignment instead of failing or duplicating;
alternatively, if your repository supports an atomic upsert/insert-if-not-exists
API, use that in place of record_assignment to ensure only the first writer
wins. Use symbols assign(), _repo.get_assignment, _repo.record_assignment,
_choose_variant, and ExperimentAssignment when locating where to apply this
change.

In `@src/synthorg/observability/prometheus_recording_streams.py`:
- Around line 53-56: The code currently checks for finiteness but still allows
negative durations to be observed; add a non-negative guard before writing to
the histogram by raising or rejecting negatives. For the block that updates
self._ws_connection_lifetime (the method that calls
require_finite("record_ws_connection_lifetime: duration_sec", duration_sec) and
then self._ws_connection_lifetime.labels(...).observe(duration_sec)), add a
check like if duration_sec < 0: raise ValueError(...) (or call a
require_non_negative helper) and do the same change in the analogous method that
updates self._ws_connection_latency (the block around lines 92-95) so no
negative duration is ever passed to the histogram observe() call.

In `@src/synthorg/workers/__main__.py`:
- Around line 172-175: Resolve the executor before starting the task queue to
avoid leaking resources: call _resolve_executor(args) and capture executor and
http_client before invoking task_queue.start(), so that any SystemExit raised
inside _resolve_executor does not occur after the queue has been started; then
start the queue and enter the try/finally that calls run_worker_pool(...) and
ensures task_queue.stop() in the finally block. Ensure both start/stop ordering
is corrected in the _main_ startup sequence where run_worker_pool is invoked.
- Around line 200-205: Before raising SystemExit for missing args.api_base_url
and args.auth_token in src/synthorg/workers/__main__.py, emit an ERROR (or
WARNING) log with the same message and context; replace the two bare raises in
the api_base_url/auth_token checks with a logger.error(...) call (include the
msg string, the problematic flag name like "--executor http" and the related env
var names SYNTHORG_API_BASE_URL / SYNTHORG_WORKER_AUTH_TOKEN) immediately before
calling raise SystemExit(msg), so the checks around args.api_base_url and
args.auth_token produce structured logs prior to exiting.

In `@src/synthorg/workers/execution_service.py`:
- Around line 114-117: Before raising NotFoundError in the missing-task path,
add a structured log call that records the task_id and any relevant context;
specifically, in the block around self._task_engine.get_task(task_id) where task
is None, emit a processLogger.warning or self.logger.warning (whichever logger
is used in this module) with a message like "Task not found" and include task_id
and any tracing/context before raising NotFoundError(msg). Ensure the log level
is WARNING/ERROR and the logged fields match other error-path logs for
consistency.

In `@src/synthorg/workers/executor.py`:
- Around line 213-217: The code incorrectly maps terminal_status == "failed" to
TaskClaimStatus.FAILED which violates the executor contract that any 2xx +
terminal status ("completed"/"cancelled"/"failed") should return
TaskClaimStatus.SUCCESS; update the conditional logic around terminal_status to
return TaskClaimStatus.SUCCESS for "failed" (and generally for all terminal
statuses) instead of TaskClaimStatus.FAILED—e.g., replace the ternary that
yields TaskClaimStatus.FAILED when terminal_status == "failed" with logic that
yields TaskClaimStatus.SUCCESS for terminal_status in
{"completed","cancelled","failed"} and only return non-success for non-terminal
error cases; ensure the change touches the branch currently returning
TaskClaimStatus.FAILED so that TaskClaimStatus.SUCCESS is returned for
terminal_status == "failed".
- Line 118: The code interpolates claim.task_id directly into the path when
building url in executor.py (url =
f"{self._base_url}/api/v1/tasks/{claim.task_id}/execute"); change this to
URL-encode the task id first (e.g., import urllib.parse and use
urllib.parse.quote(str(claim.task_id), safe='') or similar) and then interpolate
the encoded value into self._base_url so reserved characters (including slashes)
cannot produce malformed paths.

In `@tests/unit/engine/test_approval_gate.py`:
- Around line 20-23: Replace the bare MagicMock usage for the typed boundary
with the approved helper: change the parked fixture creation from
MagicMock(spec=ParkedContext) to mock_of[ParkedContext](...) so the mock
respects the typed-boundary rule; ensure you import mock_of from tests._shared
if missing and keep the parked.id = "parked-1" override (or pass id="parked-1"
into mock_of) so behavior remains identical.

---

Outside diff comments:
In `@docs/guides/monitoring.md`:
- Around line 317-318: Update the summary table row that starts with the literal
"`Audit & Security`" to include the new "audit-log fill-ratio gauge" in its
panel list so the row text matches the newly added panel; edit the table cell
that currently reads "Audit chain append rate, depth, last-append age, security
verdicts, agent identity version changes, API error categories" and append ",
audit-log fill-ratio gauge" (or insert it in the preferred position) so the two
references stay consistent.

In `@tests/unit/experiments/test_service.py`:
- Around line 153-160: Test currently only verifies total and page length; add
an assertion that the returned assignments are in recency order. After calling
svc.list_assignments and receiving page, assert that the timestamp field on each
assignment in page (e.g., created_at or updated_at on the assignment objects
returned by svc.list_assignments) is non-increasing across the page (each item’s
timestamp >= next item’s timestamp) to validate recency ordering in
test_list_assignments_paginates_in_recency_order.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b11bb13d-117b-41b0-9df3-76b307167940

📥 Commits

Reviewing files that changed from the base of the PR and between afb5cc5 and fc4c090.

📒 Files selected for processing (96)
  • .pre-commit-config.yaml
  • CLAUDE.md
  • cli/cmd/new.go
  • cli/cmd/start.go
  • cli/cmd/update.go
  • cli/internal/selfupdate/updater.go
  • docs/architecture/decisions.md
  • docs/getting_started.md
  • docs/guides/a2a-federation.md
  • docs/guides/approval-workflow.md
  • docs/guides/ceremony-scheduling-tuning.md
  • docs/guides/cost-attribution.md
  • docs/guides/custom-mcp-server-dev.md
  • docs/guides/custom-rules-and-meta-loop.md
  • docs/guides/dynamic-scoring.md
  • docs/guides/monitoring.md
  • docs/guides/ontology-extension.md
  • docs/guides/rest-api-examples.md
  • docs/guides/webhook-management.md
  • docs/guides/workers-and-background-tasks.md
  • docs/reference/typed-boundaries.md
  • docs/reference/yaml-schema.md
  • scripts/check_no_migration_framing.py
  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/api/controllers/_department_health.py
  • src/synthorg/api/controllers/experiments.py
  • src/synthorg/api/controllers/requests.py
  • src/synthorg/api/controllers/simulations.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/api/dto.py
  • src/synthorg/api/etag.py
  • src/synthorg/api/lifecycle_builder.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/state.py
  • src/synthorg/budget/_aggregation.py
  • src/synthorg/budget/enforcer.py
  • src/synthorg/client/pool.py
  • src/synthorg/client/store.py
  • src/synthorg/core/concurrency/cas_retry.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/engine/quality/decomposers/llm.py
  • src/synthorg/engine/workflow/ceremony_scheduler.py
  • src/synthorg/engine/workspace/config.py
  • src/synthorg/engine/workspace/semantic_llm.py
  • src/synthorg/experiments/__init__.py
  • src/synthorg/experiments/in_memory_repository.py
  • src/synthorg/experiments/models.py
  • src/synthorg/experiments/protocol.py
  • src/synthorg/experiments/service.py
  • src/synthorg/integrations/mcp_catalog/in_memory_installations.py
  • src/synthorg/llm/__init__.py
  • src/synthorg/llm/metadata.py
  • src/synthorg/memory/procedural/models.py
  • src/synthorg/memory/procedural/proposer.py
  • src/synthorg/meta/evolution/outcome_models.py
  • src/synthorg/meta/evolution/outcome_store_protocol.py
  • src/synthorg/meta/mcp/handlers/common_args.py
  • src/synthorg/observability/events/client.py
  • src/synthorg/observability/events/experiments.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/observability/events/workflow.py
  • src/synthorg/observability/prometheus_collector.py
  • src/synthorg/observability/prometheus_labels.py
  • src/synthorg/observability/prometheus_push_metrics.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/persistence/_shared/pagination.py
  • src/synthorg/persistence/postgres/backend.py
  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/providers/models.py
  • src/synthorg/security/config.py
  • src/synthorg/security/llm_evaluator.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/claim.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/executor.py
  • tests/conformance/persistence/conftest.py
  • tests/conformance/persistence/test_backup_round_trip.py
  • tests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.py
  • tests/unit/client/test_pool_logging.py
  • tests/unit/client/test_store_logging.py
  • tests/unit/engine/test_approval_gate.py
  • tests/unit/experiments/__init__.py
  • tests/unit/experiments/test_service.py
  • tests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.py
  • tests/unit/observability/test_events.py
  • tests/unit/persistence/test_protocol.py
  • tests/unit/providers/conftest.py
  • tests/unit/providers/test_models.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • tests/unit/workers/test_claim_lifecycle.py
  • tests/unit/workers/test_executor.py
  • web/src/__tests__/stores/artifacts.test.ts
  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/openapi.gen.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build Backend
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: CLI Test (windows-latest)
  • GitHub Check: CLI Bench Regression
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (21)
cli/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

Use the appropriate hint tier based on intent: HintError for error recovery (always visible unless quiet), HintNextStep for natural next action or destructive-action feedback, HintTip for config automation suggestions (deduplicates within session), and HintGuidance for flag/feature discovery (invisible in default auto mode)

CLI is a Go binary; use go -C cli never cd cli. For shell rules see ~/.claude/rules/common/bash.md (canonical for cd / git -C / Bash file-write rules). See cli/CLAUDE.md for CLI-specific conventions

Files:

  • cli/cmd/new.go
  • cli/cmd/start.go
  • cli/internal/selfupdate/updater.go
  • cli/cmd/update.go
cli/cmd/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

Exit codes: 0 = success, 1 = runtime error, 2 = usage error, 3 = unhealthy (backend/containers), 4 = unreachable (Docker), 10 = updates available (--check)

Files:

  • cli/cmd/new.go
  • cli/cmd/start.go
  • cli/cmd/update.go
cli/cmd/new.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

new <kind> <domain> command accepts --dry-run and --overwrite flags and scaffolds a conventions-clean Python file set under src/synthorg/ for a new feature; <kind> is one of service, persistence, tool, controller

Files:

  • cli/cmd/new.go
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Read docs/design/ page before implementing; deviations need approval

Present every plan for accept/deny before coding

No region/currency/locale privileged; use metric units; British English spelling

Every convention PR must ship its enforcement gate

Configuration precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env. No os.environ.get outside startup; pre-init Cat-2 reads use settings.bootstrap_resolver.resolve_init_value

No hardcoded numeric values; numerics live in settings/definitions/. Allowlist: 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal)

Never edit tests/baselines/unit_timing.json or any scripts/*_baseline.{txt,json} / scripts/_*_baseline.py; timeout/slow failures are source-code regressions. Per-invocation bypass for gate baselines: ALLOW_BASELINE_GROWTH=1 git commit ... (requires explicit user approval)

No from __future__ import annotations (Python 3.14 has PEP 649). Use PEP 758 exception syntax: except A, B: without parens unless binding

Type hints required on public functions; mypy strict mode. Use Google-style docstrings. Line length 88; functions <50 lines; files <800 lines

Error classes must follow <Domain><Condition>Error naming from DomainError base; never inherit directly from Exception/RuntimeError/etc

Pydantic v2 frozen models with extra="forbid" on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); use @computed_field for derived values; use NotBlankStr for identifiers

Use args models at every system boundary; call parse_typed() for every external dict ingestion

For immutability use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Use asyncio.TaskGroup for fan-out/fan-in; helpers must catch Exception and re-raise `Memor...

Files:

  • tests/unit/client/test_store_logging.py
  • tests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.py
  • src/synthorg/api/controllers/requests.py
  • tests/unit/observability/test_events.py
  • src/synthorg/meta/evolution/outcome_models.py
  • src/synthorg/meta/evolution/outcome_store_protocol.py
  • src/synthorg/observability/events/experiments.py
  • src/synthorg/llm/__init__.py
  • src/synthorg/core/concurrency/cas_retry.py
  • src/synthorg/engine/workspace/config.py
  • src/synthorg/meta/mcp/handlers/common_args.py
  • src/synthorg/memory/procedural/models.py
  • src/synthorg/engine/quality/decomposers/llm.py
  • src/synthorg/security/llm_evaluator.py
  • tests/unit/providers/test_models.py
  • src/synthorg/api/lifecycle_builder.py
  • src/synthorg/api/controllers/simulations.py
  • src/synthorg/observability/events/workflow.py
  • tests/conformance/persistence/conftest.py
  • tests/unit/providers/conftest.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/observability/events/client.py
  • src/synthorg/experiments/models.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/integrations/mcp_catalog/in_memory_installations.py
  • src/synthorg/security/config.py
  • src/synthorg/experiments/protocol.py
  • src/synthorg/client/pool.py
  • src/synthorg/memory/procedural/proposer.py
  • src/synthorg/engine/workspace/semantic_llm.py
  • src/synthorg/experiments/__init__.py
  • src/synthorg/api/etag.py
  • src/synthorg/llm/metadata.py
  • src/synthorg/persistence/postgres/backend.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/budget/enforcer.py
  • src/synthorg/api/controllers/_department_health.py
  • src/synthorg/budget/_aggregation.py
  • tests/unit/workers/test_executor.py
  • src/synthorg/api/controllers/__init__.py
  • tests/conformance/persistence/test_backup_round_trip.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/persistence/protocol.py
  • tests/unit/experiments/test_service.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/unit/client/test_pool_logging.py
  • scripts/check_no_migration_framing.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • src/synthorg/client/store.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/workers/claim.py
  • tests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.py
  • tests/unit/persistence/test_protocol.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/engine/workflow/ceremony_scheduler.py
  • src/synthorg/persistence/_shared/pagination.py
  • src/synthorg/experiments/in_memory_repository.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/observability/prometheus_labels.py
  • tests/unit/engine/test_approval_gate.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/observability/prometheus_collector.py
  • src/synthorg/api/controllers/experiments.py
  • src/synthorg/providers/models.py
  • src/synthorg/api/state.py
  • src/synthorg/api/dto.py
  • src/synthorg/experiments/service.py
  • src/synthorg/observability/prometheus_push_metrics.py
  • src/synthorg/workers/executor.py
  • tests/unit/workers/test_claim_lifecycle.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use markers: @pytest.mark.{unit,integration,e2e,slow}. Async tests use auto. Global timeout 30s. Coverage minimum 80%

Use FakeClock from tests._shared for Clock seam; use mock_of[T](**overrides) from tests._shared for typed-boundary substitutions; use SimpleNamespace for attribute-bags. Bare MagicMock at typed boundaries is forbidden (enforced by scripts/check_mock_spec.py, zero-tolerance)

Use Hypothesis with 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Never skip or xfail flaky tests; fix fundamentally. Use asyncio.Event().wait() instead of sleep(large_timeout)

Files:

  • tests/unit/client/test_store_logging.py
  • tests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.py
  • tests/unit/observability/test_events.py
  • tests/unit/providers/test_models.py
  • tests/conformance/persistence/conftest.py
  • tests/unit/providers/conftest.py
  • tests/unit/workers/test_executor.py
  • tests/conformance/persistence/test_backup_round_trip.py
  • tests/unit/experiments/test_service.py
  • tests/unit/client/test_pool_logging.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • tests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.py
  • tests/unit/persistence/test_protocol.py
  • tests/unit/engine/test_approval_gate.py
  • tests/unit/workers/test_claim_lifecycle.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/client/test_store_logging.py
  • tests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.py
  • tests/unit/observability/test_events.py
  • tests/unit/providers/test_models.py
  • tests/conformance/persistence/conftest.py
  • tests/unit/providers/conftest.py
  • tests/unit/workers/test_executor.py
  • tests/conformance/persistence/test_backup_round_trip.py
  • tests/unit/experiments/test_service.py
  • tests/unit/client/test_pool_logging.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • tests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.py
  • tests/unit/persistence/test_protocol.py
  • tests/unit/engine/test_approval_gate.py
  • tests/unit/workers/test_claim_lifecycle.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use uv sync for dependency management; layout follows src/synthorg/ (src layout), tests/, web/ (React 19), cli/ (Go binary). Python 3.14+. BUSL-1.1 → Apache 2.0 after Change Date

Files:

  • src/synthorg/api/controllers/requests.py
  • src/synthorg/meta/evolution/outcome_models.py
  • src/synthorg/meta/evolution/outcome_store_protocol.py
  • src/synthorg/observability/events/experiments.py
  • src/synthorg/llm/__init__.py
  • src/synthorg/core/concurrency/cas_retry.py
  • src/synthorg/engine/workspace/config.py
  • src/synthorg/meta/mcp/handlers/common_args.py
  • src/synthorg/memory/procedural/models.py
  • src/synthorg/engine/quality/decomposers/llm.py
  • src/synthorg/security/llm_evaluator.py
  • src/synthorg/api/lifecycle_builder.py
  • src/synthorg/api/controllers/simulations.py
  • src/synthorg/observability/events/workflow.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/observability/events/client.py
  • src/synthorg/experiments/models.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/integrations/mcp_catalog/in_memory_installations.py
  • src/synthorg/security/config.py
  • src/synthorg/experiments/protocol.py
  • src/synthorg/client/pool.py
  • src/synthorg/memory/procedural/proposer.py
  • src/synthorg/engine/workspace/semantic_llm.py
  • src/synthorg/experiments/__init__.py
  • src/synthorg/api/etag.py
  • src/synthorg/llm/metadata.py
  • src/synthorg/persistence/postgres/backend.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/budget/enforcer.py
  • src/synthorg/api/controllers/_department_health.py
  • src/synthorg/budget/_aggregation.py
  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/persistence/protocol.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/client/store.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/workers/claim.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/engine/workflow/ceremony_scheduler.py
  • src/synthorg/persistence/_shared/pagination.py
  • src/synthorg/experiments/in_memory_repository.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/observability/prometheus_labels.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/observability/prometheus_collector.py
  • src/synthorg/api/controllers/experiments.py
  • src/synthorg/providers/models.py
  • src/synthorg/api/state.py
  • src/synthorg/api/dto.py
  • src/synthorg/experiments/service.py
  • src/synthorg/observability/prometheus_push_metrics.py
  • src/synthorg/workers/executor.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/controllers/requests.py
  • src/synthorg/meta/evolution/outcome_models.py
  • src/synthorg/meta/evolution/outcome_store_protocol.py
  • src/synthorg/observability/events/experiments.py
  • src/synthorg/llm/__init__.py
  • src/synthorg/core/concurrency/cas_retry.py
  • src/synthorg/engine/workspace/config.py
  • src/synthorg/meta/mcp/handlers/common_args.py
  • src/synthorg/memory/procedural/models.py
  • src/synthorg/engine/quality/decomposers/llm.py
  • src/synthorg/security/llm_evaluator.py
  • src/synthorg/api/lifecycle_builder.py
  • src/synthorg/api/controllers/simulations.py
  • src/synthorg/observability/events/workflow.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/observability/events/client.py
  • src/synthorg/experiments/models.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/integrations/mcp_catalog/in_memory_installations.py
  • src/synthorg/security/config.py
  • src/synthorg/experiments/protocol.py
  • src/synthorg/client/pool.py
  • src/synthorg/memory/procedural/proposer.py
  • src/synthorg/engine/workspace/semantic_llm.py
  • src/synthorg/experiments/__init__.py
  • src/synthorg/api/etag.py
  • src/synthorg/llm/metadata.py
  • src/synthorg/persistence/postgres/backend.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/budget/enforcer.py
  • src/synthorg/api/controllers/_department_health.py
  • src/synthorg/budget/_aggregation.py
  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/persistence/protocol.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/client/store.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/workers/claim.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/engine/workflow/ceremony_scheduler.py
  • src/synthorg/persistence/_shared/pagination.py
  • src/synthorg/experiments/in_memory_repository.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/observability/prometheus_labels.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/observability/prometheus_collector.py
  • src/synthorg/api/controllers/experiments.py
  • src/synthorg/providers/models.py
  • src/synthorg/api/state.py
  • src/synthorg/api/dto.py
  • src/synthorg/experiments/service.py
  • src/synthorg/observability/prometheus_push_metrics.py
  • src/synthorg/workers/executor.py
**/*.{md,txt}

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers

Files:

  • docs/guides/custom-mcp-server-dev.md
  • docs/guides/custom-rules-and-meta-loop.md
  • docs/guides/ontology-extension.md
  • docs/architecture/decisions.md
  • docs/getting_started.md
  • docs/guides/dynamic-scoring.md
  • CLAUDE.md
  • docs/guides/a2a-federation.md
  • docs/guides/ceremony-scheduling-tuning.md
  • docs/guides/webhook-management.md
  • docs/guides/cost-attribution.md
  • docs/guides/approval-workflow.md
  • docs/guides/monitoring.md
  • docs/guides/rest-api-examples.md
  • docs/reference/yaml-schema.md
  • docs/reference/typed-boundaries.md
  • docs/guides/workers-and-background-tasks.md
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Use d2 for architecture / nested containers, mermaid for flowcharts / sequence / pipelines. Use Markdown tables for tabular data. D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI

Files:

  • docs/guides/custom-mcp-server-dev.md
  • docs/guides/custom-rules-and-meta-loop.md
  • docs/guides/ontology-extension.md
  • docs/architecture/decisions.md
  • docs/getting_started.md
  • docs/guides/dynamic-scoring.md
  • CLAUDE.md
  • docs/guides/a2a-federation.md
  • docs/guides/ceremony-scheduling-tuning.md
  • docs/guides/webhook-management.md
  • docs/guides/cost-attribution.md
  • docs/guides/approval-workflow.md
  • docs/guides/monitoring.md
  • docs/guides/rest-api-examples.md
  • docs/reference/yaml-schema.md
  • docs/reference/typed-boundaries.md
  • docs/guides/workers-and-background-tasks.md
cli/cmd/start.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

start command accepts flags --no-wait, --timeout, --no-pull, --dry-run, --no-detach, --no-verify

Files:

  • cli/cmd/start.go
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx}: Always use createLogger from @/lib/logger; never use bare console.warn/console.error/console.debug in application code
Always use variable name log for logger instances (e.g., const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate args to logger (not interpolated into the message string) so they go through sanitizeArg
Wrap attacker-controlled fields inside structured objects in sanitizeForLog() before embedding in logs
Callers MUST NOT wrap store mutation calls in try/catch; the store owns the error UX
Display counts must come from data.length; the wire envelope no longer carries total
Import ErrorCode and ErrorCategory from @/api/types/errors (re-exported from web/src/api/types/error-codes.gen.ts); discriminate on ErrorCode.<NAME>, never on raw integer literals
Generated DTO types re-export components['schemas']['X'] with no hand-written tightening because the generator handles response optionality
Import DTOs via the barrel (import type { AgentConfig } from '@/api/types') or directly from the generated module (import type { AgentConfig } from '@/api/types/dtos.gen')
ALWAYS reuse existing components from web/src/components/ui/ before creating new ones
A PostToolUse hook (scripts/check_web_design_system.py) flags hardcoded hex/rgba/fonts/Motion durations/locale literals/bare .toLocale*String() calls/missing Storybook stories/duplicate component patterns/complex .map() blocks; fix every violation before proceeding
Forbid passing async functions where the callsite ignores the returned promise via @typescript-eslint/no-misused-promises (with checksVoidReturn: { attributes: false }); React 19 async event handlers stay allowed

Files:

  • web/src/__tests__/stores/artifacts.test.ts
  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/openapi.gen.ts
web/src/{test-setup,__tests__}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Boot test-setup.tsx with onUnhandledRequest: 'error'; override per-case via server.use(...), never vi.mock('@/api/endpoints/*')

Files:

  • web/src/__tests__/stores/artifacts.test.ts
web/src/**/__tests__/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Every unit test runs under web/test-infra/active-handle-tracker.ts with zero tolerance for leaking event-loop-holding resources; no ceiling, no buffer

Files:

  • web/src/__tests__/stores/artifacts.test.ts
web/src/__tests__/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Forbid unawaited promises via @typescript-eslint/no-floating-promises so async work cannot survive the test that scheduled it and trip the active-handle gate

Files:

  • web/src/__tests__/stores/artifacts.test.ts
web/**/*.{tsx,ts}

📄 CodeRabbit inference engine (CLAUDE.md)

Web is React 19. For Web Dashboard Design System see web/CLAUDE.md. Reuse web/src/components/ui/; design tokens only

Files:

  • web/src/__tests__/stores/artifacts.test.ts
  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/openapi.gen.ts
tests/conformance/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Dual-backend conformance tests must use backend fixture (SQLite + Postgres). Enforced by check_dual_backend_test_parity.py

Files:

  • tests/conformance/persistence/conftest.py
  • tests/conformance/persistence/test_backup_round_trip.py
src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL

Repository CRUD methods: save(entity), get(id), delete(id) -> bool, list_items(...), query(...) returning tuples

For datetime in persistence use parse_iso_utc / format_iso_utc from persistence._shared (reject naive); use normalize_utc for already-typed values

Files:

  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/persistence/postgres/backend.py
  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/_shared/pagination.py
cli/cmd/update.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

update command accepts flags --dry-run, --no-restart, --timeout, --cli-only, --images-only, --check

Files:

  • cli/cmd/update.go
web/src/{stores,api}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Use cursor-based paging via PaginationMeta for list endpoints; keep nextCursor + hasMore in state (not offset arithmetic) and early-return when !hasMore || !nextCursor

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/openapi.gen.ts
web/src/api/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

getLiveness() is always 200 while the process is alive; getReadiness() is 200 healthy / 503 unavailable (binary 'ok' | 'unavailable' outcome, no tri-state). Any new caller must handle the 503 path explicitly.

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/openapi.gen.ts
web/src/api/types/**/*.gen.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

NEVER hand-edit a *.gen.ts file; regenerate with uv run python scripts/generate_dto_types_ts.py

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/openapi.gen.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T20:41:11.727Z
Learning: After every issue: create branch, commit, push (no auto-PR); use `/pre-pr-review` command (gh pr create is hookify-blocked). After PR creation: use `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T20:41:11.727Z
Learning: After every squash merge run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T20:41:11.727Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard and REST API, not CLI
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T20:41:11.727Z
Learning: Commits follow `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); enforced by commitizen. Branches follow `<type>/<slug>` pattern. Squash merge only; PR body becomes squash commit message. Trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T20:41:11.727Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/client/test_store_logging.py
  • tests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.py
  • src/synthorg/api/controllers/requests.py
  • tests/unit/observability/test_events.py
  • src/synthorg/meta/evolution/outcome_models.py
  • src/synthorg/meta/evolution/outcome_store_protocol.py
  • src/synthorg/observability/events/experiments.py
  • src/synthorg/llm/__init__.py
  • src/synthorg/core/concurrency/cas_retry.py
  • src/synthorg/engine/workspace/config.py
  • src/synthorg/meta/mcp/handlers/common_args.py
  • src/synthorg/memory/procedural/models.py
  • src/synthorg/engine/quality/decomposers/llm.py
  • src/synthorg/security/llm_evaluator.py
  • tests/unit/providers/test_models.py
  • src/synthorg/api/lifecycle_builder.py
  • src/synthorg/api/controllers/simulations.py
  • src/synthorg/observability/events/workflow.py
  • tests/conformance/persistence/conftest.py
  • tests/unit/providers/conftest.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/observability/events/client.py
  • src/synthorg/experiments/models.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/integrations/mcp_catalog/in_memory_installations.py
  • src/synthorg/security/config.py
  • src/synthorg/experiments/protocol.py
  • src/synthorg/client/pool.py
  • src/synthorg/memory/procedural/proposer.py
  • src/synthorg/engine/workspace/semantic_llm.py
  • src/synthorg/experiments/__init__.py
  • src/synthorg/api/etag.py
  • src/synthorg/llm/metadata.py
  • src/synthorg/persistence/postgres/backend.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/budget/enforcer.py
  • src/synthorg/api/controllers/_department_health.py
  • src/synthorg/budget/_aggregation.py
  • tests/unit/workers/test_executor.py
  • src/synthorg/api/controllers/__init__.py
  • tests/conformance/persistence/test_backup_round_trip.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/persistence/protocol.py
  • tests/unit/experiments/test_service.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/unit/client/test_pool_logging.py
  • scripts/check_no_migration_framing.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • src/synthorg/client/store.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/workers/claim.py
  • tests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.py
  • tests/unit/persistence/test_protocol.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/engine/workflow/ceremony_scheduler.py
  • src/synthorg/persistence/_shared/pagination.py
  • src/synthorg/experiments/in_memory_repository.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/observability/prometheus_labels.py
  • tests/unit/engine/test_approval_gate.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/observability/prometheus_collector.py
  • src/synthorg/api/controllers/experiments.py
  • src/synthorg/providers/models.py
  • src/synthorg/api/state.py
  • src/synthorg/api/dto.py
  • src/synthorg/experiments/service.py
  • src/synthorg/observability/prometheus_push_metrics.py
  • src/synthorg/workers/executor.py
  • tests/unit/workers/test_claim_lifecycle.py
🪛 LanguageTool
docs/getting_started.md

[uncategorized] ~45-~45: The official name of this software platform is spelled with a capital “H”.
Context: ...golangci-lint version that matches CI (.github/workflows/cli.yml`). Re-run only after ...

(GITHUB)

docs/guides/ceremony-scheduling-tuning.md

[style] ~101-~101: ‘by mistake’ might be wordy. Consider a shorter alternative.
Context: ...| Two strategies enabled simultaneously by mistake | Confirm ceremony_policy.strategy is...

(EN_WORDINESS_PREMIUM_BY_MISTAKE)

docs/guides/webhook-management.md

[uncategorized] ~32-~32: The official name of this software platform is spelled with a capital “H”.
Context: ...ter and POST Register a receiver for a github connection: ```python from synthorg.i...

(GITHUB)


[style] ~68-~68: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... 401 on a signature mismatch. - 409 on a replay older than the replay window. ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/guides/cost-attribution.md

[style] ~27-~27: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: .../agent). - GET /api/v1/costs/records` returns the raw record stream (paginated). ```...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/guides/workers-and-background-tasks.md

[grammar] ~63-~63: Ensure spelling is correct
Context: ...2. Polls the consumer for claims with a per-fetch timeout. 3. Runs the configured e...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.22.1)
docs/guides/workers-and-background-tasks.md

[warning] 81-81: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Comment thread .pre-commit-config.yaml
Comment thread cli/cmd/new.go
Comment thread cli/cmd/start.go Outdated
Comment on lines +132 to +137
**Verification checkpoint (2026-06-10):** on this date run the checklist below and update this section with the outcome (mark each item Done / Not done / Outcome).

1. Inspect `nats-io/nats.py` open PRs and recent releases on GitHub for the `inspect.iscoroutinefunction` fix.
2. If a fixed release is available: bump the `nats-py` pin in `pyproject.toml`, drop the matching `filterwarnings` entry, run `uv run python -m pytest tests/ -m integration -k nats` to confirm warnings are gone, and replace this checkpoint section with the resolution outcome.
3. If no fixed release exists: implement the local monkey-patch in `src/synthorg/communication/bus/_nats_compat.py` (one-line `nats.aio.client.iscoroutinefunction = inspect.iscoroutinefunction`), import it at bus initialisation, and extend this section with the patch landing date.
4. Re-evaluate `nats-core` JetStream support: a maintained alternative removes the entire mitigation requirement.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Checkpoint numerics should follow docs numeric-sourcing policy.

The newly added dated checklist contains hardcoded numerics/date values without <!--RS:...--> markers. Please source these numerics through runtime-stats markers (or move volatile checkpoint timing out of public docs).

As per coding guidelines, “Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/architecture/decisions.md` around lines 132 - 137, The dated checklist
header "Verification checkpoint (2026-06-10)" and any hardcoded numerics in the
new items must be replaced with runtime-stats markers per the docs
numeric-sourcing policy: pull the date/number from data/runtime_stats.yaml and
insert a <!--RS:NAME--> marker instead of the literal "2026-06-10" (or move the
volatile checkpoint into an internal/temporary doc if you prefer). Update the
checklist text in docs/architecture/decisions.md (the "Verification checkpoint"
heading and the numbered items 1–4) to reference the appropriate <!--RS:...-->
keys and ensure the corresponding entries exist in data/runtime_stats.yaml; if
you move the checkpoint out of public docs, note that change here and remove
hardcoded numerics.

Comment thread docs/getting_started.md
Comment on lines +35 to +53
## Install external CLI tools (one-time per machine)

Some gates and the docs build rely on external binaries that are not Python packages: `golangci-lint` (Go linter, used by the CLI) and `d2` (architecture diagram renderer).

Install `golangci-lint` once per machine:

```bash
bash scripts/install_cli_tools.sh
```

The script downloads the pinned `golangci-lint` version that matches CI (`.github/workflows/cli.yml`). Re-run only after bumping the pinned version; subsequent `uv sync` invocations do NOT re-run the script. CI uses its own action-based install step, so this is strictly a local-developer convenience.

Install `d2` separately (the docs job pins `v0.7.1`). The fastest path is the upstream installer:

```bash
curl -fsSL https://d2lang.com/install.sh | sh -s -- --version v0.7.1
```

On Windows, install via `winget install Terrastruct.d2` or download the release archive from `https://github.com/terrastruct/d2/releases`. Either way, ensure the resulting `d2` binary is on `PATH`; the docs build invokes it directly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

New numeric doc content needs <!--RS:...--> markers.

This section introduces hardcoded numeric values (including version numerics) without runtime-stats markers. Please source numeric literals via <!--RS:...--> or rewrite to avoid explicit numerics.

As per coding guidelines, “Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers”.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~45-~45: The official name of this software platform is spelled with a capital “H”.
Context: ...golangci-lint version that matches CI (.github/workflows/cli.yml`). Re-run only after ...

(GITHUB)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/getting_started.md` around lines 35 - 53, Replace hardcoded version
numerics in docs/getting_started.md with runtime-stats markers: change the d2
version string "v0.7.1" to a marker like <!--RS:D2_VERSION--> and replace any
golangci-lint pinned version mention with a marker like
<!--RS:GOLANGCI_LINT_VERSION--> that sources values from
data/runtime_stats.yaml; ensure any other explicit numeric literals in this
section are similarly converted to <!--RS:...--> markers and update the install
guidance to reference scripts/install_cli_tools.sh and the CI pin in
.github/workflows/cli.yml so readers know where the marker values originate.

Comment thread src/synthorg/workers/__main__.py
Comment thread src/synthorg/workers/execution_service.py
Comment thread src/synthorg/workers/executor.py Outdated
Comment thread src/synthorg/workers/executor.py Outdated
Comment thread tests/unit/engine/test_approval_gate.py Outdated
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an A/B experiment registry for deterministic variant assignment, implements an HTTP-callback task executor for distributed workers, and enhances observability with new Prometheus metrics for WebSocket connections and Postgres connection pools. It also includes extensive documentation updates, refines CLI error handling and help text, and tightens linting rules against forensic prose. Feedback from the review suggests simplifying WebSocket connection tracking by using atomic Prometheus gauge operations instead of manual locks, removing redundant local imports, and cleaning up a redundant modulo operation in the experiment hashing logic.

Comment thread src/synthorg/api/controllers/ws.py Outdated
Comment on lines +684 to +690
# Process-wide guard for the active-WS read-modify-write below. Two
# WS handler coroutines never interleave inside a single sync block,
# but Prometheus' metrics scrape thread (run by the ASGI server's
# background scraper) can read ``app.state`` concurrently with a
# handler that is mid-increment; the lock keeps the increment + store
# pair indivisible from any concurrent reader's perspective.
_WS_CONNECTION_COUNT_LOCK = threading.Lock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The process-wide _WS_CONNECTION_COUNT_LOCK and the manual tracking of ws_active_connection_count in app.state are redundant. Since ws_handler is an async coroutine, multiple instances running on the same event loop will not interleave during the synchronous execution of these recording helpers. Additionally, the ws_active_connections metric is a Prometheus Gauge, which is thread-safe and supports atomic inc() and dec() operations. Using the gauge directly would simplify the implementation and eliminate the need for manual synchronization.

Comment thread src/synthorg/api/controllers/ws.py Outdated
# lifetime into the ``synthorg_ws_connection_lifetime_seconds``
# histogram. ``time.monotonic`` so a wall-clock NTP step does not
# push the bucket bound.
import time # noqa: PLC0415
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The local import of the time module is redundant as it is already imported at the top of the file (line 23). This also applies to the local import on line 912.

Comment thread src/synthorg/experiments/service.py Outdated
material = f"{experiment}\x1f{subject_id}".encode()
digest = hashlib.sha256(material).digest()
raw = int.from_bytes(digest[:4], byteorder="big", signed=False)
return (raw % _HASH_BUCKET_MAX) % modulus
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The modulo operation raw % _HASH_BUCKET_MAX is redundant. raw is an unsigned 32-bit integer initialized from 4 bytes, so its value is already within the range [0, 2**32). Since _HASH_BUCKET_MAX is defined as 2**32, the expression raw % _HASH_BUCKET_MAX always evaluates to raw.

Suggested change
return (raw % _HASH_BUCKET_MAX) % modulus
return raw % modulus

@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

❌ Patch coverage is 71.50350% with 163 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.15%. Comparing base (afb5cc5) to head (c2fa188).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/workers/__main__.py 14.28% 24 Missing ⚠️
src/synthorg/workers/execution_service.py 40.00% 24 Missing ⚠️
src/synthorg/observability/prometheus_collector.py 42.42% 17 Missing and 2 partials ⚠️
...horg/observability/prometheus_recording_streams.py 52.94% 16 Missing ⚠️
src/synthorg/api/controllers/experiments.py 65.78% 13 Missing ⚠️
src/synthorg/workers/executor.py 83.95% 9 Missing and 4 partials ⚠️
src/synthorg/api/state.py 50.00% 12 Missing ⚠️
src/synthorg/experiments/service.py 81.35% 9 Missing and 2 partials ⚠️
src/synthorg/experiments/in_memory_repository.py 79.54% 8 Missing and 1 partial ⚠️
src/synthorg/engine/workflow/ceremony_scheduler.py 50.00% 7 Missing and 1 partial ⚠️
... and 4 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1926      +/-   ##
==========================================
- Coverage   85.20%   85.15%   -0.06%     
==========================================
  Files        1834     1846      +12     
  Lines      107196   107752     +556     
  Branches     9245     9287      +42     
==========================================
+ Hits        91341    91751     +410     
- Misses      13618    13755     +137     
- Partials     2237     2246       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…I drain race

Round 2 batch on PR #1926 (refs #1922). Bundles the CI test failure

fix, 22 valid CodeRabbit inline findings, 3 Gemini inline findings, and

2 CodeRabbit outside-diff comments. RS-marker findings on docs not in

the doc-numeric-macros gate scope (only 5 files are scanned) are

factually wrong against the actual gate definition and skipped with

reasons in the round-history entry.

CI fix:

- engine/task_engine._drain_all: split effective_timeout evenly between

  processing- and observer-drain stages so a slow processing

  cancellation no longer starves observer drain into a zero-budget

  call (test_drain_timeout_resolves_pending_futures was failing under

  CI load even at 500ms because remaining ~= 0 after _drain_processing)

ws.py (CodeRabbit + Gemini collapsed):

- Drop _WS_CONNECTION_COUNT_LOCK + state_dict counter, use atomic

  Gauge.inc()/.dec() via new collector methods. prometheus_client

  Gauge is thread-safe so the scrape thread cannot tear a count.

- Drop the two redundant local 'import time' bindings (already at top)

Other source/test fixes:

- cli/cmd/new.go: bare 'synthorg new' returns ExitUsage after Help()

- cli/internal/config/state.go: typed sentinels ErrReading/ErrParsing

- cli/cmd/start.go: errors.Is(ErrReading/ErrParsing) replaces strings.HasPrefix

- scripts/check_no_migration_framing.py: regex adds owned|emitted|wrapped

- api/controllers/tasks.py: explicit status_code=HTTP_200_OK on /execute

- api/dto.py: extract _MAX_SELECTION_WEIGHT, add 3 DTOs to __all__

- api/state.py: double-checked locking on lazy worker_execution/experiment_service

- experiments/models.py: extract _MAX_VARIANT_WEIGHT

- experiments/service.py: drop no-op intermediate modulo, re-fetch after

  record_assignment so concurrent writers converge on the canonical

  first-write timestamp

- observability/prometheus_recording_streams: require_non_negative on

  ws_connection_lifetime + pg_pool_acquire duration; add inc_/dec_ atoms

- observability/events/workers: WORKERS_MAIN_INVALID_EXECUTOR_CONFIG event

- workers/__main__: resolve executor BEFORE task_queue.start(); log before

  SystemExit on both missing-config paths

- workers/execution_service: logger.warning before NotFoundError raise

- workers/executor: urllib.parse.quote(task_id); terminal 'failed' status

  now maps to SUCCESS to match the module docstring contract

- web/src/api/types/*.gen.ts: regenerated for the execute-endpoint

  status_code=200 change

Test updates:

- tests/unit/engine/test_approval_gate: ParkedContext is Pydantic so

  create_autospec(spec_set=True) cannot see its fields; build a real

  instance at the typed boundary

- tests/unit/engine/test_task_engine_integration: refresh budget-choice

  comment to describe the new even split

- tests/unit/experiments/test_service: advance FakeClock between assigns

  + assert strictly decreasing timestamps

- tests/unit/scripts/test_check_no_migration_framing: parametrize the

  three new verbs (owned/emitted/wrapped)

- tests/unit/workers/test_executor: rename + reassert terminal-failed

  returns SUCCESS to match the fixed contract

Docs:

- docs/guides/monitoring.md: include audit-log fill-ratio gauge in the

  Audit & Security row

- docs/guides/workers-and-background-tasks.md: 'text' fence language

- docs/reference/typed-boundaries.md: split intro into parse_typed-

  enforced vs informational/lenient groups

- .pre-commit-config.yaml: refresh header re no-review-origin-in-code +

  no-migration-framing pre-push staging + CI counterpart
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 15, 2026 21:43 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/workers/__main__.py (1)

174-189: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Close owned HTTP client when queue startup fails.

If Line 178 (task_queue.start()) throws, control never enters the current try/finally, so the created httpx.AsyncClient is left unclosed.

Proposed fix
     task_queue = JetStreamTaskQueue(
         queue_config=queue_config,
         nats_config=nats_config,
     )
-    await task_queue.start()
+    queue_started = False
     try:
+        await task_queue.start()
+        queue_started = True
         await run_worker_pool(
             queue_config=queue_config,
             task_queue=task_queue,
             executor=executor,
             worker_count=args.workers,
         )
     finally:
         if http_client is not None:
             await http_client.aclose()
-        await task_queue.stop()
+        if queue_started:
+            await task_queue.stop()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/workers/__main__.py` around lines 174 - 189, If
task_queue.start() raises an exception the existing finally block is never
reached and the created http_client is leaked; wrap the startup call so that any
exception during task_queue.start() will first close the owned http_client
(await http_client.aclose() when http_client is not None) and then re-raise the
error. Concretely, after constructing task_queue and http_client, call await
task_queue.start() inside a try/except that on exception awaits
http_client.aclose() if present and then raises, then proceed into the existing
try/finally that calls run_worker_pool and stops the task_queue; reference
symbols: http_client, task_queue.start(), JetStreamTaskQueue, run_worker_pool.
♻️ Duplicate comments (2)
src/synthorg/experiments/service.py (1)

150-191: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

assign() still has a first-write race that can fail instead of replaying canonical assignment.

At Line 179, a concurrent insert conflict from record_assignment() will raise before the re-fetch path executes, so the method is not reliably idempotent under contention. Catch duplicate/unique-conflict from the repository and then re-read get_assignment(...) to return the canonical winner.

Suggested direction
         await self._repo.record_assignment(assignment)
+        # If another worker won the insert race, fetch and replay canonical row.
+        # Replace `AssignmentConflictError` with the repository's concrete
+        # duplicate/unique-conflict exception type.
+        # try:
+        #     await self._repo.record_assignment(assignment)
+        # except AssignmentConflictError:
+        #     canonical = await self._repo.get_assignment(
+        #         experiment=experiment,
+        #         subject_id=subject_id,
+        #     )
+        #     if canonical is not None:
+        #         return canonical
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/experiments/service.py` around lines 150 - 191, The assign()
flow has a race where record_assignment(...) can raise a duplicate/unique
constraint error before the re-fetch, so wrap the call to
self._repo.record_assignment(assignment) in an exception handler that catches
the repository’s duplicate/unique-conflict error, and in that handler call
self._repo.get_assignment(experiment=experiment, subject_id=subject_id) and
return the canonical assignment if found (otherwise re-raise or convert to a
clear error). Ensure you reference the same ExperimentAssignment, assignment,
canonical, and the self._repo.get_assignment/record_assignment methods so the
fix only handles the specific conflict and preserves the re-fetch semantics.
docs/guides/monitoring.md (1)

104-104: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace hardcoded numerics with runtime-stats markers in new docs content.

The newly added metric description, PromQL commentary, alert rule, and Grafana JSON still hardcode values (0.9, 0.75, 1, 10m, etc.). This will drift and violate docs numeric sourcing policy.

As per coding guidelines "{README.md,docs/**/*.md}: Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers".

Also applies to: 227-281

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/monitoring.md` at line 104, The new metric
`synthorg_security_audit_log_fill_ratio` and its associated PromQL examples,
alert rule, and Grafana JSON still use hardcoded numbers (e.g. 0.9, 0.75, 1,
10m); replace each hardcoded numeric with a runtime-stats marker <!--RS:NAME-->
that references the corresponding key in data/runtime_stats.yaml (create
sensible keys like audit_log_alert_threshold, audit_log_warn_threshold,
audit_log_max_entries, audit_log_window if they don't exist), update the
markdown description, PromQL commentary, alert rule and panel JSON to use those
markers, and ensure the marker names match entries in runtime_stats.yaml so the
docs build pulls values at render time.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/state.py`:
- Around line 204-205: AppState's __init__ assigns self._lazy_service_lock but
'_lazy_service_lock' is not listed in the class __slots__, causing
AttributeError; add the string '_lazy_service_lock' to the AppState.__slots__
tuple alongside '_experiment_service' and '_fine_tune_orchestrator' (and any
other slot entries) so the attribute can be set, leaving the existing __init__
assignment for _lazy_service_lock unchanged.

In `@src/synthorg/engine/task_engine.py`:
- Around line 393-404: The _drain_all docstring contains migration-framing and
historical narrative; rewrite the docstring for function _drain_all to remove
past-tense/migration language and instead state the current rationale concisely
(why the budget is split and what guarantee it provides). Locate the docstring
that describes splitting the budget between processing and observer drains and
replace the historical explanation with a present-tense rationale such as: the
budget is split evenly between processing and observer drains to ensure each
stage receives at least effective_timeout/2, preventing observer starve and
avoiding outer-guard deadline races; keep references to effective_timeout,
_drain_processing, and observer queue behavior for context. Ensure no reviewer
citations, issue references, or migration history remain.

In `@src/synthorg/experiments/service.py`:
- Around line 80-95: register_variant() currently only checks weight < 1 and
lets Pydantic's ExperimentVariant(...) raise a ValidationError for weight >
_MAX_VARIANT_WEIGHT, bypassing the EXPERIMENT_VARIANT_INVALID_WEIGHT log and
domain error contract; add an explicit upper-bound check (e.g., if weight >
_MAX_VARIANT_WEIGHT) before constructing ExperimentVariant to log the same
EXPERIMENT_VARIANT_INVALID_WEIGHT with experiment/variant/weight and raise the
same service ValidationError(msg) ("weight must be >= 1" should be adjusted to a
suitable message like "weight must be between 1 and {_MAX_VARIANT_WEIGHT}") so
all invalid-weight paths use the same logging and exception behavior.

In `@src/synthorg/workers/__main__.py`:
- Around line 53-55: The docstring for _placeholder_executor incorrectly states
it is used "when no backend URL is configured"; update that comment to reflect
actual behaviour: the placeholder executor is only used when the operator
explicitly passes "--executor placeholder" and not when the backend URL is
missing because _resolve_executor() exits in that case. Locate the docstring
near the _placeholder_executor definition and/or its usage in _resolve_executor,
remove or replace the phrase "or when no backend URL is configured" with wording
that says it is only a smoke-test/fallback when the operator explicitly requests
the placeholder executor (e.g., via "--executor placeholder").

In `@tests/unit/experiments/test_service.py`:
- Around line 20-27: The test reaches into ExperimentService's private attribute
svc._clock making it brittle; modify the helper _service() so it returns the
FakeClock along with the ExperimentService and repository (e.g., return svc,
repo, clock) and update tests to use that returned FakeClock instead of
accessing svc._clock; locate _service() and the test assertions that reference
svc._clock and replace those accesses with the directly returned FakeClock
instance (FakeClock and ExperimentService are the key symbols to update).

---

Outside diff comments:
In `@src/synthorg/workers/__main__.py`:
- Around line 174-189: If task_queue.start() raises an exception the existing
finally block is never reached and the created http_client is leaked; wrap the
startup call so that any exception during task_queue.start() will first close
the owned http_client (await http_client.aclose() when http_client is not None)
and then re-raise the error. Concretely, after constructing task_queue and
http_client, call await task_queue.start() inside a try/except that on exception
awaits http_client.aclose() if present and then raises, then proceed into the
existing try/finally that calls run_worker_pool and stops the task_queue;
reference symbols: http_client, task_queue.start(), JetStreamTaskQueue,
run_worker_pool.

---

Duplicate comments:
In `@docs/guides/monitoring.md`:
- Line 104: The new metric `synthorg_security_audit_log_fill_ratio` and its
associated PromQL examples, alert rule, and Grafana JSON still use hardcoded
numbers (e.g. 0.9, 0.75, 1, 10m); replace each hardcoded numeric with a
runtime-stats marker <!--RS:NAME--> that references the corresponding key in
data/runtime_stats.yaml (create sensible keys like audit_log_alert_threshold,
audit_log_warn_threshold, audit_log_max_entries, audit_log_window if they don't
exist), update the markdown description, PromQL commentary, alert rule and panel
JSON to use those markers, and ensure the marker names match entries in
runtime_stats.yaml so the docs build pulls values at render time.

In `@src/synthorg/experiments/service.py`:
- Around line 150-191: The assign() flow has a race where record_assignment(...)
can raise a duplicate/unique constraint error before the re-fetch, so wrap the
call to self._repo.record_assignment(assignment) in an exception handler that
catches the repository’s duplicate/unique-conflict error, and in that handler
call self._repo.get_assignment(experiment=experiment, subject_id=subject_id) and
return the canonical assignment if found (otherwise re-raise or convert to a
clear error). Ensure you reference the same ExperimentAssignment, assignment,
canonical, and the self._repo.get_assignment/record_assignment methods so the
fix only handles the specific conflict and preserves the re-fetch semantics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f8a810c7-270e-4e6c-aa58-3c755d4b1508

📥 Commits

Reviewing files that changed from the base of the PR and between fc4c090 and 69540af.

📒 Files selected for processing (26)
  • .pre-commit-config.yaml
  • cli/cmd/new.go
  • cli/cmd/start.go
  • cli/internal/config/state.go
  • docs/guides/monitoring.md
  • docs/guides/workers-and-background-tasks.md
  • docs/reference/typed-boundaries.md
  • scripts/check_no_migration_framing.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/api/dto.py
  • src/synthorg/api/state.py
  • src/synthorg/engine/task_engine.py
  • src/synthorg/experiments/models.py
  • src/synthorg/experiments/service.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/executor.py
  • tests/unit/engine/test_approval_gate.py
  • tests/unit/engine/test_task_engine_integration.py
  • tests/unit/experiments/test_service.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • tests/unit/workers/test_executor.py
  • web/src/api/types/openapi.gen.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: CLI Test (windows-latest)
  • GitHub Check: CLI Bench Regression
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (18)
cli/internal/config/state.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

Define Default{Postgres,NATS}ImageTag and matching Default{Postgres,NATS}ImageDigest constants in cli/internal/config/state.go with a // renovate: annotation on the tag constant for Renovate customManager to keep both tag and digest current

Files:

  • cli/internal/config/state.go
cli/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

Use the appropriate hint tier based on intent: HintError for error recovery (always visible unless quiet), HintNextStep for natural next action or destructive-action feedback, HintTip for config automation suggestions (deduplicates within session), and HintGuidance for flag/feature discovery (invisible in default auto mode)

For CLI (Go binary) development, refer to cli/CLAUDE.md and use go -C cli (never cd cli)

Files:

  • cli/internal/config/state.go
  • cli/cmd/new.go
  • cli/cmd/start.go
cli/internal/{config,verify}/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

Overriding any of registry_host, image_repo_prefix, dhi_registry, postgres_image_tag, or nats_image_tag disables image signature + SLSA verification for that invocation only and writes a stderr warning on every invocation (not suppressed by --quiet or --json)

Files:

  • cli/internal/config/state.go
cli/internal/config/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

When both SYNTHORG_DATABASE_URL and SYNTHORG_DB_PATH are present, SYNTHORG_DATABASE_URL wins (Postgres is initialised; SQLite path is ignored); malformed URL raises loudly at startup rather than silently falling back

Files:

  • cli/internal/config/state.go
!(providers/presets.py|web/public/provider-logos/**|.claude/**)

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic naming: NEVER use real vendor names in project code/tests; use example-provider, test-provider, example-{large,medium,small}-001; allowed only in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Files:

  • cli/internal/config/state.go
  • scripts/check_no_migration_framing.py
  • tests/unit/engine/test_task_engine_integration.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • src/synthorg/engine/task_engine.py
  • cli/cmd/new.go
  • cli/cmd/start.go
  • src/synthorg/api/dto.py
  • src/synthorg/api/controllers/tasks.py
  • docs/reference/typed-boundaries.md
  • .pre-commit-config.yaml
  • tests/unit/engine/test_approval_gate.py
  • docs/guides/workers-and-background-tasks.md
  • src/synthorg/experiments/models.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/experiments/service.py
  • src/synthorg/api/state.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/workers/execution_service.py
  • web/src/api/types/openapi.gen.ts
  • tests/unit/experiments/test_service.py
  • docs/guides/monitoring.md
  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/__main__.py
  • tests/unit/workers/test_executor.py
  • src/synthorg/workers/executor.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min

Windows tests: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: follow ladder in conventions.md section 12.1; use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries blocked by check_mock_spec.py; import from tests._shared

Files:

  • tests/unit/engine/test_task_engine_integration.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • tests/unit/engine/test_approval_gate.py
  • tests/unit/experiments/test_service.py
  • tests/unit/workers/test_executor.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/engine/test_task_engine_integration.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • tests/unit/engine/test_approval_gate.py
  • tests/unit/experiments/test_service.py
  • tests/unit/workers/test_executor.py
src/synthorg/!(persistence)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL

Files:

  • src/synthorg/engine/task_engine.py
  • src/synthorg/api/dto.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/experiments/models.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/experiments/service.py
  • src/synthorg/api/state.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/executor.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Configuration precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env; use settings.bootstrap_resolver.resolve_init_value for pre-init Cat-2 reads; YAML is ingestion format only, not a precedence tier; no os.environ.get outside startup

Numerics live in settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of the form NAME: int|float|Final|Final[int]|Final[float] = literal

Comments must explain WHY only; no reviewer citations, issue back-references, or migration framing

Do not use from __future__ import annotations (Python 3.14 has PEP 649); use PEP 758 except: except A, B: requires parentheses when binding

Type hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Errors must follow <Domain><Condition>Error naming from DomainError base; never inherit directly from Exception/RuntimeError/etc (enforced by check_domain_error_hierarchy.py)

Pydantic v2 API DTOs must be frozen with extra='forbid' (for Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); use @computed_field for derived values; use NotBlankStr for identifiers

Use args models at every system boundary; apply parse_typed() for every external dict ingestion (enforced by check_boundary_typed.py)

Use immutability with model_copy(update=...) or copy.deepcopy(); apply deepcopy at system boundaries

Use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)

Use Clock seam pattern: clock: Clock | None = None; tests inject FakeClock; services own _lifecycle_lock; timed-out stops mark unrestartable

For untrusted content (SEC-1): use wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML

Use `from synthorg.observability import ge...

Files:

  • src/synthorg/engine/task_engine.py
  • src/synthorg/api/dto.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/experiments/models.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/experiments/service.py
  • src/synthorg/api/state.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/executor.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/engine/task_engine.py
  • src/synthorg/api/dto.py
  • src/synthorg/api/controllers/tasks.py
  • src/synthorg/experiments/models.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/experiments/service.py
  • src/synthorg/api/state.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/executor.py
cli/cmd/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

Exit codes: 0 = success, 1 = runtime error, 2 = usage error, 3 = unhealthy (backend/containers), 4 = unreachable (Docker), 10 = updates available (--check)

Files:

  • cli/cmd/new.go
  • cli/cmd/start.go
cli/cmd/new.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

new <kind> <domain> command accepts --dry-run and --overwrite flags and scaffolds a conventions-clean Python file set under src/synthorg/ for a new feature; <kind> is one of service, persistence, tool, controller

Files:

  • cli/cmd/new.go
cli/cmd/start.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

start command accepts flags --no-wait, --timeout, --no-pull, --dry-run, --no-detach, --no-verify

Files:

  • cli/cmd/start.go
{README.md,docs/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers

Files:

  • docs/reference/typed-boundaries.md
  • docs/guides/workers-and-background-tasks.md
  • docs/guides/monitoring.md
docs/**/*.{md,d2}

📄 CodeRabbit inference engine (CLAUDE.md)

Use D2 for architecture and nested container diagrams; use Mermaid for flowcharts, sequence diagrams, and pipelines; use Markdown tables for tabular data; pin D2 CLI to v0.7.1 in CI

Files:

  • docs/reference/typed-boundaries.md
  • docs/guides/workers-and-background-tasks.md
  • docs/guides/monitoring.md
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx}: Always use createLogger from @/lib/logger; never use bare console.warn/console.error/console.debug in application code
Always use variable name log for logger instances (e.g., const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate args to logger (not interpolated into the message string) so they go through sanitizeArg
Wrap attacker-controlled fields inside structured objects in sanitizeForLog() before embedding in logs
Callers MUST NOT wrap store mutation calls in try/catch; the store owns the error UX
Display counts must come from data.length; the wire envelope no longer carries total
Import ErrorCode and ErrorCategory from @/api/types/errors (re-exported from web/src/api/types/error-codes.gen.ts); discriminate on ErrorCode.<NAME>, never on raw integer literals
Generated DTO types re-export components['schemas']['X'] with no hand-written tightening because the generator handles response optionality
Import DTOs via the barrel (import type { AgentConfig } from '@/api/types') or directly from the generated module (import type { AgentConfig } from '@/api/types/dtos.gen')
ALWAYS reuse existing components from web/src/components/ui/ before creating new ones
A PostToolUse hook (scripts/check_web_design_system.py) flags hardcoded hex/rgba/fonts/Motion durations/locale literals/bare .toLocale*String() calls/missing Storybook stories/duplicate component patterns/complex .map() blocks; fix every violation before proceeding
Forbid passing async functions where the callsite ignores the returned promise via @typescript-eslint/no-misused-promises (with checksVoidReturn: { attributes: false }); React 19 async event handlers stay allowed

Files:

  • web/src/api/types/openapi.gen.ts
web/src/{stores,api}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Use cursor-based paging via PaginationMeta for list endpoints; keep nextCursor + hasMore in state (not offset arithmetic) and early-return when !hasMore || !nextCursor

Files:

  • web/src/api/types/openapi.gen.ts
web/src/api/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

getLiveness() is always 200 while the process is alive; getReadiness() is 200 healthy / 503 unavailable (binary 'ok' | 'unavailable' outcome, no tri-state). Any new caller must handle the 503 path explicitly.

Files:

  • web/src/api/types/openapi.gen.ts
web/src/api/types/**/*.gen.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

NEVER hand-edit a *.gen.ts file; regenerate with uv run python scripts/generate_dto_types_ts.py

Files:

  • web/src/api/types/openapi.gen.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Read design spec from `docs/design/` page before implementing; deviations need approval (see DESIGN_SPEC.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: No region/currency/locale privileged; use metric units; use British English throughout
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Every convention PR must ship its enforcement gate (see docs/reference/convention-gates.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json` or baseline files (`scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`); use `ALLOW_BASELINE_GROWTH=1 git commit` for gate baseline bypass
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: After implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` command; do not use `gh pr create` (hookify-blocked); after PR: use `/aurelio-review-pr` for external feedback; fix all valid issues; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Test execution: xdist `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` (`loadfile` prevents 3.14+ Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Commit format: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Branches: `<type>/<slug>` format from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Pre-commit/pre-push hooks configured in `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: Squash merge with PR body as squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: After every squash merge, run `/post-merge-cleanup` command
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T21:42:31.315Z
Learning: For Web dashboard development, refer to `web/CLAUDE.md` for detailed guidelines
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • scripts/check_no_migration_framing.py
  • tests/unit/engine/test_task_engine_integration.py
  • tests/unit/scripts/test_check_no_migration_framing.py
  • src/synthorg/engine/task_engine.py
  • src/synthorg/api/dto.py
  • src/synthorg/api/controllers/tasks.py
  • tests/unit/engine/test_approval_gate.py
  • src/synthorg/experiments/models.py
  • src/synthorg/observability/prometheus_recording_streams.py
  • src/synthorg/experiments/service.py
  • src/synthorg/api/state.py
  • src/synthorg/api/controllers/ws.py
  • src/synthorg/workers/execution_service.py
  • tests/unit/experiments/test_service.py
  • src/synthorg/observability/events/workers.py
  • src/synthorg/workers/__main__.py
  • tests/unit/workers/test_executor.py
  • src/synthorg/workers/executor.py
🪛 LanguageTool
docs/guides/workers-and-background-tasks.md

[grammar] ~63-~63: Ensure spelling is correct
Context: ...2. Polls the consumer for claims with a per-fetch timeout. 3. Runs the configured e...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🔇 Additional comments (21)
cli/internal/config/state.go (1)

21-32: LGTM!

Also applies to: 283-284, 287-289

scripts/check_no_migration_framing.py (1)

14-31: LGTM!

Also applies to: 89-133

tests/unit/engine/test_task_engine_integration.py (1)

320-332: LGTM!

tests/unit/scripts/test_check_no_migration_framing.py (1)

456-499: LGTM!

Also applies to: 507-542

cli/cmd/new.go (1)

38-46: LGTM!

cli/cmd/start.go (1)

86-110: LGTM!

src/synthorg/api/dto.py (1)

38-38: LGTM!

Also applies to: 436-484, 760-760, 773-773, 780-780

src/synthorg/api/controllers/tasks.py (1)

8-8: LGTM!

Also applies to: 14-14, 325-371

docs/reference/typed-boundaries.md (1)

3-12: LGTM!

Also applies to: 64-67, 161-274

.pre-commit-config.yaml (1)

31-40: LGTM!

Also applies to: 459-465, 630-634, 648-650

tests/unit/engine/test_approval_gate.py (1)

3-3: LGTM!

Also applies to: 8-8, 12-12, 22-45

docs/guides/workers-and-background-tasks.md (1)

1-135: LGTM!

src/synthorg/experiments/models.py (1)

1-57: LGTM!

src/synthorg/observability/prometheus_recording_streams.py (1)

1-126: LGTM!

src/synthorg/api/state.py (1)

67-68: LGTM!

Also applies to: 118-121, 806-864

src/synthorg/api/controllers/ws.py (1)

683-714: LGTM!

Also applies to: 807-813, 892-895

src/synthorg/workers/execution_service.py (1)

1-176: LGTM!

web/src/api/types/openapi.gen.ts (1)

1608-1659: LGTM!

Also applies to: 3767-3783, 4888-4903, 5304-5311, 5646-5650, 7535-7543, 7579-7611, 8802-8816, 9923-9934, 16014-16146, 20645-20679

src/synthorg/observability/events/workers.py (1)

39-70: LGTM!

tests/unit/workers/test_executor.py (1)

1-197: LGTM!

src/synthorg/workers/executor.py (1)

1-269: LGTM!

Comment thread src/synthorg/api/state.py
Comment thread src/synthorg/engine/task_engine.py Outdated
Comment thread src/synthorg/experiments/service.py Outdated
Comment thread src/synthorg/workers/__main__.py Outdated
Comment thread tests/unit/experiments/test_service.py Outdated
Round 3 batch on PR #1926. CI now green (drain-budget split fix landed).

CodeRabbit submitted a new CHANGES_REQUESTED review with 5 actionable

inline + 1 outside-diff finding + 2 duplicate threads on round-2 work.

All 5+1 valid items + 1 duplicate are applied; the monitoring.md

RS-marker duplicate is skipped for the same reason as round 2 (file is

outside the doc-numeric-macros gate scope).

Fixes:

- api/state.py: add '_lazy_service_lock' to __slots__ so the slot is

  bound declaratively even though the mixin chain provides __dict__

- engine/task_engine: rewrite _drain_all docstring to present-tense

  rationale only (no 'under the previous design' migration framing)

- experiments/service: explicit upper-bound check on register_variant

  (weight > _MAX_VARIANT_WEIGHT) so the EXPERIMENT_VARIANT_INVALID_WEIGHT

  event fires for both bounds; wrap record_assignment in ConflictError

  handler that re-fetches the canonical row when a durable backend

  rejects the insert

- experiments/protocol: document record_assignment's ConflictError

  contract for durable backends

- workers/__main__: gate task_queue.stop() on a 'queue_started' flag so

  a task_queue.start() failure does not leak the owned http_client AND

  does not call stop() on a queue that never bound a consumer;

  refresh _placeholder_executor docstring (no longer reachable on

  missing backend URL since _resolve_executor exits)

- tests/unit/experiments/test_service: _service() now returns the

  FakeClock too so tests stop reaching into svc._clock
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 15, 2026 22:16 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/workers/__main__.py`:
- Around line 233-239: The AsyncClient is created without an explicit timeout;
to match project patterns and make timeouts consistent with
TaskExecutionExecutor's per-request timeout, instantiate httpx.AsyncClient with
an explicit timeout (e.g., 60s) when creating http_client in the factory that
returns TaskExecutionExecutor and http_client; update the http_client creation
where httpx.AsyncClient() is called so the client-level timeout aligns with the
executor's expected request timeout while still allowing TaskExecutionExecutor's
per-request override.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6e05e2a0-ce8c-4139-8c3d-55326ff9889b

📥 Commits

Reviewing files that changed from the base of the PR and between 69540af and d21d9e7.

📒 Files selected for processing (6)
  • src/synthorg/api/state.py
  • src/synthorg/engine/task_engine.py
  • src/synthorg/experiments/protocol.py
  • src/synthorg/experiments/service.py
  • src/synthorg/workers/__main__.py
  • tests/unit/experiments/test_service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: CodSpeed Web Pass
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: CLI Test (windows-latest)
  • GitHub Check: CLI Bench Regression
  • GitHub Check: Dashboard Test
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
src/synthorg/!(persistence)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL; enforce Persistence Boundary

Files:

  • src/synthorg/experiments/protocol.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/engine/task_engine.py
  • src/synthorg/api/state.py
  • src/synthorg/experiments/service.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Configuration Precedence (MANDATORY): DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env at boot; no os.environ.get outside startup

No Hardcoded Values (MANDATORY): numerics live in settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal)

No comments citing reviewers, issue back-refs, or migration framing; only explain WHY; enforced by check_no_review_origin_in_code.py + check_no_migration_framing.py

No from __future__ import annotations (3.14 has PEP 649); PEP 758 except: except A, B: no parens unless binding

Type hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Define errors as <Domain><Condition>Error from DomainError; never inherit Exception/RuntimeError directly; enforced by check_domain_error_hierarchy.py

Use Pydantic v2 frozen + extra="forbid" on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); @computed_field for derived; NotBlankStr for identifiers

Use args models at every system boundary; parse_typed() for every external dict ingestion; enforced by check_boundary_typed.py

Use immutability: model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)

Clock seam: use clock: Clock | None = None; tests inject FakeClock; lifecycle: services own _lifecycle_lock; timed-out stops mark unrestartable

Wrap untrusted content (SEC-1) with wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML

Import logger via from synthorg.observability import get_logger; variable always logger; never import logging or print() in app code

Use event ...

Files:

  • src/synthorg/experiments/protocol.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/engine/task_engine.py
  • src/synthorg/api/state.py
  • src/synthorg/experiments/service.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/experiments/protocol.py
  • src/synthorg/workers/__main__.py
  • src/synthorg/engine/task_engine.py
  • src/synthorg/api/state.py
  • src/synthorg/experiments/service.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries is blocked by scripts/check_mock_spec.py

FakeClock and mock_of import from tests._shared; inject via clock= and the helper's spec subscript

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Flaky tests: NEVER skip/xfail; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/experiments/test_service.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/experiments/test_service.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Use no region/locale privileged; use metric units; use British English for regional defaults
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Every convention PR must ship its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review` (gh pr create is hookify-blocked); after PR: use `/aurelio-review-pr` for external feedback; fix EVERYTHING valid
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: xdist: `-n 8 --dist=loadfile` auto-applied via pyproject addopts; prevents 3.14+ Windows ProactorEventLoop leak
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Use `.pre-commit-config.yaml` for pre-commit/pre-push hooks; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Web: see `web/CLAUDE.md` for detailed guidelines; CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Shell: see `~/.claude/rules/common/bash.md` (canonical for `cd` / `git -C` / Bash file-write rules)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:16:00.473Z
Learning: Python 3.14+ minimum version; BUSL-1.1 → Apache 2.0 after Change Date; src layout with `src/synthorg/`, `tests/`, `web/` (React 19), `cli/` (Go binary)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/experiments/protocol.py
  • src/synthorg/workers/__main__.py
  • tests/unit/experiments/test_service.py
  • src/synthorg/engine/task_engine.py
  • src/synthorg/api/state.py
  • src/synthorg/experiments/service.py
🔇 Additional comments (16)
src/synthorg/api/state.py (1)

67-68: LGTM!

Also applies to: 118-121, 204-204, 210-210, 283-283, 406-420, 807-865

src/synthorg/engine/task_engine.py (2)

387-413: LGTM!


729-739: LGTM!

Also applies to: 789-798, 814-823

src/synthorg/experiments/protocol.py (1)

1-89: LGTM!

src/synthorg/experiments/service.py (4)

1-50: LGTM!


66-104: LGTM!


132-215: LGTM!


231-270: LGTM!

src/synthorg/workers/__main__.py (3)

1-22: LGTM!

Also applies to: 50-63


98-124: LGTM!


170-197: LGTM!

tests/unit/experiments/test_service.py (5)

1-24: LGTM!


27-64: LGTM!


67-110: LGTM!


113-136: LGTM!


139-174: LGTM!

Comment thread src/synthorg/workers/__main__.py Outdated
…line

Round 4 babysit on PR #1926. CodeRabbit posted a single Nitpick on the

round-3 head noting that workers/__main__'s httpx.AsyncClient() was

constructed without an explicit timeout= kwarg, leaving the 5 s

httpx default in place even though every other httpx.AsyncClient(

site in the codebase (a2a/client, app, oauth flows, slack/github

health checks, generic_http, notifications/slack+ntfy, telemetry,

discovery, health_prober, probing, tsa_client) passes a project-

specific timeout= explicitly. The per-request timeout=self.

_timeout_seconds on the executor's http.post does override the

client baseline, so this is consistency hygiene rather than a runtime

bug, but the babysit mandate fixes everything valid.

Fixes:

- workers/executor: rename _DEFAULT_TIMEOUT_SECONDS -> public

  DEFAULT_HTTP_TIMEOUT_SECONDS so the worker entry point can mirror

  the same value without reaching into a private module symbol

- workers/__main__: construct httpx.AsyncClient(timeout=

  DEFAULT_HTTP_TIMEOUT_SECONDS) to match the executor's per-request

  default and the rest of the codebase's pattern
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 15, 2026 22:42 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/workers/executor.py`:
- Around line 104-112: Add logging calls before each constructor validation
raise: for the checks on api_base_url, auth_token, and timeout_seconds in the
Executor constructor (or __init__), call the module/class logger (consistent
with other code paths like _resolve_executor) to emit a WARNING or ERROR with
context including the parameter name and value, then raise the existing
ValueError; ensure you reference the same logger instance used elsewhere (e.g.,
logger or self.logger) and format messages like "Invalid constructor argument:
api_base_url=%r" for consistency.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 67305dec-0837-4975-9fa7-7423e1ada567

📥 Commits

Reviewing files that changed from the base of the PR and between d21d9e7 and c8f2f03.

📒 Files selected for processing (2)
  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/executor.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: CLI Test (windows-latest)
  • GitHub Check: CLI Bench Regression
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Read docs/design/ page before implementing; deviations need approval per DESIGN_SPEC.md

No region/currency/locale privileged; use metric units; use British English per docs/reference/regional-defaults.md

Every convention PR ships its enforcement gate per docs/reference/convention-gates.md

Configuration Precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env at the boot site. YAML is a company-template ingestion format, not a precedence tier. No os.environ.get outside startup; pre-init Cat-2 reads use settings.bootstrap_resolver.resolve_init_value per docs/reference/configuration-precedence.md

No hardcoded numeric values; numerics live in settings/definitions/. Allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of the form NAME: int|float|Final|Final[int]|Final[float] = literal. Enforced by scripts/check_no_magic_numbers.py

No from __future__ import annotations (3.14 has PEP 649). Use PEP 758 except A, B: syntax without parens unless binding

Type hints required on public functions; mypy strict. Use Google-style docstrings. Line length 88; functions <50 lines; files <800 lines

Error classes must follow naming pattern <Domain><Condition>Error and inherit from DomainError, never directly from Exception/RuntimeError/etc. Enforced by check_domain_error_hierarchy.py

Pydantic v2 models must use frozen=True and extra="forbid" on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); use @computed_field for derived fields; use NotBlankStr for identifiers

Args models required at every system boundary; use parse_typed() for every external dict ingestion. Enforced by check_boundary_typed.py

Immutability: use model_copy(update=...) or copy.deepcopy(); use deepcopy at system boundaries

Async code must use asyncio.TaskGroup for ...

Files:

  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/executor.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/executor.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:41:54.213Z
Learning: Numerics in README and public docs must be sourced from `data/runtime_stats.yaml` via `<!--RS:NAME-->` markers per `data/README.md`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:41:54.213Z
Learning: Timeout/slow test failures indicate source-code regression; never manually edit `tests/baselines/unit_timing.json`, `scripts/*_baseline.{txt,json}`, or `scripts/_*_baseline.py`. Both families are PreToolUse-blocked. Per-invocation bypass for gate baselines requires explicit user approval: `ALLOW_BASELINE_GROWTH=1 git commit ...`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:41:54.213Z
Learning: API startup lifecycle: two phases: construction (`create_app` body) wires synchronous services; on_startup (`_build_lifecycle.on_startup`) wires services needing connected persistence. Construction-phase ordering: `agent_registry` before `auto_wire_meetings`; `tunnel_provider` wired unconditionally. On-startup ordering: `SettingsService` auto-wire before `WorkflowExecutionObserver` registration; `OntologyService` wires after `persistence.connect()` via `_wire_ontology_service`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:41:54.213Z
Learning: Use `d2` for architecture / nested containers diagrams; use `mermaid` for flowcharts / sequence / pipelines diagrams; use Markdown tables for tabular data. D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:41:54.213Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced. Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`). Branches: `<type>/<slug>` from main. Squash merge with PR body as squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-15T22:41:54.213Z
Learning: After every squash merge, run `/post-merge-cleanup`. CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/workers/__main__.py
  • src/synthorg/workers/executor.py
🔇 Additional comments (10)
src/synthorg/workers/__main__.py (5)

1-22: LGTM!

Also applies to: 30-31, 35-48


53-66: LGTM!


101-127: LGTM!


173-200: LGTM!


204-246: LGTM!

src/synthorg/workers/executor.py (5)

1-48: LGTM!

Also applies to: 50-63, 66-102, 113-117


119-166: LGTM!


168-236: LGTM!


238-252: LGTM!


254-271: LGTM!

Comment thread src/synthorg/workers/executor.py
…rrors

Round 5 babysit on PR #1926. CodeRabbit posted a single Nitpick on the

round-4 head asking for the project's log-before-raise pattern on the

three constructor validation paths (api_base_url, auth_token,

timeout_seconds). CLAUDE.md mandates the pattern explicitly in the

Logging section.

Fixes:

- observability/events/workers: add WORKERS_EXECUTOR_INVALID_INIT_ARG

  event constant

- workers/executor: logger.warning(WORKERS_EXECUTOR_INVALID_INIT_ARG)

  before each ValueError raise in TaskExecutionExecutor.__init__; the

  auth_token path deliberately omits the value field so a leaked token

  cannot flow into structured logs
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 15, 2026 23:04 — with GitHub Actions Inactive
@Aureliolo Aureliolo merged commit c29eb32 into main May 16, 2026
93 of 94 checks passed
@Aureliolo Aureliolo deleted the chore/wp7-hygiene-stubs-tooling-docs branch May 16, 2026 00:40
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 16, 2026 00:40 — with GitHub Actions Inactive
Aureliolo pushed a commit that referenced this pull request May 17, 2026
<!-- HIGHLIGHTS_START -->
## Highlights

> _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub
Models). Commit-based changelog below._

### What you'll notice
- Frontend WP-6 update with UX polish improves user interface and
workflow.
- Dashboard and training endpoint improvements enhance observability and
dispatch behavior.
- Web storybook now supports change detection for more responsive UI
interactions.
- Git hooks now isolated per worktree for cleaner repository management.
- Providers automatically detect native streaming support in Litellm
models.

### What's new
- Added a new pipeline to convert Pydantic DTOs to TypeScript for better
front-end compatibility.

### Under the hood
- Refactored settings to three precedence categories, removing YAML tier
for simpler configuration.
- Completed RootConfig mirror coverage for enhanced configuration
consistency.
- Adopted API conventions with better query performance and forbidden
extra fields for stricter validation.
- Improved persistence, layer discipline, and restart safety in core
work packages.
- CI updated with split test jobs and tightened coverage gates for
better test quality.
- Switched to direct Trivy binary for security scans, removing previous
Trivy action dependency.
- Enhanced memory management with per-call processing options and better
observability during speech-to-text encoding.
- Various dependency updates for Python, infrastructure, and lock files
maintain security and stability.
- Removed TypeScript DTO type-tightening overlays to simplify type
management.
- Codebase audit tightened skill sets to prevent false positivity in
class detection by 2026.

<!-- HIGHLIGHTS_END -->

:robot: I have created a release *beep* *boop*
---


##
[0.8.5](v0.8.4...v0.8.5)
(2026-05-17)


### Features

* **codegen:** pydantic-to-typescript DTO pipeline + parity gate (closes
[#1889](#1889))
([#1909](#1909))
([0265ef5](0265ef5))
* **storybook:** enable changeDetection + trim web/CLAUDE.md
([#1939](#1939))
([3b1f4c0](3b1f4c0))
* **web,setup:** WP-6 frontend + UX polish
([#1941](#1941))
([d9ca76d](d9ca76d))


### Bug Fixes

* correct invalid git for-each-ref syntax in post-merge-cleanup skill
([#1946](#1946))
([69a1649](69a1649))
* dashboard polish, training endpoint dispatch, and observability
cleanup ([#1911](#1911))
([b61e9e8](b61e9e8))
* per-worktree git-hook isolation + hookify gate migration + MSW drift
fix ([#1949](#1949))
([e3f8495](e3f8495))
* **providers:** read supports_native_streaming from litellm model info
([#1942](#1942))
([60364ca](60364ca))
* security and audit coverage (closes
[#1883](#1883))
([#1904](#1904))
([d8ebf55](d8ebf55))


### Performance

* **ci:** mypy --num-workers=4 + enable ruff TID255
([#1944](#1944))
([484c1d3](484c1d3))


### Refactoring

* **ci:** drop aquasecurity/trivy-action, use direct trivy binary
([#1940](#1940))
([df1f946](df1f946))
* **memory:** per-call processing_kwargs + observability for ST encode
([#1943](#1943))
([3aa9d20](3aa9d20))
* Phase 7 follow-up — complete RootConfig mirror coverage (closes
[#1907](#1907))
([#1914](#1914))
([605500b](605500b))
* **settings:** collapse precedence to three categories; drop YAML tier
(closes [#1890](#1890))
([#1910](#1910))
([efd54c9](efd54c9))
* WP-3 API conventions + query performance + project-wide extra=forbid
([#1953](#1953))
([504d579](504d579)),
closes [#1918](#1918)
* WP-4 settings + cross-cutting (clock seam, contextvars, dispatch,
plugin surfaces)
([#1954](#1954))
([7207d92](7207d92))
* **wp1:** persistence + layer discipline + restart safety
([#1945](#1945))
([57586fb](57586fb))


### Documentation

* **wp5:** public-facing truth refresh
([#1924](#1924))
([afb5cc5](afb5cc5))


### CI/CD

* split test job by marker with airtight aggregate coverage gate
([#1948](#1948))
([0b818d5](0b818d5)),
closes [#1938](#1938)
[#1937](#1937)


### Maintenance

* **codebase-audit:** tighten skill to prevent 2026-05-15 FP classes
([#1923](#1923))
([9317ed1](9317ed1))
* Lock file maintenance
([#1913](#1913))
([c08a355](c08a355))
* Lock file maintenance
([#1950](#1950))
([8940ab1](8940ab1))
* remove TS DTO type-tightening overlays
([#1915](#1915))
([d296214](d296214)),
closes [#1906](#1906)
* Update Infrastructure dependencies
([#1928](#1928))
([d19fae5](d19fae5))
* Update Python dependencies
([#1929](#1929))
([75cc2c8](75cc2c8))
* **wp7:** hygiene, stubs, test/CI/tooling, doc gaps, boundary patterns
doc ([#1926](#1926))
([c29eb32](c29eb32))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WP-7: Hygiene, Stubs, Test/CI/Tooling, Doc Gaps, Boundary Patterns Doc

1 participant