chore(wp7): hygiene, stubs, test/CI/tooling, doc gaps, boundary patterns doc#1926
Conversation
Section 1a: WARNING logs before 9 raise sites in client/pool.py, client/store.py, workers/claim.py. New events: CLIENT_NOT_FOUND, CLIENT_REQUEST_NOT_FOUND, WORKERS_QUEUE_START_REJECTED, WORKERS_QUEUE_NOT_RUNNING. Section 1b: INFO CLIENT_REQUEST_STATUS_TRANSITIONED at controller caller sites after request_store.save() returns; SPRINT_STATUS_TRANSITIONED in ceremony_scheduler after the active-sprint in-memory update. Tests: test_pool_logging.py + test_store_logging.py + test_claim_lifecycle.py extensions. Refs #1922
…nd gate regex (refs #1922)
…fication checkpoint (refs #1922)
…EST guides + YAML schema reference (refs #1922)
…valContext (refs #1922)
…gres pool metrics (refs #1922)
…ecycle teardown tests (refs #1922)
…icAnalyzer (refs #1922)
…strumentation + Postgres pool stats (refs #1922)
…tead of per-site (refs #1922)
…stgres container (refs #1922)
Pre-reviewed by 20 agents; 14 valid findings addressed (2 critical, 10 major,
2 minor). Verification: ruff/format/mypy/pytest (30,530 passed) + web
lint/type-check/test + go vet/test/build, all green.
Critical:
* ws.py: guard ws_active_connection_count read-modify-write with a
threading.Lock so the Prometheus scrape thread cannot read mid-increment.
* prometheus_recording.py: split WS lifetime + Postgres pool methods into a
new StreamRecordingMixin in prometheus_recording_streams.py so the main
module stays under the 800-line ceiling.
Major:
* rest-api-examples.md: rewrite the auth + every endpoint example for
cookie-based auth (login returns CookieSessionResponse; token is in
HttpOnly Set-Cookie, not response body).
* approval-workflow.md: replace stale event names with the real ones
(api.approval.created / security.approval.{approved,rejected} / etc).
* getting_started.md + CLAUDE.md: clarify that install_cli_tools.sh
installs golangci-lint only; document d2 install separately.
* client/store.py: log SIMULATION_RUN_NOT_FOUND before SimulationStore.get
raises (completes the WP-7 ten-raise-sites-log-first mandate).
* experiments controller: switch list_assignments to HMAC-signed
paginate_cursor flow (encode_cursor / decode_cursor) like tasks.py;
regenerate web/src/api/types/openapi.gen.ts to match.
* experiments service: log EXPERIMENT_VARIANT_INVALID_WEIGHT and
EXPERIMENT_NOT_FOUND before the two raise sites.
* ceremony_scheduler.py: downgrade SPRINT_STATUS_TRANSITIONED to DEBUG
with a comment explaining sprint state is in-memory only.
* persistence/protocol.py: tighten backend.kind return type to
Literal[sqlite|postgres]; update both backend impls + the fake.
* cli/cmd/update.go: drop unnecessary else after return in CLI-update
reexec branch.
Minor:
* check_no_migration_framing.py: extend module docstring with the five
new regex patterns added by this PR.
* workers-and-background-tasks.md: add a Worker authentication subsection
documenting SYNTHORG_WORKER_AUTH_TOKEN, rotation and HTTPS requirements.
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
🧰 Additional context used📓 Path-based instructions (4){src/**/*.py,web/src/**/*.{ts,tsx}}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
src/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
⚙️ CodeRabbit configuration file
Files:
src/synthorg/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
{src/**/*.py,tests/**/*.py}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧠 Learnings (2)📓 Common learnings📚 Learning: 2026-05-05T09:04:46.195ZApplied to files:
🔇 Additional comments (7)
WalkthroughImplements an A/B experiments subsystem (models, protocol, in-memory repo, service), controller routes, DTOs and web typings, and unit tests. Adds a worker execution seam and LifecycleAdvancingExecutionService plus TaskExecutionExecutor (HTTP callback) and POST /tasks/{task_id}/execute; wires defaults into AppState and CLI flags for executor selection. Expands observability (event constants, Prometheus mixin/metrics for WebSocket lifetimes and Postgres pool), standardizes LLM sampling/top_p propagation and ModelPinMetadata, exposes persistence backend kind/config, hardens CLI self-update and config-load errors, moves costly pre-commit hooks to pre-push, expands migration-framing checks, and adds multiple docs and tests. |
Merging this PR will degrade performance by 33.9%
|
| Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|
| ❌ | resolveLocale x1000 (full-fallback to APP_LOCALE_FALLBACK) |
135.2 µs | 204.5 µs | -33.9% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing chore/wp7-hygiene-stubs-tooling-docs (c2fa188) with main (afb5cc5)
There was a problem hiding this comment.
Actionable comments posted: 32
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docs/guides/monitoring.md (1)
317-318:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the
Audit & Securityrow panel list to include the new audit-log fill-ratio gauge.The new section adds this panel, but the summary table row still omits it, so the guide is internally inconsistent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/monitoring.md` around lines 317 - 318, Update the summary table row that starts with the literal "`Audit & Security`" to include the new "audit-log fill-ratio gauge" in its panel list so the row text matches the newly added panel; edit the table cell that currently reads "Audit chain append rate, depth, last-append age, security verdicts, agent identity version changes, API error categories" and append ", audit-log fill-ratio gauge" (or insert it in the preferred position) so the two references stay consistent.tests/unit/experiments/test_service.py (1)
153-160:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winTest claims recency ordering but never validates order.
test_list_assignments_paginates_in_recency_ordercurrently only checkstotaland page length, so an ordering regression would still pass.Suggested assertion addition
assert total == 5 assert len(page) == 3 + assert [str(item.subject_id) for item in page] == ["u-4", "u-3", "u-2"]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/experiments/test_service.py` around lines 153 - 160, Test currently only verifies total and page length; add an assertion that the returned assignments are in recency order. After calling svc.list_assignments and receiving page, assert that the timestamp field on each assignment in page (e.g., created_at or updated_at on the assignment objects returned by svc.list_assignments) is non-increasing across the page (each item’s timestamp >= next item’s timestamp) to validate recency ordering in test_list_assignments_paginates_in_recency_order.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.pre-commit-config.yaml:
- Around line 626-630: Update the top-level rationale comment that mentions
pre-commit.ci to reflect the new hook staging: change any lines claiming that
"no-review-origin-in-code" and "no-migration-framing" are enforced by
pre-commit.ci to indicate they now run at stages: [pre-push] (or otherwise
reference the pre-push stage), so the header aligns with the actual
configuration (see the existing stages: [pre-push] entry and the named hooks
no-review-origin-in-code and no-migration-framing).
In `@cli/cmd/new.go`:
- Around line 37-45: The RunE for the top-level "synthorg new" command currently
just calls cmd.Help() and returns nil; change it so after calling cmd.Help() it
returns a usage error to signal a CLI usage failure (exit code 2). Concretely,
in the RunE implementation that currently references cobra.NoArgs and calls
cmd.Help(), call cmd.Help() and then return a usage-style error (for example
return cobra.ErrInvalidArgs or a new errors.New("requires a scaffold
kind/domain")) so the command both prints help and exits with a non-zero usage
error.
In `@cli/cmd/start.go`:
- Around line 92-109: The switch that checks err.Error() prefixes should be
replaced with typed/sentinel error checks against the error values/types
exported by the config package (use errors.Is and/or errors.As with symbols from
cli/internal/config, e.g. the package's sentinel errors like ErrParsing,
ErrReading or a ParseError type) so that parsing vs reading vs other config
failures are detected robustly; update the error handling in the function that
currently does the string-prefix switch to call errors.Is(err,
config.ErrReading)/errors.Is(err, config.ErrParsing) or use errors.As to match a
config.ParseError and then return the corresponding wrapped fmt.Errorf messages
(preserving the current user-facing messages and wrapping the original err) and
fall back to fmt.Errorf("config: %w", err) for anything else.
In `@docs/architecture/decisions.md`:
- Around line 132-137: The dated checklist header "Verification checkpoint
(2026-06-10)" and any hardcoded numerics in the new items must be replaced with
runtime-stats markers per the docs numeric-sourcing policy: pull the date/number
from data/runtime_stats.yaml and insert a <!--RS:NAME--> marker instead of the
literal "2026-06-10" (or move the volatile checkpoint into an internal/temporary
doc if you prefer). Update the checklist text in docs/architecture/decisions.md
(the "Verification checkpoint" heading and the numbered items 1–4) to reference
the appropriate <!--RS:...--> keys and ensure the corresponding entries exist in
data/runtime_stats.yaml; if you move the checkpoint out of public docs, note
that change here and remove hardcoded numerics.
In `@docs/getting_started.md`:
- Around line 35-53: Replace hardcoded version numerics in
docs/getting_started.md with runtime-stats markers: change the d2 version string
"v0.7.1" to a marker like <!--RS:D2_VERSION--> and replace any golangci-lint
pinned version mention with a marker like <!--RS:GOLANGCI_LINT_VERSION--> that
sources values from data/runtime_stats.yaml; ensure any other explicit numeric
literals in this section are similarly converted to <!--RS:...--> markers and
update the install guidance to reference scripts/install_cli_tools.sh and the CI
pin in .github/workflows/cli.yml so readers know where the marker values
originate.
In `@docs/guides/a2a-federation.md`:
- Around line 20-27: The table and other doc spots contain hard-coded numeric
defaults (e.g., the `a2a.timeout_seconds` default value of 30 and other literal
ports/status codes); replace each numeric literal with the appropriate
runtime-stat marker `<!--RS:NAME-->` (e.g., swap the "30" default for
`a2a.timeout_seconds` with a matching `<!--RS:...-->` marker name taken from
data/runtime_stats.yaml), update every occurrence noted (the table row for
`a2a.timeout_seconds`, the default/port/status code occurrences referenced in
the comment) to use the marker, and if a suitable marker does not exist add a
descriptive entry to data/runtime_stats.yaml and reference it here using
`<!--RS:NAME-->`; ensure marker names are descriptive and consistent with
existing markers so the public docs render the numeric values at runtime.
In `@docs/guides/approval-workflow.md`:
- Around line 19-26: Replace hardcoded numeric defaults in the table with
runtime-sourced marker comments: remove inline literals like `86400` for
`approval.timeout_seconds` and replace with a marker comment such as
`<!--RS:APPROVAL_TIMEOUT_SECONDS-->` (sourced from data/runtime_stats.yaml); do
the same for any other numeric literals cited in this section (also update
occurrences referenced at 85-89) so the docs use `<!--RS:...-->` markers instead
of literal numbers, keeping the config keys (`approval.timeout_seconds`,
`approval.enabled`, `approval.reviewer_groups`, `approval.notification_target`,
`approval.audit_chain_signing`) intact and only changing the displayed
numeric/default values to marker placeholders.
In `@docs/guides/ceremony-scheduling-tuning.md`:
- Around line 20-27: The table contains literal numeric defaults and flags
(e.g., the numeric/default values for ceremonies[].trigger.threshold,
ceremony_policy.strategy default, and ceremony_policy.auto_transition default)
that must be replaced with runtime-stat markers; update each numeric or boolean
literal in the doc (including other occurrences called out around the file) to
use the <!--RS:NAME--> marker format tied to data/runtime_stats.yaml (create or
reference appropriate NAME keys like RS_CEREMONY_THRESHOLD, RS_STRATEGY_DEFAULT,
RS_AUTO_TRANSITION, etc.), ensure markers replace the literal values in the
table entries and in the other mentioned sections, and keep the marker names
descriptive and added to runtime_stats.yaml if missing.
In `@docs/guides/cost-attribution.md`:
- Around line 12-18: Replace the hardcoded numeric estimates in
docs/guides/cost-attribution.md (the table rows for Provider, Model, Agent,
Project and the other occurrences noted at lines 43-57 and 84-99) with
runtime-stats markers that reference data/runtime_stats.yaml using the
<!--RS:NAME--> syntax; for each numeric value (e.g., ~10, ~50, 80, 95) create or
use an appropriate key in data/runtime_stats.yaml and replace the literal in the
markdown with the corresponding <!--RS:KEY--> marker so the docs-numerics gate
is satisfied and all public docs pull numbers from runtime_stats.
In `@docs/guides/custom-mcp-server-dev.md`:
- Line 8: Replace the hardcoded "200+ tools across 15 domain modules" text in
docs/guides/custom-mcp-server-dev.md with runtime-stats markers that pull values
from data/runtime_stats.yaml using the <!--RS:...--> syntax (e.g.,
<!--RS:TOOLS_COUNT--> and <!--RS:DOMAIN_MODULES_COUNT-->); update the sentence
that mentions "200+ tools" and "15 domain modules" to reference those markers so
the public doc numbers are sourced at runtime from runtime_stats.yaml.
In `@docs/guides/custom-rules-and-meta-loop.md`:
- Around line 8-115: The doc contains hardcoded numeric literals (e.g., the
"five" retry example in the intro, minutes=60 in HighFailRateRule, threshold 0.3
in the rule logic, 0.15 in the sample config, and the test outcomes counts 4/6)
that must be replaced with runtime-stats markers; update the guide to use
<!--RS:...--> markers (or remove precise numbers) sourced from
data/runtime_stats.yaml for each numeric claim referenced near HighFailRateRule,
the config block under meta_loop.rules, the example test vectors in
test_high_fail_rate, and the top-level policy example (retry limit), so readers
and the docs gate get values from runtime_stats instead of hardcoded literals.
In `@docs/guides/monitoring.md`:
- Line 104: The docs row for the metric `synthorg_security_audit_log_fill_ratio`
(and the nearby section around lines referenced 227-281) uses hardcoded numeric
values (e.g., 0.9, 0.75, 10m, 0..1); replace each hardcoded numeric with a
runtime-stats marker (<!--RS:NAME-->) that references a key in
data/runtime_stats.yaml (create/update entries in data/runtime_stats.yaml for
each numeric like alert_threshold_audit_log_fill_ratio, warning_threshold_...,
retention_window_minutes, value_range_audit_log, etc.), update the markdown
table/JSON/YAML to use those markers, and ensure marker names are unique and
documented so the build pulls values automatically.
In `@docs/guides/ontology-extension.md`:
- Around line 10-109: The guide contains hardcoded numeric literals that must be
sourced via runtime-stats markers; update the document sections (e.g.,
"Concepts", "Registering a new term", "Worked example", "Adding a profile", and
anywhere the example snippets reference numeric counts) to replace any numeric
literals with <!--RS:NAME--> markers tied to data/runtime_stats.yaml, or else
rephrase to avoid explicit numbers; ensure markers follow the exact format
<!--RS:KEY--> and map to an entry in runtime_stats.yaml so rendered prompts (and
references like FINANCE_PROFILE, InjectionProfile, and PROFILE =
FINANCE_PROFILE) include referenced runtime stats instead of hardcoded numerics.
In `@docs/guides/rest-api-examples.md`:
- Around line 8-10: Replace hardcoded numeric literals in the shown prose (e.g.,
the default backend port "3001" in "$BASE defaults to http://localhost:3001",
the "10" in "10 most common operations", and any other
counts/limits/percentages/pagination defaults) with runtime-stats markers like
<!--RS:NAME--> that map to data/runtime_stats.yaml; update occurrences around
the API mount string "/api/v1", the "$BASE" example, and any mentions of
response types (e.g., "ApiResponse<T>", "PaginatedResponse<T>") so examples
interpolate the markers instead of hard numbers, and ensure the README/public
docs follow the numeric-sourcing rule across the other reported sections.
In `@docs/guides/workers-and-background-tasks.md`:
- Around line 81-85: The fenced code block showing worker logs is missing a
language tag which triggers MD040; update the fenced block in
workers-and-background-tasks.md (the block containing
"workers.worker.claim_received task_id=task-A") by adding a language label such
as text after the opening triple backticks (e.g., ```text) so the snippet is
lint-clean; ensure the closing fence remains ``` and no other content is
changed.
In `@docs/reference/typed-boundaries.md`:
- Around line 57-60: The table mixes true parse_typed-enforced boundaries with
informational/non-parse_typed entries (e.g., extract_tool_calls, parse_payload,
invoke) which conflicts with the page text and lint guard claim of six
parse_typed boundaries; update the doc by either splitting the table into two
sections ("parse_typed-enforced" vs "informational/lenient"), or mark those rows
with a clear column/annotation (e.g., "enforcement: lenient" or "not
parse_typed") and adjust the surrounding text and lint-guard description to
reflect the actual count and enforcement scope so readers see that
extract_tool_calls, parse_payload, and invoke are not parse_typed-enforced.
In `@docs/reference/yaml-schema.md`:
- Around line 43-47: The docs reference shows hardcoded numeric literals (e.g.,
the `version` field value `1`) that must be replaced with runtime-stats markers;
update the `version` table entry and all other numeric literals referenced
(around lines 87-103, 159-162, 205-210) to use the <!--RS:NAME--> marker format
that pulls values from data/runtime_stats.yaml (create a meaningful NAME for
each value, e.g., RS_SCHEMA_VERSION for the `version` integer) and ensure the
marker replaces the literal in the markdown and the corresponding key exists in
data/runtime_stats.yaml.
In `@scripts/check_no_migration_framing.py`:
- Around line 104-109: The compiled regex for the "were previously ..." gate
currently lists only inlined|duplicated|scattered|extracted and therefore omits
the documented forbidden verbs; update the re.compile(...) pattern (the second
argument shown in the diff) to include owned, emitted, and wrapped in the
alternation (e.g. add |owned|emitted|wrapped) so phrases like "were previously
owned/emitted/wrapped" are matched; keep the existing word boundaries and
IGNORECASE flag intact.
In `@src/synthorg/api/controllers/tasks.py`:
- Around line 325-331: The POST endpoint decorator for the task execute route
currently relies on the default 201 response; update the `@post` decorator on the
"/{task_id:str}/execute" endpoint (the one decorated with guards
require_write_access and per_op_rate_limit_from_policy("tasks.execute",
key="user")) to explicitly set status_code=200 so the handler for executing an
existing task returns HTTP 200 OK instead of 201 Created.
In `@src/synthorg/api/controllers/ws.py`:
- Around line 707-711: The gauge write is happening outside the lock, so
concurrent opens/closes can overwrite it with a stale value; move the call to
app_state.prometheus_collector.set_ws_active_connections(count=current) into the
same critical section protected by _WS_CONNECTION_COUNT_LOCK immediately after
updating state_dict["ws_active_connection_count"] (do this for both the
increment path where current is computed and the decrement path that updates the
counter) so the computed current and the gauge write occur atomically.
In `@src/synthorg/api/dto.py`:
- Around line 435-479: The module defines public DTO classes
RegisterExperimentVariantRequest, AssignExperimentRequest, and
ExecuteTaskRequest but does not include them in the module export list; add
these class names to the module's __all__ so wildcard imports and the public
surface remain consistent (also check and add any other DTOs mentioned around
the other region referenced, e.g., the classes between lines 752-781). Update
the __all__ list to include "RegisterExperimentVariantRequest",
"AssignExperimentRequest", and "ExecuteTaskRequest".
- Line 441: The inline upper bound le=1000 on the weight Field should be
replaced with a module-level annotated constant following the existing pattern
(e.g. define _MAX_SELECTION_WEIGHT: int = 1000 near other _MAX_* constants) and
then reference that constant in the weight Field (use le=_MAX_SELECTION_WEIGHT);
update the constant name to be descriptive and typed with int so it follows the
project's allowlist of module-level numeric constants and keeps Field(...) free
of hardcoded literals.
In `@src/synthorg/api/state.py`:
- Around line 810-846: The lazy-initialisation of _worker_execution_service and
_experiment_service can race; wrap their first-access construction in a lock
using a double-checked pattern so only one instance is created. For both the
worker_execution_service property (creating LifecycleAdvancingExecutionService
and assigning _worker_execution_service) and the experiment_service property
(creating ExperimentService and assigning _experiment_service) acquire a shared
instance-level lock (e.g. self._init_lock or reuse an existing lock field),
re-check the backing attribute inside the lock, then construct and assign; add a
threading.Lock import/initialiser if no lock exists and keep the
set_worker_execution_service/_set_once behaviour unchanged.
In `@src/synthorg/experiments/models.py`:
- Line 29: Replace the hardcoded literal le=1000 in the weight Field by
extracting 1000 into a module-level annotated constant (e.g., MAX_WEIGHT: int =
1000) placed under your settings/definitions area, then import that constant
into src/synthorg/experiments/models.py and use it as le=MAX_WEIGHT in the
weight: int = Field(...) declaration so the numeric is no longer hardcoded and
remains type-annotated and centrally configurable.
In `@src/synthorg/experiments/service.py`:
- Around line 147-177: The assign() flow has a race between
_repo.get_assignment(...) and _repo.record_assignment(...); make recording
idempotent by handling concurrent-insert conflicts: after creating assignment
via _choose_variant(...)/ExperimentAssignment, call
_repo.record_assignment(assignment) and if it raises a unique-conflict/duplicate
error, catch it, call _repo.get_assignment(experiment, subject_id) again and
return that recorded assignment instead of failing or duplicating;
alternatively, if your repository supports an atomic upsert/insert-if-not-exists
API, use that in place of record_assignment to ensure only the first writer
wins. Use symbols assign(), _repo.get_assignment, _repo.record_assignment,
_choose_variant, and ExperimentAssignment when locating where to apply this
change.
In `@src/synthorg/observability/prometheus_recording_streams.py`:
- Around line 53-56: The code currently checks for finiteness but still allows
negative durations to be observed; add a non-negative guard before writing to
the histogram by raising or rejecting negatives. For the block that updates
self._ws_connection_lifetime (the method that calls
require_finite("record_ws_connection_lifetime: duration_sec", duration_sec) and
then self._ws_connection_lifetime.labels(...).observe(duration_sec)), add a
check like if duration_sec < 0: raise ValueError(...) (or call a
require_non_negative helper) and do the same change in the analogous method that
updates self._ws_connection_latency (the block around lines 92-95) so no
negative duration is ever passed to the histogram observe() call.
In `@src/synthorg/workers/__main__.py`:
- Around line 172-175: Resolve the executor before starting the task queue to
avoid leaking resources: call _resolve_executor(args) and capture executor and
http_client before invoking task_queue.start(), so that any SystemExit raised
inside _resolve_executor does not occur after the queue has been started; then
start the queue and enter the try/finally that calls run_worker_pool(...) and
ensures task_queue.stop() in the finally block. Ensure both start/stop ordering
is corrected in the _main_ startup sequence where run_worker_pool is invoked.
- Around line 200-205: Before raising SystemExit for missing args.api_base_url
and args.auth_token in src/synthorg/workers/__main__.py, emit an ERROR (or
WARNING) log with the same message and context; replace the two bare raises in
the api_base_url/auth_token checks with a logger.error(...) call (include the
msg string, the problematic flag name like "--executor http" and the related env
var names SYNTHORG_API_BASE_URL / SYNTHORG_WORKER_AUTH_TOKEN) immediately before
calling raise SystemExit(msg), so the checks around args.api_base_url and
args.auth_token produce structured logs prior to exiting.
In `@src/synthorg/workers/execution_service.py`:
- Around line 114-117: Before raising NotFoundError in the missing-task path,
add a structured log call that records the task_id and any relevant context;
specifically, in the block around self._task_engine.get_task(task_id) where task
is None, emit a processLogger.warning or self.logger.warning (whichever logger
is used in this module) with a message like "Task not found" and include task_id
and any tracing/context before raising NotFoundError(msg). Ensure the log level
is WARNING/ERROR and the logged fields match other error-path logs for
consistency.
In `@src/synthorg/workers/executor.py`:
- Around line 213-217: The code incorrectly maps terminal_status == "failed" to
TaskClaimStatus.FAILED which violates the executor contract that any 2xx +
terminal status ("completed"/"cancelled"/"failed") should return
TaskClaimStatus.SUCCESS; update the conditional logic around terminal_status to
return TaskClaimStatus.SUCCESS for "failed" (and generally for all terminal
statuses) instead of TaskClaimStatus.FAILED—e.g., replace the ternary that
yields TaskClaimStatus.FAILED when terminal_status == "failed" with logic that
yields TaskClaimStatus.SUCCESS for terminal_status in
{"completed","cancelled","failed"} and only return non-success for non-terminal
error cases; ensure the change touches the branch currently returning
TaskClaimStatus.FAILED so that TaskClaimStatus.SUCCESS is returned for
terminal_status == "failed".
- Line 118: The code interpolates claim.task_id directly into the path when
building url in executor.py (url =
f"{self._base_url}/api/v1/tasks/{claim.task_id}/execute"); change this to
URL-encode the task id first (e.g., import urllib.parse and use
urllib.parse.quote(str(claim.task_id), safe='') or similar) and then interpolate
the encoded value into self._base_url so reserved characters (including slashes)
cannot produce malformed paths.
In `@tests/unit/engine/test_approval_gate.py`:
- Around line 20-23: Replace the bare MagicMock usage for the typed boundary
with the approved helper: change the parked fixture creation from
MagicMock(spec=ParkedContext) to mock_of[ParkedContext](...) so the mock
respects the typed-boundary rule; ensure you import mock_of from tests._shared
if missing and keep the parked.id = "parked-1" override (or pass id="parked-1"
into mock_of) so behavior remains identical.
---
Outside diff comments:
In `@docs/guides/monitoring.md`:
- Around line 317-318: Update the summary table row that starts with the literal
"`Audit & Security`" to include the new "audit-log fill-ratio gauge" in its
panel list so the row text matches the newly added panel; edit the table cell
that currently reads "Audit chain append rate, depth, last-append age, security
verdicts, agent identity version changes, API error categories" and append ",
audit-log fill-ratio gauge" (or insert it in the preferred position) so the two
references stay consistent.
In `@tests/unit/experiments/test_service.py`:
- Around line 153-160: Test currently only verifies total and page length; add
an assertion that the returned assignments are in recency order. After calling
svc.list_assignments and receiving page, assert that the timestamp field on each
assignment in page (e.g., created_at or updated_at on the assignment objects
returned by svc.list_assignments) is non-increasing across the page (each item’s
timestamp >= next item’s timestamp) to validate recency ordering in
test_list_assignments_paginates_in_recency_order.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b11bb13d-117b-41b0-9df3-76b307167940
📒 Files selected for processing (96)
.pre-commit-config.yamlCLAUDE.mdcli/cmd/new.gocli/cmd/start.gocli/cmd/update.gocli/internal/selfupdate/updater.godocs/architecture/decisions.mddocs/getting_started.mddocs/guides/a2a-federation.mddocs/guides/approval-workflow.mddocs/guides/ceremony-scheduling-tuning.mddocs/guides/cost-attribution.mddocs/guides/custom-mcp-server-dev.mddocs/guides/custom-rules-and-meta-loop.mddocs/guides/dynamic-scoring.mddocs/guides/monitoring.mddocs/guides/ontology-extension.mddocs/guides/rest-api-examples.mddocs/guides/webhook-management.mddocs/guides/workers-and-background-tasks.mddocs/reference/typed-boundaries.mddocs/reference/yaml-schema.mdscripts/check_no_migration_framing.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/controllers/_department_health.pysrc/synthorg/api/controllers/experiments.pysrc/synthorg/api/controllers/requests.pysrc/synthorg/api/controllers/simulations.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/api/dto.pysrc/synthorg/api/etag.pysrc/synthorg/api/lifecycle_builder.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/api/state.pysrc/synthorg/budget/_aggregation.pysrc/synthorg/budget/enforcer.pysrc/synthorg/client/pool.pysrc/synthorg/client/store.pysrc/synthorg/core/concurrency/cas_retry.pysrc/synthorg/core/state_machine.pysrc/synthorg/engine/quality/decomposers/llm.pysrc/synthorg/engine/workflow/ceremony_scheduler.pysrc/synthorg/engine/workspace/config.pysrc/synthorg/engine/workspace/semantic_llm.pysrc/synthorg/experiments/__init__.pysrc/synthorg/experiments/in_memory_repository.pysrc/synthorg/experiments/models.pysrc/synthorg/experiments/protocol.pysrc/synthorg/experiments/service.pysrc/synthorg/integrations/mcp_catalog/in_memory_installations.pysrc/synthorg/llm/__init__.pysrc/synthorg/llm/metadata.pysrc/synthorg/memory/procedural/models.pysrc/synthorg/memory/procedural/proposer.pysrc/synthorg/meta/evolution/outcome_models.pysrc/synthorg/meta/evolution/outcome_store_protocol.pysrc/synthorg/meta/mcp/handlers/common_args.pysrc/synthorg/observability/events/client.pysrc/synthorg/observability/events/experiments.pysrc/synthorg/observability/events/workers.pysrc/synthorg/observability/events/workflow.pysrc/synthorg/observability/prometheus_collector.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/observability/prometheus_push_metrics.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/persistence/_shared/pagination.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/protocol.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/providers/models.pysrc/synthorg/security/config.pysrc/synthorg/security/llm_evaluator.pysrc/synthorg/workers/__main__.pysrc/synthorg/workers/claim.pysrc/synthorg/workers/execution_service.pysrc/synthorg/workers/executor.pytests/conformance/persistence/conftest.pytests/conformance/persistence/test_backup_round_trip.pytests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.pytests/unit/client/test_pool_logging.pytests/unit/client/test_store_logging.pytests/unit/engine/test_approval_gate.pytests/unit/experiments/__init__.pytests/unit/experiments/test_service.pytests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.pytests/unit/observability/test_events.pytests/unit/persistence/test_protocol.pytests/unit/providers/conftest.pytests/unit/providers/test_models.pytests/unit/scripts/test_check_no_migration_framing.pytests/unit/workers/test_claim_lifecycle.pytests/unit/workers/test_executor.pyweb/src/__tests__/stores/artifacts.test.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build Backend
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: Lighthouse Site
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (21)
cli/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Use the appropriate hint tier based on intent:
HintErrorfor error recovery (always visible unless quiet),HintNextStepfor natural next action or destructive-action feedback,HintTipfor config automation suggestions (deduplicates within session), andHintGuidancefor flag/feature discovery (invisible in default auto mode)CLI is a Go binary; use
go -C clinevercd cli. For shell rules see~/.claude/rules/common/bash.md(canonical forcd/git -C/ Bash file-write rules). Seecli/CLAUDE.mdfor CLI-specific conventions
Files:
cli/cmd/new.gocli/cmd/start.gocli/internal/selfupdate/updater.gocli/cmd/update.go
cli/cmd/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Exit codes: 0 = success, 1 = runtime error, 2 = usage error, 3 = unhealthy (backend/containers), 4 = unreachable (Docker), 10 = updates available (
--check)
Files:
cli/cmd/new.gocli/cmd/start.gocli/cmd/update.go
cli/cmd/new.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
new <kind> <domain>command accepts--dry-runand--overwriteflags and scaffolds a conventions-clean Python file set undersrc/synthorg/for a new feature;<kind>is one ofservice,persistence,tool,controller
Files:
cli/cmd/new.go
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Read
docs/design/page before implementing; deviations need approvalPresent every plan for accept/deny before coding
No region/currency/locale privileged; use metric units; British English spelling
Every convention PR must ship its enforcement gate
Configuration precedence: DB > env > code default via
SettingsService/ConfigResolver(Cat-1) or env > code default (Cat-2,read_only_post_init); Cat-3 bootstrap secrets are pure env. Noos.environ.getoutside startup; pre-init Cat-2 reads usesettings.bootstrap_resolver.resolve_init_valueNo hardcoded numeric values; numerics live in
settings/definitions/. Allowlist: 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal)Never edit
tests/baselines/unit_timing.jsonor anyscripts/*_baseline.{txt,json}/scripts/_*_baseline.py; timeout/slow failures are source-code regressions. Per-invocation bypass for gate baselines:ALLOW_BASELINE_GROWTH=1 git commit ...(requires explicit user approval)No
from __future__ import annotations(Python 3.14 has PEP 649). Use PEP 758 exception syntax:except A, B:without parens unless bindingType hints required on public functions; mypy strict mode. Use Google-style docstrings. Line length 88; functions <50 lines; files <800 lines
Error classes must follow
<Domain><Condition>Errornaming fromDomainErrorbase; never inherit directly fromException/RuntimeError/etcPydantic v2 frozen models with
extra="forbid"on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); use@computed_fieldfor derived values; useNotBlankStrfor identifiersUse args models at every system boundary; call
parse_typed()for every external dict ingestionFor immutability use
model_copy(update=...)orcopy.deepcopy(); deepcopy at system boundariesUse
asyncio.TaskGroupfor fan-out/fan-in; helpers must catchExceptionand re-raise `Memor...
Files:
tests/unit/client/test_store_logging.pytests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.pysrc/synthorg/api/controllers/requests.pytests/unit/observability/test_events.pysrc/synthorg/meta/evolution/outcome_models.pysrc/synthorg/meta/evolution/outcome_store_protocol.pysrc/synthorg/observability/events/experiments.pysrc/synthorg/llm/__init__.pysrc/synthorg/core/concurrency/cas_retry.pysrc/synthorg/engine/workspace/config.pysrc/synthorg/meta/mcp/handlers/common_args.pysrc/synthorg/memory/procedural/models.pysrc/synthorg/engine/quality/decomposers/llm.pysrc/synthorg/security/llm_evaluator.pytests/unit/providers/test_models.pysrc/synthorg/api/lifecycle_builder.pysrc/synthorg/api/controllers/simulations.pysrc/synthorg/observability/events/workflow.pytests/conformance/persistence/conftest.pytests/unit/providers/conftest.pysrc/synthorg/workers/__main__.pysrc/synthorg/observability/events/client.pysrc/synthorg/experiments/models.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/integrations/mcp_catalog/in_memory_installations.pysrc/synthorg/security/config.pysrc/synthorg/experiments/protocol.pysrc/synthorg/client/pool.pysrc/synthorg/memory/procedural/proposer.pysrc/synthorg/engine/workspace/semantic_llm.pysrc/synthorg/experiments/__init__.pysrc/synthorg/api/etag.pysrc/synthorg/llm/metadata.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/core/state_machine.pysrc/synthorg/budget/enforcer.pysrc/synthorg/api/controllers/_department_health.pysrc/synthorg/budget/_aggregation.pytests/unit/workers/test_executor.pysrc/synthorg/api/controllers/__init__.pytests/conformance/persistence/test_backup_round_trip.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/persistence/protocol.pytests/unit/experiments/test_service.pysrc/synthorg/api/rate_limits/policies.pytests/unit/client/test_pool_logging.pyscripts/check_no_migration_framing.pytests/unit/scripts/test_check_no_migration_framing.pysrc/synthorg/client/store.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/workers/claim.pytests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.pytests/unit/persistence/test_protocol.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/engine/workflow/ceremony_scheduler.pysrc/synthorg/persistence/_shared/pagination.pysrc/synthorg/experiments/in_memory_repository.pysrc/synthorg/workers/execution_service.pysrc/synthorg/observability/prometheus_labels.pytests/unit/engine/test_approval_gate.pysrc/synthorg/observability/events/workers.pysrc/synthorg/observability/prometheus_collector.pysrc/synthorg/api/controllers/experiments.pysrc/synthorg/providers/models.pysrc/synthorg/api/state.pysrc/synthorg/api/dto.pysrc/synthorg/experiments/service.pysrc/synthorg/observability/prometheus_push_metrics.pysrc/synthorg/workers/executor.pytests/unit/workers/test_claim_lifecycle.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use markers:
@pytest.mark.{unit,integration,e2e,slow}. Async tests useauto. Global timeout 30s. Coverage minimum 80%Use
FakeClockfromtests._sharedfor Clock seam; usemock_of[T](**overrides)fromtests._sharedfor typed-boundary substitutions; useSimpleNamespacefor attribute-bags. BareMagicMockat typed boundaries is forbidden (enforced byscripts/check_mock_spec.py, zero-tolerance)Use Hypothesis with 10 deterministic CI examples; failures are real bugs (fix + add
@example(...))Never skip or xfail flaky tests; fix fundamentally. Use
asyncio.Event().wait()instead ofsleep(large_timeout)
Files:
tests/unit/client/test_store_logging.pytests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.pytests/unit/observability/test_events.pytests/unit/providers/test_models.pytests/conformance/persistence/conftest.pytests/unit/providers/conftest.pytests/unit/workers/test_executor.pytests/conformance/persistence/test_backup_round_trip.pytests/unit/experiments/test_service.pytests/unit/client/test_pool_logging.pytests/unit/scripts/test_check_no_migration_framing.pytests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.pytests/unit/persistence/test_protocol.pytests/unit/engine/test_approval_gate.pytests/unit/workers/test_claim_lifecycle.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/client/test_store_logging.pytests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.pytests/unit/observability/test_events.pytests/unit/providers/test_models.pytests/conformance/persistence/conftest.pytests/unit/providers/conftest.pytests/unit/workers/test_executor.pytests/conformance/persistence/test_backup_round_trip.pytests/unit/experiments/test_service.pytests/unit/client/test_pool_logging.pytests/unit/scripts/test_check_no_migration_framing.pytests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.pytests/unit/persistence/test_protocol.pytests/unit/engine/test_approval_gate.pytests/unit/workers/test_claim_lifecycle.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
uv syncfor dependency management; layout followssrc/synthorg/(src layout),tests/,web/(React 19),cli/(Go binary). Python 3.14+. BUSL-1.1 → Apache 2.0 after Change Date
Files:
src/synthorg/api/controllers/requests.pysrc/synthorg/meta/evolution/outcome_models.pysrc/synthorg/meta/evolution/outcome_store_protocol.pysrc/synthorg/observability/events/experiments.pysrc/synthorg/llm/__init__.pysrc/synthorg/core/concurrency/cas_retry.pysrc/synthorg/engine/workspace/config.pysrc/synthorg/meta/mcp/handlers/common_args.pysrc/synthorg/memory/procedural/models.pysrc/synthorg/engine/quality/decomposers/llm.pysrc/synthorg/security/llm_evaluator.pysrc/synthorg/api/lifecycle_builder.pysrc/synthorg/api/controllers/simulations.pysrc/synthorg/observability/events/workflow.pysrc/synthorg/workers/__main__.pysrc/synthorg/observability/events/client.pysrc/synthorg/experiments/models.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/integrations/mcp_catalog/in_memory_installations.pysrc/synthorg/security/config.pysrc/synthorg/experiments/protocol.pysrc/synthorg/client/pool.pysrc/synthorg/memory/procedural/proposer.pysrc/synthorg/engine/workspace/semantic_llm.pysrc/synthorg/experiments/__init__.pysrc/synthorg/api/etag.pysrc/synthorg/llm/metadata.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/core/state_machine.pysrc/synthorg/budget/enforcer.pysrc/synthorg/api/controllers/_department_health.pysrc/synthorg/budget/_aggregation.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/persistence/protocol.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/client/store.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/workers/claim.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/engine/workflow/ceremony_scheduler.pysrc/synthorg/persistence/_shared/pagination.pysrc/synthorg/experiments/in_memory_repository.pysrc/synthorg/workers/execution_service.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/observability/events/workers.pysrc/synthorg/observability/prometheus_collector.pysrc/synthorg/api/controllers/experiments.pysrc/synthorg/providers/models.pysrc/synthorg/api/state.pysrc/synthorg/api/dto.pysrc/synthorg/experiments/service.pysrc/synthorg/observability/prometheus_push_metrics.pysrc/synthorg/workers/executor.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/api/controllers/requests.pysrc/synthorg/meta/evolution/outcome_models.pysrc/synthorg/meta/evolution/outcome_store_protocol.pysrc/synthorg/observability/events/experiments.pysrc/synthorg/llm/__init__.pysrc/synthorg/core/concurrency/cas_retry.pysrc/synthorg/engine/workspace/config.pysrc/synthorg/meta/mcp/handlers/common_args.pysrc/synthorg/memory/procedural/models.pysrc/synthorg/engine/quality/decomposers/llm.pysrc/synthorg/security/llm_evaluator.pysrc/synthorg/api/lifecycle_builder.pysrc/synthorg/api/controllers/simulations.pysrc/synthorg/observability/events/workflow.pysrc/synthorg/workers/__main__.pysrc/synthorg/observability/events/client.pysrc/synthorg/experiments/models.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/integrations/mcp_catalog/in_memory_installations.pysrc/synthorg/security/config.pysrc/synthorg/experiments/protocol.pysrc/synthorg/client/pool.pysrc/synthorg/memory/procedural/proposer.pysrc/synthorg/engine/workspace/semantic_llm.pysrc/synthorg/experiments/__init__.pysrc/synthorg/api/etag.pysrc/synthorg/llm/metadata.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/core/state_machine.pysrc/synthorg/budget/enforcer.pysrc/synthorg/api/controllers/_department_health.pysrc/synthorg/budget/_aggregation.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/persistence/protocol.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/client/store.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/workers/claim.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/engine/workflow/ceremony_scheduler.pysrc/synthorg/persistence/_shared/pagination.pysrc/synthorg/experiments/in_memory_repository.pysrc/synthorg/workers/execution_service.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/observability/events/workers.pysrc/synthorg/observability/prometheus_collector.pysrc/synthorg/api/controllers/experiments.pysrc/synthorg/providers/models.pysrc/synthorg/api/state.pysrc/synthorg/api/dto.pysrc/synthorg/experiments/service.pysrc/synthorg/observability/prometheus_push_metrics.pysrc/synthorg/workers/executor.py
**/*.{md,txt}
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics in README and public docs must be sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers
Files:
docs/guides/custom-mcp-server-dev.mddocs/guides/custom-rules-and-meta-loop.mddocs/guides/ontology-extension.mddocs/architecture/decisions.mddocs/getting_started.mddocs/guides/dynamic-scoring.mdCLAUDE.mddocs/guides/a2a-federation.mddocs/guides/ceremony-scheduling-tuning.mddocs/guides/webhook-management.mddocs/guides/cost-attribution.mddocs/guides/approval-workflow.mddocs/guides/monitoring.mddocs/guides/rest-api-examples.mddocs/reference/yaml-schema.mddocs/reference/typed-boundaries.mddocs/guides/workers-and-background-tasks.md
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Use
d2for architecture / nested containers,mermaidfor flowcharts / sequence / pipelines. Use Markdown tables for tabular data. D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI
Files:
docs/guides/custom-mcp-server-dev.mddocs/guides/custom-rules-and-meta-loop.mddocs/guides/ontology-extension.mddocs/architecture/decisions.mddocs/getting_started.mddocs/guides/dynamic-scoring.mdCLAUDE.mddocs/guides/a2a-federation.mddocs/guides/ceremony-scheduling-tuning.mddocs/guides/webhook-management.mddocs/guides/cost-attribution.mddocs/guides/approval-workflow.mddocs/guides/monitoring.mddocs/guides/rest-api-examples.mddocs/reference/yaml-schema.mddocs/reference/typed-boundaries.mddocs/guides/workers-and-background-tasks.md
cli/cmd/start.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
startcommand accepts flags--no-wait,--timeout,--no-pull,--dry-run,--no-detach,--no-verify
Files:
cli/cmd/start.go
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx}: Always usecreateLoggerfrom@/lib/logger; never use bareconsole.warn/console.error/console.debugin application code
Always use variable namelogfor logger instances (e.g.,const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate args to logger (not interpolated into the message string) so they go throughsanitizeArg
Wrap attacker-controlled fields inside structured objects insanitizeForLog()before embedding in logs
Callers MUST NOT wrap store mutation calls intry/catch; the store owns the error UX
Display counts must come fromdata.length; the wire envelope no longer carriestotal
ImportErrorCodeandErrorCategoryfrom@/api/types/errors(re-exported fromweb/src/api/types/error-codes.gen.ts); discriminate onErrorCode.<NAME>, never on raw integer literals
Generated DTO types re-exportcomponents['schemas']['X']with no hand-written tightening because the generator handles response optionality
Import DTOs via the barrel (import type { AgentConfig } from '@/api/types') or directly from the generated module (import type { AgentConfig } from '@/api/types/dtos.gen')
ALWAYS reuse existing components fromweb/src/components/ui/before creating new ones
A PostToolUse hook (scripts/check_web_design_system.py) flags hardcoded hex/rgba/fonts/Motion durations/locale literals/bare.toLocale*String()calls/missing Storybook stories/duplicate component patterns/complex.map()blocks; fix every violation before proceeding
Forbid passing async functions where the callsite ignores the returned promise via@typescript-eslint/no-misused-promises(withchecksVoidReturn: { attributes: false }); React 19asyncevent handlers stay allowed
Files:
web/src/__tests__/stores/artifacts.test.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
web/src/{test-setup,__tests__}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Boot
test-setup.tsxwithonUnhandledRequest: 'error'; override per-case viaserver.use(...), nevervi.mock('@/api/endpoints/*')
Files:
web/src/__tests__/stores/artifacts.test.ts
web/src/**/__tests__/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Every unit test runs under
web/test-infra/active-handle-tracker.tswith zero tolerance for leaking event-loop-holding resources; no ceiling, no buffer
Files:
web/src/__tests__/stores/artifacts.test.ts
web/src/__tests__/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Forbid unawaited promises via
@typescript-eslint/no-floating-promisesso async work cannot survive the test that scheduled it and trip the active-handle gate
Files:
web/src/__tests__/stores/artifacts.test.ts
web/**/*.{tsx,ts}
📄 CodeRabbit inference engine (CLAUDE.md)
Web is React 19. For Web Dashboard Design System see
web/CLAUDE.md. Reuseweb/src/components/ui/; design tokens only
Files:
web/src/__tests__/stores/artifacts.test.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
tests/conformance/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Dual-backend conformance tests must use
backendfixture (SQLite + Postgres). Enforced bycheck_dual_backend_test_parity.py
Files:
tests/conformance/persistence/conftest.pytests/conformance/persistence/test_backup_round_trip.py
src/synthorg/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Only
src/synthorg/persistence/may import sqlite/psycopg or emit raw SQLRepository CRUD methods:
save(entity),get(id),delete(id) -> bool,list_items(...),query(...)returning tuplesFor datetime in persistence use
parse_iso_utc/format_iso_utcfrompersistence._shared(reject naive); usenormalize_utcfor already-typed values
Files:
src/synthorg/persistence/sqlite/backend.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/protocol.pysrc/synthorg/persistence/_shared/pagination.py
cli/cmd/update.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
updatecommand accepts flags--dry-run,--no-restart,--timeout,--cli-only,--images-only,--check
Files:
cli/cmd/update.go
web/src/{stores,api}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Use cursor-based paging via
PaginationMetafor list endpoints; keepnextCursor+hasMorein state (not offset arithmetic) and early-return when!hasMore || !nextCursor
Files:
web/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
web/src/api/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
getLiveness()is always 200 while the process is alive;getReadiness()is 200 healthy / 503 unavailable (binary'ok' | 'unavailable'outcome, no tri-state). Any new caller must handle the 503 path explicitly.
Files:
web/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
web/src/api/types/**/*.gen.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
NEVER hand-edit a
*.gen.tsfile; regenerate withuv run python scripts/generate_dto_types_ts.py
Files:
web/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T20:41:11.727Z
Learning: After every issue: create branch, commit, push (no auto-PR); use `/pre-pr-review` command (gh pr create is hookify-blocked). After PR creation: use `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T20:41:11.727Z
Learning: After every squash merge run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T20:41:11.727Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard and REST API, not CLI
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T20:41:11.727Z
Learning: Commits follow `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); enforced by commitizen. Branches follow `<type>/<slug>` pattern. Squash merge only; PR body becomes squash commit message. Trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T20:41:11.727Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/client/test_store_logging.pytests/unit/api/rate_limits/test_in_memory_inflight_lifecycle.pysrc/synthorg/api/controllers/requests.pytests/unit/observability/test_events.pysrc/synthorg/meta/evolution/outcome_models.pysrc/synthorg/meta/evolution/outcome_store_protocol.pysrc/synthorg/observability/events/experiments.pysrc/synthorg/llm/__init__.pysrc/synthorg/core/concurrency/cas_retry.pysrc/synthorg/engine/workspace/config.pysrc/synthorg/meta/mcp/handlers/common_args.pysrc/synthorg/memory/procedural/models.pysrc/synthorg/engine/quality/decomposers/llm.pysrc/synthorg/security/llm_evaluator.pytests/unit/providers/test_models.pysrc/synthorg/api/lifecycle_builder.pysrc/synthorg/api/controllers/simulations.pysrc/synthorg/observability/events/workflow.pytests/conformance/persistence/conftest.pytests/unit/providers/conftest.pysrc/synthorg/workers/__main__.pysrc/synthorg/observability/events/client.pysrc/synthorg/experiments/models.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/integrations/mcp_catalog/in_memory_installations.pysrc/synthorg/security/config.pysrc/synthorg/experiments/protocol.pysrc/synthorg/client/pool.pysrc/synthorg/memory/procedural/proposer.pysrc/synthorg/engine/workspace/semantic_llm.pysrc/synthorg/experiments/__init__.pysrc/synthorg/api/etag.pysrc/synthorg/llm/metadata.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/core/state_machine.pysrc/synthorg/budget/enforcer.pysrc/synthorg/api/controllers/_department_health.pysrc/synthorg/budget/_aggregation.pytests/unit/workers/test_executor.pysrc/synthorg/api/controllers/__init__.pytests/conformance/persistence/test_backup_round_trip.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/persistence/protocol.pytests/unit/experiments/test_service.pysrc/synthorg/api/rate_limits/policies.pytests/unit/client/test_pool_logging.pyscripts/check_no_migration_framing.pytests/unit/scripts/test_check_no_migration_framing.pysrc/synthorg/client/store.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/workers/claim.pytests/unit/integrations/mcp_catalog/test_in_memory_installations_lifecycle.pytests/unit/persistence/test_protocol.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/engine/workflow/ceremony_scheduler.pysrc/synthorg/persistence/_shared/pagination.pysrc/synthorg/experiments/in_memory_repository.pysrc/synthorg/workers/execution_service.pysrc/synthorg/observability/prometheus_labels.pytests/unit/engine/test_approval_gate.pysrc/synthorg/observability/events/workers.pysrc/synthorg/observability/prometheus_collector.pysrc/synthorg/api/controllers/experiments.pysrc/synthorg/providers/models.pysrc/synthorg/api/state.pysrc/synthorg/api/dto.pysrc/synthorg/experiments/service.pysrc/synthorg/observability/prometheus_push_metrics.pysrc/synthorg/workers/executor.pytests/unit/workers/test_claim_lifecycle.py
🪛 LanguageTool
docs/getting_started.md
[uncategorized] ~45-~45: The official name of this software platform is spelled with a capital “H”.
Context: ...golangci-lint version that matches CI (.github/workflows/cli.yml`). Re-run only after ...
(GITHUB)
docs/guides/ceremony-scheduling-tuning.md
[style] ~101-~101: ‘by mistake’ might be wordy. Consider a shorter alternative.
Context: ...| Two strategies enabled simultaneously by mistake | Confirm ceremony_policy.strategy is...
(EN_WORDINESS_PREMIUM_BY_MISTAKE)
docs/guides/webhook-management.md
[uncategorized] ~32-~32: The official name of this software platform is spelled with a capital “H”.
Context: ...ter and POST Register a receiver for a github connection: ```python from synthorg.i...
(GITHUB)
[style] ~68-~68: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... 401 on a signature mismatch. - 409 on a replay older than the replay window. ...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
docs/guides/cost-attribution.md
[style] ~27-~27: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: .../agent). - GET /api/v1/costs/records` returns the raw record stream (paginated). ```...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
docs/guides/workers-and-background-tasks.md
[grammar] ~63-~63: Ensure spelling is correct
Context: ...2. Polls the consumer for claims with a per-fetch timeout. 3. Runs the configured e...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.22.1)
docs/guides/workers-and-background-tasks.md
[warning] 81-81: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
| **Verification checkpoint (2026-06-10):** on this date run the checklist below and update this section with the outcome (mark each item Done / Not done / Outcome). | ||
|
|
||
| 1. Inspect `nats-io/nats.py` open PRs and recent releases on GitHub for the `inspect.iscoroutinefunction` fix. | ||
| 2. If a fixed release is available: bump the `nats-py` pin in `pyproject.toml`, drop the matching `filterwarnings` entry, run `uv run python -m pytest tests/ -m integration -k nats` to confirm warnings are gone, and replace this checkpoint section with the resolution outcome. | ||
| 3. If no fixed release exists: implement the local monkey-patch in `src/synthorg/communication/bus/_nats_compat.py` (one-line `nats.aio.client.iscoroutinefunction = inspect.iscoroutinefunction`), import it at bus initialisation, and extend this section with the patch landing date. | ||
| 4. Re-evaluate `nats-core` JetStream support: a maintained alternative removes the entire mitigation requirement. |
There was a problem hiding this comment.
Checkpoint numerics should follow docs numeric-sourcing policy.
The newly added dated checklist contains hardcoded numerics/date values without <!--RS:...--> markers. Please source these numerics through runtime-stats markers (or move volatile checkpoint timing out of public docs).
As per coding guidelines, “Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/architecture/decisions.md` around lines 132 - 137, The dated checklist
header "Verification checkpoint (2026-06-10)" and any hardcoded numerics in the
new items must be replaced with runtime-stats markers per the docs
numeric-sourcing policy: pull the date/number from data/runtime_stats.yaml and
insert a <!--RS:NAME--> marker instead of the literal "2026-06-10" (or move the
volatile checkpoint into an internal/temporary doc if you prefer). Update the
checklist text in docs/architecture/decisions.md (the "Verification checkpoint"
heading and the numbered items 1–4) to reference the appropriate <!--RS:...-->
keys and ensure the corresponding entries exist in data/runtime_stats.yaml; if
you move the checkpoint out of public docs, note that change here and remove
hardcoded numerics.
| ## Install external CLI tools (one-time per machine) | ||
|
|
||
| Some gates and the docs build rely on external binaries that are not Python packages: `golangci-lint` (Go linter, used by the CLI) and `d2` (architecture diagram renderer). | ||
|
|
||
| Install `golangci-lint` once per machine: | ||
|
|
||
| ```bash | ||
| bash scripts/install_cli_tools.sh | ||
| ``` | ||
|
|
||
| The script downloads the pinned `golangci-lint` version that matches CI (`.github/workflows/cli.yml`). Re-run only after bumping the pinned version; subsequent `uv sync` invocations do NOT re-run the script. CI uses its own action-based install step, so this is strictly a local-developer convenience. | ||
|
|
||
| Install `d2` separately (the docs job pins `v0.7.1`). The fastest path is the upstream installer: | ||
|
|
||
| ```bash | ||
| curl -fsSL https://d2lang.com/install.sh | sh -s -- --version v0.7.1 | ||
| ``` | ||
|
|
||
| On Windows, install via `winget install Terrastruct.d2` or download the release archive from `https://github.com/terrastruct/d2/releases`. Either way, ensure the resulting `d2` binary is on `PATH`; the docs build invokes it directly. |
There was a problem hiding this comment.
New numeric doc content needs <!--RS:...--> markers.
This section introduces hardcoded numeric values (including version numerics) without runtime-stats markers. Please source numeric literals via <!--RS:...--> or rewrite to avoid explicit numerics.
As per coding guidelines, “Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers”.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~45-~45: The official name of this software platform is spelled with a capital “H”.
Context: ...golangci-lint version that matches CI (.github/workflows/cli.yml`). Re-run only after ...
(GITHUB)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/getting_started.md` around lines 35 - 53, Replace hardcoded version
numerics in docs/getting_started.md with runtime-stats markers: change the d2
version string "v0.7.1" to a marker like <!--RS:D2_VERSION--> and replace any
golangci-lint pinned version mention with a marker like
<!--RS:GOLANGCI_LINT_VERSION--> that sources values from
data/runtime_stats.yaml; ensure any other explicit numeric literals in this
section are similarly converted to <!--RS:...--> markers and update the install
guidance to reference scripts/install_cli_tools.sh and the CI pin in
.github/workflows/cli.yml so readers know where the marker values originate.
There was a problem hiding this comment.
Code Review
This pull request introduces an A/B experiment registry for deterministic variant assignment, implements an HTTP-callback task executor for distributed workers, and enhances observability with new Prometheus metrics for WebSocket connections and Postgres connection pools. It also includes extensive documentation updates, refines CLI error handling and help text, and tightens linting rules against forensic prose. Feedback from the review suggests simplifying WebSocket connection tracking by using atomic Prometheus gauge operations instead of manual locks, removing redundant local imports, and cleaning up a redundant modulo operation in the experiment hashing logic.
| # Process-wide guard for the active-WS read-modify-write below. Two | ||
| # WS handler coroutines never interleave inside a single sync block, | ||
| # but Prometheus' metrics scrape thread (run by the ASGI server's | ||
| # background scraper) can read ``app.state`` concurrently with a | ||
| # handler that is mid-increment; the lock keeps the increment + store | ||
| # pair indivisible from any concurrent reader's perspective. | ||
| _WS_CONNECTION_COUNT_LOCK = threading.Lock() |
There was a problem hiding this comment.
The process-wide _WS_CONNECTION_COUNT_LOCK and the manual tracking of ws_active_connection_count in app.state are redundant. Since ws_handler is an async coroutine, multiple instances running on the same event loop will not interleave during the synchronous execution of these recording helpers. Additionally, the ws_active_connections metric is a Prometheus Gauge, which is thread-safe and supports atomic inc() and dec() operations. Using the gauge directly would simplify the implementation and eliminate the need for manual synchronization.
| # lifetime into the ``synthorg_ws_connection_lifetime_seconds`` | ||
| # histogram. ``time.monotonic`` so a wall-clock NTP step does not | ||
| # push the bucket bound. | ||
| import time # noqa: PLC0415 |
| material = f"{experiment}\x1f{subject_id}".encode() | ||
| digest = hashlib.sha256(material).digest() | ||
| raw = int.from_bytes(digest[:4], byteorder="big", signed=False) | ||
| return (raw % _HASH_BUCKET_MAX) % modulus |
There was a problem hiding this comment.
The modulo operation raw % _HASH_BUCKET_MAX is redundant. raw is an unsigned 32-bit integer initialized from 4 bytes, so its value is already within the range [0, 2**32). Since _HASH_BUCKET_MAX is defined as 2**32, the expression raw % _HASH_BUCKET_MAX always evaluates to raw.
| return (raw % _HASH_BUCKET_MAX) % modulus | |
| return raw % modulus |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1926 +/- ##
==========================================
- Coverage 85.20% 85.15% -0.06%
==========================================
Files 1834 1846 +12
Lines 107196 107752 +556
Branches 9245 9287 +42
==========================================
+ Hits 91341 91751 +410
- Misses 13618 13755 +137
- Partials 2237 2246 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…I drain race Round 2 batch on PR #1926 (refs #1922). Bundles the CI test failure fix, 22 valid CodeRabbit inline findings, 3 Gemini inline findings, and 2 CodeRabbit outside-diff comments. RS-marker findings on docs not in the doc-numeric-macros gate scope (only 5 files are scanned) are factually wrong against the actual gate definition and skipped with reasons in the round-history entry. CI fix: - engine/task_engine._drain_all: split effective_timeout evenly between processing- and observer-drain stages so a slow processing cancellation no longer starves observer drain into a zero-budget call (test_drain_timeout_resolves_pending_futures was failing under CI load even at 500ms because remaining ~= 0 after _drain_processing) ws.py (CodeRabbit + Gemini collapsed): - Drop _WS_CONNECTION_COUNT_LOCK + state_dict counter, use atomic Gauge.inc()/.dec() via new collector methods. prometheus_client Gauge is thread-safe so the scrape thread cannot tear a count. - Drop the two redundant local 'import time' bindings (already at top) Other source/test fixes: - cli/cmd/new.go: bare 'synthorg new' returns ExitUsage after Help() - cli/internal/config/state.go: typed sentinels ErrReading/ErrParsing - cli/cmd/start.go: errors.Is(ErrReading/ErrParsing) replaces strings.HasPrefix - scripts/check_no_migration_framing.py: regex adds owned|emitted|wrapped - api/controllers/tasks.py: explicit status_code=HTTP_200_OK on /execute - api/dto.py: extract _MAX_SELECTION_WEIGHT, add 3 DTOs to __all__ - api/state.py: double-checked locking on lazy worker_execution/experiment_service - experiments/models.py: extract _MAX_VARIANT_WEIGHT - experiments/service.py: drop no-op intermediate modulo, re-fetch after record_assignment so concurrent writers converge on the canonical first-write timestamp - observability/prometheus_recording_streams: require_non_negative on ws_connection_lifetime + pg_pool_acquire duration; add inc_/dec_ atoms - observability/events/workers: WORKERS_MAIN_INVALID_EXECUTOR_CONFIG event - workers/__main__: resolve executor BEFORE task_queue.start(); log before SystemExit on both missing-config paths - workers/execution_service: logger.warning before NotFoundError raise - workers/executor: urllib.parse.quote(task_id); terminal 'failed' status now maps to SUCCESS to match the module docstring contract - web/src/api/types/*.gen.ts: regenerated for the execute-endpoint status_code=200 change Test updates: - tests/unit/engine/test_approval_gate: ParkedContext is Pydantic so create_autospec(spec_set=True) cannot see its fields; build a real instance at the typed boundary - tests/unit/engine/test_task_engine_integration: refresh budget-choice comment to describe the new even split - tests/unit/experiments/test_service: advance FakeClock between assigns + assert strictly decreasing timestamps - tests/unit/scripts/test_check_no_migration_framing: parametrize the three new verbs (owned/emitted/wrapped) - tests/unit/workers/test_executor: rename + reassert terminal-failed returns SUCCESS to match the fixed contract Docs: - docs/guides/monitoring.md: include audit-log fill-ratio gauge in the Audit & Security row - docs/guides/workers-and-background-tasks.md: 'text' fence language - docs/reference/typed-boundaries.md: split intro into parse_typed- enforced vs informational/lenient groups - .pre-commit-config.yaml: refresh header re no-review-origin-in-code + no-migration-framing pre-push staging + CI counterpart
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/synthorg/workers/__main__.py (1)
174-189:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClose owned HTTP client when queue startup fails.
If Line 178 (
task_queue.start()) throws, control never enters the currenttry/finally, so the createdhttpx.AsyncClientis left unclosed.Proposed fix
task_queue = JetStreamTaskQueue( queue_config=queue_config, nats_config=nats_config, ) - await task_queue.start() + queue_started = False try: + await task_queue.start() + queue_started = True await run_worker_pool( queue_config=queue_config, task_queue=task_queue, executor=executor, worker_count=args.workers, ) finally: if http_client is not None: await http_client.aclose() - await task_queue.stop() + if queue_started: + await task_queue.stop()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/synthorg/workers/__main__.py` around lines 174 - 189, If task_queue.start() raises an exception the existing finally block is never reached and the created http_client is leaked; wrap the startup call so that any exception during task_queue.start() will first close the owned http_client (await http_client.aclose() when http_client is not None) and then re-raise the error. Concretely, after constructing task_queue and http_client, call await task_queue.start() inside a try/except that on exception awaits http_client.aclose() if present and then raises, then proceed into the existing try/finally that calls run_worker_pool and stops the task_queue; reference symbols: http_client, task_queue.start(), JetStreamTaskQueue, run_worker_pool.
♻️ Duplicate comments (2)
src/synthorg/experiments/service.py (1)
150-191:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift
assign()still has a first-write race that can fail instead of replaying canonical assignment.At Line 179, a concurrent insert conflict from
record_assignment()will raise before the re-fetch path executes, so the method is not reliably idempotent under contention. Catch duplicate/unique-conflict from the repository and then re-readget_assignment(...)to return the canonical winner.Suggested direction
await self._repo.record_assignment(assignment) + # If another worker won the insert race, fetch and replay canonical row. + # Replace `AssignmentConflictError` with the repository's concrete + # duplicate/unique-conflict exception type. + # try: + # await self._repo.record_assignment(assignment) + # except AssignmentConflictError: + # canonical = await self._repo.get_assignment( + # experiment=experiment, + # subject_id=subject_id, + # ) + # if canonical is not None: + # return canonical🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/synthorg/experiments/service.py` around lines 150 - 191, The assign() flow has a race where record_assignment(...) can raise a duplicate/unique constraint error before the re-fetch, so wrap the call to self._repo.record_assignment(assignment) in an exception handler that catches the repository’s duplicate/unique-conflict error, and in that handler call self._repo.get_assignment(experiment=experiment, subject_id=subject_id) and return the canonical assignment if found (otherwise re-raise or convert to a clear error). Ensure you reference the same ExperimentAssignment, assignment, canonical, and the self._repo.get_assignment/record_assignment methods so the fix only handles the specific conflict and preserves the re-fetch semantics.docs/guides/monitoring.md (1)
104-104:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReplace hardcoded numerics with runtime-stats markers in new docs content.
The newly added metric description, PromQL commentary, alert rule, and Grafana JSON still hardcode values (
0.9,0.75,1,10m, etc.). This will drift and violate docs numeric sourcing policy.As per coding guidelines "
{README.md,docs/**/*.md}: Numerics in README and public docs must be sourced fromdata/runtime_stats.yamlvia<!--RS:NAME-->markers".Also applies to: 227-281
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/monitoring.md` at line 104, The new metric `synthorg_security_audit_log_fill_ratio` and its associated PromQL examples, alert rule, and Grafana JSON still use hardcoded numbers (e.g. 0.9, 0.75, 1, 10m); replace each hardcoded numeric with a runtime-stats marker <!--RS:NAME--> that references the corresponding key in data/runtime_stats.yaml (create sensible keys like audit_log_alert_threshold, audit_log_warn_threshold, audit_log_max_entries, audit_log_window if they don't exist), update the markdown description, PromQL commentary, alert rule and panel JSON to use those markers, and ensure the marker names match entries in runtime_stats.yaml so the docs build pulls values at render time.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/api/state.py`:
- Around line 204-205: AppState's __init__ assigns self._lazy_service_lock but
'_lazy_service_lock' is not listed in the class __slots__, causing
AttributeError; add the string '_lazy_service_lock' to the AppState.__slots__
tuple alongside '_experiment_service' and '_fine_tune_orchestrator' (and any
other slot entries) so the attribute can be set, leaving the existing __init__
assignment for _lazy_service_lock unchanged.
In `@src/synthorg/engine/task_engine.py`:
- Around line 393-404: The _drain_all docstring contains migration-framing and
historical narrative; rewrite the docstring for function _drain_all to remove
past-tense/migration language and instead state the current rationale concisely
(why the budget is split and what guarantee it provides). Locate the docstring
that describes splitting the budget between processing and observer drains and
replace the historical explanation with a present-tense rationale such as: the
budget is split evenly between processing and observer drains to ensure each
stage receives at least effective_timeout/2, preventing observer starve and
avoiding outer-guard deadline races; keep references to effective_timeout,
_drain_processing, and observer queue behavior for context. Ensure no reviewer
citations, issue references, or migration history remain.
In `@src/synthorg/experiments/service.py`:
- Around line 80-95: register_variant() currently only checks weight < 1 and
lets Pydantic's ExperimentVariant(...) raise a ValidationError for weight >
_MAX_VARIANT_WEIGHT, bypassing the EXPERIMENT_VARIANT_INVALID_WEIGHT log and
domain error contract; add an explicit upper-bound check (e.g., if weight >
_MAX_VARIANT_WEIGHT) before constructing ExperimentVariant to log the same
EXPERIMENT_VARIANT_INVALID_WEIGHT with experiment/variant/weight and raise the
same service ValidationError(msg) ("weight must be >= 1" should be adjusted to a
suitable message like "weight must be between 1 and {_MAX_VARIANT_WEIGHT}") so
all invalid-weight paths use the same logging and exception behavior.
In `@src/synthorg/workers/__main__.py`:
- Around line 53-55: The docstring for _placeholder_executor incorrectly states
it is used "when no backend URL is configured"; update that comment to reflect
actual behaviour: the placeholder executor is only used when the operator
explicitly passes "--executor placeholder" and not when the backend URL is
missing because _resolve_executor() exits in that case. Locate the docstring
near the _placeholder_executor definition and/or its usage in _resolve_executor,
remove or replace the phrase "or when no backend URL is configured" with wording
that says it is only a smoke-test/fallback when the operator explicitly requests
the placeholder executor (e.g., via "--executor placeholder").
In `@tests/unit/experiments/test_service.py`:
- Around line 20-27: The test reaches into ExperimentService's private attribute
svc._clock making it brittle; modify the helper _service() so it returns the
FakeClock along with the ExperimentService and repository (e.g., return svc,
repo, clock) and update tests to use that returned FakeClock instead of
accessing svc._clock; locate _service() and the test assertions that reference
svc._clock and replace those accesses with the directly returned FakeClock
instance (FakeClock and ExperimentService are the key symbols to update).
---
Outside diff comments:
In `@src/synthorg/workers/__main__.py`:
- Around line 174-189: If task_queue.start() raises an exception the existing
finally block is never reached and the created http_client is leaked; wrap the
startup call so that any exception during task_queue.start() will first close
the owned http_client (await http_client.aclose() when http_client is not None)
and then re-raise the error. Concretely, after constructing task_queue and
http_client, call await task_queue.start() inside a try/except that on exception
awaits http_client.aclose() if present and then raises, then proceed into the
existing try/finally that calls run_worker_pool and stops the task_queue;
reference symbols: http_client, task_queue.start(), JetStreamTaskQueue,
run_worker_pool.
---
Duplicate comments:
In `@docs/guides/monitoring.md`:
- Line 104: The new metric `synthorg_security_audit_log_fill_ratio` and its
associated PromQL examples, alert rule, and Grafana JSON still use hardcoded
numbers (e.g. 0.9, 0.75, 1, 10m); replace each hardcoded numeric with a
runtime-stats marker <!--RS:NAME--> that references the corresponding key in
data/runtime_stats.yaml (create sensible keys like audit_log_alert_threshold,
audit_log_warn_threshold, audit_log_max_entries, audit_log_window if they don't
exist), update the markdown description, PromQL commentary, alert rule and panel
JSON to use those markers, and ensure the marker names match entries in
runtime_stats.yaml so the docs build pulls values at render time.
In `@src/synthorg/experiments/service.py`:
- Around line 150-191: The assign() flow has a race where record_assignment(...)
can raise a duplicate/unique constraint error before the re-fetch, so wrap the
call to self._repo.record_assignment(assignment) in an exception handler that
catches the repository’s duplicate/unique-conflict error, and in that handler
call self._repo.get_assignment(experiment=experiment, subject_id=subject_id) and
return the canonical assignment if found (otherwise re-raise or convert to a
clear error). Ensure you reference the same ExperimentAssignment, assignment,
canonical, and the self._repo.get_assignment/record_assignment methods so the
fix only handles the specific conflict and preserves the re-fetch semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f8a810c7-270e-4e6c-aa58-3c755d4b1508
📒 Files selected for processing (26)
.pre-commit-config.yamlcli/cmd/new.gocli/cmd/start.gocli/internal/config/state.godocs/guides/monitoring.mddocs/guides/workers-and-background-tasks.mddocs/reference/typed-boundaries.mdscripts/check_no_migration_framing.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/api/dto.pysrc/synthorg/api/state.pysrc/synthorg/engine/task_engine.pysrc/synthorg/experiments/models.pysrc/synthorg/experiments/service.pysrc/synthorg/observability/events/workers.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/workers/__main__.pysrc/synthorg/workers/execution_service.pysrc/synthorg/workers/executor.pytests/unit/engine/test_approval_gate.pytests/unit/engine/test_task_engine_integration.pytests/unit/experiments/test_service.pytests/unit/scripts/test_check_no_migration_framing.pytests/unit/workers/test_executor.pyweb/src/api/types/openapi.gen.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build Backend
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Dashboard Test
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: Test (Python 3.14)
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (18)
cli/internal/config/state.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Define
Default{Postgres,NATS}ImageTagand matchingDefault{Postgres,NATS}ImageDigestconstants incli/internal/config/state.gowith a// renovate:annotation on the tag constant for Renovate customManager to keep both tag and digest current
Files:
cli/internal/config/state.go
cli/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Use the appropriate hint tier based on intent:
HintErrorfor error recovery (always visible unless quiet),HintNextStepfor natural next action or destructive-action feedback,HintTipfor config automation suggestions (deduplicates within session), andHintGuidancefor flag/feature discovery (invisible in default auto mode)For CLI (Go binary) development, refer to
cli/CLAUDE.mdand usego -C cli(nevercd cli)
Files:
cli/internal/config/state.gocli/cmd/new.gocli/cmd/start.go
cli/internal/{config,verify}/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Overriding any of
registry_host,image_repo_prefix,dhi_registry,postgres_image_tag, ornats_image_tagdisables image signature + SLSA verification for that invocation only and writes a stderr warning on every invocation (not suppressed by--quietor--json)
Files:
cli/internal/config/state.go
cli/internal/config/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
When both
SYNTHORG_DATABASE_URLandSYNTHORG_DB_PATHare present,SYNTHORG_DATABASE_URLwins (Postgres is initialised; SQLite path is ignored); malformed URL raises loudly at startup rather than silently falling back
Files:
cli/internal/config/state.go
!(providers/presets.py|web/public/provider-logos/**|.claude/**)
📄 CodeRabbit inference engine (CLAUDE.md)
Vendor-agnostic naming: NEVER use real vendor names in project code/tests; use
example-provider,test-provider,example-{large,medium,small}-001; allowed only in.claude/, third-party imports,providers/presets.py,web/public/provider-logos/
Files:
cli/internal/config/state.goscripts/check_no_migration_framing.pytests/unit/engine/test_task_engine_integration.pytests/unit/scripts/test_check_no_migration_framing.pysrc/synthorg/engine/task_engine.pycli/cmd/new.gocli/cmd/start.gosrc/synthorg/api/dto.pysrc/synthorg/api/controllers/tasks.pydocs/reference/typed-boundaries.md.pre-commit-config.yamltests/unit/engine/test_approval_gate.pydocs/guides/workers-and-background-tasks.mdsrc/synthorg/experiments/models.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/experiments/service.pysrc/synthorg/api/state.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/workers/execution_service.pyweb/src/api/types/openapi.gen.tstests/unit/experiments/test_service.pydocs/guides/monitoring.mdsrc/synthorg/observability/events/workers.pysrc/synthorg/workers/__main__.pytests/unit/workers/test_executor.pysrc/synthorg/workers/executor.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}; asyncauto; timeout 30s global; coverage 80% minWindows tests: unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backTest doubles: follow ladder in conventions.md section 12.1; use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundaries blocked bycheck_mock_spec.py; import fromtests._shared
Files:
tests/unit/engine/test_task_engine_integration.pytests/unit/scripts/test_check_no_migration_framing.pytests/unit/engine/test_approval_gate.pytests/unit/experiments/test_service.pytests/unit/workers/test_executor.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/engine/test_task_engine_integration.pytests/unit/scripts/test_check_no_migration_framing.pytests/unit/engine/test_approval_gate.pytests/unit/experiments/test_service.pytests/unit/workers/test_executor.py
src/synthorg/!(persistence)/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Only
src/synthorg/persistence/may import sqlite/psycopg or emit raw SQL
Files:
src/synthorg/engine/task_engine.pysrc/synthorg/api/dto.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/experiments/models.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/experiments/service.pysrc/synthorg/api/state.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/workers/execution_service.pysrc/synthorg/observability/events/workers.pysrc/synthorg/workers/__main__.pysrc/synthorg/workers/executor.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Configuration precedence: DB > env > code default via
SettingsService/ConfigResolver(Cat-1) or env > code default (Cat-2,read_only_post_init); Cat-3 bootstrap secrets are pure env; usesettings.bootstrap_resolver.resolve_init_valuefor pre-init Cat-2 reads; YAML is ingestion format only, not a precedence tier; noos.environ.getoutside startupNumerics live in
settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of the formNAME: int|float|Final|Final[int]|Final[float] = literalComments must explain WHY only; no reviewer citations, issue back-references, or migration framing
Do not use
from __future__ import annotations(Python 3.14 has PEP 649); use PEP 758 except:except A, B:requires parentheses when bindingType hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Errors must follow
<Domain><Condition>Errornaming fromDomainErrorbase; never inherit directly fromException/RuntimeError/etc (enforced bycheck_domain_error_hierarchy.py)Pydantic v2 API DTOs must be frozen with
extra='forbid'(for Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); use@computed_fieldfor derived values; useNotBlankStrfor identifiersUse args models at every system boundary; apply
parse_typed()for every external dict ingestion (enforced bycheck_boundary_typed.py)Use immutability with
model_copy(update=...)orcopy.deepcopy(); apply deepcopy at system boundariesUse
asyncio.TaskGroupfor fan-out/fan-in; helpers catchException(re-raiseMemoryError/RecursionError)Use Clock seam pattern:
clock: Clock | None = None; tests injectFakeClock; services own_lifecycle_lock; timed-out stops mark unrestartableFor untrusted content (SEC-1): use
wrap_untrusted()fromengine.prompt_safety; useHTMLParseGuardfor HTMLUse `from synthorg.observability import ge...
Files:
src/synthorg/engine/task_engine.pysrc/synthorg/api/dto.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/experiments/models.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/experiments/service.pysrc/synthorg/api/state.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/workers/execution_service.pysrc/synthorg/observability/events/workers.pysrc/synthorg/workers/__main__.pysrc/synthorg/workers/executor.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/engine/task_engine.pysrc/synthorg/api/dto.pysrc/synthorg/api/controllers/tasks.pysrc/synthorg/experiments/models.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/experiments/service.pysrc/synthorg/api/state.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/workers/execution_service.pysrc/synthorg/observability/events/workers.pysrc/synthorg/workers/__main__.pysrc/synthorg/workers/executor.py
cli/cmd/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Exit codes: 0 = success, 1 = runtime error, 2 = usage error, 3 = unhealthy (backend/containers), 4 = unreachable (Docker), 10 = updates available (
--check)
Files:
cli/cmd/new.gocli/cmd/start.go
cli/cmd/new.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
new <kind> <domain>command accepts--dry-runand--overwriteflags and scaffolds a conventions-clean Python file set undersrc/synthorg/for a new feature;<kind>is one ofservice,persistence,tool,controller
Files:
cli/cmd/new.go
cli/cmd/start.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
startcommand accepts flags--no-wait,--timeout,--no-pull,--dry-run,--no-detach,--no-verify
Files:
cli/cmd/start.go
{README.md,docs/**/*.md}
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics in README and public docs must be sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers
Files:
docs/reference/typed-boundaries.mddocs/guides/workers-and-background-tasks.mddocs/guides/monitoring.md
docs/**/*.{md,d2}
📄 CodeRabbit inference engine (CLAUDE.md)
Use D2 for architecture and nested container diagrams; use Mermaid for flowcharts, sequence diagrams, and pipelines; use Markdown tables for tabular data; pin D2 CLI to v0.7.1 in CI
Files:
docs/reference/typed-boundaries.mddocs/guides/workers-and-background-tasks.mddocs/guides/monitoring.md
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx}: Always usecreateLoggerfrom@/lib/logger; never use bareconsole.warn/console.error/console.debugin application code
Always use variable namelogfor logger instances (e.g.,const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate args to logger (not interpolated into the message string) so they go throughsanitizeArg
Wrap attacker-controlled fields inside structured objects insanitizeForLog()before embedding in logs
Callers MUST NOT wrap store mutation calls intry/catch; the store owns the error UX
Display counts must come fromdata.length; the wire envelope no longer carriestotal
ImportErrorCodeandErrorCategoryfrom@/api/types/errors(re-exported fromweb/src/api/types/error-codes.gen.ts); discriminate onErrorCode.<NAME>, never on raw integer literals
Generated DTO types re-exportcomponents['schemas']['X']with no hand-written tightening because the generator handles response optionality
Import DTOs via the barrel (import type { AgentConfig } from '@/api/types') or directly from the generated module (import type { AgentConfig } from '@/api/types/dtos.gen')
ALWAYS reuse existing components fromweb/src/components/ui/before creating new ones
A PostToolUse hook (scripts/check_web_design_system.py) flags hardcoded hex/rgba/fonts/Motion durations/locale literals/bare.toLocale*String()calls/missing Storybook stories/duplicate component patterns/complex.map()blocks; fix every violation before proceeding
Forbid passing async functions where the callsite ignores the returned promise via@typescript-eslint/no-misused-promises(withchecksVoidReturn: { attributes: false }); React 19asyncevent handlers stay allowed
Files:
web/src/api/types/openapi.gen.ts
web/src/{stores,api}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Use cursor-based paging via
PaginationMetafor list endpoints; keepnextCursor+hasMorein state (not offset arithmetic) and early-return when!hasMore || !nextCursor
Files:
web/src/api/types/openapi.gen.ts
web/src/api/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
getLiveness()is always 200 while the process is alive;getReadiness()is 200 healthy / 503 unavailable (binary'ok' | 'unavailable'outcome, no tri-state). Any new caller must handle the 503 path explicitly.
Files:
web/src/api/types/openapi.gen.ts
web/src/api/types/**/*.gen.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
NEVER hand-edit a
*.gen.tsfile; regenerate withuv run python scripts/generate_dto_types_ts.py
Files:
web/src/api/types/openapi.gen.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Read design spec from `docs/design/` page before implementing; deviations need approval (see DESIGN_SPEC.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: No region/currency/locale privileged; use metric units; use British English throughout
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Every convention PR must ship its enforcement gate (see docs/reference/convention-gates.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json` or baseline files (`scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`); use `ALLOW_BASELINE_GROWTH=1 git commit` for gate baseline bypass
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: After implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` command; do not use `gh pr create` (hookify-blocked); after PR: use `/aurelio-review-pr` for external feedback; fix all valid issues; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Test execution: xdist `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` (`loadfile` prevents 3.14+ Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Commit format: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Branches: `<type>/<slug>` format from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Pre-commit/pre-push hooks configured in `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: Squash merge with PR body as squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: After every squash merge, run `/post-merge-cleanup` command
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T21:42:31.315Z
Learning: For Web dashboard development, refer to `web/CLAUDE.md` for detailed guidelines
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_no_migration_framing.pytests/unit/engine/test_task_engine_integration.pytests/unit/scripts/test_check_no_migration_framing.pysrc/synthorg/engine/task_engine.pysrc/synthorg/api/dto.pysrc/synthorg/api/controllers/tasks.pytests/unit/engine/test_approval_gate.pysrc/synthorg/experiments/models.pysrc/synthorg/observability/prometheus_recording_streams.pysrc/synthorg/experiments/service.pysrc/synthorg/api/state.pysrc/synthorg/api/controllers/ws.pysrc/synthorg/workers/execution_service.pytests/unit/experiments/test_service.pysrc/synthorg/observability/events/workers.pysrc/synthorg/workers/__main__.pytests/unit/workers/test_executor.pysrc/synthorg/workers/executor.py
🪛 LanguageTool
docs/guides/workers-and-background-tasks.md
[grammar] ~63-~63: Ensure spelling is correct
Context: ...2. Polls the consumer for claims with a per-fetch timeout. 3. Runs the configured e...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🔇 Additional comments (21)
cli/internal/config/state.go (1)
21-32: LGTM!Also applies to: 283-284, 287-289
scripts/check_no_migration_framing.py (1)
14-31: LGTM!Also applies to: 89-133
tests/unit/engine/test_task_engine_integration.py (1)
320-332: LGTM!tests/unit/scripts/test_check_no_migration_framing.py (1)
456-499: LGTM!Also applies to: 507-542
cli/cmd/new.go (1)
38-46: LGTM!cli/cmd/start.go (1)
86-110: LGTM!src/synthorg/api/dto.py (1)
38-38: LGTM!Also applies to: 436-484, 760-760, 773-773, 780-780
src/synthorg/api/controllers/tasks.py (1)
8-8: LGTM!Also applies to: 14-14, 325-371
docs/reference/typed-boundaries.md (1)
3-12: LGTM!Also applies to: 64-67, 161-274
.pre-commit-config.yaml (1)
31-40: LGTM!Also applies to: 459-465, 630-634, 648-650
tests/unit/engine/test_approval_gate.py (1)
3-3: LGTM!Also applies to: 8-8, 12-12, 22-45
docs/guides/workers-and-background-tasks.md (1)
1-135: LGTM!src/synthorg/experiments/models.py (1)
1-57: LGTM!src/synthorg/observability/prometheus_recording_streams.py (1)
1-126: LGTM!src/synthorg/api/state.py (1)
67-68: LGTM!Also applies to: 118-121, 806-864
src/synthorg/api/controllers/ws.py (1)
683-714: LGTM!Also applies to: 807-813, 892-895
src/synthorg/workers/execution_service.py (1)
1-176: LGTM!web/src/api/types/openapi.gen.ts (1)
1608-1659: LGTM!Also applies to: 3767-3783, 4888-4903, 5304-5311, 5646-5650, 7535-7543, 7579-7611, 8802-8816, 9923-9934, 16014-16146, 20645-20679
src/synthorg/observability/events/workers.py (1)
39-70: LGTM!tests/unit/workers/test_executor.py (1)
1-197: LGTM!src/synthorg/workers/executor.py (1)
1-269: LGTM!
Round 3 batch on PR #1926. CI now green (drain-budget split fix landed). CodeRabbit submitted a new CHANGES_REQUESTED review with 5 actionable inline + 1 outside-diff finding + 2 duplicate threads on round-2 work. All 5+1 valid items + 1 duplicate are applied; the monitoring.md RS-marker duplicate is skipped for the same reason as round 2 (file is outside the doc-numeric-macros gate scope). Fixes: - api/state.py: add '_lazy_service_lock' to __slots__ so the slot is bound declaratively even though the mixin chain provides __dict__ - engine/task_engine: rewrite _drain_all docstring to present-tense rationale only (no 'under the previous design' migration framing) - experiments/service: explicit upper-bound check on register_variant (weight > _MAX_VARIANT_WEIGHT) so the EXPERIMENT_VARIANT_INVALID_WEIGHT event fires for both bounds; wrap record_assignment in ConflictError handler that re-fetches the canonical row when a durable backend rejects the insert - experiments/protocol: document record_assignment's ConflictError contract for durable backends - workers/__main__: gate task_queue.stop() on a 'queue_started' flag so a task_queue.start() failure does not leak the owned http_client AND does not call stop() on a queue that never bound a consumer; refresh _placeholder_executor docstring (no longer reachable on missing backend URL since _resolve_executor exits) - tests/unit/experiments/test_service: _service() now returns the FakeClock too so tests stop reaching into svc._clock
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/workers/__main__.py`:
- Around line 233-239: The AsyncClient is created without an explicit timeout;
to match project patterns and make timeouts consistent with
TaskExecutionExecutor's per-request timeout, instantiate httpx.AsyncClient with
an explicit timeout (e.g., 60s) when creating http_client in the factory that
returns TaskExecutionExecutor and http_client; update the http_client creation
where httpx.AsyncClient() is called so the client-level timeout aligns with the
executor's expected request timeout while still allowing TaskExecutionExecutor's
per-request override.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6e05e2a0-ce8c-4139-8c3d-55326ff9889b
📒 Files selected for processing (6)
src/synthorg/api/state.pysrc/synthorg/engine/task_engine.pysrc/synthorg/experiments/protocol.pysrc/synthorg/experiments/service.pysrc/synthorg/workers/__main__.pytests/unit/experiments/test_service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: CodSpeed Web Pass
- GitHub Check: Build Backend
- GitHub Check: Build Web Assets (melange)
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: Dashboard Test
- GitHub Check: Lighthouse Site
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
src/synthorg/!(persistence)/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Only
src/synthorg/persistence/may import sqlite/psycopg or emit raw SQL; enforce Persistence Boundary
Files:
src/synthorg/experiments/protocol.pysrc/synthorg/workers/__main__.pysrc/synthorg/engine/task_engine.pysrc/synthorg/api/state.pysrc/synthorg/experiments/service.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Configuration Precedence (MANDATORY): DB > env > code default via
SettingsService/ConfigResolver(Cat-1) or env > code default (Cat-2,read_only_post_init); Cat-3 bootstrap secrets are pure env at boot; noos.environ.getoutside startupNo Hardcoded Values (MANDATORY): numerics live in
settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal)No comments citing reviewers, issue back-refs, or migration framing; only explain WHY; enforced by
check_no_review_origin_in_code.py+check_no_migration_framing.pyNo
from __future__ import annotations(3.14 has PEP 649); PEP 758 except:except A, B:no parens unless bindingType hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Define errors as
<Domain><Condition>ErrorfromDomainError; never inheritException/RuntimeErrordirectly; enforced bycheck_domain_error_hierarchy.pyUse Pydantic v2 frozen +
extra="forbid"on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes);@computed_fieldfor derived;NotBlankStrfor identifiersUse args models at every system boundary;
parse_typed()for every external dict ingestion; enforced bycheck_boundary_typed.pyUse immutability:
model_copy(update=...)orcopy.deepcopy(); deepcopy at system boundariesUse
asyncio.TaskGroupfor fan-out/fan-in; helpers catchException(re-raiseMemoryError/RecursionError)Clock seam: use
clock: Clock | None = None; tests injectFakeClock; lifecycle: services own_lifecycle_lock; timed-out stops mark unrestartableWrap untrusted content (SEC-1) with
wrap_untrusted()fromengine.prompt_safety; useHTMLParseGuardfor HTMLImport logger via
from synthorg.observability import get_logger; variable alwayslogger; neverimport loggingorprint()in app codeUse event ...
Files:
src/synthorg/experiments/protocol.pysrc/synthorg/workers/__main__.pysrc/synthorg/engine/task_engine.pysrc/synthorg/api/state.pysrc/synthorg/experiments/service.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/experiments/protocol.pysrc/synthorg/workers/__main__.pysrc/synthorg/engine/task_engine.pysrc/synthorg/api/state.pysrc/synthorg/experiments/service.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}; asyncauto; timeout 30s global; coverage 80% minWindows: unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backTest doubles: use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundaries is blocked byscripts/check_mock_spec.pyFakeClock and
mock_ofimport fromtests._shared; inject viaclock=and the helper's spec subscriptHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...))Flaky tests: NEVER skip/xfail; fix fundamentally; use
asyncio.Event().wait()notsleep(large)
Files:
tests/unit/experiments/test_service.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/experiments/test_service.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Use no region/locale privileged; use metric units; use British English for regional defaults
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Every convention PR must ship its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review` (gh pr create is hookify-blocked); after PR: use `/aurelio-review-pr` for external feedback; fix EVERYTHING valid
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: xdist: `-n 8 --dist=loadfile` auto-applied via pyproject addopts; prevents 3.14+ Windows ProactorEventLoop leak
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Use `.pre-commit-config.yaml` for pre-commit/pre-push hooks; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Web: see `web/CLAUDE.md` for detailed guidelines; CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Shell: see `~/.claude/rules/common/bash.md` (canonical for `cd` / `git -C` / Bash file-write rules)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:16:00.473Z
Learning: Python 3.14+ minimum version; BUSL-1.1 → Apache 2.0 after Change Date; src layout with `src/synthorg/`, `tests/`, `web/` (React 19), `cli/` (Go binary)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
src/synthorg/experiments/protocol.pysrc/synthorg/workers/__main__.pytests/unit/experiments/test_service.pysrc/synthorg/engine/task_engine.pysrc/synthorg/api/state.pysrc/synthorg/experiments/service.py
🔇 Additional comments (16)
src/synthorg/api/state.py (1)
67-68: LGTM!Also applies to: 118-121, 204-204, 210-210, 283-283, 406-420, 807-865
src/synthorg/engine/task_engine.py (2)
387-413: LGTM!
729-739: LGTM!Also applies to: 789-798, 814-823
src/synthorg/experiments/protocol.py (1)
1-89: LGTM!src/synthorg/experiments/service.py (4)
1-50: LGTM!
66-104: LGTM!
132-215: LGTM!
231-270: LGTM!src/synthorg/workers/__main__.py (3)
1-22: LGTM!Also applies to: 50-63
98-124: LGTM!
170-197: LGTM!tests/unit/experiments/test_service.py (5)
1-24: LGTM!
27-64: LGTM!
67-110: LGTM!
113-136: LGTM!
139-174: LGTM!
…line Round 4 babysit on PR #1926. CodeRabbit posted a single Nitpick on the round-3 head noting that workers/__main__'s httpx.AsyncClient() was constructed without an explicit timeout= kwarg, leaving the 5 s httpx default in place even though every other httpx.AsyncClient( site in the codebase (a2a/client, app, oauth flows, slack/github health checks, generic_http, notifications/slack+ntfy, telemetry, discovery, health_prober, probing, tsa_client) passes a project- specific timeout= explicitly. The per-request timeout=self. _timeout_seconds on the executor's http.post does override the client baseline, so this is consistency hygiene rather than a runtime bug, but the babysit mandate fixes everything valid. Fixes: - workers/executor: rename _DEFAULT_TIMEOUT_SECONDS -> public DEFAULT_HTTP_TIMEOUT_SECONDS so the worker entry point can mirror the same value without reaching into a private module symbol - workers/__main__: construct httpx.AsyncClient(timeout= DEFAULT_HTTP_TIMEOUT_SECONDS) to match the executor's per-request default and the rest of the codebase's pattern
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/workers/executor.py`:
- Around line 104-112: Add logging calls before each constructor validation
raise: for the checks on api_base_url, auth_token, and timeout_seconds in the
Executor constructor (or __init__), call the module/class logger (consistent
with other code paths like _resolve_executor) to emit a WARNING or ERROR with
context including the parameter name and value, then raise the existing
ValueError; ensure you reference the same logger instance used elsewhere (e.g.,
logger or self.logger) and format messages like "Invalid constructor argument:
api_base_url=%r" for consistency.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 67305dec-0837-4975-9fa7-7423e1ada567
📒 Files selected for processing (2)
src/synthorg/workers/__main__.pysrc/synthorg/workers/executor.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build Backend
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Lighthouse Site
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Read
docs/design/page before implementing; deviations need approval per DESIGN_SPEC.mdNo region/currency/locale privileged; use metric units; use British English per
docs/reference/regional-defaults.mdEvery convention PR ships its enforcement gate per
docs/reference/convention-gates.mdConfiguration Precedence: DB > env > code default via
SettingsService/ConfigResolver(Cat-1) or env > code default (Cat-2,read_only_post_init); Cat-3 bootstrap secrets are pure env at the boot site. YAML is a company-template ingestion format, not a precedence tier. Noos.environ.getoutside startup; pre-init Cat-2 reads usesettings.bootstrap_resolver.resolve_init_valueperdocs/reference/configuration-precedence.mdNo hardcoded numeric values; numerics live in
settings/definitions/. Allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of the formNAME: int|float|Final|Final[int]|Final[float] = literal. Enforced byscripts/check_no_magic_numbers.pyNo
from __future__ import annotations(3.14 has PEP 649). Use PEP 758except A, B:syntax without parens unless bindingType hints required on public functions; mypy strict. Use Google-style docstrings. Line length 88; functions <50 lines; files <800 lines
Error classes must follow naming pattern
<Domain><Condition>Errorand inherit fromDomainError, never directly fromException/RuntimeError/etc. Enforced bycheck_domain_error_hierarchy.pyPydantic v2 models must use
frozen=Trueandextra="forbid"on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); use@computed_fieldfor derived fields; useNotBlankStrfor identifiersArgs models required at every system boundary; use
parse_typed()for every external dict ingestion. Enforced bycheck_boundary_typed.pyImmutability: use
model_copy(update=...)orcopy.deepcopy(); use deepcopy at system boundariesAsync code must use
asyncio.TaskGroupfor ...
Files:
src/synthorg/workers/__main__.pysrc/synthorg/workers/executor.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/workers/__main__.pysrc/synthorg/workers/executor.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:41:54.213Z
Learning: Numerics in README and public docs must be sourced from `data/runtime_stats.yaml` via `<!--RS:NAME-->` markers per `data/README.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:41:54.213Z
Learning: Timeout/slow test failures indicate source-code regression; never manually edit `tests/baselines/unit_timing.json`, `scripts/*_baseline.{txt,json}`, or `scripts/_*_baseline.py`. Both families are PreToolUse-blocked. Per-invocation bypass for gate baselines requires explicit user approval: `ALLOW_BASELINE_GROWTH=1 git commit ...`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:41:54.213Z
Learning: API startup lifecycle: two phases: construction (`create_app` body) wires synchronous services; on_startup (`_build_lifecycle.on_startup`) wires services needing connected persistence. Construction-phase ordering: `agent_registry` before `auto_wire_meetings`; `tunnel_provider` wired unconditionally. On-startup ordering: `SettingsService` auto-wire before `WorkflowExecutionObserver` registration; `OntologyService` wires after `persistence.connect()` via `_wire_ontology_service`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:41:54.213Z
Learning: Use `d2` for architecture / nested containers diagrams; use `mermaid` for flowcharts / sequence / pipelines diagrams; use Markdown tables for tabular data. D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:41:54.213Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced. Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`). Branches: `<type>/<slug>` from main. Squash merge with PR body as squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-15T22:41:54.213Z
Learning: After every squash merge, run `/post-merge-cleanup`. CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
src/synthorg/workers/__main__.pysrc/synthorg/workers/executor.py
🔇 Additional comments (10)
src/synthorg/workers/__main__.py (5)
1-22: LGTM!Also applies to: 30-31, 35-48
53-66: LGTM!
101-127: LGTM!
173-200: LGTM!
204-246: LGTM!src/synthorg/workers/executor.py (5)
1-48: LGTM!Also applies to: 50-63, 66-102, 113-117
119-166: LGTM!
168-236: LGTM!
238-252: LGTM!
254-271: LGTM!
…rrors Round 5 babysit on PR #1926. CodeRabbit posted a single Nitpick on the round-4 head asking for the project's log-before-raise pattern on the three constructor validation paths (api_base_url, auth_token, timeout_seconds). CLAUDE.md mandates the pattern explicitly in the Logging section. Fixes: - observability/events/workers: add WORKERS_EXECUTOR_INVALID_INIT_ARG event constant - workers/executor: logger.warning(WORKERS_EXECUTOR_INVALID_INIT_ARG) before each ValueError raise in TaskExecutionExecutor.__init__; the auth_token path deliberately omits the value field so a leaked token cannot flow into structured logs
<!-- HIGHLIGHTS_START --> ## Highlights > _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub Models). Commit-based changelog below._ ### What you'll notice - Frontend WP-6 update with UX polish improves user interface and workflow. - Dashboard and training endpoint improvements enhance observability and dispatch behavior. - Web storybook now supports change detection for more responsive UI interactions. - Git hooks now isolated per worktree for cleaner repository management. - Providers automatically detect native streaming support in Litellm models. ### What's new - Added a new pipeline to convert Pydantic DTOs to TypeScript for better front-end compatibility. ### Under the hood - Refactored settings to three precedence categories, removing YAML tier for simpler configuration. - Completed RootConfig mirror coverage for enhanced configuration consistency. - Adopted API conventions with better query performance and forbidden extra fields for stricter validation. - Improved persistence, layer discipline, and restart safety in core work packages. - CI updated with split test jobs and tightened coverage gates for better test quality. - Switched to direct Trivy binary for security scans, removing previous Trivy action dependency. - Enhanced memory management with per-call processing options and better observability during speech-to-text encoding. - Various dependency updates for Python, infrastructure, and lock files maintain security and stability. - Removed TypeScript DTO type-tightening overlays to simplify type management. - Codebase audit tightened skill sets to prevent false positivity in class detection by 2026. <!-- HIGHLIGHTS_END --> :robot: I have created a release *beep* *boop* --- ## [0.8.5](v0.8.4...v0.8.5) (2026-05-17) ### Features * **codegen:** pydantic-to-typescript DTO pipeline + parity gate (closes [#1889](#1889)) ([#1909](#1909)) ([0265ef5](0265ef5)) * **storybook:** enable changeDetection + trim web/CLAUDE.md ([#1939](#1939)) ([3b1f4c0](3b1f4c0)) * **web,setup:** WP-6 frontend + UX polish ([#1941](#1941)) ([d9ca76d](d9ca76d)) ### Bug Fixes * correct invalid git for-each-ref syntax in post-merge-cleanup skill ([#1946](#1946)) ([69a1649](69a1649)) * dashboard polish, training endpoint dispatch, and observability cleanup ([#1911](#1911)) ([b61e9e8](b61e9e8)) * per-worktree git-hook isolation + hookify gate migration + MSW drift fix ([#1949](#1949)) ([e3f8495](e3f8495)) * **providers:** read supports_native_streaming from litellm model info ([#1942](#1942)) ([60364ca](60364ca)) * security and audit coverage (closes [#1883](#1883)) ([#1904](#1904)) ([d8ebf55](d8ebf55)) ### Performance * **ci:** mypy --num-workers=4 + enable ruff TID255 ([#1944](#1944)) ([484c1d3](484c1d3)) ### Refactoring * **ci:** drop aquasecurity/trivy-action, use direct trivy binary ([#1940](#1940)) ([df1f946](df1f946)) * **memory:** per-call processing_kwargs + observability for ST encode ([#1943](#1943)) ([3aa9d20](3aa9d20)) * Phase 7 follow-up — complete RootConfig mirror coverage (closes [#1907](#1907)) ([#1914](#1914)) ([605500b](605500b)) * **settings:** collapse precedence to three categories; drop YAML tier (closes [#1890](#1890)) ([#1910](#1910)) ([efd54c9](efd54c9)) * WP-3 API conventions + query performance + project-wide extra=forbid ([#1953](#1953)) ([504d579](504d579)), closes [#1918](#1918) * WP-4 settings + cross-cutting (clock seam, contextvars, dispatch, plugin surfaces) ([#1954](#1954)) ([7207d92](7207d92)) * **wp1:** persistence + layer discipline + restart safety ([#1945](#1945)) ([57586fb](57586fb)) ### Documentation * **wp5:** public-facing truth refresh ([#1924](#1924)) ([afb5cc5](afb5cc5)) ### CI/CD * split test job by marker with airtight aggregate coverage gate ([#1948](#1948)) ([0b818d5](0b818d5)), closes [#1938](#1938) [#1937](#1937) ### Maintenance * **codebase-audit:** tighten skill to prevent 2026-05-15 FP classes ([#1923](#1923)) ([9317ed1](9317ed1)) * Lock file maintenance ([#1913](#1913)) ([c08a355](c08a355)) * Lock file maintenance ([#1950](#1950)) ([8940ab1](8940ab1)) * remove TS DTO type-tightening overlays ([#1915](#1915)) ([d296214](d296214)), closes [#1906](#1906) * Update Infrastructure dependencies ([#1928](#1928)) ([d19fae5](d19fae5)) * Update Python dependencies ([#1929](#1929)) ([75cc2c8](75cc2c8)) * **wp7:** hygiene, stubs, test/CI/tooling, doc gaps, boundary patterns doc ([#1926](#1926)) ([c29eb32](c29eb32)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Closes #1922.
WP-7 mega-PR landing the long-tail hygiene work from the 2026-05-15 codebase audit. ~92 files, ~5,000 LOC; one-PR-per-issue per the issue's explicit instruction.
What changed
Logging gaps
client/pool.py(5),client/store.py(2, includingSimulationStore.get),workers/claim.py(3).ClientRequest.with_statusand the persisted-write side of the new transition path;Sprint.with_transitionlogs at DEBUG since the scheduler keeps sprints in memory.Comment hygiene
scripts/check_no_migration_framing.pyto catch the variants the 2026-05-15 run missed (previously <verb>,were/was previously inlined,used to be,originally <verb>); module docstring lists the new patterns.Stubs / deferred features
synthorg.experimentsdomain (A/B variant registry + deterministic SHA-256 hash-walk assignment) — service, protocol, models, in-memory repo, REST controller. Replaces 2 of the 7 placeholder stubs from WP-7: Hygiene, Stubs, Test/CI/Tooling, Doc Gaps, Boundary Patterns Doc #1922.synthorg.workersHTTP executor (httpx.AsyncClient-based, deterministic status mapping, no manual retry — JetStream owns retries viaTaskClaimStatus.RETRY) plus the supportingexecution_serviceseam.CeremonyEvalContextnow accepts an optionalbudget_snapshotcallable.clear()+size()) on the MCP installations repo.Doc completeness
docs/guides/{a2a-federation,webhook-management,custom-mcp-server-dev,ontology-extension,custom-rules-and-meta-loop,ceremony-scheduling-tuning,cost-attribution,dynamic-scoring}.md.docs/guides/approval-workflow.md,workers-and-background-tasks.md,rest-api-examples.md; newdocs/reference/yaml-schema.md.docs/reference/typed-boundaries.mddocumenting the 6 boundary patterns (A2A JSON-RPC, MCP tool args, provider tool calls, webhook payloads, audit chain, tool-execution dual paths) plus JWT/Settings/WebSocket extras.Prompt engineering
temperature+top_ppinned onLlmSecurityEvaluator,ProceduralMemoryProposer,LlmSemanticAnalyzer. NewModelPinMetadatatype so prompt-class metadata can record a pinned model version.CompletionConfig.top_pnow defaults centrally to1.0instead of per-siteNone.Test discipline
web/src/__tests__/stores/artifacts.test.ts: replacedsetTimeout(resolve, 0)race withvi.waitFordeterministic polling.tests/conformance/persistence/test_backup_round_trip.py: walks the newbackend.kind+backend.configpublic accessors instead ofisinstance+_configprivate attribute.MagicMock(spec=...)).Tooling polish
synthorg new:RunEprints help when invoked with no subcommand (was silent).synthorg update: install-directory writability probe runs BEFORE the binary download.synthorg start: parse-failure vs missing-file errors now distinguishable..pre-commit-config.yaml: moveddomain-error-hierarchy,no-review-origin-in-code,no-migration-framingfrom pre-commit to pre-push (combined ~30s saved per commit; gates still run before push).CLAUDE.md / docs/getting_started.md
scripts/install_cli_tools.shinstalls golangci-lint only; d2 is installed separately (curl installer on Unix,winget install Terrastruct.d2on Windows).Pre-PR review
20 review agents launched in parallel against the full diff (code-reviewer, python-reviewer, docs-consistency, security-reviewer, persistence-reviewer, async-concurrency-reviewer, api-contract-drift, type-design-analyzer, go-reviewer, go-conventions-enforcer, issue-resolution-verifier, tool-parity-checker, frontend-reviewer + general-purpose runs of comment-quality-rot, logging-audit, resilience-audit, conventions-enforcer, infra-reviewer, test-quality-reviewer, and 5 mini-pass audits). Triage table at
_audit/pre-pr-review/triage.md.14 valid findings addressed (2 critical, 10 major, 2 minor) — see the squash commit body for the line-by-line list. Verified false positives explicitly documented (mostly PEP 758
except A, B:flagged by agents that didn't know Python 3.14 syntax — CLAUDE.md is explicit, test suite passes).Test plan
uv run ruff check src/ tests/ --fix,uv run ruff format src/ tests/,uv run mypy src/ tests/— clean.uv run python -m pytest tests/ -n 8— 30,530 passed, 50 skipped, all green.npm --prefix web run lint && npm --prefix web run type-check && npm --prefix web run test— 256 files / 3,155 tests pass.go -C cli vet ./... && go -C cli test ./... && go -C cli build ./...— clean.Acceptance criteria (#1922)
SimulationStore.getgap closed in the review round).docs/guides/.domain-error-hierarchyno longer in pre-commit.