Skip to content

feat: surface safety-spine state in runtime-services boot log (closes #2096)#2097

Merged
Aureliolo merged 4 commits into
mainfrom
fix/agent-runtime-online
May 24, 2026
Merged

feat: surface safety-spine state in runtime-services boot log (closes #2096)#2097
Aureliolo merged 4 commits into
mainfrom
fix/agent-runtime-online

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

Audit follow-up to merged #1956 (PR #2003 "Agent runtime online + minimal safety spine"). The audit found 13 of 14 acceptance / scope items PASS; the only WEAK item (composed end-to-end test) was already covered by tests/e2e/test_runtime_online_seam.py. The single genuine gap was operator observability of the SecOps spine state at boot. This PR closes that gap.

Closes #2096.

What changed

Runtime-services boot log carries the safety-spine state so operators reading synthorg.log can see, at a glance, whether the SecOps interceptor will be active / shadow / disabled once a provider runs. Two structured kwargs (security_enabled, security_enforcement_mode) added to all three logger.info(API_APP_STARTUP, service="runtime_services", ...) call sites in src/synthorg/workers/runtime_builder.py (the two no-provider branches in _select_active_provider and the provider-present branch in build_runtime_services).

Three pre-PR-review findings folded in by user request:

  • # module-kind: orchestrator header added to src/synthorg/workers/runtime_builder.py (was defaulting to code tier with cap 500 but is an orchestrator at 840 gate-LOC; cap is now correctly orchestrator 600 with the baseline carrying the actual size).
  • # module-kind: service header added to src/synthorg/workers/execution_service.py (sibling classification gap; same shape).
  • Try / except around the _build_runtime_coordinator TaskGroup so a failure in decomposition / routing-scorer / workspace config resolve logs operator context (via log_exception_redacted) before propagating, instead of escaping silently.
  • New summary logger.info(API_APP_STARTUP, mode="agent_engine_built", ...) at the end of build_runtime_services reporting which optional subsystems (coordinator, work pipeline, red team, vision gate) wired, so operators can see the final boot shape in one line.

Test plan

  • tests/unit/workers/test_runtime_builder.py adds a TestBootLogSafetySpineState class with three tests asserting the new security_enabled + security_enforcement_mode kwargs appear on each of the three boot branches (no-provider, empty-registry, provider-present), using structlog.testing.capture_logs.
  • Pre-push gates ran clean locally: ruff, ruff-format, mypy (affected), pytest-unit (affected), all convention gates including module-size budget and no-ghost-wiring.
  • E2E acceptance is already proven by tests/e2e/test_runtime_online_seam.py::test_runtime_executes_task_through_seam_with_safety_spine (unchanged).

Review coverage

Pre-reviewed by 4 agents per "reduced agent count" instruction: code-reviewer (APPROVED), python-reviewer (APPROVED), logging-audit (COMPLIANT), conventions-enforcer (1 MAJOR + 1 MEDIUM). Mandatory docs-consistency / comment-quality-rot / mini-pass agents skipped per explicit user instruction (no docs, no comment narrative, no ghost wiring introduced).

All four valid findings (MAJOR + MEDIUM + 2 SUGGESTION) applied in this PR per user direction.

Files

  • src/synthorg/workers/runtime_builder.py (+23 gate-LOC: 3 log sites + 2 local vars + try / except wrapper + summary log + module-kind header)
  • src/synthorg/workers/execution_service.py (module-kind header only; 0 gate-LOC)
  • tests/unit/workers/test_runtime_builder.py (new TestBootLogSafetySpineState class; +73 lines)
  • scripts/_module_size_baseline.json (mechanical baseline bump 809 to 840 for runtime_builder.py; user-approved)
  • .codespellrc (added .test_durations.unit,.test_durations.integration to skip list; was blocking pre-commit on a deliberate parse_bool test parametrize ID)

Manifest

No new _ghost_wiring_manifest.txt entries. AgentEngine is already ENFORCED from #1956.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an observability gap identified during a security audit, ensuring that the safety-spine state is clearly visible in the runtime boot logs. It also enhances system reliability by improving error reporting during the runtime coordinator initialization and provides a consolidated summary of the active subsystems upon successful startup.

Highlights

  • Observability: Added 'security_enabled' and 'security_enforcement_mode' to runtime-services boot logs to improve operator visibility into the safety-spine state.
  • Error Handling: Wrapped the runtime coordinator TaskGroup in a try/except block to ensure failures are logged with context before propagation.
  • Boot Reporting: Added a summary log at the end of runtime service construction to report the final configuration of optional subsystems.
  • Module Classification: Added '# module-kind' headers to runtime_builder.py and execution_service.py for correct architectural classification.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 24, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 24, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 4c89f27d-ece2-46fb-ae0b-7eaed2cbc035

📥 Commits

Reviewing files that changed from the base of the PR and between 6dab029 and aa7037c.

📒 Files selected for processing (4)
  • scripts/_module_size_baseline.json
  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
  • tests/unit/workers/test_runtime_builder.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: Build Backend
  • GitHub Check: Test Integration (shard 4)
  • GitHub Check: Test Integration (shard 3)
  • GitHub Check: Test Integration (shard 1)
  • GitHub Check: Test Integration (shard 2)
  • GitHub Check: Test E2E
  • GitHub Check: Test Unit (shard 3)
  • GitHub Check: Test Unit (shard 1)
  • GitHub Check: Lint
  • GitHub Check: Test Unit (shard 2)
  • GitHub Check: Test Conformance (SQLite)
  • GitHub Check: Test Unit (shard 4)
  • GitHub Check: Runtime Stats Freshness Gate
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: Build Preview
  • GitHub Check: pyright (advisory)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (7)
src/synthorg/!(persistence)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Configuration precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env; no os.environ.get outside startup
Non-provider transient I/O (e.g. git push/fetch) uses core.resilience.GeneralRetryHandler with a retryable predicate, never hand-rolled loop
Conversational human content wrapped via wrap_untrusted(TAG_TASK_DATA, ...) (SEC-1)

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Numerics must live in settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of the form NAME: int|float|Final|Final[int]|Final[float] = literal
Module-size budget: controller 400 LOC, service/orchestrator 600 LOC, repository 500 LOC, adapter/integration 700 LOC, feature 100 LOC, code 500 LOC (default), tests 800 LOC, declarative exempt, generated glob-exempt per # module-kind: header
No from __future__ import annotations in Python 3.14+ (has PEP 649); use PEP 758 except: except A, B: no parens unless binding
Type hints required on public functions; mypy strict type checking; Google-style docstrings; line length 88; functions <50 lines
Error classes must follow <Domain><Condition>Error naming and inherit from DomainError, never from Exception/RuntimeError directly
Pydantic v2: every frozen model must have extra="forbid"; @computed_field auto-exempt; use # lint-allow: frozen-extra-forbid -- <reason> for extra="allow"/"ignore" boundaries
Use @computed_field for derived fields in Pydantic models
Use NotBlankStr for identifier fields in Pydantic models
Args models must be used at every system boundary; use parse_typed() for every external dict ingestion
Enforce immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries
Use asyncio.TaskGroup for fan-out/fan-in async patterns; helpers must catch Exception and re-raise MemoryError/RecursionError
Clock seam: clock: Clock | None = None parameter in services; tests inject FakeClock
Services own _lifecycle_lock; timed-out stops mark services unrestartable
Untrusted content: use wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML
Use from synthorg.observability import get_logger; variable must be named logger; never import logging or print() in app code
Event names must come from observability.events.<domain> ...

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
src/**/*.{py,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Comments must explain WHY only; no reviewer citations / issue back-refs / migration framing

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
src/synthorg/{api/lifecycle_builder,workers/runtime_builder}.py

📄 CodeRabbit inference engine (CLAUDE.md)

EnvironmentService wires in _install_runtime_services behind has_persistence; provisions ambiently via ActiveSandboxEnvironment contextvar before engine run

Files:

  • src/synthorg/workers/runtime_builder.py
src/synthorg/workers/runtime_builder.py

📄 CodeRabbit inference engine (CLAUDE.md)

Runtime services: synthorg.workers.runtime_builder.build_runtime_services returns a RuntimeServices pair selected by ONE provider-present switch; use AgentEngineExecutionService + coordinator or NoProviderExecutionService + None as backstop

Files:

  • src/synthorg/workers/runtime_builder.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min
Windows: unit tests use WindowsSelectorEventLoopPolicy; subprocess tests override back
Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries is blocked
FakeClock and mock_of import from tests._shared; inject via clock= and helper's spec subscript
Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))
Never skip/xfail flaky tests; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/workers/test_runtime_builder.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/workers/test_runtime_builder.py
🧠 Learnings (4)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
  • tests/unit/workers/test_runtime_builder.py
📚 Learning: 2026-05-21T22:55:20.496Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2035
File: src/synthorg/meta/toolsmith/models.py:114-114
Timestamp: 2026-05-21T22:55:20.496Z
Learning: In this repo’s “magic number” review standard, the existing gate in `scripts/check_no_magic_numbers.py` intentionally does NOT flag numeric literals used as raw call-site arguments. So, do not flag numeric literals passed as keyword arguments to Pydantic `Field()` (e.g., `Field(ge=0, le=100)` / `Field(ge=1, le=50)`)—this is an established idiom. Only treat numeric literals as “magic numbers” when they occur in the locations the gate checks (module-level assignments and function/method parameter defaults).

Applied to files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
  • tests/unit/workers/test_runtime_builder.py
📚 Learning: 2026-05-21T22:55:09.289Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2035
File: src/synthorg/meta/toolsmith/config.py:29-30
Timestamp: 2026-05-21T22:55:09.289Z
Learning: For this repo’s Pydantic configuration idiom, do not treat numeric literals passed directly as arguments to `pydantic.Field(...)` as “magic numbers” during review. This includes call-site usages like `Field(default=0.2, ge=0.0, le=1.0)` (e.g., in config models such as `ToolAuthoringConfig`, `ToolValidationConfig`, `ToolsmithConfig`). Do not request extracting those `Field(...)` numeric arguments into named constants, since the repo’s `scripts/check_no_magic_numbers.py` intentionally excludes call-site `Field(...)` numerics and relies on `Field(...)` as the canonical way to express these constraints/defaults.

Applied to files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
📚 Learning: 2026-05-23T12:24:00.128Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2080
File: tests/_shared/test_postgres_proxy.py:19-48
Timestamp: 2026-05-23T12:24:00.128Z
Learning: When creating test doubles for Python typing.Protocols in tests, prefer a hand-written Protocol fake (a concrete class that explicitly implements the Protocol) over `mock_of[T]` if the Protocol only defines annotation-only attributes (e.g., `username: str`, `password: str`, `dbname: str`) with no class-level values/assignments. This is because `mock_of[T]` relies on `create_autospec(..., spec_set=True)`, which enumerates members via `dir(spec)`; annotation-only attributes are not included, so `mock_of`’s kwarg-based attribute setting can raise `AttributeError: attribute not present on spec type`. In that annotation-only case, don’t recommend `mock_of[T]`—use an explicit fake class instead.

Applied to files:

  • tests/unit/workers/test_runtime_builder.py
🔇 Additional comments (4)
src/synthorg/workers/runtime_builder.py (1)

1-1: LGTM!

Also applies to: 185-207, 576-597, 774-783, 821-843

src/synthorg/workers/execution_service.py (1)

1-1: LGTM!

tests/unit/workers/test_runtime_builder.py (1)

3-5: LGTM!

Also applies to: 9-10, 24-25, 65-115, 307-387, 389-457

scripts/_module_size_baseline.json (1)

159-159: LGTM!


Walkthrough

This PR enriches the runtime-builder boot logs with security configuration state to address operator visibility into SecOps enforcement modes. The three runtime_services startup log sites now emit security_enabled and security_enforcement_mode across no-provider, empty-registry, and provider-present modes. The coordinator resolution is wrapped with try/except logging for failed concurrent model/config/strategy resolution. Vision gate computation is precomputed as a variable, and wiring-status flags are logged before return. Tests import structured log capture, extend the provider app state helper with selective error injection, and add TestBootLogSafetySpineState to assert the new fields across all boot paths. Baselines are updated to reflect the expanded implementation.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding safety-spine state visibility to runtime-services boot logs.
Description check ✅ Passed The description comprehensively explains the changes, objectives, test plan, and files modified, all related to surfacing safety-spine state in boot logs.
Linked Issues check ✅ Passed The PR fully addresses issue #2096 by adding security_enabled and security_enforcement_mode kwargs to all three runtime-services boot log call sites and includes comprehensive unit tests for all three branches.
Out of Scope Changes check ✅ Passed All changes are in-scope: safety-spine boot logging, module-kind headers, exception handling, baseline updates, and log capture tests directly support the #2096 objective.
Docstring Coverage ✅ Passed Docstring coverage is 72.73% which is sufficient. The required threshold is 40.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 24, 2026 13:07 — with GitHub Actions Inactive
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 24, 2026

Merging this PR will not alter performance

✅ 33 untouched benchmarks
⏩ 21 skipped benchmarks1


Comparing fix/agent-runtime-online (aa7037c) with main (aedbba5)

Open in CodSpeed

Footnotes

  1. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the observability and robustness of the agent runtime startup process. Key changes include adding security configuration details (enabled status and enforcement mode) to startup logs and introducing structured logging to track the wiring of runtime components like the coordinator and work pipeline. Error handling in the runtime coordinator builder was improved using Python 3.14's parenthesisless multi-exception syntax and redacted exception logging. New unit tests verify that security state is correctly captured in boot logs across different provider scenarios. I have no feedback to provide.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/workers/runtime_builder.py`:
- Around line 826-834: The startup log for the "agent_engine_built" event
(logger.info call with API_APP_STARTUP and mode="agent_engine_built") is missing
the safety-spine fields; update that logger.info invocation to include the same
security fields used by other runtime_services startup events (e.g.,
security_enabled and security_enforcement_mode) so the event schema remains
consistent with other API_APP_STARTUP messages.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: e92dee5e-fc9b-4676-81f1-4edba762eaeb

📥 Commits

Reviewing files that changed from the base of the PR and between ddf2d86 and 6dab029.

📒 Files selected for processing (5)
  • .codespellrc
  • scripts/_module_size_baseline.json
  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
  • tests/unit/workers/test_runtime_builder.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Build Backend
  • GitHub Check: Test Unit (shard 4)
  • GitHub Check: Test Integration (shard 3)
  • GitHub Check: Test Unit (shard 2)
  • GitHub Check: Test Integration (shard 2)
  • GitHub Check: Test Integration (shard 4)
  • GitHub Check: Test Unit (shard 1)
  • GitHub Check: Test Unit (shard 3)
  • GitHub Check: Test Integration (shard 1)
  • GitHub Check: Runtime Stats Freshness Gate
  • GitHub Check: Test Conformance (SQLite)
  • GitHub Check: Test E2E
  • GitHub Check: CodSpeed Python benchmarks
  • GitHub Check: pyright (advisory)
  • GitHub Check: Build Preview
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (7)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min
Windows: unit tests use WindowsSelectorEventLoopPolicy; subprocess tests override back
Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries is blocked
FakeClock and mock_of import from tests._shared; inject via clock= and helper's spec subscript
Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))
Never skip/xfail flaky tests; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/workers/test_runtime_builder.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/workers/test_runtime_builder.py
src/synthorg/!(persistence)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Configuration precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env; no os.environ.get outside startup
Non-provider transient I/O (e.g. git push/fetch) uses core.resilience.GeneralRetryHandler with a retryable predicate, never hand-rolled loop
Conversational human content wrapped via wrap_untrusted(TAG_TASK_DATA, ...) (SEC-1)

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Numerics must live in settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of the form NAME: int|float|Final|Final[int]|Final[float] = literal
Module-size budget: controller 400 LOC, service/orchestrator 600 LOC, repository 500 LOC, adapter/integration 700 LOC, feature 100 LOC, code 500 LOC (default), tests 800 LOC, declarative exempt, generated glob-exempt per # module-kind: header
No from __future__ import annotations in Python 3.14+ (has PEP 649); use PEP 758 except: except A, B: no parens unless binding
Type hints required on public functions; mypy strict type checking; Google-style docstrings; line length 88; functions <50 lines
Error classes must follow <Domain><Condition>Error naming and inherit from DomainError, never from Exception/RuntimeError directly
Pydantic v2: every frozen model must have extra="forbid"; @computed_field auto-exempt; use # lint-allow: frozen-extra-forbid -- <reason> for extra="allow"/"ignore" boundaries
Use @computed_field for derived fields in Pydantic models
Use NotBlankStr for identifier fields in Pydantic models
Args models must be used at every system boundary; use parse_typed() for every external dict ingestion
Enforce immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries
Use asyncio.TaskGroup for fan-out/fan-in async patterns; helpers must catch Exception and re-raise MemoryError/RecursionError
Clock seam: clock: Clock | None = None parameter in services; tests inject FakeClock
Services own _lifecycle_lock; timed-out stops mark services unrestartable
Untrusted content: use wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML
Use from synthorg.observability import get_logger; variable must be named logger; never import logging or print() in app code
Event names must come from observability.events.<domain> ...

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
src/**/*.{py,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Comments must explain WHY only; no reviewer citations / issue back-refs / migration framing

Files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
src/synthorg/{api/lifecycle_builder,workers/runtime_builder}.py

📄 CodeRabbit inference engine (CLAUDE.md)

EnvironmentService wires in _install_runtime_services behind has_persistence; provisions ambiently via ActiveSandboxEnvironment contextvar before engine run

Files:

  • src/synthorg/workers/runtime_builder.py
src/synthorg/workers/runtime_builder.py

📄 CodeRabbit inference engine (CLAUDE.md)

Runtime services: synthorg.workers.runtime_builder.build_runtime_services returns a RuntimeServices pair selected by ONE provider-present switch; use AgentEngineExecutionService + coordinator or NoProviderExecutionService + None as backstop

Files:

  • src/synthorg/workers/runtime_builder.py
🧠 Learnings (4)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/workers/test_runtime_builder.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
📚 Learning: 2026-05-21T22:55:20.496Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2035
File: src/synthorg/meta/toolsmith/models.py:114-114
Timestamp: 2026-05-21T22:55:20.496Z
Learning: In this repo’s “magic number” review standard, the existing gate in `scripts/check_no_magic_numbers.py` intentionally does NOT flag numeric literals used as raw call-site arguments. So, do not flag numeric literals passed as keyword arguments to Pydantic `Field()` (e.g., `Field(ge=0, le=100)` / `Field(ge=1, le=50)`)—this is an established idiom. Only treat numeric literals as “magic numbers” when they occur in the locations the gate checks (module-level assignments and function/method parameter defaults).

Applied to files:

  • tests/unit/workers/test_runtime_builder.py
  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
📚 Learning: 2026-05-23T12:24:00.128Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2080
File: tests/_shared/test_postgres_proxy.py:19-48
Timestamp: 2026-05-23T12:24:00.128Z
Learning: When creating test doubles for Python typing.Protocols in tests, prefer a hand-written Protocol fake (a concrete class that explicitly implements the Protocol) over `mock_of[T]` if the Protocol only defines annotation-only attributes (e.g., `username: str`, `password: str`, `dbname: str`) with no class-level values/assignments. This is because `mock_of[T]` relies on `create_autospec(..., spec_set=True)`, which enumerates members via `dir(spec)`; annotation-only attributes are not included, so `mock_of`’s kwarg-based attribute setting can raise `AttributeError: attribute not present on spec type`. In that annotation-only case, don’t recommend `mock_of[T]`—use an explicit fake class instead.

Applied to files:

  • tests/unit/workers/test_runtime_builder.py
📚 Learning: 2026-05-21T22:55:09.289Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2035
File: src/synthorg/meta/toolsmith/config.py:29-30
Timestamp: 2026-05-21T22:55:09.289Z
Learning: For this repo’s Pydantic configuration idiom, do not treat numeric literals passed directly as arguments to `pydantic.Field(...)` as “magic numbers” during review. This includes call-site usages like `Field(default=0.2, ge=0.0, le=1.0)` (e.g., in config models such as `ToolAuthoringConfig`, `ToolValidationConfig`, `ToolsmithConfig`). Do not request extracting those `Field(...)` numeric arguments into named constants, since the repo’s `scripts/check_no_magic_numbers.py` intentionally excludes call-site `Field(...)` numerics and relies on `Field(...)` as the canonical way to express these constraints/defaults.

Applied to files:

  • src/synthorg/workers/execution_service.py
  • src/synthorg/workers/runtime_builder.py
🔇 Additional comments (5)
src/synthorg/workers/execution_service.py (1)

1-1: LGTM!

src/synthorg/workers/runtime_builder.py (1)

1-1: LGTM!

Also applies to: 185-207, 576-597, 774-783, 821-825, 840-840

tests/unit/workers/test_runtime_builder.py (1)

3-5: LGTM!

Also applies to: 9-10, 24-24, 289-355

.codespellrc (1)

7-7: LGTM!

scripts/_module_size_baseline.json (1)

159-159: LGTM!

Comment thread src/synthorg/workers/runtime_builder.py
@codecov
Copy link
Copy Markdown

codecov Bot commented May 24, 2026

Codecov Report

❌ Patch coverage is 92.30769% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 87.13%. Comparing base (a78810a) to head (aa7037c).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/workers/runtime_builder.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2097   +/-   ##
=======================================
  Coverage   87.12%   87.13%           
=======================================
  Files        2251     2251           
  Lines      130302   130311    +9     
=======================================
+ Hits       113531   113541   +10     
+ Misses      16756    16755    -1     
  Partials       15       15           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Aureliolo added 4 commits May 24, 2026 15:55
…r coordinator resolve and subsystem-built summary log
- Add security_enabled / security_enforcement_mode to the
  agent_engine_built startup log so the post-construction
  runtime_services event matches the schema of the agent_engine
  decision event right above it (CodeRabbit inline, runtime_builder.py:826-834).
- Cover the _build_runtime_coordinator try/except resolve-failure
  path so codecov/patch is back at 100% on the touched lines
  (new TestRuntimeCoordinatorResolveFailure + extended
  TestBootLogSafetySpineState assertion for the agent_engine_built event).
@Aureliolo Aureliolo force-pushed the fix/agent-runtime-online branch from 6dab029 to aa7037c Compare May 24, 2026 14:40
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 24, 2026 14:42 — with GitHub Actions Inactive
@Aureliolo Aureliolo merged commit f187b31 into main May 24, 2026
90 checks passed
@Aureliolo Aureliolo deleted the fix/agent-runtime-online branch May 24, 2026 14:51
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 24, 2026 14:51 — with GitHub Actions Inactive
Aureliolo pushed a commit that referenced this pull request May 24, 2026
<!-- HIGHLIGHTS_START -->
## Highlights

> _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub
Models). Commit-based changelog below._

### What you'll notice
- New brownfield codebase intake mode supports merger and acquisition
scenarios.
- Added deep CEO interview feature to improve project charter creation.
- Introduced mission control and flight recorder operator cockpit for
better operational oversight.
- Research mode added for enhanced exploratory work.
- Runtime services now log safety-spine state at boot for clearer
diagnostics.

### What's new
- Research mode feature enables deeper data exploration.
- CEO interview integration helps shape project charters.
- Mission control and flight recorder cockpit introduced for operational
tracking.

### Under the hood
- Improved codebase modularity with module-size gates and lint
tightening.
- Added __init__.py to 21 test directories for better test discovery.
- Promoted six transitive dependencies to direct dependencies for
clarity.
- Split codespell ignore list into vocabulary and source renames.
- Decomposed oversized web utilities, hooks, and libraries for
maintainability.
- Enhanced CI with Lychee link checker integration and retry logic for
cosign signing.
- Sharded unit and integration tests and added Postgres service
container in CI.
- Updated infrastructure and web dependencies; maintained lock files.

<!-- HIGHLIGHTS_END -->

:robot: I have created a release *beep* *boop*
---


##
[0.8.8](v0.8.7...v0.8.8)
(2026-05-24)


### Features

* brownfield codebase intake (merger/acquisition entry mode)
([#2042](#2042))
([e287621](e287621)),
closes [#1975](#1975)
* deep CEO interview to project charter
([#2045](#2045))
([904f2fb](904f2fb))
* mission control + flight recorder operator cockpit
([#2044](#2044))
([1c2660b](1c2660b))
* research mode
([#2041](#2041))
([f81a5ac](f81a5ac)),
closes [#1989](#1989)
* surface safety-spine state in runtime-services boot log (closes
[#2096](#2096))
([#2097](#2097))
([f187b31](f187b31))


### Refactoring

* add __init__.py to 21 leaf test directories (INP001)
([#2081](#2081))
([2592118](2592118)),
closes [#2064](#2064)
* codebase modularity (1/4) - module-size gates + lint tightening +
tools ([#2078](#2078))
([556fbd9](556fbd9)),
closes [#2047](#2047)
[#2040](#2040)
* promote 6 transitive deps to direct deps
([#2083](#2083))
([adedc6a](adedc6a))
* split codespell ignore-words-list into vocab + source renames
([#2085](#2085))
([917d98a](917d98a)),
closes [#2074](#2074)
* **web:** PR A foundation, decompose oversized utils/hooks/lib
([#2092](#2092))
([#2098](#2098))
([aedbba5](aedbba5))


### CI/CD

* exclude slsa.dev from lychee (transient timeout on canonical badge)
([#2090](#2090))
([346c51d](346c51d))
* fix paths-filter shallow-clone race and scorecard allowlist
([#2089](#2089))
([7cd7ce8](7cd7ce8))
* refresh .test_durations.{unit,integration}
([#2087](#2087))
([ddf2d86](ddf2d86))
* retry cosign sign on transient GHCR/Rekor failures
([#2100](#2100))
([da9422a](da9422a))
* shard test-unit + test-integration, sysmon coverage, Postgres service
container ([#2080](#2080))
([0768787](0768787))
* wire Lychee link-checker (workflow + installer + pre-push hook)
([#2084](#2084))
([1c0694a](1c0694a))


### Maintenance

* Lock file maintenance
([#2086](#2086))
([a78810a](a78810a))
* Update Infrastructure dependencies
([#2055](#2055))
([041ad8b](041ad8b))
* Update Web dependencies
([#2054](#2054))
([4d57b9a](4d57b9a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Boot log lacks safety-spine state (operators cannot confirm SecOps mode without grepping config)

1 participant