Skip to content

fix: bring the multi-agent coordinator online at boot#2007

Merged
Aureliolo merged 6 commits into
mainfrom
fix/coordinator-online
May 18, 2026
Merged

fix: bring the multi-agent coordinator online at boot#2007
Aureliolo merged 6 commits into
mainfrom
fix/coordinator-online

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

Wires build_coordinator at boot behind the provider-present switch so /coordinate runs decompose, route, parallel execution, then rollup end to end when a provider is configured. An empty company (no provider) still honestly 503s. The build_coordinator ghost-wiring manifest line is flipped PENDING -> ENFORCED in this same PR (CORE acceptance criterion), and the no-ghost-wiring gate passes.

  • One shared boot AgentEngine powers both the worker execution service and the coordinator (RuntimeServices).
  • post_setup_reinit dual hot-swaps worker service + coordinator so a provider added after an empty-company start wakes the whole runtime with no restart.
  • New BFS StateMachine.path_to / transition_path walks the parent task through its valid lifecycle to the rollup-derived status.

Pre-PR review hardening (this branch)

23 review agents ran; 27 valid findings addressed (one conflicting finding flagged invalid: silent-failure-hunter misread PEP 758 except A, B: as a syntax error; it is project-mandated and the suite passes):

  • Structure: extracted engine/coordination/parent_rollup.py (rollup compute + lifecycle walk + phase wrappers); service.py is now under 800 lines and the former 167-line _phase_update_parent is gone.
  • Concurrency: atomic AppState.set_coordinator_if_absent seam removes the boot check-then-act TOCTOU while preserving injection-over-autowire; scoped diagnostic re-read on a mid-walk hop rejection (the submit seam already validates each transition, so this is a logged partial advance, not corruption).
  • Error handling: _resolve_routing_scorer_config fail-open split into resolve vs projection stages with distinct log context; runtime-build failures wrapped in a new RuntimeServicesBuildError domain error; _rebuild_runtime_services failure now logs at ERROR (gates setup_complete).
  • Perf: the three independent coordinator-build awaits run concurrently under a TaskGroup.
  • Docs: docs/design/coordination.md and README.md updated to current state.
  • +24 tests: new test_parent_rollup.py (no-path, empty-path no-op, partial-hop failure, diagnostic re-read fallback, phase wrapper), cyclic-graph defensive test, scorer fail-open, set_coordinator_if_absent injected-wins (unit + integration), dual-swap failure atomicity, subtask ASSIGNED promotion, per-hop target-status assertions.

Test plan

  • Full suite green pre-change (31149 passed, 50 platform-skipped).
  • All 236 touched-area unit/integration/e2e tests pass post-change.
  • ruff, ruff format, mypy --strict clean on all changed files.
  • Every pre-commit and pre-push gate passed on push (incl. no-ghost-wiring, domain-error-hierarchy, frozen-model, settings->startup trace, no reviewer/migration framing, no em-dash).

Review coverage

Pre-reviewed by 23 agents (docs-consistency, comment-quality-rot, code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, conventions-enforcer, security-reviewer, api-contract-drift, test-quality-reviewer, async-concurrency-reviewer, issue-resolution-verifier, tool-parity-checker, + 6 audit mini-passes). issue-resolution-verifier: all 6 acceptance criteria RESOLVED.

Closes #1958

Aureliolo added 5 commits May 18, 2026 18:11
Extract parent-rollup module (service.py <800, no >50-line method), atomic set_coordinator_if_absent seam, split fail-open scorer config, TaskGroup-parallel coordinator build, RuntimeServicesBuildError domain wrap, scoped concurrency-diagnostic re-read, +24 tests, docs sync. Pre-reviewed by 23 agents, 27 findings addressed.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: f994849f-3930-4ec3-a904-40985bcca4c5

📥 Commits

Reviewing files that changed from the base of the PR and between 67092a3 and f2b6895.

📒 Files selected for processing (2)
  • tests/integration/api/test_runtime_install_ordering.py
  • tests/unit/api/test_state.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build Backend
  • GitHub Check: Lighthouse Site
  • GitHub Check: Test E2E
  • GitHub Check: Test Integration
  • GitHub Check: Dashboard Test
  • GitHub Check: Test Unit
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}. Async auto. Timeout 30s global. Coverage 80% min

Test doubles: FakeClock for Clock seam; mock_of[T](**overrides) for typed-boundary substitutions; SimpleNamespace for attribute-bags. Bare MagicMock at typed boundary blocked by scripts/check_mock_spec.py (zero-tolerance, no baseline). FakeClock and mock_of from tests._shared; inject via clock= and helper's spec subscript

Flaky tests: NEVER skip/xfail; fix fundamentally. Use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/api/test_state.py
  • tests/integration/api/test_runtime_install_ordering.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/api/test_state.py
  • tests/integration/api/test_runtime_install_ordering.py
tests/**/*test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race). Subprocess tests override back

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Files:

  • tests/unit/api/test_state.py
  • tests/integration/api/test_runtime_install_ordering.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic: NEVER use real vendor names in project code/tests. Use example-provider, test-provider, example-{large,medium,small}-001. Allowed in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Files:

  • tests/unit/api/test_state.py
  • tests/integration/api/test_runtime_install_ordering.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: No region/locale privilege; use metric units; British English per regional-defaults.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Every convention PR ships its enforcement gate per convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Timeout/slow test failures = source-code regression; never edit `tests/baselines/unit_timing.json` or any `scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`. Both families PreToolUse-blocked. Per-invocation bypass: `ALLOW_BASELINE_GROWTH=1 git commit ...`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR: `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: xdist `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` (`loadfile` prevents 3.14+Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Pre-commit/pre-push hooks via `.pre-commit-config.yaml`. Tool-call gates: `.claude/settings.json` PreToolUse (`scripts/check_*.sh`/`.py`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: Squash merge. PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: GitHub queries: `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:59:20.107Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/api/test_state.py
  • tests/integration/api/test_runtime_install_ordering.py
🔇 Additional comments (2)
tests/unit/api/test_state.py (1)

17-17: LGTM!

Also applies to: 114-114, 123-123, 128-128, 134-134, 150-150, 159-159, 163-210, 294-294, 300-300, 316-316, 322-322

tests/integration/api/test_runtime_install_ordering.py (1)

1-12: LGTM!

Also applies to: 17-17, 23-23, 43-50, 52-71, 73-95


Walkthrough

This pull request bundles runtime services at boot: build_runtime_services now returns a RuntimeServices pair (worker_execution_service and optional MultiAgentCoordinator) sharing a boot AgentEngine. App startup installs runtime services via a new _install_runtime_services hook and preserves explicitly injected coordinators. AppState gains coordinator wiring APIs (set/set_if_absent/swap) and a performance-tracker availability flag. The coordinator delegates rollup and parent-update behavior to a new parent_rollup module; state-machine pathfinding and transition_path helpers compute shortest lifecycle hops. post_setup_reinit hot-swaps both services on provider changes. Tests and docs updated; build_coordinator manifest flipped to ENFORCED.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: bring the multi-agent coordinator online at boot' accurately and concisely describes the main objective of the changeset: wiring the coordinator at boot.
Description check ✅ Passed The description provides comprehensive detail about wiring build_coordinator at boot, dual hot-swap behavior, BFS path-finding, extracted modules, error handling, and test coverage—all directly related to the changeset.
Linked Issues check ✅ Passed All functional acceptance criteria from #1958 are met: coordinator is wired at boot behind provider-present switch, end-to-end decompose→route→parallel→rollup executes on real tasks, and build_coordinator manifest line is flipped PENDING→ENFORCED with no-ghost-wiring gate passing.
Out of Scope Changes check ✅ Passed All changes directly support coordinator boot-wiring: new RuntimeServices abstraction, lifecycle path-finding, parent rollup logic extraction, AppState coordinator APIs, coordinator rebuild during reinit, new tests—no unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 41.38% which is sufficient. The required threshold is 40.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 18, 2026 16:19 — with GitHub Actions Inactive
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 18, 2026

Merging this PR will not alter performance

✅ 33 untouched benchmarks
⏩ 21 skipped benchmarks1


Comparing fix/coordinator-online (f2b6895) with main (6a9c0aa)

Open in CodSpeed

Footnotes

  1. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/api/test_runtime_install_ordering.py`:
- Around line 14-15: Replace the use of MagicMock for the injected-coordinator
test with the project test-double helper: import and use
mock_of[MultiAgentCoordinator](**overrides) instead of MagicMock(spec=...)
wherever the injected-coordinator mock is created (the top-level import and the
mock instances referenced in the injected-coordinator test and the blocks around
lines 74–95); ensure you remove the MagicMock import if no longer used and pass
any needed method/property overrides to mock_of so the typed boundary conforms
to the mock_of[MultiAgentCoordinator] convention.

In `@tests/unit/api/test_state.py`:
- Around line 174-235: Several tests (test_set_coordinator_attaches_when_none,
test_set_coordinator_is_once_only,
test_set_coordinator_if_absent_installs_when_none,
test_set_coordinator_if_absent_keeps_injected,
test_swap_coordinator_attaches_when_none,
test_swap_coordinator_replaces_existing,
test_swap_coordinator_noop_when_identical) use
MagicMock(spec=MultiAgentCoordinator) at a typed boundary which violates the
mock-spec gate; replace each MagicMock(...) with the repository-approved typed
test double factory mock_of[MultiAgentCoordinator](...) (or
mock_of(MultiAgentCoordinator, **overrides) per local helper) so that the spec
is preserved via the mock_of helper and CI’s scripts/check_mock_spec.py will not
fail.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: dbd61760-648c-4ba6-8957-2a5a4bbc9875

📥 Commits

Reviewing files that changed from the base of the PR and between 6a9c0aa and 67092a3.

📒 Files selected for processing (30)
  • CLAUDE.md
  • README.md
  • docs/design/coordination.md
  • scripts/_ghost_wiring_manifest.txt
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/setup/agent_helpers.py
  • src/synthorg/api/state.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/core/task_transitions.py
  • src/synthorg/engine/coordination/group_builder.py
  • src/synthorg/engine/coordination/parent_rollup.py
  • src/synthorg/engine/coordination/section_config.py
  • src/synthorg/engine/coordination/service.py
  • src/synthorg/engine/errors.py
  • src/synthorg/settings/definitions/coordination.py
  • src/synthorg/workers/runtime_builder.py
  • tests/e2e/test_coordinator_online_seam.py
  • tests/e2e/test_runtime_online_seam.py
  • tests/integration/api/conftest.py
  • tests/integration/api/test_post_setup_reinit_wake.py
  • tests/integration/api/test_runtime_install_ordering.py
  • tests/unit/api/test_state.py
  • tests/unit/core/test_state_machine.py
  • tests/unit/core/test_task_transitions.py
  • tests/unit/engine/test_coordination_group_builder.py
  • tests/unit/engine/test_coordination_section_config.py
  • tests/unit/engine/test_coordination_service.py
  • tests/unit/engine/test_parent_rollup.py
  • tests/unit/settings/test_coordination_decomposition_model.py
  • tests/unit/workers/test_runtime_builder.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test E2E
  • GitHub Check: Test Unit
  • GitHub Check: Test Integration
  • GitHub Check: Lighthouse Site
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Design Spec (MANDATORY): read docs/design/ page before implementing; deviations need approval. See DESIGN_SPEC.md.

Regional Defaults (MANDATORY): no region/currency/locale privileged; metric units; British English. See docs/reference/regional-defaults.md.

Configuration Precedence (MANDATORY): DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env at boot site. YAML is a company-template ingestion format, not a precedence tier. No os.environ.get outside startup; pre-init Cat-2 reads use settings.bootstrap_resolver.resolve_init_value. See docs/reference/configuration-precedence.md.

No Hardcoded Values (MANDATORY): numerics live in settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of the form NAME: int|float|Final|Final[int]|Final[float] = literal. Enforced by scripts/check_no_magic_numbers.py.

Comments WHY only; no reviewer citations / issue back-refs / migration framing. Enforced by check_no_review_origin_in_code.py + check_no_migration_framing.py.

No from __future__ import annotations (3.14 has PEP 649). PEP 758 except: except A, B: no parens unless binding.

Type hints on public functions; mypy strict. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines.

Errors: <Domain><Condition>Error from DomainError; never inherit Exception/RuntimeError/etc directly. Enforced by check_domain_error_hierarchy.py.

Pydantic v2 frozen + extra="forbid" on every frozen model project-wide (gate check_frozen_model_extra_forbid.py; @computed_field auto-exempt, per-line # lint-allow: frozen-extra-forbid -- <reason> for extra="allow"/"ignore" boundaries); @computed_field for derived; NotBlankStr for identifiers.

Args models at every system boundary; parse_typed() for every external dict ingestion. Enforced by check_boundary_typed.py...

Files:

  • src/synthorg/core/task_transitions.py
  • tests/integration/api/test_post_setup_reinit_wake.py
  • tests/integration/api/conftest.py
  • src/synthorg/engine/errors.py
  • src/synthorg/settings/definitions/coordination.py
  • tests/unit/engine/test_parent_rollup.py
  • tests/unit/settings/test_coordination_decomposition_model.py
  • src/synthorg/api/state.py
  • src/synthorg/engine/coordination/section_config.py
  • tests/e2e/test_runtime_online_seam.py
  • src/synthorg/api/controllers/setup/agent_helpers.py
  • tests/unit/engine/test_coordination_service.py
  • tests/unit/engine/test_coordination_section_config.py
  • tests/unit/api/test_state.py
  • tests/unit/engine/test_coordination_group_builder.py
  • src/synthorg/engine/coordination/group_builder.py
  • tests/unit/core/test_state_machine.py
  • tests/unit/core/test_task_transitions.py
  • src/synthorg/workers/runtime_builder.py
  • tests/e2e/test_coordinator_online_seam.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/engine/coordination/parent_rollup.py
  • src/synthorg/api/app.py
  • tests/unit/workers/test_runtime_builder.py
  • tests/integration/api/test_runtime_install_ordering.py
  • src/synthorg/engine/coordination/service.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/core/task_transitions.py
  • src/synthorg/engine/errors.py
  • src/synthorg/settings/definitions/coordination.py
  • src/synthorg/api/state.py
  • src/synthorg/engine/coordination/section_config.py
  • src/synthorg/api/controllers/setup/agent_helpers.py
  • src/synthorg/engine/coordination/group_builder.py
  • src/synthorg/workers/runtime_builder.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/engine/coordination/parent_rollup.py
  • src/synthorg/api/app.py
  • src/synthorg/engine/coordination/service.py
{README.md,docs/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Doc Numeric Claims (MANDATORY): numerics in README + public docs sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers. See data/README.md.

Files:

  • README.md
  • docs/design/coordination.md
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Diagrams: use d2 for architecture / nested containers, mermaid for flowcharts / sequence / pipelines. Markdown tables for tabular data. D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI.

Files:

  • README.md
  • docs/design/coordination.md
  • CLAUDE.md
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}. Async auto. Timeout 30s global. Coverage 80% min.

xdist: -n 8 --dist=loadfile auto-applied via pyproject addopts (loadfile prevents 3.14+Windows ProactorEventLoop leak).

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race). Subprocess tests override back.

Test doubles: use FakeClock for the Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags. Bare MagicMock at typed boundaries is blocked by scripts/check_mock_spec.py (zero-tolerance, no baseline). Import from tests._shared; inject via clock= and helper's spec subscript.

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)). Flaky: NEVER skip/xfail; fix fundamentally. Use asyncio.Event().wait() not sleep(large).

Files:

  • tests/integration/api/test_post_setup_reinit_wake.py
  • tests/integration/api/conftest.py
  • tests/unit/engine/test_parent_rollup.py
  • tests/unit/settings/test_coordination_decomposition_model.py
  • tests/e2e/test_runtime_online_seam.py
  • tests/unit/engine/test_coordination_service.py
  • tests/unit/engine/test_coordination_section_config.py
  • tests/unit/api/test_state.py
  • tests/unit/engine/test_coordination_group_builder.py
  • tests/unit/core/test_state_machine.py
  • tests/unit/core/test_task_transitions.py
  • tests/e2e/test_coordinator_online_seam.py
  • tests/unit/workers/test_runtime_builder.py
  • tests/integration/api/test_runtime_install_ordering.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/integration/api/test_post_setup_reinit_wake.py
  • tests/integration/api/conftest.py
  • tests/unit/engine/test_parent_rollup.py
  • tests/unit/settings/test_coordination_decomposition_model.py
  • tests/e2e/test_runtime_online_seam.py
  • tests/unit/engine/test_coordination_service.py
  • tests/unit/engine/test_coordination_section_config.py
  • tests/unit/api/test_state.py
  • tests/unit/engine/test_coordination_group_builder.py
  • tests/unit/core/test_state_machine.py
  • tests/unit/core/test_task_transitions.py
  • tests/e2e/test_coordinator_online_seam.py
  • tests/unit/workers/test_runtime_builder.py
  • tests/integration/api/test_runtime_install_ordering.py
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: Planning (MANDATORY): present every plan for accept/deny before coding.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: Convention Rollout (MANDATORY): every convention PR ships its enforcement gate. See docs/reference/convention-gates.md.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: Test Regression (MANDATORY): timeout/slow failures = source-code regression; never edit `tests/baselines/unit_timing.json` or any `scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`. Both families are PreToolUse-blocked. Per-invocation bypass for gate baselines: `ALLOW_BASELINE_GROWTH=1 git commit ...` (requires explicit user approval).
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: Post-Implementation + Pre-PR Review (MANDATORY): after issue: branch + commit + push (no auto-PR); use `/pre-pr-review` (`gh pr create` is blocked by `scripts/check_no_pr_create.sh`). After PR: `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: API startup lifecycle: Two phases: construction (`create_app` body) wires synchronous services; on_startup (`_build_lifecycle.on_startup`) wires services needing connected persistence. Construction ordering: `agent_registry` before `auto_wire_meetings`; `tunnel_provider` unconditional. On-startup ordering: `SettingsService` auto-wire precedes `WorkflowExecutionObserver` registration; `OntologyService` wires after `persistence.connect()`. Runtime services: `synthorg.workers.runtime_builder.build_runtime_services` selects ONE provider-present switch, returns `RuntimeServices` pair (execution service + coordinator). Empty-company rejects task creation (AgentRuntimeNotConfiguredError, 4014) and 503s `/coordinate`. Setup completion: `post_setup_reinit()` propagates failures; `settings_svc.set("api", "setup_complete", "true")` only if reinit clean. Check/validate/reinit/persist serialised under `COMPLETE_LOCK` to prevent race conditions.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: MCP: Define `ToolHandler` + `args_model`; call `require_admin_guardrails()` on admin tools; route through service layers. See mcp-handler-contract.md.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced. Signed commits required on protected refs (GPG/SSH or GitHub App). Branches: `<type>/<slug>` from main. Pre-commit/pre-push hooks: `.pre-commit-config.yaml`. Tool-call gates: `.claude/settings.json` PreToolUse. Squash merge. PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) in PR body.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-18T16:18:52.299Z
Learning: Workflow: After every squash merge → `/post-merge-cleanup`. CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API.
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/core/task_transitions.py
  • tests/integration/api/test_post_setup_reinit_wake.py
  • tests/integration/api/conftest.py
  • src/synthorg/engine/errors.py
  • src/synthorg/settings/definitions/coordination.py
  • tests/unit/engine/test_parent_rollup.py
  • tests/unit/settings/test_coordination_decomposition_model.py
  • src/synthorg/api/state.py
  • src/synthorg/engine/coordination/section_config.py
  • tests/e2e/test_runtime_online_seam.py
  • src/synthorg/api/controllers/setup/agent_helpers.py
  • tests/unit/engine/test_coordination_service.py
  • tests/unit/engine/test_coordination_section_config.py
  • tests/unit/api/test_state.py
  • tests/unit/engine/test_coordination_group_builder.py
  • src/synthorg/engine/coordination/group_builder.py
  • tests/unit/core/test_state_machine.py
  • tests/unit/core/test_task_transitions.py
  • src/synthorg/workers/runtime_builder.py
  • tests/e2e/test_coordinator_online_seam.py
  • src/synthorg/core/state_machine.py
  • src/synthorg/engine/coordination/parent_rollup.py
  • src/synthorg/api/app.py
  • tests/unit/workers/test_runtime_builder.py
  • tests/integration/api/test_runtime_install_ordering.py
  • src/synthorg/engine/coordination/service.py
📚 Learning: 2026-05-16T18:36:19.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/guides/contributing.md:95-95
Timestamp: 2026-05-16T18:36:19.195Z
Learning: In the SynthOrg repo, the “Doc Numeric Claims (MANDATORY)” RS-marker rule should be applied only to these docs: README.md; docs/index.md; docs/roadmap/index.md; docs/architecture/decisions.md; docs/reference/convention-gates.md. This rule is enforced by scripts/check_doc_numeric_macros.py (with runtime substitution by scripts/inject_runtime_stats.py), so reviewers should not flag similar numeric-claim issues in other paths (e.g., anything under docs/guides/). When checking those scoped files, the rule skips fenced code blocks and only flags digits that are adjacent to stat nouns (tests/providers/agents/stars/releases). Numeric CLI flags like “--num-workers=4” inside fenced bash code blocks are not subject to this rule.

Applied to files:

  • README.md
📚 Learning: 2026-05-16T18:36:31.446Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/reference/conventions.md:787-789
Timestamp: 2026-05-16T18:36:31.446Z
Learning: In Aureliolo/synthorg, follow the `Doc Numeric Claims (MANDATORY)` rule enforced by `scripts/check_doc_numeric_macros.py` only for these markdown files: `README.md`, `docs/index.md`, `docs/roadmap/index.md`, `docs/architecture/decisions.md`, and `docs/reference/convention-gates.md`. The gate flags digits that appear adjacent to the stat nouns `tests`, `providers`, `agents`, `stars`, and `releases`—those numeric claims must use the required `<!--RS:...-->` macro format. Do not apply this rule to prose that mentions Python version numbers (e.g., “Python 3.14” / “Python 3.15”); those should not be flagged as requiring `<!--RS:...-->`.

Applied to files:

  • README.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: In the synthorg repo, the “Doc Numeric Claims (MANDATORY)” RS-marker rule is enforced only for this exact set of Markdown files: README.md, docs/index.md, docs/roadmap/index.md, docs/architecture/decisions.md, and docs/reference/convention-gates.md. During code reviews, do not raise RS-marker/numeric-claims findings for numeric values in any other files (e.g., docs/getting_started.md, docs/guides/*, docs/reference/conventions.md), since they are not checked or injected by scripts/check_doc_numeric_macros.py or scripts/inject_runtime_stats.py.

Applied to files:

  • README.md
📚 Learning: 2026-05-16T18:36:31.446Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/reference/conventions.md:787-789
Timestamp: 2026-05-16T18:36:31.446Z
Learning: In Aureliolo/synthorg, do not require adding `<!--RS:...-->` “Doc Numeric Claims (MANDATORY)” numeric macros for Python version numbers mentioned in documentation prose (e.g., “Python 3.14”, “Python 3.15”). The `scripts/check_doc_numeric_macros.py` gate only applies to `README.md`, `docs/index.md`, `docs/roadmap/index.md`, `docs/architecture/decisions.md`, and `docs/reference/convention-gates.md`, and it only flags digits adjacent to specific stat nouns (tests/providers/agents/stars/releases), not language version mentions like “Python 3.14”.

Applied to files:

  • README.md
  • docs/design/coordination.md
  • CLAUDE.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing Markdown in the synthorg repo, account for the CI gate `check_doc_numeric_macros.py`: it skips fenced code blocks entirely, and it only flags digits that are adjacent to these stat nouns: `tests`, `providers`, `agents`, `stars`, `releases`. Therefore, numeric examples such as CLI flag values (e.g., `--num-workers=4` in fenced bash blocks) and prose version numbers (e.g., `3.14`/`3.15`) are not expected to trigger this check; prioritize changes only when digits appear next to one of the listed nouns (e.g., “5 tests”, “10 stars”, etc.).

Applied to files:

  • README.md
  • docs/design/coordination.md
  • CLAUDE.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing markdown files for the "Doc Numeric Claims (MANDATORY)" RS-marker rule, only require/flag missing RS markers in the files that are actually in-scope for the rule. The scope is enforced via an identical _SCOPED_FILES allowlist in scripts/check_doc_numeric_macros.py and scripts/inject_runtime_stats.py, and currently includes: README.md; docs/index.md; docs/roadmap/index.md; docs/architecture/decisions.md; docs/reference/convention-gates.md. For any other markdown files (e.g., docs/getting_started.md, docs/guides/*), missing RS markers for numeric claims are no-ops and should NOT be flagged.

Applied to files:

  • README.md
  • docs/design/coordination.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing Markdown in the synthorg repo against the `check_doc_numeric_macros.py` gate, account for its documented behavior: it skips fenced code blocks entirely, and it only flags digits that are adjacent to specific stat nouns (`tests`, `providers`, `agents`, `stars`, `releases`). As a result, CLI-style numbers (e.g., `--num-workers=4`) inside fenced bash code blocks should never be treated as violations of this gate; only non-fenced text needs checking, and only around those specific nouns.

Applied to files:

  • docs/design/coordination.md
📚 Learning: 2026-05-17T11:45:11.839Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1952
File: src/synthorg/settings/definitions/api.py:594-638
Timestamp: 2026-05-17T11:45:11.839Z
Learning: In SynthOrg (Aureliolo/synthorg) pre-alpha, apply the strict no-backward-compat policy: any setting-key rename must be fully completed in the same change/PR with all repo callers updated, and you should not keep legacy aliases or compatibility fallbacks. When reviewing, do not flag a setting-key rename as a breaking upgrade hazard if the rename is repo-wide and fully implemented within the same PR.

Applied to files:

  • src/synthorg/settings/definitions/coordination.py
📚 Learning: 2026-05-17T11:45:11.839Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1952
File: src/synthorg/settings/definitions/api.py:594-638
Timestamp: 2026-05-17T11:45:11.839Z
Learning: In this repository, SynthOrg is pre-alpha and uses a strict no-backward-compat policy for setting-key renames. When reviewing code under src/synthorg/settings, do NOT flag a setting-key rename as an “upgrade-safety” issue if the rename is complete/atomic in the same PR: all callers/usages of the old key are updated simultaneously, and the PR does not keep any legacy aliases, compatibility fallbacks, or migration/rollback paths for the old key.

Applied to files:

  • src/synthorg/settings/definitions/coordination.py
🔇 Additional comments (43)
CLAUDE.md (1)

82-83: LGTM!

README.md (1)

48-48: LGTM!

docs/design/coordination.md (1)

10-10: LGTM!

scripts/_ghost_wiring_manifest.txt (1)

27-27: LGTM!

src/synthorg/core/state_machine.py (1)

24-24: LGTM!

Also applies to: 203-263

src/synthorg/core/task_transitions.py (1)

103-119: LGTM!

src/synthorg/engine/errors.py (1)

275-284: LGTM!

src/synthorg/settings/definitions/coordination.py (1)

56-70: LGTM!

src/synthorg/api/state.py (1)

1151-1162: LGTM!

Also applies to: 1164-1174, 1176-1209, 1211-1235, 1245-1248

src/synthorg/engine/coordination/section_config.py (1)

67-71: LGTM!

Also applies to: 99-109

src/synthorg/workers/runtime_builder.py (1)

64-81: LGTM!

Also applies to: 178-205, 207-249, 251-300, 302-368

src/synthorg/api/app.py (1)

995-1061: LGTM!

src/synthorg/api/controllers/setup/agent_helpers.py (1)

142-144: LGTM!

Also applies to: 147-196

tests/unit/core/test_state_machine.py (1)

127-198: LGTM!

tests/unit/core/test_task_transitions.py (1)

7-11: LGTM!

Also applies to: 247-286

tests/unit/engine/test_coordination_group_builder.py (1)

5-5: LGTM!

Also applies to: 62-89

tests/unit/engine/test_coordination_section_config.py (1)

37-61: LGTM!

tests/unit/engine/test_coordination_service.py (1)

17-17: LGTM!

Also applies to: 349-352, 381-388, 621-621, 659-659

tests/unit/engine/test_parent_rollup.py (1)

1-275: LGTM!

tests/unit/settings/test_coordination_decomposition_model.py (1)

1-72: LGTM!

tests/unit/workers/test_runtime_builder.py (1)

1-2: LGTM!

Also applies to: 4-4, 12-12, 16-16, 22-25, 31-72, 75-225

tests/integration/api/conftest.py (1)

62-63: LGTM!

Also applies to: 68-69, 95-95

tests/integration/api/test_post_setup_reinit_wake.py (1)

1-112: LGTM!

src/synthorg/engine/coordination/parent_rollup.py (8)

1-48: LGTM!


50-79: LGTM!


81-114: LGTM!


117-195: LGTM!


198-227: LGTM!


230-291: LGTM!


294-332: LGTM!


335-428: LGTM!

src/synthorg/engine/coordination/service.py (4)

1-61: LGTM!


324-331: LGTM!


355-361: LGTM!


453-693: LGTM!

src/synthorg/engine/coordination/group_builder.py (2)

9-9: LGTM!


135-161: LGTM!

tests/e2e/test_runtime_online_seam.py (2)

1-66: LGTM!


123-146: LGTM!

tests/e2e/test_coordinator_online_seam.py (4)

1-64: LGTM!


66-131: LGTM!


134-166: LGTM!


169-289: LGTM!

Comment thread tests/integration/api/test_runtime_install_ordering.py Outdated
Comment thread tests/unit/api/test_state.py
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 92.38579% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.02%. Comparing base (6a9c0aa) to head (f2b6895).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/workers/runtime_builder.py 87.50% 6 Missing ⚠️
src/synthorg/engine/coordination/parent_rollup.py 94.56% 4 Missing and 1 partial ⚠️
src/synthorg/api/app.py 83.33% 2 Missing ⚠️
...rc/synthorg/api/controllers/setup/agent_helpers.py 83.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2007      +/-   ##
==========================================
+ Coverage   84.99%   85.02%   +0.02%     
==========================================
  Files        1883     1884       +1     
  Lines      111241   111359     +118     
  Branches     9490     9497       +7     
==========================================
+ Hits        94550    94679     +129     
+ Misses      14370    14364       -6     
+ Partials     2321     2316       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

Gemini encountered an error creating the summary. You can try again by commenting /gemini summary.

Replace MagicMock(spec=...) typed-boundary doubles with mock_of[T]() in test_state.py and test_runtime_install_ordering.py per the mock_of convention.
@Aureliolo Aureliolo merged commit 180b38a into main May 18, 2026
82 checks passed
@Aureliolo Aureliolo deleted the fix/coordinator-online branch May 18, 2026 17:16
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 18, 2026 17:16 — with GitHub Actions Inactive
Aureliolo pushed a commit that referenced this pull request May 19, 2026
<!-- HIGHLIGHTS_START -->
## Highlights

> _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub
Models). Commit-based changelog below._

### What you'll notice
- Multi-agent coordination is now active immediately on startup for
smoother operation.
- Governance rules are fully enforced during use, ensuring compliance at
all times.
- Coordination metrics update live, giving real-time insights into
system activity.
- Review agents are now reliably processed, preventing silent drops in
tasks.
- Sandbox containers can be reused for agents and tasks, speeding up
execution and reducing overhead.

### What's new
- Agents support online runtime with a minimal safety framework to
improve stability.
- Recorded LLM interactions can be deterministically replayed at the
provider interface.
- Distributed path validation has been enhanced for more robust data
routing.
- A client-simulation runtime was added for end-to-end testing of the
IntakeEngine.
- A new work pipeline spine architecture has been introduced to
streamline task processing.

### Under the hood
- Infrastructure, Python, and web dependencies have all been updated to
latest versions.
- Updated apko lockfiles in the CI/CD pipeline improve build
consistency.

<!-- HIGHLIGHTS_END -->

:robot: I have created a release *beep* *boop*
---


##
[0.8.6](v0.8.5...v0.8.6)
(2026-05-19)


### Features

* agent runtime online + minimal safety spine (runtime root)
([#2003](#2003))
([e5eef1a](e5eef1a)),
closes [#1956](#1956)
* deterministic recorded-LLM cassette replay at the provider chokepoint
([#2010](#2010))
([cabf55d](cabf55d))
* distributed path validation + hardening
([#2011](#2011))
([a382e4a](a382e4a)),
closes [#1966](#1966)
* wire IntakeEngine via boot client-simulation runtime (e2e test
harness) ([#2006](#2006))
([6a9c0aa](6a9c0aa)),
closes [#1961](#1961)
* work pipeline spine
([#1960](#1960))
([#2013](#2013))
([29b64e3](29b64e3))


### Bug Fixes

* bring the multi-agent coordinator online at boot
([#2007](#2007))
([180b38a](180b38a)),
closes [#1958](#1958)
* full governance enforcement online
([#1957](#1957))
([#2005](#2005))
([4140fc5](4140fc5))
* harden anti-ghost-wiring gate and fix silently-dropped review agents
([#2000](#2000))
([89b57ce](89b57ce))
* make coordination metrics live
([#1959](#1959))
([#2012](#2012))
([c4775e2](c4775e2))
* sandbox lifecycle dispatch (per-agent / per-task container reuse)
([#2008](#2008))
([03d2587](03d2587)),
closes [#1965](#1965)


### Documentation

* add GitButler concept-only concurrency research
([#1978](#1978))
([#2009](#2009))
([9e4f5c1](9e4f5c1))
* honest-hybrid refresh of README, site, and design specs
([#2001](#2001))
([f485bea](f485bea))


### CI/CD

* update apko lockfiles
([#2004](#2004))
([e2b9eee](e2b9eee))


### Maintenance

* Update Infrastructure dependencies
([#2014](#2014))
([0b16bdf](0b16bdf))
* Update Python dependencies
([#2015](#2015))
([a7224bb](a7224bb))
* Update Web dependencies
([#2016](#2016))
([7a7fe76](7a7fe76))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-agent coordinator online

1 participant