Skip to content

docs: audit cleanup D -- public-facing & docs sync (#1709)#1715

Merged
Aureliolo merged 2 commits into
mainfrom
docs/audit-cleanup-d-docs-sync
May 2, 2026
Merged

docs: audit cleanup D -- public-facing & docs sync (#1709)#1715
Aureliolo merged 2 commits into
mainfrom
docs/audit-cleanup-d-docs-sync

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Bundle D from the 2026-05-01 codebase audit. Public-facing pages and design docs drifted from the actual codebase; this PR resyncs them.

Summary

  • Test-count claim (README.md, docs/roadmap/index.md): "25,000+ unit tests" -> "27,000+ tests". Live count is 27,566 collected (uv run python -m pytest tests/ --collect-only -q | tail -1).
  • Region count (docs/architecture/tech-stack.md): "57 Latin-script locales across 12 world regions" -> 11 (data/locales.yaml has 11 region groupings).
  • Comparison page (docs/reference/comparison.md, .gitignore): drop the gitignore entry and commit the regenerated artefact. The script and YAML are correct; the local file was generated 2026-04-02 against a YAML state where SynthOrg features were classified as planned across the board, then YAML was updated 2026-04-26 in audit: fix public-facing documentation drift (test count, event modules, comparison page) #1595 without rerunning the generator. CI regenerates on every build, but tracking the artefact surfaces drift in PR diffs.
  • D16 sandbox decision (docs/architecture/decisions.md): rewrite to describe the actual layered SandboxBackend strategy: subprocess for low-risk (file_system, git); Docker for high-risk (code_execution, terminal, database, web). Code (src/synthorg/tools/sandbox/factory.py) and design (docs/design/tools.md) already align; only the decision row drifted.
  • Sandbox fallback prose (docs/design/tools.md): drop "no unsafe subprocess fallback for code execution"; low-risk categories continue via subprocess even when Docker is unavailable.
  • REJECTED status (docs/design/communication.md): drop *(proposed)* on the REJECTED A2A task-state row. TaskStatus.REJECTED is implemented at src/synthorg/core/enums.py:256 with the CREATED -> REJECTED transition documented in the enum docstring.
  • Deleted-module references (docs/design/self-improvement.md, docs/reference/pluggable-subsystems.md, docs/reference/claude-reference.md): drop references to src/synthorg/meta/rollout/clock.py (Clock + SystemClock now live at src/synthorg/core/clock.py) and to src/synthorg/api/auth/{lockout_store,refresh_store,session_store}.py (storage now lives under persistence/{sqlite,postgres}/).
  • DESIGN_SPEC row (docs/DESIGN_SPEC.md): add the Internationalization design page row; internationalization.md exists but was unlisted.
  • CLAUDE.md vendor-name carve-out: drop the dangling docs/design/operations.md reference; src/synthorg/providers/presets.py already plays that role and is item (3) of the same list.

Migration-framing rot (agent 155)

Strip 11 issue-number / migration narratives back to present-tense descriptions:

  • src/synthorg/core/evidence.py:51
  • src/synthorg/engine/middleware/coordination_protocol.py:80,84 (+ adjacent lines 48-49 drive-by)
  • src/synthorg/engine/trajectory/efficiency_ratios.py:6
  • src/synthorg/hr/evaluation/extractors/efficiency.py:38 (+ adjacent line 5 drive-by)
  • src/synthorg/hr/evaluation/metric_extractor_protocol.py:165-166
  • src/synthorg/providers/management/preset_override_service.py:17-18
  • tests/unit/api/test_exception_handlers.py:824
  • tests/unit/budget/test_cost_record.py:317
  • tests/unit/engine/artifacts/test_service.py:4-5
  • tests/unit/observability/test_events.py:315-321,331-333
  • tests/unit/settings/test_definitions_config_bridge.py:5

tests/unit/architecture/test_layering.py keeps its #1610 reference; the test exists to enforce the deletion of synthorg.api.errors, so the historical anchor is load-bearing.

Decisions resolved during planning

  • D16: update the decision text to match the layered code (recommended option). Rejected: restricting code to docker-only would force a running container for trivial file reads and break the local-first quickstart.
  • operations.md: remove the dangling CLAUDE.md reference. src/synthorg/providers/presets.py already covers vendor-name allowance.
  • Comparison page: regen + un-gitignore. No script bug; the staleness was a one-shot gap from a YAML change without rerunning the generator. Tracking the file surfaces future drift in PR diffs.
  • README test count: drop the unit qualifier; use 27,000+ tests (the live total of 27,566 covers unit + integration + e2e + property tests).

Test plan

  • Pre-push hooks (mypy on affected modules, pytest on affected modules, ruff format, ruff check, persistence-boundary, forbidden-literal, no-em-dashes, gitleaks) all passed.
  • uv run python scripts/generate_comparison.py reruns produce a no-op diff against the committed file (apart from Last updated: if the YAML hasn't been touched since).
  • uv run python -m pytest tests/ --collect-only -q | tail -1 reports 27566 tests collected.
  • Diagram-syntax-validator agent ran in /pre-pr-review quick mode against the changed .md files (D2 in communication.md, Mermaid in README.md, communication.md, self-improvement.md); no syntax issues, no diagram bodies modified.

Review coverage

Pre-reviewed via /pre-pr-review quick (docs-only PR; only diagram-syntax-validator was required). Pre-push hooks gated mypy + pytest on touched Python files.

Closes #1709

Bundle D from the 2026-05-01 codebase audit. Public-facing pages and
design docs drifted from the actual codebase:

- README.md and docs/roadmap/index.md: replace "25,000+ unit tests" with
  "27,000+ tests". Live count is 27,566 collected
  (uv run python -m pytest tests/ --collect-only -q | tail -1 reports
  "27566 tests collected").
- docs/architecture/tech-stack.md: "57 Latin-script locales across 12
  world regions" becomes 11 world regions (data/locales.yaml has 11
  region groupings).
- docs/reference/comparison.md: drop the .gitignore line and commit the
  generator output. The script and YAML are correct; the local file was
  generated 2026-04-02 against a YAML state where SynthOrg features
  were classified as planned across the board, then YAML was updated
  2026-04-26 in #1595 without rerunning the generator. CI regenerates
  on every build, but tracking the artefact surfaces drift in PR diffs.
- docs/architecture/decisions.md: rewrite D16 to describe the actual
  layered SandboxBackend strategy (subprocess for low-risk file_system
  and git, Docker for high-risk code_execution / terminal / database /
  web). Code (src/synthorg/tools/sandbox/factory.py) and design
  (docs/design/tools.md) already align; only the decision row drifted.
- docs/design/tools.md: drop the "no unsafe subprocess fallback for
  code execution" phrasing; low-risk categories continue via subprocess
  even when Docker is unavailable.
- docs/design/communication.md: drop *(proposed)* on the REJECTED A2A
  task-state row. TaskStatus.REJECTED is implemented at
  src/synthorg/core/enums.py:256 with the CREATED..REJECTED transition
  documented in the enum docstring.
- docs/design/self-improvement.md, docs/reference/pluggable-subsystems.md
  and docs/reference/claude-reference.md: drop references to deleted
  modules. Clock and SystemClock now live at src/synthorg/core/clock.py
  (not src/synthorg/meta/rollout/clock.py). Auth lockout / refresh /
  session storage now lives under persistence/{sqlite,postgres}/ (not
  src/synthorg/api/auth/{lockout_store,refresh_store,session_store}.py).
- docs/DESIGN_SPEC.md: add the Internationalization design page row;
  internationalization.md exists but was unlisted.
- CLAUDE.md: drop the dangling docs/design/operations.md reference from
  the vendor-name carve-out list. src/synthorg/providers/presets.py is
  already item (3) of the same list and plays that role; renumber.

Migration-framing rot (agent 155): strip 11 issue-number / migration
narratives back to present-tense descriptions across src/ and tests/,
plus 2 adjacent drive-bys in files already touched
(coordination_protocol.py:48-49, efficiency.py:5).
tests/unit/architecture/test_layering.py keeps its #1610 reference; the
test exists to enforce the deletion of synthorg.api.errors, so the
historical anchor is load-bearing.

Closes #1709
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9200380c-1c1e-4d65-8784-b9fad4f1a18e

📥 Commits

Reviewing files that changed from the base of the PR and between 3e00022 and 695ab2b.

📒 Files selected for processing (1)
  • src/synthorg/engine/middleware/coordination_protocol.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Python 3.14+ with PEP 649 native lazy annotations; never use from __future__ import annotations.

Use PEP 758 except syntax: except A, B: (no parens) when not binding to a name; as exc requires parens (except (A, B) as exc:).

All public functions and classes must have type hints; mypy strict mode enforced.

Docstrings must use Google style and are required on all public classes and functions; ruff D rules enforced.

Create new objects for immutability; never mutate existing ones. Use frozen Pydantic models for config/identity; for non-Pydantic registries use copy.deepcopy() at construction + MappingProxyType wrapping.

Separate config (frozen models) from runtime state (mutable-via-copy models); never mix static config and mutable runtime fields in one model.

Use Pydantic v2 with ConfigDict(frozen=True, allow_inf_nan=False) on all models; use extra='forbid' on request DTOs; use @computed_field for derived values; use NotBlankStr from core.types for identifier/name fields.

Prefer asyncio.TaskGroup for fan-out/fan-in async concurrency. Wrap independent task bodies in async def helpers that catch Exception (re-raise only MemoryError/RecursionError) so one failure doesn't unwind the group.

Classes that read time or sleep must take clock: Clock | None = None defaulting to SystemClock(); tests inject FakeClock. Use the replacement table in docs/reference/conventions.md for legacy-callable carve-outs.

Maximum line length is 88 characters (ruff enforced); functions < 50 lines; files < 800 lines.

Every Pydantic model is ConfigDict(frozen=True, ...) by default unless documented otherwise; mutations go through model_copy(update=...), never direct attribute assignment.

Validate at system boundaries (user input, external APIs, config files).

Files:

  • src/synthorg/engine/middleware/coordination_protocol.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every BaseTool subclass, MCP tool registration, A2A RPC method, and WebSocket event must declare a typed Pydantic args model and validate before dispatch.

When migrating an entry-point from raw dict[str, Any] to typed Pydantic validation, call parse_typed() from synthorg.api.boundary with a hardcoded literal boundary label (never user-controlled).

Async start()/stop() services own a dedicated self._lifecycle_lock; timed-out stops mark the service unrestartable per docs/reference/lifecycle-sync.md.

Wrap attacker-controllable strings at LLM call sites via wrap_untrusted() from synthorg.engine.prompt_safety; append untrusted_content_directive(tags) to the enclosing system prompt (SEC-1).

Never call lxml.html.fromstring on attacker input; use HTMLParseGuard from synthorg.tools.html_parse_guard (SEC-1).

Cross-cutting subsystems follow protocol + strategy + factory + config discriminator with safe defaults; services (which wrap repositories) are a distinct pattern per docs/reference/pluggable-subsystems.md.

Handle errors explicitly, never swallow. Domain error families register a base-class entry in EXCEPTION_HANDLERS so subtypes get correct status codes per docs/reference/errors.md.

Domain error classes use Error naming, inherit from DomainError (or a domain-scoped intermediate inheriting DomainError), not bare Exception/RuntimeError.

Every business-logic module must import from synthorg.observability import get_logger then logger = get_logger(name); variable name is always 'logger'. Carve-outs are documented in module docstring.

Never use import logging / logging.getLogger() / print() in application code; exception: observability/{setup,sinks,syslog_handler,http_handler,otlp_handler}.py for handler-construction/bootstrap code.

Event names must always import constants from synthorg.observability.events.; never use string literals. See docs/reference/conventions.md for domain inventory and telemetry namespace split.

Always use str...

Files:

  • src/synthorg/engine/middleware/coordination_protocol.py
**/*.{py,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Store UTC datetimes; render via Intl without passing timeZone (browser tz wins).

Date/number format: always via Intl; no hand-rolled templates.

Files:

  • src/synthorg/engine/middleware/coordination_protocol.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/engine/middleware/coordination_protocol.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Always read the relevant docs/design/ page before implementing any feature or planning any issue; DESIGN_SPEC.md is a pointer file linking the design pages. If implementation deviates from spec, alert the user and explain why; never silently diverge.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts; be critical and surface improvements as suggestions.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Commits use type: description format. Types: feat, fix, refactor, docs, test, chore, perf, ci. Enforced by commitizen (commit-msg hook).
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Signed commits required on main and all other refs via branch protection (exception: GitHub App-signed commits from synthorg-repo-bot via Git Data API satisfy required_signatures).
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Branches use <type>/<slug> naming from main.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Coverage minimum 80% enforced in CI; benchmarks excluded via --ignore=tests/benchmarks/.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Always include -n 8 when running pytest locally; never run tests sequentially. CI uses -n auto.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: When tests fail due to timeout, slowness, or xdist contention: NEVER delete/skip/xfail tests to fix slowness; NEVER use --no-verify; run durations analysis, compare against tests/baselines/unit_timing.json (known-good baseline), identify source code regression (suite time exceeds baseline * 1.3), fix the source code, not tests. pytest_sessionfinish hook warns loudly if regression detected; trust the warning.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Never use cd in Bash commands: working directory already set to project root; use absolute paths or run commands directly. Exception: bash -c 'cd <dir> && <cmd>' (child process, no cwd side effects). Use for tools without -C flag, e.g. bash -c 'cd web && npm install'.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Never use Bash to write or modify files: use Write or Edit tools. Forbidden: cat >, cat << EOF, echo >, echo >>, sed -i, python -c open(...).write(...), tee (read-only/inspection uses like piping to stdout are fine).
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Use go -C cli (never cd cli).
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: After finishing an issue implementation: always create a feature branch (<type>/<slug>), commit, and push; do NOT create a PR automatically. Do NOT leave work uncommitted on main; branch, commit, push immediately after finishing.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: NEVER create a PR directly: gh pr create blocked by hookify. ALWAYS use /pre-pr-review to create PRs; it runs automated checks + review agents + fixes before creating PR. For trivial/docs-only changes: /pre-pr-review quick skips agents but still runs checks. After PR exists, use /aurelio-review-pr for external reviewer feedback.
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: /commit-push-pr command effectively blocked (calls gh pr create internally).
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T10:00:02.141Z
Learning: Fix everything valid, never skip: When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, findings adjacent to PR changes), fix them all. No deferring, no 'out of scope' skipping.
🔇 Additional comments (2)
src/synthorg/engine/middleware/coordination_protocol.py (2)

80-84: Documentation improvement looks good.

The Field descriptions now use concrete class names (TaskLedgerMiddleware, ProgressLedgerMiddleware) instead of generic or issue-numbered references, which improves clarity and follows the coding guideline against issue/PR back-references.

The wording variation ("from" in Field descriptions vs "populated by" in the class docstring) is acceptable and avoids repetition while conveying the same meaning.

This comment references the same middleware classes as the previous verification comment (lines 48-49), so the verification script above will confirm whether these Field descriptions are accurate.


48-49: Middleware class names and field population are correct.

Both TaskLedgerMiddleware and ProgressLedgerMiddleware exist in src/synthorg/engine/middleware/coordination_constraints.py and populate the documented fields. TaskLedgerMiddleware.before_dispatch() returns ctx.model_copy(update={"task_ledger": ledger}) and ProgressLedgerMiddleware.after_rollup() returns ctx.model_copy(update={"progress_ledger": ledger}), confirming the documentation is accurate.


Walkthrough

This PR applies multiple documentation updates from a 2026-05-01 audit: increments reported test counts (25,000+ → 27,000+), adds an Internationalization design row, revises D16 to specify a layered SandboxBackend (subprocess for low-risk, Docker for high-risk), finalizes the REJECTED A2A task-state mapping, removes references to meta/rollout/clock.py and replaces with core/clock/SystemClock defaults, corrects the name-generation region count (12 → 11), clarifies Docker-unavailable failure behavior for tool categories, replaces several issue-numbered references in docstrings with descriptive text, and adds a generated comparison page docs/reference/comparison.md. No exported/public code entities were changed.

Suggested labels

autorelease: tagged

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title "docs: audit cleanup D -- public-facing & docs sync (#1709)" clearly summarizes the main change: resynchronizing public-facing documentation and design docs with the actual codebase following an audit.
Description check ✅ Passed The PR description provides detailed context on the audit cleanup work, explaining test-count updates, module reference corrections, decision doc rewrites, and migration-framing cleanup across multiple files.
Linked Issues check ✅ Passed The PR comprehensively addresses all coding-related requirements from issue #1709: test counts updated, comparison page regenerated and ungitignored, region count corrected, deleted-module references removed, D16 decision rewritten to match code, REJECTED status finalized, design pages corrected, and migration-framing narratives stripped.
Out of Scope Changes check ✅ Passed All changes are documentation-focused and directly align with issue #1709 scope: docs/, README.md, .gitignore, and docstring updates addressing audit findings. No extraneous code modifications or unrelated changes are present.
Docstring Coverage ✅ Passed Docstring coverage is 66.67% which is sufficient. The required threshold is 40.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Review rate limit: 4/5 reviews remaining, refill in 12 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 2, 2026 09:36 — with GitHub Actions Inactive
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 2, 2026

Merging this PR will not alter performance

✅ 33 untouched benchmarks
⏩ 21 skipped benchmarks1


Comparing docs/audit-cleanup-d-docs-sync (695ab2b) with main (40ee65b)

Open in CodSpeed

Footnotes

  1. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request performs a broad cleanup of the codebase by removing internal issue references from docstrings and comments across various modules. Key documentation updates include the introduction of a comprehensive framework comparison page, an internationalization design specification, and a revised architectural decision (D16) that adopts a layered sandboxing approach—using subprocesses for low-risk tasks and Docker for high-risk execution. Additionally, project metadata and test counts have been updated to reflect current development status. Feedback is provided to improve the specificity of docstrings in the coordination protocol by explicitly naming the ledger middleware as the source of certain data structures.

Comment on lines +48 to +49
task_ledger: TaskLedger populated by coordination middleware.
progress_ledger: ProgressLedger populated by coordination middleware.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The generic term "coordination middleware" is used to replace a specific issue reference (#1257). If these ledgers are populated by a specific middleware implementation (e.g., a LedgerMiddleware), it would be more helpful to name that specific component or describe the phase in which they are populated, as "coordination middleware" refers to the entire category defined in this protocol file.

Suggested change
task_ledger: TaskLedger populated by coordination middleware.
progress_ledger: ProgressLedger populated by coordination middleware.
task_ledger: TaskLedger populated by the ledger middleware.
progress_ledger: ProgressLedger populated by the ledger middleware.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/reference/claude-reference.md (1)

38-69: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Replace the text-fenced package tree with a supported docs format.

This updated section is still maintained inside a ```text block; migrate this structure to Mermaid (simple hierarchy) or a Markdown table/list format to comply with docs standards.

As per coding guidelines, docs/**/*.md should use Mermaid for simple hierarchies and should not use ```text blocks for structural diagrams.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/claude-reference.md` around lines 38 - 69, Replace the
existing ```text fenced package tree (the block that begins with
"src/synthorg/") with a Mermaid hierarchy block (or a Markdown nested list if
Mermaid is unsuitable); remove the ```text block entirely and author a compact
mermaid diagram that captures the top-level directories (src/synthorg/,
web/src/, cli/, site/, data/) and the key subpackages (e.g., api/, core/,
engine/, persistence/, memory/, providers/, tools/) so the structure renders as
a simple hierarchy per docs standards and avoids using ```text for structural
diagrams.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/DESIGN_SPEC.md`:
- Line 46: The DESIGN_SPEC row claiming "British English UI default" conflicts
with the runtime constant APP_LOCALE_FALLBACK in web/src/utils/locale.ts
(currently set to 'en'); decide whether the source-of-truth should be the spec
or the code and make a single consistent change: either update the docs row text
to "Neutral English (en) UI default" to match APP_LOCALE_FALLBACK, or change
APP_LOCALE_FALLBACK to 'en-GB' and adjust any Intl/formatting tests to use
en-GB; ensure you update the unique identifiers in the same commit (the spec
table entry and the APP_LOCALE_FALLBACK constant) so they stay aligned.

---

Outside diff comments:
In `@docs/reference/claude-reference.md`:
- Around line 38-69: Replace the existing ```text fenced package tree (the block
that begins with "src/synthorg/") with a Mermaid hierarchy block (or a Markdown
nested list if Mermaid is unsuitable); remove the ```text block entirely and
author a compact mermaid diagram that captures the top-level directories
(src/synthorg/, web/src/, cli/, site/, data/) and the key subpackages (e.g.,
api/, core/, engine/, persistence/, memory/, providers/, tools/) so the
structure renders as a simple hierarchy per docs standards and avoids using
```text for structural diagrams.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 947e8dd5-a591-420b-af50-1e2b5616358d

📥 Commits

Reviewing files that changed from the base of the PR and between 40ee65b and 3e00022.

📒 Files selected for processing (24)
  • .gitignore
  • CLAUDE.md
  • README.md
  • docs/DESIGN_SPEC.md
  • docs/architecture/decisions.md
  • docs/architecture/tech-stack.md
  • docs/design/communication.md
  • docs/design/self-improvement.md
  • docs/design/tools.md
  • docs/reference/claude-reference.md
  • docs/reference/comparison.md
  • docs/reference/pluggable-subsystems.md
  • docs/roadmap/index.md
  • src/synthorg/core/evidence.py
  • src/synthorg/engine/middleware/coordination_protocol.py
  • src/synthorg/engine/trajectory/efficiency_ratios.py
  • src/synthorg/hr/evaluation/extractors/efficiency.py
  • src/synthorg/hr/evaluation/metric_extractor_protocol.py
  • src/synthorg/providers/management/preset_override_service.py
  • tests/unit/api/test_exception_handlers.py
  • tests/unit/budget/test_cost_record.py
  • tests/unit/engine/artifacts/test_service.py
  • tests/unit/observability/test_events.py
  • tests/unit/settings/test_definitions_config_bridge.py
💤 Files with no reviewable changes (2)
  • .gitignore
  • docs/design/self-improvement.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Backend
  • GitHub Check: Lighthouse Site
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When a spec topic is referenced (e.g. 'the Agents page' or 'the Engine page's Crash Recovery section'), read the relevant docs/design/ page before implementation; deviations require explicit user approval and spec updates

Files:

  • docs/DESIGN_SPEC.md
  • docs/architecture/tech-stack.md
  • README.md
  • docs/roadmap/index.md
  • CLAUDE.md
  • docs/reference/claude-reference.md
  • docs/design/tools.md
  • docs/design/communication.md
  • docs/reference/comparison.md
  • docs/reference/pluggable-subsystems.md
  • docs/architecture/decisions.md
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Use D2 (\``d2) for architecture diagrams, nested container layouts, complex entity relationships; use Mermaid (```mermaid) for flowcharts, sequence diagrams, simple hierarchies, pipelines; use Markdown tables for grid/matrix data; never use ```text` blocks with ASCII/Unicode box-drawing

D2 diagrams use theme 200 (Dark Mauve), dark-only render, configured globally in mkdocs.yml; rendered at build time via mkdocs-d2-plugin (dagre layout); requires D2 CLI v0.7.1 on PATH locally and in CI

Files:

  • docs/DESIGN_SPEC.md
  • docs/architecture/tech-stack.md
  • docs/roadmap/index.md
  • docs/reference/claude-reference.md
  • docs/design/tools.md
  • docs/design/communication.md
  • docs/reference/comparison.md
  • docs/reference/pluggable-subsystems.md
  • docs/architecture/decisions.md
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every business-logic module must import logger via from synthorg.observability import get_logger and assign logger = get_logger(__name__); never use bare import logging or logging.getLogger() or print()

All Pydantic models must use ConfigDict(frozen=True, allow_inf_nan=False) unless documented otherwise; mutations go through model_copy(update=...), never direct attribute assignment

No from __future__ import annotations — Python 3.14 has native PEP 649 lazy annotations

Use PEP 758 except syntax: except A, B: without parens when not binding; except (A, B) as exc: with parens when binding (ruff enforces on 3.14)

All public functions and classes require type hints; mypy strict mode is enforced

Docstrings must use Google style and are required on public classes and functions (ruff D rules enforced)

Event names must always import constants from synthorg.observability.events.<domain>; never use string literals

Never use logger.exception(EVENT, error=str(exc)), logger.warning(EVENT, error=str(exc)), or logger.error(EVENT, error=str(exc)); use error_type=type(exc).__name__ and error=safe_error_description(exc) from synthorg.observability.safe_error_description

Wrap attacker-controllable strings at LLM call sites via wrap_untrusted() from synthorg.engine.prompt_safety and append untrusted_content_directive(tags) to the enclosing system prompt (SEC-1)

Never call lxml.html.fromstring on attacker input; use HTMLParseGuard from synthorg.tools.html_parse_guard (SEC-1)

Classes that read time or sleep must accept clock: Clock | None = None defaulting to SystemClock() from synthorg.core.clock; tests inject FakeClock

Async start() / stop() services must own a dedicated self._lifecycle_lock; timed-out stops mark the service unrestartable

Prefer asyncio.TaskGroup for fan-out/fan-in concurrency; wrap independent task bodies in async def helpers that catch Exception (re-raise only MemoryError/RecursionError)...

Files:

  • tests/unit/engine/artifacts/test_service.py
  • src/synthorg/hr/evaluation/extractors/efficiency.py
  • tests/unit/budget/test_cost_record.py
  • src/synthorg/hr/evaluation/metric_extractor_protocol.py
  • src/synthorg/providers/management/preset_override_service.py
  • src/synthorg/engine/trajectory/efficiency_ratios.py
  • tests/unit/settings/test_definitions_config_bridge.py
  • src/synthorg/core/evidence.py
  • tests/unit/api/test_exception_handlers.py
  • tests/unit/observability/test_events.py
  • src/synthorg/engine/middleware/coordination_protocol.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every Mock() / AsyncMock() / MagicMock() must declare spec=ConcreteClass (Protocol or class); bare calls are blocked by scripts/check_mock_spec.py pre-commit gate; baseline updates require uv run python scripts/check_mock_spec.py --update

Time-driven tests: import FakeClock from tests._shared.fake_clock (NOT rollout-subsystem paths) and inject into the class under test; FakeClock.sleep advances virtual time + yields via asyncio.sleep(0); use await clock.advance_async(seconds) for cooperative task driving

FakeClock-first: patch time.monotonic() / asyncio.sleep() globals only for legacy code paths without a Clock seam; when the class under test accepts clock=, always inject FakeClock

Use @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow to categorize tests; run with -n 8 parallelism (pytest-xdist); global 30-second timeout per test

NEVER delete, skip, or xfail tests to fix slowness; identify slow tests with --durations=50 --durations-min=0.5, compare against tests/baselines/unit_timing.json, and fix source-code regressions (suite time >baseline * 1.3 is a source code bug, not a test bug)

Minimum 80% code coverage enforced in CI; benchmarks excluded via --ignore=tests/benchmarks/

Property-based tests use Hypothesis with derandomize=True (10 deterministic examples in CI); when Hypothesis finds a failure, it is a real bug — fix the underlying code and add an @example(...) decorator for permanent coverage

Tests must use test-provider, test-small-001, etc.; never use real vendor names (Anthropic, OpenAI, Claude, GPT)

Files:

  • tests/unit/engine/artifacts/test_service.py
  • tests/unit/budget/test_cost_record.py
  • tests/unit/settings/test_definitions_config_bridge.py
  • tests/unit/api/test_exception_handlers.py
  • tests/unit/observability/test_events.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/engine/artifacts/test_service.py
  • tests/unit/budget/test_cost_record.py
  • tests/unit/settings/test_definitions_config_bridge.py
  • tests/unit/api/test_exception_handlers.py
  • tests/unit/observability/test_events.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Comments must explain WHY only (hidden constraints, subtle invariants, workarounds for upstream bugs with stable URLs); forbidden: reviewer citations, issue back-references, internal taxonomy shorthand, migration framing, round/iteration narrative, self-evident restatements

Line length: 88 characters (ruff enforced); functions: <50 lines; files: <800 lines

Never import aiosqlite, sqlite3, psycopg, psycopg_pool, or emit raw SQL DDL/DML keywords in string literals outside src/synthorg/persistence/

Repository CRUD vocabulary: save(entity) -> None (insert-or-update, idempotent), get(id) -> Entity | None (None on miss), delete(id) -> bool (True on removal, False if absent), list_items(...) -> tuple[Entity, ...] (paginated/filtered), query(...) -> tuple[Entity, ...] (queries always return tuples, never lists)

Domain error class naming: use <Domain><Condition>Error, inherit from DomainError or a domain-scoped intermediate; bare Exception/RuntimeError at domain boundaries is forbidden

Handle errors explicitly, never swallow; domain error families register a base-class entry in EXCEPTION_HANDLERS (src/synthorg/api/exception_handlers.py) so subtypes get correct status codes

Validate at system boundaries (user input, external APIs, config files)

For every mutable setting, use DB > env (SYNTHORG_<NAMESPACE>_<KEY>) > YAML > code default, resolved through SettingsService / ConfigResolver; two exceptions: init-time only (env-only, no registry) and read-only post-init (marked with read_only_post_init=True, raises SettingReadOnlyError on set)

State transitions on status enums log at INFO after persistence write succeeds, using domain-scoped *_STATUS_TRANSITIONED constant carrying from_status / to_status / domain id

Every settings read emits one INFO settings.value.resolved event on first cold read per process, carrying source + yaml_path; subsequent reads stay at DEBUG

NEVER use real vendor names (Anthropic, OpenAI, Claud...

Files:

  • src/synthorg/hr/evaluation/extractors/efficiency.py
  • src/synthorg/hr/evaluation/metric_extractor_protocol.py
  • src/synthorg/providers/management/preset_override_service.py
  • src/synthorg/engine/trajectory/efficiency_ratios.py
  • src/synthorg/core/evidence.py
  • src/synthorg/engine/middleware/coordination_protocol.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/hr/evaluation/extractors/efficiency.py
  • src/synthorg/hr/evaluation/metric_extractor_protocol.py
  • src/synthorg/providers/management/preset_override_service.py
  • src/synthorg/engine/trajectory/efficiency_ratios.py
  • src/synthorg/core/evidence.py
  • src/synthorg/engine/middleware/coordination_protocol.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: ALWAYS read the relevant `docs/design/` page before implementing any feature or planning any issue; the design spec is the starting point for architecture, data models, and behavior
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts; at every phase, actively look for design improvements and surface suggestions (not silent changes) for user approval
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: If implementation deviates from the spec, alert the user and explain why; the user decides whether to proceed or update the spec; do NOT silently diverge
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: When approved deviations occur, update the relevant `docs/design/` page to reflect the new reality
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: Prioritize issues by dependency order, not priority labels; unblocked dependencies come first
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: After finishing an issue implementation, always create a feature branch (`<type>/<slug>`), commit, and push; do NOT create a PR automatically; do NOT leave work uncommitted on main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: NEVER create a PR directly via `gh pr create` (blocked by hookify); ALWAYS use `/pre-pr-review` to create PRs; for trivial/docs changes use `/pre-pr-review quick` to skip agents; after PR exists, use `/aurelio-review-pr` for external feedback
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: When review agents find valid issues (including pre-existing issues, suggestions, adjacent findings), fix them all; never skip or defer; no 'out of scope' exemptions
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: Commits use `<type>: <description>` format; types: feat, fix, refactor, docs, test, chore, perf, ci; enforced by commitizen (commit-msg hook); signed commits required on main; all commits must be GPG/SSH signed (exception: GitHub App-signed commits from `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: Branches use `<type>/<slug>` format from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: Merge strategy: squash merge; PR body becomes the squash commit message on main; preserve existing `Closes `#NNN`` trailers; never remove unless explicitly asked
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: NEVER use `cd` in Bash commands; working directory is already set to project root; use absolute paths or run commands directly; exception: `bash -c "cd <dir> && <cmd>"` runs in child process safely
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-02T09:36:49.620Z
Learning: NEVER use Bash to write/modify files; use Write or Edit tools; forbidden: `cat >`, `cat << EOF`, `echo >`, `echo >>`, `sed -i`, `python -c "open(...).write(...)"`, `tee` (read-only inspection like piping to stdout is fine)
🔇 Additional comments (20)
docs/design/communication.md (1)

199-199: Docs-state mapping update is correct and consistent.

The REJECTEDrejected bidirectional mapping now matches the implemented status model and resolves the prior “proposed” inconsistency.

src/synthorg/providers/management/preset_override_service.py (1)

18-18: Docstring wording update is aligned with current design narrative.

This change correctly removes audit/migration framing and keeps the note forward-looking without issue-tracker coupling.

Based on learnings: "If implementation deviates from the spec, alert the user and explain why; the user decides whether to proceed or update the spec; do NOT silently diverge"

src/synthorg/core/evidence.py (1)

51-52: Docstring update is consistent with the current schema and audit-facing wording.

Looks good—this is a clean present-tense docs sync with no behavior risk.

src/synthorg/engine/middleware/coordination_protocol.py (1)

48-49: Docs wording update is correct and consistent.

These description updates improve clarity and remove issue-specific phrasing without affecting behavior.

Also applies to: 80-85

CLAUDE.md (1)

213-213: Vendor-name exception cleanup is consistent and scoped correctly.

This update cleanly removes the stale operations-page exception while preserving the intended runtime/data carve-outs (presets.py and logo assets).

README.md (1)

22-22: Test-metric update looks correct and consistent.

This wording aligns with the audit sync objective and keeps the claim safely conservative.

docs/roadmap/index.md (1)

5-5: Roadmap status metric is correctly synchronized.

The updated count is consistent with the audit cleanup goal and the corresponding README change.

src/synthorg/engine/trajectory/efficiency_ratios.py (1)

5-5: Docstring wording is accurate and consistent with the model fields.

src/synthorg/hr/evaluation/metric_extractor_protocol.py (1)

164-166: Async rationale clarification is clear and semantically correct for the protocol contract.

src/synthorg/hr/evaluation/extractors/efficiency.py (1)

5-5: Kill-switch docstring updates are consistent with the extractor’s current behavior and terminology.

Also applies to: 37-37

tests/unit/api/test_exception_handlers.py (1)

824-824: Docstring clarification looks good.

This wording is clearer and matches the warning-path-focused assertions in the class.

tests/unit/engine/artifacts/test_service.py (1)

3-5: Module docstring sync is accurate.

The updated wording correctly reflects the event namespace and mutation-emission behavior validated by this file.

tests/unit/budget/test_cost_record.py (1)

317-317: Docstring update is clear and appropriate.

The revised phrasing better matches the class’ validation-focused test coverage.

tests/unit/observability/test_events.py (1)

315-331: Comment refresh is consistent and helpful.

These clarifications improve maintainability without altering test behavior or coverage intent.

docs/reference/comparison.md (1)

8-268: Generated comparison page looks consistent and release-ready.

The generated marker, section structure, legend clarity, and table-based presentation are all in good shape for docs sync.

tests/unit/settings/test_definitions_config_bridge.py (1)

3-6: LGTM – clearer present-tense description.

The updated docstring accurately describes the _EXPECTED tuple structure and removes migration framing, aligning with the PR's audit cleanup objective.

docs/architecture/tech-stack.md (1)

90-90: Name-generation stack note is consistent with implementation.

Line 90 accurately reflects the current locale cardinality/region count and the seeded-vs-cached Faker behavior in src/synthorg/templates/locales.py and src/synthorg/templates/presets.py.

docs/reference/pluggable-subsystems.md (1)

54-54: Documentation reference is accurate.

The SystemClock from synthorg.core.clock is correctly identified as the default clock implementation. Verification confirms the class exists, is properly importable from the module, and is consistently used throughout the codebase as the default for rollout strategy initialization.

docs/architecture/decisions.md (1)

64-64: LGTM! D16 accurately describes the layered sandboxing strategy.

The updated decision correctly documents the SandboxBackend protocol with subprocess for low-risk categories (file_system, git) and Docker for high-risk (code_execution, terminal, database, web). All isolation controls listed (env filtering, restricted PATH, workspace-scoped cwd, timeout + process-group kill, library-injection-var blocking) are verified in the code snippets from SubprocessSandboxConfig, _build_filtered_env, and _filter_path. The rationale clearly explains why Docker-only was rejected (breaks local-first quickstart for trivial file operations) and why the layered approach balances security with usability.

docs/design/tools.md (1)

123-125: LGTM! Docker failure behavior accurately documented.

The updated text correctly describes the layered sandboxing behavior: when Docker is unavailable, only tool categories whose configured backend is Docker will fail with an error, while categories configured for subprocess (file_system, git) continue to work. This aligns perfectly with decision D16 in docs/architecture/decisions.md and the SandboxingConfig design shown in context snippets. The wording is precise—it correctly emphasizes "whose configured backend is Docker" rather than implying an intrinsic category property.

Comment thread docs/DESIGN_SPEC.md
| [Client Simulation](design/client-simulation.md) | Client Types, Intake, Review Pipeline, Simulation | Synthetic client framework for workload generation and evaluation |
| [Strategy & Trendslop Mitigation](design/strategy.md) | Lenses, Principles, Confidence, Impact | Anti-trendslop mitigation for strategic agents |
| [Self-Improvement](design/self-improvement.md) | Meta-Loop, Signals, Rules, Proposals, Rollout | Self-improving company: signal aggregation, rule engine, improvement proposals, staged rollout |
| [Internationalization](design/internationalization.md) | Locale Resolution, UI Text, Translation Scope | British English UI default, locale-aware Intl-based display formatting, no planned translation framework |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

UI-locale default text conflicts with current implementation.

Line 46 says “British English UI default”, but web/src/utils/locale.ts:20-30 sets APP_LOCALE_FALLBACK = 'en' (neutral English). Please align the row with current behavior (or update runtime code to en-GB if that is the intended spec).

Proposed docs-only fix
-| [Internationalization](design/internationalization.md) | Locale Resolution, UI Text, Translation Scope | British English UI default, locale-aware Intl-based display formatting, no planned translation framework |
+| [Internationalization](design/internationalization.md) | Locale Resolution, UI Text, Translation Scope | Neutral English (`en`) fallback, locale-aware Intl-based display formatting, no planned translation framework |

Based on learnings: If implementation deviates from the spec, alert the user and explain why; the user decides whether to proceed or update the spec; do NOT silently diverge.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| [Internationalization](design/internationalization.md) | Locale Resolution, UI Text, Translation Scope | British English UI default, locale-aware Intl-based display formatting, no planned translation framework |
| [Internationalization](design/internationalization.md) | Locale Resolution, UI Text, Translation Scope | Neutral English (`en`) fallback, locale-aware Intl-based display formatting, no planned translation framework |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/DESIGN_SPEC.md` at line 46, The DESIGN_SPEC row claiming "British
English UI default" conflicts with the runtime constant APP_LOCALE_FALLBACK in
web/src/utils/locale.ts (currently set to 'en'); decide whether the
source-of-truth should be the spec or the code and make a single consistent
change: either update the docs row text to "Neutral English (en) UI default" to
match APP_LOCALE_FALLBACK, or change APP_LOCALE_FALLBACK to 'en-GB' and adjust
any Intl/formatting tests to use en-GB; ensure you update the unique identifiers
in the same commit (the spec table entry and the APP_LOCALE_FALLBACK constant)
so they stay aligned.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.60%. Comparing base (40ee65b) to head (695ab2b).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1715      +/-   ##
==========================================
- Coverage   84.61%   84.60%   -0.01%     
==========================================
  Files        1781     1781              
  Lines      101888   101888              
  Branches     8968     8968              
==========================================
- Hits        86211    86206       -5     
- Misses      13483    13486       +3     
- Partials     2194     2196       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Coordination protocol docstring uses specific middleware class names:

- src/synthorg/engine/middleware/coordination_protocol.py:48-49,80-84:
  "TaskLedger populated by coordination middleware" becomes
  "TaskLedger populated by TaskLedgerMiddleware"; same shape for
  ProgressLedger -> ProgressLedgerMiddleware. The specific classes are
  registered at engine/middleware/_defaults.py:80,82 and live in
  engine/middleware/coordination_constraints.py:57,133. Naming the
  concrete middleware is more informative than the generic category.

Skipped findings (logged for audit):

- coderabbitai/inline at docs/DESIGN_SPEC.md:46 ("British English UI
  default" vs APP_LOCALE_FALLBACK = 'en'). The DESIGN_SPEC row already
  separates UI text spelling (British English; colour, behaviour, ...)
  from runtime Intl formatting (locale-resolved with 'en' fallback).
  docs/design/internationalization.md:7 explicitly distinguishes the
  two; the runtime constant is for number/date formatting, not UI text.
- coderabbitai/outside-diff at docs/reference/claude-reference.md:38-69
  (replace text-fenced package tree with Mermaid). The CLAUDE.md rule
  is "Never use text blocks with ASCII/Unicode box-drawing characters
  for diagrams"; both conditions are conjunctive. The block contains
  zero box-drawing characters; it is a structured description list
  with multi-line per-directory descriptions, which Mermaid would not
  render legibly.
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 2, 2026 09:59 — with GitHub Actions Inactive
@Aureliolo Aureliolo merged commit ade03b7 into main May 2, 2026
77 checks passed
@Aureliolo Aureliolo deleted the docs/audit-cleanup-d-docs-sync branch May 2, 2026 10:11
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 2, 2026 10:11 — with GitHub Actions Inactive
Aureliolo pushed a commit that referenced this pull request May 3, 2026
<!-- HIGHLIGHTS_START -->
## Highlights

> _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub
Models). Commit-based changelog below._

### What you'll notice
- Frontend and UX polishing improves user interface responsiveness and
visual consistency.
- API hygiene and validation enhancements provide smoother and more
reliable interactions.

### What's new
- Introduced typed-boundary helpers enabling better type safety and
parse_typed workflows.
- Added codebase-audit skill prompt tuning for improved project
auditing.

### Under the hood
- Eliminated flaky tests caused by module-level state for more stable
test outcomes.
- Unified image tag management under CLI and Renovate for consistent
dependency updates.
- Added cross-PR file-overlap analysis to the review dependency pull
request skill.
- Updated multiple dependencies including Python, Web, CLI, and
container libraries.
- Improved CI tooling and lock file maintenance for better build
reliability.

<!-- HIGHLIGHTS_END -->

:robot: I have created a release *beep* *boop*
---


##
[0.7.8](v0.7.7...v0.7.8)
(2026-05-03)


### Features

* **api:** typed-boundary helper + codebase-audit skill prompt tuning
([#1712](#1712))
([40ee65b](40ee65b))
* **boundary:** RFC
[#1711](#1711) Phases 2 + 3
— typed boundaries via parse_typed
([#1720](#1720))
([7b9f409](7b9f409))


### Bug Fixes

* **api:** audit cleanup B -- API hygiene & validation
([#1719](#1719))
([3d790d9](3d790d9))
* audit cleanup C - persistence, concurrency & data integrity
([#1708](#1708))
([#1717](#1717))
([bcce097](bcce097))
* **test:** exterminate xdist-flaky tests with module-level state
([#1713](#1713))
([#1721](#1721))
([8d258dd](8d258dd))
* **web:** audit cleanup E -- frontend & UX polish
([#1710](#1710))
([#1718](#1718))
([3a3591a](3a3591a))


### Refactoring

* **cli:** single source of truth for DHI image tags + Renovate manager
([#1723](#1723))
([57980a2](57980a2))


### Documentation

* audit cleanup D -- public-facing & docs sync
([#1709](#1709))
([#1715](#1715))
([ade03b7](ade03b7))


### Tests

* **engine:** make TestDrainTimeout deterministic + preserve subclass
type in [@Ontology](https://github.com/ontology)_entity
([#1729](#1729))
([b00fb05](b00fb05))


### CI/CD

* Update CI tool dependencies
([#1703](#1703))
([355a9ff](355a9ff))


### Maintenance

* add cross-PR file-overlap analysis to review-dep-pr skill
([#1722](#1722))
([3861d8a](3861d8a))
* **ci:** unify apko-version under workflow env so Renovate manages it
everywhere ([#1724](#1724))
([9c0a7fd](9c0a7fd))
* consolidate DHI image-pin custom regex managers
([#1726](#1726))
([b8b0cba](b8b0cba))
* **deps:** update dependency chainguard-dev/melange to v0.50.4
([#1701](#1701))
([8cbf83a](8cbf83a))
* Lock file maintenance
([#1705](#1705))
([414cfea](414cfea))
* Lock file maintenance
([#1727](#1727))
([5cb1212](5cb1212))
* Update CLI dependencies
([#1702](#1702))
([9fb57b9](9fb57b9))
* Update Container dependencies
([#1698](#1698))
([6d24fd6](6d24fd6))
* Update dependency @eslint-react/eslint-plugin to v5
([#1704](#1704))
([1cb1294](1cb1294))
* Update Python dependencies
([#1699](#1699))
([8e7af3a](8e7af3a))
* Update Python dependencies to v4.15.0
([#1725](#1725))
([69164c8](69164c8))
* Update Web dependencies
([#1700](#1700))
([715300d](715300d))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audit cleanup D: public-facing & docs sync

1 participant