refactor: tighten check_mock_spec gate, add mock_of[T], drain baseline by Aureliolo · Pull Request #1862 · Aureliolo/synthorg

Aureliolo · 2026-05-10T22:07:31Z

Closes #1855.

What changes

Gate quality (sub-task 1)

Rewrite scripts/check_mock_spec.py with AST-based detection of bare Mock / AsyncMock / MagicMock at typed boundaries only. Catches:
- positional / kwarg argument to a non-Mock-class callable (Pattern A1/A2),
- m: T = Mock() annotated local where T is not Mock-typed (Pattern A3),
- return Mock() / yield Mock() / yield from Mock() from a typed-return function (Pattern A4).
Skips: .return_value chains, attribute-bag scratch objects, dict / list / set / tuple-literal values, name-binding indirect substitutions (which would require cross-module callee resolution).
Delete scripts/mock_spec_baseline.txt (3046 entries). Gate runs zero-tolerance against the live tree.

Helper + ladder (sub-task 2)

Add tests/_shared/mock_of.py with mock_of[T](**overrides) factory backed by create_autospec(T, instance=True, spec_set=True). Re-export from tests/_shared/__init__.py.
Add tests/_shared/test_mock_of.py with 8 unit tests (spec enforcement, override application, unknown-override rejection, Protocol support, async methods, subscript requirement, isinstance compatibility, no state leak between calls).
Document the test-double ladder in docs/reference/conventions.md section 12.1 (FakeClock → create_autospec / mock_of → SimpleNamespace → bare MagicMock forbidden at typed boundary). Update CLAUDE.md Testing bullet with the pointer.

Drain (sub-task 3)

Add scripts/check_mock_spec_ratchet.py PreToolUse hook plus registrations in .claude/settings.json and .opencode/plugins/synthorg-hooks.ts. Blocks (a) edits to tests/*.py that raise the gate's CATCH count for the touched file, (b) edits to scripts/check_mock_spec.py that remove _Verdict.CATCH branches. Drives organic drive-by tightening: every edit reduces or holds.
Convert every surviving Pattern A site across tests/ to spec=ConcreteType / mock_of[T] / FakeClock / SimpleNamespace.
Rewrite tests/unit/scripts/test_check_mock_spec.py with rule-table coverage.

Pre-PR review

14 review agents ran (12 main roster + 2 mini-pass). 9 valid findings consolidated, 9 implemented in the follow-up fix: address pre-PR review feedback commit. 4 false positives rejected (chiefly PEP 758 except OSError, UnicodeDecodeError: flagged by 3 agents unfamiliar with Python 3.14 syntax; empirically verified to parse and catch correctly).

Implemented:

Add design specification and project setup #1 docs-consistency (CRITICAL): add check_mock_spec_ratchet.py to PreToolUse hooks section in docs/reference/convention-gates.md.
Add design specification, license, and project setup #2 docs-consistency (CRITICAL): backfill check_error_codes_ts_in_sync.py and check_no_controller_response_for_domain_errors.py in the gate inventory; update  macro, data/runtime_stats.yaml source, and the CLAUDE.md reference list count.
Design unified provider interface (base classes, protocols) #3 pr-test-analyzer (CRITICAL): add tests/unit/scripts/test_check_mock_spec_ratchet.py with 18 unit tests covering envelope parsing (Bash / unknown tool / no file_path / non-.py / out-of-scope / tests/_shared/ exemption / no-op edit / old_string-not-in-before), ratchet enforcement (hold / raise / reduce / Write tool no-op + raise), gate-edit protection (preserve / remove / add branches), and fail-open path with stderr surfacing.
Evaluate LiteLLM: integration prototype, limitations, alternatives #4 pr-test-analyzer (MAJOR): add test_yield_from_typed_return_caught locking the ast.YieldFrom branch.
Implement provider adapter layer (direct API or via LiteLLM) #5 pr-test-analyzer (MAJOR): add test_set_element_skipped locking the ast.Set collection-skip branch.
Implement model routing engine (role-based, cost-based, fallback chains) #6 silent-failure-hunter (IMPORTANT): stderr-log the two fail-open except Exception: swallows in _scan_text and _check_test_file so transitional failures surface instead of silently allowing edits.
Implement per-call cost tracking and usage logging #7 tool-parity-checker (MINOR): list check_mock_spec_ratchet.py in the OpenCode plugin header comment.
Evaluate and implement message bus (asyncio queues vs external broker) #8 + Implement retry logic, rate limiting, and provider error handling #9 comment-analyzer (SUGGESTION): docstring on _load_gate() plus five AST predicate helpers (_is_empty_splat, _is_literal_none, _has_spec_positional, _has_spec_keyword, _has_dynamic_splat).

Also fixed an unrelated mypy strict failure: three test files imported VersionRepository / FineTuneCheckpointRepository / FineTuneRunRepository from synthorg.persistence.protocol, but those names are TYPE_CHECKING-only re-exports there; routed through the source protocol modules (version_protocol, fine_tune_protocol).

Issue-resolution-verifier confirmed all 5 acceptance criteria RESOLVED at 92-100 % confidence. The stretch goal said "file kept as a structural marker" but the branch deletes scripts/mock_spec_baseline.txt entirely; the CLAUDE.md text now explicitly documents "zero-tolerance, no baseline" so the intent is unambiguous.

Test plan

uv run ruff check src/ tests/ scripts/ — clean.
uv run mypy src/ tests/ — Success: no issues found in 3714 source files.
uv run python -m pytest tests/ -m unit — 28,255 passed, 18 skipped (Windows / logfire-extra environmental skips).
New coverage: 50 gate + ratchet unit tests pass (32 existing + 2 added + 16 new ratchet tests).
uv run python scripts/check_convention_gate_inventory.py — exit 0 (meta-gate parity preserved).

Local cleanup note

Contributors pulling this branch should delete any stale scripts/mock_spec_baseline.txt from their worktree; the gate now runs zero-tolerance and the file is no longer produced.

#1855) - Rewrite scripts/check_mock_spec.py with AST-based detection of bare Mock at typed boundaries only. Catches direct cases (positional / kwarg to a non-Mock-class callable, m: T = Mock() annotated local, return Mock() from a typed fn). Indirect substitution via name binding sits on the test-double ladder rung but is not gated; precise detection there needs cross-module callee resolution. - Delete scripts/mock_spec_baseline.txt (3046 entries). Gate runs zero-tolerance against the live tree. - Add tests/_shared/mock_of.py with mock_of[T](**overrides) factory backed by create_autospec(T, instance=True, spec_set=True). Re-export from tests/_shared/__init__.py. - Document the test-double ladder in docs/reference/conventions.md section 12.1 (FakeClock, mock_of / create_autospec, SimpleNamespace, bare MagicMock forbidden at typed boundary). Update CLAUDE.md Testing bullet with the pointer. - Add scripts/check_mock_spec_ratchet.py PreToolUse hook plus registrations in .claude/settings.json and .opencode/plugins/ synthorg-hooks.ts. Blocks (a) edits to tests/*.py that raise the gate's CATCH count for the touched file, (b) edits to scripts/check_mock_spec.py that remove _Verdict.CATCH branches. Drives organic drive-by tightening: every edit reduces or holds. - Convert every surviving Pattern A site across tests/ to use spec=ConcreteType / mock_of[T] / FakeClock / SimpleNamespace. - Rewrite tests/unit/scripts/test_check_mock_spec.py with new rule-table coverage. - Add tests/_shared/test_mock_of.py with 8 unit tests for the helper. Closes #1855

- Test coverage: add YieldFrom and Set test cases to test_check_mock_spec.py; both AST branches were live but unverified. - Test coverage: add 18-test test_check_mock_spec_ratchet.py covering the new PreToolUse hook (envelope parsing, scope filters, ratchet enforcement, gate-edit weakening protection, fail-open path). - Hook: add stderr logging to the two fail-open broad-except branches so transitional failures surface instead of silently allowing. - Hook: parenthesise except (OSError, UnicodeDecodeError) for reader clarity. - Hook: add docstring on _load_gate. - Gate: add docstrings on five AST predicate helpers. - Docs: add check_mock_spec_ratchet.py to PreToolUse hooks section in convention-gates.md; backfill check_error_codes_ts_in_sync.py and check_no_controller_response_for_domain_errors.py in the inventory list; update runtime_stats macro and CLAUDE.md gate count. - OpenCode plugin: add hook to header comment list. - Imports: route VersionRepository / FineTune* repository imports in three test files through their source protocol modules; the protocol.py re-export was TYPE_CHECKING-only and mypy strict rejected it.

github-actions · 2026-05-10T22:07:41Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-05-10T22:09:34Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8f80d057-f1a0-4465-9e5b-5475925f751a

📥 Commits

Reviewing files that changed from the base of the PR and between 7bd83e2 and ea84b10.

📒 Files selected for processing (2)

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Lighthouse Site
GitHub Check: Test (Python 3.14)
GitHub Check: Build Web Assets (melange)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Read docs/design/ page before implementing; deviations need approval per DESIGN_SPEC.md

Present every plan for accept/deny before coding

Use no region/currency/locale privilege; use metric units and British English spelling

Every convention PR must ship its enforcement gate via scripts/ checks

Configuration precedence: DB > env > YAML > code default via SettingsService/ConfigResolver; never use os.environ.get outside startup

No hardcoded numeric values except 0/1/-1, HTTP codes, hex masks, powers-of-2; numerics live in settings/definitions/

No from __future__ import annotations (Python 3.14+ has PEP 649); use PEP 758 except syntax except for except A, B: without parens unless binding

Type hints required on public functions; mypy strict mode; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Error classes must follow <Domain><Condition>Error naming and inherit from DomainError, never directly from Exception/RuntimeError; enforced by check_domain_error_hierarchy.py

API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes) must use Pydantic v2 frozen config with extra="forbid", @computed_field for derived fields, NotBlankStr for identifiers

Use args models at every system boundary; call parse_typed() for every external dict ingestion; enforced by check_boundary_typed.py

Immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Async code must use asyncio.TaskGroup for fan-out/fan-in; async helpers must catch Exception and re-raise MemoryError/RecursionError

Clock seam: add clock: Clock | None = None parameter; tests inject FakeClock via tests._shared

Lifecycle: services must own _lifecycle_lock; timed-out stops must mark services unrestartable

Untrusted content (SEC-1): use wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML parsing

Repository CRUD methods: save(entity), get(id), `del...

Files:

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Timeout/slow test failures indicate source-code regression; never edit tests/baselines/unit_timing.json or scripts/*_baseline.{txt,json} files directly; use ALLOW_BASELINE_GROWTH=1 for explicit bypass

Use pytest markers: @pytest.mark.{unit,integration,e2e,slow}; async tests use auto mode; global timeout 30s; minimum coverage 80%

xdist applies -n 8 --dist=loadfile via pyproject addopts (loadfile prevents 3.14+Windows ProactorEventLoop leak)

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Use FakeClock from tests._shared for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries is blocked by check_mock_spec.py

Hypothesis testing: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)); use HYPOTHESIS_PROFILE=dev for unit properties, HYPOTHESIS_PROFILE=fuzz for fuzz with no timeout

Never skip/xfail flaky tests; fix fundamentally; use asyncio.Event().wait() instead of sleep(large)

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T08:18:43.130Z
Learning: Post-implementation + pre-PR review: after issue, branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR, use `/aurelio-review-pr` for external feedback; fix all valid issues, no deferring

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T08:18:43.130Z
Learning: After every squash merge, run `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T08:18:43.130Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T08:18:43.130Z
Learning: Branches: use format `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T08:18:43.130Z
Learning: Use squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

🔇 Additional comments (13)

scripts/check_mock_spec_ratchet.py (6)

1-48: LGTM!

The module docstring accurately describes the two-protection ratchet mechanism, and the imports/constants are well-organized. The sentinel constants and path definitions are clear.

51-82: LGTM!

The AST-based counting correctly identifies return _Verdict.CATCH statements while ignoring docstrings/comments. The None sentinel for parse failures enables proper fail-open behavior.

85-138: LGTM!

The dynamic gate loading, sentinel pattern, and temp-file scanning are well-implemented. The error logging to stderr before returning _SCAN_FAILED provides good observability for debugging transient failures.

141-178: LGTM!

The isinstance guards on all envelope fields prevent TypeError on malformed payloads and correctly short-circuit to fail-open. The replace_all boolean validation is particularly important to prevent "false" (truthy string) from being misinterpreted.

181-237: LGTM!

Both check functions correctly implement the ratchet semantics: _check_test_file blocks increases, _check_gate_file blocks decreases. The _SCAN_FAILED sentinel handling ensures transient failures cannot wedge editing. Error messages are clear and actionable.

240-319: LGTM!

The guard cascade is comprehensive and correctly implements the fail-open contract. The path allowlist logic properly constrains enforcement to the gate file and test files outside tests/_shared/. The PEP 758 unparenthesized except clauses (lines 288, 306) are valid for Python 3.14+. Based on learnings: "This repository targets Python 3.14+ and follows PEP 758."

tests/unit/scripts/test_check_mock_spec_ratchet.py (7)

1-50: LGTM!

The Protocol-based typing for the ratchet module and the dynamic loading approach are clean. Using cast to type the dynamically loaded module is appropriate.

53-76: LGTM!

The fixtures and helpers are well-designed. The fake_tests_root fixture properly isolates tests from the real tests/ tree, and monkeypatch handles cleanup automatically.

79-205: LGTM!

Comprehensive coverage of all fail-open paths: empty stdin, invalid JSON, unsupported tools, missing/non-string file_path, non-Python files, files outside allowlist, tests/_shared/ exemption, no-op edits, and missing old_string. Each test is focused and isolated.

207-308: LGTM!

The test file edit scenarios thoroughly cover the ratchet logic: holding count allows, raising count blocks with proper stderr messaging, and reducing count allows. Both Edit and Write tools are tested.

311-427: LGTM!

The gate file scenarios comprehensively test branch preservation, removal blocking, and addition allowing. The test_gate_edit_docstring_wording_change_allows test is particularly valuable—it locks the AST-based counting behavior and ensures docstring/comment changes don't trigger false positives.

430-556: LGTM!

The _SCAN_FAILED sentinel test correctly exercises the asymmetric failure scenario (BEFORE fails, AFTER succeeds) that would have caused a false block with literal-zero handling. The call_count == 2 assertion verifies both scans were attempted. The malformed envelope tests comprehensively cover all isinstance guards in _compute_after and main.

559-642: LGTM!

The remaining tests complete the fail-open coverage: non-dict JSON payloads, unparseable gate AFTER states, and gate load failures. The stderr assertions in test_gate_load_failure_falls_open verify that failures are properly surfaced for debugging.

Walkthrough

Adds an AST two-pass typed-boundary mock-spec gate and a PreToolUse ratchet script; wires the ratchet into .claude settings and the OpenCode plugin; adds tests/_shared/mock_of with unit tests and exports; converts many tests to use spec-constrained mocks or mock_of/create_autospec; and updates docs and runtime_stats to reflect the new gate inventory.

gemini-code-assist

Code Review

This pull request refactors the mock-spec enforcement mechanism to target bare mocks specifically at typed boundaries, such as constructor arguments and annotated locals, while introducing a "test-double ladder" convention. It includes a new mock_of[T] helper for typed substitutions and a PreToolUse ratchet hook to prevent the introduction of new violations. The review identifies a critical syntax error in the ratchet script's exception handling and suggests enhancing the gate's AST parsing to correctly recognize the mock_of[T] subscript syntax as a valid factory.

gemini-code-assist · 2026-05-10T22:10:25Z

+
+    try:
+        before = path.read_text(encoding="utf-8") if path.exists() else ""
+    except OSError, UnicodeDecodeError:


The comma-separated exception list syntax except OSError, UnicodeDecodeError: is a SyntaxError in all current stable Python versions (3.x < 3.14). While PEP 758 proposes this syntax for Python 3.14, it is not yet available in released versions. In Python 3, multiple exceptions must be enclosed in parentheses.

Additionally, in Python 3.x (before 3.14), except OSError, UnicodeDecodeError: would be interpreted as except OSError as UnicodeDecodeError:, which means UnicodeDecodeError would not be caught, and the exception object would be bound to that name, likely causing the script to crash if a UnicodeDecodeError is actually raised. Since this hook is intended to fail-open, a crash here would instead fail-closed and block all edits.

Suggested change

except OSError, UnicodeDecodeError:

except (OSError, UnicodeDecodeError):

gemini-code-assist · 2026-05-10T22:10:25Z

+def _terminal_callee_name(value: ast.expr) -> str | None:
+    """Terminal identifier of a callee expression, or None."""
+    if isinstance(value, ast.Name):
+        return value.id
+    if isinstance(value, ast.Attribute):
+        return value.attr
+    return None


The _terminal_callee_name helper does not handle ast.Subscript nodes. This causes an inconsistency where create_autospec(T, return_value=Mock()) is correctly skipped (because create_autospec is in _MOCK_FACTORY_NAMES), but the new recommended helper mock_of[T](return_value=Mock()) is caught as a violation. This is because mock_of[T] is a Subscript node, and _terminal_callee_name returns None for it, causing _verdict_for_call_arg to fall through to CATCH.

Suggested change

def _terminal_callee_name(value: ast.expr) -> str | None:

"""Terminal identifier of a callee expression, or None."""

if isinstance(value, ast.Name):

return value.id

if isinstance(value, ast.Attribute):

return value.attr

return None

def _terminal_callee_name(value: ast.expr) -> str | None:

"""Terminal identifier of a callee expression, or None."""

if isinstance(value, ast.Name):

return value.id

if isinstance(value, ast.Attribute):

return value.attr

if isinstance(value, ast.Subscript):

return _terminal_callee_name(value.value)

return None

gemini-code-assist · 2026-05-10T22:10:26Z

+)
+"""Class names whose bare-call form is the candidate set for Pass 1."""
+
+_MOCK_FACTORY_NAMES: frozenset[str] = frozenset({"create_autospec"})


The mock_of helper should be added to _MOCK_FACTORY_NAMES to ensure that bare mocks used for internal configuration (like return_value=Mock()) are treated with the same exemption as create_autospec. This works in conjunction with updating _terminal_callee_name to handle subscripts.

Suggested change

_MOCK_FACTORY_NAMES: frozenset[str] = frozenset({"create_autospec"})

_MOCK_FACTORY_NAMES: frozenset[str] = frozenset({"create_autospec", "mock_of"})

codecov · 2026-05-10T22:17:56Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.89%. Comparing base (362e5c8) to head (ea84b10).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1862      +/-   ##
==========================================
+ Coverage   84.87%   84.89%   +0.01%     
==========================================
  Files        1806     1806              
  Lines      105211   105211              
  Branches     9214     9214              
==========================================
+ Hits        89301    89321      +20     
+ Misses      13679    13658      -21     
- Partials     2231     2232       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 38

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)

tests/unit/security/test_llm_evaluator.py (1)

122-126: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Prefer mock_of[ProviderRegistry] over raw MagicMock(spec=...) at this boundary.

Line 122 creates a typed-boundary test double with MagicMock(spec=ProviderRegistry). Please use mock_of to keep the enforcement/migration path uniform.

Proposed patch

+from tests._shared import mock_of
@@
-    registry = MagicMock(spec=ProviderRegistry)
-    registry.get = MagicMock(side_effect=lambda name: driver_map[name])
-    registry.list_providers = MagicMock(
-        return_value=tuple(sorted(driver_map.keys())),
-    )
+    registry = mock_of[ProviderRegistry]()
+    registry.get.side_effect = lambda name: driver_map[name]
+    registry.list_providers.return_value = tuple(sorted(driver_map.keys()))

As per coding guidelines tests/**/*.py: “mock_of[T](**overrides) for typed-boundary substitutions” and “Import … mock_of from tests._shared”.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/security/test_llm_evaluator.py` around lines 122 - 126, Replace
the typed MagicMock with the project's typed test-double helper: import mock_of
from tests._shared and create registry = mock_of[ProviderRegistry]() instead of
MagicMock(spec=ProviderRegistry); keep the existing overridden methods
(registry.get and registry.list_providers) but set them as attributes on the
mock_of instance (e.g., registry.get = MagicMock(side_effect=...),
registry.list_providers = MagicMock(return_value=...)) so the test still uses
driver_map while conforming to the mock_of[ProviderRegistry] pattern.

tests/unit/meta/chief_of_staff/test_chat.py (1)

161-166: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use mock_of[CompletionProvider] for this provider double.

Line 161 still uses direct AsyncMock(spec=CompletionProvider) at a typed boundary.

Suggested change

 from unittest.mock import AsyncMock
+from tests._shared import mock_of
@@
-        provider = AsyncMock(spec=CompletionProvider)
+        provider = mock_of[CompletionProvider]()
         provider.complete.side_effect = RuntimeError("LLM unavailable")

As per coding guidelines, tests/**/*.py: “Test doubles hierarchy: use … mock_of[T](**overrides) for typed-boundary substitutions” and “Import FakeClock and mock_of from tests._shared”.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/meta/chief_of_staff/test_chat.py` around lines 161 - 166, Replace
the direct AsyncMock(spec=CompletionProvider) usage with the project test-double
helper: create the provider using mock_of[CompletionProvider](...) instead of
AsyncMock to satisfy the typed-boundary rule; update the test to construct
provider =
mock_of[CompletionProvider](complete=AsyncMock(side_effect=RuntimeError("LLM
unavailable"))) and ensure mock_of and FakeClock are imported from
tests._shared; leave the ChiefOfStaffChat(provider=provider,
config=ChiefOfStaffConfig()) usage unchanged.

tests/unit/engine/middleware/test_behavior_tagger.py (1)

146-150: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Prefer mock_of[AgentMiddlewareContext] for the typed context mock.

Line 146 should use the shared typed-double helper instead of direct MagicMock(spec=...).

Suggested change

 from unittest.mock import MagicMock
+from tests._shared import mock_of
@@
-        ctx = MagicMock(spec=AgentMiddlewareContext)
+        ctx = mock_of[AgentMiddlewareContext]()
         ctx.agent_id = "test-agent"
         ctx.task_id = "test-task"
         ctx.metadata = None  # Will cause AttributeError in _infer_tags

As per coding guidelines, tests/**/*.py: “Test doubles hierarchy: use … mock_of[T](**overrides) for typed-boundary substitutions” and “Import FakeClock and mock_of from tests._shared”.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/engine/middleware/test_behavior_tagger.py` around lines 146 - 150,
Replace the untyped MagicMock usage with the shared typed test-double helper:
import mock_of from tests._shared and construct ctx =
mock_of[AgentMiddlewareContext](agent_id="test-agent", task_id="test-task",
metadata=None) so the typed mock is used when calling mw.after_model(ctx); keep
the metadata=None to exercise the _infer_tags path and ensure
AgentMiddlewareContext, mw.after_model, and _infer_tags are referenced by name
when locating the change.

tests/unit/meta/signals/test_scaling.py (1)

212-217: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use mock_of[ScalingService] for these service doubles.

These are typed-boundary substitutions and should use the repo-standard helper instead of raw MagicMock(spec=...): it provides autospecced methods plus spec_set=True, enforcing the test doubles hierarchy across the codebase.

Suggested change

+from tests._shared import mock_of
+
-        service = MagicMock(spec=ScalingService)
-        service.get_recent_decisions = MagicMock(
-            side_effect=RuntimeError("boom"),
-        )
-        service.get_recent_actions = MagicMock(return_value=())
+        service = mock_of[ScalingService](
+            get_recent_decisions=MagicMock(side_effect=RuntimeError("boom")),
+            get_recent_actions=MagicMock(return_value=()),
+        )

Also applies to: 245-247

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/meta/signals/test_scaling.py` around lines 212 - 217, Tests create
service doubles using MagicMock(spec=ScalingService) but repo standard requires
the mock_of helper to produce autospecced, spec_set=True doubles; replace
MagicMock(spec=ScalingService) with mock_of[ScalingService] for the service used
in tests around ScalingSignalAggregator (and the other occurrence at lines
~245-247), and ensure the stubbed methods (get_recent_decisions,
get_recent_actions) are attached to that mock_of instance rather than a raw
MagicMock.

tests/unit/meta/test_signals.py (1)

114-133: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Replace typed-boundary mocks with mock_of[T].

PerformanceTracker and ScalingService are typed-boundary seams. Using mock_of[T] provides autospec'd methods and spec_set=True, enforcing the signature contract that the bare MagicMock(spec=...) pattern skips.

Suggested change

+from tests._shared import mock_of
+
-        tracker = MagicMock(spec=PerformanceTracker)
+        tracker = mock_of[PerformanceTracker]()
...
-        tracker = MagicMock(spec=PerformanceTracker)
-        tracker.get_snapshot = AsyncMock(side_effect=RuntimeError("tracker broken"))
+        tracker = mock_of[PerformanceTracker](
+            get_snapshot=AsyncMock(side_effect=RuntimeError("tracker broken"))
+        )
...
-        service = MagicMock(spec=ScalingService)
-        service.get_recent_decisions = MagicMock(return_value=())
-        service.get_recent_actions = MagicMock(return_value=())
+        service = mock_of[ScalingService](
+            get_recent_decisions=MagicMock(return_value=()),
+            get_recent_actions=MagicMock(return_value=()),
+        )
...
-        scaling_service = MagicMock(spec=ScalingService)
-        scaling_service.get_recent_decisions = MagicMock(return_value=())
-        scaling_service.get_recent_actions = MagicMock(return_value=())
+        scaling_service = mock_of[ScalingService](
+            get_recent_decisions=MagicMock(return_value=()),
+            get_recent_actions=MagicMock(return_value=()),
+        )

Also applies to: 171-172, 210-213, 254-256

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/meta/test_signals.py` around lines 114 - 133, Replace
typed-boundary MagicMock usage with the project's typed mock helpers: change
tracker = MagicMock(spec=PerformanceTracker) to tracker =
mock_of[PerformanceTracker] (and similarly any MagicMock(spec=ScalingService) to
mock_of[ScalingService]) so mocks are autospecced and spec_set=True; preserve
the AsyncMock side effect on tracker.get_snapshot but create tracker as
mock_of[PerformanceTracker] and set tracker.get_snapshot =
AsyncMock(side_effect=[s1, s2]) (or use async_mock_of if your test helpers
provide it) to keep the async behavior while enforcing the real interface for
methods like get_snapshot.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/reference/conventions.md`:
- Around line 331-336: The docs show using create_autospec(T, instance=True)
which is less strict than our helper mock_of[T]; update the text to recommend
create_autospec(T, instance=True, spec_set=True) so the example matches
mock_of[T] behavior and keeps rung-2 strictness consistent with mock_of[T]
(refer to mock_of[T], tests._shared, and the instance=True/spec_set=True
options).

In `@scripts/check_mock_spec_ratchet.py`:
- Around line 177-180: Remove the unnecessary try/except around
Path.is_relative_to: replace the try block with a direct assignment to in_tests
using path.is_relative_to(_TESTS_ROOT) since on supported Python versions
is_relative_to returns False for unrelated paths and never raises ValueError;
update the code in scripts/check_mock_spec_ratchet.py where path.is_relative_to,
_TESTS_ROOT, and in_tests are used to eliminate the dead except branch.

In `@scripts/check_mock_spec.py`:
- Around line 390-396: The function _iter_test_files currently compares an
unresolved shared_dir (constructed as _TESTS_ROOT / "_shared") against
path.parents from rglob, which can mismatch when symlinks or different mounts
are involved; fix this by resolving shared_dir (and/or resolving path when
checking membership) before the comparison so both sides use canonical paths —
update the shared_dir assignment in _iter_test_files to use .resolve(), or
compare using path.resolve() (or Path.is_relative_to with resolved bases) to
reliably exclude the _shared directory.

In `@tests/integration/hr/training/test_training_persistence.py`:
- Around line 80-81: Replace typed AsyncMock test doubles with the shared helper
mock_of from tests._shared: import mock_of and change mock_selector =
AsyncMock(spec=SourceSelector) to mock_selector = mock_of[SourceSelector]()
(then set mock_selector.select.return_value = (NotBlankStr("senior-1"),) as
before). Apply the same conversion for the other occurrences (the mocks around
lines 86-87 and 98), ensuring you use mock_of[SourceSelector] (or the
appropriate collaborator type) so typed collaborator doubles follow the
project's test-double policy.

In `@tests/integration/hr/training/test_training_pipeline.py`:
- Around line 126-127: The test uses AsyncMock(spec=AgentRegistryService) as a
typed boundary double; replace these with the repo-standard
mock_of[AgentRegistryService](...) from tests._shared (and import mock_of at the
top) in the _build_service helper and the other occurrences noted (the registry,
and the other mocks at the 130-137 and 144-149 blocks); keep the same
return_value/attribute overrides (e.g., registry.list_active.return_value =
(senior,)) but construct the mock via mock_of[...] so the helper follows the
canonical typed mock pattern.

In `@tests/integration/integrations/test_rate_limiter_shared_state.py`:
- Line 115: Replace the MagicMock-based bus test double with the repository
helper: import mock_of from tests._shared and change the instantiation from
MagicMock(spec=MessageBus) to mock_of[MessageBus]() (or
mock_of[MessageBus](**overrides) if you need to pass attributes); target the
test variable named "bus" so the mock uses spec_set/boundary semantics
consistent with the repo guidelines.

In `@tests/unit/api/auth/test_postgres_session_store.py`:
- Line 121: Replace the untyped MagicMock usage with the shared typed
test-double helper: change the sqlite_db test double (currently created via
MagicMock(spec=Connection)) to use mock_of[Connection](...) from tests._shared,
and add the import for mock_of at the top of the test module; ensure any
overrides passed to MagicMock are forwarded to mock_of to preserve behavior so
sqlite_db remains a Connection-typed mock.

In `@tests/unit/api/test_auto_wire_meetings.py`:
- Around line 16-17: The helper _fake_registries() returns bare MagicMocks and
must be converted to return typed test doubles using mock_of[...] for the
registry interfaces; update _fake_registries() so it constructs typed mocks for
AgentRegistryService and ProviderRegistry (e.g.
mock_of[AgentRegistryService](...) and mock_of[ProviderRegistry](...)) and apply
any existing overrides previously passed to the bare mocks, ensuring all tests
that call _fake_registries() (including the parametrised cases around lines
~173-179) receive properly typed mocks instead of bare MagicMock instances.

In `@tests/unit/api/test_health.py`:
- Around line 230-233: Replace uses of MagicMock(spec=AppState) with the shared
test helper mock_of[AppState] in the telemetry-status tests (e.g., where
app_state is created for _resolve_telemetry_status and related assertions using
TelemetryStatus); import mock_of (and FakeClock if used elsewhere) from
tests._shared at the top of the test module and update the other occurrences
noted (the blocks around the other assertions) to use
mock_of[AppState](**overrides) instead of MagicMock so the typed-boundary
substitution follows the project testing conventions.

In `@tests/unit/api/test_lifecycle_cleanup_kill_switch.py`:
- Around line 44-55: The tests use MagicMock/AsyncMock with
spec=_no_arg_sync/_no_arg_async for cleanup_expired (and similar mocks like
session_store.cleanup_expired, lockout_store.cleanup_expired, and
config_resolver.get_bool/get_float), which doesn't enforce call signatures;
replace those with unittest.mock.create_autospec targets (using spec_set=True)
so calls are validated against the real callable signatures—e.g.
create_autospec(_no_arg_sync, spec_set=True, return_value=None) for sync mocks
and create_autospec(_no_arg_async, spec_set=True, return_value=None) for async
ones, and similarly create_autospec(_config_get_async, spec_set=True,
return_value=...) for config_resolver methods at the flagged locations.

In `@tests/unit/engine/checkpoint/test_strategy.py`:
- Around line 504-508: Replace AsyncMock(spec=CheckpointRepository) and
AsyncMock(spec=HeartbeatRepository) test doubles with the repository-typed
helper mock_of from tests._shared: import mock_of and create cp_repo =
mock_of[CheckpointRepository](get_latest=..., delete_by_execution=...) and
hb_repo = mock_of[HeartbeatRepository](delete=...) so the doubles carry the
correct typed-boundary semantics; ensure you pass the same AsyncMock return
values (e.g., get_latest returning None, delete_by_execution returning 0) as
overrides to mock_of.

In `@tests/unit/engine/test_agent_engine_lifecycle.py`:
- Around line 116-119: The test uses MagicMock(spec=ExecutionLoop) for the
execution_loop doubles (e.g., variable mock_loop with methods
execute/get_loop_type) but per guidelines you should replace these with
mock_of[ExecutionLoop] and import mock_of (and FakeClock if needed) from
tests._shared; update each construction (all occurrences like the mock_loop at
lines referenced and similar variables at other spots) to use
mock_of[ExecutionLoop](**overrides) and keep the same AsyncMock/return_value
overrides for execute and MagicMock overrides for get_loop_type so the
typed-boundary test doubles follow the project's test-double hierarchy.

In `@tests/unit/engine/test_agent_engine.py`:
- Around line 484-487: Replace the typed test doubles that use
MagicMock/AsyncMock(spec=...) with the project's mock_of[...] helper: find the
mock_loop variable (currently using MagicMock/AsyncMock with spec=ExecutionLoop
and AsyncMock(return_value=mock_result)) and change it to use
mock_of[ExecutionLoop](execute=AsyncMock(return_value=mock_result),
get_loop_type=MagicMock(return_value="react")) or equivalent overrides via
mock_of; ensure tests import mock_of (and FakeClock where needed) from
tests._shared and remove direct MagicMock/AsyncMock(spec=...) usages at the
typed boundaries mentioned (e.g., mock_loop and the other listed locations).

In `@tests/unit/engine/test_personality_trim_notifier.py`:
- Around line 254-257: Replace the direct MagicMock(spec=ConfigResolver) typed
double with the shared helper mock_of[ConfigResolver](**overrides) and import
mock_of (and FakeClock if used elsewhere) from tests._shared; specifically,
update the resolver declarations in test_personality_trim_notifier (the resolver
variable using get_bool/get_int) to use
mock_of[ConfigResolver](get_bool=AsyncMock(side_effect=RuntimeError("db down")),
get_int=AsyncMock(return_value=10)) and make the same replacement for the other
resolver instance around lines 351-354 so all typed boundary doubles follow the
mock_of pattern.

In `@tests/unit/engine/test_workspace_git_worktree.py`:
- Around line 124-126: Create a small local helper function in the test module
(e.g., _make_run_git_mock) that returns an AsyncMock configured with
spec=PlannerWorktreeStrategy._run_git and the common return_value tuple (0, "",
""); replace all repeated AsyncMock(spec=PlannerWorktreeStrategy._run_git,
return_value=(0, "", "")) occurrences in
tests/unit/engine/test_workspace_git_worktree.py with calls to this helper to
centralize setup and reduce duplication while keeping existing behavior.

In `@tests/unit/engine/workflow/test_service_validate.py`:
- Around line 25-30: The test helper _service uses AsyncMock(spec=...) for
constructor-injected repos which violates the project's typed boundary rule;
replace those with the mock_of[...] helper to create autospec'd mocks with
spec_set=True. In _service, construct definition_repo and version_repo using
mock_of[WorkflowDefinitionRepository] and mock_of[VersionRepository] (or
mock_of[typing.AsyncMock[...] ] as appropriate) and pass them into
WorkflowService so the mocks are properly typed and locked to the interface.

In `@tests/unit/hr/evaluation/test_eval_loop_coordinator.py`:
- Around line 31-35: Replace MagicMock(spec=...) typed-boundary doubles with the
shared test-double helper: import mock_of from tests._shared and use
mock_of[PerformanceTracker](**overrides) for tracker and
mock_of[EvaluationService](**overrides) for evaluation (and the other similar
mocks later in the file that use MagicMock(spec=...)). Locate the instantiations
currently using MagicMock(spec=PerformanceTracker) and
MagicMock(spec=EvaluationService) and swap them to mock_of[...] while preserving
configured return values (e.g., get_task_metrics.return_value and evaluate
return behavior) by passing overrides or setting attributes on the returned
mock_of instance.

In `@tests/unit/hr/test_activity_list_recent.py`:
- Line 183: Replace the direct AsyncMock(spec=CostTracker) test double with the
shared helper mock_of[CostTracker] so the typed-boundary mock follows the
project's test-doubles hierarchy; update the test to import mock_of (and
FakeClock if needed) from tests._shared and create cost_tracker =
mock_of[CostTracker](...) instead of AsyncMock(spec=CostTracker) so the
substitution is consistent with the ladder/enforcement gate.

In `@tests/unit/hr/training/test_onboarding_integration.py`:
- Around line 52-59: Replace direct AsyncMock(spec=...) usages with the project
test helper mock_of to standardize typed-boundary doubles: change registry =
AsyncMock(spec=AgentRegistryService) to registry =
mock_of[AgentRegistryService]() (and keep the existing registry.get.return_value
= _make_identity()), training_service = AsyncMock(spec=TrainingService) to
training_service = mock_of[TrainingService]() (retain
training_service.execute.return_value = _make_empty_result()), and onboarding =
AsyncMock(spec=OnboardingService) to onboarding = mock_of[OnboardingService]();
apply the same replacement pattern at the other occurrences referenced (lines
around the blocks using registry, training_service, onboarding) so all typed
test doubles use mock_of[T](**overrides).

In `@tests/unit/hr/training/test_service.py`:
- Around line 78-79: Replace direct AsyncMock(spec=SourceSelector) usage in the
test helper with the project's typed helper mock_of[T]; specifically, change the
creation of selector (currently AsyncMock(spec=SourceSelector) returning
("senior-1",)) to use mock_of[SourceSelector](...) and set its select return
value accordingly, and do the same for the other occurrences noted (the mocks
created at the other two spots). Also import mock_of from tests._shared at the
top of the file and ensure the _make_service helper uses these mock_of-based
substitutes instead of AsyncMock(spec=...).

In `@tests/unit/memory/consolidation/test_dual_mode_strategy.py`:
- Around line 84-86: The helper _make_strategy is creating typed test doubles
using AsyncMock(spec=MemoryBackend) — replace those with the project's typed
mock helper: use
mock_of[MemoryBackend](store=AsyncMock(return_value="summary-1"),
delete=AsyncMock(return_value=True)) (and the analogous replacements for the
other doubles at lines 90-96) and add the import for mock_of from tests._shared;
update references where backend is constructed so the mock_of[...] call supplies
the same AsyncMock overrides (store, delete, etc.) to preserve behavior.

In `@tests/unit/memory/consolidation/test_service.py`:
- Line 24: The test helper _make_backend_mock currently returns an untyped
AsyncMock causing spec-less mocks at the MemoryBackend boundary; update
_make_backend_mock to return a typed test double using
mock_of[MemoryBackend](...) (or equivalent mock_of[MemoryBackend](**overrides))
so that all consumers like MemoryConsolidationService receive a properly-typed
backend mock; ensure any awaited methods used in tests are created as AsyncMock
on the typed mock and adjust call sites if necessary to use the typed mock's
attributes.

In `@tests/unit/memory/consolidation/test_strategy.py`:
- Line 41: Replace the ad-hoc AsyncMock(spec=MemoryBackend) test doubles with
the shared helper mock_of[MemoryBackend] and import mock_of from tests._shared;
specifically, change occurrences where backend = AsyncMock(spec=MemoryBackend)
(and the other listed occurrences) to use backend = mock_of[MemoryBackend]() so
typed-boundary mocks are consistent, and add the import "from tests._shared
import mock_of" at the top of the test module if not already present.

In `@tests/unit/memory/org/test_protocol.py`:
- Line 18: Replace the raw AsyncMock(spec=OrgFactRepository) test double with
the shared helper mock_of[OrgFactRepository] from tests._shared: import mock_of
from tests._shared and change the create call for the protocol-bound store (the
variable currently assigned via AsyncMock(spec=OrgFactRepository)) to use
mock_of[OrgFactRepository](**overrides if needed) so the test follows the repo
test-double helper pattern; keep the same variable name (store) and any override
kwargs currently passed to AsyncMock.

In `@tests/unit/memory/procedural/propagation/test_department_scoped.py`:
- Around line 43-53: Replace the direct AsyncMock/MagicMock(spec=...) test
doubles with the shared typed helper mock_of[...] from tests._shared: e.g., use
mock_of[AgentRegistryService](get=AsyncMock(return_value=source_agent),
list_by_department=AsyncMock(return_value=(target_agent1, target_agent2)))
instead of AsyncMock(spec=AgentRegistryService), use
mock_of[MemoryBackend](store=AsyncMock(return_value="mem-copy-1")) instead of
AsyncMock(spec=MemoryBackend), and use
mock_of[MemoryEntry](category=MemoryCategory.PROCEDURAL) instead of
MagicMock(spec=MemoryEntry); apply the same pattern to the other occurrences
noted (lines ~81-91, 116-124, 156-164, 189-197, 235-246, 275-285) and import
mock_of from tests._shared.

In `@tests/unit/memory/procedural/pruning/test_hybrid_strategy.py`:
- Around line 154-160: Replace the typed AsyncMock instances with the shared
test-double factory: instead of AsyncMock(spec=PruningStrategy) for mock_ttl and
mock_pareto, create them via mock_of[PruningStrategy](...) so the doubles follow
the project pattern; ensure you still set .name ("mock-ttl"/"mock-pareto") and
assign the async prune return value (e.g., prune returning ("mem-expired",) and
("mem-low-relevance",)) on the mock_of-created objects so existing assertions
using mock_ttl.prune and mock_pareto.prune continue to work.

In `@tests/unit/memory/retrieval/hierarchical/test_supervisor.py`:
- Around line 110-113: Replace the direct AsyncMock spec construction with the
project's typed mock helper: instead of AsyncMock(spec=CompletionProvider) for
the provider used in tests (the variable named provider and its .complete
AsyncMock side_effect), call mock_of[CompletionProvider](...) to create the
typed double and pass the same overrides (e.g., setting complete to an AsyncMock
side_effect=RuntimeError("LLM unavailable")); do the same replacement for the
other occurrence around lines 243-246 so all CompletionProvider test doubles use
mock_of[CompletionProvider].

In `@tests/unit/memory/test_architect_tools.py`:
- Line 197: Replace the direct AsyncMock usage for the typed-boundary repository
with the project test helper: where fact_store is created with
AsyncMock(spec=OrgFactRepository) (and the other occurrences at the noted
lines), use mock_of[OrgFactRepository](...) imported from tests._shared instead;
update imports to bring in mock_of (and FakeClock if needed elsewhere) and
instantiate fact_store via mock_of to follow the test doubles convention for
typed-boundary substitutions.

In `@tests/unit/meta/chief_of_staff/test_monitor.py`:
- Line 67: Replace the direct AsyncMock(spec=SnapshotBuilder) test double with
the typed-boundary helper mock_of[SnapshotBuilder]() (import mock_of from
tests._shared) so the double follows the shared test-doubles ladder; update the
builder assignment (and the other noted occurrences for snapshot_builder at the
same pattern) to use mock_of[SnapshotBuilder](**overrides) where needed and
ensure any async behavior is preserved by configuring the mock_of instance
rather than using AsyncMock(spec=...).

In `@tests/unit/meta/mcp/test_handlers_integrations.py`:
- Around line 64-65: Replace the direct MagicMock(spec=ArtifactStorageBackend)
with the shared typed-double helper: create the storage test double via
mock_of[ArtifactStorageBackend] and pass the AsyncMock override for the delete
method (e.g., provide delete=AsyncMock(return_value=True)) so the variable
storage retains the correct typed-spec behavior while still stubbing delete;
update the reference to storage used in tests accordingly.

In `@tests/unit/ontology/drift/test_service.py`:
- Around line 70-71: The test uses raw
AsyncMock(spec=OntologyDriftReportRepository) and then assigns
store.store_report, which breaks the project's typed-boundary mocking
convention; replace these with the project's
mock_of[OntologyDriftReportRepository](...) pattern (import mock_of where
needed) and pass overrides for async methods like store_report via the helper
instead of assigning attributes after creation so spec_set-like behavior is
preserved; update both occurrences (the store variable at the shown lines and
the other site around 141-143) to use mock_of with appropriate AsyncMock
overrides.

In `@tests/unit/ontology/drift/test_strategies.py`:
- Around line 248-251: Replace the AsyncMock(spec=DriftDetectionStrategy) test
doubles with the typed helper: use mock_of[DriftDetectionStrategy](...) and pass
detect overrides as AsyncMock instances (e.g. core =
mock_of[DriftDetectionStrategy](detect=AsyncMock(return_value=clean_report)) and
user = mock_of[DriftDetectionStrategy](detect=AsyncMock())). Do the same for the
other occurrences mentioned (the other core/user doubles) so the doubles use
mock_of with the typed spec_set behavior.

In `@tests/unit/scripts/test_check_mock_spec.py`:
- Around line 235-246: Add a new regression test that asserts attribute-chain
assignments like "x = Mock(); x.return_value = Mock()" are skipped (in addition
to the existing constructor kwarg case tested by
test_inner_mock_in_mock_class_call_skipped); create a function named e.g.
test_return_value_chain_skipped that builds source with a Mock() assigned to a
name and then a direct ".return_value = Mock()" assignment, then call
_MODULE._scan_file(write_test_file(src)) and assert it returns [] — also add the
same pattern test for the other location referenced (lines ~258-266) so both
attribute-assignment and kwarg forms are covered.

In `@tests/unit/security/test_service_safety_integration.py`:
- Around line 62-63: Replace the raw AsyncMock(spec=ApprovalStoreProtocol) usage
with the shared test helper mock_of from tests._shared: create approval_store
via mock_of[ApprovalStoreProtocol](add=AsyncMock()) instead of approval_store =
AsyncMock(spec=ApprovalStoreProtocol) and then assignment approval_store.add =
AsyncMock(); ensure you import mock_of from tests._shared and keep the same
variable name approval_store so typed-boundary expectations for
ApprovalStoreProtocol remain satisfied.

In `@tests/unit/security/test_shadow_mode.py`:
- Around line 82-85: Replace direct MagicMock(spec=...) typed substitutions with
the test helper mock_of from tests._shared: import mock_of and replace
occurrences like MagicMock(spec=RuleEngine), MagicMock(spec=AuditLog),
MagicMock(spec=OutputScanner) with mock_of[RuleEngine](**overrides),
mock_of[AuditLog](...), mock_of[OutputScanner](...) respectively so the mocks
follow the new test-double ladder and typing contract; apply the same change for
the other occurrences noted (around lines 100-104, 118-122, 138-142, 160-164,
179-190, 205-207) and preserve any explicit return_value or attribute overrides
by passing them as keyword overrides to mock_of.

In `@tests/unit/security/timeout/test_timeout_checker.py`:
- Around line 148-149: Replace the direct AsyncMock(spec=TimeoutPolicy) creation
with the shared typed helper: import mock_of from tests._shared and instantiate
mock_policy using mock_of[TimeoutPolicy](...) instead of AsyncMock(spec=...).
Preserve the existing behavior (set mock_policy.determine_action.side_effect =
RuntimeError("boom")) and ensure the variable name mock_policy and its usage in
the test remain unchanged so the typed boundary is used throughout.

In `@tests/unit/settings/test_service.py`:
- Around line 407-409: Replace the typed MagicMock instances with the project's
typed helper: import mock_of from tests._shared and change creations like
MagicMock(spec=MessageBus) to mock_of[MessageBus](...) (preserving any attribute
overrides such as is_running=True and publish=AsyncMock()); update each
occurrence (around the instances mentioned and at lines corresponding to the
other occurrences) so all test doubles for MessageBus use mock_of[MessageBus]
for consistent typed-boundary substitution.

In `@tests/unit/tools/test_process_cleanup.py`:
- Around line 18-19: Replace the MagicMock-based attribute-bag use with
SimpleNamespace: instead of proc = MagicMock(spec=Process) and setting
proc._transport = None, create proc = SimpleNamespace(_transport=None) (and for
the other test blocks that set attributes on proc) so tests model a mutable
attribute container rather than a behaviour mock; update all occurrences that
construct proc via MagicMock(spec=Process) and directly manipulate _transport or
similar attributes to use SimpleNamespace with the same attribute names.

---

Outside diff comments:
In `@tests/unit/engine/middleware/test_behavior_tagger.py`:
- Around line 146-150: Replace the untyped MagicMock usage with the shared typed
test-double helper: import mock_of from tests._shared and construct ctx =
mock_of[AgentMiddlewareContext](agent_id="test-agent", task_id="test-task",
metadata=None) so the typed mock is used when calling mw.after_model(ctx); keep
the metadata=None to exercise the _infer_tags path and ensure
AgentMiddlewareContext, mw.after_model, and _infer_tags are referenced by name
when locating the change.

In `@tests/unit/meta/chief_of_staff/test_chat.py`:
- Around line 161-166: Replace the direct AsyncMock(spec=CompletionProvider)
usage with the project test-double helper: create the provider using
mock_of[CompletionProvider](...) instead of AsyncMock to satisfy the
typed-boundary rule; update the test to construct provider =
mock_of[CompletionProvider](complete=AsyncMock(side_effect=RuntimeError("LLM
unavailable"))) and ensure mock_of and FakeClock are imported from
tests._shared; leave the ChiefOfStaffChat(provider=provider,
config=ChiefOfStaffConfig()) usage unchanged.

In `@tests/unit/meta/signals/test_scaling.py`:
- Around line 212-217: Tests create service doubles using
MagicMock(spec=ScalingService) but repo standard requires the mock_of helper to
produce autospecced, spec_set=True doubles; replace
MagicMock(spec=ScalingService) with mock_of[ScalingService] for the service used
in tests around ScalingSignalAggregator (and the other occurrence at lines
~245-247), and ensure the stubbed methods (get_recent_decisions,
get_recent_actions) are attached to that mock_of instance rather than a raw
MagicMock.

In `@tests/unit/meta/test_signals.py`:
- Around line 114-133: Replace typed-boundary MagicMock usage with the project's
typed mock helpers: change tracker = MagicMock(spec=PerformanceTracker) to
tracker = mock_of[PerformanceTracker] (and similarly any
MagicMock(spec=ScalingService) to mock_of[ScalingService]) so mocks are
autospecced and spec_set=True; preserve the AsyncMock side effect on
tracker.get_snapshot but create tracker as mock_of[PerformanceTracker] and set
tracker.get_snapshot = AsyncMock(side_effect=[s1, s2]) (or use async_mock_of if
your test helpers provide it) to keep the async behavior while enforcing the
real interface for methods like get_snapshot.

In `@tests/unit/security/test_llm_evaluator.py`:
- Around line 122-126: Replace the typed MagicMock with the project's typed
test-double helper: import mock_of from tests._shared and create registry =
mock_of[ProviderRegistry]() instead of MagicMock(spec=ProviderRegistry); keep
the existing overridden methods (registry.get and registry.list_providers) but
set them as attributes on the mock_of instance (e.g., registry.get =
MagicMock(side_effect=...), registry.list_providers =
MagicMock(return_value=...)) so the test still uses driver_map while conforming
to the mock_of[ProviderRegistry] pattern.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6ea84693-5e9b-41a9-955f-3062d3d378a9

📥 Commits

Reviewing files that changed from the base of the PR and between 362e5c8 and f81fdde.

📒 Files selected for processing (123)

.claude/settings.json
.opencode/plugins/synthorg-hooks.ts
CLAUDE.md
data/runtime_stats.yaml
docs/reference/convention-gates.md
docs/reference/conventions.md
scripts/check_mock_spec.py
scripts/check_mock_spec_ratchet.py
scripts/mock_spec_baseline.txt
tests/_shared/__init__.py
tests/_shared/mock_of.py
tests/_shared/test_mock_of.py
tests/evals/prompt/test_rerank_prompt.py
tests/integration/hr/training/test_training_persistence.py
tests/integration/hr/training/test_training_pipeline.py
tests/integration/integrations/test_rate_limiter_shared_state.py
tests/integration/mcp/test_tool_surface.py
tests/unit/a2a/test_gateway.py
tests/unit/api/auth/test_postgres_session_store.py
tests/unit/api/controllers/test_memory_admin.py
tests/unit/api/controllers/test_provider_health.py
tests/unit/api/controllers/test_setup.py
tests/unit/api/controllers/test_ws.py
tests/unit/api/test_auto_wire_meetings.py
tests/unit/api/test_health.py
tests/unit/api/test_lifecycle_cleanup_kill_switch.py
tests/unit/api/test_personality_trim_notifier_factory.py
tests/unit/api/test_state.py
tests/unit/budget/test_coordination_collector.py
tests/unit/budget/test_coordination_collector_properties.py
tests/unit/communication/async_tasks/test_service.py
tests/unit/communication/meeting/test_scheduler_races.py
tests/unit/communication/messages/test_service.py
tests/unit/communication/test_nats_publish.py
tests/unit/engine/checkpoint/test_strategy.py
tests/unit/engine/classification/test_loaders.py
tests/unit/engine/classification/test_semantic_detectors.py
tests/unit/engine/classification/test_sinks.py
tests/unit/engine/evolution/proposers/test_separate_analyzer.py
tests/unit/engine/evolution/test_service.py
tests/unit/engine/middleware/test_behavior_tagger.py
tests/unit/engine/test_agent_engine.py
tests/unit/engine/test_agent_engine_auto_loop.py
tests/unit/engine/test_agent_engine_budget.py
tests/unit/engine/test_agent_engine_degradation.py
tests/unit/engine/test_agent_engine_errors.py
tests/unit/engine/test_agent_engine_lifecycle.py
tests/unit/engine/test_agent_engine_procedural.py
tests/unit/engine/test_approval_gate_events.py
tests/unit/engine/test_coordination_factory.py
tests/unit/engine/test_coordination_service.py
tests/unit/engine/test_personality_trim_notifier.py
tests/unit/engine/test_prompt_safety_sites.py
tests/unit/engine/test_security_factory.py
tests/unit/engine/test_semantic_llm.py
tests/unit/engine/test_workspace_git_worktree.py
tests/unit/engine/test_workspace_merge.py
tests/unit/engine/workflow/test_execution_observer.py
tests/unit/engine/workflow/test_service_validate.py
tests/unit/hr/evaluation/test_eval_loop_coordinator.py
tests/unit/hr/performance/test_composite_quality_strategy.py
tests/unit/hr/performance/test_llm_judge_quality_strategy.py
tests/unit/hr/performance/test_tracker_enhancements.py
tests/unit/hr/test_activity_list_recent.py
tests/unit/hr/training/test_guards.py
tests/unit/hr/training/test_llm_curated.py
tests/unit/hr/training/test_onboarding_integration.py
tests/unit/hr/training/test_service.py
tests/unit/hr/training/test_source_selectors.py
tests/unit/memory/consolidation/test_abstractive.py
tests/unit/memory/consolidation/test_dual_mode_strategy.py
tests/unit/memory/consolidation/test_retention.py
tests/unit/memory/consolidation/test_service.py
tests/unit/memory/consolidation/test_service_kill_switch.py
tests/unit/memory/consolidation/test_strategy.py
tests/unit/memory/consolidation/test_two_tier_strategy.py
tests/unit/memory/consolidation/test_wiki_export.py
tests/unit/memory/org/test_protocol.py
tests/unit/memory/procedural/capture/test_factory.py
tests/unit/memory/procedural/propagation/test_department_scoped.py
tests/unit/memory/procedural/propagation/test_no_propagation.py
tests/unit/memory/procedural/propagation/test_role_scoped.py
tests/unit/memory/procedural/pruning/test_hybrid_strategy.py
tests/unit/memory/procedural/test_evolver.py
tests/unit/memory/procedural/test_pipeline.py
tests/unit/memory/procedural/test_proposer.py
tests/unit/memory/retrieval/hierarchical/test_supervisor.py
tests/unit/memory/retrieval/reranking/test_llm_reranker.py
tests/unit/memory/test_architect_tools.py
tests/unit/memory/test_tool_retriever.py
tests/unit/meta/chief_of_staff/test_chat.py
tests/unit/meta/chief_of_staff/test_monitor.py
tests/unit/meta/mcp/test_all_handlers_wired.py
tests/unit/meta/mcp/test_handlers_integrations.py
tests/unit/meta/signals/test_scaling.py
tests/unit/meta/test_code_modification_strategy.py
tests/unit/meta/test_signals.py
tests/unit/ontology/drift/test_service.py
tests/unit/ontology/drift/test_strategies.py
tests/unit/ontology/test_entity_guard.py
tests/unit/persistence/postgres/test_backend.py
tests/unit/persistence/sqlite/test_preset_override_repo.py
tests/unit/providers/resilience/test_retry.py
tests/unit/providers/test_capability_mutations.py
tests/unit/scripts/test_check_mock_spec.py
tests/unit/scripts/test_check_mock_spec_ratchet.py
tests/unit/scripts/test_generate_comparison.py
tests/unit/security/test_llm_evaluator.py
tests/unit/security/test_llm_fallback_integration.py
tests/unit/security/test_safety_classifier.py
tests/unit/security/test_service_safety_integration.py
tests/unit/security/test_shadow_mode.py
tests/unit/security/test_uncertainty_checker.py
tests/unit/security/timeout/test_timeout_checker.py
tests/unit/settings/test_bridge_config_wiring.py
tests/unit/settings/test_observability_subscriber.py
tests/unit/settings/test_per_op_rate_limit_subscriber.py
tests/unit/settings/test_resolver.py
tests/unit/settings/test_security_subscriber.py
tests/unit/settings/test_service.py
tests/unit/tools/git/test_git_url_validator.py
tests/unit/tools/git/test_git_url_validator_toctou.py
tests/unit/tools/test_process_cleanup.py

CodeRabbit ASSERTIVE review on f81fdde flagged 38 inline + 5 outside-diff typed-boundary substitutions plus several script / docs nits. Gemini called out a Subscript-handling miss in the gate and a path-injection shape in the ratchet. CodeQL surfaced three py/path-injection alerts. Script + docs - scripts/check_mock_spec.py: `_terminal_callee_name` now recurses through `ast.Subscript`, and `mock_of` joins `_MOCK_FACTORY_NAMES`, so `mock_of[T](return_value=Mock())` is treated as a factory call rather than a typed-boundary violation. `_iter_test_files` and `cmd_scan_paths` resolve `_TESTS_ROOT` and `shared_dir` before comparing parents so symlinks / bind-mounts cannot bypass the `_shared` exclusion. - scripts/check_mock_spec_ratchet.py: dropped the dead `is_relative_to` try/except (Python 3.14 never raises here), restructured the path validation so the allow-list check is inline-visible to CodeQL's data-flow analysis. Kept the PEP-758 `except OSError, UnicodeDecodeError:` form per CLAUDE.md. - docs/reference/conventions.md: bumped the rung-2 example from `create_autospec(T, instance=True)` to `create_autospec(T, instance=True, spec_set=True)` so the doc matches `mock_of[T]` strictness. Test conversions - Replaced `(Magic|Async)Mock(spec=T)` at typed-boundary sites with `mock_of[T](**overrides)` from tests/_shared across 35 test files (unit + integration). Where the override list was set via attribute-bag writes after construction, collapsed into kwargs at construction so the spec_set=True invariant is enforced. - tests/_shared/mock_of.py: `__call__` now returns `Any` and `_MockOfMeta.__getitem__` accepts `Any` instead of `type[T]`. The runtime invariants (autospec + spec_set=True + override-key validation) are unchanged; only the static-type ergonomics shift, so callers configure `.method.return_value` and call `.assert_*()` without per-call cast wrappers and Protocol specs no longer trip mypy strict's `type-abstract` check. - tests/unit/scripts/test_check_mock_spec.py: locked the Subscript-factory recognition and the attribute-chain `x.return_value = Mock()` SKIP path in two new regression tests. - tests/unit/tools/test_process_cleanup.py: pure attribute-bag doubles switched from `MagicMock(spec=Process)` to `SimpleNamespace` (with `cast` at the call site) per the test-doubles ladder; the one test that needs type-level `PropertyMock` descriptor wiring keeps the original mock. - tests/unit/api/test_lifecycle_cleanup_kill_switch.py: stub callables switched to `create_autospec(..., spec_set=True)`. Fixed `_config_get_async` signature to match the real `(namespace, key)` two-positional contract; `create_autospec` caught the earlier one-arg mismatch. Skipped findings (factually wrong against current code) - Gemini #1 (`except (OSError, UnicodeDecodeError):` parens): the CLAUDE.md project rule explicitly says PEP 758 `except A, B:` no parens unless binding, and the project is Python 3.14+. The current syntax is correct under the project interpreter. - tests/unit/api/test_health.py (AppState mock_of conversion): `AppState` uses `__slots__` plus property accessors that `create_autospec(spec_set=True)` does not expose, so the looser `MagicMock(spec=AppState)` is the right rung here. - tests/unit/engine/middleware/test_behavior_tagger.py (AgentMiddlewareContext mock_of conversion): Pydantic `Field()` annotations are invisible to `create_autospec`; spec_set=True rejects `agent_id`/`task_id`/`metadata` overrides at runtime. Reverted with an explanatory comment. - tests/unit/engine/test_workspace_git_worktree.py (`_make_run_git_mock` helper extraction): only two call sites with identical setup and no shared `return_value` to factor out; CodeRabbit's described `return_value=(0, '', '')` form is absent from current code. Verification `uv run ruff check src/ tests/ scripts/` clean. `uv run ruff format src/ tests/ scripts/` clean. `uv run mypy src/ tests/` clean (3714 files). `uv run python -m pytest tests/ -m unit -n 8` -- 28,257 passed, 18 skipped in 81s. No regressions. CodeQL alerts 287 / 288 / 289 still open on `refs/pull/1862/head` because CodeQL has not re-run against the new HEAD yet; the next babysit round will re-evaluate.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

tests/unit/settings/test_service.py (1)
409-409: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Replace remaining MagicMock(spec=MessageBus) boundary doubles with mock_of[MessageBus].

Line 409, Line 427, Line 501, Line 558, Line 692, and Line 708 still use the old pattern, so this file is only partially migrated to the enforced helper.
Suggested patch pattern
-        bus = MagicMock(spec=MessageBus)
-        bus.is_running = True
-        bus.publish = AsyncMock()
+        bus = mock_of[MessageBus](
+            is_running=True,
+            publish=AsyncMock(),
+        )
Apply the same replacement to each remaining occurrence above.
As per coding guidelines tests/**/*.py: “Test doubles: use … mock_of[T](**overrides) for typed-boundary substitutions…”.

Also applies to: 427-427, 501-501, 558-558, 692-692, 708-708
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/settings/test_service.py` at line 409, Replace the remaining uses
of MagicMock(spec=MessageBus) with the typed test double helper
mock_of[MessageBus](**overrides) (i.e., change occurrences that create a "bus"
MagicMock to mock_of[MessageBus]) and add the necessary import for mock_of;
search for all remaining MagicMock(spec=MessageBus) usages (they reference
MessageBus and variable names like bus) and update them to use
mock_of[MessageBus] so the file consistently uses the typed-boundary
substitution helper.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 135-149: The current `_check_gate_file` uses naive substring
counts (`before.count(_CATCH_MARKER)`) which miscounts docstrings/comments;
instead parse `before` and `after` with `ast.parse` and count actual ast.Return
nodes whose value represents `_Verdict.CATCH` (e.g., a Return node where value
is an ast.Attribute with value ast.Name id '_Verdict' and attr 'CATCH', and
optionally handle cases like a single Name with id '_Verdict.CATCH' if present);
replace the `before_count`/`after_count` assignments with these AST counts and,
to be robust, fall back to the original substring count only if `ast.parse`
raises a SyntaxError. Ensure you update the logic using the same
`_check_gate_file` function and keep the same return codes and error message
behavior.

In `@scripts/check_mock_spec.py`:
- Around line 311-316: _verdict_for_call_arg currently marks any call whose
callee isn't a known mock factory as _Verdict.CATCH, causing false positives for
helper(Mock()) and accepts_mock(Mock()); change it to first resolve the callee's
parameter annotation (use the same resolver you use elsewhere to inspect the
function signature of call.func via _terminal_callee_name or the project's type
resolver) and only return _Verdict.CATCH when that parameter is definitively
non-Mock-typed; if the annotation cannot be resolved, return _Verdict.SKIP; keep
the existing checks against _MOCK_NAMES and _MOCK_FACTORY_NAMES and return
_Verdict.SKIP for those cases.

---

Duplicate comments:
In `@tests/unit/settings/test_service.py`:
- Line 409: Replace the remaining uses of MagicMock(spec=MessageBus) with the
typed test double helper mock_of[MessageBus](**overrides) (i.e., change
occurrences that create a "bus" MagicMock to mock_of[MessageBus]) and add the
necessary import for mock_of; search for all remaining
MagicMock(spec=MessageBus) usages (they reference MessageBus and variable names
like bus) and update them to use mock_of[MessageBus] so the file consistently
uses the typed-boundary substitution helper.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c735376e-c6f3-440e-b979-f51aaa8a5dbb

📥 Commits

Reviewing files that changed from the base of the PR and between f81fdde and 6942e98.

📒 Files selected for processing (43)

docs/reference/conventions.md
scripts/check_mock_spec.py
scripts/check_mock_spec_ratchet.py
tests/_shared/mock_of.py
tests/_shared/test_mock_of.py
tests/integration/hr/training/test_training_persistence.py
tests/integration/hr/training/test_training_pipeline.py
tests/integration/integrations/test_rate_limiter_shared_state.py
tests/unit/api/auth/test_postgres_session_store.py
tests/unit/api/test_auto_wire_meetings.py
tests/unit/api/test_lifecycle_cleanup_kill_switch.py
tests/unit/engine/checkpoint/test_strategy.py
tests/unit/engine/middleware/test_behavior_tagger.py
tests/unit/engine/test_agent_engine.py
tests/unit/engine/test_agent_engine_lifecycle.py
tests/unit/engine/test_personality_trim_notifier.py
tests/unit/engine/workflow/test_service_validate.py
tests/unit/hr/evaluation/test_eval_loop_coordinator.py
tests/unit/hr/test_activity_list_recent.py
tests/unit/hr/training/test_onboarding_integration.py
tests/unit/hr/training/test_service.py
tests/unit/memory/consolidation/test_dual_mode_strategy.py
tests/unit/memory/consolidation/test_service.py
tests/unit/memory/consolidation/test_strategy.py
tests/unit/memory/org/test_protocol.py
tests/unit/memory/procedural/propagation/test_department_scoped.py
tests/unit/memory/procedural/pruning/test_hybrid_strategy.py
tests/unit/memory/retrieval/hierarchical/test_supervisor.py
tests/unit/memory/test_architect_tools.py
tests/unit/meta/chief_of_staff/test_chat.py
tests/unit/meta/chief_of_staff/test_monitor.py
tests/unit/meta/mcp/test_handlers_integrations.py
tests/unit/meta/signals/test_scaling.py
tests/unit/meta/test_signals.py
tests/unit/ontology/drift/test_service.py
tests/unit/ontology/drift/test_strategies.py
tests/unit/scripts/test_check_mock_spec.py
tests/unit/security/test_llm_evaluator.py
tests/unit/security/test_service_safety_integration.py
tests/unit/security/test_shadow_mode.py
tests/unit/security/timeout/test_timeout_checker.py
tests/unit/settings/test_service.py
tests/unit/tools/test_process_cleanup.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build Web Assets (melange)
GitHub Check: Test (Python 3.14)
GitHub Check: Lighthouse Site
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Read design spec from docs/design/ page before implementing; deviations need approval per DESIGN_SPEC.md

Present every plan for accept/deny before coding

No region/currency/locale privileged; use metric units; use British English per docs/reference/regional-defaults.md

Every convention PR ships its enforcement gate per docs/reference/convention-gates.md

Configuration precedence: DB > env > YAML > code default via SettingsService/ConfigResolver; no os.environ.get outside startup per docs/reference/configuration-precedence.md

No hardcoded numeric values; numerics live in settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2. Enforced by scripts/check_no_magic_numbers.py

Comments must document WHY only; no reviewer citations / issue back-refs / migration framing. Enforced by check_no_review_origin_in_code.py + check_no_migration_framing.py

No from __future__ import annotations (Python 3.14+ has PEP 649). Use PEP 758 except syntax (except A, B: no parens unless binding)

Type hints on public functions; mypy strict mode. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines

Error classes must follow <Domain><Condition>Error naming from DomainError base; never inherit Exception/RuntimeError directly. Enforced by check_domain_error_hierarchy.py

Pydantic v2: frozen + extra="forbid" on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); use @computed_field for derived; use NotBlankStr for identifiers

Args models required at every system boundary; use parse_typed() for every external dict ingestion. Enforced by check_boundary_typed.py

Immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Async fan-out/fan-in must use asyncio.TaskGroup; helpers must catch Exception (re-raise MemoryError/RecursionError)

Clock seam: include clock: Clock | None = None parameter; tests inject FakeClock. Li...

Files:

tests/unit/engine/workflow/test_service_validate.py
tests/unit/ontology/drift/test_strategies.py
tests/unit/memory/test_architect_tools.py
tests/unit/api/auth/test_postgres_session_store.py
tests/unit/meta/mcp/test_handlers_integrations.py
tests/unit/memory/consolidation/test_dual_mode_strategy.py
tests/unit/meta/chief_of_staff/test_chat.py
tests/unit/security/test_llm_evaluator.py
tests/unit/hr/evaluation/test_eval_loop_coordinator.py
tests/unit/engine/checkpoint/test_strategy.py
tests/unit/security/timeout/test_timeout_checker.py
tests/unit/security/test_service_safety_integration.py
tests/unit/engine/test_personality_trim_notifier.py
tests/unit/memory/procedural/propagation/test_department_scoped.py
tests/unit/memory/retrieval/hierarchical/test_supervisor.py
tests/unit/hr/test_activity_list_recent.py
tests/unit/memory/procedural/pruning/test_hybrid_strategy.py
tests/unit/engine/middleware/test_behavior_tagger.py
tests/integration/hr/training/test_training_pipeline.py
tests/unit/security/test_shadow_mode.py
tests/unit/api/test_lifecycle_cleanup_kill_switch.py
tests/integration/hr/training/test_training_persistence.py
tests/unit/ontology/drift/test_service.py
tests/unit/engine/test_agent_engine.py
tests/integration/integrations/test_rate_limiter_shared_state.py
tests/unit/api/test_auto_wire_meetings.py
tests/unit/meta/test_signals.py
tests/unit/settings/test_service.py
tests/unit/engine/test_agent_engine_lifecycle.py
tests/unit/tools/test_process_cleanup.py
tests/unit/hr/training/test_onboarding_integration.py
tests/unit/hr/training/test_service.py
tests/unit/meta/signals/test_scaling.py
tests/_shared/mock_of.py
tests/_shared/test_mock_of.py
tests/unit/scripts/test_check_mock_spec.py
tests/unit/meta/chief_of_staff/test_monitor.py
tests/unit/memory/consolidation/test_strategy.py
tests/unit/memory/org/test_protocol.py
scripts/check_mock_spec.py
tests/unit/memory/consolidation/test_service.py
scripts/check_mock_spec_ratchet.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Timeout/slow test failures indicate source-code regression; never edit tests/baselines/unit_timing.json or any scripts/*_baseline.{txt,json} / scripts/_*_baseline.py. Both families are PreToolUse-blocked. Per-invocation bypass: ALLOW_BASELINE_GROWTH=1 git commit ... (requires explicit user approval)

Test markers: use @pytest.mark.{unit,integration,e2e,slow}. Async tests use auto. Global timeout 30s. Coverage minimum 80%

xdist configuration: use -n 8 --dist=loadfile (auto-applied via pyproject addopts). loadfile prevents Python 3.14+ Windows ProactorEventLoop leak

Windows test compatibility: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: use FakeClock from tests._shared for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags. Bare MagicMock at typed boundaries is blocked by scripts/check_mock_spec.py (zero-tolerance, no baseline)

Vendor-agnostic test names: NEVER use real vendor names in project code/tests. Use example-provider, test-provider, example-{large,medium,small}-001. Real names allowed in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Hypothesis property tests: use 10 deterministic CI examples; test failures are real bugs (fix + add @example(...)). Never skip/xfail flaky tests; fix fundamentally

Files:

tests/unit/engine/workflow/test_service_validate.py
tests/unit/ontology/drift/test_strategies.py
tests/unit/memory/test_architect_tools.py
tests/unit/api/auth/test_postgres_session_store.py
tests/unit/meta/mcp/test_handlers_integrations.py
tests/unit/memory/consolidation/test_dual_mode_strategy.py
tests/unit/meta/chief_of_staff/test_chat.py
tests/unit/security/test_llm_evaluator.py
tests/unit/hr/evaluation/test_eval_loop_coordinator.py
tests/unit/engine/checkpoint/test_strategy.py
tests/unit/security/timeout/test_timeout_checker.py
tests/unit/security/test_service_safety_integration.py
tests/unit/engine/test_personality_trim_notifier.py
tests/unit/memory/procedural/propagation/test_department_scoped.py
tests/unit/memory/retrieval/hierarchical/test_supervisor.py
tests/unit/hr/test_activity_list_recent.py
tests/unit/memory/procedural/pruning/test_hybrid_strategy.py
tests/unit/engine/middleware/test_behavior_tagger.py
tests/integration/hr/training/test_training_pipeline.py
tests/unit/security/test_shadow_mode.py
tests/unit/api/test_lifecycle_cleanup_kill_switch.py
tests/integration/hr/training/test_training_persistence.py
tests/unit/ontology/drift/test_service.py
tests/unit/engine/test_agent_engine.py
tests/integration/integrations/test_rate_limiter_shared_state.py
tests/unit/api/test_auto_wire_meetings.py
tests/unit/meta/test_signals.py
tests/unit/settings/test_service.py
tests/unit/engine/test_agent_engine_lifecycle.py
tests/unit/tools/test_process_cleanup.py
tests/unit/hr/training/test_onboarding_integration.py
tests/unit/hr/training/test_service.py
tests/unit/meta/signals/test_scaling.py
tests/_shared/mock_of.py
tests/_shared/test_mock_of.py
tests/unit/scripts/test_check_mock_spec.py
tests/unit/meta/chief_of_staff/test_monitor.py
tests/unit/memory/consolidation/test_strategy.py
tests/unit/memory/org/test_protocol.py
tests/unit/memory/consolidation/test_service.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/engine/workflow/test_service_validate.py
tests/unit/ontology/drift/test_strategies.py
tests/unit/memory/test_architect_tools.py
tests/unit/api/auth/test_postgres_session_store.py
tests/unit/meta/mcp/test_handlers_integrations.py
tests/unit/memory/consolidation/test_dual_mode_strategy.py
tests/unit/meta/chief_of_staff/test_chat.py
tests/unit/security/test_llm_evaluator.py
tests/unit/hr/evaluation/test_eval_loop_coordinator.py
tests/unit/engine/checkpoint/test_strategy.py
tests/unit/security/timeout/test_timeout_checker.py
tests/unit/security/test_service_safety_integration.py
tests/unit/engine/test_personality_trim_notifier.py
tests/unit/memory/procedural/propagation/test_department_scoped.py
tests/unit/memory/retrieval/hierarchical/test_supervisor.py
tests/unit/hr/test_activity_list_recent.py
tests/unit/memory/procedural/pruning/test_hybrid_strategy.py
tests/unit/engine/middleware/test_behavior_tagger.py
tests/integration/hr/training/test_training_pipeline.py
tests/unit/security/test_shadow_mode.py
tests/unit/api/test_lifecycle_cleanup_kill_switch.py
tests/integration/hr/training/test_training_persistence.py
tests/unit/ontology/drift/test_service.py
tests/unit/engine/test_agent_engine.py
tests/integration/integrations/test_rate_limiter_shared_state.py
tests/unit/api/test_auto_wire_meetings.py
tests/unit/meta/test_signals.py
tests/unit/settings/test_service.py
tests/unit/engine/test_agent_engine_lifecycle.py
tests/unit/tools/test_process_cleanup.py
tests/unit/hr/training/test_onboarding_integration.py
tests/unit/hr/training/test_service.py
tests/unit/meta/signals/test_scaling.py
tests/_shared/mock_of.py
tests/_shared/test_mock_of.py
tests/unit/scripts/test_check_mock_spec.py
tests/unit/meta/chief_of_staff/test_monitor.py
tests/unit/memory/consolidation/test_strategy.py
tests/unit/memory/org/test_protocol.py
tests/unit/memory/consolidation/test_service.py

{README.md,docs/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README and public docs must be sourced from data/runtime_stats.yaml via  markers per data/README.md

Files:

docs/reference/conventions.md

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T04:59:27.265Z
Learning: After issue: create branch + commit + push without auto-PR; use `/pre-pr-review` command (gh pr create is hookify-blocked). After PR: use `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T04:59:27.265Z
Learning: Git commits: use `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci). Commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T04:59:27.265Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T04:59:27.265Z
Learning: Branch naming: use `<type>/<slug>` format from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T04:59:27.265Z
Learning: Squash merge PRs; PR body becomes squash commit. Trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T04:59:27.265Z
Learning: After every squash merge → run `/post-merge-cleanup` command

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

tests/unit/engine/workflow/test_service_validate.py
tests/unit/ontology/drift/test_strategies.py
tests/unit/memory/test_architect_tools.py
tests/unit/api/auth/test_postgres_session_store.py
tests/unit/meta/mcp/test_handlers_integrations.py
tests/unit/memory/consolidation/test_dual_mode_strategy.py
tests/unit/meta/chief_of_staff/test_chat.py
tests/unit/security/test_llm_evaluator.py
tests/unit/hr/evaluation/test_eval_loop_coordinator.py
tests/unit/engine/checkpoint/test_strategy.py
tests/unit/security/timeout/test_timeout_checker.py
tests/unit/security/test_service_safety_integration.py
tests/unit/engine/test_personality_trim_notifier.py
tests/unit/memory/procedural/propagation/test_department_scoped.py
tests/unit/memory/retrieval/hierarchical/test_supervisor.py
tests/unit/hr/test_activity_list_recent.py
tests/unit/memory/procedural/pruning/test_hybrid_strategy.py
tests/unit/engine/middleware/test_behavior_tagger.py
tests/integration/hr/training/test_training_pipeline.py
tests/unit/security/test_shadow_mode.py
tests/unit/api/test_lifecycle_cleanup_kill_switch.py
tests/integration/hr/training/test_training_persistence.py
tests/unit/ontology/drift/test_service.py
tests/unit/engine/test_agent_engine.py
tests/integration/integrations/test_rate_limiter_shared_state.py
tests/unit/api/test_auto_wire_meetings.py
tests/unit/meta/test_signals.py
tests/unit/settings/test_service.py
tests/unit/engine/test_agent_engine_lifecycle.py
tests/unit/tools/test_process_cleanup.py
tests/unit/hr/training/test_onboarding_integration.py
tests/unit/hr/training/test_service.py
tests/unit/meta/signals/test_scaling.py
tests/_shared/mock_of.py
tests/_shared/test_mock_of.py
tests/unit/scripts/test_check_mock_spec.py
tests/unit/meta/chief_of_staff/test_monitor.py
tests/unit/memory/consolidation/test_strategy.py
tests/unit/memory/org/test_protocol.py
scripts/check_mock_spec.py
tests/unit/memory/consolidation/test_service.py
scripts/check_mock_spec_ratchet.py

🔇 Additional comments (47)

tests/unit/engine/middleware/test_behavior_tagger.py (1)

146-150: Good typed-boundary mock adjustment.

Using MagicMock(spec=AgentMiddlewareContext) here keeps the exception-path test realistic while avoiding an unrestricted boundary mock.

docs/reference/conventions.md (1)

319-363: Ladder documentation looks clear and consistent.

The new rung definitions and zero-tolerance gate guidance are aligned with the introduced mock_of[T] behavior and enforcement flow.

tests/unit/tools/test_process_cleanup.py (1)

20-70: Nice cleanup to attribute-bag test doubles.

The SimpleNamespace migration is appropriate for these _transport state tests, and the remaining typed MagicMock(spec=Process) case is well-justified.

tests/_shared/mock_of.py (1)

32-113: mock_of[T] helper implementation is solid.

Autospec + spec_set=True plus explicit override-key failure messaging gives the strict behavior this gate/migration needs.

tests/unit/security/test_service_safety_integration.py (1)

63-65: Typed-boundary store double is correctly tightened.

Switching to mock_of[ApprovalStoreProtocol](...) here is a good fit for the new enforcement model.

tests/unit/security/test_llm_evaluator.py (1)

123-127: Registry mock migration looks good.

mock_of[ProviderRegistry] with explicit method overrides preserves intent while enforcing boundary shape.

tests/unit/api/auth/test_postgres_session_store.py (1)

122-122: Good consistency update for the SQLite boundary double.

Using mock_of[Connection]() here matches the repo’s typed test-double convention.

tests/unit/engine/test_personality_trim_notifier.py (1)

255-258: Config resolver boundary doubles are correctly migrated.

Using mock_of[ConfigResolver](...) in both branches keeps these tests aligned with the new typed-boundary enforcement.

Also applies to: 353-356

tests/unit/memory/org/test_protocol.py (1)

8-9: Typed-boundary mock migration looks good.

Using mock_of[OrgFactRepository]() here is consistent with the new gate and helper pattern.

Also applies to: 17-17

tests/unit/memory/consolidation/test_strategy.py (1)

43-43: Consistent mock_of[MemoryBackend] usage is solid.

The typed-boundary substitutions are now uniform across the strategy tests.

Also applies to: 51-51, 62-62, 77-77, 99-99, 118-118, 154-154, 173-173, 193-193

tests/unit/memory/retrieval/hierarchical/test_supervisor.py (1)

23-24: Provider mock migration is correct.

The updated tests now use typed mock_of[CompletionProvider] as intended.

Also applies to: 111-113, 244-246

tests/integration/integrations/test_rate_limiter_shared_state.py (1)

109-109: Factory test now follows the typed mock helper pattern.

The MessageBus boundary mock migration here is clean.

Also applies to: 115-115

tests/unit/meta/chief_of_staff/test_chat.py (1)

35-36: CompletionProvider test double update is well done.

This keeps the error-path assertion intact while using the typed helper.

Also applies to: 162-164

tests/unit/settings/test_service.py (1)

94-99: SettingsRepository fixture migration is correct.

Using mock_of[SettingsRepository](...) here is aligned with the typed-boundary gate direction.

tests/unit/meta/signals/test_scaling.py (1)

213-216: ScalingService boundary mocks are correctly moved to mock_of.

The updates preserve behaviour and align with the typed-double policy.

Also applies to: 245-248

tests/unit/meta/mcp/test_handlers_integrations.py (1)

65-67: ArtifactStorageBackend fixture migration is clean.

Using mock_of[ArtifactStorageBackend] with an async delete override is the right fit.

tests/unit/ontology/drift/test_strategies.py (2)

301-302: Use mock_of[DriftDetectionStrategy] in test_strategy_name as well.

Line 301 and Line 302 still use AsyncMock(spec=DriftDetectionStrategy) for constructor-injected typed collaborators; please align these with mock_of[...] like the other updated tests.

As per coding guidelines tests/**/*.py: “Test doubles: use mock_of[T](**overrides) for typed-boundary substitutions.”

249-285: CORE/USER routing tests now correctly use typed helper doubles.

Good migration to mock_of[DriftDetectionStrategy] with explicit async detect overrides.

tests/unit/memory/procedural/pruning/test_hybrid_strategy.py (1)

155-161: Typed-boundary mock migration is correct.

This keeps the same async prune behaviour while moving injected collaborators to mock_of[PruningStrategy](), which is exactly what the new gate expects.

tests/unit/hr/test_activity_list_recent.py (1)

184-189: CostTracker mock now follows the typed-boundary helper pattern.

Good update on Line 184–Line 189: behaviour stays the same and the substitution is now spec-constrained through mock_of.

tests/unit/security/test_shadow_mode.py (1)

82-207: Consistent typed mock migration across shadow-mode scenarios.

All changed collaborator injections now use mock_of[...] consistently, and the test intent for ACTIVE/SHADOW/DISABLED flows is preserved.

tests/unit/memory/test_architect_tools.py (1)

198-200: Typed collaborator doubles are now aligned with the shared helper contract.

Nice cleanup: repository/exporter boundaries now use mock_of[...] while keeping the async method stubs unchanged.

Also applies to: 217-219, 251-253, 272-274

tests/unit/engine/workflow/test_service_validate.py (1)

25-31: _service() now builds typed repository doubles correctly.

The boundary mocks for definition/version repositories are now created through mock_of[...], matching the enforced test-double path.

tests/unit/security/timeout/test_timeout_checker.py (1)

149-154: Policy-failure test now uses a typed helper mock at the boundary.

This keeps the same failure simulation while moving TimeoutPolicy mocking to mock_of[...].

tests/unit/engine/test_agent_engine_lifecycle.py (1)

117-120: Large typed-boundary mock migration is consistent and solid.

The ExecutionLoop/CostTracker substitutions are now uniformly helper-based, with existing lifecycle/timeout test intent preserved.

Also applies to: 152-155, 188-191, 224-227, 258-261, 290-293, 323-326, 436-439, 465-468, 470-473, 520-523, 562-565

tests/unit/meta/test_signals.py (1)

8-9: Typed-boundary mock migration is clean and consistent.

These changes correctly move typed collaborators to mock_of[...] with explicit overrides where needed, while preserving existing test intent.

Also applies to: 32-32, 115-115, 172-174, 212-215, 257-260

tests/unit/memory/consolidation/test_dual_mode_strategy.py (1)

4-5: _make_strategy default doubles are now correctly typed.

Good migration to mock_of[...] in the helper; this keeps typed seams strict without changing test behaviour.

Also applies to: 10-11, 15-16, 19-20, 86-110

tests/unit/engine/test_agent_engine.py (1)

35-35: Execution-loop seam updates look solid.

The mock_of[ExecutionLoop] adoption is applied consistently in the updated tests and keeps the orchestration-path assertions clear.

Also applies to: 40-43, 485-488, 697-700, 817-820, 1164-1167, 1222-1225, 1279-1282

tests/unit/hr/training/test_onboarding_integration.py (1)

4-5: Good typed-mock cleanup in onboarding bridge tests.

The updated fixture setup and assertions keep intent unchanged while aligning the primary typed seams to mock_of[...].

Also applies to: 11-13, 17-19, 54-60, 70-72, 78-85, 94-96, 99-106, 118-119

tests/unit/api/test_auto_wire_meetings.py (1)

16-18: Registry test-double typing improvements are correctly applied.

Updating _fake_registries() to typed mocks is a strong improvement and the parametrised partial-missing cases remain explicit and readable.

Also applies to: 25-27, 174-175, 179-180

tests/integration/hr/training/test_training_persistence.py (1)

23-26: Integration test collaborator typing is tightened correctly.

Nice use of mock_of[...] here—this keeps the persistence flow test realistic while enforcing typed collaborator boundaries.

Also applies to: 32-32, 81-82, 87-88, 99-99

tests/unit/hr/evaluation/test_eval_loop_coordinator.py (1)

10-15: _make_coordinator typed dependency setup looks correct.

The helper now cleanly uses typed mock_of[...] doubles with sensible defaults, improving contract fidelity in the suite.

Also applies to: 20-23, 32-46

tests/unit/memory/procedural/propagation/test_department_scoped.py (1)

7-16: Department-scoped propagation test seams are consistently hardened.

The typed mock_of[...] migration for registry/backend dependencies is well applied across scenarios.

Also applies to: 44-53, 84-93, 121-128, 163-170, 198-205, 246-256, 288-297

tests/unit/scripts/test_check_mock_spec.py (5)

277-285: Past review finding addressed.

This test locks the .return_value = Mock() attribute-chain skip, complementing test_inner_mock_in_mock_class_call_skipped which covers the constructor-kwarg form. Both ways of wiring child mocks are now regression-protected.

1-63: LGTM!

Clean module setup with:

Clear docstring explaining the gate's purpose

Proper Protocol definition for the module interface

Dynamic module loading to test the script directly

Reusable write_test_file fixture

65-233: LGTM!

Excellent coverage of Pattern A (typed-boundary catches):

Direct mock substitutions at call sites (positional/keyword)

Walrus expressions in call arguments

Annotated assignments with concrete types

Fixture return/yield with type annotations

Async function returns

yield from in typed generators

PropertyMock recognition

The documented skips for indirect name-binding substitutions are well-justified—cross-module type resolution is out of scope for a pure-AST gate.

235-387: LGTM!

Thorough coverage of Pattern B/C/D (non-typed-boundary skips):

Inner mocks in mock/autospec/mock_of calls

.return_value chain assignments

Attribute-bag scratch usage

Collection literals (dict/list/tuple/set)

Mock-typed annotations

Unannotated returns

Class/module scope assignments

These tests lock the "lower rungs" of the test-double ladder that the gate intentionally does not enforce.

389-453: LGTM!

Legacy carry-over tests ensure backward compatibility:

Specced mocks ignored

First positional arg treated as spec

unittest.mock attribute-call form detection

Non-mock calls not flagged

Parse error handling

tests/_shared/ directory exclusion

tests/unit/api/test_lifecycle_cleanup_kill_switch.py (5)

23-45: LGTM!

The spec target functions are well-designed for use with create_autospec():

_no_arg_sync() and _no_arg_async() correctly model zero-arg cleanup methods

_config_get_async(_namespace, _key) correctly mirrors the two-positional-arg signature of ConfigResolver.get_* methods, ensuring signature enforcement catches regressions

The docstrings clearly document the purpose without excessive explanation.

47-97: LGTM!

The _build_app_state factory correctly applies the test-double ladder:

create_autospec(..., spec_set=True) enforces call signatures at typed boundaries

SimpleNamespace serves as a lightweight attribute-bag for store containers

Consistent pattern across all three stores and the config resolver

100-128: LGTM!

The _run_loop_ticks helper provides deterministic loop control by:

Monkeypatching asyncio.sleep to count iterations

Cancelling after exactly N ticks via CancelledError

Yielding control via await real_sleep(0) to allow task progress

This avoids wall-clock races and makes tests deterministic.

131-166: LGTM!

Both test cases correctly verify the kill-switch behaviour:

enabled=True: All cleanup paths run on every tick

enabled=False: No cleanup paths run, but resolver is still consulted (live re-enable)

The assertions use appropriate call_count for sync and await_count for async methods.

206-306: LGTM!

The exception isolation tests comprehensively verify:

Ticket failure doesn't block session/lockout cleanup

Session failure doesn't block lockout cleanup

MemoryError correctly propagates (OOM must not be swallowed)

The create_autospec with side_effect is properly used to simulate failures.

tests/_shared/test_mock_of.py (4)

1-11: LGTM!

Clean module setup with appropriate imports and module-level pytest.mark.unit marker.

13-27: LGTM!

The local spec classes provide good coverage:

_ConcreteService tests both sync and async method autospeccing

_Renderer as a Protocol verifies Protocol support (important for structural typing)

29-69: LGTM!

Comprehensive test coverage for mock_of[T]:

Autospec behavior (call tracking, unknown attribute rejection)

Override application and rejection of unknown keys

Protocol support

Async method awaiting

Subscript syntax enforcement

72-79: LGTM!

The state isolation test correctly verifies that separate mock_of[T]() calls return independent instances with no shared state—critical for test reliability.

coderabbitai · 2026-05-11T05:08:51Z

+def _verdict_for_call_arg(call: ast.Call) -> _Verdict:
+    """Return CATCH if *call*'s callee is non-Mock, else SKIP."""
+    terminal = _terminal_callee_name(call.func)
+    if terminal in _MOCK_NAMES or terminal in _MOCK_FACTORY_NAMES:
+        return _Verdict.SKIP
+    return _Verdict.CATCH


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't treat every call argument as a typed boundary.

Lines 311-316 currently return CATCH for any non-mock callee, so helper(Mock()) and even accepts_mock(Mock()) are flagged when the parameter is untyped or explicitly Mock-typed. That widens the gate beyond the PR contract of catching only non-Mock-typed boundaries and will create false positives. Please resolve the callee parameter annotation before catching, or default unresolved call sites to SKIP instead of unconditional CATCH.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/check_mock_spec.py` around lines 311 - 316, _verdict_for_call_arg currently marks any call whose callee isn't a known mock factory as _Verdict.CATCH, causing false positives for helper(Mock()) and accepts_mock(Mock()); change it to first resolve the callee's parameter annotation (use the same resolver you use elsewhere to inspect the function signature of call.func via _terminal_callee_name or the project's type resolver) and only return _Verdict.CATCH when that parameter is definitively non-Mock-typed; if the annotation cannot be resolved, return _Verdict.SKIP; keep the existing checks against _MOCK_NAMES and _MOCK_FACTORY_NAMES and return _Verdict.SKIP for those cases.

… 3 codeql, 9 ci) CI broke on the round-1 push because TrainingService.__init__ deep-copies its extractors dict, and the mock_of[Protocol]() autospec instances the round-1 conversion threaded through those extractors carry references to _abc._abc_data (Protocol's abstract-method registry) which is unpicklable. Local unit-only verification missed it; the CI integration suite caught it with TypeError: cannot pickle 'module' object / cannot pickle '_abc._abc_data' object across nine tests. CodeRabbit's round-2 review flagged two additional gate-internals concerns and six remaining MessageBus migration sites; CodeQL re-evaluated and re-fired three path-injection alerts (288, 289, 290) on the restructured ratchet because the is_relative_to boolean form was not consistently recognised as a sanitiser. CI failure fix - tests/integration/hr/training/test_training_pipeline.py: reverted the four mock_of[Protocol] collaborators inside _build_service to AsyncMock(spec=Protocol). The looser form survives copy.deepcopy because the spec class is stored as a plain attribute rather than via the autospec proxy layer that pulls in _abc._abc_data. Reverting preserves CodeRabbit's substantive ask (typed-boundary discipline) -- the original code already had spec=Protocol -- and unblocks the integration suite. An explanatory comment on _build_service documents the deepcopy constraint so the next reviewer sees the reason. CodeRabbit round 2 - scripts/check_mock_spec_ratchet.py: replaced the substring _CATCH_MARKER count in _check_gate_file with _count_catch_returns, an AST walk that counts only ast.Return nodes whose value is _Verdict.CATCH. Docstring / comment / error-message edits that touch the literal string no longer trip the ratchet; a real branch removal is still caught. Falls back to substring count if the source fails to parse so a transient syntax error during interactive editing cannot wedge the gate. - tests/unit/scripts/test_check_mock_spec_ratchet.py: updated the fake_gate fixture to use real def f(): return _Verdict.CATCH branches (the substring-only stubs the previous tests relied on are no longer counted), and added test_gate_edit_docstring_wording_change_allows to lock the substring-vs-AST distinction so a future regression cannot re-introduce the false-positive class. - tests/unit/settings/test_service.py: migrated the remaining six MagicMock(spec=MessageBus) boundary doubles (lines 409, 427, 501, 558, 692, 708 in the prior file) to mock_of[MessageBus](is_running=..., publish=...). Round 1 only caught the first instance because the migration script's pattern required attribute writes whose RHS was a (Magic|Async)?Mock factory call; the MessageBus sites set is_running=True as a literal RHS and slipped past. Skipped CodeRabbit finding - scripts/check_mock_spec.py:311-316 (`_verdict_for_call_arg` too aggressive): CodeRabbit asked the gate to only CATCH when the callee's parameter annotation is definitively non-Mock-typed, and default to SKIP when the annotation cannot be resolved. The gate's own docstring documents that resolving callee parameter annotations is explicitly out of scope for a pure-stdlib AST gate (`docs/reference/conventions.md` section 12.1 covers the indirect-substitution discipline). The PR contract is conversion to mock_of[T] plus subscript-factory recognition, not a fundamental gate-semantics rework -- that would be a separate refactor with its own design discussion. CodeQL path-injection alerts (288, 289, 290) - scripts/check_mock_spec_ratchet.py: switched the allowlist check from Path.is_relative_to to Path.relative_to inside an explicit try/except. relative_to raises ValueError for out-of-tree paths, which is the form CodeQL's py/path-injection query recognises as a sanitiser; the boolean is_relative_to form combined with an adjacent conjunct (`shared_dir not in path.parents`) was not consistently picked up on the previous round. Resolution and the .py suffix check now run inside the same try/except that catches OSError/ ValueError so the read sink at line 220 only fires on a fully validated path. Verification `uv run ruff check src/ tests/ scripts/` clean. `uv run ruff format src/ tests/ scripts/` clean. `uv run mypy src/ tests/` clean (3714 files). `uv run python -m pytest tests/ -m "unit or integration" -n 8` -- 29,735 passed, 49 skipped in 295s. The nine round-1 integration failures (test_training_pipeline.py) now pass.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 1-28: Update the module docstring to accurately reflect the
current gate-check behavior: replace the statement that the hook "counts the
literal substring `_Verdict.CATCH`" with a note that it uses an AST-based count
via the helper function `_count_catch_returns` (which walks the parsed AST and
counts `return _Verdict.CATCH` nodes), and briefly mention that parse/import
errors cause the hook to fail open as implemented; reference
`_count_catch_returns` and the ratchet behavior in the same paragraph so
maintainers know the implementation approach.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ca78b2cb-7c27-4f6c-bb37-9c970efd45e6

📥 Commits

Reviewing files that changed from the base of the PR and between 6942e98 and 519831b.

📒 Files selected for processing (4)

scripts/check_mock_spec_ratchet.py
tests/integration/hr/training/test_training_pipeline.py
tests/unit/scripts/test_check_mock_spec_ratchet.py
tests/unit/settings/test_service.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build Web Assets (melange)
GitHub Check: Test (Python 3.14)
GitHub Check: Lighthouse Site
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Read docs/design/ page before implementing; deviations need approval per DESIGN_SPEC.md

Present every plan for accept/deny before coding

No region/currency/locale privileged; use metric units; use British English

Every convention PR ships its enforcement gate

Configuration precedence: DB > env > YAML > code default via SettingsService/ConfigResolver; no os.environ.get outside startup

No hardcoded numerics; allow only 0/1/-1, HTTP codes, hex masks, powers-of-2; numerics live in settings/definitions/

After issue: branch + commit + push (no auto-PR); use /pre-pr-review; after PR: use /aurelio-review-pr for external feedback; fix EVERYTHING valid; no deferring

Use uv sync for dependencies; uv run ruff check/format for linting; uv run mypy for strict type-checking; uv run pytest with markers for testing

No from __future__ import annotations (Python 3.14 has PEP 649); use PEP 758 except except A, B: requires parens if binding

Type hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Errors must be <Domain><Condition>Error inheriting from DomainError, never Exception/RuntimeError directly; enforced by check_domain_error_hierarchy.py

Use Pydantic v2 frozen + extra="forbid" on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); @computed_field for derived; NotBlankStr for identifiers

Args models at every system boundary; use parse_typed() for every external dict ingestion; enforced by check_boundary_typed.py

Immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception and re-raise MemoryError/RecursionError

Clock seam: use clock: Clock | None = None; tests inject FakeClock; lifecycle: services own _lifecycle_lock; timed-out stops mark unrestartable

Untrusted content (SEC-1): use wrap_untrusted() from `eng...

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py
tests/unit/settings/test_service.py
scripts/check_mock_spec_ratchet.py
tests/integration/hr/training/test_training_pipeline.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Timeout/slow failures indicate source-code regression; never edit tests/baselines/unit_timing.json or any scripts/*_baseline.{txt,json} / scripts/_*_baseline.py; bypass via ALLOW_BASELINE_GROWTH=1 requiring explicit user approval

Use markers: @pytest.mark.{unit,integration,e2e,slow}; async tests use auto; global timeout 30s; coverage minimum 80%

xdist: use -n 8 --dist=loadfile auto-applied via pyproject addopts (loadfile prevents 3.14+Windows ProactorEventLoop leak)

Windows unit tests: use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; never bare MagicMock at typed boundaries; enforced by check_mock_spec.py

Import FakeClock and mock_of from tests._shared; inject via clock= and helper's spec subscript

Hypothesis: use 10 deterministic CI examples; failures are real bugs (fix + add @example(...)); never skip/xfail flaky tests; fix fundamentally using asyncio.Event().wait() instead of sleep(large)

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py
tests/unit/settings/test_service.py
tests/integration/hr/training/test_training_pipeline.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py
tests/unit/settings/test_service.py
tests/integration/hr/training/test_training_pipeline.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T05:43:55.831Z
Learning: Numerics in README and public docs must be sourced from `data/runtime_stats.yaml` via `<!--RS:NAME-->` markers

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T05:43:55.831Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced; signed commits required on protected refs (GPG/SSH or GitHub App)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T05:43:55.831Z
Learning: Branches: `<type>/<slug>` from main; pre-commit/pre-push hooks in `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T05:43:55.831Z
Learning: Squash merge with PR body becoming commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T05:43:55.831Z
Learning: After every squash merge → `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T05:43:55.831Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

tests/unit/scripts/test_check_mock_spec_ratchet.py
tests/unit/settings/test_service.py
scripts/check_mock_spec_ratchet.py
tests/integration/hr/training/test_training_pipeline.py

🔇 Additional comments (22)

tests/unit/settings/test_service.py (3)

6-29: Typed-boundary mock imports are correctly aligned.

Good migration away from direct MagicMock usage to mock_of + MessageBus typing.

94-104: Repository fixture uses typed mock construction correctly.

mock_of[SettingsRepository] plus explicit async method stubs makes this fixture deterministic and aligned with the new gate semantics.

409-412: MessageBus test doubles are now consistently typed and explicit.

Nice cleanup: all affected notification-path tests now use mock_of[MessageBus] with clear async publish behaviour.

Also applies to: 428-431, 503-506, 561-564, 696-699, 713-716

tests/integration/hr/training/test_training_pipeline.py (3)

20-21: Good dependency tightening for typed collaborators.

These imports make the helper’s typed mock specs explicit and readable.

Also applies to: 48-49

123-136: Deepcopy exception is clearly justified.

The docstring captures the concrete mock_of/create_autospec deepcopy failure mode and why AsyncMock(spec=...) is the safe fallback in this helper.

139-140: Default doubles are now properly spec-constrained.

Switching helper defaults from loose mocks to AsyncMock(spec=...) materially improves boundary safety while preserving existing test setup ergonomics.

Also applies to: 143-144, 149-150, 157-158

scripts/check_mock_spec_ratchet.py (7)

46-74: LGTM!

The AST-based counting with substring fallback is correctly implemented. The walker properly identifies return _Verdict.CATCH statements while ignoring docstrings and comments.

77-88: LGTM!

Dynamic module loading is implemented correctly with appropriate error handling.

91-112: LGTM!

Fail-open behavior with proper temp file cleanup and stderr logging. The private API access is appropriately annotated.

115-137: LGTM!

The function correctly handles both Edit and Write tool envelopes with appropriate no-op detection.

140-183: LGTM!

Both check functions correctly implement their respective ratchet directions with clear error messages.

225-232: Confirm shared_dir in path.parents works as intended.

The check shared_dir in path.parents correctly identifies files under tests/_shared/ because Path.parents returns a sequence of parent directories as Path objects. For /repo/tests/_shared/foo.py, the parents include the resolved shared_dir path.

However, note that if path equals shared_dir exactly (editing tests/_shared/ itself, which is a directory), this check would not match. Since the earlier path.suffix != ".py" guard filters directories, this is safe.

234-237: CodeQL path-injection alert is mitigated by prior sanitization.

The path.read_text() at line 235 operates on a path that has already passed through the relative_to(tests_root) sanitizer (lines 227-230) or is confirmed equal to gate_path. This ensures only files within the trusted tests/ tree or the gate script itself are read.

The CodeQL alerts in past reviews appear to be false positives given this control flow. The inline comment at lines 209-216 documents this design choice.

tests/unit/scripts/test_check_mock_spec_ratchet.py (9)

1-22: LGTM!

Clear module docstring explaining the hook's dual protection scheme, appropriate imports, and correct pytest.mark.unit marker.

24-49: LGTM!

The _RatchetModule Protocol cleanly documents the tested interface, and dynamic loading at module scope ensures early failure if the script is broken.

52-76: LGTM!

Clean fixture design with proper monkeypatch usage for automatic teardown.

78-116: LGTM!

Comprehensive coverage of all fail-open paths in the hook's guard cascade.

118-203: LGTM!

Good coverage of path-filtering logic including the tests/_shared/ exemption.

206-275: LGTM!

The test file edit scenarios correctly verify the ratchet behavior for holding, raising, and reducing violation counts.

278-307: LGTM!

Write tool tests correctly mirror the Edit tool behavior verification.

310-426: LGTM!

Excellent coverage of gate-editing scenarios, especially test_gate_edit_docstring_wording_change_allows which locks the AST-based counting behavior against regression to substring matching.

429-457: LGTM!

Properly verifies fail-open behavior and stderr diagnostics when the gate fails to load.

CI on round-2's head is green except for the standalone CodeQL status check, which stays FAILURE while py/path-injection alerts 288, 289, and 291 sit open against scripts/check_mock_spec_ratchet.py. CodeRabbit's round-3 review surfaced one quick docstring nit -- the module header still described the substring-counting gate strategy that round 2 replaced with the AST walk. CodeRabbit nit - scripts/check_mock_spec_ratchet.py: refreshed the module docstring to describe _count_catch_returns's AST walk and the substring fallback used when the parse fails open, matching what round 2 actually shipped. CodeQL dismissals (alerts 288, 289, 291 -- py/path-injection) The three alerts all point at the gate-input path-validation block. Rounds 1 and 2 each restructured the validation -- round 1 dropped the dead is_relative_to try/except and inlined the allowlist check, round 2 switched to Path.relative_to inside an explicit try/except so the sanitiser raises ValueError on out-of-tree paths. CodeQL re-fired on each push. The rule's data-flow analysis does not propagate the relative_to-raise sanitiser through the is_gate early-return branch to the downstream read_text sink, and is not following the shared_dir not in path.parents conjunct as a meaningful gate either. The runtime invariants are tight: path is always either the resolved gate file or a path under _TESTS_ROOT and outside tests/_shared/ by the time the read fires. The hook also runs under the harness's working-directory contract -- the file_path from the PreToolUse envelope is the path Claude is about to edit, already scoped to the repo. Dismissing as false positive with a one-line justification pointing at the in-code comment that documents the allowlist; the comment carries codeql[py/path-injection] markers at lines 218 and 235 so a future reviewer encounters the rationale at the cited site rather than only on the GitHub UI. Verification `uv run ruff check scripts/` clean. `uv run python -m pytest tests/unit/scripts/test_check_mock_spec_ratchet.py` -- all 19 ratchet tests pass.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 105-113: The except block that currently returns 0 on scan failure
should not treat an error as a clean scan; instead log the error and re-raise
(or return a distinct error sentinel that the caller checks) so callers can
distinguish "scan failed" from "0 violations". Update the two try/except sites
that call gate._scan_file (the block that prints "check_mock_spec_ratchet: gate
scan failed, allowing edit..." and the similar block at lines ~155-157) to stop
returning 0 on exception and either raise the caught exc again or return a
documented non-zero sentinel (and update the caller logic to handle that
sentinel) so scan failures are not conflated with zero-violation results.
- Around line 195-201: After json.loads(raw) assign to payload, ensure payload
is a mapping before using .get: check isinstance(payload, dict) and if not,
return 0 (or treat as empty dict), then safely read tool_name =
payload.get("tool_name", "") and tool_input = payload.get("tool_input") or {};
also coerce tool_input to a dict (e.g., if not isinstance(tool_input, dict) set
tool_input = {}) so accesses later in this function (using payload, tool_name,
tool_input) do not raise when JSON returns a non-object type.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7a1d88a8-23d7-42ab-a8b1-4ba0295864e2

📥 Commits

Reviewing files that changed from the base of the PR and between 519831b and 055251e.

📒 Files selected for processing (1)

scripts/check_mock_spec_ratchet.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Lighthouse Site
GitHub Check: Build Web Assets (melange)
GitHub Check: Test (Python 3.14)
GitHub Check: Analyze (python)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: No region/currency/locale should be privileged; use metric units; British English per docs/reference/regional-defaults.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Every convention PR ships its enforcement gate per docs/reference/convention-gates.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Timeout/slow test failures = source-code regression; never edit `tests/baselines/*` or `scripts/*_baseline.*` files or `scripts/_*_baseline.py`; no auto-PR; use `/pre-pr-review` after pushing

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Per-invocation bypass for gate baselines requires explicit user approval: `ALLOW_BASELINE_GROWTH=1 git commit ...`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR: `/aurelio-review-pr` for external feedback; fix EVERYTHING valid; no deferring

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Run `uv sync` for all dependencies and `uv sync --group docs` for docs toolchain

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Use `uv run ruff check src/ tests/ --fix` for linting and auto-fixing

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Use `uv run ruff format src/ tests/` for code formatting

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Use `uv run mypy src/ tests/` for strict type checking

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: xdist `-n 8 --dist=loadfile` auto-applied via pyproject addopts (loadfile prevents 3.14+Windows ProactorEventLoop leak)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via synthorg-repo-bot)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Pre-commit/pre-push hooks via .pre-commit-config.yaml; hookify rules in .claude/hookify.*.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Squash merge; PR body becomes squash commit; trailers (Release-As, Closes `#N`) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: After every squash merge → `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: Web: see web/CLAUDE.md; CLI: see cli/CLAUDE.md (use `go -C cli`, never `cd cli`); Shell: see ~/.claude/rules/common/bash.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:03:47.682Z
Learning: GitHub queries: `gh issue list` via Bash, NOT MCP `list_issues`

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_mock_spec_ratchet.py

CodeRabbit's round-4 review (commit-id 055251e) surfaced two robustness gaps in the ratchet hook: scan failures silently collapsed into a zero-violation count, and the top-level JSON payload was assumed to be a mapping. CodeQL re-fired alert 292 on the same path-injection block after round 3's docstring expansion shifted line numbers and the dismissal fingerprint did not match. CodeRabbit findings - scripts/check_mock_spec_ratchet.py: `_scan_text` now returns the `_SCAN_FAILED` sentinel (-1) when the gate raises rather than a literal 0, and `_check_test_file` short-circuits to exit 0 (fail open, with the existing stderr breadcrumb) when either the BEFORE or AFTER scan failed. The old shape masked a transient scan failure as "zero violations" -- if BEFORE failed silently and AFTER succeeded with a positive count, the ratchet would have wrongly blocked the edit. - scripts/check_mock_spec_ratchet.py: `main()` now guards `isinstance(payload, dict)` before calling `.get` on the decoded envelope, and coerces `tool_input` to an empty dict when the field is missing or not a mapping. A malformed envelope (or a future schema change that produces an array / scalar at the top level) no longer crashes the hook with AttributeError; it short- circuits open so the surrounding tool call is unaffected. - tests/unit/scripts/test_check_mock_spec_ratchet.py: two new regression tests -- `test_scan_failure_does_not_block_when_after_count_zero` patches a fake gate that raises on every scan and asserts exit 0 plus the stderr breadcrumb (locks the sentinel branch); `test_non_dict_payload_returns_zero` feeds a JSON array as the envelope and asserts exit 0 (locks the isinstance guard). CodeQL dismissal (alert 292, py/path-injection) - Same shape as 288 / 289 / 291: allowlist enforced via `Path.relative_to` (raises) + exact `gate_path` match, but CodeQL data-flow does not propagate the sanitiser through the `is_gate` early-return to the downstream `Path.resolve` sink. Round-3's docstring expansion shifted the cited line numbers in the file, so the per-alert fingerprint did not carry the previous dismissals forward. Dismissed via API as `dismissed_reason="false positive"` with a one-line pointer at the in-code comment that documents the allowlist. Verification `uv run ruff check src/ tests/ scripts/` clean. `uv run ruff format src/ tests/ scripts/` clean. `uv run mypy src/ tests/` clean (3714 files). `uv run python -m pytest tests/ -m "unit or integration" -n 8` -- 29,737 passed, 49 skipped in 290s. The 21 ratchet-hook tests cover both new regression sites.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 145-154: Validate Edit envelope fields before using them: ensure
file_path is a str before calling Path(file_path), ensure
tool_input.get("old_string") and tool_input.get("new_string") are strings before
using "in" or str.replace, and only treat replace_all as True when
isinstance(tool_input.get("replace_all"), bool) and it's True; if any of these
type checks fail, short-circuit by returning None (keeping the fail-open
behavior). Apply the same explicit isinstance checks and early-return logic to
the other Edit-handling block that uses the same variables (old_string,
new_string, replace_all, before) so malformed JSON doesn't raise.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 90eae35d-75fd-44b4-91e7-1ea9ae1bb66d

📥 Commits

Reviewing files that changed from the base of the PR and between 055251e and f3d5d87.

📒 Files selected for processing (2)

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Test (Python 3.14)
GitHub Check: Build Web Assets (melange)
GitHub Check: Lighthouse Site
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.{py,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use metric units; British English; no region/currency/locale privileged in code defaults

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py
scripts/check_mock_spec_ratchet.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}. Async tests use auto. Timeout 30s global. Coverage 80% minimum

Windows unit tests: use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags. Bare MagicMock at typed boundaries is blocked by check_mock_spec.py

FakeClock and mock_of import from tests._shared; inject via clock= and helper subscript

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)). Never skip/xfail flaky tests; fix fundamentally. Use asyncio.Event().wait() not sleep(large)

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Read design spec from `docs/design/` before implementing; deviations require approval per DESIGN_SPEC.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Every convention PR ships its enforcement gate

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Timeout/slow test failures = source-code regression; never edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`. Use `ALLOW_BASELINE_GROWTH=1` for explicit bypass

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: After issue completion: branch + commit + push; use `/pre-pr-review` (not auto-PR). After PR: `/aurelio-review-pr` for external feedback. Fix all valid issues; no deferring

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: xdist: `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` to prevent 3.14+ Windows ProactorEventLoop leak

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Pre-commit/pre-push hooks via `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: After every squash merge → run `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Web: see `web/CLAUDE.md`. CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`). Shell: see `~/.claude/rules/common/bash.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:34:40.283Z
Learning: Quick commands: `uv sync`, `uv run ruff check/format`, `uv run mypy`, `uv run pytest` with markers and coverage 80% min

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

tests/unit/scripts/test_check_mock_spec_ratchet.py
scripts/check_mock_spec_ratchet.py

CodeRabbit's round-5 review (commit-id f3d5d87) flagged one remaining robustness gap: _compute_after assumed tool_input["old_string"], tool_input["new_string"], and tool_input["file_path"] were strings without checking. A well-formed JSON envelope can still carry non-string values for those fields (list / int / null), in which case the downstream in / replace / Path(...) calls raise TypeError and the hook crashes instead of failing open. CodeRabbit finding - scripts/check_mock_spec_ratchet.py: _compute_after now isinstance-checks old_string and new_string before the in / replace calls and short-circuits to None on type mismatch. replace_all is now compared by identity to True so a non-bool truthy value (the previous bool(...) coercion accepted any truthy object) does not silently flip the flag. main() adds isinstance(file_path, str) before the Path(file_path) call so a malformed envelope cannot reach the path-validation block with a non-string payload. - tests/unit/scripts/test_check_mock_spec_ratchet.py: two new regression tests -- test_edit_with_non_string_fields_returns_zero feeds a list / int for old/new and asserts exit 0; test_edit_with_non_string_file_path_returns_zero feeds an int file_path and asserts exit 0 (locks both isinstance guards). Verification `uv run ruff check src/ tests/ scripts/` clean. `uv run ruff format src/ tests/ scripts/` clean. `uv run mypy src/ tests/` clean (3714 files). `uv run python -m pytest tests/unit/scripts/test_check_mock_spec_ratchet.py` -- all 23 ratchet tests pass (two new regression cases added).

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 58-65: The current except block in the AST parse
(ast.parse(source)) falls back to source.count(f"{_VERDICT_NAME}.{_CATCH_ATTR}")
which can still block edits; change the except SyntaxError path to return a
clear parse-failure sentinel (e.g., None or a dedicated constant) instead of a
count, and update _check_gate_file() to treat that sentinel as "unparsable" and
allow the edit when either side (old or new) is unparsable; refer to the
ast.parse call, the returned value from this function, the
_VERDICT_NAME/_CATCH_ATTR usage, and the _check_gate_file() function when making
the changes.

In `@tests/unit/scripts/test_check_mock_spec_ratchet.py`:
- Around line 430-452: The stub _broken_scan used in
test_scan_failure_does_not_block_when_after_count_zero currently raises on every
call; change it to raise only on the first invocation and return a positive scan
result on the second so the BEFORE scan simulates failure while the AFTER scan
succeeds. Implement this by making _broken_scan track call count (e.g. closure
variable or iterator) and on first call raise RuntimeError("simulated scan
failure"), and on the second return a list/tuple representing a positive count;
keep the existing fake_gate_module = SimpleNamespace(_scan_file=_broken_scan)
and monkeypatch.setattr(_MODULE, "_load_gate", ...) unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d01bbd19-e0b4-4536-8441-481d533d2910

📥 Commits

Reviewing files that changed from the base of the PR and between f3d5d87 and 1b8f9e3.

📒 Files selected for processing (2)

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Test (Python 3.14)
GitHub Check: Build Web Assets (melange)
GitHub Check: Lighthouse Site
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Configuration precedence: DB > env > YAML > code default via SettingsService/ConfigResolver; no os.environ.get outside startup

No from __future__ import annotations (3.14 has PEP 649); PEP 758 except: except A, B: requires parens when binding

Files:

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: use @pytest.mark.{unit,integration,e2e,slow}; async tests use auto; 30s global timeout; 80% coverage min; xdist -n 8 --dist=loadfile auto-applied

Windows unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries is blocked by check_mock_spec.py

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)); dual-backend conformance in tests/conformance/persistence/ consumes backend fixture (SQLite + Postgres); enforced by check_dual_backend_test_parity.py

Never skip/xfail flaky tests; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: No region/currency/locale privileged; use metric units; use British English

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: Every convention PR must ship its enforcement gate

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: After implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` before `gh pr create`; after PR use `/aurelio-review-pr`; fix all valid feedback with no deferring

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: Git commits: `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced; signed commits required on protected refs

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: Branch naming: `<type>/<slug>` from main; use `.pre-commit-config.yaml` hooks and `.claude/hookify.*.md` rules

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: Squash merge only; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: After every squash merge → run `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API, not CLI

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: Use `go -C cli` for CLI commands, never `cd cli`; see `cli/CLAUDE.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T06:59:14.610Z
Learning: Reference documentation files: load on demand from docs/reference/ for design-spec, conventions, convention-gates, regional-defaults, persistence-boundary, configuration-precedence, errors, sec-prompt-safety, lifecycle-sync, mcp-handler-contract, typed-boundaries, retry-patterns, scaffolding, audit-category-gate-coverage, dead-api-endpoints, pluggable-subsystems, protocols-audit, telemetry

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

CodeRabbit's round-6 review (commit-id 1b8f9e3) flagged two robustness gaps left over from round 2's AST-counter refactor: the SyntaxError fallback could still flip the inequality on a transient mid-edit break, and the round-4 regression test was weaker than the failure mode it was meant to cover. CodeRabbit findings - scripts/check_mock_spec_ratchet.py: _count_catch_returns now returns int | None -- None on SyntaxError -- and _check_gate_file short-circuits to exit 0 when either side fails to parse. The substring fallback (which still miscounted docstrings / comments when both sides were unparseable) is gone. The narrowing is split into two is-None arms so mypy strict accepts the comparison on the joined-int path. - tests/unit/scripts/test_check_mock_spec_ratchet.py: the scan stub in test_scan_failure_does_not_block_when_after_count_zero now raises only on the first call and returns a positive count on the second, exercising the asymmetric BEFORE-fails / AFTER-succeeds path the regression test is meant to lock (previous stub raised on every call, so it did not actually distinguish the silent-zero failure mode from the new sentinel-based fail-open). - tests/unit/scripts/test_check_mock_spec_ratchet.py: new test_unparseable_gate_after_falls_open regression test -- drives the gate-file path with a syntactically broken AFTER state and asserts exit 0; locks the None sentinel. Verification uv run ruff check src/ tests/ scripts/ -- clean. uv run ruff format src/ tests/ scripts/ -- clean. uv run mypy src/ tests/ -- clean (3714 files). uv run python -m pytest tests/ -m unit or integration -n 8 -- 29,740 passed, 49 skipped in 290s. 24 ratchet-hook tests cover the unparseable-gate and asymmetric-scan paths.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

scripts/check_mock_spec_ratchet.py (1)

20-22: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Module docstring still references a removed substring fallback.

Lines 20-22 say fail-open happens via a substring fallback, but _count_catch_returns() now returns None on parse failure. Please update this wording to match current behaviour.

Suggested edit

-   AFTER < BEFORE blocks; parse / import errors fall open via the
-   helper's substring fallback so a transiently broken gate cannot
-   wedge editing.
+   AFTER < BEFORE blocks; parse / import errors fail open via
+   explicit unparseable-sentinel handling so a transiently broken
+   gate cannot wedge editing.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/check_mock_spec_ratchet.py` around lines 20 - 22, Update the module
docstring text that currently claims a "substring fallback" causes parse/import
errors to fail open; instead reflect the current behavior of
_count_catch_returns(): when parsing fails it returns None, and callers treat
that None as a fail-open signal. Replace the substring-fallback wording with a
concise description that _count_catch_returns() returns None on parse failure
and that this None is used to allow editing to proceed.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 58-63: Edit the inline comment that begins "Returns ``None`` when
the source fails to parse..." to remove any reviewer/alert provenance or
back-references and leave only the rationale (why the function returns None and
the behavior being avoided); do the same cleanup for the other comment block
around the code that covers the same topic (the block you referenced at the
other location). Ensure comments explain only the rationale/why and contain no
reviewer names, tool alerts, issue IDs, or migration framing.

In `@tests/unit/scripts/test_check_mock_spec_ratchet.py`:
- Around line 436-438: The docstring currently contains
reviewer-history/back-reference phrasing (mentions CodeRabbit and migration
framing) instead of a behavioural description; update the test docstring(s) that
mention the ``_SCAN_FAILED`` sentinel (and the similar occurrences around the
second block at lines referencing 551-553) to remove reviewer/tool
back-references and instead state the intended behaviour being tested (e.g.,
that the test locks the ``_SCAN_FAILED`` sentinel to prevent returning a literal
0 on scan failure), keeping the focus on "what" the test asserts rather than
"who/when" reviewed it.

---

Duplicate comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 20-22: Update the module docstring text that currently claims a
"substring fallback" causes parse/import errors to fail open; instead reflect
the current behavior of _count_catch_returns(): when parsing fails it returns
None, and callers treat that None as a fail-open signal. Replace the
substring-fallback wording with a concise description that
_count_catch_returns() returns None on parse failure and that this None is used
to allow editing to proceed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8119b577-feff-4099-9543-9c91849f8487

📥 Commits

Reviewing files that changed from the base of the PR and between 1b8f9e3 and d9ebb44.

📒 Files selected for processing (2)

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Lighthouse Site
GitHub Check: Test (Python 3.14)
GitHub Check: Build Web Assets (melange)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{py,ts,tsx,go}

📄 CodeRabbit inference engine (CLAUDE.md)

No region/currency/locale privileged; use metric units; British English

Files:

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

**/*.{py,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Comments explain WHY only; no reviewer citations / issue back-refs / migration framing

Vendor-agnostic: NEVER use real vendor names in project code/tests; use example-provider, test-provider, example-{large,medium,small}-001

Files:

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundary is blocked

FakeClock and mock_of import from tests._shared; inject via clock= and helper's spec subscript

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)); never skip/xfail

Flaky tests: never skip/xfail; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Read `docs/design/` page before implementing; deviations need approval

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Every convention PR ships its enforcement gate

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR: `/aurelio-review-pr` for external feedback; fix everything valid

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: xdist `-n 8 --dist=loadfile` auto-applied via pyproject `addopts`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Pre-commit/pre-push hooks: `.pre-commit-config.yaml`; Hookify rules: `.claude/hookify.*.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: After every squash merge → `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Web: see `web/CLAUDE.md`; CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:30:19.560Z
Learning: Shell: see `~/.claude/rules/common/bash.md` (canonical for `cd` / `git -C` / Bash file-write rules)

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

CodeRabbit's round-7 review (commit-id d9ebb44) flagged three docstring / comment hygiene issues left over from rounds 4 and 6: the module docstring still described the old substring fallback, and two helper / test docstrings carried reviewer-history phrasing (mentions of "CodeRabbit asked / flagged") which violates the project rule against reviewer citations in code (CLAUDE.md: no reviewer citations / issue back-refs / migration framing). CodeRabbit findings - scripts/check_mock_spec_ratchet.py: module-level docstring lines 20-22 now describe the actual unparseable-sentinel handling (returns None on parse failure, callers treat that as fail-open) instead of the removed substring fallback. - scripts/check_mock_spec_ratchet.py: _count_catch_returns docstring drops the "CodeRabbit asked the ratchet to avoid" sentence and states the behavioural rationale directly (a substring fallback would flip the inequality when a mid-edit save changes docstring text on only one side). - scripts/check_mock_spec_ratchet.py: _scan_text docstring loses the "the previous behaviour conflated" migration framing and states why the sentinel exists (asymmetric BEFORE-fails / AFTER-succeeds case computes after > before and wrongly blocks if zero is used as a silent failure signal). - tests/unit/scripts/test_check_mock_spec_ratchet.py: two regression-test docstrings reworded to describe the locked behaviour ("Locks the X sentinel: when ...") instead of the reviewer history that produced them. No functional code change in this round -- only the docstring / comment wording. The ratchet behaviour is unchanged. Verification uv run ruff check src/ tests/ scripts/ -- clean. uv run mypy src/ tests/ -- clean (3714 files). uv run python -m pytest tests/unit/scripts/test_check_mock_spec_ratchet.py -- 24 ratchet tests pass.

coderabbitai

♻️ Duplicate comments (1)

scripts/check_mock_spec_ratchet.py (1)
160-166: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject malformed replace_all values instead of treating them as single-replace edits.

replace_all_raw is True avoids truthy junk, but it still computes an AFTER state for non-boolean values like "false" or 1. That breaks the hook’s fail-open contract for malformed envelopes and can block or allow based on a synthetic single-replace edit instead of short-circuiting open.
Suggested fix
         replace_all_raw = tool_input.get("replace_all", False)
-        replace_all = replace_all_raw is True
+        if not isinstance(replace_all_raw, bool):
+            return None
+        replace_all = replace_all_raw
         if old not in before:
             return None
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/check_mock_spec_ratchet.py` around lines 160 - 166, The code
currently computes replace_all with "replace_all_raw is True" which still
proceeds for non-boolean inputs (like "false" or 1) and yields a synthetic
single-replace edit; change the validation so that replace_all_raw must be a
boolean and reject malformed values: if replace_all_raw is not an instance of
bool, return None (short-circuit) to preserve the fail-open contract; otherwise
set replace_all = replace_all_raw and proceed with the existing logic that uses
replace_all, old, before, new.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 160-166: The code currently computes replace_all with
"replace_all_raw is True" which still proceeds for non-boolean inputs (like
"false" or 1) and yields a synthetic single-replace edit; change the validation
so that replace_all_raw must be a boolean and reject malformed values: if
replace_all_raw is not an instance of bool, return None (short-circuit) to
preserve the fail-open contract; otherwise set replace_all = replace_all_raw and
proceed with the existing logic that uses replace_all, old, before, new.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c1c65b61-ed23-4a80-8ed1-9ce7586939bd

📥 Commits

Reviewing files that changed from the base of the PR and between d9ebb44 and 7bd83e2.

📒 Files selected for processing (2)

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Lighthouse Site
GitHub Check: Test (Python 3.14)
GitHub Check: Build Web Assets (melange)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (1)

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min; xdist -n 8 --dist=loadfile auto-applied via pyproject addopts

Windows testing: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries forbidden (blocked by check_mock_spec.py with zero tolerance, no baseline)

FakeClock and mock_of import from tests._shared; inject via clock= and helper's spec subscript

Hypothesis: 10 deterministic CI examples; test failures are real bugs (fix + add @example(...))

Flaky tests: NEVER skip/xfail; fix fundamentally; use asyncio.Event().wait() not sleep(large) for waiting

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/scripts/test_check_mock_spec_ratchet.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Read design specification from `docs/design/` before implementing; deviations require approval per DESIGN_SPEC.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: No region/currency/locale privileging; use metric units and British English per docs/reference/regional-defaults.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Every convention PR ships its enforcement gate per docs/reference/convention-gates.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: After issue implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` before creating PR via gh pr create (hookify-blocked); after PR merge, use `/aurelio-review-pr` for external feedback; fix all valid feedback, no deferring

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Use `uv sync` for dependency management; `uv run ruff check`, `uv run ruff format`, `uv run mypy`, `uv run pytest` with appropriate markers (unit/integration/e2e)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Pre-commit/pre-push hooks defined in `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Squash merge only; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: After every squash merge → run `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API, never in CLI binary

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Web guidelines and design system details in `web/CLAUDE.md`; CLI guidelines in `cli/CLAUDE.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-11T07:54:33.696Z
Learning: Shell, bash, and file-write rules in `~/.claude/rules/common/bash.md` (canonical for `cd` / `git -C` / Bash file-write); use `go -C cli`, never `cd cli`

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_mock_spec_ratchet.py
tests/unit/scripts/test_check_mock_spec_ratchet.py

CodeRabbit's round-8 review (commit-id 7bd83e2) flagged that the round-5 replace_all guard accepts non-boolean inputs and falls through to a synthetic single-replace edit instead of short- circuiting open. ``"false"`` (string truthy), ``1`` (int), or any other non-bool value would pass the ``replace_all_raw is True`` check (evaluates to False) and the function would still compute an AFTER state with a single replacement, giving the ratchet a misleading value to compare. CodeRabbit finding - scripts/check_mock_spec_ratchet.py: ``_compute_after`` now ``isinstance(replace_all_raw, bool)``-checks the value before using it, returning ``None`` for any non-bool input. Boolean ``True``/``False`` flow through unchanged. The strict-fail-open branch matches the rest of the function (old/new/file_path isinstance guards from rounds 4-5). - tests/unit/scripts/test_check_mock_spec_ratchet.py: new ``test_edit_with_non_bool_replace_all_returns_zero`` regression test feeds ``replace_all="false"`` and asserts exit 0 (locks the new isinstance branch). Verification uv run ruff check src/ tests/ scripts/ -- clean. uv run mypy src/ tests/ -- clean (3714 files). uv run python -m pytest tests/unit/scripts/test_check_mock_spec_ratchet.py -- 25 ratchet tests pass.

## Highlights > _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub Models). Commit-based changelog below._ ### What you'll notice - Password and secret fields now include an eye-toggle for easier visibility control. - Containers running without probes are shown as healthy in the doctor command. - Unloaded and missing PR-review agents are restored and available again. ### What's new - Gate baseline protection is enhanced to block em-dashes during writing. ### Under the hood - Replaced Atlas with yoyo-migrations for persistence management. - Refactored codebase extensively, including context-bound user authentication and registry pattern for enums. - Improved linting by draining magic number usages and tightening mock and constant checks. - Updated CI to retry Docker pushes on network timeout errors. - Updated apko lockfiles for dependency management.  :robot: I have created a release *beep* *boop* --- ## [0.8.3](v0.8.2...v0.8.3) (2026-05-12) ### Features * harden gate baseline protection + block em-dashes at write time ([#1860](#1860)) ([b41f151](b41f151)) * **web:** eye-toggle on every password / secret field ([#1873](#1873)) ([9070387](9070387)) ### Bug Fixes * **ci:** retry Docker push on Go net/http deadline + cancellation errors ([#1877](#1877)) ([23a0bfa](23a0bfa)) * **cli:** render running-no-probe containers as healthy in doctor ([#1870](#1870)) ([6263795](6263795)) * restore unloaded and missing PR-review agents ([#1875](#1875)) ([db004fd](db004fd)), closes [#1871](#1871) ### Refactoring * bind authenticated user via ContextVar ([#1858](#1858)) ([57ed0b4](57ed0b4)) * code-structure cleanup (sub-tasks D + F + G + H + I) ([#1859](#1859)) ([362e5c8](362e5c8)) * convert enum dispatch to registry pattern ([#1854](#1854)) ([e90550e](e90550e)) * drain no_magic_numbers baseline to zero via Final hoists ([#1856](#1856) phase 2) ([#1872](#1872)) ([ec8109e](ec8109e)) * drain pagination + loop-init + kill-switch baselines ([#1857](#1857)) ([#1868](#1868)) ([115c3c2](115c3c2)) * **persistence:** replace Atlas with yoyo-migrations ([#1876](#1876)) ([1b7e975](1b7e975)), closes [#1874](#1874) * protocols audit follow-up (REVIEW + fold pass) ([#1869](#1869)) ([af33ddb](af33ddb)) * protocols audit follow-up REMOVE pass ([#1867](#1867)) ([dd1eebc](dd1eebc)) * tighten check_mock_spec gate, add mock_of[T], drain baseline ([#1862](#1862)) ([240a253](240a253)) * tighten check_no_magic_numbers for named module constants ([#1856](#1856)) ([#1866](#1866)) ([90c933b](90c933b)) ### CI/CD * update apko lockfiles ([#1863](#1863)) ([2bd32e6](2bd32e6)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Aureliolo added 2 commits May 10, 2026 19:23

Aureliolo temporarily deployed to atlas May 10, 2026 22:07 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 10, 2026 22:09 — with GitHub Actions Inactive

gemini-code-assist Bot reviewed May 10, 2026

View reviewed changes

github-advanced-security AI found potential problems May 10, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py Fixed

Comment thread scripts/check_mock_spec_ratchet.py Dismissed

Comment thread scripts/check_mock_spec_ratchet.py Dismissed

coderabbitai Bot requested changes May 10, 2026

View reviewed changes

Aureliolo temporarily deployed to atlas May 11, 2026 04:58 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 04:59 — with GitHub Actions Inactive

github-advanced-security AI found potential problems May 11, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py Fixed

coderabbitai Bot requested changes May 11, 2026

View reviewed changes

Aureliolo temporarily deployed to atlas May 11, 2026 05:43 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 05:44 — with GitHub Actions Inactive

github-advanced-security AI found potential problems May 11, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py Fixed

coderabbitai Bot requested changes May 11, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py

Aureliolo temporarily deployed to atlas May 11, 2026 06:03 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 06:04 — with GitHub Actions Inactive

github-advanced-security AI found potential problems May 11, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py Dismissed

coderabbitai Bot requested changes May 11, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py Outdated

Comment thread scripts/check_mock_spec_ratchet.py Outdated

Aureliolo temporarily deployed to atlas May 11, 2026 06:34 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 06:35 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 11, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py Outdated

Aureliolo temporarily deployed to atlas May 11, 2026 06:58 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 06:59 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 11, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py Outdated

Comment thread tests/unit/scripts/test_check_mock_spec_ratchet.py

Aureliolo temporarily deployed to atlas May 11, 2026 07:29 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 07:31 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 11, 2026

View reviewed changes

Comment thread scripts/check_mock_spec_ratchet.py Outdated

Comment thread tests/unit/scripts/test_check_mock_spec_ratchet.py Outdated

Aureliolo temporarily deployed to atlas May 11, 2026 07:53 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 07:55 — with GitHub Actions Inactive

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Aureliolo temporarily deployed to atlas May 11, 2026 08:18 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 08:19 — with GitHub Actions Inactive

Aureliolo merged commit 240a253 into main May 11, 2026
76 checks passed

Aureliolo deleted the refactor/refine-check-mock-spec branch May 11, 2026 08:44

Aureliolo temporarily deployed to cloudflare-preview May 11, 2026 08:44 — with GitHub Actions Inactive

synthorg-repo-bot Bot mentioned this pull request May 11, 2026

chore(main): release 0.8.3 #1861

Merged

	except OSError, UnicodeDecodeError:
	except (OSError, UnicodeDecodeError):

	_MOCK_FACTORY_NAMES: frozenset[str] = frozenset({"create_autospec"})
	_MOCK_FACTORY_NAMES: frozenset[str] = frozenset({"create_autospec", "mock_of"})

Conversation

Aureliolo commented May 10, 2026

What changes

Gate quality (sub-task 1)

Helper + ladder (sub-task 2)

Drain (sub-task 3)

Pre-PR review

Test plan

Local cleanup note

Uh oh!

github-actions Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 10, 2026 •

edited

Loading

coderabbitai Bot commented May 10, 2026 •

edited

Loading

codecov Bot commented May 10, 2026 •

edited

Loading