refactor: tighten check_mock_spec gate, add mock_of[T], drain baseline#1862
Conversation
#1855) - Rewrite scripts/check_mock_spec.py with AST-based detection of bare Mock at typed boundaries only. Catches direct cases (positional / kwarg to a non-Mock-class callable, m: T = Mock() annotated local, return Mock() from a typed fn). Indirect substitution via name binding sits on the test-double ladder rung but is not gated; precise detection there needs cross-module callee resolution. - Delete scripts/mock_spec_baseline.txt (3046 entries). Gate runs zero-tolerance against the live tree. - Add tests/_shared/mock_of.py with mock_of[T](**overrides) factory backed by create_autospec(T, instance=True, spec_set=True). Re-export from tests/_shared/__init__.py. - Document the test-double ladder in docs/reference/conventions.md section 12.1 (FakeClock, mock_of / create_autospec, SimpleNamespace, bare MagicMock forbidden at typed boundary). Update CLAUDE.md Testing bullet with the pointer. - Add scripts/check_mock_spec_ratchet.py PreToolUse hook plus registrations in .claude/settings.json and .opencode/plugins/ synthorg-hooks.ts. Blocks (a) edits to tests/*.py that raise the gate's CATCH count for the touched file, (b) edits to scripts/check_mock_spec.py that remove _Verdict.CATCH branches. Drives organic drive-by tightening: every edit reduces or holds. - Convert every surviving Pattern A site across tests/ to use spec=ConcreteType / mock_of[T] / FakeClock / SimpleNamespace. - Rewrite tests/unit/scripts/test_check_mock_spec.py with new rule-table coverage. - Add tests/_shared/test_mock_of.py with 8 unit tests for the helper. Closes #1855
- Test coverage: add YieldFrom and Set test cases to test_check_mock_spec.py; both AST branches were live but unverified. - Test coverage: add 18-test test_check_mock_spec_ratchet.py covering the new PreToolUse hook (envelope parsing, scope filters, ratchet enforcement, gate-edit weakening protection, fail-open path). - Hook: add stderr logging to the two fail-open broad-except branches so transitional failures surface instead of silently allowing. - Hook: parenthesise except (OSError, UnicodeDecodeError) for reader clarity. - Hook: add docstring on _load_gate. - Gate: add docstrings on five AST predicate helpers. - Docs: add check_mock_spec_ratchet.py to PreToolUse hooks section in convention-gates.md; backfill check_error_codes_ts_in_sync.py and check_no_controller_response_for_domain_errors.py in the inventory list; update runtime_stats macro and CLAUDE.md gate count. - OpenCode plugin: add hook to header comment list. - Imports: route VersionRepository / FineTune* repository imports in three test files through their source protocol modules; the protocol.py re-export was TYPE_CHECKING-only and mypy strict rejected it.
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🧰 Additional context used📓 Path-based instructions (2)**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
tests/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (2)📓 Common learnings📚 Learning: 2026-05-05T09:04:46.195ZApplied to files:
🔇 Additional comments (13)
WalkthroughAdds an AST two-pass typed-boundary mock-spec gate and a PreToolUse ratchet script; wires the ratchet into .claude settings and the OpenCode plugin; adds tests/_shared/mock_of with unit tests and exports; converts many tests to use spec-constrained mocks or mock_of/create_autospec; and updates docs and runtime_stats to reflect the new gate inventory. |
There was a problem hiding this comment.
Code Review
This pull request refactors the mock-spec enforcement mechanism to target bare mocks specifically at typed boundaries, such as constructor arguments and annotated locals, while introducing a "test-double ladder" convention. It includes a new mock_of[T] helper for typed substitutions and a PreToolUse ratchet hook to prevent the introduction of new violations. The review identifies a critical syntax error in the ratchet script's exception handling and suggests enhancing the gate's AST parsing to correctly recognize the mock_of[T] subscript syntax as a valid factory.
|
|
||
| try: | ||
| before = path.read_text(encoding="utf-8") if path.exists() else "" | ||
| except OSError, UnicodeDecodeError: |
There was a problem hiding this comment.
The comma-separated exception list syntax except OSError, UnicodeDecodeError: is a SyntaxError in all current stable Python versions (3.x < 3.14). While PEP 758 proposes this syntax for Python 3.14, it is not yet available in released versions. In Python 3, multiple exceptions must be enclosed in parentheses.
Additionally, in Python 3.x (before 3.14), except OSError, UnicodeDecodeError: would be interpreted as except OSError as UnicodeDecodeError:, which means UnicodeDecodeError would not be caught, and the exception object would be bound to that name, likely causing the script to crash if a UnicodeDecodeError is actually raised. Since this hook is intended to fail-open, a crash here would instead fail-closed and block all edits.
| except OSError, UnicodeDecodeError: | |
| except (OSError, UnicodeDecodeError): |
| def _terminal_callee_name(value: ast.expr) -> str | None: | ||
| """Terminal identifier of a callee expression, or None.""" | ||
| if isinstance(value, ast.Name): | ||
| return value.id | ||
| if isinstance(value, ast.Attribute): | ||
| return value.attr | ||
| return None |
There was a problem hiding this comment.
The _terminal_callee_name helper does not handle ast.Subscript nodes. This causes an inconsistency where create_autospec(T, return_value=Mock()) is correctly skipped (because create_autospec is in _MOCK_FACTORY_NAMES), but the new recommended helper mock_of[T](return_value=Mock()) is caught as a violation. This is because mock_of[T] is a Subscript node, and _terminal_callee_name returns None for it, causing _verdict_for_call_arg to fall through to CATCH.
| def _terminal_callee_name(value: ast.expr) -> str | None: | |
| """Terminal identifier of a callee expression, or None.""" | |
| if isinstance(value, ast.Name): | |
| return value.id | |
| if isinstance(value, ast.Attribute): | |
| return value.attr | |
| return None | |
| def _terminal_callee_name(value: ast.expr) -> str | None: | |
| """Terminal identifier of a callee expression, or None.""" | |
| if isinstance(value, ast.Name): | |
| return value.id | |
| if isinstance(value, ast.Attribute): | |
| return value.attr | |
| if isinstance(value, ast.Subscript): | |
| return _terminal_callee_name(value.value) | |
| return None |
| ) | ||
| """Class names whose bare-call form is the candidate set for Pass 1.""" | ||
|
|
||
| _MOCK_FACTORY_NAMES: frozenset[str] = frozenset({"create_autospec"}) |
There was a problem hiding this comment.
The mock_of helper should be added to _MOCK_FACTORY_NAMES to ensure that bare mocks used for internal configuration (like return_value=Mock()) are treated with the same exemption as create_autospec. This works in conjunction with updating _terminal_callee_name to handle subscripts.
| _MOCK_FACTORY_NAMES: frozenset[str] = frozenset({"create_autospec"}) | |
| _MOCK_FACTORY_NAMES: frozenset[str] = frozenset({"create_autospec", "mock_of"}) |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1862 +/- ##
==========================================
+ Coverage 84.87% 84.89% +0.01%
==========================================
Files 1806 1806
Lines 105211 105211
Branches 9214 9214
==========================================
+ Hits 89301 89321 +20
+ Misses 13679 13658 -21
- Partials 2231 2232 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 38
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
tests/unit/security/test_llm_evaluator.py (1)
122-126: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winPrefer
mock_of[ProviderRegistry]over rawMagicMock(spec=...)at this boundary.Line 122 creates a typed-boundary test double with
MagicMock(spec=ProviderRegistry). Please usemock_ofto keep the enforcement/migration path uniform.Proposed patch
+from tests._shared import mock_of @@ - registry = MagicMock(spec=ProviderRegistry) - registry.get = MagicMock(side_effect=lambda name: driver_map[name]) - registry.list_providers = MagicMock( - return_value=tuple(sorted(driver_map.keys())), - ) + registry = mock_of[ProviderRegistry]() + registry.get.side_effect = lambda name: driver_map[name] + registry.list_providers.return_value = tuple(sorted(driver_map.keys()))As per coding guidelines
tests/**/*.py: “mock_of[T](**overrides)for typed-boundary substitutions” and “Import …mock_offromtests._shared”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/security/test_llm_evaluator.py` around lines 122 - 126, Replace the typed MagicMock with the project's typed test-double helper: import mock_of from tests._shared and create registry = mock_of[ProviderRegistry]() instead of MagicMock(spec=ProviderRegistry); keep the existing overridden methods (registry.get and registry.list_providers) but set them as attributes on the mock_of instance (e.g., registry.get = MagicMock(side_effect=...), registry.list_providers = MagicMock(return_value=...)) so the test still uses driver_map while conforming to the mock_of[ProviderRegistry] pattern.tests/unit/meta/chief_of_staff/test_chat.py (1)
161-166: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winUse
mock_of[CompletionProvider]for this provider double.Line 161 still uses direct
AsyncMock(spec=CompletionProvider)at a typed boundary.Suggested change
from unittest.mock import AsyncMock +from tests._shared import mock_of @@ - provider = AsyncMock(spec=CompletionProvider) + provider = mock_of[CompletionProvider]() provider.complete.side_effect = RuntimeError("LLM unavailable")As per coding guidelines,
tests/**/*.py: “Test doubles hierarchy: use …mock_of[T](**overrides)for typed-boundary substitutions” and “Import FakeClock andmock_offromtests._shared”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/meta/chief_of_staff/test_chat.py` around lines 161 - 166, Replace the direct AsyncMock(spec=CompletionProvider) usage with the project test-double helper: create the provider using mock_of[CompletionProvider](...) instead of AsyncMock to satisfy the typed-boundary rule; update the test to construct provider = mock_of[CompletionProvider](complete=AsyncMock(side_effect=RuntimeError("LLM unavailable"))) and ensure mock_of and FakeClock are imported from tests._shared; leave the ChiefOfStaffChat(provider=provider, config=ChiefOfStaffConfig()) usage unchanged.tests/unit/engine/middleware/test_behavior_tagger.py (1)
146-150: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winPrefer
mock_of[AgentMiddlewareContext]for the typed context mock.Line 146 should use the shared typed-double helper instead of direct
MagicMock(spec=...).Suggested change
from unittest.mock import MagicMock +from tests._shared import mock_of @@ - ctx = MagicMock(spec=AgentMiddlewareContext) + ctx = mock_of[AgentMiddlewareContext]() ctx.agent_id = "test-agent" ctx.task_id = "test-task" ctx.metadata = None # Will cause AttributeError in _infer_tagsAs per coding guidelines,
tests/**/*.py: “Test doubles hierarchy: use …mock_of[T](**overrides)for typed-boundary substitutions” and “Import FakeClock andmock_offromtests._shared”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/engine/middleware/test_behavior_tagger.py` around lines 146 - 150, Replace the untyped MagicMock usage with the shared typed test-double helper: import mock_of from tests._shared and construct ctx = mock_of[AgentMiddlewareContext](agent_id="test-agent", task_id="test-task", metadata=None) so the typed mock is used when calling mw.after_model(ctx); keep the metadata=None to exercise the _infer_tags path and ensure AgentMiddlewareContext, mw.after_model, and _infer_tags are referenced by name when locating the change.tests/unit/meta/signals/test_scaling.py (1)
212-217:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse
mock_of[ScalingService]for these service doubles.These are typed-boundary substitutions and should use the repo-standard helper instead of raw
MagicMock(spec=...): it provides autospecced methods plusspec_set=True, enforcing the test doubles hierarchy across the codebase.Suggested change
+from tests._shared import mock_of + - service = MagicMock(spec=ScalingService) - service.get_recent_decisions = MagicMock( - side_effect=RuntimeError("boom"), - ) - service.get_recent_actions = MagicMock(return_value=()) + service = mock_of[ScalingService]( + get_recent_decisions=MagicMock(side_effect=RuntimeError("boom")), + get_recent_actions=MagicMock(return_value=()), + )Also applies to: 245-247
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/meta/signals/test_scaling.py` around lines 212 - 217, Tests create service doubles using MagicMock(spec=ScalingService) but repo standard requires the mock_of helper to produce autospecced, spec_set=True doubles; replace MagicMock(spec=ScalingService) with mock_of[ScalingService] for the service used in tests around ScalingSignalAggregator (and the other occurrence at lines ~245-247), and ensure the stubbed methods (get_recent_decisions, get_recent_actions) are attached to that mock_of instance rather than a raw MagicMock.tests/unit/meta/test_signals.py (1)
114-133: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winReplace typed-boundary mocks with
mock_of[T].
PerformanceTrackerandScalingServiceare typed-boundary seams. Usingmock_of[T]provides autospec'd methods andspec_set=True, enforcing the signature contract that the bareMagicMock(spec=...)pattern skips.Suggested change
+from tests._shared import mock_of + - tracker = MagicMock(spec=PerformanceTracker) + tracker = mock_of[PerformanceTracker]() ... - tracker = MagicMock(spec=PerformanceTracker) - tracker.get_snapshot = AsyncMock(side_effect=RuntimeError("tracker broken")) + tracker = mock_of[PerformanceTracker]( + get_snapshot=AsyncMock(side_effect=RuntimeError("tracker broken")) + ) ... - service = MagicMock(spec=ScalingService) - service.get_recent_decisions = MagicMock(return_value=()) - service.get_recent_actions = MagicMock(return_value=()) + service = mock_of[ScalingService]( + get_recent_decisions=MagicMock(return_value=()), + get_recent_actions=MagicMock(return_value=()), + ) ... - scaling_service = MagicMock(spec=ScalingService) - scaling_service.get_recent_decisions = MagicMock(return_value=()) - scaling_service.get_recent_actions = MagicMock(return_value=()) + scaling_service = mock_of[ScalingService]( + get_recent_decisions=MagicMock(return_value=()), + get_recent_actions=MagicMock(return_value=()), + )Also applies to: 171-172, 210-213, 254-256
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/meta/test_signals.py` around lines 114 - 133, Replace typed-boundary MagicMock usage with the project's typed mock helpers: change tracker = MagicMock(spec=PerformanceTracker) to tracker = mock_of[PerformanceTracker] (and similarly any MagicMock(spec=ScalingService) to mock_of[ScalingService]) so mocks are autospecced and spec_set=True; preserve the AsyncMock side effect on tracker.get_snapshot but create tracker as mock_of[PerformanceTracker] and set tracker.get_snapshot = AsyncMock(side_effect=[s1, s2]) (or use async_mock_of if your test helpers provide it) to keep the async behavior while enforcing the real interface for methods like get_snapshot.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/reference/conventions.md`:
- Around line 331-336: The docs show using create_autospec(T, instance=True)
which is less strict than our helper mock_of[T]; update the text to recommend
create_autospec(T, instance=True, spec_set=True) so the example matches
mock_of[T] behavior and keeps rung-2 strictness consistent with mock_of[T]
(refer to mock_of[T], tests._shared, and the instance=True/spec_set=True
options).
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 177-180: Remove the unnecessary try/except around
Path.is_relative_to: replace the try block with a direct assignment to in_tests
using path.is_relative_to(_TESTS_ROOT) since on supported Python versions
is_relative_to returns False for unrelated paths and never raises ValueError;
update the code in scripts/check_mock_spec_ratchet.py where path.is_relative_to,
_TESTS_ROOT, and in_tests are used to eliminate the dead except branch.
In `@scripts/check_mock_spec.py`:
- Around line 390-396: The function _iter_test_files currently compares an
unresolved shared_dir (constructed as _TESTS_ROOT / "_shared") against
path.parents from rglob, which can mismatch when symlinks or different mounts
are involved; fix this by resolving shared_dir (and/or resolving path when
checking membership) before the comparison so both sides use canonical paths —
update the shared_dir assignment in _iter_test_files to use .resolve(), or
compare using path.resolve() (or Path.is_relative_to with resolved bases) to
reliably exclude the _shared directory.
In `@tests/integration/hr/training/test_training_persistence.py`:
- Around line 80-81: Replace typed AsyncMock test doubles with the shared helper
mock_of from tests._shared: import mock_of and change mock_selector =
AsyncMock(spec=SourceSelector) to mock_selector = mock_of[SourceSelector]()
(then set mock_selector.select.return_value = (NotBlankStr("senior-1"),) as
before). Apply the same conversion for the other occurrences (the mocks around
lines 86-87 and 98), ensuring you use mock_of[SourceSelector] (or the
appropriate collaborator type) so typed collaborator doubles follow the
project's test-double policy.
In `@tests/integration/hr/training/test_training_pipeline.py`:
- Around line 126-127: The test uses AsyncMock(spec=AgentRegistryService) as a
typed boundary double; replace these with the repo-standard
mock_of[AgentRegistryService](...) from tests._shared (and import mock_of at the
top) in the _build_service helper and the other occurrences noted (the registry,
and the other mocks at the 130-137 and 144-149 blocks); keep the same
return_value/attribute overrides (e.g., registry.list_active.return_value =
(senior,)) but construct the mock via mock_of[...] so the helper follows the
canonical typed mock pattern.
In `@tests/integration/integrations/test_rate_limiter_shared_state.py`:
- Line 115: Replace the MagicMock-based bus test double with the repository
helper: import mock_of from tests._shared and change the instantiation from
MagicMock(spec=MessageBus) to mock_of[MessageBus]() (or
mock_of[MessageBus](**overrides) if you need to pass attributes); target the
test variable named "bus" so the mock uses spec_set/boundary semantics
consistent with the repo guidelines.
In `@tests/unit/api/auth/test_postgres_session_store.py`:
- Line 121: Replace the untyped MagicMock usage with the shared typed
test-double helper: change the sqlite_db test double (currently created via
MagicMock(spec=Connection)) to use mock_of[Connection](...) from tests._shared,
and add the import for mock_of at the top of the test module; ensure any
overrides passed to MagicMock are forwarded to mock_of to preserve behavior so
sqlite_db remains a Connection-typed mock.
In `@tests/unit/api/test_auto_wire_meetings.py`:
- Around line 16-17: The helper _fake_registries() returns bare MagicMocks and
must be converted to return typed test doubles using mock_of[...] for the
registry interfaces; update _fake_registries() so it constructs typed mocks for
AgentRegistryService and ProviderRegistry (e.g.
mock_of[AgentRegistryService](...) and mock_of[ProviderRegistry](...)) and apply
any existing overrides previously passed to the bare mocks, ensuring all tests
that call _fake_registries() (including the parametrised cases around lines
~173-179) receive properly typed mocks instead of bare MagicMock instances.
In `@tests/unit/api/test_health.py`:
- Around line 230-233: Replace uses of MagicMock(spec=AppState) with the shared
test helper mock_of[AppState] in the telemetry-status tests (e.g., where
app_state is created for _resolve_telemetry_status and related assertions using
TelemetryStatus); import mock_of (and FakeClock if used elsewhere) from
tests._shared at the top of the test module and update the other occurrences
noted (the blocks around the other assertions) to use
mock_of[AppState](**overrides) instead of MagicMock so the typed-boundary
substitution follows the project testing conventions.
In `@tests/unit/api/test_lifecycle_cleanup_kill_switch.py`:
- Around line 44-55: The tests use MagicMock/AsyncMock with
spec=_no_arg_sync/_no_arg_async for cleanup_expired (and similar mocks like
session_store.cleanup_expired, lockout_store.cleanup_expired, and
config_resolver.get_bool/get_float), which doesn't enforce call signatures;
replace those with unittest.mock.create_autospec targets (using spec_set=True)
so calls are validated against the real callable signatures—e.g.
create_autospec(_no_arg_sync, spec_set=True, return_value=None) for sync mocks
and create_autospec(_no_arg_async, spec_set=True, return_value=None) for async
ones, and similarly create_autospec(_config_get_async, spec_set=True,
return_value=...) for config_resolver methods at the flagged locations.
In `@tests/unit/engine/checkpoint/test_strategy.py`:
- Around line 504-508: Replace AsyncMock(spec=CheckpointRepository) and
AsyncMock(spec=HeartbeatRepository) test doubles with the repository-typed
helper mock_of from tests._shared: import mock_of and create cp_repo =
mock_of[CheckpointRepository](get_latest=..., delete_by_execution=...) and
hb_repo = mock_of[HeartbeatRepository](delete=...) so the doubles carry the
correct typed-boundary semantics; ensure you pass the same AsyncMock return
values (e.g., get_latest returning None, delete_by_execution returning 0) as
overrides to mock_of.
In `@tests/unit/engine/test_agent_engine_lifecycle.py`:
- Around line 116-119: The test uses MagicMock(spec=ExecutionLoop) for the
execution_loop doubles (e.g., variable mock_loop with methods
execute/get_loop_type) but per guidelines you should replace these with
mock_of[ExecutionLoop] and import mock_of (and FakeClock if needed) from
tests._shared; update each construction (all occurrences like the mock_loop at
lines referenced and similar variables at other spots) to use
mock_of[ExecutionLoop](**overrides) and keep the same AsyncMock/return_value
overrides for execute and MagicMock overrides for get_loop_type so the
typed-boundary test doubles follow the project's test-double hierarchy.
In `@tests/unit/engine/test_agent_engine.py`:
- Around line 484-487: Replace the typed test doubles that use
MagicMock/AsyncMock(spec=...) with the project's mock_of[...] helper: find the
mock_loop variable (currently using MagicMock/AsyncMock with spec=ExecutionLoop
and AsyncMock(return_value=mock_result)) and change it to use
mock_of[ExecutionLoop](execute=AsyncMock(return_value=mock_result),
get_loop_type=MagicMock(return_value="react")) or equivalent overrides via
mock_of; ensure tests import mock_of (and FakeClock where needed) from
tests._shared and remove direct MagicMock/AsyncMock(spec=...) usages at the
typed boundaries mentioned (e.g., mock_loop and the other listed locations).
In `@tests/unit/engine/test_personality_trim_notifier.py`:
- Around line 254-257: Replace the direct MagicMock(spec=ConfigResolver) typed
double with the shared helper mock_of[ConfigResolver](**overrides) and import
mock_of (and FakeClock if used elsewhere) from tests._shared; specifically,
update the resolver declarations in test_personality_trim_notifier (the resolver
variable using get_bool/get_int) to use
mock_of[ConfigResolver](get_bool=AsyncMock(side_effect=RuntimeError("db down")),
get_int=AsyncMock(return_value=10)) and make the same replacement for the other
resolver instance around lines 351-354 so all typed boundary doubles follow the
mock_of pattern.
In `@tests/unit/engine/test_workspace_git_worktree.py`:
- Around line 124-126: Create a small local helper function in the test module
(e.g., _make_run_git_mock) that returns an AsyncMock configured with
spec=PlannerWorktreeStrategy._run_git and the common return_value tuple (0, "",
""); replace all repeated AsyncMock(spec=PlannerWorktreeStrategy._run_git,
return_value=(0, "", "")) occurrences in
tests/unit/engine/test_workspace_git_worktree.py with calls to this helper to
centralize setup and reduce duplication while keeping existing behavior.
In `@tests/unit/engine/workflow/test_service_validate.py`:
- Around line 25-30: The test helper _service uses AsyncMock(spec=...) for
constructor-injected repos which violates the project's typed boundary rule;
replace those with the mock_of[...] helper to create autospec'd mocks with
spec_set=True. In _service, construct definition_repo and version_repo using
mock_of[WorkflowDefinitionRepository] and mock_of[VersionRepository] (or
mock_of[typing.AsyncMock[...] ] as appropriate) and pass them into
WorkflowService so the mocks are properly typed and locked to the interface.
In `@tests/unit/hr/evaluation/test_eval_loop_coordinator.py`:
- Around line 31-35: Replace MagicMock(spec=...) typed-boundary doubles with the
shared test-double helper: import mock_of from tests._shared and use
mock_of[PerformanceTracker](**overrides) for tracker and
mock_of[EvaluationService](**overrides) for evaluation (and the other similar
mocks later in the file that use MagicMock(spec=...)). Locate the instantiations
currently using MagicMock(spec=PerformanceTracker) and
MagicMock(spec=EvaluationService) and swap them to mock_of[...] while preserving
configured return values (e.g., get_task_metrics.return_value and evaluate
return behavior) by passing overrides or setting attributes on the returned
mock_of instance.
In `@tests/unit/hr/test_activity_list_recent.py`:
- Line 183: Replace the direct AsyncMock(spec=CostTracker) test double with the
shared helper mock_of[CostTracker] so the typed-boundary mock follows the
project's test-doubles hierarchy; update the test to import mock_of (and
FakeClock if needed) from tests._shared and create cost_tracker =
mock_of[CostTracker](...) instead of AsyncMock(spec=CostTracker) so the
substitution is consistent with the ladder/enforcement gate.
In `@tests/unit/hr/training/test_onboarding_integration.py`:
- Around line 52-59: Replace direct AsyncMock(spec=...) usages with the project
test helper mock_of to standardize typed-boundary doubles: change registry =
AsyncMock(spec=AgentRegistryService) to registry =
mock_of[AgentRegistryService]() (and keep the existing registry.get.return_value
= _make_identity()), training_service = AsyncMock(spec=TrainingService) to
training_service = mock_of[TrainingService]() (retain
training_service.execute.return_value = _make_empty_result()), and onboarding =
AsyncMock(spec=OnboardingService) to onboarding = mock_of[OnboardingService]();
apply the same replacement pattern at the other occurrences referenced (lines
around the blocks using registry, training_service, onboarding) so all typed
test doubles use mock_of[T](**overrides).
In `@tests/unit/hr/training/test_service.py`:
- Around line 78-79: Replace direct AsyncMock(spec=SourceSelector) usage in the
test helper with the project's typed helper mock_of[T]; specifically, change the
creation of selector (currently AsyncMock(spec=SourceSelector) returning
("senior-1",)) to use mock_of[SourceSelector](...) and set its select return
value accordingly, and do the same for the other occurrences noted (the mocks
created at the other two spots). Also import mock_of from tests._shared at the
top of the file and ensure the _make_service helper uses these mock_of-based
substitutes instead of AsyncMock(spec=...).
In `@tests/unit/memory/consolidation/test_dual_mode_strategy.py`:
- Around line 84-86: The helper _make_strategy is creating typed test doubles
using AsyncMock(spec=MemoryBackend) — replace those with the project's typed
mock helper: use
mock_of[MemoryBackend](store=AsyncMock(return_value="summary-1"),
delete=AsyncMock(return_value=True)) (and the analogous replacements for the
other doubles at lines 90-96) and add the import for mock_of from tests._shared;
update references where backend is constructed so the mock_of[...] call supplies
the same AsyncMock overrides (store, delete, etc.) to preserve behavior.
In `@tests/unit/memory/consolidation/test_service.py`:
- Line 24: The test helper _make_backend_mock currently returns an untyped
AsyncMock causing spec-less mocks at the MemoryBackend boundary; update
_make_backend_mock to return a typed test double using
mock_of[MemoryBackend](...) (or equivalent mock_of[MemoryBackend](**overrides))
so that all consumers like MemoryConsolidationService receive a properly-typed
backend mock; ensure any awaited methods used in tests are created as AsyncMock
on the typed mock and adjust call sites if necessary to use the typed mock's
attributes.
In `@tests/unit/memory/consolidation/test_strategy.py`:
- Line 41: Replace the ad-hoc AsyncMock(spec=MemoryBackend) test doubles with
the shared helper mock_of[MemoryBackend] and import mock_of from tests._shared;
specifically, change occurrences where backend = AsyncMock(spec=MemoryBackend)
(and the other listed occurrences) to use backend = mock_of[MemoryBackend]() so
typed-boundary mocks are consistent, and add the import "from tests._shared
import mock_of" at the top of the test module if not already present.
In `@tests/unit/memory/org/test_protocol.py`:
- Line 18: Replace the raw AsyncMock(spec=OrgFactRepository) test double with
the shared helper mock_of[OrgFactRepository] from tests._shared: import mock_of
from tests._shared and change the create call for the protocol-bound store (the
variable currently assigned via AsyncMock(spec=OrgFactRepository)) to use
mock_of[OrgFactRepository](**overrides if needed) so the test follows the repo
test-double helper pattern; keep the same variable name (store) and any override
kwargs currently passed to AsyncMock.
In `@tests/unit/memory/procedural/propagation/test_department_scoped.py`:
- Around line 43-53: Replace the direct AsyncMock/MagicMock(spec=...) test
doubles with the shared typed helper mock_of[...] from tests._shared: e.g., use
mock_of[AgentRegistryService](get=AsyncMock(return_value=source_agent),
list_by_department=AsyncMock(return_value=(target_agent1, target_agent2)))
instead of AsyncMock(spec=AgentRegistryService), use
mock_of[MemoryBackend](store=AsyncMock(return_value="mem-copy-1")) instead of
AsyncMock(spec=MemoryBackend), and use
mock_of[MemoryEntry](category=MemoryCategory.PROCEDURAL) instead of
MagicMock(spec=MemoryEntry); apply the same pattern to the other occurrences
noted (lines ~81-91, 116-124, 156-164, 189-197, 235-246, 275-285) and import
mock_of from tests._shared.
In `@tests/unit/memory/procedural/pruning/test_hybrid_strategy.py`:
- Around line 154-160: Replace the typed AsyncMock instances with the shared
test-double factory: instead of AsyncMock(spec=PruningStrategy) for mock_ttl and
mock_pareto, create them via mock_of[PruningStrategy](...) so the doubles follow
the project pattern; ensure you still set .name ("mock-ttl"/"mock-pareto") and
assign the async prune return value (e.g., prune returning ("mem-expired",) and
("mem-low-relevance",)) on the mock_of-created objects so existing assertions
using mock_ttl.prune and mock_pareto.prune continue to work.
In `@tests/unit/memory/retrieval/hierarchical/test_supervisor.py`:
- Around line 110-113: Replace the direct AsyncMock spec construction with the
project's typed mock helper: instead of AsyncMock(spec=CompletionProvider) for
the provider used in tests (the variable named provider and its .complete
AsyncMock side_effect), call mock_of[CompletionProvider](...) to create the
typed double and pass the same overrides (e.g., setting complete to an AsyncMock
side_effect=RuntimeError("LLM unavailable")); do the same replacement for the
other occurrence around lines 243-246 so all CompletionProvider test doubles use
mock_of[CompletionProvider].
In `@tests/unit/memory/test_architect_tools.py`:
- Line 197: Replace the direct AsyncMock usage for the typed-boundary repository
with the project test helper: where fact_store is created with
AsyncMock(spec=OrgFactRepository) (and the other occurrences at the noted
lines), use mock_of[OrgFactRepository](...) imported from tests._shared instead;
update imports to bring in mock_of (and FakeClock if needed elsewhere) and
instantiate fact_store via mock_of to follow the test doubles convention for
typed-boundary substitutions.
In `@tests/unit/meta/chief_of_staff/test_monitor.py`:
- Line 67: Replace the direct AsyncMock(spec=SnapshotBuilder) test double with
the typed-boundary helper mock_of[SnapshotBuilder]() (import mock_of from
tests._shared) so the double follows the shared test-doubles ladder; update the
builder assignment (and the other noted occurrences for snapshot_builder at the
same pattern) to use mock_of[SnapshotBuilder](**overrides) where needed and
ensure any async behavior is preserved by configuring the mock_of instance
rather than using AsyncMock(spec=...).
In `@tests/unit/meta/mcp/test_handlers_integrations.py`:
- Around line 64-65: Replace the direct MagicMock(spec=ArtifactStorageBackend)
with the shared typed-double helper: create the storage test double via
mock_of[ArtifactStorageBackend] and pass the AsyncMock override for the delete
method (e.g., provide delete=AsyncMock(return_value=True)) so the variable
storage retains the correct typed-spec behavior while still stubbing delete;
update the reference to storage used in tests accordingly.
In `@tests/unit/ontology/drift/test_service.py`:
- Around line 70-71: The test uses raw
AsyncMock(spec=OntologyDriftReportRepository) and then assigns
store.store_report, which breaks the project's typed-boundary mocking
convention; replace these with the project's
mock_of[OntologyDriftReportRepository](...) pattern (import mock_of where
needed) and pass overrides for async methods like store_report via the helper
instead of assigning attributes after creation so spec_set-like behavior is
preserved; update both occurrences (the store variable at the shown lines and
the other site around 141-143) to use mock_of with appropriate AsyncMock
overrides.
In `@tests/unit/ontology/drift/test_strategies.py`:
- Around line 248-251: Replace the AsyncMock(spec=DriftDetectionStrategy) test
doubles with the typed helper: use mock_of[DriftDetectionStrategy](...) and pass
detect overrides as AsyncMock instances (e.g. core =
mock_of[DriftDetectionStrategy](detect=AsyncMock(return_value=clean_report)) and
user = mock_of[DriftDetectionStrategy](detect=AsyncMock())). Do the same for the
other occurrences mentioned (the other core/user doubles) so the doubles use
mock_of with the typed spec_set behavior.
In `@tests/unit/scripts/test_check_mock_spec.py`:
- Around line 235-246: Add a new regression test that asserts attribute-chain
assignments like "x = Mock(); x.return_value = Mock()" are skipped (in addition
to the existing constructor kwarg case tested by
test_inner_mock_in_mock_class_call_skipped); create a function named e.g.
test_return_value_chain_skipped that builds source with a Mock() assigned to a
name and then a direct ".return_value = Mock()" assignment, then call
_MODULE._scan_file(write_test_file(src)) and assert it returns [] — also add the
same pattern test for the other location referenced (lines ~258-266) so both
attribute-assignment and kwarg forms are covered.
In `@tests/unit/security/test_service_safety_integration.py`:
- Around line 62-63: Replace the raw AsyncMock(spec=ApprovalStoreProtocol) usage
with the shared test helper mock_of from tests._shared: create approval_store
via mock_of[ApprovalStoreProtocol](add=AsyncMock()) instead of approval_store =
AsyncMock(spec=ApprovalStoreProtocol) and then assignment approval_store.add =
AsyncMock(); ensure you import mock_of from tests._shared and keep the same
variable name approval_store so typed-boundary expectations for
ApprovalStoreProtocol remain satisfied.
In `@tests/unit/security/test_shadow_mode.py`:
- Around line 82-85: Replace direct MagicMock(spec=...) typed substitutions with
the test helper mock_of from tests._shared: import mock_of and replace
occurrences like MagicMock(spec=RuleEngine), MagicMock(spec=AuditLog),
MagicMock(spec=OutputScanner) with mock_of[RuleEngine](**overrides),
mock_of[AuditLog](...), mock_of[OutputScanner](...) respectively so the mocks
follow the new test-double ladder and typing contract; apply the same change for
the other occurrences noted (around lines 100-104, 118-122, 138-142, 160-164,
179-190, 205-207) and preserve any explicit return_value or attribute overrides
by passing them as keyword overrides to mock_of.
In `@tests/unit/security/timeout/test_timeout_checker.py`:
- Around line 148-149: Replace the direct AsyncMock(spec=TimeoutPolicy) creation
with the shared typed helper: import mock_of from tests._shared and instantiate
mock_policy using mock_of[TimeoutPolicy](...) instead of AsyncMock(spec=...).
Preserve the existing behavior (set mock_policy.determine_action.side_effect =
RuntimeError("boom")) and ensure the variable name mock_policy and its usage in
the test remain unchanged so the typed boundary is used throughout.
In `@tests/unit/settings/test_service.py`:
- Around line 407-409: Replace the typed MagicMock instances with the project's
typed helper: import mock_of from tests._shared and change creations like
MagicMock(spec=MessageBus) to mock_of[MessageBus](...) (preserving any attribute
overrides such as is_running=True and publish=AsyncMock()); update each
occurrence (around the instances mentioned and at lines corresponding to the
other occurrences) so all test doubles for MessageBus use mock_of[MessageBus]
for consistent typed-boundary substitution.
In `@tests/unit/tools/test_process_cleanup.py`:
- Around line 18-19: Replace the MagicMock-based attribute-bag use with
SimpleNamespace: instead of proc = MagicMock(spec=Process) and setting
proc._transport = None, create proc = SimpleNamespace(_transport=None) (and for
the other test blocks that set attributes on proc) so tests model a mutable
attribute container rather than a behaviour mock; update all occurrences that
construct proc via MagicMock(spec=Process) and directly manipulate _transport or
similar attributes to use SimpleNamespace with the same attribute names.
---
Outside diff comments:
In `@tests/unit/engine/middleware/test_behavior_tagger.py`:
- Around line 146-150: Replace the untyped MagicMock usage with the shared typed
test-double helper: import mock_of from tests._shared and construct ctx =
mock_of[AgentMiddlewareContext](agent_id="test-agent", task_id="test-task",
metadata=None) so the typed mock is used when calling mw.after_model(ctx); keep
the metadata=None to exercise the _infer_tags path and ensure
AgentMiddlewareContext, mw.after_model, and _infer_tags are referenced by name
when locating the change.
In `@tests/unit/meta/chief_of_staff/test_chat.py`:
- Around line 161-166: Replace the direct AsyncMock(spec=CompletionProvider)
usage with the project test-double helper: create the provider using
mock_of[CompletionProvider](...) instead of AsyncMock to satisfy the
typed-boundary rule; update the test to construct provider =
mock_of[CompletionProvider](complete=AsyncMock(side_effect=RuntimeError("LLM
unavailable"))) and ensure mock_of and FakeClock are imported from
tests._shared; leave the ChiefOfStaffChat(provider=provider,
config=ChiefOfStaffConfig()) usage unchanged.
In `@tests/unit/meta/signals/test_scaling.py`:
- Around line 212-217: Tests create service doubles using
MagicMock(spec=ScalingService) but repo standard requires the mock_of helper to
produce autospecced, spec_set=True doubles; replace
MagicMock(spec=ScalingService) with mock_of[ScalingService] for the service used
in tests around ScalingSignalAggregator (and the other occurrence at lines
~245-247), and ensure the stubbed methods (get_recent_decisions,
get_recent_actions) are attached to that mock_of instance rather than a raw
MagicMock.
In `@tests/unit/meta/test_signals.py`:
- Around line 114-133: Replace typed-boundary MagicMock usage with the project's
typed mock helpers: change tracker = MagicMock(spec=PerformanceTracker) to
tracker = mock_of[PerformanceTracker] (and similarly any
MagicMock(spec=ScalingService) to mock_of[ScalingService]) so mocks are
autospecced and spec_set=True; preserve the AsyncMock side effect on
tracker.get_snapshot but create tracker as mock_of[PerformanceTracker] and set
tracker.get_snapshot = AsyncMock(side_effect=[s1, s2]) (or use async_mock_of if
your test helpers provide it) to keep the async behavior while enforcing the
real interface for methods like get_snapshot.
In `@tests/unit/security/test_llm_evaluator.py`:
- Around line 122-126: Replace the typed MagicMock with the project's typed
test-double helper: import mock_of from tests._shared and create registry =
mock_of[ProviderRegistry]() instead of MagicMock(spec=ProviderRegistry); keep
the existing overridden methods (registry.get and registry.list_providers) but
set them as attributes on the mock_of instance (e.g., registry.get =
MagicMock(side_effect=...), registry.list_providers =
MagicMock(return_value=...)) so the test still uses driver_map while conforming
to the mock_of[ProviderRegistry] pattern.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6ea84693-5e9b-41a9-955f-3062d3d378a9
📒 Files selected for processing (123)
.claude/settings.json.opencode/plugins/synthorg-hooks.tsCLAUDE.mddata/runtime_stats.yamldocs/reference/convention-gates.mddocs/reference/conventions.mdscripts/check_mock_spec.pyscripts/check_mock_spec_ratchet.pyscripts/mock_spec_baseline.txttests/_shared/__init__.pytests/_shared/mock_of.pytests/_shared/test_mock_of.pytests/evals/prompt/test_rerank_prompt.pytests/integration/hr/training/test_training_persistence.pytests/integration/hr/training/test_training_pipeline.pytests/integration/integrations/test_rate_limiter_shared_state.pytests/integration/mcp/test_tool_surface.pytests/unit/a2a/test_gateway.pytests/unit/api/auth/test_postgres_session_store.pytests/unit/api/controllers/test_memory_admin.pytests/unit/api/controllers/test_provider_health.pytests/unit/api/controllers/test_setup.pytests/unit/api/controllers/test_ws.pytests/unit/api/test_auto_wire_meetings.pytests/unit/api/test_health.pytests/unit/api/test_lifecycle_cleanup_kill_switch.pytests/unit/api/test_personality_trim_notifier_factory.pytests/unit/api/test_state.pytests/unit/budget/test_coordination_collector.pytests/unit/budget/test_coordination_collector_properties.pytests/unit/communication/async_tasks/test_service.pytests/unit/communication/meeting/test_scheduler_races.pytests/unit/communication/messages/test_service.pytests/unit/communication/test_nats_publish.pytests/unit/engine/checkpoint/test_strategy.pytests/unit/engine/classification/test_loaders.pytests/unit/engine/classification/test_semantic_detectors.pytests/unit/engine/classification/test_sinks.pytests/unit/engine/evolution/proposers/test_separate_analyzer.pytests/unit/engine/evolution/test_service.pytests/unit/engine/middleware/test_behavior_tagger.pytests/unit/engine/test_agent_engine.pytests/unit/engine/test_agent_engine_auto_loop.pytests/unit/engine/test_agent_engine_budget.pytests/unit/engine/test_agent_engine_degradation.pytests/unit/engine/test_agent_engine_errors.pytests/unit/engine/test_agent_engine_lifecycle.pytests/unit/engine/test_agent_engine_procedural.pytests/unit/engine/test_approval_gate_events.pytests/unit/engine/test_coordination_factory.pytests/unit/engine/test_coordination_service.pytests/unit/engine/test_personality_trim_notifier.pytests/unit/engine/test_prompt_safety_sites.pytests/unit/engine/test_security_factory.pytests/unit/engine/test_semantic_llm.pytests/unit/engine/test_workspace_git_worktree.pytests/unit/engine/test_workspace_merge.pytests/unit/engine/workflow/test_execution_observer.pytests/unit/engine/workflow/test_service_validate.pytests/unit/hr/evaluation/test_eval_loop_coordinator.pytests/unit/hr/performance/test_composite_quality_strategy.pytests/unit/hr/performance/test_llm_judge_quality_strategy.pytests/unit/hr/performance/test_tracker_enhancements.pytests/unit/hr/test_activity_list_recent.pytests/unit/hr/training/test_guards.pytests/unit/hr/training/test_llm_curated.pytests/unit/hr/training/test_onboarding_integration.pytests/unit/hr/training/test_service.pytests/unit/hr/training/test_source_selectors.pytests/unit/memory/consolidation/test_abstractive.pytests/unit/memory/consolidation/test_dual_mode_strategy.pytests/unit/memory/consolidation/test_retention.pytests/unit/memory/consolidation/test_service.pytests/unit/memory/consolidation/test_service_kill_switch.pytests/unit/memory/consolidation/test_strategy.pytests/unit/memory/consolidation/test_two_tier_strategy.pytests/unit/memory/consolidation/test_wiki_export.pytests/unit/memory/org/test_protocol.pytests/unit/memory/procedural/capture/test_factory.pytests/unit/memory/procedural/propagation/test_department_scoped.pytests/unit/memory/procedural/propagation/test_no_propagation.pytests/unit/memory/procedural/propagation/test_role_scoped.pytests/unit/memory/procedural/pruning/test_hybrid_strategy.pytests/unit/memory/procedural/test_evolver.pytests/unit/memory/procedural/test_pipeline.pytests/unit/memory/procedural/test_proposer.pytests/unit/memory/retrieval/hierarchical/test_supervisor.pytests/unit/memory/retrieval/reranking/test_llm_reranker.pytests/unit/memory/test_architect_tools.pytests/unit/memory/test_tool_retriever.pytests/unit/meta/chief_of_staff/test_chat.pytests/unit/meta/chief_of_staff/test_monitor.pytests/unit/meta/mcp/test_all_handlers_wired.pytests/unit/meta/mcp/test_handlers_integrations.pytests/unit/meta/signals/test_scaling.pytests/unit/meta/test_code_modification_strategy.pytests/unit/meta/test_signals.pytests/unit/ontology/drift/test_service.pytests/unit/ontology/drift/test_strategies.pytests/unit/ontology/test_entity_guard.pytests/unit/persistence/postgres/test_backend.pytests/unit/persistence/sqlite/test_preset_override_repo.pytests/unit/providers/resilience/test_retry.pytests/unit/providers/test_capability_mutations.pytests/unit/scripts/test_check_mock_spec.pytests/unit/scripts/test_check_mock_spec_ratchet.pytests/unit/scripts/test_generate_comparison.pytests/unit/security/test_llm_evaluator.pytests/unit/security/test_llm_fallback_integration.pytests/unit/security/test_safety_classifier.pytests/unit/security/test_service_safety_integration.pytests/unit/security/test_shadow_mode.pytests/unit/security/test_uncertainty_checker.pytests/unit/security/timeout/test_timeout_checker.pytests/unit/settings/test_bridge_config_wiring.pytests/unit/settings/test_observability_subscriber.pytests/unit/settings/test_per_op_rate_limit_subscriber.pytests/unit/settings/test_resolver.pytests/unit/settings/test_security_subscriber.pytests/unit/settings/test_service.pytests/unit/tools/git/test_git_url_validator.pytests/unit/tools/git/test_git_url_validator_toctou.pytests/unit/tools/test_process_cleanup.py
CodeRabbit ASSERTIVE review on f81fdde flagged 38 inline + 5 outside-diff typed-boundary substitutions plus several script / docs nits. Gemini called out a Subscript-handling miss in the gate and a path-injection shape in the ratchet. CodeQL surfaced three py/path-injection alerts. Script + docs - scripts/check_mock_spec.py: `_terminal_callee_name` now recurses through `ast.Subscript`, and `mock_of` joins `_MOCK_FACTORY_NAMES`, so `mock_of[T](return_value=Mock())` is treated as a factory call rather than a typed-boundary violation. `_iter_test_files` and `cmd_scan_paths` resolve `_TESTS_ROOT` and `shared_dir` before comparing parents so symlinks / bind-mounts cannot bypass the `_shared` exclusion. - scripts/check_mock_spec_ratchet.py: dropped the dead `is_relative_to` try/except (Python 3.14 never raises here), restructured the path validation so the allow-list check is inline-visible to CodeQL's data-flow analysis. Kept the PEP-758 `except OSError, UnicodeDecodeError:` form per CLAUDE.md. - docs/reference/conventions.md: bumped the rung-2 example from `create_autospec(T, instance=True)` to `create_autospec(T, instance=True, spec_set=True)` so the doc matches `mock_of[T]` strictness. Test conversions - Replaced `(Magic|Async)Mock(spec=T)` at typed-boundary sites with `mock_of[T](**overrides)` from tests/_shared across 35 test files (unit + integration). Where the override list was set via attribute-bag writes after construction, collapsed into kwargs at construction so the spec_set=True invariant is enforced. - tests/_shared/mock_of.py: `__call__` now returns `Any` and `_MockOfMeta.__getitem__` accepts `Any` instead of `type[T]`. The runtime invariants (autospec + spec_set=True + override-key validation) are unchanged; only the static-type ergonomics shift, so callers configure `.method.return_value` and call `.assert_*()` without per-call cast wrappers and Protocol specs no longer trip mypy strict's `type-abstract` check. - tests/unit/scripts/test_check_mock_spec.py: locked the Subscript-factory recognition and the attribute-chain `x.return_value = Mock()` SKIP path in two new regression tests. - tests/unit/tools/test_process_cleanup.py: pure attribute-bag doubles switched from `MagicMock(spec=Process)` to `SimpleNamespace` (with `cast` at the call site) per the test-doubles ladder; the one test that needs type-level `PropertyMock` descriptor wiring keeps the original mock. - tests/unit/api/test_lifecycle_cleanup_kill_switch.py: stub callables switched to `create_autospec(..., spec_set=True)`. Fixed `_config_get_async` signature to match the real `(namespace, key)` two-positional contract; `create_autospec` caught the earlier one-arg mismatch. Skipped findings (factually wrong against current code) - Gemini #1 (`except (OSError, UnicodeDecodeError):` parens): the CLAUDE.md project rule explicitly says PEP 758 `except A, B:` no parens unless binding, and the project is Python 3.14+. The current syntax is correct under the project interpreter. - tests/unit/api/test_health.py (AppState mock_of conversion): `AppState` uses `__slots__` plus property accessors that `create_autospec(spec_set=True)` does not expose, so the looser `MagicMock(spec=AppState)` is the right rung here. - tests/unit/engine/middleware/test_behavior_tagger.py (AgentMiddlewareContext mock_of conversion): Pydantic `Field()` annotations are invisible to `create_autospec`; spec_set=True rejects `agent_id`/`task_id`/`metadata` overrides at runtime. Reverted with an explanatory comment. - tests/unit/engine/test_workspace_git_worktree.py (`_make_run_git_mock` helper extraction): only two call sites with identical setup and no shared `return_value` to factor out; CodeRabbit's described `return_value=(0, '', '')` form is absent from current code. Verification `uv run ruff check src/ tests/ scripts/` clean. `uv run ruff format src/ tests/ scripts/` clean. `uv run mypy src/ tests/` clean (3714 files). `uv run python -m pytest tests/ -m unit -n 8` -- 28,257 passed, 18 skipped in 81s. No regressions. CodeQL alerts 287 / 288 / 289 still open on `refs/pull/1862/head` because CodeQL has not re-run against the new HEAD yet; the next babysit round will re-evaluate.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
tests/unit/settings/test_service.py (1)
409-409: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winReplace remaining
MagicMock(spec=MessageBus)boundary doubles withmock_of[MessageBus].Line 409, Line 427, Line 501, Line 558, Line 692, and Line 708 still use the old pattern, so this file is only partially migrated to the enforced helper.
Suggested patch pattern
- bus = MagicMock(spec=MessageBus) - bus.is_running = True - bus.publish = AsyncMock() + bus = mock_of[MessageBus]( + is_running=True, + publish=AsyncMock(), + )Apply the same replacement to each remaining occurrence above.
As per coding guidelines
tests/**/*.py: “Test doubles: use …mock_of[T](**overrides)for typed-boundary substitutions…”.Also applies to: 427-427, 501-501, 558-558, 692-692, 708-708
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/settings/test_service.py` at line 409, Replace the remaining uses of MagicMock(spec=MessageBus) with the typed test double helper mock_of[MessageBus](**overrides) (i.e., change occurrences that create a "bus" MagicMock to mock_of[MessageBus]) and add the necessary import for mock_of; search for all remaining MagicMock(spec=MessageBus) usages (they reference MessageBus and variable names like bus) and update them to use mock_of[MessageBus] so the file consistently uses the typed-boundary substitution helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 135-149: The current `_check_gate_file` uses naive substring
counts (`before.count(_CATCH_MARKER)`) which miscounts docstrings/comments;
instead parse `before` and `after` with `ast.parse` and count actual ast.Return
nodes whose value represents `_Verdict.CATCH` (e.g., a Return node where value
is an ast.Attribute with value ast.Name id '_Verdict' and attr 'CATCH', and
optionally handle cases like a single Name with id '_Verdict.CATCH' if present);
replace the `before_count`/`after_count` assignments with these AST counts and,
to be robust, fall back to the original substring count only if `ast.parse`
raises a SyntaxError. Ensure you update the logic using the same
`_check_gate_file` function and keep the same return codes and error message
behavior.
In `@scripts/check_mock_spec.py`:
- Around line 311-316: _verdict_for_call_arg currently marks any call whose
callee isn't a known mock factory as _Verdict.CATCH, causing false positives for
helper(Mock()) and accepts_mock(Mock()); change it to first resolve the callee's
parameter annotation (use the same resolver you use elsewhere to inspect the
function signature of call.func via _terminal_callee_name or the project's type
resolver) and only return _Verdict.CATCH when that parameter is definitively
non-Mock-typed; if the annotation cannot be resolved, return _Verdict.SKIP; keep
the existing checks against _MOCK_NAMES and _MOCK_FACTORY_NAMES and return
_Verdict.SKIP for those cases.
---
Duplicate comments:
In `@tests/unit/settings/test_service.py`:
- Line 409: Replace the remaining uses of MagicMock(spec=MessageBus) with the
typed test double helper mock_of[MessageBus](**overrides) (i.e., change
occurrences that create a "bus" MagicMock to mock_of[MessageBus]) and add the
necessary import for mock_of; search for all remaining
MagicMock(spec=MessageBus) usages (they reference MessageBus and variable names
like bus) and update them to use mock_of[MessageBus] so the file consistently
uses the typed-boundary substitution helper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c735376e-c6f3-440e-b979-f51aaa8a5dbb
📒 Files selected for processing (43)
docs/reference/conventions.mdscripts/check_mock_spec.pyscripts/check_mock_spec_ratchet.pytests/_shared/mock_of.pytests/_shared/test_mock_of.pytests/integration/hr/training/test_training_persistence.pytests/integration/hr/training/test_training_pipeline.pytests/integration/integrations/test_rate_limiter_shared_state.pytests/unit/api/auth/test_postgres_session_store.pytests/unit/api/test_auto_wire_meetings.pytests/unit/api/test_lifecycle_cleanup_kill_switch.pytests/unit/engine/checkpoint/test_strategy.pytests/unit/engine/middleware/test_behavior_tagger.pytests/unit/engine/test_agent_engine.pytests/unit/engine/test_agent_engine_lifecycle.pytests/unit/engine/test_personality_trim_notifier.pytests/unit/engine/workflow/test_service_validate.pytests/unit/hr/evaluation/test_eval_loop_coordinator.pytests/unit/hr/test_activity_list_recent.pytests/unit/hr/training/test_onboarding_integration.pytests/unit/hr/training/test_service.pytests/unit/memory/consolidation/test_dual_mode_strategy.pytests/unit/memory/consolidation/test_service.pytests/unit/memory/consolidation/test_strategy.pytests/unit/memory/org/test_protocol.pytests/unit/memory/procedural/propagation/test_department_scoped.pytests/unit/memory/procedural/pruning/test_hybrid_strategy.pytests/unit/memory/retrieval/hierarchical/test_supervisor.pytests/unit/memory/test_architect_tools.pytests/unit/meta/chief_of_staff/test_chat.pytests/unit/meta/chief_of_staff/test_monitor.pytests/unit/meta/mcp/test_handlers_integrations.pytests/unit/meta/signals/test_scaling.pytests/unit/meta/test_signals.pytests/unit/ontology/drift/test_service.pytests/unit/ontology/drift/test_strategies.pytests/unit/scripts/test_check_mock_spec.pytests/unit/security/test_llm_evaluator.pytests/unit/security/test_service_safety_integration.pytests/unit/security/test_shadow_mode.pytests/unit/security/timeout/test_timeout_checker.pytests/unit/settings/test_service.pytests/unit/tools/test_process_cleanup.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Read design spec from
docs/design/page before implementing; deviations need approval per DESIGN_SPEC.mdPresent every plan for accept/deny before coding
No region/currency/locale privileged; use metric units; use British English per
docs/reference/regional-defaults.mdEvery convention PR ships its enforcement gate per
docs/reference/convention-gates.mdConfiguration precedence: DB > env > YAML > code default via
SettingsService/ConfigResolver; noos.environ.getoutside startup perdocs/reference/configuration-precedence.mdNo hardcoded numeric values; numerics live in
settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2. Enforced byscripts/check_no_magic_numbers.pyComments must document WHY only; no reviewer citations / issue back-refs / migration framing. Enforced by
check_no_review_origin_in_code.py+check_no_migration_framing.pyNo
from __future__ import annotations(Python 3.14+ has PEP 649). Use PEP 758 except syntax (except A, B:no parens unless binding)Type hints on public functions; mypy strict mode. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines
Error classes must follow
<Domain><Condition>Errornaming fromDomainErrorbase; never inheritException/RuntimeErrordirectly. Enforced bycheck_domain_error_hierarchy.pyPydantic v2: frozen +
extra="forbid"on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); use@computed_fieldfor derived; useNotBlankStrfor identifiersArgs models required at every system boundary; use
parse_typed()for every external dict ingestion. Enforced bycheck_boundary_typed.pyImmutability: use
model_copy(update=...)orcopy.deepcopy(); deepcopy at system boundariesAsync fan-out/fan-in must use
asyncio.TaskGroup; helpers must catchException(re-raiseMemoryError/RecursionError)Clock seam: include
clock: Clock | None = Noneparameter; tests injectFakeClock. Li...
Files:
tests/unit/engine/workflow/test_service_validate.pytests/unit/ontology/drift/test_strategies.pytests/unit/memory/test_architect_tools.pytests/unit/api/auth/test_postgres_session_store.pytests/unit/meta/mcp/test_handlers_integrations.pytests/unit/memory/consolidation/test_dual_mode_strategy.pytests/unit/meta/chief_of_staff/test_chat.pytests/unit/security/test_llm_evaluator.pytests/unit/hr/evaluation/test_eval_loop_coordinator.pytests/unit/engine/checkpoint/test_strategy.pytests/unit/security/timeout/test_timeout_checker.pytests/unit/security/test_service_safety_integration.pytests/unit/engine/test_personality_trim_notifier.pytests/unit/memory/procedural/propagation/test_department_scoped.pytests/unit/memory/retrieval/hierarchical/test_supervisor.pytests/unit/hr/test_activity_list_recent.pytests/unit/memory/procedural/pruning/test_hybrid_strategy.pytests/unit/engine/middleware/test_behavior_tagger.pytests/integration/hr/training/test_training_pipeline.pytests/unit/security/test_shadow_mode.pytests/unit/api/test_lifecycle_cleanup_kill_switch.pytests/integration/hr/training/test_training_persistence.pytests/unit/ontology/drift/test_service.pytests/unit/engine/test_agent_engine.pytests/integration/integrations/test_rate_limiter_shared_state.pytests/unit/api/test_auto_wire_meetings.pytests/unit/meta/test_signals.pytests/unit/settings/test_service.pytests/unit/engine/test_agent_engine_lifecycle.pytests/unit/tools/test_process_cleanup.pytests/unit/hr/training/test_onboarding_integration.pytests/unit/hr/training/test_service.pytests/unit/meta/signals/test_scaling.pytests/_shared/mock_of.pytests/_shared/test_mock_of.pytests/unit/scripts/test_check_mock_spec.pytests/unit/meta/chief_of_staff/test_monitor.pytests/unit/memory/consolidation/test_strategy.pytests/unit/memory/org/test_protocol.pyscripts/check_mock_spec.pytests/unit/memory/consolidation/test_service.pyscripts/check_mock_spec_ratchet.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Timeout/slow test failures indicate source-code regression; never edit
tests/baselines/unit_timing.jsonor anyscripts/*_baseline.{txt,json}/scripts/_*_baseline.py. Both families are PreToolUse-blocked. Per-invocation bypass:ALLOW_BASELINE_GROWTH=1 git commit ...(requires explicit user approval)Test markers: use
@pytest.mark.{unit,integration,e2e,slow}. Async tests useauto. Global timeout 30s. Coverage minimum 80%xdist configuration: use
-n 8 --dist=loadfile(auto-applied via pyprojectaddopts).loadfileprevents Python 3.14+ Windows ProactorEventLoop leakWindows test compatibility: unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backTest doubles: use
FakeClockfromtests._sharedfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags. BareMagicMockat typed boundaries is blocked byscripts/check_mock_spec.py(zero-tolerance, no baseline)Vendor-agnostic test names: NEVER use real vendor names in project code/tests. Use
example-provider,test-provider,example-{large,medium,small}-001. Real names allowed in.claude/, third-party imports,providers/presets.py,web/public/provider-logos/Hypothesis property tests: use 10 deterministic CI examples; test failures are real bugs (fix + add
@example(...)). Never skip/xfail flaky tests; fix fundamentally
Files:
tests/unit/engine/workflow/test_service_validate.pytests/unit/ontology/drift/test_strategies.pytests/unit/memory/test_architect_tools.pytests/unit/api/auth/test_postgres_session_store.pytests/unit/meta/mcp/test_handlers_integrations.pytests/unit/memory/consolidation/test_dual_mode_strategy.pytests/unit/meta/chief_of_staff/test_chat.pytests/unit/security/test_llm_evaluator.pytests/unit/hr/evaluation/test_eval_loop_coordinator.pytests/unit/engine/checkpoint/test_strategy.pytests/unit/security/timeout/test_timeout_checker.pytests/unit/security/test_service_safety_integration.pytests/unit/engine/test_personality_trim_notifier.pytests/unit/memory/procedural/propagation/test_department_scoped.pytests/unit/memory/retrieval/hierarchical/test_supervisor.pytests/unit/hr/test_activity_list_recent.pytests/unit/memory/procedural/pruning/test_hybrid_strategy.pytests/unit/engine/middleware/test_behavior_tagger.pytests/integration/hr/training/test_training_pipeline.pytests/unit/security/test_shadow_mode.pytests/unit/api/test_lifecycle_cleanup_kill_switch.pytests/integration/hr/training/test_training_persistence.pytests/unit/ontology/drift/test_service.pytests/unit/engine/test_agent_engine.pytests/integration/integrations/test_rate_limiter_shared_state.pytests/unit/api/test_auto_wire_meetings.pytests/unit/meta/test_signals.pytests/unit/settings/test_service.pytests/unit/engine/test_agent_engine_lifecycle.pytests/unit/tools/test_process_cleanup.pytests/unit/hr/training/test_onboarding_integration.pytests/unit/hr/training/test_service.pytests/unit/meta/signals/test_scaling.pytests/_shared/mock_of.pytests/_shared/test_mock_of.pytests/unit/scripts/test_check_mock_spec.pytests/unit/meta/chief_of_staff/test_monitor.pytests/unit/memory/consolidation/test_strategy.pytests/unit/memory/org/test_protocol.pytests/unit/memory/consolidation/test_service.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/engine/workflow/test_service_validate.pytests/unit/ontology/drift/test_strategies.pytests/unit/memory/test_architect_tools.pytests/unit/api/auth/test_postgres_session_store.pytests/unit/meta/mcp/test_handlers_integrations.pytests/unit/memory/consolidation/test_dual_mode_strategy.pytests/unit/meta/chief_of_staff/test_chat.pytests/unit/security/test_llm_evaluator.pytests/unit/hr/evaluation/test_eval_loop_coordinator.pytests/unit/engine/checkpoint/test_strategy.pytests/unit/security/timeout/test_timeout_checker.pytests/unit/security/test_service_safety_integration.pytests/unit/engine/test_personality_trim_notifier.pytests/unit/memory/procedural/propagation/test_department_scoped.pytests/unit/memory/retrieval/hierarchical/test_supervisor.pytests/unit/hr/test_activity_list_recent.pytests/unit/memory/procedural/pruning/test_hybrid_strategy.pytests/unit/engine/middleware/test_behavior_tagger.pytests/integration/hr/training/test_training_pipeline.pytests/unit/security/test_shadow_mode.pytests/unit/api/test_lifecycle_cleanup_kill_switch.pytests/integration/hr/training/test_training_persistence.pytests/unit/ontology/drift/test_service.pytests/unit/engine/test_agent_engine.pytests/integration/integrations/test_rate_limiter_shared_state.pytests/unit/api/test_auto_wire_meetings.pytests/unit/meta/test_signals.pytests/unit/settings/test_service.pytests/unit/engine/test_agent_engine_lifecycle.pytests/unit/tools/test_process_cleanup.pytests/unit/hr/training/test_onboarding_integration.pytests/unit/hr/training/test_service.pytests/unit/meta/signals/test_scaling.pytests/_shared/mock_of.pytests/_shared/test_mock_of.pytests/unit/scripts/test_check_mock_spec.pytests/unit/meta/chief_of_staff/test_monitor.pytests/unit/memory/consolidation/test_strategy.pytests/unit/memory/org/test_protocol.pytests/unit/memory/consolidation/test_service.py
{README.md,docs/**/*.md}
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics in README and public docs must be sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers perdata/README.md
Files:
docs/reference/conventions.md
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T04:59:27.265Z
Learning: After issue: create branch + commit + push without auto-PR; use `/pre-pr-review` command (gh pr create is hookify-blocked). After PR: use `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T04:59:27.265Z
Learning: Git commits: use `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci). Commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T04:59:27.265Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T04:59:27.265Z
Learning: Branch naming: use `<type>/<slug>` format from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T04:59:27.265Z
Learning: Squash merge PRs; PR body becomes squash commit. Trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T04:59:27.265Z
Learning: After every squash merge → run `/post-merge-cleanup` command
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/engine/workflow/test_service_validate.pytests/unit/ontology/drift/test_strategies.pytests/unit/memory/test_architect_tools.pytests/unit/api/auth/test_postgres_session_store.pytests/unit/meta/mcp/test_handlers_integrations.pytests/unit/memory/consolidation/test_dual_mode_strategy.pytests/unit/meta/chief_of_staff/test_chat.pytests/unit/security/test_llm_evaluator.pytests/unit/hr/evaluation/test_eval_loop_coordinator.pytests/unit/engine/checkpoint/test_strategy.pytests/unit/security/timeout/test_timeout_checker.pytests/unit/security/test_service_safety_integration.pytests/unit/engine/test_personality_trim_notifier.pytests/unit/memory/procedural/propagation/test_department_scoped.pytests/unit/memory/retrieval/hierarchical/test_supervisor.pytests/unit/hr/test_activity_list_recent.pytests/unit/memory/procedural/pruning/test_hybrid_strategy.pytests/unit/engine/middleware/test_behavior_tagger.pytests/integration/hr/training/test_training_pipeline.pytests/unit/security/test_shadow_mode.pytests/unit/api/test_lifecycle_cleanup_kill_switch.pytests/integration/hr/training/test_training_persistence.pytests/unit/ontology/drift/test_service.pytests/unit/engine/test_agent_engine.pytests/integration/integrations/test_rate_limiter_shared_state.pytests/unit/api/test_auto_wire_meetings.pytests/unit/meta/test_signals.pytests/unit/settings/test_service.pytests/unit/engine/test_agent_engine_lifecycle.pytests/unit/tools/test_process_cleanup.pytests/unit/hr/training/test_onboarding_integration.pytests/unit/hr/training/test_service.pytests/unit/meta/signals/test_scaling.pytests/_shared/mock_of.pytests/_shared/test_mock_of.pytests/unit/scripts/test_check_mock_spec.pytests/unit/meta/chief_of_staff/test_monitor.pytests/unit/memory/consolidation/test_strategy.pytests/unit/memory/org/test_protocol.pyscripts/check_mock_spec.pytests/unit/memory/consolidation/test_service.pyscripts/check_mock_spec_ratchet.py
🔇 Additional comments (47)
tests/unit/engine/middleware/test_behavior_tagger.py (1)
146-150: Good typed-boundary mock adjustment.Using
MagicMock(spec=AgentMiddlewareContext)here keeps the exception-path test realistic while avoiding an unrestricted boundary mock.docs/reference/conventions.md (1)
319-363: Ladder documentation looks clear and consistent.The new rung definitions and zero-tolerance gate guidance are aligned with the introduced
mock_of[T]behavior and enforcement flow.tests/unit/tools/test_process_cleanup.py (1)
20-70: Nice cleanup to attribute-bag test doubles.The
SimpleNamespacemigration is appropriate for these_transportstate tests, and the remaining typedMagicMock(spec=Process)case is well-justified.tests/_shared/mock_of.py (1)
32-113:mock_of[T]helper implementation is solid.Autospec +
spec_set=Trueplus explicit override-key failure messaging gives the strict behavior this gate/migration needs.tests/unit/security/test_service_safety_integration.py (1)
63-65: Typed-boundary store double is correctly tightened.Switching to
mock_of[ApprovalStoreProtocol](...)here is a good fit for the new enforcement model.tests/unit/security/test_llm_evaluator.py (1)
123-127: Registry mock migration looks good.
mock_of[ProviderRegistry]with explicit method overrides preserves intent while enforcing boundary shape.tests/unit/api/auth/test_postgres_session_store.py (1)
122-122: Good consistency update for the SQLite boundary double.Using
mock_of[Connection]()here matches the repo’s typed test-double convention.tests/unit/engine/test_personality_trim_notifier.py (1)
255-258: Config resolver boundary doubles are correctly migrated.Using
mock_of[ConfigResolver](...)in both branches keeps these tests aligned with the new typed-boundary enforcement.Also applies to: 353-356
tests/unit/memory/org/test_protocol.py (1)
8-9: Typed-boundary mock migration looks good.Using
mock_of[OrgFactRepository]()here is consistent with the new gate and helper pattern.Also applies to: 17-17
tests/unit/memory/consolidation/test_strategy.py (1)
43-43: Consistentmock_of[MemoryBackend]usage is solid.The typed-boundary substitutions are now uniform across the strategy tests.
Also applies to: 51-51, 62-62, 77-77, 99-99, 118-118, 154-154, 173-173, 193-193
tests/unit/memory/retrieval/hierarchical/test_supervisor.py (1)
23-24: Provider mock migration is correct.The updated tests now use typed
mock_of[CompletionProvider]as intended.Also applies to: 111-113, 244-246
tests/integration/integrations/test_rate_limiter_shared_state.py (1)
109-109: Factory test now follows the typed mock helper pattern.The
MessageBusboundary mock migration here is clean.Also applies to: 115-115
tests/unit/meta/chief_of_staff/test_chat.py (1)
35-36:CompletionProvidertest double update is well done.This keeps the error-path assertion intact while using the typed helper.
Also applies to: 162-164
tests/unit/settings/test_service.py (1)
94-99:SettingsRepositoryfixture migration is correct.Using
mock_of[SettingsRepository](...)here is aligned with the typed-boundary gate direction.tests/unit/meta/signals/test_scaling.py (1)
213-216:ScalingServiceboundary mocks are correctly moved tomock_of.The updates preserve behaviour and align with the typed-double policy.
Also applies to: 245-248
tests/unit/meta/mcp/test_handlers_integrations.py (1)
65-67:ArtifactStorageBackendfixture migration is clean.Using
mock_of[ArtifactStorageBackend]with an asyncdeleteoverride is the right fit.tests/unit/ontology/drift/test_strategies.py (2)
301-302: Usemock_of[DriftDetectionStrategy]intest_strategy_nameas well.Line 301 and Line 302 still use
AsyncMock(spec=DriftDetectionStrategy)for constructor-injected typed collaborators; please align these withmock_of[...]like the other updated tests.As per coding guidelines
tests/**/*.py: “Test doubles: usemock_of[T](**overrides)for typed-boundary substitutions.”
249-285: CORE/USER routing tests now correctly use typed helper doubles.Good migration to
mock_of[DriftDetectionStrategy]with explicit asyncdetectoverrides.tests/unit/memory/procedural/pruning/test_hybrid_strategy.py (1)
155-161: Typed-boundary mock migration is correct.This keeps the same async
prunebehaviour while moving injected collaborators tomock_of[PruningStrategy](), which is exactly what the new gate expects.tests/unit/hr/test_activity_list_recent.py (1)
184-189:CostTrackermock now follows the typed-boundary helper pattern.Good update on Line 184–Line 189: behaviour stays the same and the substitution is now spec-constrained through
mock_of.tests/unit/security/test_shadow_mode.py (1)
82-207: Consistent typed mock migration across shadow-mode scenarios.All changed collaborator injections now use
mock_of[...]consistently, and the test intent for ACTIVE/SHADOW/DISABLED flows is preserved.tests/unit/memory/test_architect_tools.py (1)
198-200: Typed collaborator doubles are now aligned with the shared helper contract.Nice cleanup: repository/exporter boundaries now use
mock_of[...]while keeping the async method stubs unchanged.Also applies to: 217-219, 251-253, 272-274
tests/unit/engine/workflow/test_service_validate.py (1)
25-31:_service()now builds typed repository doubles correctly.The boundary mocks for definition/version repositories are now created through
mock_of[...], matching the enforced test-double path.tests/unit/security/timeout/test_timeout_checker.py (1)
149-154: Policy-failure test now uses a typed helper mock at the boundary.This keeps the same failure simulation while moving
TimeoutPolicymocking tomock_of[...].tests/unit/engine/test_agent_engine_lifecycle.py (1)
117-120: Large typed-boundary mock migration is consistent and solid.The
ExecutionLoop/CostTrackersubstitutions are now uniformly helper-based, with existing lifecycle/timeout test intent preserved.Also applies to: 152-155, 188-191, 224-227, 258-261, 290-293, 323-326, 436-439, 465-468, 470-473, 520-523, 562-565
tests/unit/meta/test_signals.py (1)
8-9: Typed-boundary mock migration is clean and consistent.These changes correctly move typed collaborators to
mock_of[...]with explicit overrides where needed, while preserving existing test intent.Also applies to: 32-32, 115-115, 172-174, 212-215, 257-260
tests/unit/memory/consolidation/test_dual_mode_strategy.py (1)
4-5:_make_strategydefault doubles are now correctly typed.Good migration to
mock_of[...]in the helper; this keeps typed seams strict without changing test behaviour.Also applies to: 10-11, 15-16, 19-20, 86-110
tests/unit/engine/test_agent_engine.py (1)
35-35: Execution-loop seam updates look solid.The
mock_of[ExecutionLoop]adoption is applied consistently in the updated tests and keeps the orchestration-path assertions clear.Also applies to: 40-43, 485-488, 697-700, 817-820, 1164-1167, 1222-1225, 1279-1282
tests/unit/hr/training/test_onboarding_integration.py (1)
4-5: Good typed-mock cleanup in onboarding bridge tests.The updated fixture setup and assertions keep intent unchanged while aligning the primary typed seams to
mock_of[...].Also applies to: 11-13, 17-19, 54-60, 70-72, 78-85, 94-96, 99-106, 118-119
tests/unit/api/test_auto_wire_meetings.py (1)
16-18: Registry test-double typing improvements are correctly applied.Updating
_fake_registries()to typed mocks is a strong improvement and the parametrised partial-missing cases remain explicit and readable.Also applies to: 25-27, 174-175, 179-180
tests/integration/hr/training/test_training_persistence.py (1)
23-26: Integration test collaborator typing is tightened correctly.Nice use of
mock_of[...]here—this keeps the persistence flow test realistic while enforcing typed collaborator boundaries.Also applies to: 32-32, 81-82, 87-88, 99-99
tests/unit/hr/evaluation/test_eval_loop_coordinator.py (1)
10-15:_make_coordinatortyped dependency setup looks correct.The helper now cleanly uses typed
mock_of[...]doubles with sensible defaults, improving contract fidelity in the suite.Also applies to: 20-23, 32-46
tests/unit/memory/procedural/propagation/test_department_scoped.py (1)
7-16: Department-scoped propagation test seams are consistently hardened.The typed
mock_of[...]migration for registry/backend dependencies is well applied across scenarios.Also applies to: 44-53, 84-93, 121-128, 163-170, 198-205, 246-256, 288-297
tests/unit/scripts/test_check_mock_spec.py (5)
277-285: Past review finding addressed.This test locks the
.return_value = Mock()attribute-chain skip, complementingtest_inner_mock_in_mock_class_call_skippedwhich covers the constructor-kwarg form. Both ways of wiring child mocks are now regression-protected.
1-63: LGTM!Clean module setup with:
- Clear docstring explaining the gate's purpose
- Proper Protocol definition for the module interface
- Dynamic module loading to test the script directly
- Reusable
write_test_filefixture
65-233: LGTM!Excellent coverage of Pattern A (typed-boundary catches):
- Direct mock substitutions at call sites (positional/keyword)
- Walrus expressions in call arguments
- Annotated assignments with concrete types
- Fixture return/yield with type annotations
- Async function returns
yield fromin typed generatorsPropertyMockrecognitionThe documented skips for indirect name-binding substitutions are well-justified—cross-module type resolution is out of scope for a pure-AST gate.
235-387: LGTM!Thorough coverage of Pattern B/C/D (non-typed-boundary skips):
- Inner mocks in mock/autospec/mock_of calls
.return_valuechain assignments- Attribute-bag scratch usage
- Collection literals (dict/list/tuple/set)
- Mock-typed annotations
- Unannotated returns
- Class/module scope assignments
These tests lock the "lower rungs" of the test-double ladder that the gate intentionally does not enforce.
389-453: LGTM!Legacy carry-over tests ensure backward compatibility:
- Specced mocks ignored
- First positional arg treated as spec
unittest.mockattribute-call form detection- Non-mock calls not flagged
- Parse error handling
tests/_shared/directory exclusiontests/unit/api/test_lifecycle_cleanup_kill_switch.py (5)
23-45: LGTM!The spec target functions are well-designed for use with
create_autospec():
_no_arg_sync()and_no_arg_async()correctly model zero-arg cleanup methods_config_get_async(_namespace, _key)correctly mirrors the two-positional-arg signature ofConfigResolver.get_*methods, ensuring signature enforcement catches regressionsThe docstrings clearly document the purpose without excessive explanation.
47-97: LGTM!The
_build_app_statefactory correctly applies the test-double ladder:
create_autospec(..., spec_set=True)enforces call signatures at typed boundariesSimpleNamespaceserves as a lightweight attribute-bag for store containers- Consistent pattern across all three stores and the config resolver
100-128: LGTM!The
_run_loop_tickshelper provides deterministic loop control by:
- Monkeypatching
asyncio.sleepto count iterations- Cancelling after exactly N ticks via
CancelledError- Yielding control via
await real_sleep(0)to allow task progressThis avoids wall-clock races and makes tests deterministic.
131-166: LGTM!Both test cases correctly verify the kill-switch behaviour:
enabled=True: All cleanup paths run on every tickenabled=False: No cleanup paths run, but resolver is still consulted (live re-enable)The assertions use appropriate
call_countfor sync andawait_countfor async methods.
206-306: LGTM!The exception isolation tests comprehensively verify:
- Ticket failure doesn't block session/lockout cleanup
- Session failure doesn't block lockout cleanup
MemoryErrorcorrectly propagates (OOM must not be swallowed)The
create_autospecwithside_effectis properly used to simulate failures.tests/_shared/test_mock_of.py (4)
1-11: LGTM!Clean module setup with appropriate imports and module-level
pytest.mark.unitmarker.
13-27: LGTM!The local spec classes provide good coverage:
_ConcreteServicetests both sync and async method autospeccing_Rendereras aProtocolverifies Protocol support (important for structural typing)
29-69: LGTM!Comprehensive test coverage for
mock_of[T]:
- Autospec behavior (call tracking, unknown attribute rejection)
- Override application and rejection of unknown keys
- Protocol support
- Async method awaiting
- Subscript syntax enforcement
72-79: LGTM!The state isolation test correctly verifies that separate
mock_of[T]()calls return independent instances with no shared state—critical for test reliability.
| def _verdict_for_call_arg(call: ast.Call) -> _Verdict: | ||
| """Return CATCH if *call*'s callee is non-Mock, else SKIP.""" | ||
| terminal = _terminal_callee_name(call.func) | ||
| if terminal in _MOCK_NAMES or terminal in _MOCK_FACTORY_NAMES: | ||
| return _Verdict.SKIP | ||
| return _Verdict.CATCH |
There was a problem hiding this comment.
Don't treat every call argument as a typed boundary.
Lines 311-316 currently return CATCH for any non-mock callee, so helper(Mock()) and even accepts_mock(Mock()) are flagged when the parameter is untyped or explicitly Mock-typed. That widens the gate beyond the PR contract of catching only non-Mock-typed boundaries and will create false positives. Please resolve the callee parameter annotation before catching, or default unresolved call sites to SKIP instead of unconditional CATCH.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/check_mock_spec.py` around lines 311 - 316, _verdict_for_call_arg
currently marks any call whose callee isn't a known mock factory as
_Verdict.CATCH, causing false positives for helper(Mock()) and
accepts_mock(Mock()); change it to first resolve the callee's parameter
annotation (use the same resolver you use elsewhere to inspect the function
signature of call.func via _terminal_callee_name or the project's type resolver)
and only return _Verdict.CATCH when that parameter is definitively
non-Mock-typed; if the annotation cannot be resolved, return _Verdict.SKIP; keep
the existing checks against _MOCK_NAMES and _MOCK_FACTORY_NAMES and return
_Verdict.SKIP for those cases.
… 3 codeql, 9 ci) CI broke on the round-1 push because TrainingService.__init__ deep-copies its extractors dict, and the mock_of[Protocol]() autospec instances the round-1 conversion threaded through those extractors carry references to _abc._abc_data (Protocol's abstract-method registry) which is unpicklable. Local unit-only verification missed it; the CI integration suite caught it with TypeError: cannot pickle 'module' object / cannot pickle '_abc._abc_data' object across nine tests. CodeRabbit's round-2 review flagged two additional gate-internals concerns and six remaining MessageBus migration sites; CodeQL re-evaluated and re-fired three path-injection alerts (288, 289, 290) on the restructured ratchet because the is_relative_to boolean form was not consistently recognised as a sanitiser. CI failure fix - tests/integration/hr/training/test_training_pipeline.py: reverted the four mock_of[Protocol] collaborators inside _build_service to AsyncMock(spec=Protocol). The looser form survives copy.deepcopy because the spec class is stored as a plain attribute rather than via the autospec proxy layer that pulls in _abc._abc_data. Reverting preserves CodeRabbit's substantive ask (typed-boundary discipline) -- the original code already had spec=Protocol -- and unblocks the integration suite. An explanatory comment on _build_service documents the deepcopy constraint so the next reviewer sees the reason. CodeRabbit round 2 - scripts/check_mock_spec_ratchet.py: replaced the substring _CATCH_MARKER count in _check_gate_file with _count_catch_returns, an AST walk that counts only ast.Return nodes whose value is _Verdict.CATCH. Docstring / comment / error-message edits that touch the literal string no longer trip the ratchet; a real branch removal is still caught. Falls back to substring count if the source fails to parse so a transient syntax error during interactive editing cannot wedge the gate. - tests/unit/scripts/test_check_mock_spec_ratchet.py: updated the fake_gate fixture to use real def f(): return _Verdict.CATCH branches (the substring-only stubs the previous tests relied on are no longer counted), and added test_gate_edit_docstring_wording_change_allows to lock the substring-vs-AST distinction so a future regression cannot re-introduce the false-positive class. - tests/unit/settings/test_service.py: migrated the remaining six MagicMock(spec=MessageBus) boundary doubles (lines 409, 427, 501, 558, 692, 708 in the prior file) to mock_of[MessageBus](is_running=..., publish=...). Round 1 only caught the first instance because the migration script's pattern required attribute writes whose RHS was a (Magic|Async)?Mock factory call; the MessageBus sites set is_running=True as a literal RHS and slipped past. Skipped CodeRabbit finding - scripts/check_mock_spec.py:311-316 (`_verdict_for_call_arg` too aggressive): CodeRabbit asked the gate to only CATCH when the callee's parameter annotation is definitively non-Mock-typed, and default to SKIP when the annotation cannot be resolved. The gate's own docstring documents that resolving callee parameter annotations is explicitly out of scope for a pure-stdlib AST gate (`docs/reference/conventions.md` section 12.1 covers the indirect-substitution discipline). The PR contract is conversion to mock_of[T] plus subscript-factory recognition, not a fundamental gate-semantics rework -- that would be a separate refactor with its own design discussion. CodeQL path-injection alerts (288, 289, 290) - scripts/check_mock_spec_ratchet.py: switched the allowlist check from Path.is_relative_to to Path.relative_to inside an explicit try/except. relative_to raises ValueError for out-of-tree paths, which is the form CodeQL's py/path-injection query recognises as a sanitiser; the boolean is_relative_to form combined with an adjacent conjunct (`shared_dir not in path.parents`) was not consistently picked up on the previous round. Resolution and the .py suffix check now run inside the same try/except that catches OSError/ ValueError so the read sink at line 220 only fires on a fully validated path. Verification `uv run ruff check src/ tests/ scripts/` clean. `uv run ruff format src/ tests/ scripts/` clean. `uv run mypy src/ tests/` clean (3714 files). `uv run python -m pytest tests/ -m "unit or integration" -n 8` -- 29,735 passed, 49 skipped in 295s. The nine round-1 integration failures (test_training_pipeline.py) now pass.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 1-28: Update the module docstring to accurately reflect the
current gate-check behavior: replace the statement that the hook "counts the
literal substring `_Verdict.CATCH`" with a note that it uses an AST-based count
via the helper function `_count_catch_returns` (which walks the parsed AST and
counts `return _Verdict.CATCH` nodes), and briefly mention that parse/import
errors cause the hook to fail open as implemented; reference
`_count_catch_returns` and the ratchet behavior in the same paragraph so
maintainers know the implementation approach.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ca78b2cb-7c27-4f6c-bb37-9c970efd45e6
📒 Files selected for processing (4)
scripts/check_mock_spec_ratchet.pytests/integration/hr/training/test_training_pipeline.pytests/unit/scripts/test_check_mock_spec_ratchet.pytests/unit/settings/test_service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Read
docs/design/page before implementing; deviations need approval per DESIGN_SPEC.mdPresent every plan for accept/deny before coding
No region/currency/locale privileged; use metric units; use British English
Every convention PR ships its enforcement gate
Configuration precedence: DB > env > YAML > code default via
SettingsService/ConfigResolver; noos.environ.getoutside startupNo hardcoded numerics; allow only 0/1/-1, HTTP codes, hex masks, powers-of-2; numerics live in
settings/definitions/After issue: branch + commit + push (no auto-PR); use
/pre-pr-review; after PR: use/aurelio-review-prfor external feedback; fix EVERYTHING valid; no deferringUse
uv syncfor dependencies;uv run ruff check/formatfor linting;uv run mypyfor strict type-checking;uv run pytestwith markers for testingNo
from __future__ import annotations(Python 3.14 has PEP 649); use PEP 758 exceptexcept A, B:requires parens if bindingType hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Errors must be
<Domain><Condition>Errorinheriting fromDomainError, neverException/RuntimeErrordirectly; enforced bycheck_domain_error_hierarchy.pyUse Pydantic v2 frozen +
extra="forbid"on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes);@computed_fieldfor derived;NotBlankStrfor identifiersArgs models at every system boundary; use
parse_typed()for every external dict ingestion; enforced bycheck_boundary_typed.pyImmutability: use
model_copy(update=...)orcopy.deepcopy(); deepcopy at system boundariesUse
asyncio.TaskGroupfor fan-out/fan-in; helpers catchExceptionand re-raiseMemoryError/RecursionErrorClock seam: use
clock: Clock | None = None; tests injectFakeClock; lifecycle: services own_lifecycle_lock; timed-out stops mark unrestartableUntrusted content (SEC-1): use
wrap_untrusted()from `eng...
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.pytests/unit/settings/test_service.pyscripts/check_mock_spec_ratchet.pytests/integration/hr/training/test_training_pipeline.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Timeout/slow failures indicate source-code regression; never edit
tests/baselines/unit_timing.jsonor anyscripts/*_baseline.{txt,json}/scripts/_*_baseline.py; bypass viaALLOW_BASELINE_GROWTH=1requiring explicit user approvalUse markers:
@pytest.mark.{unit,integration,e2e,slow}; async tests useauto; global timeout 30s; coverage minimum 80%xdist: use
-n 8 --dist=loadfileauto-applied via pyprojectaddopts(loadfileprevents 3.14+Windows ProactorEventLoop leak)Windows unit tests: use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backTest doubles: use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; never bareMagicMockat typed boundaries; enforced bycheck_mock_spec.pyImport
FakeClockandmock_offromtests._shared; inject viaclock=and helper's spec subscriptHypothesis: use 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)); never skip/xfail flaky tests; fix fundamentally usingasyncio.Event().wait()instead ofsleep(large)
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.pytests/unit/settings/test_service.pytests/integration/hr/training/test_training_pipeline.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.pytests/unit/settings/test_service.pytests/integration/hr/training/test_training_pipeline.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T05:43:55.831Z
Learning: Numerics in README and public docs must be sourced from `data/runtime_stats.yaml` via `<!--RS:NAME-->` markers
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T05:43:55.831Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced; signed commits required on protected refs (GPG/SSH or GitHub App)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T05:43:55.831Z
Learning: Branches: `<type>/<slug>` from main; pre-commit/pre-push hooks in `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T05:43:55.831Z
Learning: Squash merge with PR body becoming commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T05:43:55.831Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T05:43:55.831Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/scripts/test_check_mock_spec_ratchet.pytests/unit/settings/test_service.pyscripts/check_mock_spec_ratchet.pytests/integration/hr/training/test_training_pipeline.py
🔇 Additional comments (22)
tests/unit/settings/test_service.py (3)
6-29: Typed-boundary mock imports are correctly aligned.Good migration away from direct
MagicMockusage tomock_of+MessageBustyping.
94-104: Repository fixture uses typed mock construction correctly.
mock_of[SettingsRepository]plus explicit async method stubs makes this fixture deterministic and aligned with the new gate semantics.
409-412: MessageBus test doubles are now consistently typed and explicit.Nice cleanup: all affected notification-path tests now use
mock_of[MessageBus]with clear asyncpublishbehaviour.Also applies to: 428-431, 503-506, 561-564, 696-699, 713-716
tests/integration/hr/training/test_training_pipeline.py (3)
20-21: Good dependency tightening for typed collaborators.These imports make the helper’s typed mock specs explicit and readable.
Also applies to: 48-49
123-136: Deepcopy exception is clearly justified.The docstring captures the concrete
mock_of/create_autospecdeepcopy failure mode and whyAsyncMock(spec=...)is the safe fallback in this helper.
139-140: Default doubles are now properly spec-constrained.Switching helper defaults from loose mocks to
AsyncMock(spec=...)materially improves boundary safety while preserving existing test setup ergonomics.Also applies to: 143-144, 149-150, 157-158
scripts/check_mock_spec_ratchet.py (7)
46-74: LGTM!The AST-based counting with substring fallback is correctly implemented. The walker properly identifies
return _Verdict.CATCHstatements while ignoring docstrings and comments.
77-88: LGTM!Dynamic module loading is implemented correctly with appropriate error handling.
91-112: LGTM!Fail-open behavior with proper temp file cleanup and stderr logging. The private API access is appropriately annotated.
115-137: LGTM!The function correctly handles both
EditandWritetool envelopes with appropriate no-op detection.
140-183: LGTM!Both check functions correctly implement their respective ratchet directions with clear error messages.
225-232: Confirmshared_dir in path.parentsworks as intended.The check
shared_dir in path.parentscorrectly identifies files undertests/_shared/becausePath.parentsreturns a sequence of parent directories asPathobjects. For/repo/tests/_shared/foo.py, the parents include the resolvedshared_dirpath.However, note that if
pathequalsshared_direxactly (editingtests/_shared/itself, which is a directory), this check would not match. Since the earlierpath.suffix != ".py"guard filters directories, this is safe.
234-237: CodeQL path-injection alert is mitigated by prior sanitization.The
path.read_text()at line 235 operates on a path that has already passed through therelative_to(tests_root)sanitizer (lines 227-230) or is confirmed equal togate_path. This ensures only files within the trustedtests/tree or the gate script itself are read.The CodeQL alerts in past reviews appear to be false positives given this control flow. The inline comment at lines 209-216 documents this design choice.
tests/unit/scripts/test_check_mock_spec_ratchet.py (9)
1-22: LGTM!Clear module docstring explaining the hook's dual protection scheme, appropriate imports, and correct
pytest.mark.unitmarker.
24-49: LGTM!The
_RatchetModuleProtocol cleanly documents the tested interface, and dynamic loading at module scope ensures early failure if the script is broken.
52-76: LGTM!Clean fixture design with proper monkeypatch usage for automatic teardown.
78-116: LGTM!Comprehensive coverage of all fail-open paths in the hook's guard cascade.
118-203: LGTM!Good coverage of path-filtering logic including the
tests/_shared/exemption.
206-275: LGTM!The test file edit scenarios correctly verify the ratchet behavior for holding, raising, and reducing violation counts.
278-307: LGTM!Write tool tests correctly mirror the Edit tool behavior verification.
310-426: LGTM!Excellent coverage of gate-editing scenarios, especially
test_gate_edit_docstring_wording_change_allowswhich locks the AST-based counting behavior against regression to substring matching.
429-457: LGTM!Properly verifies fail-open behavior and stderr diagnostics when the gate fails to load.
CI on round-2's head is green except for the standalone CodeQL status check, which stays FAILURE while py/path-injection alerts 288, 289, and 291 sit open against scripts/check_mock_spec_ratchet.py. CodeRabbit's round-3 review surfaced one quick docstring nit -- the module header still described the substring-counting gate strategy that round 2 replaced with the AST walk. CodeRabbit nit - scripts/check_mock_spec_ratchet.py: refreshed the module docstring to describe _count_catch_returns's AST walk and the substring fallback used when the parse fails open, matching what round 2 actually shipped. CodeQL dismissals (alerts 288, 289, 291 -- py/path-injection) The three alerts all point at the gate-input path-validation block. Rounds 1 and 2 each restructured the validation -- round 1 dropped the dead is_relative_to try/except and inlined the allowlist check, round 2 switched to Path.relative_to inside an explicit try/except so the sanitiser raises ValueError on out-of-tree paths. CodeQL re-fired on each push. The rule's data-flow analysis does not propagate the relative_to-raise sanitiser through the is_gate early-return branch to the downstream read_text sink, and is not following the shared_dir not in path.parents conjunct as a meaningful gate either. The runtime invariants are tight: path is always either the resolved gate file or a path under _TESTS_ROOT and outside tests/_shared/ by the time the read fires. The hook also runs under the harness's working-directory contract -- the file_path from the PreToolUse envelope is the path Claude is about to edit, already scoped to the repo. Dismissing as false positive with a one-line justification pointing at the in-code comment that documents the allowlist; the comment carries codeql[py/path-injection] markers at lines 218 and 235 so a future reviewer encounters the rationale at the cited site rather than only on the GitHub UI. Verification `uv run ruff check scripts/` clean. `uv run python -m pytest tests/unit/scripts/test_check_mock_spec_ratchet.py` -- all 19 ratchet tests pass.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 105-113: The except block that currently returns 0 on scan failure
should not treat an error as a clean scan; instead log the error and re-raise
(or return a distinct error sentinel that the caller checks) so callers can
distinguish "scan failed" from "0 violations". Update the two try/except sites
that call gate._scan_file (the block that prints "check_mock_spec_ratchet: gate
scan failed, allowing edit..." and the similar block at lines ~155-157) to stop
returning 0 on exception and either raise the caught exc again or return a
documented non-zero sentinel (and update the caller logic to handle that
sentinel) so scan failures are not conflated with zero-violation results.
- Around line 195-201: After json.loads(raw) assign to payload, ensure payload
is a mapping before using .get: check isinstance(payload, dict) and if not,
return 0 (or treat as empty dict), then safely read tool_name =
payload.get("tool_name", "") and tool_input = payload.get("tool_input") or {};
also coerce tool_input to a dict (e.g., if not isinstance(tool_input, dict) set
tool_input = {}) so accesses later in this function (using payload, tool_name,
tool_input) do not raise when JSON returns a non-object type.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7a1d88a8-23d7-42ab-a8b1-4ba0295864e2
📒 Files selected for processing (1)
scripts/check_mock_spec_ratchet.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Lighthouse Site
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: No region/currency/locale should be privileged; use metric units; British English per docs/reference/regional-defaults.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Every convention PR ships its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Timeout/slow test failures = source-code regression; never edit `tests/baselines/*` or `scripts/*_baseline.*` files or `scripts/_*_baseline.py`; no auto-PR; use `/pre-pr-review` after pushing
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Per-invocation bypass for gate baselines requires explicit user approval: `ALLOW_BASELINE_GROWTH=1 git commit ...`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR: `/aurelio-review-pr` for external feedback; fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Run `uv sync` for all dependencies and `uv sync --group docs` for docs toolchain
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Use `uv run ruff check src/ tests/ --fix` for linting and auto-fixing
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Use `uv run ruff format src/ tests/` for code formatting
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Use `uv run mypy src/ tests/` for strict type checking
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: xdist `-n 8 --dist=loadfile` auto-applied via pyproject addopts (loadfile prevents 3.14+Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via synthorg-repo-bot)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Pre-commit/pre-push hooks via .pre-commit-config.yaml; hookify rules in .claude/hookify.*.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Squash merge; PR body becomes squash commit; trailers (Release-As, Closes `#N`) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: Web: see web/CLAUDE.md; CLI: see cli/CLAUDE.md (use `go -C cli`, never `cd cli`); Shell: see ~/.claude/rules/common/bash.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:03:47.682Z
Learning: GitHub queries: `gh issue list` via Bash, NOT MCP `list_issues`
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_mock_spec_ratchet.py
CodeRabbit's round-4 review (commit-id 055251e) surfaced two robustness gaps in the ratchet hook: scan failures silently collapsed into a zero-violation count, and the top-level JSON payload was assumed to be a mapping. CodeQL re-fired alert 292 on the same path-injection block after round 3's docstring expansion shifted line numbers and the dismissal fingerprint did not match. CodeRabbit findings - scripts/check_mock_spec_ratchet.py: `_scan_text` now returns the `_SCAN_FAILED` sentinel (-1) when the gate raises rather than a literal 0, and `_check_test_file` short-circuits to exit 0 (fail open, with the existing stderr breadcrumb) when either the BEFORE or AFTER scan failed. The old shape masked a transient scan failure as "zero violations" -- if BEFORE failed silently and AFTER succeeded with a positive count, the ratchet would have wrongly blocked the edit. - scripts/check_mock_spec_ratchet.py: `main()` now guards `isinstance(payload, dict)` before calling `.get` on the decoded envelope, and coerces `tool_input` to an empty dict when the field is missing or not a mapping. A malformed envelope (or a future schema change that produces an array / scalar at the top level) no longer crashes the hook with AttributeError; it short- circuits open so the surrounding tool call is unaffected. - tests/unit/scripts/test_check_mock_spec_ratchet.py: two new regression tests -- `test_scan_failure_does_not_block_when_after_count_zero` patches a fake gate that raises on every scan and asserts exit 0 plus the stderr breadcrumb (locks the sentinel branch); `test_non_dict_payload_returns_zero` feeds a JSON array as the envelope and asserts exit 0 (locks the isinstance guard). CodeQL dismissal (alert 292, py/path-injection) - Same shape as 288 / 289 / 291: allowlist enforced via `Path.relative_to` (raises) + exact `gate_path` match, but CodeQL data-flow does not propagate the sanitiser through the `is_gate` early-return to the downstream `Path.resolve` sink. Round-3's docstring expansion shifted the cited line numbers in the file, so the per-alert fingerprint did not carry the previous dismissals forward. Dismissed via API as `dismissed_reason="false positive"` with a one-line pointer at the in-code comment that documents the allowlist. Verification `uv run ruff check src/ tests/ scripts/` clean. `uv run ruff format src/ tests/ scripts/` clean. `uv run mypy src/ tests/` clean (3714 files). `uv run python -m pytest tests/ -m "unit or integration" -n 8` -- 29,737 passed, 49 skipped in 290s. The 21 ratchet-hook tests cover both new regression sites.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 145-154: Validate Edit envelope fields before using them: ensure
file_path is a str before calling Path(file_path), ensure
tool_input.get("old_string") and tool_input.get("new_string") are strings before
using "in" or str.replace, and only treat replace_all as True when
isinstance(tool_input.get("replace_all"), bool) and it's True; if any of these
type checks fail, short-circuit by returning None (keeping the fail-open
behavior). Apply the same explicit isinstance checks and early-return logic to
the other Edit-handling block that uses the same variables (old_string,
new_string, replace_all, before) so malformed JSON doesn't raise.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 90eae35d-75fd-44b4-91e7-1ea9ae1bb66d
📒 Files selected for processing (2)
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{py,ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use metric units; British English; no region/currency/locale privileged in code defaults
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.pyscripts/check_mock_spec_ratchet.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}. Async tests useauto. Timeout 30s global. Coverage 80% minimumWindows unit tests: use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backTest doubles: use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags. BareMagicMockat typed boundaries is blocked bycheck_mock_spec.pyFakeClock and
mock_ofimport fromtests._shared; inject viaclock=and helper subscriptHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)). Never skip/xfail flaky tests; fix fundamentally. Useasyncio.Event().wait()notsleep(large)
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Read design spec from `docs/design/` before implementing; deviations require approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Every convention PR ships its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Timeout/slow test failures = source-code regression; never edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`. Use `ALLOW_BASELINE_GROWTH=1` for explicit bypass
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: After issue completion: branch + commit + push; use `/pre-pr-review` (not auto-PR). After PR: `/aurelio-review-pr` for external feedback. Fix all valid issues; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: xdist: `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` to prevent 3.14+ Windows ProactorEventLoop leak
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Pre-commit/pre-push hooks via `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: After every squash merge → run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Web: see `web/CLAUDE.md`. CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`). Shell: see `~/.claude/rules/common/bash.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:34:40.283Z
Learning: Quick commands: `uv sync`, `uv run ruff check/format`, `uv run mypy`, `uv run pytest` with markers and coverage 80% min
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/scripts/test_check_mock_spec_ratchet.pyscripts/check_mock_spec_ratchet.py
CodeRabbit's round-5 review (commit-id f3d5d87) flagged one remaining robustness gap: _compute_after assumed tool_input["old_string"], tool_input["new_string"], and tool_input["file_path"] were strings without checking. A well-formed JSON envelope can still carry non-string values for those fields (list / int / null), in which case the downstream in / replace / Path(...) calls raise TypeError and the hook crashes instead of failing open. CodeRabbit finding - scripts/check_mock_spec_ratchet.py: _compute_after now isinstance-checks old_string and new_string before the in / replace calls and short-circuits to None on type mismatch. replace_all is now compared by identity to True so a non-bool truthy value (the previous bool(...) coercion accepted any truthy object) does not silently flip the flag. main() adds isinstance(file_path, str) before the Path(file_path) call so a malformed envelope cannot reach the path-validation block with a non-string payload. - tests/unit/scripts/test_check_mock_spec_ratchet.py: two new regression tests -- test_edit_with_non_string_fields_returns_zero feeds a list / int for old/new and asserts exit 0; test_edit_with_non_string_file_path_returns_zero feeds an int file_path and asserts exit 0 (locks both isinstance guards). Verification `uv run ruff check src/ tests/ scripts/` clean. `uv run ruff format src/ tests/ scripts/` clean. `uv run mypy src/ tests/` clean (3714 files). `uv run python -m pytest tests/unit/scripts/test_check_mock_spec_ratchet.py` -- all 23 ratchet tests pass (two new regression cases added).
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 58-65: The current except block in the AST parse
(ast.parse(source)) falls back to source.count(f"{_VERDICT_NAME}.{_CATCH_ATTR}")
which can still block edits; change the except SyntaxError path to return a
clear parse-failure sentinel (e.g., None or a dedicated constant) instead of a
count, and update _check_gate_file() to treat that sentinel as "unparsable" and
allow the edit when either side (old or new) is unparsable; refer to the
ast.parse call, the returned value from this function, the
_VERDICT_NAME/_CATCH_ATTR usage, and the _check_gate_file() function when making
the changes.
In `@tests/unit/scripts/test_check_mock_spec_ratchet.py`:
- Around line 430-452: The stub _broken_scan used in
test_scan_failure_does_not_block_when_after_count_zero currently raises on every
call; change it to raise only on the first invocation and return a positive scan
result on the second so the BEFORE scan simulates failure while the AFTER scan
succeeds. Implement this by making _broken_scan track call count (e.g. closure
variable or iterator) and on first call raise RuntimeError("simulated scan
failure"), and on the second return a list/tuple representing a positive count;
keep the existing fake_gate_module = SimpleNamespace(_scan_file=_broken_scan)
and monkeypatch.setattr(_MODULE, "_load_gate", ...) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d01bbd19-e0b4-4536-8441-481d533d2910
📒 Files selected for processing (2)
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Configuration precedence: DB > env > YAML > code default via
SettingsService/ConfigResolver; noos.environ.getoutside startupNo
from __future__ import annotations(3.14 has PEP 649); PEP 758 except:except A, B:requires parens when binding
Files:
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers: use
@pytest.mark.{unit,integration,e2e,slow}; async tests useauto; 30s global timeout; 80% coverage min; xdist-n 8 --dist=loadfileauto-appliedWindows unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backTest doubles: use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundaries is blocked bycheck_mock_spec.pyHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)); dual-backend conformance intests/conformance/persistence/consumesbackendfixture (SQLite + Postgres); enforced bycheck_dual_backend_test_parity.pyNever skip/xfail flaky tests; fix fundamentally; use
asyncio.Event().wait()notsleep(large)
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: No region/currency/locale privileged; use metric units; use British English
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: Every convention PR must ship its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: After implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` before `gh pr create`; after PR use `/aurelio-review-pr`; fix all valid feedback with no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: Git commits: `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced; signed commits required on protected refs
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: Branch naming: `<type>/<slug>` from main; use `.pre-commit-config.yaml` hooks and `.claude/hookify.*.md` rules
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: Squash merge only; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: After every squash merge → run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API, not CLI
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: Use `go -C cli` for CLI commands, never `cd cli`; see `cli/CLAUDE.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T06:59:14.610Z
Learning: Reference documentation files: load on demand from docs/reference/ for design-spec, conventions, convention-gates, regional-defaults, persistence-boundary, configuration-precedence, errors, sec-prompt-safety, lifecycle-sync, mcp-handler-contract, typed-boundaries, retry-patterns, scaffolding, audit-category-gate-coverage, dead-api-endpoints, pluggable-subsystems, protocols-audit, telemetry
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
CodeRabbit's round-6 review (commit-id 1b8f9e3) flagged two robustness gaps left over from round 2's AST-counter refactor: the SyntaxError fallback could still flip the inequality on a transient mid-edit break, and the round-4 regression test was weaker than the failure mode it was meant to cover. CodeRabbit findings - scripts/check_mock_spec_ratchet.py: _count_catch_returns now returns int | None -- None on SyntaxError -- and _check_gate_file short-circuits to exit 0 when either side fails to parse. The substring fallback (which still miscounted docstrings / comments when both sides were unparseable) is gone. The narrowing is split into two is-None arms so mypy strict accepts the comparison on the joined-int path. - tests/unit/scripts/test_check_mock_spec_ratchet.py: the scan stub in test_scan_failure_does_not_block_when_after_count_zero now raises only on the first call and returns a positive count on the second, exercising the asymmetric BEFORE-fails / AFTER-succeeds path the regression test is meant to lock (previous stub raised on every call, so it did not actually distinguish the silent-zero failure mode from the new sentinel-based fail-open). - tests/unit/scripts/test_check_mock_spec_ratchet.py: new test_unparseable_gate_after_falls_open regression test -- drives the gate-file path with a syntactically broken AFTER state and asserts exit 0; locks the None sentinel. Verification uv run ruff check src/ tests/ scripts/ -- clean. uv run ruff format src/ tests/ scripts/ -- clean. uv run mypy src/ tests/ -- clean (3714 files). uv run python -m pytest tests/ -m unit or integration -n 8 -- 29,740 passed, 49 skipped in 290s. 24 ratchet-hook tests cover the unparseable-gate and asymmetric-scan paths.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
scripts/check_mock_spec_ratchet.py (1)
20-22:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winModule docstring still references a removed substring fallback.
Lines 20-22 say fail-open happens via a substring fallback, but
_count_catch_returns()now returnsNoneon parse failure. Please update this wording to match current behaviour.Suggested edit
- AFTER < BEFORE blocks; parse / import errors fall open via the - helper's substring fallback so a transiently broken gate cannot - wedge editing. + AFTER < BEFORE blocks; parse / import errors fail open via + explicit unparseable-sentinel handling so a transiently broken + gate cannot wedge editing.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/check_mock_spec_ratchet.py` around lines 20 - 22, Update the module docstring text that currently claims a "substring fallback" causes parse/import errors to fail open; instead reflect the current behavior of _count_catch_returns(): when parsing fails it returns None, and callers treat that None as a fail-open signal. Replace the substring-fallback wording with a concise description that _count_catch_returns() returns None on parse failure and that this None is used to allow editing to proceed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 58-63: Edit the inline comment that begins "Returns ``None`` when
the source fails to parse..." to remove any reviewer/alert provenance or
back-references and leave only the rationale (why the function returns None and
the behavior being avoided); do the same cleanup for the other comment block
around the code that covers the same topic (the block you referenced at the
other location). Ensure comments explain only the rationale/why and contain no
reviewer names, tool alerts, issue IDs, or migration framing.
In `@tests/unit/scripts/test_check_mock_spec_ratchet.py`:
- Around line 436-438: The docstring currently contains
reviewer-history/back-reference phrasing (mentions CodeRabbit and migration
framing) instead of a behavioural description; update the test docstring(s) that
mention the ``_SCAN_FAILED`` sentinel (and the similar occurrences around the
second block at lines referencing 551-553) to remove reviewer/tool
back-references and instead state the intended behaviour being tested (e.g.,
that the test locks the ``_SCAN_FAILED`` sentinel to prevent returning a literal
0 on scan failure), keeping the focus on "what" the test asserts rather than
"who/when" reviewed it.
---
Duplicate comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 20-22: Update the module docstring text that currently claims a
"substring fallback" causes parse/import errors to fail open; instead reflect
the current behavior of _count_catch_returns(): when parsing fails it returns
None, and callers treat that None as a fail-open signal. Replace the
substring-fallback wording with a concise description that
_count_catch_returns() returns None on parse failure and that this None is used
to allow editing to proceed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8119b577-feff-4099-9543-9c91849f8487
📒 Files selected for processing (2)
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Lighthouse Site
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{py,ts,tsx,go}
📄 CodeRabbit inference engine (CLAUDE.md)
No region/currency/locale privileged; use metric units; British English
Files:
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
**/*.{py,ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Comments explain WHY only; no reviewer citations / issue back-refs / migration framing
Vendor-agnostic: NEVER use real vendor names in project code/tests; use
example-provider,test-provider,example-{large,medium,small}-001
Files:
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}; asyncauto; timeout 30s global; coverage 80% minWindows: unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backTest doubles: use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundary is blockedFakeClock and
mock_ofimport fromtests._shared; inject viaclock=and helper's spec subscriptHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)); never skip/xfailFlaky tests: never skip/xfail; fix fundamentally; use
asyncio.Event().wait()notsleep(large)
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Read `docs/design/` page before implementing; deviations need approval
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Every convention PR ships its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR: `/aurelio-review-pr` for external feedback; fix everything valid
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: xdist `-n 8 --dist=loadfile` auto-applied via pyproject `addopts`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Pre-commit/pre-push hooks: `.pre-commit-config.yaml`; Hookify rules: `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Web: see `web/CLAUDE.md`; CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:30:19.560Z
Learning: Shell: see `~/.claude/rules/common/bash.md` (canonical for `cd` / `git -C` / Bash file-write rules)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
CodeRabbit's round-7 review (commit-id d9ebb44) flagged three docstring / comment hygiene issues left over from rounds 4 and 6: the module docstring still described the old substring fallback, and two helper / test docstrings carried reviewer-history phrasing (mentions of "CodeRabbit asked / flagged") which violates the project rule against reviewer citations in code (CLAUDE.md: no reviewer citations / issue back-refs / migration framing). CodeRabbit findings - scripts/check_mock_spec_ratchet.py: module-level docstring lines 20-22 now describe the actual unparseable-sentinel handling (returns None on parse failure, callers treat that as fail-open) instead of the removed substring fallback. - scripts/check_mock_spec_ratchet.py: _count_catch_returns docstring drops the "CodeRabbit asked the ratchet to avoid" sentence and states the behavioural rationale directly (a substring fallback would flip the inequality when a mid-edit save changes docstring text on only one side). - scripts/check_mock_spec_ratchet.py: _scan_text docstring loses the "the previous behaviour conflated" migration framing and states why the sentinel exists (asymmetric BEFORE-fails / AFTER-succeeds case computes after > before and wrongly blocks if zero is used as a silent failure signal). - tests/unit/scripts/test_check_mock_spec_ratchet.py: two regression-test docstrings reworded to describe the locked behaviour ("Locks the X sentinel: when ...") instead of the reviewer history that produced them. No functional code change in this round -- only the docstring / comment wording. The ratchet behaviour is unchanged. Verification uv run ruff check src/ tests/ scripts/ -- clean. uv run mypy src/ tests/ -- clean (3714 files). uv run python -m pytest tests/unit/scripts/test_check_mock_spec_ratchet.py -- 24 ratchet tests pass.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
scripts/check_mock_spec_ratchet.py (1)
160-166:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winReject malformed
replace_allvalues instead of treating them as single-replace edits.
replace_all_raw is Trueavoids truthy junk, but it still computes an AFTER state for non-boolean values like"false"or1. That breaks the hook’s fail-open contract for malformed envelopes and can block or allow based on a synthetic single-replace edit instead of short-circuiting open.Suggested fix
replace_all_raw = tool_input.get("replace_all", False) - replace_all = replace_all_raw is True + if not isinstance(replace_all_raw, bool): + return None + replace_all = replace_all_raw if old not in before: return None🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/check_mock_spec_ratchet.py` around lines 160 - 166, The code currently computes replace_all with "replace_all_raw is True" which still proceeds for non-boolean inputs (like "false" or 1) and yields a synthetic single-replace edit; change the validation so that replace_all_raw must be a boolean and reject malformed values: if replace_all_raw is not an instance of bool, return None (short-circuit) to preserve the fail-open contract; otherwise set replace_all = replace_all_raw and proceed with the existing logic that uses replace_all, old, before, new.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@scripts/check_mock_spec_ratchet.py`:
- Around line 160-166: The code currently computes replace_all with
"replace_all_raw is True" which still proceeds for non-boolean inputs (like
"false" or 1) and yields a synthetic single-replace edit; change the validation
so that replace_all_raw must be a boolean and reject malformed values: if
replace_all_raw is not an instance of bool, return None (short-circuit) to
preserve the fail-open contract; otherwise set replace_all = replace_all_raw and
proceed with the existing logic that uses replace_all, old, before, new.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c1c65b61-ed23-4a80-8ed1-9ce7586939bd
📒 Files selected for processing (2)
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Lighthouse Site
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (1)
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min; xdist-n 8 --dist=loadfileauto-applied via pyproject addoptsWindows testing: unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backTest doubles: use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundaries forbidden (blocked by check_mock_spec.py with zero tolerance, no baseline)FakeClock and
mock_ofimport fromtests._shared; inject viaclock=and helper's spec subscriptHypothesis: 10 deterministic CI examples; test failures are real bugs (fix + add
@example(...))Flaky tests: NEVER skip/xfail; fix fundamentally; use
asyncio.Event().wait()notsleep(large)for waiting
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/scripts/test_check_mock_spec_ratchet.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Read design specification from `docs/design/` before implementing; deviations require approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: No region/currency/locale privileging; use metric units and British English per docs/reference/regional-defaults.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Every convention PR ships its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: After issue implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` before creating PR via gh pr create (hookify-blocked); after PR merge, use `/aurelio-review-pr` for external feedback; fix all valid feedback, no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Use `uv sync` for dependency management; `uv run ruff check`, `uv run ruff format`, `uv run mypy`, `uv run pytest` with appropriate markers (unit/integration/e2e)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Pre-commit/pre-push hooks defined in `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Squash merge only; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: After every squash merge → run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API, never in CLI binary
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Web guidelines and design system details in `web/CLAUDE.md`; CLI guidelines in `cli/CLAUDE.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-11T07:54:33.696Z
Learning: Shell, bash, and file-write rules in `~/.claude/rules/common/bash.md` (canonical for `cd` / `git -C` / Bash file-write); use `go -C cli`, never `cd cli`
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_mock_spec_ratchet.pytests/unit/scripts/test_check_mock_spec_ratchet.py
CodeRabbit's round-8 review (commit-id 7bd83e2) flagged that the round-5 replace_all guard accepts non-boolean inputs and falls through to a synthetic single-replace edit instead of short- circuiting open. ``"false"`` (string truthy), ``1`` (int), or any other non-bool value would pass the ``replace_all_raw is True`` check (evaluates to False) and the function would still compute an AFTER state with a single replacement, giving the ratchet a misleading value to compare. CodeRabbit finding - scripts/check_mock_spec_ratchet.py: ``_compute_after`` now ``isinstance(replace_all_raw, bool)``-checks the value before using it, returning ``None`` for any non-bool input. Boolean ``True``/``False`` flow through unchanged. The strict-fail-open branch matches the rest of the function (old/new/file_path isinstance guards from rounds 4-5). - tests/unit/scripts/test_check_mock_spec_ratchet.py: new ``test_edit_with_non_bool_replace_all_returns_zero`` regression test feeds ``replace_all="false"`` and asserts exit 0 (locks the new isinstance branch). Verification uv run ruff check src/ tests/ scripts/ -- clean. uv run mypy src/ tests/ -- clean (3714 files). uv run python -m pytest tests/unit/scripts/test_check_mock_spec_ratchet.py -- 25 ratchet tests pass.
<!-- HIGHLIGHTS_START --> ## Highlights > _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub Models). Commit-based changelog below._ ### What you'll notice - Password and secret fields now include an eye-toggle for easier visibility control. - Containers running without probes are shown as healthy in the doctor command. - Unloaded and missing PR-review agents are restored and available again. ### What's new - Gate baseline protection is enhanced to block em-dashes during writing. ### Under the hood - Replaced Atlas with yoyo-migrations for persistence management. - Refactored codebase extensively, including context-bound user authentication and registry pattern for enums. - Improved linting by draining magic number usages and tightening mock and constant checks. - Updated CI to retry Docker pushes on network timeout errors. - Updated apko lockfiles for dependency management. <!-- HIGHLIGHTS_END --> :robot: I have created a release *beep* *boop* --- ## [0.8.3](v0.8.2...v0.8.3) (2026-05-12) ### Features * harden gate baseline protection + block em-dashes at write time ([#1860](#1860)) ([b41f151](b41f151)) * **web:** eye-toggle on every password / secret field ([#1873](#1873)) ([9070387](9070387)) ### Bug Fixes * **ci:** retry Docker push on Go net/http deadline + cancellation errors ([#1877](#1877)) ([23a0bfa](23a0bfa)) * **cli:** render running-no-probe containers as healthy in doctor ([#1870](#1870)) ([6263795](6263795)) * restore unloaded and missing PR-review agents ([#1875](#1875)) ([db004fd](db004fd)), closes [#1871](#1871) ### Refactoring * bind authenticated user via ContextVar ([#1858](#1858)) ([57ed0b4](57ed0b4)) * code-structure cleanup (sub-tasks D + F + G + H + I) ([#1859](#1859)) ([362e5c8](362e5c8)) * convert enum dispatch to registry pattern ([#1854](#1854)) ([e90550e](e90550e)) * drain no_magic_numbers baseline to zero via Final hoists ([#1856](#1856) phase 2) ([#1872](#1872)) ([ec8109e](ec8109e)) * drain pagination + loop-init + kill-switch baselines ([#1857](#1857)) ([#1868](#1868)) ([115c3c2](115c3c2)) * **persistence:** replace Atlas with yoyo-migrations ([#1876](#1876)) ([1b7e975](1b7e975)), closes [#1874](#1874) * protocols audit follow-up (REVIEW + fold pass) ([#1869](#1869)) ([af33ddb](af33ddb)) * protocols audit follow-up REMOVE pass ([#1867](#1867)) ([dd1eebc](dd1eebc)) * tighten check_mock_spec gate, add mock_of[T], drain baseline ([#1862](#1862)) ([240a253](240a253)) * tighten check_no_magic_numbers for named module constants ([#1856](#1856)) ([#1866](#1866)) ([90c933b](90c933b)) ### CI/CD * update apko lockfiles ([#1863](#1863)) ([2bd32e6](2bd32e6)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Closes #1855.
What changes
Gate quality (sub-task 1)
scripts/check_mock_spec.pywith AST-based detection of bareMock/AsyncMock/MagicMockat typed boundaries only. Catches:m: T = Mock()annotated local whereTis not Mock-typed (Pattern A3),return Mock()/yield Mock()/yield from Mock()from a typed-return function (Pattern A4)..return_valuechains, attribute-bag scratch objects, dict / list / set / tuple-literal values, name-binding indirect substitutions (which would require cross-module callee resolution).scripts/mock_spec_baseline.txt(3046 entries). Gate runs zero-tolerance against the live tree.Helper + ladder (sub-task 2)
tests/_shared/mock_of.pywithmock_of[T](**overrides)factory backed bycreate_autospec(T, instance=True, spec_set=True). Re-export fromtests/_shared/__init__.py.tests/_shared/test_mock_of.pywith 8 unit tests (spec enforcement, override application, unknown-override rejection, Protocol support, async methods, subscript requirement, isinstance compatibility, no state leak between calls).docs/reference/conventions.mdsection 12.1 (FakeClock →create_autospec/mock_of→SimpleNamespace→ bareMagicMockforbidden at typed boundary). Update CLAUDE.md Testing bullet with the pointer.Drain (sub-task 3)
scripts/check_mock_spec_ratchet.pyPreToolUse hook plus registrations in.claude/settings.jsonand.opencode/plugins/synthorg-hooks.ts. Blocks (a) edits totests/*.pythat raise the gate's CATCH count for the touched file, (b) edits toscripts/check_mock_spec.pythat remove_Verdict.CATCHbranches. Drives organic drive-by tightening: every edit reduces or holds.tests/tospec=ConcreteType/mock_of[T]/FakeClock/SimpleNamespace.tests/unit/scripts/test_check_mock_spec.pywith rule-table coverage.Pre-PR review
14 review agents ran (12 main roster + 2 mini-pass). 9 valid findings consolidated, 9 implemented in the follow-up
fix: address pre-PR review feedbackcommit. 4 false positives rejected (chiefly PEP 758except OSError, UnicodeDecodeError:flagged by 3 agents unfamiliar with Python 3.14 syntax; empirically verified to parse and catch correctly).Implemented:
check_mock_spec_ratchet.pyto PreToolUse hooks section indocs/reference/convention-gates.md.check_error_codes_ts_in_sync.pyandcheck_no_controller_response_for_domain_errors.pyin the gate inventory; update<!--RS:convention_gates-->macro,data/runtime_stats.yamlsource, and the CLAUDE.md reference list count.tests/unit/scripts/test_check_mock_spec_ratchet.pywith 18 unit tests covering envelope parsing (Bash / unknown tool / no file_path / non-.py / out-of-scope /tests/_shared/exemption / no-op edit / old_string-not-in-before), ratchet enforcement (hold / raise / reduce / Write tool no-op + raise), gate-edit protection (preserve / remove / add branches), and fail-open path with stderr surfacing.test_yield_from_typed_return_caughtlocking theast.YieldFrombranch.test_set_element_skippedlocking theast.Setcollection-skip branch.except Exception:swallows in_scan_textand_check_test_fileso transitional failures surface instead of silently allowing edits.check_mock_spec_ratchet.pyin the OpenCode plugin header comment._load_gate()plus five AST predicate helpers (_is_empty_splat,_is_literal_none,_has_spec_positional,_has_spec_keyword,_has_dynamic_splat).Also fixed an unrelated mypy strict failure: three test files imported
VersionRepository/FineTuneCheckpointRepository/FineTuneRunRepositoryfromsynthorg.persistence.protocol, but those names are TYPE_CHECKING-only re-exports there; routed through the source protocol modules (version_protocol,fine_tune_protocol).Issue-resolution-verifier confirmed all 5 acceptance criteria RESOLVED at 92-100 % confidence. The stretch goal said "file kept as a structural marker" but the branch deletes
scripts/mock_spec_baseline.txtentirely; the CLAUDE.md text now explicitly documents "zero-tolerance, no baseline" so the intent is unambiguous.Test plan
uv run ruff check src/ tests/ scripts/— clean.uv run mypy src/ tests/—Success: no issues found in 3714 source files.uv run python -m pytest tests/ -m unit— 28,255 passed, 18 skipped (Windows / logfire-extra environmental skips).uv run python scripts/check_convention_gate_inventory.py— exit 0 (meta-gate parity preserved).Local cleanup note
Contributors pulling this branch should delete any stale
scripts/mock_spec_baseline.txtfrom their worktree; the gate now runs zero-tolerance and the file is no longer produced.