Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@
}
]
},
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "python3 scripts/check_mock_spec_ratchet.py",
"timeout": 10000
}
]
},
{
"matcher": "Edit|Write",
"hooks": [
Expand Down
35 changes: 34 additions & 1 deletion .opencode/plugins/synthorg-hooks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
* PreToolUse (Bash): scripts/check_bash_no_write.sh
* PreToolUse (Bash): scripts/check_git_c_cwd.sh
* PreToolUse (Bash | Edit): scripts/check_no_bulk_edit.py
* PreToolUse (Edit|Write): scripts/check_mock_spec_ratchet.py
* PreToolUse (Edit|Write): scripts/check_no_edit_migration.sh
* PreToolUse (Edit|Write): scripts/check_no_edit_baseline.sh
* PreToolUse (Edit|Write): scripts/check_no_em_dashes_hook.sh
Expand Down Expand Up @@ -196,6 +197,39 @@ export const SynthOrgHooks: Plugin = async ({ client, $, app }) => {
}

const filePathInput = { file_path: filePath } as Record<string, unknown>;
const args = (output.args ?? {}) as Record<string, unknown>;

// Mock-spec ratchet: blocks edits that would increase the
// gate's CATCH count in any tests/*.py file, and edits that
// weaken scripts/check_mock_spec.py. The hook needs the
// full Edit / Write payload (file_path + old_string /
// new_string / content) so it runs before the
// file-path-only checks below.
{
const ratchetInput = { ...filePathInput } as Record<string, unknown>;
if (typeof args.old_string === "string") {
ratchetInput.old_string = args.old_string;
}
if (typeof args.new_string === "string") {
ratchetInput.new_string = args.new_string;
}
if (typeof args.replace_all === "boolean") {
ratchetInput.replace_all = args.replace_all;
}
if (typeof args.content === "string") {
ratchetInput.content = args.content;
}
const outcome = runHookScript(
"scripts/check_mock_spec_ratchet.py",
ratchetInput,
10000,
input.tool === "edit" ? "Edit" : "Write",
);
const denyReason = denyReasonFromOutcome(outcome);
if (denyReason) {
throw new Error(denyReason);
}
}

// Order must match `.claude/settings.json` PreToolUse Edit|Write:
// migration, baseline, em-dash (richer payload), triage-gate.
Expand All @@ -218,7 +252,6 @@ export const SynthOrgHooks: Plugin = async ({ client, $, app }) => {

// check_no_em_dashes_hook.sh: inspects the candidate content
// before it lands on disk (mirrors scripts/check_no_em_dashes.py).
const args = (output.args ?? {}) as Record<string, unknown>;
const emDashInput = { ...filePathInput } as Record<string, unknown>;
if (typeof args.content === "string") {
emDashInput.content = args.content;
Expand Down
6 changes: 3 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ PYTHONPATH=. uv run zensical build # docs

- [docs/reference/claude-reference.md](docs/reference/claude-reference.md): Doc layout, Docker, releasing, CI, dependencies, Hypothesis deep-dive
- [docs/reference/conventions.md](docs/reference/conventions.md): repository CRUD, lifecycle, response wrapping, validators, event imports, domain errors, file structure, frozen ConfigDict, args models, Pydantic v2, async, Clock seam, observability event-name inventory, repository CRUD method names, MCP handler logging centralisation, repository file structure, registering MANDATORY rules, `activate_*` / `deactivate_*` lifecycle naming
- [docs/reference/convention-gates.md](docs/reference/convention-gates.md): gate inventory (35 enforcement gates + meta-gate)
- [docs/reference/convention-gates.md](docs/reference/convention-gates.md): gate inventory (39 enforcement gates + meta-gate + PreToolUse hooks)
- [docs/reference/regional-defaults.md](docs/reference/regional-defaults.md), [persistence-boundary.md](docs/reference/persistence-boundary.md), [configuration-precedence.md](docs/reference/configuration-precedence.md), [errors.md](docs/reference/errors.md), [sec-prompt-safety.md](docs/reference/sec-prompt-safety.md), [lifecycle-sync.md](docs/reference/lifecycle-sync.md), [mcp-handler-contract.md](docs/reference/mcp-handler-contract.md), [typed-boundaries.md](docs/reference/typed-boundaries.md), [retry-patterns.md](docs/reference/retry-patterns.md), [scaffolding.md](docs/reference/scaffolding.md), [audit-category-gate-coverage.md](docs/reference/audit-category-gate-coverage.md), [dead-api-endpoints.md](docs/reference/dead-api-endpoints.md), [pluggable-subsystems.md](docs/reference/pluggable-subsystems.md), [protocols-audit.md](docs/reference/protocols-audit.md), [telemetry.md](docs/reference/telemetry.md)

## Diagrams
Expand Down Expand Up @@ -82,8 +82,8 @@ PYTHONPATH=. uv run zensical build # docs
- Markers: `@pytest.mark.{unit,integration,e2e,slow}`. Async `auto`. Timeout 30s global. Coverage 80% min.
- xdist `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` (`loadfile` prevents 3.14+Windows ProactorEventLoop leak).
- Windows: unit tests use `WindowsSelectorEventLoopPolicy` (3.14 IOCP teardown race). Subprocess tests override back.
- Mock-spec: every Mock declares `spec=ConcreteClass`; baseline at `scripts/mock_spec_baseline.txt`.
- FakeClock from `tests._shared.fake_clock`; inject via `clock=`.
- Test doubles: ladder in [conventions.md](docs/reference/conventions.md) section 12.1. `FakeClock` for the Clock seam, `mock_of[T](**overrides)` for typed-boundary substitutions, `SimpleNamespace` for attribute-bags. Bare `MagicMock` at a typed boundary (constructor / fn arg / annotated local / typed fixture return) is blocked by `scripts/check_mock_spec.py` (zero-tolerance, no baseline).
- FakeClock and `mock_of` import from `tests._shared`; inject via `clock=` and the helper's spec subscript.
- Vendor-agnostic: NEVER use real vendor names in project code/tests. Use `example-provider`, `test-provider`, `example-{large,medium,small}-001`. Allowed in `.claude/`, third-party imports, `providers/presets.py`, `web/public/provider-logos/`.
- Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add `@example(...)`).
- Flaky: NEVER skip/xfail; fix fundamentally. Use `asyncio.Event().wait()` not `sleep(large)`.
Expand Down
4 changes: 2 additions & 2 deletions data/runtime_stats.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ stats:
raw: 7
display: "7"
convention_gates:
raw: 38
display: "38"
raw: 41
display: "41"

sources:
tests: "uv run python -m pytest --collect-only -q"
Expand Down
5 changes: 4 additions & 1 deletion docs/reference/convention-gates.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ All under `scripts/`. The list is generated by `ls scripts/check_*.py`; if an en
- `check_domain_error_hierarchy.py`
- `check_dto_forbid_extra.py`
- `check_dual_backend_test_parity.py`
- `check_error_codes_ts_in_sync.py`
- `check_forbidden_literals.py`
- `check_image_signatures.py`
- `check_list_pagination.py`
Expand All @@ -29,6 +30,7 @@ All under `scripts/`. The list is generated by `ls scripts/check_*.py`; if an en
- `check_mcp_admin_tool_guardrails.py`
- `check_mock_spec.py`
- `check_no_bulk_edit.py`
- `check_no_controller_response_for_domain_errors.py`
- `check_no_em_dashes.py`
- `check_no_loop_bound_init.py`
- `check_no_magic_numbers.py`
Expand All @@ -48,7 +50,7 @@ All under `scripts/`. The list is generated by `ls scripts/check_*.py`; if an en
- `check_workflow_shell_git_commits.py`
- `check_workflow_tag_lifecycle.py`

(<!--RS:convention_gates-->38<!--/RS--> total `check_*.py` scripts: enforcement gates plus the meta-gate below.)
(<!--RS:convention_gates-->41<!--/RS--> total `check_*.py` scripts: 39 enforcement gates listed above plus the meta-gate below, plus the `check_mock_spec_ratchet.py` PreToolUse hook.)

## PreToolUse hooks (Claude Code + OpenCode)

Expand All @@ -60,6 +62,7 @@ Some conventions are also enforced *before* the file lands on disk so the offend
- `check_no_edit_migration.sh`: blocks `Edit` / `Write` on `src/synthorg/persistence/{sqlite,postgres}/revisions/*.sql` (use `atlas migrate diff` instead).
- `check_no_atlas_rehash.sh`: blocks `Bash` invocations of `atlas migrate hash` (rehashing breaks installed databases).
- `check_pre_pr_review_triage_gate.sh`: blocks `Edit` / `Write` outside `_audit/` while a `/pre-pr-review` triage table is pending user approval.
- `check_mock_spec_ratchet.py`: blocks `Edit` / `Write` to `tests/*.py` that would raise the mock-spec gate's CATCH count for the touched file, and blocks `Edit` / `Write` to `scripts/check_mock_spec.py` that would remove `_Verdict.CATCH` branches. Drives drive-by tightening: every edit reduces or holds the residual.

The hook layer is fail-closed: the OpenCode plugin treats hook execution errors as denials, so a misbehaving hook script blocks the action rather than letting it through.

Expand Down
45 changes: 45 additions & 0 deletions docs/reference/conventions.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,51 @@ churn there (~30 test sites passing callables) outweighs the
testability win. New code uses the `Clock` Protocol; do not add new
modules to the legacy-callable list without justification.

## 12.1. Test-double ladder

When a test needs to stand in for a real collaborator, prefer the
narrowest tool that still expresses the contract. The ladder, top to
bottom:

1. **Protocol fake**: a hand-written class that satisfies a Protocol
structurally, with deterministic state. Canonical example:
`tests/_shared/fake_clock.py` (`FakeClock` satisfies
`synthorg.core.clock.Clock`). Use this when the seam has more than
one method, the test asserts on observed effects (sleeps recorded,
time advanced), or virtual-time semantics matter.
2. **`create_autospec` / `mock_of[T]`**: a typed mock built from the
real class. Use `mock_of[T](**overrides)` from `tests._shared` for
the common case (autospec with `instance=True, spec_set=True`,
plus optional kwarg-overrides); reach for raw
`create_autospec(T, instance=True, spec_set=True)` when the call site needs the
lower-level API. Missing methods raise `AttributeError`; renames
Comment thread
coderabbitai[bot] marked this conversation as resolved.
in production fail tests immediately.
3. **`SimpleNamespace`**: a plain attribute bag for scratch data
that never crosses a typed boundary. Use when the test only
needs `obj.x = 1; obj.y = 2` semantics and does not care about
method behaviour.
4. **Bare `MagicMock` (forbidden at a typed boundary)**: a
`MagicMock()` with no `spec=` absorbs any attribute access. The
`scripts/check_mock_spec.py` gate blocks substituting a bare
mock for a typed parameter, fixture return, or annotated local.
Bare mocks remain syntactically allowed for `.return_value =`
chains and attribute-bag scratch (rungs 3 and below); the gate
does not scan those.

Picking a rung:

| Need | Use |
| ---------------------------------------------------------- | ---------------------------------- |
| Wall-clock / monotonic / sleep | `FakeClock` |
| Concrete service / repo at a constructor or fn argument | `mock_of[T](**overrides)` |
| Other Protocol with hand-rolled state | new Protocol fake under `tests/_shared/` |
| Throwaway namespace for `obj.x = 1` style | `types.SimpleNamespace(x=1, y=2)` |
| Inner mock for `parent.method.return_value = ...` chain | bare `MagicMock()` (not a typed boundary) |

The gate in `scripts/check_mock_spec.py` runs in zero-tolerance mode
(no baseline file). A new bare `Mock()` substituted for a typed
parameter fails pre-commit; the fix is one of the three upper rungs.

## 13. Observability event-name inventory

Every observability event is a `Final[str]` constant in a
Expand Down
Loading
Loading