Skip to content

feat: harden gate baseline protection + block em-dashes at write time#1860

Merged
Aureliolo merged 6 commits into
mainfrom
chore/baseline-edit-protection
May 10, 2026
Merged

feat: harden gate baseline protection + block em-dashes at write time#1860
Aureliolo merged 6 commits into
mainfrom
chore/baseline-edit-protection

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

Closes the "Claude silently grows debt" loophole and adds a write-time em-dash intercept so common writer mistakes never reach the diff.

Three new enforcement layers:

  1. scripts/check_no_edit_baseline.sh (extended): PreToolUse Edit/Write hook now blocks scripts/*_baseline.{txt,json} and scripts/_*_baseline.py in addition to tests/baselines/*.json.
  2. scripts/check_no_baseline_update.sh (new): PreToolUse Bash hook blocks scripts/check_*.py --update-baseline / --update / --refresh-baseline invocations.
  3. scripts/check_baseline_growth.py (new): pre-commit gate compares each staged scripts/*_baseline.* against HEAD and rejects commits that grow the entry count. Bypass via ALLOW_BASELINE_GROWTH=1 (per-invocation, requires explicit user approval). Raises InvalidBaselineError and exits 2 on corrupt JSON so a malformed baseline cannot silently slip through the comparison.

Plus an em-dash write-time intercept:

  1. scripts/check_no_em_dashes_hook.sh (new): PreToolUse Edit/Write hook rejects U+2014 and HTML entities (—, —, —) in the candidate content. Mirrors the existing diff-time check_no_em_dashes.py pre-commit gate but fires earlier, saving the writer-then-fixer round-trip.

All four hooks wired into both .claude/settings.json and .opencode/plugins/synthorg-hooks.ts (parity preserved, identical execution order).

Documentation

  • CLAUDE.md "Test Regression (MANDATORY)" line expanded to cover both protected baseline families and document the ALLOW_BASELINE_GROWTH=1 bypass.
  • docs/reference/convention-gates.md: added check_baseline_growth.py to the inventory (count 37 -> 38) and added a "PreToolUse hooks" subsection listing all six committed hooks.
  • data/runtime_stats.yaml: bumped convention_gates.raw from 37 to 38.

Test plan

  • 43 new unit tests in tests/unit/scripts/test_check_baseline_growth.py (30) and test_check_no_em_dashes_hook.py (13). Cover: path classification, JSON / text entry counting, growth detection, env bypass, missing-HEAD, FileNotFoundError vs unexpected OSError, invalid-JSON exit, unreadable staged file, multi-path mixed states; em-dash detection across content + new_string, HTML entity variants, ASCII hyphen pass, double-hyphen pass, CHANGELOG.md exclusion (relative + absolute), empty content, missing tool_input, malformed JSON envelope.
  • All 43 pass on Python 3.14.4 / Windows. Full unit suite green (xdist worker-crash flakes confirmed unrelated via --last-failed -n0 re-run).
  • Smoke-tested every hook by piping mock JSON envelopes through bash and verifying both deny and allow paths.

Review coverage

Pre-PR review by 9 agents (code-reviewer, python-reviewer, tool-parity-checker, docs-consistency, comment-quality-rot, conventions-enforcer, test-quality-reviewer, infra-reviewer, silent-failure-hunter). 15 valid findings actioned (2 Critical + 4 Major + 9 Medium). 3 false positives rejected with documented rationale. Triage table at _audit/pre-pr-review/triage.md.

Notable Critical fixes from the review:

  • _count_json_entries returned -1 on parse failure -> a corrupt JSON baseline would silently slip past staged > head. Now raises InvalidBaselineError and exits 2.
  • docs/reference/convention-gates.md inventory was missing the new gate (the file's own header requires updates in the same PR).
  • OpenCode plugin executed the em-dash hook AFTER the triage-gate, while Claude Code ran it BEFORE. Reordered for parity.

Self-test moment

The em-dash hook activated mid-session and tripped on its own commit attempt because the bash source contained literal — patterns. Fixed by reconstructing the patterns at runtime via ${AMP}mdash;, the same trick the Python pre-commit gate uses. Verified end-to-end by writing this PR description: every dash here is an ASCII hyphen.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 10, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 10, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5350a73a-a1bf-4595-afe0-489ece48f42a

📥 Commits

Reviewing files that changed from the base of the PR and between 5587575 and deb0926.

📒 Files selected for processing (13)
  • .claude/settings.json
  • .opencode/plugins/synthorg-hooks.ts
  • .pre-commit-config.yaml
  • CLAUDE.md
  • data/runtime_stats.yaml
  • docs/reference/convention-gates.md
  • scripts/check_baseline_growth.py
  • scripts/check_no_baseline_update.sh
  • scripts/check_no_edit_baseline.sh
  • scripts/check_no_em_dashes_hook.sh
  • tests/unit/scripts/test_check_baseline_growth.py
  • tests/unit/scripts/test_check_no_edit_baseline.py
  • tests/unit/scripts/test_check_no_em_dashes_hook.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
{README.md,docs/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README and public docs sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers; see data/README.md

Files:

  • docs/reference/convention-gates.md
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Use d2 for architecture/nested containers, mermaid for flowcharts/sequence/pipelines. Markdown tables for tabular data. D2 theme 200, D2 CLI pinned to v0.7.1

Files:

  • docs/reference/convention-gates.md
.pre-commit-config.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

Pre-commit/pre-push hooks: see .pre-commit-config.yaml. Hookify rules: .claude/hookify.*.md

Files:

  • .pre-commit-config.yaml
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: use @pytest.mark.{unit,integration,e2e,slow}. Async tests use auto. Global timeout 30s. Coverage minimum 80%

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race). Subprocess tests override back

Mock-spec: every Mock declares spec=ConcreteClass; baseline at scripts/mock_spec_baseline.txt

Testing: inject FakeClock via clock= parameter; use from tests._shared.fake_clock

Hypothesis: 10 deterministic CI examples. Failures are real bugs (fix + add @example(...))

Flaky tests: NEVER skip/xfail; fix fundamentally. Use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_no_edit_baseline.py
  • tests/unit/scripts/test_check_baseline_growth.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_no_edit_baseline.py
  • tests/unit/scripts/test_check_baseline_growth.py
{src/synthorg/**/*.py,tests/**/*.py,!src/synthorg/providers/presets.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor names: NEVER use real vendor names in project code/tests. Use example-provider, test-provider, example-{large,medium,small}-001. Allowed in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_no_edit_baseline.py
  • tests/unit/scripts/test_check_baseline_growth.py
{,.opencode/plugins/}synthorg-hooks.ts

📄 CodeRabbit inference engine (AGENTS.md)

synthorg-hooks.ts plugin uses PreToolUse/PostToolUse hooks that call the same validation scripts as Claude Code hooks

Files:

  • .opencode/plugins/synthorg-hooks.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Read `docs/design/` page before implementing; deviations need approval. See DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Use no region/currency/locale privilege; metric units; British English
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Every convention PR ships its enforcement gate; see `docs/reference/convention-gates.md`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Timeout/slow test failures are source-code regression; never edit `tests/baselines/unit_timing.json` or baseline files. Per-invocation bypass: `ALLOW_BASELINE_GROWTH=1 git commit`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR: `/aurelio-review-pr` for external feedback. Fix all valid feedback; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Git commits: `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Git: signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Git branches: `<type>/<slug>` format from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Squash merge: PR body becomes squash commit. Trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: After every squash merge → run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:56:16.462Z
Learning: Web: see `web/CLAUDE.md`. CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`). Shell: see `~/.claude/rules/common/bash.md`
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • scripts/check_baseline_growth.py
  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_no_edit_baseline.py
  • tests/unit/scripts/test_check_baseline_growth.py
🔇 Additional comments (18)
data/runtime_stats.yaml (1)

45-47: LGTM!

The convention gate count bump from 37 to 38 correctly reflects the addition of scripts/check_baseline_growth.py. The source definition at line 56 (glob scripts/check_*.py) will include the new gate.

scripts/check_no_baseline_update.sh (1)

1-39: LGTM!

The hook correctly blocks baseline regeneration commands. The regex patterns properly match the --update-baseline, --refresh-baseline, and standalone --update flags (with end-of-string or whitespace anchor to avoid false positives on --update-foo). The fail-closed exit code 2 with JSON deny envelope follows the established hook contract.

scripts/check_no_edit_baseline.sh (1)

26-39: LGTM!

The Windows path normalisation correctly converts backslashes to forward slashes before the case match, ensuring protection applies regardless of path separator style. The case patterns comprehensively cover all documented baseline locations (tests/baselines/*.json, scripts/*_baseline.{txt,json}, scripts/_*_baseline.py) in both relative and absolute forms.

scripts/check_no_em_dashes_hook.sh (1)

1-65: LGTM!

The hook correctly blocks em-dashes at write time. The UTF-8 em-dash literal is constructed via $'\xe2\x80\x94' to keep the source file clean. The HTML entity patterns (&mdash;, &#8212;, &#x2014;) are properly assembled via concatenation. The changelog exemption correctly normalises Windows paths before matching. The grep -qF with multiple -e flags efficiently checks for any blocked pattern.

.claude/settings.json (1)

32-36: LGTM!

Both new hooks are correctly registered with appropriate matchers (Bash for baseline-update blocking, Edit|Write for em-dash blocking) and consistent 5000ms timeouts. The ordering aligns with the documented execution sequence in synthorg-hooks.ts.

Also applies to: 77-81

.opencode/plugins/synthorg-hooks.ts (2)

198-251: LGTM!

The Edit|Write handler correctly separates the envelope payloads: filePathInput for migration/baseline/triage checks, and emDashInput (with content/new_string fields) for the em-dash hook. The hook execution order matches .claude/settings.json: migration → baseline → em-dash → triage-gate.


342-356: LGTM!

The baseline-update hook is correctly invoked unconditionally for all Bash commands, matching the rationale that aliases or subprocess wrappers could obscure the literal flag tokens. The fail-closed error handling via denyReasonFromOutcome is consistent with other hooks.

tests/unit/scripts/test_check_no_edit_baseline.py (1)

1-156: LGTM!

Comprehensive test coverage for the baseline edit hook including all protected path patterns (.txt, .json, .py), Windows path normalisation, missing file_path, and malformed JSON handling. The Python baseline test at lines 75-86 addresses the prior review feedback. The fail-open rationale for malformed JSON (lines 140-145) is well-documented.

scripts/check_baseline_growth.py (4)

1-58: LGTM!

The unified _BASELINE_BASENAME_RE regex (lines 37-39) and its consistent use in _is_baseline_path (lines 46-58) address the prior review feedback about acceptance logic divergence. The regex correctly restricts basenames to lowercase alphanumeric with underscores and enforces the leading underscore requirement for .py baselines.


140-168: LGTM!

Reading staged content from the git index via git show :<path> (line 156) correctly addresses the prior review concern about working-tree drift. This also closes the path-injection vector since the script never touches the filesystem with user-controlled paths.


189-198: LGTM!

The warning for corrupt HEAD baselines (lines 192-196) addresses the prior review feedback about silent fallback to head_count=0. The message clearly explains the fallback behaviour and its implications.


203-244: LGTM!

The main function correctly implements the gate logic: early exit on ALLOW_BASELINE_GROWTH=1 bypass, path filtering via the unified _is_baseline_path, aggregated reporting for invalid baselines (exit 2) and growth violations (exit 1), and clear user-facing guidance for the bypass mechanism.

.pre-commit-config.yaml (1)

218-224: Baseline-growth hook wiring looks correct.

The new hook is scoped appropriately and the updated regex now covers underscore-prefixed *_baseline.{txt,json} alongside _*_baseline.py as intended.

CLAUDE.md (1)

18-18: MANDATORY regression rule update is clear and enforceable.

The expanded baseline protections plus explicit per-invocation bypass guidance are documented well.

docs/reference/convention-gates.md (2)

14-14: Gate inventory update is consistent.

Adding check_baseline_growth.py and bumping the RS-backed total keeps the inventory in sync.

Also applies to: 51-51


53-65: PreToolUse hook documentation is solid.

This section clearly captures the pre-write/pre-command controls and explicitly states fail-closed behaviour.

tests/unit/scripts/test_check_no_em_dashes_hook.py (1)

1-194: Strong unit coverage for em-dash PreToolUse behaviour.

The suite exercises deny paths, expected exemptions, malformed-envelope resilience, and Windows/POSIX path variants thoroughly.

tests/unit/scripts/test_check_baseline_growth.py (1)

1-460: Baseline-growth gate tests are comprehensive and high-value.

Great coverage of classification, parser failures, git-read edge cases, growth detection, and bypass semantics.


Walkthrough

Adds a baseline-growth pre-commit checker and three PreToolUse hook scripts (baseline-update blocking, em-dash blocking, expanded baseline edit protection), wires them into Claude (.claude/settings.json), OpenCode (.opencode/plugins/synthorg-hooks.ts), and pre-commit, updates documentation and a runtime gate count, and adds unit tests covering path classification, entry counting, growth detection, bypass behavior, and hook behaviors.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.15% which is insufficient. The required threshold is 40.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'feat: harden gate baseline protection + block em-dashes at write time' clearly and specifically summarizes the primary changes: hardening baseline protection and adding em-dash blocking at write time.
Description check ✅ Passed The PR description is comprehensive and directly related to the changeset, covering the four new enforcement layers, documentation updates, tests, and verification details.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 10, 2026 10:36 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several new enforcement gates and hooks designed to prevent the growth of gate-suppression baselines and the use of em-dashes. Key additions include a baseline-growth pre-commit hook, and PreToolUse hooks for blocking baseline updates and em-dash characters in candidate content. Documentation in CLAUDE.md and convention-gates.md has been updated accordingly. Feedback identifies a potential loophole in the JSON entry counting logic within scripts/check_baseline_growth.py that could allow growth to go undetected if the locations key is missing.

Comment thread scripts/check_baseline_growth.py Outdated
Comment on lines +58 to +64
if isinstance(payload, dict):
locations = payload.get("locations")
if isinstance(locations, dict):
return len(locations)
if isinstance(locations, list):
return len(locations)
return 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of _count_json_entries returns 0 for any JSON dictionary that does not contain a "locations" key. This creates a loophole where growth in a baseline file using a flat dictionary structure (or a different key) would not be detected (since 0 > 0 is false).

It is safer to fall back to counting the top-level keys of the dictionary if the expected "locations" key is missing or is not a collection. This ensures that any addition to the baseline is captured as growth.

    if isinstance(payload, dict):
        locations = payload.get("locations")
        if isinstance(locations, (dict, list)):
            return len(locations)
        return len(payload)

Comment thread scripts/check_baseline_growth.py Fixed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.pre-commit-config.yaml:
- Line 223: The hook regex in .pre-commit-config.yaml currently requires the
first character of scripts/*_baseline.(txt|json) to be a lowercase letter, so
underscore-prefixed baselines like
scripts/_workflow_shell_git_commits_baseline.json are not matched; update the
files regex to allow an optional leading underscore for the txt/json basename
(e.g., change the part matching ([a-z][a-z_]*_baseline\.(txt|json)) to allow
[_a-z] as the first character or add an alternative that starts with _),
ensuring both families (scripts/*_baseline.{txt,json} and
scripts/_*_baseline.py) are covered so underscore-prefixed JSON/TXT baselines
trigger the hook.

In `@scripts/check_no_em_dashes_hook.sh`:
- Around line 35-40: The case pattern only matches POSIX separators so Windows
paths with backslashes are missed; normalize FILE_PATH by replacing backslashes
with forward slashes before the case check (create a normalized variable from
FILE_PATH, e.g., replace '\' with '/'), then use that normalized variable in the
case statement that checks for .github/CHANGELOG.md so both POSIX and
Windows-style paths are exempt; update references to FILE_PATH in this block to
use the normalized name (look for FILE_PATH and the case that matches
*/.github/CHANGELOG.md).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 09efa3ff-7b69-4519-abb9-610d41bee7b3

📥 Commits

Reviewing files that changed from the base of the PR and between e90550e and db0a470.

📒 Files selected for processing (12)
  • .claude/settings.json
  • .opencode/plugins/synthorg-hooks.ts
  • .pre-commit-config.yaml
  • CLAUDE.md
  • data/runtime_stats.yaml
  • docs/reference/convention-gates.md
  • scripts/check_baseline_growth.py
  • scripts/check_no_baseline_update.sh
  • scripts/check_no_edit_baseline.sh
  • scripts/check_no_em_dashes_hook.sh
  • tests/unit/scripts/test_check_baseline_growth.py
  • tests/unit/scripts/test_check_no_em_dashes_hook.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers per data/README.md

Files:

  • docs/reference/convention-gates.md
docs/**/*.{md,d2}

📄 CodeRabbit inference engine (CLAUDE.md)

Use d2 for architecture and nested container diagrams; use mermaid for flowcharts, sequences, and pipelines. Use markdown tables for tabular data. Pin D2 CLI to v0.7.1 in CI

Files:

  • docs/reference/convention-gates.md
{,.opencode/plugins/}synthorg-hooks.ts

📄 CodeRabbit inference engine (AGENTS.md)

synthorg-hooks.ts plugin uses PreToolUse/PostToolUse hooks that call the same validation scripts as Claude Code hooks

Files:

  • .opencode/plugins/synthorg-hooks.ts
.pre-commit-config.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

Pre-commit/pre-push hooks: use .pre-commit-config.yaml. Hookify rules: .claude/hookify.*.md

Files:

  • .pre-commit-config.yaml
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: use @pytest.mark.{unit,integration,e2e,slow}. Async auto. Timeout 30s global. Coverage 80% min

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race). Subprocess tests override back

Mock-spec: every Mock declares spec=ConcreteClass; baseline at scripts/mock_spec_baseline.txt

Use FakeClock from tests._shared.fake_clock; inject via clock=

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Flaky tests: NEVER skip/xfail; fix fundamentally. Use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_baseline_growth.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_baseline_growth.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: No region/currency/locale privileged; use metric units; British English per regional-defaults.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Every convention PR must ship its enforcement gate per convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.{txt,json}` or `scripts/_*_baseline.py`. Both families PreToolUse-blocked. Bypass via `ALLOW_BASELINE_GROWTH=1 git commit` with explicit user approval
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR: use `/aurelio-review-pr` for external feedback. Fix all valid issues; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: xdist: use `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` (loadfile prevents 3.14+ Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Git commits: use `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Branches: use `<type>/<slug>` naming from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Squash merge. PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: After every squash merge → run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Web: see `web/CLAUDE.md` for web-specific guidelines
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: CLI: see `cli/CLAUDE.md` for CLI-specific guidelines; use `go -C cli`, never `cd cli`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Shell: see `~/.claude/rules/common/bash.md` for canonical Bash file-write rules (`cd` / `git -C` / Bash file-write rules)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: License: BUSL-1.1 → Apache 2.0 after Change Date
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T10:36:04.008Z
Learning: Project layout: `src/synthorg/` (src layout), `tests/`, `web/` (React 19), `cli/` (Go binary)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_baseline_growth.py
  • scripts/check_baseline_growth.py

Comment thread .pre-commit-config.yaml Outdated
Comment thread scripts/check_no_em_dashes_hook.sh
@codecov
Copy link
Copy Markdown

codecov Bot commented May 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.88%. Comparing base (57ed0b4) to head (deb0926).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1860   +/-   ##
=======================================
  Coverage   84.88%   84.88%           
=======================================
  Files        1806     1806           
  Lines      105078   105078           
  Branches     9200     9200           
=======================================
  Hits        89199    89199           
  Misses      13651    13651           
  Partials     2228     2228           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Aureliolo Aureliolo force-pushed the chore/baseline-edit-protection branch from db0a470 to 6f6aa7f Compare May 10, 2026 11:05
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 10, 2026 11:07 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_baseline_growth.py`:
- Around line 162-165: The except block that catches InvalidBaselineError around
the call to _staged_entries(head_text, suffix) silently sets head_count = 0;
change it to emit a warning (via the module logger or stderr) that the HEAD
baseline is corrupt and that you're falling back to 0, and include the exception
message (and optionally a short snippet or identifier of head_text) so the
silent fallback is visible for debugging while preserving the fallback behavior.

In `@scripts/check_no_edit_baseline.sh`:
- Around line 28-35: The case block only handles POSIX '/' separators so Windows
paths like 'C:\repo\scripts\foo_baseline.txt' can bypass it; normalize FILE_PATH
before the case match by replacing backslashes with forward slashes (e.g.,
transform FILE_PATH using a string replace of '\' to '/') so the existing case
patterns match reliably; update the top of the script (before the case) where
FILE_PATH is set/used, ensuring the normalized FILE_PATH value is then used in
the case statement in scripts/check_no_edit_baseline.sh.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ffbb963a-7ece-438c-b54d-f0dc46ac0a2c

📥 Commits

Reviewing files that changed from the base of the PR and between db0a470 and 6f6aa7f.

📒 Files selected for processing (12)
  • .claude/settings.json
  • .opencode/plugins/synthorg-hooks.ts
  • .pre-commit-config.yaml
  • CLAUDE.md
  • data/runtime_stats.yaml
  • docs/reference/convention-gates.md
  • scripts/check_baseline_growth.py
  • scripts/check_no_baseline_update.sh
  • scripts/check_no_edit_baseline.sh
  • scripts/check_no_em_dashes_hook.sh
  • tests/unit/scripts/test_check_baseline_growth.py
  • tests/unit/scripts/test_check_no_em_dashes_hook.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers

Files:

  • CLAUDE.md
{docs,**}/**/*.{d2,md}

📄 CodeRabbit inference engine (CLAUDE.md)

Use D2 for architecture/nested container diagrams; mermaid for flowcharts/sequence/pipelines; markdown tables for tabular data; D2 theme 200 (Dark Mauve); D2 CLI pinned to v0.7.1 in CI

Files:

  • CLAUDE.md
  • docs/reference/convention-gates.md
{,.opencode/plugins/}synthorg-hooks.ts

📄 CodeRabbit inference engine (AGENTS.md)

synthorg-hooks.ts plugin uses PreToolUse/PostToolUse hooks that call the same validation scripts as Claude Code hooks

Files:

  • .opencode/plugins/synthorg-hooks.ts
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: use @pytest.mark.{unit,integration,e2e,slow}; async tests use auto; global timeout 30s; coverage 80% minimum

Windows tests: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Mock-spec: every Mock must declare spec=ConcreteClass; baseline at scripts/mock_spec_baseline.txt

Use FakeClock from tests._shared.fake_clock for time seams; inject via clock= parameter

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Flaky tests: NEVER skip/xfail; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_baseline_growth.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • tests/unit/scripts/test_check_baseline_growth.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Read design specification in `docs/design/` before implementing; deviations require approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Every convention PR must ship its enforcement gate; see docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: After issue completion: create branch, commit, push (no auto-PR); use `/pre-pr-review` before `gh pr create`; after PR: use `/aurelio-review-pr` for external feedback; fix all valid feedback
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: xdist: apply `-n 8 --dist=loadfile` via pyproject `addopts` (loadfile prevents 3.14+Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Use pre-commit/pre-push hooks from `.pre-commit-config.yaml`; follow hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Use GitHub Bash queries (`gh issue list`) not MCP `list_issues` for querying issues
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: After every squash merge: run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: CLI is Docker-only (init/start/stop/status); all features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:06:33.223Z
Learning: Use `go -C cli` command syntax, never `cd cli` when working with CLI; see cli/CLAUDE.md
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/scripts/test_check_no_em_dashes_hook.py
  • scripts/check_baseline_growth.py
  • tests/unit/scripts/test_check_baseline_growth.py
🔇 Additional comments (8)
docs/reference/convention-gates.md (1)

13-51: LGTM!

The gate inventory correctly adds check_baseline_growth.py and updates the count to 38. The new "PreToolUse hooks" section clearly documents all six fail-closed hooks with their behavior.

.opencode/plugins/synthorg-hooks.ts (2)

198-251: LGTM!

The refactored PreToolUse handling correctly:

  • Maintains hook execution order matching .claude/settings.json
  • Passes content (Write) or new_string (Edit) to the em-dash hook for content inspection
  • Continues to pass filePathInput to the triage gate

342-356: LGTM!

The check_no_baseline_update.sh hook is correctly invoked unconditionally for all Bash commands (like the atlas rehash hook), ensuring aliased or wrapped invocations cannot bypass the gate.

scripts/check_baseline_growth.py (2)

125-137: Path traversal mitigation is sound.

The _safe_baseline_path function correctly defends against path injection by resolving symlinks and verifying the result stays within REPO_ROOT. The CodeQL alert on line 151 is mitigated by this validation—read_text is only called after confirming the resolved path is repo-relative.


1-211: LGTM overall!

The gate implementation is well-structured with clear separation of concerns:

  • Proper exit codes for different failure modes
  • Defensive path validation before filesystem access
  • Explicit bypass mechanism with ALLOW_BASELINE_GROWTH=1
  • Informative error messages guiding users toward fixes
tests/unit/scripts/test_check_baseline_growth.py (3)

1-35: LGTM!

The test module correctly uses pytest.mark.unit and loads the script dynamically via importlib.util to access private helpers. The cast("Any", ...) pattern is a pragmatic approach for dynamic module typing.


41-109: LGTM!

Comprehensive parametrized tests for path classification and entry counting, including edge cases for JSON with/without locations, top-level lists, and the fallback to top-level keys.


130-357: LGTM!

The end-to-end tests thoroughly cover:

  • Bypass via ALLOW_BASELINE_GROWTH=1
  • Growth detection and shrink allowance
  • Missing HEAD treated as empty
  • Invalid JSON blocking with EXIT_INVALID_BASELINE
  • Path traversal rejection via _safe_baseline_path
  • Multi-path mixed states (grew, shrank, missing)

Comment thread scripts/check_baseline_growth.py
Comment thread scripts/check_no_edit_baseline.sh Outdated
Comment thread scripts/check_baseline_growth.py Fixed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_baseline_growth.py`:
- Line 31: The baseline acceptance logic is inconsistent between
_is_baseline_path and _safe_baseline_path causing some scripts/*_baseline.*
files to be silently skipped; unify them to a single regex-driven rule by using
the existing _BASELINE_BASENAME_RE as the canonical check in both places (or
factor it into a new helper used by both). Update _is_baseline_path (and any
code paths inside _inspect_path that call _safe_baseline_path) to rely on that
regex test for the filename basename rather than duplicating different checks,
ensuring all baseline files matching _BASELINE_BASENAME_RE are treated
equivalently.
- Around line 151-164: The function _inspect_path currently uses
absolute.read_text to get staged_text from the working tree; change it to read
the staged blob from the Git index instead (so comparisons reflect what is
actually staged). Replace the Path.read_text usage in _inspect_path with a
git-index read (e.g., run git show :<path> or use a Git library to read the
index blob for the same relative path), decode the returned bytes to UTF-8 into
staged_text, and handle OSError/CalledProcessError or missing-index cases the
same way you handled read failures; keep existing variables (suffix, absolute,
grown, invalid) and the rest of the function logic unchanged.

In `@tests/unit/scripts/test_check_no_edit_baseline.py`:
- Around line 36-141: Add a focused unit test in test_check_no_edit_baseline.py
that asserts Python-format baseline paths under scripts (matching the pattern
scripts/_*_baseline.py) are blocked: create a new `@_BASH_AVAILABLE` test (e.g.,
test_blocks_edit_of_python_baseline_posix) that calls the existing helper _run
with tool_input.file_path set to a representative scripts/_something_baseline.py
path and new_string set to some edit, then assert result.returncode == 2; reuse
the same conventions/decorators as the other tests to ensure parity with
test_blocks_edit_of_text_baseline_posix and
test_blocks_edit_of_json_baseline_posix.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 28d334fb-d1e7-4614-81a3-8b7bb81b34f8

📥 Commits

Reviewing files that changed from the base of the PR and between 6f6aa7f and 5587575.

📒 Files selected for processing (4)
  • scripts/check_baseline_growth.py
  • scripts/check_no_edit_baseline.sh
  • tests/unit/scripts/test_check_baseline_growth.py
  • tests/unit/scripts/test_check_no_edit_baseline.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Deploy Preview
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{py,ts,tsx,md}

📄 CodeRabbit inference engine (CLAUDE.md)

Never privilege any region/currency/locale; use metric units and British English per docs/reference/regional-defaults.md

Files:

  • tests/unit/scripts/test_check_no_edit_baseline.py
  • scripts/check_baseline_growth.py
  • tests/unit/scripts/test_check_baseline_growth.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use uv sync for dependency management; format with uv run ruff format src/ tests/; lint with uv run ruff check --fix; type-check with uv run mypy src/ tests/ in strict mode

Files:

  • tests/unit/scripts/test_check_no_edit_baseline.py
  • scripts/check_baseline_growth.py
  • tests/unit/scripts/test_check_baseline_growth.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Run pytest with -m unit, -m integration, or -m e2e markers; enforce 80% coverage minimum; use xdist with -n 8 --dist=loadfile

Test markers: use @pytest.mark.{unit,integration,e2e,slow}; async tests use auto mode; global timeout 30s; coverage minimum 80%

Windows testing: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Mock specifications: every Mock declares spec=ConcreteClass; baseline at scripts/mock_spec_baseline.txt

Use FakeClock from tests._shared.fake_clock; inject via clock= parameter in tests

Hypothesis testing: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)); use HYPOTHESIS_PROFILE=fuzz for fuzzing

Flaky tests: NEVER skip/xfail; fix fundamentally; use asyncio.Event().wait() instead of sleep(large)

Files:

  • tests/unit/scripts/test_check_no_edit_baseline.py
  • tests/unit/scripts/test_check_baseline_growth.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/scripts/test_check_no_edit_baseline.py
  • tests/unit/scripts/test_check_baseline_growth.py
**/*.sh

📄 CodeRabbit inference engine (CLAUDE.md)

Shell scripts: see ~/.claude/rules/common/bash.md for canonical cd / git -C / file-write rules

Files:

  • scripts/check_no_edit_baseline.sh
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: Read design spec from `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: Test regression: timeout/slow failures indicate source-code regression; never manually edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.*` files; use `ALLOW_BASELINE_GROWTH=1` for explicit approval
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: After issue implementation: branch + commit + push without auto-PR; use `/pre-pr-review` before creating PR; use `/aurelio-review-pr` for external feedback; fix all valid feedback before merging
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: Commit messages: `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced; signed commits required on protected refs (GPG/SSH or GitHub App)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: Branches: use `<type>/<slug>` naming from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: Pre-commit/pre-push hooks configured in `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: Squash merge PRs; PR body becomes squash commit message; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: After every squash merge, run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-10T11:31:12.392Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API; use `go -C cli` never `cd cli` per cli/CLAUDE.md
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/scripts/test_check_no_edit_baseline.py
  • scripts/check_baseline_growth.py
  • tests/unit/scripts/test_check_baseline_growth.py
🔇 Additional comments (3)
scripts/check_no_edit_baseline.sh (1)

26-39: Path normalisation and protected-pattern matching are solid.

This closes the Windows separator bypass class and keeps the deny logic clear and deterministic.

scripts/check_baseline_growth.py (1)

173-182: Good corrupt-HEAD warning path.

The warning on parse failure plus fail-closed fallback (head_count=0) makes this failure mode visible without weakening enforcement.

tests/unit/scripts/test_check_baseline_growth.py (1)

304-339: Excellent regression coverage for hardening paths.

These tests directly protect the two risky areas in this PR: path-injection handling and corrupt-HEAD warning behaviour.

Also applies to: 341-373

Comment thread scripts/check_baseline_growth.py Outdated
Comment thread scripts/check_baseline_growth.py
Comment thread tests/unit/scripts/test_check_no_edit_baseline.py
Aureliolo added 6 commits May 10, 2026 13:53
- scripts/check_baseline_growth.py:_count_json_entries fallback to len(payload) when 'locations' key is missing, so flat-dict baselines surface growth instead of returning sentinel 0 (gemini)

- scripts/check_baseline_growth.py: validate resolved baseline path stays under REPO_ROOT before reading; closes CodeQL alert #285 (py/path-injection)

- .pre-commit-config.yaml: hook regex now accepts underscore-prefixed JSON/TXT baselines like scripts/_workflow_shell_git_commits_baseline.json (CodeRabbit)

- scripts/check_no_em_dashes_hook.sh: normalise backslashes before changelog exemption case-match so Windows paths hit the same skip path (CodeRabbit)

- Tests: update json fallback expectation, add path-traversal guard test, add Windows-path changelog exemption parametrized cases
- scripts/check_baseline_growth.py: replace _safe_baseline_path resolve()/is_relative_to() with regex-validated basename approach so the user-supplied directory is discarded and only a known-safe REPO_ROOT/scripts/<basename> path is constructed; closes CodeQL alerts #285 (line 151) and #286 (line 134)

- scripts/check_baseline_growth.py: emit stderr WARNING when InvalidBaselineError is raised parsing HEAD content instead of silently falling back to head_count=0 (coderabbit)

- scripts/check_no_edit_baseline.sh: normalise backslashes before case-match so Windows paths like C:\repo\scripts\foo_baseline.txt hit the same protection as POSIX (coderabbit)

- Tests: rewrite path-injection test for regex sanitisation, add corrupt-HEAD warning test, new tests/unit/scripts/test_check_no_edit_baseline.py covering POSIX + Windows-path + non-baseline + malformed-envelope cases
- scripts/check_baseline_growth.py: unify _is_baseline_path and _safe_baseline_path under a single _BASELINE_BASENAME_RE (canonical regex matching txt/json with optional leading underscore + py with required leading underscore); remove _safe_baseline_path entirely so the gate never touches the user-controlled filesystem path (coderabbit)

- scripts/check_baseline_growth.py: read staged content via 'git show :<path>' instead of Path.read_text(); index reads reflect what is actually staged and remove the path-injection sink, closing CodeQL alert #285 (coderabbit + codeql)

- tests/unit/scripts/test_check_baseline_growth.py: switch tmp_path/REPO_ROOT-monkeypatched fixtures to mocks of _read_staged + _read_head; broaden test_is_baseline_path with traversal/Windows-separator/uppercase/extension cases; add _read_staged subprocess.run tests

- tests/unit/scripts/test_check_no_edit_baseline.py: add test_blocks_edit_of_python_baseline_posix and a Windows-path .py case (coderabbit)
@Aureliolo Aureliolo force-pushed the chore/baseline-edit-protection branch from 5587575 to deb0926 Compare May 10, 2026 11:55
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 10, 2026 11:56 — with GitHub Actions Inactive
@Aureliolo Aureliolo merged commit b41f151 into main May 10, 2026
104 of 105 checks passed
@Aureliolo Aureliolo deleted the chore/baseline-edit-protection branch May 10, 2026 12:25
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 10, 2026 12:25 — with GitHub Actions Inactive
Aureliolo pushed a commit that referenced this pull request May 12, 2026
<!-- HIGHLIGHTS_START -->
## Highlights

> _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub
Models). Commit-based changelog below._

### What you'll notice
- Password and secret fields now include an eye-toggle for easier
visibility control.
- Containers running without probes are shown as healthy in the doctor
command.
- Unloaded and missing PR-review agents are restored and available
again.

### What's new
- Gate baseline protection is enhanced to block em-dashes during
writing.

### Under the hood
- Replaced Atlas with yoyo-migrations for persistence management.
- Refactored codebase extensively, including context-bound user
authentication and registry pattern for enums.
- Improved linting by draining magic number usages and tightening mock
and constant checks.
- Updated CI to retry Docker pushes on network timeout errors.
- Updated apko lockfiles for dependency management.

<!-- HIGHLIGHTS_END -->

:robot: I have created a release *beep* *boop*
---


##
[0.8.3](v0.8.2...v0.8.3)
(2026-05-12)


### Features

* harden gate baseline protection + block em-dashes at write time
([#1860](#1860))
([b41f151](b41f151))
* **web:** eye-toggle on every password / secret field
([#1873](#1873))
([9070387](9070387))


### Bug Fixes

* **ci:** retry Docker push on Go net/http deadline + cancellation
errors ([#1877](#1877))
([23a0bfa](23a0bfa))
* **cli:** render running-no-probe containers as healthy in doctor
([#1870](#1870))
([6263795](6263795))
* restore unloaded and missing PR-review agents
([#1875](#1875))
([db004fd](db004fd)),
closes [#1871](#1871)


### Refactoring

* bind authenticated user via ContextVar
([#1858](#1858))
([57ed0b4](57ed0b4))
* code-structure cleanup (sub-tasks D + F + G + H + I)
([#1859](#1859))
([362e5c8](362e5c8))
* convert enum dispatch to registry pattern
([#1854](#1854))
([e90550e](e90550e))
* drain no_magic_numbers baseline to zero via Final hoists
([#1856](#1856) phase 2)
([#1872](#1872))
([ec8109e](ec8109e))
* drain pagination + loop-init + kill-switch baselines
([#1857](#1857))
([#1868](#1868))
([115c3c2](115c3c2))
* **persistence:** replace Atlas with yoyo-migrations
([#1876](#1876))
([1b7e975](1b7e975)),
closes [#1874](#1874)
* protocols audit follow-up (REVIEW + fold pass)
([#1869](#1869))
([af33ddb](af33ddb))
* protocols audit follow-up REMOVE pass
([#1867](#1867))
([dd1eebc](dd1eebc))
* tighten check_mock_spec gate, add mock_of[T], drain baseline
([#1862](#1862))
([240a253](240a253))
* tighten check_no_magic_numbers for named module constants
([#1856](#1856))
([#1866](#1866))
([90c933b](90c933b))


### CI/CD

* update apko lockfiles
([#1863](#1863))
([2bd32e6](2bd32e6))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants