feat: harden gate baseline protection + block em-dashes at write time#1860
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
There was a problem hiding this comment.
Code Review
This pull request introduces several new enforcement gates and hooks designed to prevent the growth of gate-suppression baselines and the use of em-dashes. Key additions include a baseline-growth pre-commit hook, and PreToolUse hooks for blocking baseline updates and em-dash characters in candidate content. Documentation in CLAUDE.md and convention-gates.md has been updated accordingly. Feedback identifies a potential loophole in the JSON entry counting logic within scripts/check_baseline_growth.py that could allow growth to go undetected if the locations key is missing.
| if isinstance(payload, dict): | ||
| locations = payload.get("locations") | ||
| if isinstance(locations, dict): | ||
| return len(locations) | ||
| if isinstance(locations, list): | ||
| return len(locations) | ||
| return 0 |
There was a problem hiding this comment.
The current implementation of _count_json_entries returns 0 for any JSON dictionary that does not contain a "locations" key. This creates a loophole where growth in a baseline file using a flat dictionary structure (or a different key) would not be detected (since 0 > 0 is false).
It is safer to fall back to counting the top-level keys of the dictionary if the expected "locations" key is missing or is not a collection. This ensures that any addition to the baseline is captured as growth.
if isinstance(payload, dict):
locations = payload.get("locations")
if isinstance(locations, (dict, list)):
return len(locations)
return len(payload)There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.pre-commit-config.yaml:
- Line 223: The hook regex in .pre-commit-config.yaml currently requires the
first character of scripts/*_baseline.(txt|json) to be a lowercase letter, so
underscore-prefixed baselines like
scripts/_workflow_shell_git_commits_baseline.json are not matched; update the
files regex to allow an optional leading underscore for the txt/json basename
(e.g., change the part matching ([a-z][a-z_]*_baseline\.(txt|json)) to allow
[_a-z] as the first character or add an alternative that starts with _),
ensuring both families (scripts/*_baseline.{txt,json} and
scripts/_*_baseline.py) are covered so underscore-prefixed JSON/TXT baselines
trigger the hook.
In `@scripts/check_no_em_dashes_hook.sh`:
- Around line 35-40: The case pattern only matches POSIX separators so Windows
paths with backslashes are missed; normalize FILE_PATH by replacing backslashes
with forward slashes before the case check (create a normalized variable from
FILE_PATH, e.g., replace '\' with '/'), then use that normalized variable in the
case statement that checks for .github/CHANGELOG.md so both POSIX and
Windows-style paths are exempt; update references to FILE_PATH in this block to
use the normalized name (look for FILE_PATH and the case that matches
*/.github/CHANGELOG.md).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 09efa3ff-7b69-4519-abb9-610d41bee7b3
📒 Files selected for processing (12)
.claude/settings.json.opencode/plugins/synthorg-hooks.ts.pre-commit-config.yamlCLAUDE.mddata/runtime_stats.yamldocs/reference/convention-gates.mdscripts/check_baseline_growth.pyscripts/check_no_baseline_update.shscripts/check_no_edit_baseline.shscripts/check_no_em_dashes_hook.shtests/unit/scripts/test_check_baseline_growth.pytests/unit/scripts/test_check_no_em_dashes_hook.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Lighthouse Site
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
docs/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics in README and public docs must be sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers per data/README.md
Files:
docs/reference/convention-gates.md
docs/**/*.{md,d2}
📄 CodeRabbit inference engine (CLAUDE.md)
Use
d2for architecture and nested container diagrams; usemermaidfor flowcharts, sequences, and pipelines. Use markdown tables for tabular data. Pin D2 CLI to v0.7.1 in CI
Files:
docs/reference/convention-gates.md
{,.opencode/plugins/}synthorg-hooks.ts
📄 CodeRabbit inference engine (AGENTS.md)
synthorg-hooks.ts plugin uses PreToolUse/PostToolUse hooks that call the same validation scripts as Claude Code hooks
Files:
.opencode/plugins/synthorg-hooks.ts
.pre-commit-config.yaml
📄 CodeRabbit inference engine (CLAUDE.md)
Pre-commit/pre-push hooks: use
.pre-commit-config.yaml. Hookify rules:.claude/hookify.*.md
Files:
.pre-commit-config.yaml
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers: use
@pytest.mark.{unit,integration,e2e,slow}. Asyncauto. Timeout 30s global. Coverage 80% minWindows: unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race). Subprocess tests override backMock-spec: every Mock declares
spec=ConcreteClass; baseline atscripts/mock_spec_baseline.txtUse
FakeClockfromtests._shared.fake_clock; inject viaclock=Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...))Flaky tests: NEVER skip/xfail; fix fundamentally. Use
asyncio.Event().wait()notsleep(large)
Files:
tests/unit/scripts/test_check_no_em_dashes_hook.pytests/unit/scripts/test_check_baseline_growth.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/scripts/test_check_no_em_dashes_hook.pytests/unit/scripts/test_check_baseline_growth.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: No region/currency/locale privileged; use metric units; British English per regional-defaults.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Every convention PR must ship its enforcement gate per convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.{txt,json}` or `scripts/_*_baseline.py`. Both families PreToolUse-blocked. Bypass via `ALLOW_BASELINE_GROWTH=1 git commit` with explicit user approval
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR: use `/aurelio-review-pr` for external feedback. Fix all valid issues; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: xdist: use `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` (loadfile prevents 3.14+ Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Git commits: use `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Branches: use `<type>/<slug>` naming from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Squash merge. PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: After every squash merge → run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Web: see `web/CLAUDE.md` for web-specific guidelines
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: CLI: see `cli/CLAUDE.md` for CLI-specific guidelines; use `go -C cli`, never `cd cli`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Shell: see `~/.claude/rules/common/bash.md` for canonical Bash file-write rules (`cd` / `git -C` / Bash file-write rules)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: License: BUSL-1.1 → Apache 2.0 after Change Date
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T10:36:04.008Z
Learning: Project layout: `src/synthorg/` (src layout), `tests/`, `web/` (React 19), `cli/` (Go binary)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/scripts/test_check_no_em_dashes_hook.pytests/unit/scripts/test_check_baseline_growth.pyscripts/check_baseline_growth.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1860 +/- ##
=======================================
Coverage 84.88% 84.88%
=======================================
Files 1806 1806
Lines 105078 105078
Branches 9200 9200
=======================================
Hits 89199 89199
Misses 13651 13651
Partials 2228 2228 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
db0a470 to
6f6aa7f
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/check_baseline_growth.py`:
- Around line 162-165: The except block that catches InvalidBaselineError around
the call to _staged_entries(head_text, suffix) silently sets head_count = 0;
change it to emit a warning (via the module logger or stderr) that the HEAD
baseline is corrupt and that you're falling back to 0, and include the exception
message (and optionally a short snippet or identifier of head_text) so the
silent fallback is visible for debugging while preserving the fallback behavior.
In `@scripts/check_no_edit_baseline.sh`:
- Around line 28-35: The case block only handles POSIX '/' separators so Windows
paths like 'C:\repo\scripts\foo_baseline.txt' can bypass it; normalize FILE_PATH
before the case match by replacing backslashes with forward slashes (e.g.,
transform FILE_PATH using a string replace of '\' to '/') so the existing case
patterns match reliably; update the top of the script (before the case) where
FILE_PATH is set/used, ensuring the normalized FILE_PATH value is then used in
the case statement in scripts/check_no_edit_baseline.sh.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ffbb963a-7ece-438c-b54d-f0dc46ac0a2c
📒 Files selected for processing (12)
.claude/settings.json.opencode/plugins/synthorg-hooks.ts.pre-commit-config.yamlCLAUDE.mddata/runtime_stats.yamldocs/reference/convention-gates.mdscripts/check_baseline_growth.pyscripts/check_no_baseline_update.shscripts/check_no_edit_baseline.shscripts/check_no_em_dashes_hook.shtests/unit/scripts/test_check_baseline_growth.pytests/unit/scripts/test_check_no_em_dashes_hook.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Lighthouse Site
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics in README and public docs must be sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers
Files:
CLAUDE.md
{docs,**}/**/*.{d2,md}
📄 CodeRabbit inference engine (CLAUDE.md)
Use D2 for architecture/nested container diagrams; mermaid for flowcharts/sequence/pipelines; markdown tables for tabular data; D2 theme 200 (Dark Mauve); D2 CLI pinned to v0.7.1 in CI
Files:
CLAUDE.mddocs/reference/convention-gates.md
{,.opencode/plugins/}synthorg-hooks.ts
📄 CodeRabbit inference engine (AGENTS.md)
synthorg-hooks.ts plugin uses PreToolUse/PostToolUse hooks that call the same validation scripts as Claude Code hooks
Files:
.opencode/plugins/synthorg-hooks.ts
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers: use
@pytest.mark.{unit,integration,e2e,slow}; async tests use auto; global timeout 30s; coverage 80% minimumWindows tests: unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backMock-spec: every Mock must declare
spec=ConcreteClass; baseline atscripts/mock_spec_baseline.txtUse
FakeClockfromtests._shared.fake_clockfor time seams; inject viaclock=parameterHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...))Flaky tests: NEVER skip/xfail; fix fundamentally; use
asyncio.Event().wait()notsleep(large)
Files:
tests/unit/scripts/test_check_no_em_dashes_hook.pytests/unit/scripts/test_check_baseline_growth.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/scripts/test_check_no_em_dashes_hook.pytests/unit/scripts/test_check_baseline_growth.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Read design specification in `docs/design/` before implementing; deviations require approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Every convention PR must ship its enforcement gate; see docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: After issue completion: create branch, commit, push (no auto-PR); use `/pre-pr-review` before `gh pr create`; after PR: use `/aurelio-review-pr` for external feedback; fix all valid feedback
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: xdist: apply `-n 8 --dist=loadfile` via pyproject `addopts` (loadfile prevents 3.14+Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Use pre-commit/pre-push hooks from `.pre-commit-config.yaml`; follow hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Use GitHub Bash queries (`gh issue list`) not MCP `list_issues` for querying issues
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: After every squash merge: run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: CLI is Docker-only (init/start/stop/status); all features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:06:33.223Z
Learning: Use `go -C cli` command syntax, never `cd cli` when working with CLI; see cli/CLAUDE.md
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/scripts/test_check_no_em_dashes_hook.pyscripts/check_baseline_growth.pytests/unit/scripts/test_check_baseline_growth.py
🔇 Additional comments (8)
docs/reference/convention-gates.md (1)
13-51: LGTM!The gate inventory correctly adds
check_baseline_growth.pyand updates the count to 38. The new "PreToolUse hooks" section clearly documents all six fail-closed hooks with their behavior..opencode/plugins/synthorg-hooks.ts (2)
198-251: LGTM!The refactored PreToolUse handling correctly:
- Maintains hook execution order matching
.claude/settings.json- Passes
content(Write) ornew_string(Edit) to the em-dash hook for content inspection- Continues to pass
filePathInputto the triage gate
342-356: LGTM!The
check_no_baseline_update.shhook is correctly invoked unconditionally for all Bash commands (like the atlas rehash hook), ensuring aliased or wrapped invocations cannot bypass the gate.scripts/check_baseline_growth.py (2)
125-137: Path traversal mitigation is sound.The
_safe_baseline_pathfunction correctly defends against path injection by resolving symlinks and verifying the result stays withinREPO_ROOT. The CodeQL alert on line 151 is mitigated by this validation—read_textis only called after confirming the resolved path is repo-relative.
1-211: LGTM overall!The gate implementation is well-structured with clear separation of concerns:
- Proper exit codes for different failure modes
- Defensive path validation before filesystem access
- Explicit bypass mechanism with
ALLOW_BASELINE_GROWTH=1- Informative error messages guiding users toward fixes
tests/unit/scripts/test_check_baseline_growth.py (3)
1-35: LGTM!The test module correctly uses
pytest.mark.unitand loads the script dynamically viaimportlib.utilto access private helpers. Thecast("Any", ...)pattern is a pragmatic approach for dynamic module typing.
41-109: LGTM!Comprehensive parametrized tests for path classification and entry counting, including edge cases for JSON with/without
locations, top-level lists, and the fallback to top-level keys.
130-357: LGTM!The end-to-end tests thoroughly cover:
- Bypass via
ALLOW_BASELINE_GROWTH=1- Growth detection and shrink allowance
- Missing HEAD treated as empty
- Invalid JSON blocking with
EXIT_INVALID_BASELINE- Path traversal rejection via
_safe_baseline_path- Multi-path mixed states (grew, shrank, missing)
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/check_baseline_growth.py`:
- Line 31: The baseline acceptance logic is inconsistent between
_is_baseline_path and _safe_baseline_path causing some scripts/*_baseline.*
files to be silently skipped; unify them to a single regex-driven rule by using
the existing _BASELINE_BASENAME_RE as the canonical check in both places (or
factor it into a new helper used by both). Update _is_baseline_path (and any
code paths inside _inspect_path that call _safe_baseline_path) to rely on that
regex test for the filename basename rather than duplicating different checks,
ensuring all baseline files matching _BASELINE_BASENAME_RE are treated
equivalently.
- Around line 151-164: The function _inspect_path currently uses
absolute.read_text to get staged_text from the working tree; change it to read
the staged blob from the Git index instead (so comparisons reflect what is
actually staged). Replace the Path.read_text usage in _inspect_path with a
git-index read (e.g., run git show :<path> or use a Git library to read the
index blob for the same relative path), decode the returned bytes to UTF-8 into
staged_text, and handle OSError/CalledProcessError or missing-index cases the
same way you handled read failures; keep existing variables (suffix, absolute,
grown, invalid) and the rest of the function logic unchanged.
In `@tests/unit/scripts/test_check_no_edit_baseline.py`:
- Around line 36-141: Add a focused unit test in test_check_no_edit_baseline.py
that asserts Python-format baseline paths under scripts (matching the pattern
scripts/_*_baseline.py) are blocked: create a new `@_BASH_AVAILABLE` test (e.g.,
test_blocks_edit_of_python_baseline_posix) that calls the existing helper _run
with tool_input.file_path set to a representative scripts/_something_baseline.py
path and new_string set to some edit, then assert result.returncode == 2; reuse
the same conventions/decorators as the other tests to ensure parity with
test_blocks_edit_of_text_baseline_posix and
test_blocks_edit_of_json_baseline_posix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 28d334fb-d1e7-4614-81a3-8b7bb81b34f8
📒 Files selected for processing (4)
scripts/check_baseline_growth.pyscripts/check_no_edit_baseline.shtests/unit/scripts/test_check_baseline_growth.pytests/unit/scripts/test_check_no_edit_baseline.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Deploy Preview
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{py,ts,tsx,md}
📄 CodeRabbit inference engine (CLAUDE.md)
Never privilege any region/currency/locale; use metric units and British English per docs/reference/regional-defaults.md
Files:
tests/unit/scripts/test_check_no_edit_baseline.pyscripts/check_baseline_growth.pytests/unit/scripts/test_check_baseline_growth.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
uv syncfor dependency management; format withuv run ruff format src/ tests/; lint withuv run ruff check --fix; type-check withuv run mypy src/ tests/in strict mode
Files:
tests/unit/scripts/test_check_no_edit_baseline.pyscripts/check_baseline_growth.pytests/unit/scripts/test_check_baseline_growth.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Run pytest with
-m unit,-m integration, or-m e2emarkers; enforce 80% coverage minimum; use xdist with-n 8 --dist=loadfileTest markers: use
@pytest.mark.{unit,integration,e2e,slow}; async tests useautomode; global timeout 30s; coverage minimum 80%Windows testing: unit tests use
WindowsSelectorEventLoopPolicy(3.14 IOCP teardown race); subprocess tests override backMock specifications: every Mock declares
spec=ConcreteClass; baseline atscripts/mock_spec_baseline.txtUse
FakeClockfromtests._shared.fake_clock; inject viaclock=parameter in testsHypothesis testing: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)); useHYPOTHESIS_PROFILE=fuzzfor fuzzingFlaky tests: NEVER skip/xfail; fix fundamentally; use
asyncio.Event().wait()instead ofsleep(large)
Files:
tests/unit/scripts/test_check_no_edit_baseline.pytests/unit/scripts/test_check_baseline_growth.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/scripts/test_check_no_edit_baseline.pytests/unit/scripts/test_check_baseline_growth.py
**/*.sh
📄 CodeRabbit inference engine (CLAUDE.md)
Shell scripts: see
~/.claude/rules/common/bash.mdfor canonicalcd/git -C/ file-write rules
Files:
scripts/check_no_edit_baseline.sh
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: Read design spec from `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: Test regression: timeout/slow failures indicate source-code regression; never manually edit `tests/baselines/unit_timing.json` or `scripts/*_baseline.*` files; use `ALLOW_BASELINE_GROWTH=1` for explicit approval
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: After issue implementation: branch + commit + push without auto-PR; use `/pre-pr-review` before creating PR; use `/aurelio-review-pr` for external feedback; fix all valid feedback before merging
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: Commit messages: `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced; signed commits required on protected refs (GPG/SSH or GitHub App)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: Branches: use `<type>/<slug>` naming from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: Pre-commit/pre-push hooks configured in `.pre-commit-config.yaml`; hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: Squash merge PRs; PR body becomes squash commit message; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: After every squash merge, run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-10T11:31:12.392Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API; use `go -C cli` never `cd cli` per cli/CLAUDE.md
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/scripts/test_check_no_edit_baseline.pyscripts/check_baseline_growth.pytests/unit/scripts/test_check_baseline_growth.py
🔇 Additional comments (3)
scripts/check_no_edit_baseline.sh (1)
26-39: Path normalisation and protected-pattern matching are solid.This closes the Windows separator bypass class and keeps the deny logic clear and deterministic.
scripts/check_baseline_growth.py (1)
173-182: Good corrupt-HEAD warning path.The warning on parse failure plus fail-closed fallback (
head_count=0) makes this failure mode visible without weakening enforcement.tests/unit/scripts/test_check_baseline_growth.py (1)
304-339: Excellent regression coverage for hardening paths.These tests directly protect the two risky areas in this PR: path-injection handling and corrupt-HEAD warning behaviour.
Also applies to: 341-373
- scripts/check_baseline_growth.py:_count_json_entries fallback to len(payload) when 'locations' key is missing, so flat-dict baselines surface growth instead of returning sentinel 0 (gemini) - scripts/check_baseline_growth.py: validate resolved baseline path stays under REPO_ROOT before reading; closes CodeQL alert #285 (py/path-injection) - .pre-commit-config.yaml: hook regex now accepts underscore-prefixed JSON/TXT baselines like scripts/_workflow_shell_git_commits_baseline.json (CodeRabbit) - scripts/check_no_em_dashes_hook.sh: normalise backslashes before changelog exemption case-match so Windows paths hit the same skip path (CodeRabbit) - Tests: update json fallback expectation, add path-traversal guard test, add Windows-path changelog exemption parametrized cases
- scripts/check_baseline_growth.py: replace _safe_baseline_path resolve()/is_relative_to() with regex-validated basename approach so the user-supplied directory is discarded and only a known-safe REPO_ROOT/scripts/<basename> path is constructed; closes CodeQL alerts #285 (line 151) and #286 (line 134) - scripts/check_baseline_growth.py: emit stderr WARNING when InvalidBaselineError is raised parsing HEAD content instead of silently falling back to head_count=0 (coderabbit) - scripts/check_no_edit_baseline.sh: normalise backslashes before case-match so Windows paths like C:\repo\scripts\foo_baseline.txt hit the same protection as POSIX (coderabbit) - Tests: rewrite path-injection test for regex sanitisation, add corrupt-HEAD warning test, new tests/unit/scripts/test_check_no_edit_baseline.py covering POSIX + Windows-path + non-baseline + malformed-envelope cases
- scripts/check_baseline_growth.py: unify _is_baseline_path and _safe_baseline_path under a single _BASELINE_BASENAME_RE (canonical regex matching txt/json with optional leading underscore + py with required leading underscore); remove _safe_baseline_path entirely so the gate never touches the user-controlled filesystem path (coderabbit) - scripts/check_baseline_growth.py: read staged content via 'git show :<path>' instead of Path.read_text(); index reads reflect what is actually staged and remove the path-injection sink, closing CodeQL alert #285 (coderabbit + codeql) - tests/unit/scripts/test_check_baseline_growth.py: switch tmp_path/REPO_ROOT-monkeypatched fixtures to mocks of _read_staged + _read_head; broaden test_is_baseline_path with traversal/Windows-separator/uppercase/extension cases; add _read_staged subprocess.run tests - tests/unit/scripts/test_check_no_edit_baseline.py: add test_blocks_edit_of_python_baseline_posix and a Windows-path .py case (coderabbit)
5587575 to
deb0926
Compare
<!-- HIGHLIGHTS_START --> ## Highlights > _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub Models). Commit-based changelog below._ ### What you'll notice - Password and secret fields now include an eye-toggle for easier visibility control. - Containers running without probes are shown as healthy in the doctor command. - Unloaded and missing PR-review agents are restored and available again. ### What's new - Gate baseline protection is enhanced to block em-dashes during writing. ### Under the hood - Replaced Atlas with yoyo-migrations for persistence management. - Refactored codebase extensively, including context-bound user authentication and registry pattern for enums. - Improved linting by draining magic number usages and tightening mock and constant checks. - Updated CI to retry Docker pushes on network timeout errors. - Updated apko lockfiles for dependency management. <!-- HIGHLIGHTS_END --> :robot: I have created a release *beep* *boop* --- ## [0.8.3](v0.8.2...v0.8.3) (2026-05-12) ### Features * harden gate baseline protection + block em-dashes at write time ([#1860](#1860)) ([b41f151](b41f151)) * **web:** eye-toggle on every password / secret field ([#1873](#1873)) ([9070387](9070387)) ### Bug Fixes * **ci:** retry Docker push on Go net/http deadline + cancellation errors ([#1877](#1877)) ([23a0bfa](23a0bfa)) * **cli:** render running-no-probe containers as healthy in doctor ([#1870](#1870)) ([6263795](6263795)) * restore unloaded and missing PR-review agents ([#1875](#1875)) ([db004fd](db004fd)), closes [#1871](#1871) ### Refactoring * bind authenticated user via ContextVar ([#1858](#1858)) ([57ed0b4](57ed0b4)) * code-structure cleanup (sub-tasks D + F + G + H + I) ([#1859](#1859)) ([362e5c8](362e5c8)) * convert enum dispatch to registry pattern ([#1854](#1854)) ([e90550e](e90550e)) * drain no_magic_numbers baseline to zero via Final hoists ([#1856](#1856) phase 2) ([#1872](#1872)) ([ec8109e](ec8109e)) * drain pagination + loop-init + kill-switch baselines ([#1857](#1857)) ([#1868](#1868)) ([115c3c2](115c3c2)) * **persistence:** replace Atlas with yoyo-migrations ([#1876](#1876)) ([1b7e975](1b7e975)), closes [#1874](#1874) * protocols audit follow-up (REVIEW + fold pass) ([#1869](#1869)) ([af33ddb](af33ddb)) * protocols audit follow-up REMOVE pass ([#1867](#1867)) ([dd1eebc](dd1eebc)) * tighten check_mock_spec gate, add mock_of[T], drain baseline ([#1862](#1862)) ([240a253](240a253)) * tighten check_no_magic_numbers for named module constants ([#1856](#1856)) ([#1866](#1866)) ([90c933b](90c933b)) ### CI/CD * update apko lockfiles ([#1863](#1863)) ([2bd32e6](2bd32e6)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
Closes the "Claude silently grows debt" loophole and adds a write-time em-dash intercept so common writer mistakes never reach the diff.
Three new enforcement layers:
scripts/check_no_edit_baseline.sh(extended): PreToolUse Edit/Write hook now blocksscripts/*_baseline.{txt,json}andscripts/_*_baseline.pyin addition totests/baselines/*.json.scripts/check_no_baseline_update.sh(new): PreToolUse Bash hook blocksscripts/check_*.py --update-baseline/--update/--refresh-baselineinvocations.scripts/check_baseline_growth.py(new): pre-commit gate compares each stagedscripts/*_baseline.*against HEAD and rejects commits that grow the entry count. Bypass viaALLOW_BASELINE_GROWTH=1(per-invocation, requires explicit user approval). RaisesInvalidBaselineErrorand exits 2 on corrupt JSON so a malformed baseline cannot silently slip through the comparison.Plus an em-dash write-time intercept:
scripts/check_no_em_dashes_hook.sh(new): PreToolUse Edit/Write hook rejects U+2014 and HTML entities (—,—,—) in the candidate content. Mirrors the existing diff-timecheck_no_em_dashes.pypre-commit gate but fires earlier, saving the writer-then-fixer round-trip.All four hooks wired into both
.claude/settings.jsonand.opencode/plugins/synthorg-hooks.ts(parity preserved, identical execution order).Documentation
CLAUDE.md"Test Regression (MANDATORY)" line expanded to cover both protected baseline families and document theALLOW_BASELINE_GROWTH=1bypass.docs/reference/convention-gates.md: addedcheck_baseline_growth.pyto the inventory (count 37 -> 38) and added a "PreToolUse hooks" subsection listing all six committed hooks.data/runtime_stats.yaml: bumpedconvention_gates.rawfrom 37 to 38.Test plan
tests/unit/scripts/test_check_baseline_growth.py(30) andtest_check_no_em_dashes_hook.py(13). Cover: path classification, JSON / text entry counting, growth detection, env bypass, missing-HEAD, FileNotFoundError vs unexpected OSError, invalid-JSON exit, unreadable staged file, multi-path mixed states; em-dash detection across content + new_string, HTML entity variants, ASCII hyphen pass, double-hyphen pass, CHANGELOG.md exclusion (relative + absolute), empty content, missing tool_input, malformed JSON envelope.--last-failed -n0re-run).Review coverage
Pre-PR review by 9 agents (code-reviewer, python-reviewer, tool-parity-checker, docs-consistency, comment-quality-rot, conventions-enforcer, test-quality-reviewer, infra-reviewer, silent-failure-hunter). 15 valid findings actioned (2 Critical + 4 Major + 9 Medium). 3 false positives rejected with documented rationale. Triage table at
_audit/pre-pr-review/triage.md.Notable Critical fixes from the review:
_count_json_entriesreturned-1on parse failure -> a corrupt JSON baseline would silently slip paststaged > head. Now raisesInvalidBaselineErrorand exits 2.docs/reference/convention-gates.mdinventory was missing the new gate (the file's own header requires updates in the same PR).Self-test moment
The em-dash hook activated mid-session and tripped on its own commit attempt because the bash source contained literal
—patterns. Fixed by reconstructing the patterns at runtime via${AMP}mdash;, the same trick the Python pre-commit gate uses. Verified end-to-end by writing this PR description: every dash here is an ASCII hyphen.