chore: improve codebase-audit skill from 2026-05-03 lessons by Aureliolo · Pull Request #1732 · Aureliolo/synthorg

Aureliolo · 2026-05-03T15:42:59Z

Summary

Tightens the /codebase-audit skill based on lessons from the 2026-05-03 full run. Skill-only change (.claude/skills/codebase-audit/SKILL.md); no source code touched.

Changes

Agent 09 (unwired-settings) prompt rewrite -- was the dominant FP source (38% rate, 39 of 39 false positives in last run). New prompt enumerates all six legitimate consumption patterns: bridge config methods, composed-config methods (get_coordination_config()), direct ConfigResolver.get_int/float/str/bool calls, Pydantic config-model embedding, subscriber pattern, bootstrap-only / read-only-post-init flags. Adds a verification requirement that forces three independent searches before flagging.
Hard ban on agent scratch scripts -- in agent-prompt template + skill-level Rule Implement agent-to-agent messaging with channels and topics #10. Past run leaked 14+ helper .py files to project root, scripts/, and c:\tmp\ that triggered Pyright diagnostic floods in the main thread.
New Phase 7: Cleanup (mandatory) -- sweeps known leak filenames + greps for any new .py at root. Documents the 14 files seen in 2026-05-03 run as the seed sweep list.
Phase 0 robustness -- handles existing _audit/latest directory (Junction collision on Windows); explicitly bans Bash redirects (project's check_bash_no_write.sh hook blocks >, >>, 2>, &>); explicit no-cd reminder.
Phase 1 Renovate filter -- drops bot-managed Dependency Dashboard from injected open-issues list per feedback_open_issues_exclude_renovate.md memory rule.
Phase 4 mandatory file enumeration -- previous run hallucinated agent filenames (14-web-store-architecture-drift.md, 19-benchmark-regression.md) that never existed. INDEX-builder must enumerate actual files via Glob and grep counts.
Phase 5 DEFAULT: deduped ISSUES.md -- one row per issue class (e.g. all 107 09-unwired-settings.md findings collapse into one "Unwired settings" row). User shouldn't have to wade through 300+ raw findings to make a decision.
Validation batching strategy -- explicit heavy / mid / small file batching with target ~20-30 findings per validator.
Lessons-from-prior-runs section -- dated subsections (starting with 2026-05-03) so each rule's rationale is traceable. Future runs add new dated subsections; older entries stay.
Rules expansion -- Implement agent-to-agent messaging with channels and topics #10 no scratch scripts, Implement agent engine core with ExecutionLoop protocol integration (DESIGN_SPEC §3.1, §6.1, §6.5) #11 no Bash redirects, Implement hierarchical delegation (task flows down, results flow up) #12 surface FP-rate signals end-of-run.

Test plan

Skill is a docs-only file (no source code). Validated by re-reading the skill end-to-end and grep-checking that all four key edits landed (DO NOT write helper, Phase 7: Cleanup, Lessons from prior runs, agent 09 FP guard).

Review coverage

Quick mode: docs-only change auto-skipped agent review. Automated checks N/A (no Python, web, or Go files changed). Pre-commit hooks passed at commit time (em-dash, large-file, secrets, branch protection, etc.).

Closes

No linked issue (skill self-improvement from observed run).

- Fix agent 09 (unwired-settings) FP rate (was 38%, drove 39 of 39 FPs). Recognise direct ConfigResolver.get_int/float/str calls, composed-config methods (get_coordination_config()), Pydantic config-model embedding, subscriber pattern, and bootstrap-only flags as legitimate consumption. - Hard-prohibit agent scratch scripts: ban writing helper *.py to project root, scripts/, c:/tmp/, /tmp/. Past run leaked 14+ scratch files that triggered Pyright diagnostic floods. - Add Phase 7: Cleanup (mandatory) to sweep agent-leaked scripts. - Phase 0: handle existing _audit/latest dir (Junction collision); ban Bash redirects (project hook blocks them); explicit no-cd reminder. - Phase 1: filter Renovate Dependency Dashboard from injected issue list. - Phase 4: MANDATORY enumerate actual finding files via Glob; previous run hallucinated agent filenames in INDEX. - Phase 5 DEFAULT: produce deduped ISSUES.md (one row per issue class) before triage prompt. User shouldn't have to wade through 300+ raw findings to make a decision. - Validation batching strategy made concrete (heavy/mid/small files). - Lessons-from-prior-runs section so each rule's rationale is traceable. - Rules expanded: no scratch scripts (#10), no Bash redirects (#11), surface FP-rate signals end-of-run (#12).

github-actions · 2026-05-03T15:43:10Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-05-03T15:43:11Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e4124ea0-d7d0-4780-a3cc-6daf766a7ecb

📥 Commits

Reviewing files that changed from the base of the PR and between 4a1d5e1 and 76d97ac.

📒 Files selected for processing (1)

.claude/skills/codebase-audit/SKILL.md

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (python)

🧰 Additional context used

🪛 LanguageTool

.claude/skills/codebase-audit/SKILL.md

[style] ~3238-~3238: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...

(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)

🔇 Additional comments (1)

.claude/skills/codebase-audit/SKILL.md (1)

42-50: LGTM! Comprehensive and well-documented improvements.

The skill updates systematically address all issues from the 2026-05-03 audit run:

Phase 0 robustness: The setup command now handles Windows junctions, real directories, and symlinks correctly with the conditional removal pattern.

Agent 09 FP prevention: Expanding from one consumption pattern to six enumerated patterns with mandatory three-search verification directly targets the 38% false-positive rate observed in the prior run.

Batching strategy: The concrete Bash example and split guidance (heavy/mid/small files targeting ~20-30 findings per validator) are clear and mathematically correct.

INDEX.md enumeration: The mandatory file-enumeration requirement prevents the hallucinated-filename issue that occurred previously.

Phase 7 cleanup: The two-stage sweep (known leaks + time-based find) with robust prev_run logic will catch agent script leakage that contaminated the last run.

Lessons section: Documenting the specific 2026-05-03 issues with cross-references to the fixes ensures future maintainers understand the rationale.

All past review comments have been correctly addressed in the code, and the new rules (#10, #11, #12) formalize the constraints that prevent recurring problems.

Also applies to: 53-53, 68-68, 401-442, 3125-3143, 3180-3283, 3501-3530, 3548-3568

Walkthrough

Updated the codebase-audit skill spec across phases: Phase 0 exports RUN_DIR, creates "$RUN_DIR/findings", and relinks _audit/latest with conflict-aware handling and reinforced PreToolUse Bash constraints. Phase 1 now injects GitHub open-issue data (including author) and filters Renovate/bot noise. Agent prompts forbid writing helper/analysis Python scripts outside _audit/latest/findings/. Phase 3 adds grep-driven batching targeting ~20–30 findings per validator. Phase 4 requires INDEX.md from real filenames and grep counts. Phase 5 always emits _audit/latest/ISSUES.md and computes confirmed counts. Phase 7 mandates cleanup of leaked and newer .py scripts. Global rules disallow scratch scripts and Bash redirects and require surfacing high FP rates.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: tightening the codebase-audit skill based on lessons from the 2026-05-03 incident.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, detailing all major improvements made to the SKILL.md file.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Review rate limit: 2/5 reviews remaining, refill in 24 minutes and 17 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Around line 3503-3504: The cleanup command in the snippet contains the
forbidden stderr redirect symbol "2>/dev/null"; remove that redirect and rely on
the existing "|| true" to swallow errors (or explicitly handle logging to an
allowed destination), so update the rm -f ... command to omit "2>/dev/null"
wherever it appears; specifically change the line containing the rm invocation
that includes "2>/dev/null" so it no longer uses stderr redirection.
- Line 42: The one-liner that creates RUN_DIR then tries to remove and relink
_audit/latest fails when _audit/latest is a directory/junction; update the
cleanup step to robustly remove whatever _audit/latest is before creating the
new symlink by replacing the plain "rm -f _audit/latest" usage with a safe
existence check and force recursive removal (e.g., test for existence of
_audit/latest and run a recursive remove) so the relink step always succeeds;
reference the RUN_DIR assignment and the `_audit/latest` target in the same
one-liner.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 36ad0deb-e5aa-4283-a3dc-32a977efbe74

📥 Commits

Reviewing files that changed from the base of the PR and between de0a1e1 and f6b72b6.

📒 Files selected for processing (1)

.claude/skills/codebase-audit/SKILL.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Analyze (go)
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (python)
GitHub Check: Analyze (actions)

🧰 Additional context used

🪛 LanguageTool

.claude/skills/codebase-audit/SKILL.md

[style] ~3231-~3231: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...

(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)

[uncategorized] ~3496-~3496: Do not mix variants of the same word (‘finalize’ and ‘finalise’) within a single text.
Context: ...ns after triage / report-only output is finalized. Sweep agent-leaked scratch files from ...

(EN_WORD_COHERENCY)

gemini-code-assist

Code Review

This pull request enhances the codebase audit skill by implementing a mandatory cleanup phase, prohibiting scratch scripts and Bash redirects, and refining the 'unwired-settings' agent to minimize false positives. It also introduces a structured batching strategy for validation and a mandatory deduped issue list for triage. Feedback suggests improving shell command robustness by quoting variables, removing a prohibited stderr redirect in the cleanup script, and correcting the search logic to include the scripts directory as intended.

gemini-code-assist · 2026-05-03T15:44:53Z

-RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)"
-mkdir -p "$RUN_DIR/findings"
-ln -sfn "runs/$(basename "$RUN_DIR")" _audit/latest
+RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" && mkdir -p "$RUN_DIR/findings" && rm -f _audit/latest && ln -sfn "runs/$(basename $RUN_DIR)" _audit/latest && echo "RUN_DIR=$RUN_DIR"


The setup command has a regression where $RUN_DIR is no longer quoted inside the basename call. While the generated timestamp is unlikely to contain spaces, it is better practice to maintain quoting for robustness and consistency with the previous version of this skill.

Suggested change

RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" && mkdir -p "$RUN_DIR/findings" && rm -f _audit/latest && ln -sfn "runs/$(basename $RUN_DIR)" _audit/latest && echo "RUN_DIR=$RUN_DIR"

RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" && mkdir -p "$RUN_DIR/findings" && rm -f _audit/latest && ln -sfn "runs/$(basename "$RUN_DIR")" _audit/latest && echo "RUN_DIR=$RUN_DIR"

gemini-code-assist · 2026-05-03T15:44:53Z


+### Batching strategy (concrete)
+
+Run `Bash for f in _audit/latest/findings/*.md; do count=$(grep -cE "^### (critical|high|medium|low|info)" "$f"); echo "$count $(basename $f)"; done | sort -rn` to produce a sorted list of (count, filename). Then build batches:


The batching command uses $(basename $f) without quotes. This is unsafe if any filenames in the findings directory contain spaces. Although the agents are instructed to use specific filenames, quoting variables in shell commands is a best practice for robustness.

gemini-code-assist · 2026-05-03T15:44:53Z

+Run this Bash sweep:
+
+```bash
+rm -f find_missing_logging.py find_missing_logging_filtered.py parse_audit.py validate_config_examples.py audit_diff.py audit_parity.py check_docs.py check_rate_limits.py circular_dep_analyzer.py check_protocols.py debug_scanner.py detailed_check.py final_audit.py find_unwired.py test_regex.py validate_configs.py verify_final.py verify_protocols.py 2>/dev/null || true


The cleanup command uses a stderr redirect (2>/dev/null), which violates Rule 11 (line 3536) and the setup notes in Phase 0 (line 47) that explicitly ban all Bash redirects. Per the instructions in Rule 11, you should let stderr surface or rely on rm -f being silent for non-existent files, combined with || true to ensure the command chain continues.

Suggested change

rm -f find_missing_logging.py find_missing_logging_filtered.py parse_audit.py validate_config_examples.py audit_diff.py audit_parity.py check_docs.py check_rate_limits.py circular_dep_analyzer.py check_protocols.py debug_scanner.py detailed_check.py final_audit.py find_unwired.py test_regex.py validate_configs.py verify_final.py verify_protocols.py 2>/dev/null || true

rm -f find_missing_logging.py find_missing_logging_filtered.py parse_audit.py validate_config_examples.py audit_diff.py audit_parity.py check_docs.py check_rate_limits.py circular_dep_analyzer.py check_protocols.py debug_scanner.py detailed_check.py final_audit.py find_unwired.py test_regex.py validate_configs.py verify_final.py verify_protocols.py || true

gemini-code-assist · 2026-05-03T15:44:53Z

+Then list and remove anything else suspicious:
+
+```bash
+find . -maxdepth 2 -name "*.py" -newer _audit/runs/$(ls -1 _audit/runs/ | tail -2 | head -1)/findings -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*" -not -path "./scripts/*"


The find command excludes the ./scripts/* path, which contradicts the subsequent notes (line 3512) stating that files inside scripts/ should be listed and the user prompted before removal. Since the description (line 3498) specifically mentions that scripts leaked into scripts/ during previous runs, this exclusion will prevent the cleanup phase from identifying those leaks.

Suggested change

find . -maxdepth 2 -name "*.py" -newer _audit/runs/$(ls -1 _audit/runs/ | tail -2 | head -1)/findings -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*" -not -path "./scripts/*"

find . -maxdepth 2 -name "*.py" -newer _audit/runs/$(ls -1 _audit/runs/ | tail -2 | head -1)/findings -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*"

- Phase 0 setup: handle Junction/real-directory case for _audit/latest via inline conditional (rm -rf if directory but not symlink, else rm -f). CodeRabbit flagged plain rm -f would abort the chain on directory targets. - Phase 0 setup: quote $RUN_DIR in basename for defensive habit. Gemini. - Phase 7 cleanup: drop the stderr-discard redirect (violates new Rule #11 banning redirects). rm -f is silent on missing paths, so the redirect was redundant. CodeRabbit + Gemini. - Phase 7 cleanup: drop -not -path './scripts/*' from find sweep so the scripts/audit_pydantic_models leakage detected last run actually surfaces. Gemini. - Phase 7 batching strategy: quote $f in basename for filename safety. Gemini. Pre-reviewed by 2 external agents (CodeRabbit, Gemini), all 4 findings addressed.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.claude/skills/codebase-audit/SKILL.md (1)

2-2: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep Claude/OpenCode command metadata in sync (152 vs 155 agents).

This file now advertises 155 agents, but .opencode/commands/codebase-audit.md still says 152. That breaks the documented parity contract and can mislead users depending on entrypoint.

Suggested parity update

--- a/.opencode/commands/codebase-audit.md
+++ b/.opencode/commands/codebase-audit.md
-description: "Full codebase audit: launches 152 specialized agents to find issues across Python/React/Go/docs/website, writes findings to _audit/latest/findings/, then triages with user"
+description: "Full codebase audit: launches 155 specialized agents to find issues across Python/React/Go/docs/website, writes findings to _audit/latest/findings/, then triages with user"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/codebase-audit/SKILL.md at line 2, The SKILL description
advertises "Full codebase audit: launches 155 specialized agents..." but the
command metadata still lists 152; update the command metadata value that
currently reads "152" to "155" so the numeric agent count matches the
description string, and then verify both the description string and the command
metadata are identical (155) across the two files.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 3127: The example command in the SKILL.md snippet incorrectly begins with
the literal token "Bash for f in ..." which is shell-invalid; update the example
so it either starts with a standard shell loop keyword (replace the leading
"Bash " with nothing so it begins "for f in ...") or show it executed via an
explicit shell invocation (e.g., prefix with a proper "bash -lc" invocation) and
ensure the rest of the pipeline (grep -cE, echo, sort -rn) and variable usage
(f, count) remain unchanged; modify the example line containing "Bash for f in
_audit/latest/findings/*.md; do count=$(...)" accordingly.
- Line 68: Update the gh query to include author metadata so Renovate-author
filtering can work: change the command string "gh issue list --state open
--limit 200 --json number,title,labels" to include author (e.g., --json
number,title,labels,author) and then apply the filter against author.login ===
"app/renovate" as described in feedback_open_issues_exclude_renovate.md; locate
usages of the literal "gh issue list --state open --limit 200 --json
number,title,labels" (and any helper that parses its JSON) and ensure they read
author.login before dropping issues authored by "app/renovate".
- Around line 3129-3134: The guidance is inconsistent: update the "Heavy files
(>25 findings)" bullet and its example so the example matches the stated chunk
size (~25) and target (~20-30). Locate the line starting with "Heavy files (>25
findings): dedicate 1 batch per ~25-finding chunk." and replace the example "a
107-finding file splits into batches of 35 + 36 + 36." with a correctly summed
split that uses ~25-finding chunks (for example "a 107-finding file splits into
batches of 25 + 25 + 25 + 32.") so the numbers align with the "~25" chunk
guidance and the overall "~20-30" target.
- Line 3556: The lesson note incorrectly states that `rm -f _audit/latest` fixes
symlink collisions; update the note to explain and show the robust,
platform-aware approach: when relinking `_audit/latest` handle symlinks, files,
and directory junctions separately (e.g., check if `_audit/latest` is a symlink
and unlink it, if it's a directory/junction remove it with rmdir/rm -rf only
after verifying it's not a real audit dir) and mention that `ln -sfn` alone can
fail on Windows junctions—reference the existing `ln -sfn` and `rm -f
_audit/latest` phrasing and the "Phase 0 setup command" so readers replace the
old single `rm -f` advice with the conditional removal pattern appropriate for
their OS.

---

Outside diff comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 2: The SKILL description advertises "Full codebase audit: launches 155
specialized agents..." but the command metadata still lists 152; update the
command metadata value that currently reads "152" to "155" so the numeric agent
count matches the description string, and then verify both the description
string and the command metadata are identical (155) across the two files.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 379acbe4-f492-4f2f-a062-509e336287f8

📥 Commits

Reviewing files that changed from the base of the PR and between f6b72b6 and 8585a58.

📒 Files selected for processing (1)

.claude/skills/codebase-audit/SKILL.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Analyze (go)
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (python)

🧰 Additional context used

🪛 LanguageTool

.claude/skills/codebase-audit/SKILL.md

[style] ~3232-~3232: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...

(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)

[uncategorized] ~3497-~3497: Do not mix variants of the same word (‘finalize’ and ‘finalise’) within a single text.
Context: ...ns after triage / report-only output is finalized. Sweep agent-leaked scratch files from ...

(EN_WORD_COHERENCY)

- Phase 1 issue list: include author in --json fields so the Renovate filter (drops author.login == 'app/renovate') has metadata to match on. Without --json author the filter is a no-op. CodeRabbit. - Phase 7 batching strategy: drop the literal leading 'Bash' token from the example command (was 'Run `Bash for f in ...`'); shell-invalid as written. Reformatted as a proper fenced bash block. CodeRabbit. - Phase 7 batching strategy: align heavy-files example with stated ~25-finding chunk size. Was '107 splits into 35+36+36'; now '107 splits into 25+25+25+32' (4 batches). CodeRabbit. - Lessons section: rewrite the symlink-collision note to describe the actual conditional fix landed in round-1 (handles all 3 states: symlink, real-dir/Junction, missing) instead of the obsolete 'rm -f alone' claim. CodeRabbit. - .opencode mirror: bump '152' to '155' in description and adapter prose to match SKILL.md. CodeRabbit.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 3503: The sentence "Runs after triage / report-only output is finalized.
Sweep agent-leaked scratch files from the working tree." uses US spelling;
update it to British English by changing "finalized" to "finalised" so it reads
"Runs after triage / report-only output is finalised..." and ensure this
spelling aligns with the project's internationalisation convention (see the
string containing "Runs after triage / report-only output is finalized").

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7953b614-4e50-4b8e-8d98-c4ac9ca353c2

📥 Commits

Reviewing files that changed from the base of the PR and between 8585a58 and dd746be.

📒 Files selected for processing (2)

.claude/skills/codebase-audit/SKILL.md
.opencode/commands/codebase-audit.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Analyze (go)
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (python)

🧰 Additional context used

🪛 LanguageTool

.claude/skills/codebase-audit/SKILL.md

[style] ~3238-~3238: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...

(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)

[uncategorized] ~3503-~3503: Do not mix variants of the same word (‘finalize’ and ‘finalise’) within a single text.
Context: ...ns after triage / report-only output is finalized. Sweep agent-leaked scratch files from ...

(EN_WORD_COHERENCY)

🔇 Additional comments (2)

.claude/skills/codebase-audit/SKILL.md (1)

42-42: Past review feedback genuinely addressed.

Verified that all 6 items flagged in previous reviews are now correctly fixed in the current code:

Line 42: Conditional _audit/latest removal handles symlink vs directory/junction

Line 68: gh issue list includes --json ...author for Renovate filtering

Line 3130: Bash loop starts with for f in (no invalid leading token)

Line 3135: Batch example math correct (25+25+25+32 = 107)

Line 3510: Cleanup command uses || true, no stderr redirect

Line 3562: Lesson note documents the full conditional-removal approach

Also applies to: 68-68, 3130-3130, 3135-3135, 3510-3510, 3562-3562

.opencode/commands/codebase-audit.md (1)

2-2: LGTM: Agent count updated consistently.

The adapter correctly reflects the skill's updated agent count (155) in both the description and the spawning instructions. The change aligns with the main skill file's expansion from 152 to 155 agents.

Also applies to: 11-11

- Phase 7: 'finalized' (US spelling) becomes 'finalised' (UK). Project ships in British English per docs/design/internationalization.md and the internal Agent 37 directive. CodeRabbit + LanguageTool flagged. - Phase 7: harden the find -newer command for first/second-run cases. Was 'tail -2 | head -1' which compares against itself on first run (no findings) and is fragile in general. Now 'ls -1t | sed -n 2p' picks the second-newest run by mtime, the '[ -n $prev_run ] &&' guard short-circuits the first-run case where no previous run exists, and the path is now quoted. CodeRabbit.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

.claude/skills/codebase-audit/SKILL.md (1)

3518-3523: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

First-run cleanup probe can still return non-zero and fail the phase.

When there is no previous run, [ -n "$prev_run" ] returns exit code 1; in Bash-tool execution this is treated as failure even though “no previous run” should be a valid no-op path.

Suggested robust command form

-prev_run=$(ls -1t _audit/runs/ | sed -n '2p') && [ -n "$prev_run" ] && find . -maxdepth 2 -name "*.py" -newer "_audit/runs/$prev_run/findings" -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*"
+prev_run=$(ls -1t _audit/runs/ | sed -n '2p'); if [ -n "$prev_run" ]; then find . -maxdepth 2 -name "*.py" -newer "_audit/runs/$prev_run/findings" -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*"; fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/codebase-audit/SKILL.md around lines 3518 - 3523, The current
one-liner using prev_run=$(...) && [ -n "$prev_run" ] && find ... can return a
non-zero exit when there is no previous run; change the logic to explicitly
handle the "no previous run" case so the step returns success: after computing
prev_run, use an if/then check (if [ -n "$prev_run" ]; then run the find command
referencing "_audit/runs/$prev_run/findings"; fi) or ensure the entire probe
always exits zero by appending a safe fallback (e.g., run the find only when
prev_run is non-empty or wrap the test in a construct that does not propagate a
non-zero exit). Reference the prev_run variable and the find invocation to
locate where to apply this defensive check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 53: Update the `.gitignore` check to match entries with optional
leading/trailing slashes so entries like `/_audit/` aren't missed: replace the
current grep pattern that only anchors `^_audit` with a regex that allows an
optional leading slash and either a trailing slash or end-of-line (i.e., match
`/_audit/`, `_audit/`, or `_audit`), and update the doc line that runs the grep
accordingly; refer to the `_audit` token and the `.gitignore` check in the
SKILL.md content when making the change.

---

Duplicate comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Around line 3518-3523: The current one-liner using prev_run=$(...) && [ -n
"$prev_run" ] && find ... can return a non-zero exit when there is no previous
run; change the logic to explicitly handle the "no previous run" case so the
step returns success: after computing prev_run, use an if/then check (if [ -n
"$prev_run" ]; then run the find command referencing
"_audit/runs/$prev_run/findings"; fi) or ensure the entire probe always exits
zero by appending a safe fallback (e.g., run the find only when prev_run is
non-empty or wrap the test in a construct that does not propagate a non-zero
exit). Reference the prev_run variable and the find invocation to locate where
to apply this defensive check.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 039dfad0-1a14-4acc-bf27-a010aaa8d237

📥 Commits

Reviewing files that changed from the base of the PR and between dd746be and 4a1d5e1.

📒 Files selected for processing (1)

.claude/skills/codebase-audit/SKILL.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (go)
GitHub Check: Analyze (python)

🧰 Additional context used

🪛 LanguageTool

.claude/skills/codebase-audit/SKILL.md

[style] ~3238-~3238: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...

(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)

🔇 Additional comments (1)

.claude/skills/codebase-audit/SKILL.md (1)

401-442: Strong FP guardrail rewrite for Agent 09.

The six consumption-pattern checklist plus mandatory multi-search verification is a solid improvement and should materially reduce false positives in unwired-settings.

- Phase 0 .gitignore check: broaden grep pattern from '^_audit' (which misses '/_audit/' style entries) to a regex that matches all four common forms ('_audit', '_audit/', '/_audit', '/_audit/'). Prevents false-negative duplicate-add actions. CodeRabbit.

) ## Summary Adds Rules **R-A through R-E** to the `/codebase-audit` agent-prompt template. Each is grounded in a specific FP category observed in the 2026-05-03 audit run (12 categories tracked during the post-run walk-through). Skill-only change; no source code touched. ## Why After PR #1732 (round-1 fixes), four agents still produced FPs at significant rates because their prompts shared the same underlying gaps: - Agents flag patterns without verifying the surrounding state (existing fixes, docstring carve-outs, caller distribution, semantic equivalence). - Convention-scope findings cite N cases without estimating total population, so the user can't choose between surgical fix and convention rollout. - Mock-drift / API-duplication findings calibrate severity without sampling actual usage. These five rules are mandatory for every audit agent. ## The 5 rules ### R-A: Verify the proposed fix isn't already in place Catches: - Memory-leak findings whose cleanup hook IS already registered (agent 97 Site 2 was a FP because `cancelPendingPersist` is in `test-setup.tsx:272`). - NotImplementedError-in-mixin findings overridden in concrete subclass (agent 18 had 2/3 sample FP rate). - **Ghost-wired settings (most severe class)**: agent 09 flagged 13 settings dead; 11 of 13 were consumed by living machinery whose owning service was simply never instantiated at boot. Import-graph trace missed this entirely. The new rule requires tracing through `lifecycle_helpers.py` / `app.py` startup wiring. ### R-B: Read existing helper docstrings for design carve-outs Catches: agent 138 flagged 5 inline retry loops as needing centralisation, but `GeneralRetryHandler`'s module docstring explicitly carved out 4 of the 5 (semantic self-correction, contention, sync-thread). Reading the docstring catches design intent. ### R-C: Scope estimate alongside cited cases Catches: agent 34 cited 12 plain-Exception classes; actual project-wide population is ~80-120 across 19 modules. User's surgical-vs-convention-rollout decision depends on knowing the gap. New rule: agents emit "12 cited; sample suggests ~80-120 across N modules". ### R-D: Semantic equivalence before flagging duplicates Catches: agent 110 flagged a pairwise-compare cluster as duplicate when one used field-equality and the other used text-similarity (identical iteration, divergent inner predicate). Agent 146 conflated intentional bootstrap multi-surface settings with accidental duplication (4 of 8 flagged settings were intentional). ### R-E: Caller-context distribution for API duplication findings Catches: AuthService mixed sync/async API was high-severity in raw findings; actual distribution was 100% production-async + test-fixture-sync, which made the choice obvious. New rule: agents emit caller distribution before recommending the fix shape. ## Test plan Skill is docs-only. Validated by: - Reading the 5 new rules end-to-end for clarity + grounding. - Each rule cites the concrete 2026-05-03 example so future agents can recognise the pattern. ## Out of scope A separate master tracking issue (filed alongside this PR) captures the remaining session learnings: 5 follow-up issues for larger workstreams (config layering, error-handling convention rollout, no-hardcoded-values rule, bootstrap-wiring lint, scoring research), Group 1+2+3 decisions with research-backed rationale, and the meta-process plan (scaffolding templates, pre-PR mini-audit, worktree do-not-introduce, convention-must-ship-its-gate). ## Closes No linked issue (skill self-improvement).

## Highlights > _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub Models). Commit-based changelog below._ ### What you'll notice - Release notes now include AI-generated Highlights and commit subject/body in dev releases. ### What's new - Trace added to detect ghost-wired settings for better linting feedback. ### Under the hood - Docker publish step now retries transient GHCR errors to improve release stability. - Codebase audit incorporated 21 mechanical fixes from recent reviews to improve quality. - Added false-positive prevention rules to codebase audit prompts to reduce errors. - Enhanced codebase audit skill based on insights from latest audits. - Web component tightened async-leak detection and improved jsdom Storage timer handling.  :robot: I have created a release *beep* *boop* --- ## [0.7.9](v0.7.8...v0.7.9) (2026-05-04) ### Features * **ci:** promote AI Highlights to release body, add commit subject/body to dev releases ([#1743](#1743)) ([70bffe9](70bffe9)) * **lint:** bootstrap-wiring trace for ghost-wired settings ([#1742](#1742)) ([6ddbb86](6ddbb86)), closes [#1737](#1737) ### Bug Fixes * **ci:** retry transient GHCR errors in docker publish + retag ([#1741](#1741)) ([18e5349](18e5349)) ### Maintenance * 2026-05-03 Bucket A audit ([#1733](#1733)): 21 mechanical fixes ([#1744](#1744)) ([334262b](334262b)) * add 5 FP-prevention rules to /codebase-audit agent prompts ([#1734](#1734)) ([8f33ad6](8f33ad6)) * improve codebase-audit skill from 2026-05-03 lessons ([#1732](#1732)) ([7281cd7](7281cd7)) * **web:** tighten async-leak ceiling, bypass jsdom Storage timer dispatch ([#1728](#1728)) ([de0a1e1](de0a1e1)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

coderabbitai Bot requested changes May 3, 2026

View reviewed changes

Comment thread .claude/skills/codebase-audit/SKILL.md Outdated

Comment thread .claude/skills/codebase-audit/SKILL.md Outdated

gemini-code-assist Bot reviewed May 3, 2026

View reviewed changes

coderabbitai Bot requested changes May 3, 2026

View reviewed changes

Comment thread .claude/skills/codebase-audit/SKILL.md Outdated

Comment thread .claude/skills/codebase-audit/SKILL.md Outdated

Comment thread .claude/skills/codebase-audit/SKILL.md Outdated

Comment thread .claude/skills/codebase-audit/SKILL.md Outdated

coderabbitai Bot requested changes May 3, 2026

View reviewed changes

Comment thread .claude/skills/codebase-audit/SKILL.md Outdated

Comment thread .claude/skills/codebase-audit/SKILL.md

coderabbitai Bot requested changes May 3, 2026

View reviewed changes

Comment thread .claude/skills/codebase-audit/SKILL.md Outdated

coderabbitai Bot approved these changes May 3, 2026

View reviewed changes

Aureliolo merged commit 7281cd7 into main May 3, 2026
41 checks passed

Aureliolo deleted the chore/codebase-audit-skill-improvements branch May 3, 2026 16:50

synthorg-repo-bot Bot mentioned this pull request May 3, 2026

chore(main): release 0.7.9 #1731

Merged

This was referenced May 3, 2026

chore: add 5 FP-prevention rules to /codebase-audit agent prompts #1734

Merged

Audit 2026-05-03 follow-ups: decisions, FP-prevention, meta-process plan (handoff) #1735

Closed

	RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" && mkdir -p "$RUN_DIR/findings" && rm -f _audit/latest && ln -sfn "runs/$(basename $RUN_DIR)" _audit/latest && echo "RUN_DIR=$RUN_DIR"
	RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" && mkdir -p "$RUN_DIR/findings" && rm -f _audit/latest && ln -sfn "runs/$(basename "$RUN_DIR")" _audit/latest && echo "RUN_DIR=$RUN_DIR"


		### Batching strategy (concrete)

		Run `Bash for f in _audit/latest/findings/*.md; do count=$(grep -cE "^### (critical\|high\|medium\|low\|info)" "$f"); echo "$count $(basename $f)"; done \| sort -rn` to produce a sorted list of (count, filename). Then build batches:

	rm -f find_missing_logging.py find_missing_logging_filtered.py parse_audit.py validate_config_examples.py audit_diff.py audit_parity.py check_docs.py check_rate_limits.py circular_dep_analyzer.py check_protocols.py debug_scanner.py detailed_check.py final_audit.py find_unwired.py test_regex.py validate_configs.py verify_final.py verify_protocols.py 2>/dev/null \|\| true
	rm -f find_missing_logging.py find_missing_logging_filtered.py parse_audit.py validate_config_examples.py audit_diff.py audit_parity.py check_docs.py check_rate_limits.py circular_dep_analyzer.py check_protocols.py debug_scanner.py detailed_check.py final_audit.py find_unwired.py test_regex.py validate_configs.py verify_final.py verify_protocols.py \|\| true

	find . -maxdepth 2 -name ".py" -newer _audit/runs/$(ls -1 _audit/runs/ \| tail -2 \| head -1)/findings -not -path "./src/" -not -path "./tests/" -not -path "./web/" -not -path "./cli/" -not -path "./scripts/"
	find . -maxdepth 2 -name ".py" -newer _audit/runs/$(ls -1 _audit/runs/ \| tail -2 \| head -1)/findings -not -path "./src/" -not -path "./tests/" -not -path "./web/" -not -path "./cli/*"

Conversation

Aureliolo commented May 3, 2026

Summary

Changes

Test plan

Review coverage

Closes

Uh oh!

github-actions Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 3, 2026 •

edited

Loading

coderabbitai Bot commented May 3, 2026 •

edited

Loading