chore: improve codebase-audit skill from 2026-05-03 lessons#1732
Conversation
- Fix agent 09 (unwired-settings) FP rate (was 38%, drove 39 of 39 FPs). Recognise direct ConfigResolver.get_int/float/str calls, composed-config methods (get_coordination_config()), Pydantic config-model embedding, subscriber pattern, and bootstrap-only flags as legitimate consumption. - Hard-prohibit agent scratch scripts: ban writing helper *.py to project root, scripts/, c:/tmp/, /tmp/. Past run leaked 14+ scratch files that triggered Pyright diagnostic floods. - Add Phase 7: Cleanup (mandatory) to sweep agent-leaked scripts. - Phase 0: handle existing _audit/latest dir (Junction collision); ban Bash redirects (project hook blocks them); explicit no-cd reminder. - Phase 1: filter Renovate Dependency Dashboard from injected issue list. - Phase 4: MANDATORY enumerate actual finding files via Glob; previous run hallucinated agent filenames in INDEX. - Phase 5 DEFAULT: produce deduped ISSUES.md (one row per issue class) before triage prompt. User shouldn't have to wade through 300+ raw findings to make a decision. - Validation batching strategy made concrete (heavy/mid/small files). - Lessons-from-prior-runs section so each rule's rationale is traceable. - Rules expanded: no scratch scripts (#10), no Bash redirects (#11), surface FP-rate signals end-of-run (#12).
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🧰 Additional context used🪛 LanguageTool.claude/skills/codebase-audit/SKILL.md[style] ~3238-~3238: ‘make a decision’ might be wordy. Consider a shorter alternative. (EN_WORDINESS_PREMIUM_MAKE_A_DECISION) 🔇 Additional comments (1)
WalkthroughUpdated the codebase-audit skill spec across phases: Phase 0 exports 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 2/5 reviews remaining, refill in 24 minutes and 17 seconds. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Around line 3503-3504: The cleanup command in the snippet contains the
forbidden stderr redirect symbol "2>/dev/null"; remove that redirect and rely on
the existing "|| true" to swallow errors (or explicitly handle logging to an
allowed destination), so update the rm -f ... command to omit "2>/dev/null"
wherever it appears; specifically change the line containing the rm invocation
that includes "2>/dev/null" so it no longer uses stderr redirection.
- Line 42: The one-liner that creates RUN_DIR then tries to remove and relink
_audit/latest fails when _audit/latest is a directory/junction; update the
cleanup step to robustly remove whatever _audit/latest is before creating the
new symlink by replacing the plain "rm -f _audit/latest" usage with a safe
existence check and force recursive removal (e.g., test for existence of
_audit/latest and run a recursive remove) so the relink step always succeeds;
reference the RUN_DIR assignment and the `_audit/latest` target in the same
one-liner.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 36ad0deb-e5aa-4283-a3dc-32a977efbe74
📒 Files selected for processing (1)
.claude/skills/codebase-audit/SKILL.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (actions)
🧰 Additional context used
🪛 LanguageTool
.claude/skills/codebase-audit/SKILL.md
[style] ~3231-~3231: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...
(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)
[uncategorized] ~3496-~3496: Do not mix variants of the same word (‘finalize’ and ‘finalise’) within a single text.
Context: ...ns after triage / report-only output is finalized. Sweep agent-leaked scratch files from ...
(EN_WORD_COHERENCY)
There was a problem hiding this comment.
Code Review
This pull request enhances the codebase audit skill by implementing a mandatory cleanup phase, prohibiting scratch scripts and Bash redirects, and refining the 'unwired-settings' agent to minimize false positives. It also introduces a structured batching strategy for validation and a mandatory deduped issue list for triage. Feedback suggests improving shell command robustness by quoting variables, removing a prohibited stderr redirect in the cleanup script, and correcting the search logic to include the scripts directory as intended.
| RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" | ||
| mkdir -p "$RUN_DIR/findings" | ||
| ln -sfn "runs/$(basename "$RUN_DIR")" _audit/latest | ||
| RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" && mkdir -p "$RUN_DIR/findings" && rm -f _audit/latest && ln -sfn "runs/$(basename $RUN_DIR)" _audit/latest && echo "RUN_DIR=$RUN_DIR" |
There was a problem hiding this comment.
The setup command has a regression where $RUN_DIR is no longer quoted inside the basename call. While the generated timestamp is unlikely to contain spaces, it is better practice to maintain quoting for robustness and consistency with the previous version of this skill.
| RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" && mkdir -p "$RUN_DIR/findings" && rm -f _audit/latest && ln -sfn "runs/$(basename $RUN_DIR)" _audit/latest && echo "RUN_DIR=$RUN_DIR" | |
| RUN_DIR="_audit/runs/$(date +%Y-%m-%d-%H%M%S)" && mkdir -p "$RUN_DIR/findings" && rm -f _audit/latest && ln -sfn "runs/$(basename "$RUN_DIR")" _audit/latest && echo "RUN_DIR=$RUN_DIR" |
|
|
||
| ### Batching strategy (concrete) | ||
|
|
||
| Run `Bash for f in _audit/latest/findings/*.md; do count=$(grep -cE "^### (critical|high|medium|low|info)" "$f"); echo "$count $(basename $f)"; done | sort -rn` to produce a sorted list of (count, filename). Then build batches: |
There was a problem hiding this comment.
| Run this Bash sweep: | ||
|
|
||
| ```bash | ||
| rm -f find_missing_logging.py find_missing_logging_filtered.py parse_audit.py validate_config_examples.py audit_diff.py audit_parity.py check_docs.py check_rate_limits.py circular_dep_analyzer.py check_protocols.py debug_scanner.py detailed_check.py final_audit.py find_unwired.py test_regex.py validate_configs.py verify_final.py verify_protocols.py 2>/dev/null || true |
There was a problem hiding this comment.
The cleanup command uses a stderr redirect (2>/dev/null), which violates Rule 11 (line 3536) and the setup notes in Phase 0 (line 47) that explicitly ban all Bash redirects. Per the instructions in Rule 11, you should let stderr surface or rely on rm -f being silent for non-existent files, combined with || true to ensure the command chain continues.
| rm -f find_missing_logging.py find_missing_logging_filtered.py parse_audit.py validate_config_examples.py audit_diff.py audit_parity.py check_docs.py check_rate_limits.py circular_dep_analyzer.py check_protocols.py debug_scanner.py detailed_check.py final_audit.py find_unwired.py test_regex.py validate_configs.py verify_final.py verify_protocols.py 2>/dev/null || true | |
| rm -f find_missing_logging.py find_missing_logging_filtered.py parse_audit.py validate_config_examples.py audit_diff.py audit_parity.py check_docs.py check_rate_limits.py circular_dep_analyzer.py check_protocols.py debug_scanner.py detailed_check.py final_audit.py find_unwired.py test_regex.py validate_configs.py verify_final.py verify_protocols.py || true |
| Then list and remove anything else suspicious: | ||
|
|
||
| ```bash | ||
| find . -maxdepth 2 -name "*.py" -newer _audit/runs/$(ls -1 _audit/runs/ | tail -2 | head -1)/findings -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*" -not -path "./scripts/*" |
There was a problem hiding this comment.
The find command excludes the ./scripts/* path, which contradicts the subsequent notes (line 3512) stating that files inside scripts/ should be listed and the user prompted before removal. Since the description (line 3498) specifically mentions that scripts leaked into scripts/ during previous runs, this exclusion will prevent the cleanup phase from identifying those leaks.
| find . -maxdepth 2 -name "*.py" -newer _audit/runs/$(ls -1 _audit/runs/ | tail -2 | head -1)/findings -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*" -not -path "./scripts/*" | |
| find . -maxdepth 2 -name "*.py" -newer _audit/runs/$(ls -1 _audit/runs/ | tail -2 | head -1)/findings -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*" |
- Phase 0 setup: handle Junction/real-directory case for _audit/latest via inline conditional (rm -rf if directory but not symlink, else rm -f). CodeRabbit flagged plain rm -f would abort the chain on directory targets. - Phase 0 setup: quote $RUN_DIR in basename for defensive habit. Gemini. - Phase 7 cleanup: drop the stderr-discard redirect (violates new Rule #11 banning redirects). rm -f is silent on missing paths, so the redirect was redundant. CodeRabbit + Gemini. - Phase 7 cleanup: drop -not -path './scripts/*' from find sweep so the scripts/audit_pydantic_models leakage detected last run actually surfaces. Gemini. - Phase 7 batching strategy: quote $f in basename for filename safety. Gemini. Pre-reviewed by 2 external agents (CodeRabbit, Gemini), all 4 findings addressed.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.claude/skills/codebase-audit/SKILL.md (1)
2-2:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep Claude/OpenCode command metadata in sync (152 vs 155 agents).
This file now advertises 155 agents, but
.opencode/commands/codebase-audit.mdstill says 152. That breaks the documented parity contract and can mislead users depending on entrypoint.Suggested parity update
--- a/.opencode/commands/codebase-audit.md +++ b/.opencode/commands/codebase-audit.md -description: "Full codebase audit: launches 152 specialized agents to find issues across Python/React/Go/docs/website, writes findings to _audit/latest/findings/, then triages with user" +description: "Full codebase audit: launches 155 specialized agents to find issues across Python/React/Go/docs/website, writes findings to _audit/latest/findings/, then triages with user"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/codebase-audit/SKILL.md at line 2, The SKILL description advertises "Full codebase audit: launches 155 specialized agents..." but the command metadata still lists 152; update the command metadata value that currently reads "152" to "155" so the numeric agent count matches the description string, and then verify both the description string and the command metadata are identical (155) across the two files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 3127: The example command in the SKILL.md snippet incorrectly begins with
the literal token "Bash for f in ..." which is shell-invalid; update the example
so it either starts with a standard shell loop keyword (replace the leading
"Bash " with nothing so it begins "for f in ...") or show it executed via an
explicit shell invocation (e.g., prefix with a proper "bash -lc" invocation) and
ensure the rest of the pipeline (grep -cE, echo, sort -rn) and variable usage
(f, count) remain unchanged; modify the example line containing "Bash for f in
_audit/latest/findings/*.md; do count=$(...)" accordingly.
- Line 68: Update the gh query to include author metadata so Renovate-author
filtering can work: change the command string "gh issue list --state open
--limit 200 --json number,title,labels" to include author (e.g., --json
number,title,labels,author) and then apply the filter against author.login ===
"app/renovate" as described in feedback_open_issues_exclude_renovate.md; locate
usages of the literal "gh issue list --state open --limit 200 --json
number,title,labels" (and any helper that parses its JSON) and ensure they read
author.login before dropping issues authored by "app/renovate".
- Around line 3129-3134: The guidance is inconsistent: update the "Heavy files
(>25 findings)" bullet and its example so the example matches the stated chunk
size (~25) and target (~20-30). Locate the line starting with "Heavy files (>25
findings): dedicate 1 batch per ~25-finding chunk." and replace the example "a
107-finding file splits into batches of 35 + 36 + 36." with a correctly summed
split that uses ~25-finding chunks (for example "a 107-finding file splits into
batches of 25 + 25 + 25 + 32.") so the numbers align with the "~25" chunk
guidance and the overall "~20-30" target.
- Line 3556: The lesson note incorrectly states that `rm -f _audit/latest` fixes
symlink collisions; update the note to explain and show the robust,
platform-aware approach: when relinking `_audit/latest` handle symlinks, files,
and directory junctions separately (e.g., check if `_audit/latest` is a symlink
and unlink it, if it's a directory/junction remove it with rmdir/rm -rf only
after verifying it's not a real audit dir) and mention that `ln -sfn` alone can
fail on Windows junctions—reference the existing `ln -sfn` and `rm -f
_audit/latest` phrasing and the "Phase 0 setup command" so readers replace the
old single `rm -f` advice with the conditional removal pattern appropriate for
their OS.
---
Outside diff comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 2: The SKILL description advertises "Full codebase audit: launches 155
specialized agents..." but the command metadata still lists 152; update the
command metadata value that currently reads "152" to "155" so the numeric agent
count matches the description string, and then verify both the description
string and the command metadata are identical (155) across the two files.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 379acbe4-f492-4f2f-a062-509e336287f8
📒 Files selected for processing (1)
.claude/skills/codebase-audit/SKILL.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (python)
🧰 Additional context used
🪛 LanguageTool
.claude/skills/codebase-audit/SKILL.md
[style] ~3232-~3232: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...
(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)
[uncategorized] ~3497-~3497: Do not mix variants of the same word (‘finalize’ and ‘finalise’) within a single text.
Context: ...ns after triage / report-only output is finalized. Sweep agent-leaked scratch files from ...
(EN_WORD_COHERENCY)
- Phase 1 issue list: include author in --json fields so the Renovate filter (drops author.login == 'app/renovate') has metadata to match on. Without --json author the filter is a no-op. CodeRabbit. - Phase 7 batching strategy: drop the literal leading 'Bash' token from the example command (was 'Run `Bash for f in ...`'); shell-invalid as written. Reformatted as a proper fenced bash block. CodeRabbit. - Phase 7 batching strategy: align heavy-files example with stated ~25-finding chunk size. Was '107 splits into 35+36+36'; now '107 splits into 25+25+25+32' (4 batches). CodeRabbit. - Lessons section: rewrite the symlink-collision note to describe the actual conditional fix landed in round-1 (handles all 3 states: symlink, real-dir/Junction, missing) instead of the obsolete 'rm -f alone' claim. CodeRabbit. - .opencode mirror: bump '152' to '155' in description and adapter prose to match SKILL.md. CodeRabbit.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 3503: The sentence "Runs after triage / report-only output is finalized.
Sweep agent-leaked scratch files from the working tree." uses US spelling;
update it to British English by changing "finalized" to "finalised" so it reads
"Runs after triage / report-only output is finalised..." and ensure this
spelling aligns with the project's internationalisation convention (see the
string containing "Runs after triage / report-only output is finalized").
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7953b614-4e50-4b8e-8d98-c4ac9ca353c2
📒 Files selected for processing (2)
.claude/skills/codebase-audit/SKILL.md.opencode/commands/codebase-audit.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (python)
🧰 Additional context used
🪛 LanguageTool
.claude/skills/codebase-audit/SKILL.md
[style] ~3238-~3238: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...
(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)
[uncategorized] ~3503-~3503: Do not mix variants of the same word (‘finalize’ and ‘finalise’) within a single text.
Context: ...ns after triage / report-only output is finalized. Sweep agent-leaked scratch files from ...
(EN_WORD_COHERENCY)
🔇 Additional comments (2)
.claude/skills/codebase-audit/SKILL.md (1)
42-42: Past review feedback genuinely addressed.Verified that all 6 items flagged in previous reviews are now correctly fixed in the current code:
- Line 42: Conditional
_audit/latestremoval handles symlink vs directory/junction- Line 68:
gh issue listincludes--json ...authorfor Renovate filtering- Line 3130: Bash loop starts with
for f in(no invalid leading token)- Line 3135: Batch example math correct (25+25+25+32 = 107)
- Line 3510: Cleanup command uses
|| true, no stderr redirect- Line 3562: Lesson note documents the full conditional-removal approach
Also applies to: 68-68, 3130-3130, 3135-3135, 3510-3510, 3562-3562
.opencode/commands/codebase-audit.md (1)
2-2: LGTM: Agent count updated consistently.The adapter correctly reflects the skill's updated agent count (155) in both the description and the spawning instructions. The change aligns with the main skill file's expansion from 152 to 155 agents.
Also applies to: 11-11
- Phase 7: 'finalized' (US spelling) becomes 'finalised' (UK). Project ships in British English per docs/design/internationalization.md and the internal Agent 37 directive. CodeRabbit + LanguageTool flagged. - Phase 7: harden the find -newer command for first/second-run cases. Was 'tail -2 | head -1' which compares against itself on first run (no findings) and is fragile in general. Now 'ls -1t | sed -n 2p' picks the second-newest run by mtime, the '[ -n $prev_run ] &&' guard short-circuits the first-run case where no previous run exists, and the path is now quoted. CodeRabbit.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
.claude/skills/codebase-audit/SKILL.md (1)
3518-3523:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFirst-run cleanup probe can still return non-zero and fail the phase.
When there is no previous run,
[ -n "$prev_run" ]returns exit code1; in Bash-tool execution this is treated as failure even though “no previous run” should be a valid no-op path.Suggested robust command form
-prev_run=$(ls -1t _audit/runs/ | sed -n '2p') && [ -n "$prev_run" ] && find . -maxdepth 2 -name "*.py" -newer "_audit/runs/$prev_run/findings" -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*" +prev_run=$(ls -1t _audit/runs/ | sed -n '2p'); if [ -n "$prev_run" ]; then find . -maxdepth 2 -name "*.py" -newer "_audit/runs/$prev_run/findings" -not -path "./src/*" -not -path "./tests/*" -not -path "./web/*" -not -path "./cli/*"; fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/codebase-audit/SKILL.md around lines 3518 - 3523, The current one-liner using prev_run=$(...) && [ -n "$prev_run" ] && find ... can return a non-zero exit when there is no previous run; change the logic to explicitly handle the "no previous run" case so the step returns success: after computing prev_run, use an if/then check (if [ -n "$prev_run" ]; then run the find command referencing "_audit/runs/$prev_run/findings"; fi) or ensure the entire probe always exits zero by appending a safe fallback (e.g., run the find only when prev_run is non-empty or wrap the test in a construct that does not propagate a non-zero exit). Reference the prev_run variable and the find invocation to locate where to apply this defensive check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Line 53: Update the `.gitignore` check to match entries with optional
leading/trailing slashes so entries like `/_audit/` aren't missed: replace the
current grep pattern that only anchors `^_audit` with a regex that allows an
optional leading slash and either a trailing slash or end-of-line (i.e., match
`/_audit/`, `_audit/`, or `_audit`), and update the doc line that runs the grep
accordingly; refer to the `_audit` token and the `.gitignore` check in the
SKILL.md content when making the change.
---
Duplicate comments:
In @.claude/skills/codebase-audit/SKILL.md:
- Around line 3518-3523: The current one-liner using prev_run=$(...) && [ -n
"$prev_run" ] && find ... can return a non-zero exit when there is no previous
run; change the logic to explicitly handle the "no previous run" case so the
step returns success: after computing prev_run, use an if/then check (if [ -n
"$prev_run" ]; then run the find command referencing
"_audit/runs/$prev_run/findings"; fi) or ensure the entire probe always exits
zero by appending a safe fallback (e.g., run the find only when prev_run is
non-empty or wrap the test in a construct that does not propagate a non-zero
exit). Reference the prev_run variable and the find invocation to locate where
to apply this defensive check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 039dfad0-1a14-4acc-bf27-a010aaa8d237
📒 Files selected for processing (1)
.claude/skills/codebase-audit/SKILL.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (python)
🧰 Additional context used
🪛 LanguageTool
.claude/skills/codebase-audit/SKILL.md
[style] ~3238-~3238: ‘make a decision’ might be wordy. Consider a shorter alternative.
Context: ...staring at 300+ raw findings and cannot make a decision. ### Phase 5 DEFAULT: Deduped Issue Li...
(EN_WORDINESS_PREMIUM_MAKE_A_DECISION)
🔇 Additional comments (1)
.claude/skills/codebase-audit/SKILL.md (1)
401-442: Strong FP guardrail rewrite for Agent 09.The six consumption-pattern checklist plus mandatory multi-search verification is a solid improvement and should materially reduce false positives in
unwired-settings.
- Phase 0 .gitignore check: broaden grep pattern from '^_audit'
(which misses '/_audit/' style entries) to a regex that matches
all four common forms ('_audit', '_audit/', '/_audit', '/_audit/').
Prevents false-negative duplicate-add actions. CodeRabbit.
) ## Summary Adds Rules **R-A through R-E** to the `/codebase-audit` agent-prompt template. Each is grounded in a specific FP category observed in the 2026-05-03 audit run (12 categories tracked during the post-run walk-through). Skill-only change; no source code touched. ## Why After PR #1732 (round-1 fixes), four agents still produced FPs at significant rates because their prompts shared the same underlying gaps: - Agents flag patterns without verifying the surrounding state (existing fixes, docstring carve-outs, caller distribution, semantic equivalence). - Convention-scope findings cite N cases without estimating total population, so the user can't choose between surgical fix and convention rollout. - Mock-drift / API-duplication findings calibrate severity without sampling actual usage. These five rules are mandatory for every audit agent. ## The 5 rules ### R-A: Verify the proposed fix isn't already in place Catches: - Memory-leak findings whose cleanup hook IS already registered (agent 97 Site 2 was a FP because `cancelPendingPersist` is in `test-setup.tsx:272`). - NotImplementedError-in-mixin findings overridden in concrete subclass (agent 18 had 2/3 sample FP rate). - **Ghost-wired settings (most severe class)**: agent 09 flagged 13 settings dead; 11 of 13 were consumed by living machinery whose owning service was simply never instantiated at boot. Import-graph trace missed this entirely. The new rule requires tracing through `lifecycle_helpers.py` / `app.py` startup wiring. ### R-B: Read existing helper docstrings for design carve-outs Catches: agent 138 flagged 5 inline retry loops as needing centralisation, but `GeneralRetryHandler`'s module docstring explicitly carved out 4 of the 5 (semantic self-correction, contention, sync-thread). Reading the docstring catches design intent. ### R-C: Scope estimate alongside cited cases Catches: agent 34 cited 12 plain-Exception classes; actual project-wide population is ~80-120 across 19 modules. User's surgical-vs-convention-rollout decision depends on knowing the gap. New rule: agents emit "12 cited; sample suggests ~80-120 across N modules". ### R-D: Semantic equivalence before flagging duplicates Catches: agent 110 flagged a pairwise-compare cluster as duplicate when one used field-equality and the other used text-similarity (identical iteration, divergent inner predicate). Agent 146 conflated intentional bootstrap multi-surface settings with accidental duplication (4 of 8 flagged settings were intentional). ### R-E: Caller-context distribution for API duplication findings Catches: AuthService mixed sync/async API was high-severity in raw findings; actual distribution was 100% production-async + test-fixture-sync, which made the choice obvious. New rule: agents emit caller distribution before recommending the fix shape. ## Test plan Skill is docs-only. Validated by: - Reading the 5 new rules end-to-end for clarity + grounding. - Each rule cites the concrete 2026-05-03 example so future agents can recognise the pattern. ## Out of scope A separate master tracking issue (filed alongside this PR) captures the remaining session learnings: 5 follow-up issues for larger workstreams (config layering, error-handling convention rollout, no-hardcoded-values rule, bootstrap-wiring lint, scoring research), Group 1+2+3 decisions with research-backed rationale, and the meta-process plan (scaffolding templates, pre-PR mini-audit, worktree do-not-introduce, convention-must-ship-its-gate). ## Closes No linked issue (skill self-improvement).
<!-- HIGHLIGHTS_START --> ## Highlights > _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub Models). Commit-based changelog below._ ### What you'll notice - Release notes now include AI-generated Highlights and commit subject/body in dev releases. ### What's new - Trace added to detect ghost-wired settings for better linting feedback. ### Under the hood - Docker publish step now retries transient GHCR errors to improve release stability. - Codebase audit incorporated 21 mechanical fixes from recent reviews to improve quality. - Added false-positive prevention rules to codebase audit prompts to reduce errors. - Enhanced codebase audit skill based on insights from latest audits. - Web component tightened async-leak detection and improved jsdom Storage timer handling. <!-- HIGHLIGHTS_END --> :robot: I have created a release *beep* *boop* --- ## [0.7.9](v0.7.8...v0.7.9) (2026-05-04) ### Features * **ci:** promote AI Highlights to release body, add commit subject/body to dev releases ([#1743](#1743)) ([70bffe9](70bffe9)) * **lint:** bootstrap-wiring trace for ghost-wired settings ([#1742](#1742)) ([6ddbb86](6ddbb86)), closes [#1737](#1737) ### Bug Fixes * **ci:** retry transient GHCR errors in docker publish + retag ([#1741](#1741)) ([18e5349](18e5349)) ### Maintenance * 2026-05-03 Bucket A audit ([#1733](#1733)): 21 mechanical fixes ([#1744](#1744)) ([334262b](334262b)) * add 5 FP-prevention rules to /codebase-audit agent prompts ([#1734](#1734)) ([8f33ad6](8f33ad6)) * improve codebase-audit skill from 2026-05-03 lessons ([#1732](#1732)) ([7281cd7](7281cd7)) * **web:** tighten async-leak ceiling, bypass jsdom Storage timer dispatch ([#1728](#1728)) ([de0a1e1](de0a1e1)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
Tightens the
/codebase-auditskill based on lessons from the 2026-05-03 full run. Skill-only change (.claude/skills/codebase-audit/SKILL.md); no source code touched.Changes
unwired-settings) prompt rewrite -- was the dominant FP source (38% rate, 39 of 39 false positives in last run). New prompt enumerates all six legitimate consumption patterns: bridge config methods, composed-config methods (get_coordination_config()), directConfigResolver.get_int/float/str/boolcalls, Pydantic config-model embedding, subscriber pattern, bootstrap-only / read-only-post-init flags. Adds a verification requirement that forces three independent searches before flagging..pyfiles to project root,scripts/, andc:\tmp\that triggered Pyright diagnostic floods in the main thread..pyat root. Documents the 14 files seen in 2026-05-03 run as the seed sweep list._audit/latestdirectory (Junction collision on Windows); explicitly bans Bash redirects (project'scheck_bash_no_write.shhook blocks>,>>,2>,&>); explicit no-cdreminder.feedback_open_issues_exclude_renovate.mdmemory rule.14-web-store-architecture-drift.md,19-benchmark-regression.md) that never existed. INDEX-builder must enumerate actual files via Glob and grep counts.ISSUES.md-- one row per issue class (e.g. all 10709-unwired-settings.mdfindings collapse into one "Unwired settings" row). User shouldn't have to wade through 300+ raw findings to make a decision.Test plan
Skill is a docs-only file (no source code). Validated by re-reading the skill end-to-end and grep-checking that all four key edits landed (
DO NOT write helper,Phase 7: Cleanup,Lessons from prior runs, agent 09 FP guard).Review coverage
Quick mode: docs-only change auto-skipped agent review. Automated checks N/A (no Python, web, or Go files changed). Pre-commit hooks passed at commit time (em-dash, large-file, secrets, branch protection, etc.).
Closes
No linked issue (skill self-improvement from observed run).