feat: add skill scan results audit trail#406
Conversation
Add .agents/SKILL-SCAN-RESULTS.md with initial scan results (116 skills, 115 safe, 1 false positive). Update security-helper.sh and add-skill-helper.sh to automatically append scan results on batch scans and skill imports. Update skill-scanner.md docs to reference the new results log.
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughAdds persistent security scan result logging: a repository Markdown log is created and shell scripts (.agents/scripts/*) now record per-skill and aggregate scan outcomes and metrics to that file; documentation is updated to describe the automatic results log. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI as Add-skill / Scanner CLI
participant Helper as security-helper.sh / add-skill-helper.sh
participant Repo as .agents/SKILL-SCAN-RESULTS.md
participant Analyzer as Skill Analyzer
CLI->>Analyzer: run skill scan (per-skill or batch)
Analyzer-->>Helper: return findings (CRITICAL/HIGH/MEDIUM/INFO)
Helper->>Helper: compute counts, determine outcome (SAFE/BLOCKED/override)
Helper->>Repo: append entry / update history row with timestamp and metrics
Helper-->>CLI: continue flow (allow/import or block)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust audit trail for skill security scanning, significantly improving transparency and traceability of security assessments. By automating the logging of both batch and individual skill import scan results into a dedicated Markdown file, it provides a clear, human-readable record of findings, including false positives and user overrides. This enhancement ensures that security posture changes are consistently documented and easily reviewable, without impacting existing functionality. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Fri Feb 6 23:30:18 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Code Review
This pull request introduces an audit trail for skill security scans, which is a great addition for transparency and security monitoring. The changes involve creating a markdown file for the log, and updating helper scripts to populate it after both batch scans and individual skill imports.
My review found a couple of issues:
- A high-severity bug in
add-skill-helper.shthat would lead to incorrect data being logged in the audit trail. - A medium-severity portability issue in
security-helper.shrelated to the use ofsed, which would cause silent failures on Linux systems.
I've provided suggestions to fix both issues. Once these are addressed, the changes will be in good shape.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.agents/scripts/security-helper.sh:
- Around line 377-415: The update_scan_results_log function uses macOS-only
in-place sed flags (sed -i '' ...) which will fail on GNU sed and is masked by
|| true; replace the in-place sed calls (the three sed lines that update Date,
Skills scanned, and Safe) with a portable approach such as writing sed output to
a temp file and atomically moving it back (e.g., run sed "s/.../.../"
"$results_file" > "$results_file.tmp" && mv "$results_file.tmp" "$results_file")
or detect GNU vs BSD sed and set the correct -i flag accordingly; ensure all
three substitutions use the same portable method and preserve exit status so
failures are not silently ignored.
In @.agents/SKILL-SCAN-RESULTS.md:
- Around line 14-20: The severity table in SKILL-SCAN-RESULTS.md isn't updated
by the update_scan_results_log() function in security-helper.sh, so add logic in
update_scan_results_log() to compute the latest counts and perform targeted
replacements for the five severity rows ("Critical", "High", "Medium", "Low",
"Info"); specifically, after computing the counts use sed (or awk/perl) to
replace the lines matching the table rows (e.g. the lines starting with "|
Critical |", etc.) with the new counts, ensuring the replacements target only
those rows so the rest of the file is untouched; alternatively, if you prefer
not to change the file, add a clear note in SKILL-SCAN-RESULTS.md that the
summary table is only accurate for the initial scan and the history table is the
source of truth.
🧹 Nitpick comments (3)
.agents/scripts/security-helper.sh (1)
515-522: Severity counting viajqis correct but could yield unexpected arithmetic on parse failure.If
jqfails or returns empty (e.g., malformed scanner output), the$(...)falls back to"0"via|| echo "0", which is good. However, ifjqoutputs a non-numeric string (unlikely but defensive), the arithmetic on lines 520-522 would error underset -e.This is a minor edge case given the
|| echo "0"fallback, but worth noting for the zero-tech-debt standard..agents/scripts/add-skill-helper.sh (2)
30-30: Consider makingSCAN_RESULTS_FILEreadonly for consistency withsecurity-helper.sh.In
security-helper.sh(line 15), this is declaredreadonly. Here it's a plain variable. Minor inconsistency.♻️ Suggested fix
-SCAN_RESULTS_FILE=".agents/SKILL-SCAN-RESULTS.md" +readonly SCAN_RESULTS_FILE=".agents/SKILL-SCAN-RESULTS.md"
586-621:log_skill_scan_resultis well-designed — non-blocking, graceful, and correctly gated.The early return when
repo_rootis empty or the results file doesn't exist ensures this never disrupts the main import flow. Clean implementation.One note: the severity-to-column mapping (lines 607-615) attributes all findings to the max severity column. This is an approximation — a skill with 3 CRITICAL + 2 HIGH findings would log
critical=5, high=0. Sincescan_skill_securitydoesn't pass per-severity breakdowns, this is an acceptable simplification for per-import logging, but worth a brief inline comment for future maintainers.📝 Optional: add clarifying comment
+ # Approximate: attributes all findings to the max severity column + # (per-severity breakdown not available in single-skill import context) if [[ "$max_severity" == "CRITICAL" ]]; then critical="$findings"
Address review feedback: pass critical/high/medium counts separately instead of attributing total findings to the max severity level.
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Fri Feb 6 23:34:45 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.agents/scripts/add-skill-helper.sh (1)
560-564:⚠️ Potential issue | 🟡 MinorMissing audit log entry for non-interactive blocked imports.
Every other CRITICAL/HIGH exit path (skip-security at Line 543, user override at Line 570, user cancel at Line 575) logs via
log_skill_scan_result, but the non-interactive block silently returns 1 without recording the event. This leaves a gap in the audit trail for the exact scenario most likely to be debugged later (CI/pipeline blocks).Proposed fix
# In non-interactive mode (piped), block by default if [[ ! -t 0 ]]; then log_error "Import blocked due to security findings (use --skip-security to override)" + log_skill_scan_result "$skill_name" "import BLOCKED (non-interactive)" "$critical_count" "$high_count" "$medium_count" "$max_severity" return 1 fi
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Fri Feb 6 23:37:00 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
…on failure/block (t147.5, t150) On task failure or block, the supervisor now auto-creates a diagnostic subtask (e.g., t001.diag) that investigates the root cause. When the diagnostic task completes, the parent task is re-queued with fresh retries. Self-healing features: - is_self_heal_eligible(): checks task type, heal attempts, failure type - create_diagnostic_subtask(): creates diag task with failure context - check_diagnostic_completions(): re-queues parents after diag success - attempt_self_healing(): scans for healable tasks during pulse - cmd_self_heal: manual trigger for specific task or batch Eligibility rules prevent infinite loops: - Diagnostic tasks cannot self-heal (no recursion) - Max 1 heal attempt per task (configurable via SUPERVISOR_MAX_HEAL_ATTEMPTS) - auth_error, out_of_memory, merge_conflict require human intervention DB schema: adds parent_task_id, is_diagnostic, heal_attempts columns with automatic migration for existing databases. Also triages PR #406 review threads (t147.5): all 3 threads already fixed in prior PRs. Replied with evidence on each thread.



Summary
.agents/SKILL-SCAN-RESULTS.mdwith initial Cisco Skill Scanner results (116 skills scanned, 115 safe, 1 false positive in credentials SKILL.md)security-helper.shto automatically log batch scan results to the audit trail afterskill-scan alladd-skill-helper.shto log per-skill import scan results (safe, blocked, override paths)skill-scanner.mddocs to reference the new results logDetails
The scan results file provides transparency into security scanning:
security-helper.sh skill-scan all) and individual imports (add-skill-helper.sh add) append resultsThe initial scan found 1 CRITICAL false positive: the YARA rule
coercive_injection_genericflagged "List all API keys" in the credentials SKILL.md as a data exfiltration pattern. This is a legitimate tool description, not an injection.Quality
Summary by CodeRabbit
Chores
Documentation