Skip to content

t230: Prune duplicate FAILURE_PATTERN memories and prevent recurrence#954

Merged
marcusquinn merged 2 commits intomainfrom
feature/t230
Feb 10, 2026
Merged

t230: Prune duplicate FAILURE_PATTERN memories and prevent recurrence#954
marcusquinn merged 2 commits intomainfrom
feature/t230

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 10, 2026

Summary

  • Prunes ~70 duplicate clean_exit_no_signal memory entries (FAILURE_PATTERN, ERROR_FIX, etc.) down to consolidated summaries
  • Prevents future accumulation via two mechanisms in store_failure_pattern()
  • Adds prune-patterns subcommand to memory-helper.sh for targeted pattern cleanup

Changes

Prevention (supervisor-helper.sh)

  • Skip clean_exit_no_signal retries in store_failure_pattern — these are infrastructure noise, not actionable failure patterns
  • Add 24h rate-limit: max 3 entries per outcome_detail to prevent memory pollution from any repetitive error

Cleanup (memory-helper.sh)

  • Add prune-patterns subcommand for bulk consolidation of repetitive pattern entries by error keyword
  • Keeps newest entry per type as representative, deletes the rest
  • Supports --dry-run for safe preview

Execution

  • Ran prune-patterns against live memory DB to clean up existing 83 duplicate entries

Testing

  • ShellCheck validation on modified scripts
  • Dry-run verification before actual pruning
  • Before/after memory stats comparison

Closes #t230

Summary by CodeRabbit

  • New Features

    • Added a new pattern pruning utility to remove repetitive entries.
  • Bug Fixes

    • Improved handling of transient infrastructure failures to prevent false error reporting.
    • Added safeguards to prevent duplicate failure patterns from being stored.

…no_signal (t230)

- Skip clean_exit_no_signal retries in store_failure_pattern (infrastructure noise)
- Add 24h rate-limit: max 3 entries per outcome_detail to prevent memory pollution
- Blocked/failed outcomes still stored but capped to avoid 55+ identical entries
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

Walkthrough

Added a new cmd_prune_patterns() command to memory-helper.sh for pruning repetitive pattern entries with validation and index rebuilding. Enhanced store_failure_pattern() in supervisor-helper.sh with expanded transient failure detection and rate-limit guards to prevent memory pollution from repeated infrastructure failures.

Changes

Cohort / File(s) Summary
Pattern Pruning Command
.agents/scripts/memory-helper.sh
Introduced cmd_prune_patterns() function implementing pattern pruning by keyword with backup, selective deletion, and index rebuilding. Integrated into CLI dispatcher with prune help documentation.
Failure Pattern Rate-Limiting
.agents/scripts/supervisor-helper.sh
Extended store_failure_pattern() to include clean_exit_no_signal as transient failure pattern and added 24-hour rate-limit guard; skips storage if 3+ recent outcome matches detected.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🧹 The patterns now prune themselves with care,
Rate-limiters guard against the noise so rare,
Infrastructure whispers fade without a trace,
Memory stays pure—a well-kept DevOps space! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main changes: pruning duplicate FAILURE_PATTERN memories and implementing prevention measures (t230 reference). It accurately reflects both the cleanup action and the forward-looking safeguard introduced in the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t230

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 36 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Tue Feb 10 17:14:01 UTC 2026: Code review monitoring started
Tue Feb 10 17:14:01 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 36

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 36
  • VULNERABILITIES: 0

Generated on: Tue Feb 10 17:14:04 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Targeted cleanup for repetitive pattern entries by keyword. Keeps N newest
entries per keyword, removes the rest. Supports --dry-run for safe preview.
Example: memory-helper.sh prune-patterns clean_exit_no_signal --keep 3
@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 36 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Tue Feb 10 17:15:48 UTC 2026: Code review monitoring started
Tue Feb 10 17:15:48 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 36

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 36
  • VULNERABILITIES: 0

Generated on: Tue Feb 10 17:15:52 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

@marcusquinn marcusquinn marked this pull request as ready for review February 10, 2026 17:16
@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses the issue of memory pollution caused by repetitive and non-actionable failure patterns. It introduces both proactive prevention measures to limit the storage of such patterns and a reactive cleanup tool to consolidate existing duplicates. The changes aim to improve the signal-to-noise ratio in the memory database, making it more efficient and relevant for identifying genuine issues, particularly those related to infrastructure noise.

Highlights

  • New prune-patterns subcommand: Introduced a new prune-patterns subcommand to memory-helper.sh for consolidating repetitive memory entries based on a keyword, offering options for dry-runs, specifying types, and controlling the number of entries to keep.
  • Prevention of future memory pollution: Implemented prevention mechanisms in supervisor-helper.sh to stop future accumulation of noisy failure patterns by ignoring clean_exit_no_signal retries and adding a 24-hour rate limit (max 3 entries) for similar FAILURE_PATTERN entries.
  • Initial cleanup of existing entries: Performed an initial cleanup of approximately 70 existing duplicate memory entries using the newly developed prune-patterns tool, targeting clean_exit_no_signal patterns.
Changelog
  • .agents/scripts/memory-helper.sh
    • Added cmd_prune_patterns function for targeted memory cleanup, allowing consolidation of repetitive pattern entries by keyword.
    • Implemented argument parsing for prune-patterns including keyword, dry-run, keep count, and types.
    • Included database backup and FTS index rebuild as part of the pruning process.
    • Registered prune-patterns as a new subcommand in the memory-helper.sh script.
    • Added detailed help documentation for prune-patterns options.
    • Hooked cmd_prune_patterns into the main command dispatcher.
  • .agents/scripts/supervisor-helper.sh
    • Modified store_failure_pattern to prevent storing clean_exit_no_signal as a retry pattern, classifying it as infrastructure noise.
    • Introduced a rate-limiting mechanism to store_failure_pattern to limit entries to 3 per outcome_detail within 24 hours, preventing memory pollution from repetitive errors.
Activity
  • ShellCheck validation was performed on the modified scripts to ensure code quality.
  • Dry-run verification was conducted using the new prune-patterns command before actual database modifications.
  • Before and after memory statistics were compared to confirm the effectiveness of the pruning operations.
  • The prune-patterns command was executed against the live memory database to clean up existing duplicate entries.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new utility for pruning duplicate memory entries and adds rate-limiting to prevent their future accumulation. The changes are well-structured and include necessary updates to documentation and command handling. My review focuses on adherence to the repository's shell scripting style guide, particularly around error handling. I've identified a couple of instances where error suppression could hide underlying issues and have suggested modifications to improve robustness.

Comment on lines +1858 to +1859
db "$MEMORY_DB" "DELETE FROM learning_relations WHERE id IN (SELECT id FROM learnings WHERE $delete_where);" 2>/dev/null || true
db "$MEMORY_DB" "DELETE FROM learning_relations WHERE supersedes_id IN (SELECT id FROM learnings WHERE $delete_where);" 2>/dev/null || true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of 2>/dev/null || true here suppresses all errors from the database operations, which violates the repository's style guide. This can hide important issues like database corruption, permission errors, or schema changes. If these operations are expected to fail sometimes (e.g., if the learning_relations table is optional), it's better to handle that possibility explicitly rather than silencing all errors.

Suggested change
db "$MEMORY_DB" "DELETE FROM learning_relations WHERE id IN (SELECT id FROM learnings WHERE $delete_where);" 2>/dev/null || true
db "$MEMORY_DB" "DELETE FROM learning_relations WHERE supersedes_id IN (SELECT id FROM learnings WHERE $delete_where);" 2>/dev/null || true
db "$MEMORY_DB" "DELETE FROM learning_relations WHERE id IN (SELECT id FROM learnings WHERE $delete_where);"
db "$MEMORY_DB" "DELETE FROM learning_relations WHERE supersedes_id IN (SELECT id FROM learnings WHERE $delete_where);"
References
  1. Rule docs: update branch creation to recommend worktrees for parallel sessions #50: 2>/dev/null is acceptable ONLY when redirecting to log files, not blanket suppression. The code uses 2>/dev/null to suppress errors from database commands, which can hide important issues. (link)

Comment on lines +10054 to +10056
recent_count=$(sqlite3 "$MEMORY_DB" \
"SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \
2>/dev/null || echo "0")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of 2>/dev/null suppresses potential errors from sqlite3, which violates the repository's style guide. While the || echo "0" provides a fallback, it's better to let errors be visible so that issues like database corruption or permission problems can be diagnosed. Silently failing could cause this rate-limiting check to be bypassed, defeating its purpose.

Suggested change
recent_count=$(sqlite3 "$MEMORY_DB" \
"SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \
2>/dev/null || echo "0")
recent_count=$(sqlite3 "$MEMORY_DB" \
"SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \
|| echo "0")
References
  1. Rule docs: update branch creation to recommend worktrees for parallel sessions #50: 2>/dev/null is acceptable ONLY when redirecting to log files, not blanket suppression. The code uses 2>/dev/null to suppress errors from a database query, which can hide important issues. (link)

@marcusquinn marcusquinn merged commit ceac16c into main Feb 10, 2026
16 of 18 checks passed
@marcusquinn marcusquinn deleted the feature/t230 branch February 10, 2026 17:20
marcusquinn added a commit that referenced this pull request Feb 10, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.agents/scripts/memory-helper.sh:
- Around line 1773-1783: The code builds type_sql from the user-provided --types
string without validating values, enabling SQL injection; update the logic that
iterates over type_parts to validate each t against the allowed set (e.g.,
VALID_TYPES) before adding to type_conditions: for each t in type_parts, check
membership in VALID_TYPES and either skip/collect only valid entries or abort
with an error if any invalid type is present, then build type_sql from the
validated type_conditions (preserve the quoting/join logic but only use
validated values) so only known safe types are placed into the SQL IN clause.

In @.agents/scripts/supervisor-helper.sh:
- Around line 10048-10056: The LIKE '%${escaped_detail}%' comparison can
over-match because % and _ are wildcards and an empty detail matches everything;
instead guard empty outcome_detail and use a non-wildcard search such as
instr(content, '...') to test substring membership. Concretely: after computing
escaped_detail (via sql_escape), if escaped_detail is empty set recent_count=0
and skip the sqlite3 call; otherwise change the SQL to "SELECT COUNT(*) FROM
learnings WHERE type='FAILURE_PATTERN' AND instr(content, '${escaped_detail}')>0
AND created_at > datetime('now','-1 day');" so you avoid LIKE wildcard semantics
while still using sql_escape to protect quoting; assign that result to
recent_count as before.
🧹 Nitpick comments (1)
.agents/scripts/memory-helper.sh (1)

1785-1790: Consider case-insensitive keyword matching.

The LIKE operator with '%${escaped_keyword}%' is case-sensitive in SQLite by default. Users might expect case-insensitive matching when searching for error patterns.

💡 Optional enhancement for case-insensitive matching

If case-insensitive matching is desired:

-    local escaped_keyword="${keyword//"'"/"''"}"
+    local escaped_keyword
+    escaped_keyword=$(echo "${keyword//"'"/"''"}" | tr '[:upper:]' '[:lower:]')

And update the queries to use LOWER():

     total_count=$(db "$MEMORY_DB" \
-        "SELECT COUNT(*) FROM learnings WHERE type IN ($type_sql) AND content LIKE '%${escaped_keyword}%';")
+        "SELECT COUNT(*) FROM learnings WHERE type IN ($type_sql) AND LOWER(content) LIKE '%${escaped_keyword}%';")

This would make matching consistent regardless of case variations in stored content. However, this is a nice-to-have improvement and can be deferred based on actual usage patterns.

Comment on lines +1773 to +1783
local type_sql=""
local IFS=','
local type_parts=()
read -ra type_parts <<< "$types"
unset IFS
local type_conditions=()
for t in "${type_parts[@]}"; do
type_conditions+=("'$t'")
done
type_sql=$(printf "%s," "${type_conditions[@]}")
type_sql="${type_sql%,}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CRITICAL: SQL injection vulnerability via unvalidated --types parameter.

The types from the --types parameter are wrapped in quotes but never validated against VALID_TYPES before being used in SQL queries. An attacker could inject arbitrary SQL by providing a malicious type value.

Example attack: --types "X',DROP TABLE learnings--"
Would generate: type IN ('X',DROP TABLE learnings--')

🔒 Proposed fix: Validate types before SQL use
     type_sql="${type_sql%,}"
+
+    # Validate all types against VALID_TYPES (prevent SQL injection)
+    for t in "${type_parts[@]}"; do
+        local type_pattern=" $t "
+        if [[ ! " $VALID_TYPES " =~ $type_pattern ]]; then
+            log_error "Invalid type in --types: '$t'"
+            log_error "Valid types: $VALID_TYPES"
+            return 1
+        fi
+    done
 
     local escaped_keyword="${keyword//"'"/"''"}"

This validation ensures that only known, safe type values are used in SQL queries, preventing injection attacks.

As per coding guidelines: "Security issues such as authentication/authorization flaws, injection (SQL/NoSQL)" must be addressed.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
local type_sql=""
local IFS=','
local type_parts=()
read -ra type_parts <<< "$types"
unset IFS
local type_conditions=()
for t in "${type_parts[@]}"; do
type_conditions+=("'$t'")
done
type_sql=$(printf "%s," "${type_conditions[@]}")
type_sql="${type_sql%,}"
local type_sql=""
local IFS=','
local type_parts=()
read -ra type_parts <<< "$types"
unset IFS
local type_conditions=()
for t in "${type_parts[@]}"; do
type_conditions+=("'$t'")
done
type_sql=$(printf "%s," "${type_conditions[@]}")
type_sql="${type_sql%,}"
# Validate all types against VALID_TYPES (prevent SQL injection)
for t in "${type_parts[@]}"; do
local type_pattern=" $t "
if [[ ! " $VALID_TYPES " =~ $type_pattern ]]; then
log_error "Invalid type in --types: '$t'"
log_error "Valid types: $VALID_TYPES"
return 1
fi
done
local escaped_keyword="${keyword//"'"/"''"}"
🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis

[warning] 1774-1774: .agents/scripts/memory-helper.sh#L1774
The special variable IFS affects how splitting takes place when expanding unquoted variables.

🤖 Prompt for AI Agents
In @.agents/scripts/memory-helper.sh around lines 1773 - 1783, The code builds
type_sql from the user-provided --types string without validating values,
enabling SQL injection; update the logic that iterates over type_parts to
validate each t against the allowed set (e.g., VALID_TYPES) before adding to
type_conditions: for each t in type_parts, check membership in VALID_TYPES and
either skip/collect only valid entries or abort with an error if any invalid
type is present, then build type_sql from the validated type_conditions
(preserve the quoting/join logic but only use validated values) so only known
safe types are placed into the SQL IN clause.

Comment on lines +10048 to +10056
# Rate-limit: skip if 3+ entries with the same outcome_detail exist in last 24h (t230)
# Prevents memory pollution from repetitive infrastructure failures
local recent_count=0
local escaped_detail
escaped_detail="$(sql_escape "$outcome_detail")"
if [[ -r "$MEMORY_DB" ]]; then
recent_count=$(sqlite3 "$MEMORY_DB" \
"SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \
2>/dev/null || echo "0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Harden the rate‑limit query against LIKE wildcard over‑matching.

Line 10055: LIKE '%${escaped_detail}%' treats _ and % as wildcards, so common outcome_detail values (e.g., clean_exit_no_signal) can over‑match and prematurely suppress storage. Use instr() or escape LIKE wildcards; also guard empty detail to avoid matching everything.

🔧 Suggested fix (avoid LIKE wildcards)
-    local escaped_detail
-    escaped_detail="$(sql_escape "$outcome_detail")"
-    if [[ -r "$MEMORY_DB" ]]; then
-        recent_count=$(sqlite3 "$MEMORY_DB" \
-            "SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \
-            2>/dev/null || echo "0")
-    fi
+    local escaped_detail
+    escaped_detail="$(sql_escape "$outcome_detail")"
+    if [[ -r "$MEMORY_DB" && -n "$escaped_detail" ]]; then
+        recent_count=$(sqlite3 "$MEMORY_DB" \
+            "SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND instr(content, '${escaped_detail}') > 0 AND created_at > datetime('now', '-1 day');" \
+            2>/dev/null || echo "0")
+    fi
🤖 Prompt for AI Agents
In @.agents/scripts/supervisor-helper.sh around lines 10048 - 10056, The LIKE
'%${escaped_detail}%' comparison can over-match because % and _ are wildcards
and an empty detail matches everything; instead guard empty outcome_detail and
use a non-wildcard search such as instr(content, '...') to test substring
membership. Concretely: after computing escaped_detail (via sql_escape), if
escaped_detail is empty set recent_count=0 and skip the sqlite3 call; otherwise
change the SQL to "SELECT COUNT(*) FROM learnings WHERE type='FAILURE_PATTERN'
AND instr(content, '${escaped_detail}')>0 AND created_at > datetime('now','-1
day');" so you avoid LIKE wildcard semantics while still using sql_escape to
protect quoting; assign that result to recent_count as before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant