t230: Prune duplicate FAILURE_PATTERN memories and prevent recurrence#954
t230: Prune duplicate FAILURE_PATTERN memories and prevent recurrence#954marcusquinn merged 2 commits intomainfrom
Conversation
…no_signal (t230) - Skip clean_exit_no_signal retries in store_failure_pattern (infrastructure noise) - Add 24h rate-limit: max 3 entries per outcome_detail to prevent memory pollution - Blocked/failed outcomes still stored but capped to avoid 55+ identical entries
WalkthroughAdded a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Tue Feb 10 17:14:04 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
Targeted cleanup for repetitive pattern entries by keyword. Keeps N newest entries per keyword, removes the rest. Supports --dry-run for safe preview. Example: memory-helper.sh prune-patterns clean_exit_no_signal --keep 3
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Tue Feb 10 17:15:52 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses the issue of memory pollution caused by repetitive and non-actionable failure patterns. It introduces both proactive prevention measures to limit the storage of such patterns and a reactive cleanup tool to consolidate existing duplicates. The changes aim to improve the signal-to-noise ratio in the memory database, making it more efficient and relevant for identifying genuine issues, particularly those related to infrastructure noise. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new utility for pruning duplicate memory entries and adds rate-limiting to prevent their future accumulation. The changes are well-structured and include necessary updates to documentation and command handling. My review focuses on adherence to the repository's shell scripting style guide, particularly around error handling. I've identified a couple of instances where error suppression could hide underlying issues and have suggested modifications to improve robustness.
| db "$MEMORY_DB" "DELETE FROM learning_relations WHERE id IN (SELECT id FROM learnings WHERE $delete_where);" 2>/dev/null || true | ||
| db "$MEMORY_DB" "DELETE FROM learning_relations WHERE supersedes_id IN (SELECT id FROM learnings WHERE $delete_where);" 2>/dev/null || true |
There was a problem hiding this comment.
The use of 2>/dev/null || true here suppresses all errors from the database operations, which violates the repository's style guide. This can hide important issues like database corruption, permission errors, or schema changes. If these operations are expected to fail sometimes (e.g., if the learning_relations table is optional), it's better to handle that possibility explicitly rather than silencing all errors.
| db "$MEMORY_DB" "DELETE FROM learning_relations WHERE id IN (SELECT id FROM learnings WHERE $delete_where);" 2>/dev/null || true | |
| db "$MEMORY_DB" "DELETE FROM learning_relations WHERE supersedes_id IN (SELECT id FROM learnings WHERE $delete_where);" 2>/dev/null || true | |
| db "$MEMORY_DB" "DELETE FROM learning_relations WHERE id IN (SELECT id FROM learnings WHERE $delete_where);" | |
| db "$MEMORY_DB" "DELETE FROM learning_relations WHERE supersedes_id IN (SELECT id FROM learnings WHERE $delete_where);" |
References
- Rule docs: update branch creation to recommend worktrees for parallel sessions #50:
2>/dev/nullis acceptable ONLY when redirecting to log files, not blanket suppression. The code uses2>/dev/nullto suppress errors from database commands, which can hide important issues. (link)
| recent_count=$(sqlite3 "$MEMORY_DB" \ | ||
| "SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \ | ||
| 2>/dev/null || echo "0") |
There was a problem hiding this comment.
The use of 2>/dev/null suppresses potential errors from sqlite3, which violates the repository's style guide. While the || echo "0" provides a fallback, it's better to let errors be visible so that issues like database corruption or permission problems can be diagnosed. Silently failing could cause this rate-limiting check to be bypassed, defeating its purpose.
| recent_count=$(sqlite3 "$MEMORY_DB" \ | |
| "SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \ | |
| 2>/dev/null || echo "0") | |
| recent_count=$(sqlite3 "$MEMORY_DB" \ | |
| "SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \ | |
| || echo "0") |
References
- Rule docs: update branch creation to recommend worktrees for parallel sessions #50:
2>/dev/nullis acceptable ONLY when redirecting to log files, not blanket suppression. The code uses2>/dev/nullto suppress errors from a database query, which can hide important issues. (link)
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.agents/scripts/memory-helper.sh:
- Around line 1773-1783: The code builds type_sql from the user-provided --types
string without validating values, enabling SQL injection; update the logic that
iterates over type_parts to validate each t against the allowed set (e.g.,
VALID_TYPES) before adding to type_conditions: for each t in type_parts, check
membership in VALID_TYPES and either skip/collect only valid entries or abort
with an error if any invalid type is present, then build type_sql from the
validated type_conditions (preserve the quoting/join logic but only use
validated values) so only known safe types are placed into the SQL IN clause.
In @.agents/scripts/supervisor-helper.sh:
- Around line 10048-10056: The LIKE '%${escaped_detail}%' comparison can
over-match because % and _ are wildcards and an empty detail matches everything;
instead guard empty outcome_detail and use a non-wildcard search such as
instr(content, '...') to test substring membership. Concretely: after computing
escaped_detail (via sql_escape), if escaped_detail is empty set recent_count=0
and skip the sqlite3 call; otherwise change the SQL to "SELECT COUNT(*) FROM
learnings WHERE type='FAILURE_PATTERN' AND instr(content, '${escaped_detail}')>0
AND created_at > datetime('now','-1 day');" so you avoid LIKE wildcard semantics
while still using sql_escape to protect quoting; assign that result to
recent_count as before.
🧹 Nitpick comments (1)
.agents/scripts/memory-helper.sh (1)
1785-1790: Consider case-insensitive keyword matching.The LIKE operator with
'%${escaped_keyword}%'is case-sensitive in SQLite by default. Users might expect case-insensitive matching when searching for error patterns.💡 Optional enhancement for case-insensitive matching
If case-insensitive matching is desired:
- local escaped_keyword="${keyword//"'"/"''"}" + local escaped_keyword + escaped_keyword=$(echo "${keyword//"'"/"''"}" | tr '[:upper:]' '[:lower:]')And update the queries to use LOWER():
total_count=$(db "$MEMORY_DB" \ - "SELECT COUNT(*) FROM learnings WHERE type IN ($type_sql) AND content LIKE '%${escaped_keyword}%';") + "SELECT COUNT(*) FROM learnings WHERE type IN ($type_sql) AND LOWER(content) LIKE '%${escaped_keyword}%';")This would make matching consistent regardless of case variations in stored content. However, this is a nice-to-have improvement and can be deferred based on actual usage patterns.
| local type_sql="" | ||
| local IFS=',' | ||
| local type_parts=() | ||
| read -ra type_parts <<< "$types" | ||
| unset IFS | ||
| local type_conditions=() | ||
| for t in "${type_parts[@]}"; do | ||
| type_conditions+=("'$t'") | ||
| done | ||
| type_sql=$(printf "%s," "${type_conditions[@]}") | ||
| type_sql="${type_sql%,}" |
There was a problem hiding this comment.
CRITICAL: SQL injection vulnerability via unvalidated --types parameter.
The types from the --types parameter are wrapped in quotes but never validated against VALID_TYPES before being used in SQL queries. An attacker could inject arbitrary SQL by providing a malicious type value.
Example attack: --types "X',DROP TABLE learnings--"
Would generate: type IN ('X',DROP TABLE learnings--')
🔒 Proposed fix: Validate types before SQL use
type_sql="${type_sql%,}"
+
+ # Validate all types against VALID_TYPES (prevent SQL injection)
+ for t in "${type_parts[@]}"; do
+ local type_pattern=" $t "
+ if [[ ! " $VALID_TYPES " =~ $type_pattern ]]; then
+ log_error "Invalid type in --types: '$t'"
+ log_error "Valid types: $VALID_TYPES"
+ return 1
+ fi
+ done
local escaped_keyword="${keyword//"'"/"''"}"This validation ensures that only known, safe type values are used in SQL queries, preventing injection attacks.
As per coding guidelines: "Security issues such as authentication/authorization flaws, injection (SQL/NoSQL)" must be addressed.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| local type_sql="" | |
| local IFS=',' | |
| local type_parts=() | |
| read -ra type_parts <<< "$types" | |
| unset IFS | |
| local type_conditions=() | |
| for t in "${type_parts[@]}"; do | |
| type_conditions+=("'$t'") | |
| done | |
| type_sql=$(printf "%s," "${type_conditions[@]}") | |
| type_sql="${type_sql%,}" | |
| local type_sql="" | |
| local IFS=',' | |
| local type_parts=() | |
| read -ra type_parts <<< "$types" | |
| unset IFS | |
| local type_conditions=() | |
| for t in "${type_parts[@]}"; do | |
| type_conditions+=("'$t'") | |
| done | |
| type_sql=$(printf "%s," "${type_conditions[@]}") | |
| type_sql="${type_sql%,}" | |
| # Validate all types against VALID_TYPES (prevent SQL injection) | |
| for t in "${type_parts[@]}"; do | |
| local type_pattern=" $t " | |
| if [[ ! " $VALID_TYPES " =~ $type_pattern ]]; then | |
| log_error "Invalid type in --types: '$t'" | |
| log_error "Valid types: $VALID_TYPES" | |
| return 1 | |
| fi | |
| done | |
| local escaped_keyword="${keyword//"'"/"''"}" |
🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis
[warning] 1774-1774: .agents/scripts/memory-helper.sh#L1774
The special variable IFS affects how splitting takes place when expanding unquoted variables.
🤖 Prompt for AI Agents
In @.agents/scripts/memory-helper.sh around lines 1773 - 1783, The code builds
type_sql from the user-provided --types string without validating values,
enabling SQL injection; update the logic that iterates over type_parts to
validate each t against the allowed set (e.g., VALID_TYPES) before adding to
type_conditions: for each t in type_parts, check membership in VALID_TYPES and
either skip/collect only valid entries or abort with an error if any invalid
type is present, then build type_sql from the validated type_conditions
(preserve the quoting/join logic but only use validated values) so only known
safe types are placed into the SQL IN clause.
| # Rate-limit: skip if 3+ entries with the same outcome_detail exist in last 24h (t230) | ||
| # Prevents memory pollution from repetitive infrastructure failures | ||
| local recent_count=0 | ||
| local escaped_detail | ||
| escaped_detail="$(sql_escape "$outcome_detail")" | ||
| if [[ -r "$MEMORY_DB" ]]; then | ||
| recent_count=$(sqlite3 "$MEMORY_DB" \ | ||
| "SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \ | ||
| 2>/dev/null || echo "0") |
There was a problem hiding this comment.
Harden the rate‑limit query against LIKE wildcard over‑matching.
Line 10055: LIKE '%${escaped_detail}%' treats _ and % as wildcards, so common outcome_detail values (e.g., clean_exit_no_signal) can over‑match and prematurely suppress storage. Use instr() or escape LIKE wildcards; also guard empty detail to avoid matching everything.
🔧 Suggested fix (avoid LIKE wildcards)
- local escaped_detail
- escaped_detail="$(sql_escape "$outcome_detail")"
- if [[ -r "$MEMORY_DB" ]]; then
- recent_count=$(sqlite3 "$MEMORY_DB" \
- "SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND content LIKE '%${escaped_detail}%' AND created_at > datetime('now', '-1 day');" \
- 2>/dev/null || echo "0")
- fi
+ local escaped_detail
+ escaped_detail="$(sql_escape "$outcome_detail")"
+ if [[ -r "$MEMORY_DB" && -n "$escaped_detail" ]]; then
+ recent_count=$(sqlite3 "$MEMORY_DB" \
+ "SELECT COUNT(*) FROM learnings WHERE type = 'FAILURE_PATTERN' AND instr(content, '${escaped_detail}') > 0 AND created_at > datetime('now', '-1 day');" \
+ 2>/dev/null || echo "0")
+ fi🤖 Prompt for AI Agents
In @.agents/scripts/supervisor-helper.sh around lines 10048 - 10056, The LIKE
'%${escaped_detail}%' comparison can over-match because % and _ are wildcards
and an empty detail matches everything; instead guard empty outcome_detail and
use a non-wildcard search such as instr(content, '...') to test substring
membership. Concretely: after computing escaped_detail (via sql_escape), if
escaped_detail is empty set recent_count=0 and skip the sqlite3 call; otherwise
change the SQL to "SELECT COUNT(*) FROM learnings WHERE type='FAILURE_PATTERN'
AND instr(content, '${escaped_detail}')>0 AND created_at > datetime('now','-1
day');" so you avoid LIKE wildcard semantics while still using sql_escape to
protect quoting; assign that result to recent_count as before.



Summary
clean_exit_no_signalmemory entries (FAILURE_PATTERN, ERROR_FIX, etc.) down to consolidated summariesstore_failure_pattern()prune-patternssubcommand tomemory-helper.shfor targeted pattern cleanupChanges
Prevention (supervisor-helper.sh)
clean_exit_no_signalretries instore_failure_pattern— these are infrastructure noise, not actionable failure patternsoutcome_detailto prevent memory pollution from any repetitive errorCleanup (memory-helper.sh)
prune-patternssubcommand for bulk consolidation of repetitive pattern entries by error keyword--dry-runfor safe previewExecution
prune-patternsagainst live memory DB to clean up existing 83 duplicate entriesTesting
Closes #t230
Summary by CodeRabbit
New Features
Bug Fixes