feat: Agent Design Pattern Improvements (t052-t057, t067)#140
feat: Agent Design Pattern Improvements (t052-t057, t067)#140marcusquinn merged 3 commits intomainfrom
Conversation
- t053: Verified all 195 subagents have YAML frontmatter (already complete) - t054: Add session-distill-helper.sh for automatic session reflection to memory - t055: Document cache-aware prompt patterns in build-agent.md - t056: Add mcp-index-helper.sh for on-demand MCP tool discovery with SQLite FTS5 - t057: Add consolidate command to memory-helper.sh for merging similar memories - t067: Implement on-demand MCP loading pattern in generate-opencode-agents.sh Key improvements: - MCP tools indexed in SQLite for fast search without loading all definitions - Session learnings auto-extracted from git history (ERROR_FIX, WORKING_SOLUTION, etc.) - Cache-aware prompt patterns documented for better prompt caching - Memory consolidation reduces redundant entries - AGENTS.md updated with MCP on-demand loading section Token savings: On-demand MCP loading can reduce context by 50-90% by only loading tools when their subagent is invoked.
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThe PR introduces three core automation features: on-demand MCP tool discovery via SQLite FTS5 indexing, automatic session-based learning distillation, and memory entry consolidation with deduplication. It extends CLI surfaces for memory operations, integrates MCP index syncing into agent generation, and adds comprehensive documentation. No breaking changes. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent Generator
participant MCP as mcp-index-helper.sh
participant SQLite as SQLite FTS5 Index
participant Config as opencode.json
participant User as User/Subagent
Agent->>Agent: Generate subagent
Agent->>MCP: Invoke sync command
MCP->>Config: Read MCP & tool definitions
Config-->>MCP: JSON config data
MCP->>SQLite: Initialize/refresh index
MCP->>SQLite: Populate mcp_tools & mcp_tools_fts
SQLite-->>MCP: Index updated
MCP-->>Agent: Sync complete (green confirmation)
Agent-->>User: Display MCP loading guidance
User->>MCP: search "desired_capability"
MCP->>SQLite: Query mcp_tools_fts (FTS5)
SQLite-->>MCP: Ranked search results
MCP-->>User: Display matching MCPs & tools
sequenceDiagram
participant Session as Session Directory
participant Analyzer as analyze_session()
participant Extractor as extract_learnings()
participant Memory as memory-helper.sh
participant DB as Memory Store
Analyzer->>Session: Collect git context (branch, commits, files)
Session-->>Analyzer: Git metadata
Analyzer->>Session: Detect errors, check TODO.md
Session-->>Analyzer: Error fixes, completed tasks
Analyzer->>Analyzer: Generate session-analysis.json
Extractor->>Analyzer: Read session-analysis.json
Analyzer-->>Extractor: Analysis data
Extractor->>Extractor: Derive learning patterns (ERROR_FIX, WORKING_SOLUTION, etc.)
Extractor->>Extractor: Generate extracted-learnings.json
Memory->>DB: Store each learning (store command)
DB-->>Memory: Entries committed
Memory-->>Extractor: Storage confirmed
Extractor->>Session: Clean up temp files
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant improvements to the agent design pattern, primarily focusing on enhancing context efficiency and intelligent memory management. It implements an on-demand loading mechanism for MCP tools, drastically reducing token consumption by only loading necessary tools. Complementing this, the PR automates the capture of session learnings into a persistent memory system and provides a new utility to consolidate these memories, ensuring a streamlined and relevant knowledge base. Additionally, it includes crucial documentation for optimizing prompt design to leverage LLM caching, collectively aiming for more performant and cost-effective agent operations. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Wed Jan 21 05:17:17 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Code Review
This pull request introduces significant improvements to the agent design pattern, focusing on on-demand MCP loading and memory management to optimize context usage. The new scripts mcp-index-helper.sh and session-distill-helper.sh are well-structured and the documentation updates are clear. However, I've identified several critical security vulnerabilities related to SQL injection in mcp-index-helper.sh due to unescaped user input. There are also critical issues with unsafe JSON construction in session-distill-helper.sh and correctness bugs in memory-helper.sh and generate-opencode-agents.sh. These issues should be addressed to ensure the stability and security of the new features.
.agent/scripts/mcp-index-helper.sh
Outdated
| category as Category, | ||
| CASE enabled_globally WHEN 1 THEN 'Yes' ELSE 'No' END as Global | ||
| FROM mcp_tools_fts | ||
| WHERE mcp_tools_fts MATCH '$query' |
There was a problem hiding this comment.
The $query variable is directly injected into the SQL query, creating a SQL injection vulnerability. An attacker could craft a malicious search query to manipulate or damage the database. The input must be sanitized before being used in the query.
| WHERE mcp_tools_fts MATCH '$query' | |
| WHERE mcp_tools_fts MATCH '${query//'/'"'"'}' |
.agent/scripts/mcp-index-helper.sh
Outdated
| description as Description, | ||
| CASE enabled_globally WHEN 1 THEN 'Yes' ELSE 'No' END as Global | ||
| FROM mcp_tools | ||
| WHERE mcp_name = '$mcp_name' |
There was a problem hiding this comment.
.agent/scripts/mcp-index-helper.sh
Outdated
| sqlite3 "$INDEX_DB" <<EOF | ||
| SELECT DISTINCT mcp_name | ||
| FROM mcp_tools | ||
| WHERE tool_name LIKE '%$tool_query%' |
There was a problem hiding this comment.
The $tool_query variable is directly embedded in the LIKE clause, creating a SQL injection vulnerability. A malicious tool query could be used to alter the query's logic and potentially access or modify unintended data. The input should be sanitized.
| WHERE tool_name LIKE '%$tool_query%' | |
| WHERE tool_name LIKE '%${tool_query//'/'"'"'}%' |
| if [[ -n "$fix_commits" ]]; then | ||
| while IFS= read -r commit_msg; do | ||
| if [[ -n "$commit_msg" ]]; then | ||
| learnings+=("{\"type\": \"ERROR_FIX\", \"content\": \"$commit_msg\", \"tags\": \"session,auto-distill,$branch\"}") |
There was a problem hiding this comment.
Manually constructing JSON with unescaped variables is unsafe and will create invalid JSON if the variables contain special characters like quotes. This pattern is repeated for different learning types in this function.
You should use jq to safely construct the JSON objects. For example:
learning_json=$(jq -n --arg type "ERROR_FIX" --arg content "$commit_msg" --arg tags "session,auto-distill,$branch" '{type: $type, content: $content, tags: $tags}')
learnings+=("$learning_json")This should be applied to all places where JSON is manually constructed in this function (lines 138, 147, 156, 167).
| "$MCP_INDEX_HELPER" sync 2>/dev/null || echo -e " ${YELLOW}⚠${NC} MCP index sync skipped (non-critical)" | ||
| echo -e " ${GREEN}✓${NC} MCP tool index updated" |
There was a problem hiding this comment.
The success message MCP tool index updated is always printed, even if the sync command fails. This is misleading and can hide failures in the agent generation process. The success message should only be shown when the command actually succeeds.
| "$MCP_INDEX_HELPER" sync 2>/dev/null || echo -e " ${YELLOW}⚠${NC} MCP index sync skipped (non-critical)" | |
| echo -e " ${GREEN}✓${NC} MCP tool index updated" | |
| if "$MCP_INDEX_HELPER" sync 2>/dev/null; then | |
| echo -e " ${GREEN}✓${NC} MCP tool index updated" | |
| else | |
| echo -e " ${YELLOW}⚠${NC} MCP index sync skipped (non-critical)" | |
| fi |
.agent/scripts/memory-helper.sh
Outdated
| )) * 2.0 / ( | ||
| length(l1.content) - length(replace(l1.content, ' ', '')) + 1 + | ||
| length(l2.content) - length(replace(l2.content, ' ', '')) + 1 | ||
| ) > 0.5 |
There was a problem hiding this comment.
The similarity threshold is hardcoded to 0.5, which means the --threshold command-line option is ignored. The $similarity_threshold variable (defined on line 529) should be used here instead.
To fix this, you'll need to change the heredoc on line 546 from <<'EOF' to <<EOF to allow shell variable expansion, and then replace 0.5 with $similarity_threshold.
| (SELECT COUNT(*) FROM ( | ||
| SELECT value FROM json_each('["' || replace(lower(l1.content), ' ', '","') || '"]') | ||
| INTERSECT | ||
| SELECT value FROM json_each('["' || replace(lower(l2.content), ' ', '","') || '"]') | ||
| )) * 2.0 / ( |
There was a problem hiding this comment.
The method used to construct a JSON array from the content string is not robust. If the content contains special characters like a double quote ("), it will create invalid JSON and cause the SQL query to fail. The content should be properly escaped before being used in the json_each function. This applies to both l1.content and l2.content.
| (SELECT COUNT(*) FROM ( | |
| SELECT value FROM json_each('["' || replace(lower(l1.content), ' ', '","') || '"]') | |
| INTERSECT | |
| SELECT value FROM json_each('["' || replace(lower(l2.content), ' ', '","') || '"]') | |
| )) * 2.0 / ( | |
| (SELECT COUNT(*) FROM ( | |
| SELECT value FROM json_each('["' || replace(replace(lower(l1.content), '"', '""'), ' ', '","') || '"]') | |
| INTERSECT | |
| SELECT value FROM json_each('["' || replace(replace(lower(l2.content), '"', '""'), ' ', '","') || '"]') | |
| )) |
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Fix all issues with AI agents
In @.agent/scripts/mcp-index-helper.sh:
- Around line 357-366: The WHERE clause uses the unescaped shell variable
$mcp_name which allows SQL injection; before invoking sqlite3, sanitize/escape
single quotes in mcp_name (e.g. mcp_name_esc="${mcp_name//\'/\'\'}") and then
use that escaped variable in the heredoc SQL (replace '$mcp_name' with
'$mcp_name_esc'), ensuring the sqlite3 invocation still references $INDEX_DB and
the mcp_tools table; this prevents injected quotes from breaking the query.
- Around line 316-327: The SQL injection risk comes from interpolating $query
directly into the FTS5 MATCH clause; update the script to avoid direct
interpolation by either using sqlite prepared/parameterized queries or by
sanitizing/escaping $query before embedding it: reference the variables
INDEX_DB, query and limit and the table mcp_tools_fts; implement one of (a) call
sqlite3 with a prepared statement / parameter binding for the MATCH and LIMIT
values, or (b) sanitize query by escaping single quotes (replace ' with '') and
removing/escaping FTS5 metacharacters before placing it into the here-doc so the
WHERE mcp_tools_fts MATCH '$query' is safe.
- Around line 440-446: The SQL in get_mcp_for_tool interpolates $tool_query
directly into a LIKE pattern causing SQL injection; fix by avoiding direct
interpolation: either bind the parameter to the sqlite3 query (use a
parameterized statement for the SELECT against the mcp_tools table) or properly
escape/quote $tool_query (escape single quotes and SQL wildcard characters '%'
and '_' and then wrap in the literal pattern) before passing it into the LIKE
clause so tool_name LIKE uses a safe, sanitized value instead of raw
$tool_query.
In @.agent/scripts/memory-helper.sh:
- Around line 607-616: The SQL update is vulnerable because merged_tags is
interpolated raw into sqlite3 and unescaped apostrophes will break the query;
before calling sqlite3 in the merge block (where older_tags, newer_tags,
merged_tags, older_id, newer_id are used) escape single quotes in merged_tags
(and similarly sanitize older_id if it can contain quotes) by replacing each '
with '' and then use that escaped value in the UPDATE statement so the SQL
string is safely formed.
- Around line 528-537: The script parses --threshold into the
similarity_threshold variable but never uses it; update the SQL that currently
hardcodes 0.5 (the similarity cutoff used when selecting similar embeddings) to
use the similarity_threshold variable instead, making sure to safely
expand/quote the shell variable into the here-doc or SQL execution command (and
validate/cast it as a numeric value if necessary); search for
similarity_threshold and the SQL block that references 0.5 in memory-helper.sh
to locate the exact place to substitute the hardcoded value.
- Around line 594-626: The consolidated counter is incremented inside a subshell
because the loop is fed via a pipeline (echo "$duplicates" | while ...), so the
parent shell never observes updates; change the loop to avoid the subshell by
using process substitution or here-string (e.g. replace the pipeline feeding the
while loop with while IFS='|' read -r ...; do ...; done <<< "$duplicates" or
done < <(printf '%s\n' "$duplicates")), keep the same body (variables
older_id/newer_id, merging tags, sqlite3 updates, and the ((consolidated++))
increment) so the consolidated variable in the parent shell reflects the actual
count used later for the success message.
In @.agent/scripts/session-distill-helper.sh:
- Around line 216-231: The post-increment arithmetic ((count++)) and
((stored++)) can cause the script to exit under set -e because the arithmetic
expression returns the previous value; replace those with safe assignment
increments using count=$((count + 1)) and stored=$((stored + 1)) in the loop
that processes learnings (variables: count, stored) and keep the MEMORY_HELPER
store call logic unchanged so increments always succeed regardless of set -e.
- Around line 85-96: The heredoc that writes to "$analysis_file" is vulnerable
to JSON injection because unescaped variables (branch, commits_today,
files_changed, error_fixes, completed_tasks, recent_commits) can contain
quotes/newlines; replace the raw heredoc with a safe jq-based builder: use jq -n
with --arg for string vars (e.g., branch), --argjson for numeric counters
(commits_today, files_changed, error_fixes, completed_tasks) and convert
recent_commits into a JSON array via jq -R -s 'split("\n") |
map(select(length>0))' piped into jq or passed as --slurpfile, then write jq's
output to "$analysis_file" so all values are properly escaped and typed.
- Around line 136-141: The code pushes raw commit messages into the learnings
JSON array (e.g., the loop that appends ERROR_FIX entries using
learnings+=(...)) which allows quotes/special chars to break JSON; change these
three append sites (the ERROR_FIX loop handling commit_msg/fix_commits and the
similar blocks that add CODEBASE_PATTERN and CONTEXT entries) to construct JSON
safely using a JSON-serializer (e.g., call jq --arg type "ERROR_FIX" --arg
content "$commit_msg" --arg tags "session,auto-distill,$branch" to build the
object string) or use a printf+jq pattern so content is properly escaped before
adding to learnings; ensure you replace the direct string interpolation in the
learnings+=(...) expressions for ERROR_FIX, CODEBASE_PATTERN, and CONTEXT with
the safe-constructed JSON string.
🤖 Augment PR SummarySummary: Improves the aidevops agent design patterns to reduce prompt context usage and better capture/reuse learnings across sessions. Changes:
Technical Notes: The MCP discovery workflow is designed to keep MCPs disabled globally and enable them per-subagent to save ~50–90% of context tokens depending on the task. 🤖 Was this summary useful? React with 👍 or 👎 |
| MCP_INDEX_HELPER="$AGENTS_DIR/scripts/mcp-index-helper.sh" | ||
| if [[ -x "$MCP_INDEX_HELPER" ]]; then | ||
| "$MCP_INDEX_HELPER" sync 2>/dev/null || echo -e " ${YELLOW}⚠${NC} MCP index sync skipped (non-critical)" | ||
| echo -e " ${GREEN}✓${NC} MCP tool index updated" |
There was a problem hiding this comment.
| import os | ||
| import sys | ||
|
|
||
| config_path = os.path.expanduser("~/.config/opencode/opencode.json") |
There was a problem hiding this comment.
The bash layer supports configurable paths via AIDEVOPS_MCP_INDEX_DIR/OPENCODE_CONFIG, but the embedded Python uses hard-coded ~/.config/opencode/opencode.json and ~/.aidevops/.../mcp-tools.db, so overrides won’t take effect. This can write/read the index from an unexpected location when the env vars are set.
🤖 Was this useful? React with 👍 or 👎
| CASE enabled_globally WHEN 1 THEN 'Yes' ELSE 'No' END as Global | ||
| FROM mcp_tools_fts | ||
| WHERE mcp_tools_fts MATCH '$query' | ||
| ORDER BY rank |
There was a problem hiding this comment.
.agent/scripts/mcp-index-helper.sh
Outdated
| category as Category, | ||
| CASE enabled_globally WHEN 1 THEN 'Yes' ELSE 'No' END as Global | ||
| FROM mcp_tools_fts | ||
| WHERE mcp_tools_fts MATCH '$query' |
There was a problem hiding this comment.
User input is interpolated directly into SQL (MATCH '$query'), so queries containing '/FTS syntax can break the search (and the same pattern appears in other commands). Consider escaping/sanitizing user input before embedding it into SQL strings.
Other Locations
.agent/scripts/mcp-index-helper.sh:363.agent/scripts/mcp-index-helper.sh:443
🤖 Was this useful? React with 👍 or 👎
| while [[ $# -gt 0 ]]; do | ||
| case "$1" in | ||
| --dry-run) dry_run=true; shift ;; | ||
| --threshold) similarity_threshold="$2"; shift 2 ;; |
| WHERE ( | ||
| -- Check for significant word overlap | ||
| (SELECT COUNT(*) FROM ( | ||
| SELECT value FROM json_each('["' || replace(lower(l1.content), ' ', '","') || '"]') |
There was a problem hiding this comment.
This builds JSON for json_each() by concatenating raw content into a JSON literal, which will break if a memory contains quotes/backslashes/newlines (invalid JSON → sqlite error) and abort consolidation. Consider a safer tokenization approach that doesn’t require constructing JSON from unescaped content.
🤖 Was this useful? React with 👍 or 👎
| if [[ -n "$fix_commits" ]]; then | ||
| while IFS= read -r commit_msg; do | ||
| if [[ -n "$commit_msg" ]]; then | ||
| learnings+=("{\"type\": \"ERROR_FIX\", \"content\": \"$commit_msg\", \"tags\": \"session,auto-distill,$branch\"}") |
There was a problem hiding this comment.
These learning objects are assembled by string-interpolating commit_msg into JSON without escaping, so commit messages containing quotes/backslashes can produce invalid JSON and cause the jq formatting/storage step to fail. Consider generating the JSON via a proper escaper (e.g., jq --arg ...) rather than manual string building.
Other Locations
.agent/scripts/session-distill-helper.sh:156.agent/scripts/session-distill-helper.sh:167
🤖 Was this useful? React with 👍 or 👎
- mcp-index-helper.sh: Escape query, mcp_name, tool_query before SQL interpolation to prevent SQL injection in FTS5 MATCH and LIKE clauses - memory-helper.sh: Escape merged_tags before SQL UPDATE, use similarity_threshold variable instead of hardcoded 0.5, fix subshell variable scope by using here-string instead of pipe - session-distill-helper.sh: Use jq for safe JSON building to prevent injection from commit messages, replace ((count++)) with count=$((count + 1)) for set -e compatibility Addresses CodeRabbit review comments on PR #140
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Wed Jan 21 05:32:12 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |



Summary
Implements agent design pattern improvements for better context efficiency and MCP on-demand loading:
Key Changes
New Scripts
mcp-index-helper.sh- SQLite FTS5 index for MCP tool discoverysession-distill-helper.sh- Extract learnings from session for memory storageUpdated Files
AGENTS.md- Added MCP on-demand loading sectionbuild-agent.md- Added cache-aware prompt patterns sectionmemory-helper.sh- Addedconsolidatecommandgenerate-opencode-agents.sh- Syncs MCP index on agent generationNew Documentation
tools/context/mcp-discovery.md- On-demand MCP discovery subagentToken Savings
On-demand MCP loading can reduce context by 50-90% by only loading tools when their subagent is invoked, rather than preloading all MCP definitions globally.
Testing
Related Tasks
Closes t052, t053, t054, t055, t056, t057, t067
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.