feat: MCP on-demand loading - reduce startup tokens by ~4.6K#275
feat: MCP on-demand loading - reduce startup tokens by ~4.6K#275marcusquinn merged 1 commit intomainfrom
Conversation
…obally Reduces session startup tokens by ~4.6K by disabling heavy MCPs globally and enabling them on-demand via subagents: - playwriter (~3K tokens): enable via @playwriter subagent - augment-context-engine (~1K tokens): enable via @augment-context-engine - gh_grep (~600 tokens): replaced by @github-search (uses rg/bash) Changes: - Add mcp: frontmatter to playwriter.md and augment-context-engine.md - Create github-search.md subagent (rg/bash patterns, no MCP needed) - Update generate-opencode-agents.sh to remove MCPs from DEFAULT_TOOLS - Update mcp-discovery.md with on-demand pattern documentation - Add github-search to subagent-index.toon osgrep remains enabled as primary semantic search (local, no auth).
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a crucial optimization for token management by shifting several resource-intensive Multi-Context Providers (MCPs) to an on-demand loading model. By doing so, it drastically cuts down the initial token overhead during session startup, making the system more efficient and responsive. The change also refines how certain functionalities, like GitHub code search, are integrated, favoring lighter, more performant alternatives. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
WalkthroughThis PR implements an explicit MCP On-Demand Loading Strategy by disabling heavy MCPs globally (playwriter, augment-context-engine, gh_grep) to reduce token usage, keeping osgrep as the primary semantic search tool, and extending the subagent index with new tool and service categories while updating MCP discovery documentation accordingly. Changes
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There was a problem hiding this comment.
Code Review
This pull request effectively reduces the startup context size by disabling several heavy MCPs and enabling them on-demand through subagents. The changes are well-documented through comments in the configuration script and updates to the markdown documentation. The introduction of the @github-search subagent to replace the gh_grep MCP is a smart move. I have a couple of suggestions to improve the robustness of a shell command example and to clarify the documentation around the new on-demand loading mechanism.
| # Clone a specific repo | ||
| gh repo clone vercel/next.js -- --depth 1 | ||
|
|
||
| # Search within it | ||
| rg "getServerSession" next.js/ | ||
|
|
||
| # Clean up | ||
| rm -rf next.js |
There was a problem hiding this comment.
The Clone and Search Pattern example uses a fixed directory name (next.js) and rm -rf for cleanup. This could lead to accidental data loss if a directory with the same name already exists for other reasons. A more robust approach is to use a temporary directory created with mktemp to avoid name collisions and ensure the cleanup is safe.
| # Clone a specific repo | |
| gh repo clone vercel/next.js -- --depth 1 | |
| # Search within it | |
| rg "getServerSession" next.js/ | |
| # Clean up | |
| rm -rf next.js | |
| # Create a temporary directory to avoid name collisions | |
| CLONE_DIR=$(mktemp -d) | |
| # Clone a specific repo into the temp directory | |
| gh repo clone vercel/next.js "$CLONE_DIR" -- --depth 1 | |
| # Search within it | |
| rg "getServerSession" "$CLONE_DIR" | |
| # Clean up | |
| rm -rf "$CLONE_DIR" |
| **Note**: The `mcp:` field is declarative - it documents which MCP the subagent | ||
| requires. The actual enabling happens in `generate-opencode-agents.sh` which | ||
| configures OpenCode's per-agent tool permissions. |
There was a problem hiding this comment.
This note appears to contradict the changes in generate-opencode-agents.sh. The script is modified to disable MCPs from the default configuration, while this note claims the script is where the "actual enabling happens". This is confusing for anyone trying to understand the on-demand loading mechanism.
Could you clarify how the enabling works? If it's a dynamic, runtime process triggered by the mcp: frontmatter, the documentation should be updated to reflect that, rather than pointing to the static generation script. A clearer note might be: "The mcp: field is a directive for the agent runtime to dynamically load the required MCP tools when this subagent is invoked."
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Mon Feb 2 17:47:49 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
.agent/tools/context/augment-context-engine.md (1)
48-48:⚠️ Potential issue | 🟡 MinorDocumentation inconsistency with the on-demand loading strategy.
Line 48 states "Enabled for Agents: All 14 primary agents (as fallback to osgrep)" but the PR's on-demand loading strategy disables
augment-context-engine_*globally. The MCP is now only enabled when invoking the@augment-context-enginesubagent.Consider updating this line to reflect the new on-demand pattern:
-**Enabled for Agents**: All 14 primary agents (as fallback to osgrep) +**Enabled via**: `@augment-context-engine` subagent (on-demand, ~1K tokens saved).agent/scripts/generate-opencode-agents.sh (5)
507-530:⚠️ Potential issue | 🔴 CriticalEAGER_MCPS list contradicts the on-demand loading strategy.
Line 510 includes
playwriter,augment-context-engine, andgh_grepinEAGER_MCPS, but lines 125-133 document these as disabled globally to save tokens. This inconsistency means the MCPs will be eagerly loaded at startup, negating the ~4.6K token savings.Proposed fix - move to LAZY_MCPS
# Eager-loaded (enabled: True): Used by all main agents, start at launch -EAGER_MCPS = {'osgrep', 'augment-context-engine', 'context7', 'playwriter', 'gh_grep', 'sentry', 'socket'} +EAGER_MCPS = {'osgrep', 'context7', 'sentry', 'socket'} # Lazy-loaded (enabled: False): Subagent-only, start on-demand LAZY_MCPS = {'claude-code-mcp', 'outscraper', 'dataforseo', 'shadcn', 'macos-automator', - 'gsc', 'localwp', 'chrome-devtools', 'quickfile', 'amazon-order-history', - 'google-analytics-mcp', 'MCP_DOCKER', 'ahrefs'} + 'gsc', 'localwp', 'chrome-devtools', 'quickfile', 'amazon-order-history', + 'google-analytics-mcp', 'MCP_DOCKER', 'ahrefs', 'playwriter', 'augment-context-engine', 'gh_grep'}
546-582:⚠️ Potential issue | 🔴 CriticalGlobal tool enablement contradicts on-demand strategy.
Lines 547, 567, and 580-582 enable
osgrep_*,playwriter_*, and setgh_grep_*globally. Per the documented strategy, onlyosgrep_*should be enabled globally; the others should remain disabled (or not be set toTrue).Proposed fix
# osgrep_* enabled globally (used by all main agents) config['tools']['osgrep_*'] = True -# playwriter_* enabled globally (used by all main agents) -config['tools']['playwriter_*'] = True +# playwriter_* disabled globally (enabled via `@playwriter` subagent) +config['tools']['playwriter_*'] = False ... -# gh_grep tools disabled globally, enabled for specific agents -if 'gh_grep_*' not in config['tools']: - config['tools']['gh_grep_*'] = False - print(" Set gh_grep_* disabled globally (enabled for Build+)") +# gh_grep tools disabled globally (replaced by `@github-search` subagent) +config['tools']['gh_grep_*'] = False +print(" Set gh_grep_* disabled globally (use `@github-search` subagent)")
549-564:⚠️ Potential issue | 🔴 CriticalPlaywriter MCP configuration inconsistent with on-demand goal.
The playwriter MCP is added with
"enabled": True(lines 556, 562), but the PR objective is to disable it globally. The MCP server should be configured but disabled until the@playwritersubagent is invoked.Proposed fix
if bun_path: config['mcp']['playwriter'] = { "type": "local", "command": ["bun", "x", "playwriter@latest"], - "enabled": True + "enabled": False } else: config['mcp']['playwriter'] = { "type": "local", "command": ["npx", "playwriter@latest"], - "enabled": True + "enabled": False } - print(" Added playwriter MCP (eager load - used by all agents)") + print(" Added playwriter MCP (lazy load - `@playwriter` subagent only)")
571-577:⚠️ Potential issue | 🔴 Criticalgh_grep MCP should be disabled for on-demand loading.
The gh_grep MCP is added with
"enabled": True(line 575), contradicting the documented strategy where it's replaced by the@github-searchsubagent using rg/bash.Proposed fix
if 'gh_grep' not in config['mcp']: config['mcp']['gh_grep'] = { "type": "remote", "url": "https://mcp.grep.app", - "enabled": True + "enabled": False } - print(" Added gh_grep MCP (eager load - used by Build+)") + print(" Added gh_grep MCP (lazy load - use `@github-search` instead)")
401-424:⚠️ Potential issue | 🟠 MajorOmO agents (Sisyphus, Planner-Sisyphus) still enable augment-context-engine_*.
Lines 409 and 421 enable
augment-context-engine_*: Truefor Sisyphus and Planner-Sisyphus, which contradicts the on-demand loading strategy. These should use the@augment-context-enginesubagent pattern instead.Proposed fix
sorted_agents["Sisyphus"] = { ... "tools": { "write": True, "edit": True, "bash": True, "read": True, "glob": True, "grep": True, "webfetch": True, "task": True, "todoread": True, "todowrite": True, - "context7_*": True, "osgrep_*": True, "augment-context-engine_*": True + "context7_*": True, "osgrep_*": True }, ... } sorted_agents["Planner-Sisyphus"] = { ... "tools": { "write": False, "edit": False, "bash": False, "read": True, "glob": True, "grep": True, "webfetch": True, "task": False, - "context7_*": True, "osgrep_*": True, "augment-context-engine_*": True + "context7_*": True, "osgrep_*": True }, ... }
🤖 Fix all issues with AI agents
In @.agent/tools/context/github-search.md:
- Line 74: Replace ripgrep usages that specify the non-existent built-in type
"tsx" with the existing "ts" type so both .ts and .tsx files are matched; update
the occurrences of --type tsx and -t tsx in the document (e.g., the command
shown as rg "useState\(.*loading" --type tsx and the -t tsx mention) to --type
ts and -t ts respectively, and also change the other instance around line 104 to
use --type ts.
| rg -C 3 "pattern" --type js | ||
|
|
||
| # Regex patterns | ||
| rg "useState\(.*loading" --type tsx |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file exists and read the relevant lines
cat -n .agent/tools/context/github-search.md | sed -n '70,110p'Repository: marcusquinn/aidevops
Length of output: 1050
🏁 Script executed:
# Verify ripgrep's type system
rg --type-list | head -100Repository: marcusquinn/aidevops
Length of output: 2914
🏁 Script executed:
# Specifically search for ts/tsx types
rg --type-list | grep -i "ts\|tsx"Repository: marcusquinn/aidevops
Length of output: 755
Use --type ts to match both TypeScript and TSX files — the ts type built-in type in ripgrep explicitly includes *.tsx files, so -t ts or --type ts covers both.
The --type tsx flag doesn't exist as a built-in ripgrep type. While the proposed glob pattern fix (-g '*.tsx') would work, the more direct approach is to use the existing ts type which already encompasses .tsx files.
Apply at lines 74 and 104:
Recommended approach
-rg "useState\(.*loading" --type tsx
+rg "useState\(.*loading" --type ts-rg "useEffect\(\(\) => \{" --type tsx -C 2
+rg "useEffect\(\(\) => \{" --type ts -C 2Also correct line 77: -t tsx should be -t ts.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rg "useState\(.*loading" --type tsx | |
| rg "useState\(.*loading" --type ts |
🤖 Prompt for AI Agents
In @.agent/tools/context/github-search.md at line 74, Replace ripgrep usages
that specify the non-existent built-in type "tsx" with the existing "ts" type so
both .ts and .tsx files are matched; update the occurrences of --type tsx and -t
tsx in the document (e.g., the command shown as rg "useState\(.*loading" --type
tsx and the -t tsx mention) to --type ts and -t ts respectively, and also change
the other instance around line 104 to use --type ts.
🤖 Augment PR SummarySummary: This PR shifts several heavy MCP integrations to an on-demand subagent model to reduce session startup context (claimed ~4.6K tokens). Changes:
Technical Notes: 🤖 Was this summary useful? React with 👍 or 👎 |
| # to reduce context bloat from verbose tool descriptions | ||
| # | ||
| # MCP On-Demand Loading Strategy: | ||
| # The following MCPs are DISABLED globally to reduce context token usage (~4.6K saved): |
There was a problem hiding this comment.
The comments here say playwriter_* is disabled globally, but later in this script config['tools']['playwriter_*'] = True and the Playwriter MCP is configured as eager-loaded, so it will still load for all agents and likely negate the intended token savings.
Other Locations
.agent/scripts/generate-opencode-agents.sh:549.agent/scripts/generate-opencode-agents.sh:566
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| # MCP On-Demand Loading Strategy: | ||
| # The following MCPs are DISABLED globally to reduce context token usage (~4.6K saved): | ||
| # - playwriter_*: ~3K tokens - enable via @playwriter subagent | ||
| # - augment-context-engine_*: ~1K tokens - enable via @augment-context-engine subagent |
There was a problem hiding this comment.
After removing augment-context-engine_* from DEFAULT_TOOLS, it looks like @augment-context-engine won’t actually be granted augment-context-engine_* anywhere during subagent generation (only playwriter is handled in the case block), so invoking that subagent may not work as described.
Other Locations
.agent/scripts/generate-opencode-agents.sh:691.agent/scripts/generate-opencode-agents.sh:712
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| # The following MCPs are DISABLED globally to reduce context token usage (~4.6K saved): | ||
| # - playwriter_*: ~3K tokens - enable via @playwriter subagent | ||
| # - augment-context-engine_*: ~1K tokens - enable via @augment-context-engine subagent | ||
| # - gh_grep_*: ~600 tokens - replaced by @github-search subagent (uses rg/bash) |
There was a problem hiding this comment.
This claims gh_grep_* is disabled and replaced by @github-search, but the script still configures gh_grep as an eager-loaded MCP server; if the goal is to remove its startup overhead, this likely needs to be made lazy/optional too.
Other Locations
.agent/scripts/generate-opencode-agents.sh:569.agent/scripts/generate-opencode-agents.sh:510
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| 3. Use the MCP tools as documented in the subagent | ||
|
|
||
| **Note**: The `mcp:` field is declarative - it documents which MCP the subagent | ||
| requires. The actual enabling happens in `generate-opencode-agents.sh` which |
There was a problem hiding this comment.
The doc says the mcp: field is declarative and "actual enabling happens" in generate-opencode-agents.sh, but the generator currently doesn’t appear to read mcp: frontmatter (subagent tool enablement is hardcoded by subagent name). This mismatch could confuse users about how on-demand MCPs are actually activated.
Other Locations
.agent/scripts/generate-opencode-agents.sh:238.agent/scripts/generate-opencode-agents.sh:691
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| gh search code "getServerSession" --repo nextauthjs/next-auth --limit 10 | ||
|
|
||
| # Filter by file path | ||
| gh search code "middleware" --filename "*.ts" --limit 10 |
There was a problem hiding this comment.



Summary
Changes
playwriter@playwritersubagentaugment-context-engine@augment-context-enginesubagentgh_grep@github-searchsubagent (rg/bash)Files changed:
playwriter.md- Addmcp:frontmatteraugment-context-engine.md- Addmcp:frontmattergithub-search.md- New subagent (replaces gh_grep with rg/bash)generate-opencode-agents.sh- Remove MCPs from DEFAULT_TOOLSmcp-discovery.md- Document on-demand patternsubagent-index.toon- Add github-searchTesting
osgrepremains enabled as primary semantic search./setup.shafter merge to regenerate OpenCode config@playwriter,@augment-context-engine,@github-searchsubagentsRelated
Addresses token efficiency discussion in session about reducing startup overhead.
Summary by CodeRabbit
New Features
Documentation
Chores