Litellm docs mcp filtering semantic#20316
Merged
ishaan-jaff merged 22 commits intomainfrom Feb 3, 2026
Merged
Conversation
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Greptile OverviewGreptile SummaryThis PR introduces MCP Semantic Tool Filtering, a new feature that reduces context window usage by semantically filtering MCP tools before sending them to LLMs. The implementation adds a pre-call hook that uses semantic-router to match user queries against tool descriptions and returns only the top-K most relevant tools. Key Changes:
The implementation correctly follows the custom rule about avoiding Router object creation in the request path - the SemanticRouter is built once at startup and reused for all requests. Error handling is graceful, falling back to unfiltered tools if any step fails. Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/proxy_server.py | Adds semantic tool filter initialization during proxy startup - properly placed outside request path |
| litellm/proxy/hooks/mcp_semantic_filter/hook.py | Implements pre-call hook for semantic tool filtering - efficient design with startup initialization |
| litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py | Core semantic filtering logic using semantic-router - router built at startup, not per-request |
| tests/mcp_tests/test_semantic_tool_filter_e2e.py | End-to-end tests validating semantic filter behavior with real proxy server |
| tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py | Comprehensive unit tests for semantic tool filtering logic with various scenarios |
Sequence Diagram
sequenceDiagram
participant Client
participant ProxyServer as Proxy Server
participant Hook as SemanticToolFilterHook
participant Filter as SemanticMCPToolFilter
participant MCPHandler as LiteLLM_Proxy_MCP_Handler
participant SemanticRouter as semantic-router
participant LLM as LLM Provider
Note over ProxyServer,Filter: Startup Phase
ProxyServer->>Hook: _initialize_semantic_tool_filter()
Hook->>Filter: initialize_from_config(config, llm_router)
Filter->>Filter: Create SemanticMCPToolFilter instance
Filter->>Filter: build_router_from_mcp_registry()
Filter->>SemanticRouter: Build SemanticRouter with tool embeddings
SemanticRouter->>LLM: Generate embeddings for tool descriptions
LLM-->>SemanticRouter: Return embeddings
SemanticRouter-->>Filter: Router ready with indexed tools
Filter-->>Hook: SemanticToolFilterHook instance
Hook-->>ProxyServer: Register hook with litellm.logging_callback_manager
Note over Client,LLM: Request Phase - Semantic Filtering
Client->>ProxyServer: POST /v1/chat/completions (with MCP tools)
ProxyServer->>Hook: async_pre_call_hook(data, user_api_key_dict)
alt MCP references need expansion
Hook->>Hook: _should_expand_mcp_tools(tools)
Hook->>MCPHandler: _expand_mcp_tools(tools, user_api_key_dict)
MCPHandler->>MCPHandler: _parse_mcp_tools() → separate MCP from others
MCPHandler->>MCPHandler: _process_mcp_tools_to_openai_format()
MCPHandler-->>Hook: Expanded tools (OpenAI format dicts)
end
Hook->>Filter: extract_user_query(messages)
Filter-->>Hook: User query string
Hook->>Filter: filter_tools(query, available_tools)
Filter->>SemanticRouter: router(text=query, limit=top_k)
SemanticRouter->>LLM: Generate query embedding
LLM-->>SemanticRouter: Query embedding
SemanticRouter->>SemanticRouter: Calculate similarity scores
SemanticRouter-->>Filter: Top-K matched tool names
Filter->>Filter: _get_tools_by_names() → preserve format
Filter-->>Hook: Filtered tools (top-K most relevant)
Hook->>Hook: Update data["tools"] with filtered tools
Hook->>Hook: Store metadata for response headers
Hook-->>ProxyServer: Modified request data
ProxyServer->>LLM: Forward request with filtered tools
LLM-->>ProxyServer: LLM response
ProxyServer->>Hook: async_post_call_response_headers_hook()
Hook-->>ProxyServer: Add x-litellm-semantic-filter headers
ProxyServer-->>Client: Response with filter stats in headers
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unitCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes