Litellm docs mcp filtering semantic by ishaan-jaff · Pull Request #20316 · BerriAI/litellm

ishaan-jaff · 2026-02-03T02:28:45Z

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

vercel · 2026-02-03T02:28:50Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Building	Preview, Comment	Feb 3, 2026 2:28am

greptile-apps · 2026-02-03T02:32:52Z

Greptile Overview

Greptile Summary

This PR introduces MCP Semantic Tool Filtering, a new feature that reduces context window usage by semantically filtering MCP tools before sending them to LLMs. The implementation adds a pre-call hook that uses semantic-router to match user queries against tool descriptions and returns only the top-K most relevant tools.

Key Changes:

Startup initialization in proxy_server.py:793-809 builds the semantic router once during proxy startup, avoiding performance impact on request path
Hook implementation (hook.py) intercepts requests, expands MCP tool references, extracts user queries, and applies semantic filtering
Core filtering logic (semantic_tool_filter.py) uses semantic-router with embeddings to rank and select relevant tools
Configuration via litellm_settings.mcp_semantic_tool_filter with defaults for embedding model, top_k, and similarity threshold
Comprehensive testing with both unit tests and end-to-end tests covering various scenarios
Documentation includes usage examples, architecture diagrams, and configuration options

The implementation correctly follows the custom rule about avoiding Router object creation in the request path - the SemanticRouter is built once at startup and reused for all requests. Error handling is graceful, falling back to unfiltered tools if any step fails.

Confidence Score: 4/5

This PR is safe to merge with minimal risk - well-designed feature with proper testing and error handling
Score of 4 reflects solid implementation with comprehensive tests and proper architectural separation. Router initialization happens at startup (not in request path), graceful error handling falls back to unfiltered tools, and the feature is opt-in. Minor consideration: feature adds new dependency on semantic-router library and generates embeddings during requests, but this is by design and documented.
No files require special attention - all changes are well-structured and tested

Important Files Changed

Filename	Overview
litellm/proxy/proxy_server.py	Adds semantic tool filter initialization during proxy startup - properly placed outside request path
litellm/proxy/hooks/mcp_semantic_filter/hook.py	Implements pre-call hook for semantic tool filtering - efficient design with startup initialization
litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py	Core semantic filtering logic using semantic-router - router built at startup, not per-request
tests/mcp_tests/test_semantic_tool_filter_e2e.py	End-to-end tests validating semantic filter behavior with real proxy server
tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py	Comprehensive unit tests for semantic tool filtering logic with various scenarios

Sequence Diagram

sequenceDiagram
    participant Client
    participant ProxyServer as Proxy Server
    participant Hook as SemanticToolFilterHook
    participant Filter as SemanticMCPToolFilter
    participant MCPHandler as LiteLLM_Proxy_MCP_Handler
    participant SemanticRouter as semantic-router
    participant LLM as LLM Provider

    Note over ProxyServer,Filter: Startup Phase
    ProxyServer->>Hook: _initialize_semantic_tool_filter()
    Hook->>Filter: initialize_from_config(config, llm_router)
    Filter->>Filter: Create SemanticMCPToolFilter instance
    Filter->>Filter: build_router_from_mcp_registry()
    Filter->>SemanticRouter: Build SemanticRouter with tool embeddings
    SemanticRouter->>LLM: Generate embeddings for tool descriptions
    LLM-->>SemanticRouter: Return embeddings
    SemanticRouter-->>Filter: Router ready with indexed tools
    Filter-->>Hook: SemanticToolFilterHook instance
    Hook-->>ProxyServer: Register hook with litellm.logging_callback_manager

    Note over Client,LLM: Request Phase - Semantic Filtering
    Client->>ProxyServer: POST /v1/chat/completions (with MCP tools)
    ProxyServer->>Hook: async_pre_call_hook(data, user_api_key_dict)
    
    alt MCP references need expansion
        Hook->>Hook: _should_expand_mcp_tools(tools)
        Hook->>MCPHandler: _expand_mcp_tools(tools, user_api_key_dict)
        MCPHandler->>MCPHandler: _parse_mcp_tools() → separate MCP from others
        MCPHandler->>MCPHandler: _process_mcp_tools_to_openai_format()
        MCPHandler-->>Hook: Expanded tools (OpenAI format dicts)
    end
    
    Hook->>Filter: extract_user_query(messages)
    Filter-->>Hook: User query string
    
    Hook->>Filter: filter_tools(query, available_tools)
    Filter->>SemanticRouter: router(text=query, limit=top_k)
    SemanticRouter->>LLM: Generate query embedding
    LLM-->>SemanticRouter: Query embedding
    SemanticRouter->>SemanticRouter: Calculate similarity scores
    SemanticRouter-->>Filter: Top-K matched tool names
    Filter->>Filter: _get_tools_by_names() → preserve format
    Filter-->>Hook: Filtered tools (top-K most relevant)
    
    Hook->>Hook: Update data["tools"] with filtered tools
    Hook->>Hook: Store metadata for response headers
    Hook-->>ProxyServer: Modified request data
    
    ProxyServer->>LLM: Forward request with filtered tools
    LLM-->>ProxyServer: LLM response
    
    ProxyServer->>Hook: async_post_call_response_headers_hook()
    Hook-->>ProxyServer: Add x-litellm-semantic-filter headers
    
    ProxyServer-->>Client: Response with filter stats in headers

greptile-apps

_{5 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

ishaan-jaff and others added 22 commits February 2, 2026 13:29

init: SemanticMCPToolFilter

b2c342f

init: SemanticToolFilterHook

1df971f

test_e2e_semantic_filter

bdd7bcb

mock tests: test_semantic_filter_basic_filtering

a14ee74

Update litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py

3e80bb2

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

refactor folder/file organization

c5304dc

docs fix

722b1f4

fix filter

dfb26c7

fix: filter_tools

ad3b06f

fix linting tool filrer

f452038

initialize_from_config

8a02888

fix: _expand_mcp_tools

8913390

_initialize_semantic_tool_filter

6751c73

working: async_post_call_response_headers_hook

3967326

clean up semantic tool filter

9f6257c

add _initialize_semantic_tool_filter

9464055

build_router_from_mcp_registry

1b84d0f

_get_tools_by_names

3825d93

fiix config

d8343f8

async_post_call_response_headers_hook

839e114

docs mcp filter

8386190

docs fix

fd1c9ba

ishaan-jaff merged commit 0ef506a into main Feb 3, 2026
7 of 12 checks passed

vercel bot deployed to Preview February 3, 2026 02:30 View deployment

greptile-apps bot reviewed Feb 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Litellm docs mcp filtering semantic#20316

Litellm docs mcp filtering semantic#20316
ishaan-jaff merged 22 commits intomainfrom
litellm_docs_mcp_filtering_semantic

ishaan-jaff commented Feb 3, 2026

Uh oh!

vercel bot commented Feb 3, 2026

Uh oh!

Uh oh!

greptile-apps bot commented Feb 3, 2026

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ishaan-jaff commented Feb 3, 2026

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Feb 3, 2026

Uh oh!

Uh oh!

greptile-apps bot commented Feb 3, 2026

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant