diff --git a/docs/my-website/docs/tutorials/claude_code_websearch.md b/docs/my-website/docs/tutorials/claude_code_websearch.md
new file mode 100644
index 00000000000..cc2f79666da
--- /dev/null
+++ b/docs/my-website/docs/tutorials/claude_code_websearch.md
@@ -0,0 +1,192 @@
+# Claude Code - WebSearch Across All Providers
+
+Enable Claude Code's web search tool to work with any provider (Bedrock, Azure, Vertex, etc.). LiteLLM automatically intercepts web search requests and executes them server-side.
+
+## Proxy Configuration
+
+Add WebSearch interception to your `litellm_config.yaml`:
+
+```yaml
+model_list:
+ - model_name: bedrock-sonnet
+ litellm_params:
+ model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
+ aws_region_name: us-east-1
+
+# Enable WebSearch interception for providers
+litellm_settings:
+ callbacks:
+ - websearch_interception:
+ enabled_providers:
+ - bedrock
+ - azure
+ - vertex_ai
+ search_tool_name: perplexity-search # Optional: specific search tool
+
+# Configure search provider
+search_tools:
+ - search_tool_name: perplexity-search
+ litellm_params:
+ search_provider: perplexity
+ api_key: os.environ/PERPLEXITY_API_KEY
+```
+
+## Quick Start
+
+### 1. Configure LiteLLM Proxy
+
+Create `config.yaml`:
+
+```yaml
+model_list:
+ - model_name: bedrock-sonnet
+ litellm_params:
+ model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
+ aws_region_name: us-east-1
+
+litellm_settings:
+ callbacks:
+ - websearch_interception:
+ enabled_providers: [bedrock]
+
+search_tools:
+ - search_tool_name: perplexity-search
+ litellm_params:
+ search_provider: perplexity
+ api_key: os.environ/PERPLEXITY_API_KEY
+```
+
+### 2. Start Proxy
+
+```bash
+export PERPLEXITY_API_KEY=your-key
+litellm --config config.yaml
+```
+
+### 3. Use with Claude Code
+
+```bash
+export ANTHROPIC_BASE_URL=http://localhost:4000
+export ANTHROPIC_API_KEY=sk-1234
+claude
+```
+
+Now use web search in Claude Code - it works with any provider!
+
+## How It Works
+
+When Claude Code sends a web search request, LiteLLM:
+1. Intercepts the native `web_search` tool
+2. Converts it to LiteLLM's standard format
+3. Executes the search via Perplexity/Tavily
+4. Returns the final answer to Claude Code
+
+```mermaid
+sequenceDiagram
+ participant CC as Claude Code
+ participant LP as LiteLLM Proxy
+ participant B as Bedrock/Azure/etc
+ participant P as Perplexity/Tavily
+
+ CC->>LP: Request with web_search tool
+ Note over LP: Convert native tool
to LiteLLM format
+ LP->>B: Request with converted tool
+ B-->>LP: Response: tool_use
+ Note over LP: Detect web search
tool_use
+ LP->>P: Execute search
+ P-->>LP: Search results
+ LP->>B: Follow-up with results
+ B-->>LP: Final answer
+ LP-->>CC: Final answer with search results
+```
+
+**Result**: One API call from Claude Code → Complete answer with search results
+
+## Supported Providers
+
+| Provider | Native Web Search | With LiteLLM |
+|----------|-------------------|--------------|
+| **Anthropic** | ✅ Yes | ✅ Yes |
+| **Bedrock** | ❌ No | ✅ Yes |
+| **Azure** | ❌ No | ✅ Yes |
+| **Vertex AI** | ❌ No | ✅ Yes |
+| **Other Providers** | ❌ No | ✅ Yes |
+
+## Search Providers
+
+Configure which search provider to use. LiteLLM supports multiple search providers:
+
+| Provider | Configuration |
+|----------|---------------|
+| **Perplexity** | `search_provider: perplexity` |
+| **Tavily** | `search_provider: tavily` |
+
+See [all supported search providers](../search/index.md) for the complete list.
+
+## Configuration Options
+
+### WebSearch Interception Parameters
+
+| Parameter | Type | Required | Description | Example |
+|-----------|------|----------|-------------|---------|
+| `enabled_providers` | List[String] | Yes | List of providers to enable web search interception for | `[bedrock, azure, vertex_ai]` |
+| `search_tool_name` | String | No | Specific search tool from `search_tools` config. If not set, uses first available search tool. | `perplexity-search` |
+
+### Supported Provider Values
+
+Use these values in `enabled_providers`:
+
+| Provider | Value | Description |
+|----------|-------|-------------|
+| AWS Bedrock | `bedrock` | Amazon Bedrock Claude models |
+| Azure OpenAI | `azure` | Azure-hosted models |
+| Google Vertex AI | `vertex_ai` | Google Cloud Vertex AI |
+| Any Other | Provider name | Any LiteLLM-supported provider |
+
+### Complete Configuration Example
+
+```yaml
+model_list:
+ - model_name: bedrock-sonnet
+ litellm_params:
+ model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
+ aws_region_name: us-east-1
+
+ - model_name: azure-gpt4
+ litellm_params:
+ model: azure/gpt-4
+ api_base: https://my-azure.openai.azure.com
+ api_key: os.environ/AZURE_API_KEY
+
+litellm_settings:
+ callbacks:
+ - websearch_interception:
+ enabled_providers:
+ - bedrock # Enable for AWS Bedrock
+ - azure # Enable for Azure OpenAI
+ - vertex_ai # Enable for Google Vertex
+ search_tool_name: perplexity-search # Optional: use specific search tool
+
+# Configure search tools
+search_tools:
+ - search_tool_name: perplexity-search
+ litellm_params:
+ search_provider: perplexity
+ api_key: os.environ/PERPLEXITY_API_KEY
+
+ - search_tool_name: tavily-search
+ litellm_params:
+ search_provider: tavily
+ api_key: os.environ/TAVILY_API_KEY
+```
+
+**How search tool selection works:**
+- If `search_tool_name` is specified → Uses that specific search tool
+- If `search_tool_name` is not specified → Uses first search tool in `search_tools` list
+- In example above: Without `search_tool_name`, would use `perplexity-search` (first in list)
+
+## Related
+
+- [Claude Code Quickstart](./claude_responses_api.md)
+- [Claude Code Cost Tracking](./claude_code_customer_tracking.md)
+- [Using Non-Anthropic Models](./claude_non_anthropic_models.md)
diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js
index fd651c3adc5..102e3dfe1c5 100644
--- a/docs/my-website/sidebars.js
+++ b/docs/my-website/sidebars.js
@@ -122,6 +122,7 @@ const sidebars = {
items: [
"tutorials/claude_responses_api",
"tutorials/claude_code_customer_tracking",
+ "tutorials/claude_code_websearch",
"tutorials/claude_mcp",
"tutorials/claude_non_anthropic_models",
]
diff --git a/litellm/constants.py b/litellm/constants.py
index dba79b2f186..3bdd943481e 100644
--- a/litellm/constants.py
+++ b/litellm/constants.py
@@ -329,6 +329,11 @@
"medium": 5,
"high": 10,
}
+
+# LiteLLM standard web search tool name
+# Used for web search interception across providers
+LITELLM_WEB_SEARCH_TOOL_NAME = "litellm_web_search"
+
DEFAULT_IMAGE_ENDPOINT_MODEL = "dall-e-2"
DEFAULT_VIDEO_ENDPOINT_MODEL = "sora-2"
diff --git a/litellm/integrations/custom_logger.py b/litellm/integrations/custom_logger.py
index 317613420a5..12243a19184 100644
--- a/litellm/integrations/custom_logger.py
+++ b/litellm/integrations/custom_logger.py
@@ -143,6 +143,34 @@ async def async_log_stream_event(self, kwargs, response_obj, start_time, end_tim
async def async_log_pre_api_call(self, model, messages, kwargs):
pass
+ async def async_pre_request_hook(
+ self, model: str, messages: List, kwargs: Dict
+ ) -> Optional[Dict]:
+ """
+ Hook called before making the API request to allow modifying request parameters.
+
+ This is specifically designed for modifying the request before it's sent to the provider.
+ Unlike async_log_pre_api_call (which is for logging), this hook is meant for transformations.
+
+ Args:
+ model: The model name
+ messages: The messages list
+ kwargs: The request parameters (tools, stream, temperature, etc.)
+
+ Returns:
+ Optional[Dict]: Modified kwargs to use for the request, or None if no modifications
+
+ Example:
+ ```python
+ async def async_pre_request_hook(self, model, messages, kwargs):
+ # Convert native tools to standard format
+ if kwargs.get("tools"):
+ kwargs["tools"] = convert_tools(kwargs["tools"])
+ return kwargs
+ ```
+ """
+ pass
+
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
pass
diff --git a/litellm/integrations/prometheus.py b/litellm/integrations/prometheus.py
index 1e1da803e48..b490c21174f 100644
--- a/litellm/integrations/prometheus.py
+++ b/litellm/integrations/prometheus.py
@@ -21,7 +21,12 @@
import litellm
from litellm._logging import print_verbose, verbose_logger
from litellm.integrations.custom_logger import CustomLogger
-from litellm.proxy._types import LiteLLM_TeamTable, LiteLLM_UserTable, UserAPIKeyAuth
+from litellm.proxy._types import (
+ LiteLLM_DeletedVerificationToken,
+ LiteLLM_TeamTable,
+ LiteLLM_UserTable,
+ UserAPIKeyAuth,
+)
from litellm.types.integrations.prometheus import *
from litellm.types.integrations.prometheus import _sanitize_prometheus_label_name
from litellm.types.utils import StandardLoggingPayload
@@ -2153,7 +2158,7 @@ async def _initialize_budget_metrics(
self,
data_fetch_function: Callable[..., Awaitable[Tuple[List[Any], Optional[int]]]],
set_metrics_function: Callable[[List[Any]], Awaitable[None]],
- data_type: Literal["teams", "keys"],
+ data_type: Literal["teams", "keys", "users"],
):
"""
Generic method to initialize budget metrics for teams or API keys.
@@ -2245,7 +2250,7 @@ async def _initialize_api_key_budget_metrics(self):
async def fetch_keys(
page_size: int, page: int
- ) -> Tuple[List[Union[str, UserAPIKeyAuth]], Optional[int]]:
+ ) -> Tuple[List[Union[str, UserAPIKeyAuth, LiteLLM_DeletedVerificationToken]], Optional[int]]:
key_list_response = await _list_key_helper(
prisma_client=prisma_client,
page=page,
diff --git a/litellm/integrations/websearch_interception/ARCHITECTURE.md b/litellm/integrations/websearch_interception/ARCHITECTURE.md
index 345741c3c03..3aa0a1558d7 100644
--- a/litellm/integrations/websearch_interception/ARCHITECTURE.md
+++ b/litellm/integrations/websearch_interception/ARCHITECTURE.md
@@ -7,6 +7,98 @@ Server-side WebSearch tool execution for models that don't natively support it (
User makes **ONE** `litellm.messages.acreate()` call → Gets final answer with search results.
The agentic loop happens transparently on the server.
+## LiteLLM Standard Web Search Tool
+
+LiteLLM defines a standard web search tool format (`litellm_web_search`) that all native provider tools are converted to. This enables consistent interception across providers.
+
+**Standard Tool Definition** (defined in `tools.py`):
+```python
+{
+ "name": "litellm_web_search",
+ "description": "Search the web for information...",
+ "input_schema": {
+ "type": "object",
+ "properties": {
+ "query": {"type": "string", "description": "The search query"}
+ },
+ "required": ["query"]
+ }
+}
+```
+
+**Tool Name Constant**: `LITELLM_WEB_SEARCH_TOOL_NAME = "litellm_web_search"` (defined in `litellm/constants.py`)
+
+### Supported Tool Formats
+
+The interception system automatically detects and handles:
+
+| Tool Format | Example | Provider | Detection Method | Future-Proof |
+|-------------|---------|----------|------------------|-------------|
+| **LiteLLM Standard** | `name="litellm_web_search"` | Any | Direct name match | N/A |
+| **Anthropic Native** | `type="web_search_20250305"` | Bedrock, Claude API | Type prefix: `startswith("web_search_")` | ✅ Yes (web_search_2026, etc.) |
+| **Claude Code CLI** | `name="web_search"`, `type="web_search_20250305"` | Claude Code | Name + type check | ✅ Yes (version-agnostic) |
+| **Legacy** | `name="WebSearch"` | Custom | Name match | N/A (backwards compat) |
+
+**Future Compatibility**: The `startswith("web_search_")` check in `tools.py` automatically supports future Anthropic web search versions.
+
+### Claude Code CLI Integration
+
+Claude Code (Anthropic's official CLI) sends web search requests using Anthropic's native tool format:
+
+```python
+{
+ "type": "web_search_20250305",
+ "name": "web_search",
+ "max_uses": 8
+}
+```
+
+**What Happens:**
+1. Claude Code sends native `web_search_20250305` tool to LiteLLM proxy
+2. LiteLLM intercepts and converts to `litellm_web_search` standard format
+3. Bedrock receives converted tool (NOT native format)
+4. Model returns `tool_use` block for `litellm_web_search` (not `server_tool_use`)
+5. LiteLLM's agentic loop intercepts the `tool_use`
+6. Executes `litellm.asearch()` using configured provider (Perplexity, Tavily, etc.)
+7. Returns final answer to Claude Code user
+
+**Without Interception**: Bedrock would receive native tool → try to execute natively → return `web_search_tool_result_error` with `invalid_tool_input`
+
+**With Interception**: LiteLLM converts → Bedrock returns tool_use → LiteLLM executes search → Returns final answer ✅
+
+### Native Tool Conversion
+
+Native tools are converted to LiteLLM standard format **before** sending to the provider:
+
+1. **Conversion Point** (`litellm/llms/anthropic/experimental_pass_through/messages/handler.py`):
+ - In `anthropic_messages()` function (lines 60-127)
+ - Runs BEFORE the API request is made
+ - Detects native web search tools using `is_web_search_tool()`
+ - Converts to `litellm_web_search` format using `get_litellm_web_search_tool()`
+ - Prevents provider from executing search natively (avoids `web_search_tool_result_error`)
+
+2. **Response Detection** (`transformation.py`):
+ - Detects `tool_use` blocks with any web search tool name
+ - Handles: `litellm_web_search`, `WebSearch`, `web_search`
+ - Extracts search queries for execution
+
+**Example Conversion**:
+```python
+# Input (Claude Code's native tool)
+{
+ "type": "web_search_20250305",
+ "name": "web_search",
+ "max_uses": 8
+}
+
+# Output (LiteLLM standard)
+{
+ "name": "litellm_web_search",
+ "description": "Search the web for information...",
+ "input_schema": {...}
+}
+```
+
---
## Request Flow
@@ -63,6 +155,9 @@ sequenceDiagram
| Component | File | Purpose |
|-----------|------|---------|
| **WebSearchInterceptionLogger** | `handler.py` | CustomLogger that implements agentic loop hooks |
+| **Tool Standardization** | `tools.py` | Standard tool definition, detection, and utilities |
+| **Tool Name Constant** | `constants.py` | `LITELLM_WEB_SEARCH_TOOL_NAME = "litellm_web_search"` |
+| **Tool Conversion** | `anthropic/.../ handler.py` | Converts native tools to LiteLLM standard before API call |
| **Transformation Logic** | `transformation.py` | Detect tool_use, build tool_result messages, format search responses |
| **Agentic Loop Hooks** | `integrations/custom_logger.py` | Base hooks: `async_should_run_agentic_loop()`, `async_run_agentic_loop()` |
| **Hook Orchestration** | `llms/custom_httpx/llm_http_handler.py` | `_call_agentic_completion_hooks()` - calls hooks after response |
@@ -74,7 +169,10 @@ sequenceDiagram
## Configuration
```python
-from litellm.integrations.websearch_interception import WebSearchInterceptionLogger
+from litellm.integrations.websearch_interception import (
+ WebSearchInterceptionLogger,
+ get_litellm_web_search_tool,
+)
from litellm.types.utils import LlmProviders
# Enable for Bedrock with specific search tool
@@ -85,13 +183,25 @@ litellm.callbacks = [
)
]
-# Make request (streaming or non-streaming both work)
+# Make request with LiteLLM standard tool (recommended)
+response = await litellm.messages.acreate(
+ model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+ messages=[{"role": "user", "content": "What is LiteLLM?"}],
+ tools=[get_litellm_web_search_tool()], # LiteLLM standard
+ max_tokens=1024,
+ stream=True # Auto-converted to non-streaming
+)
+
+# OR send native tools - they're auto-converted to LiteLLM standard
response = await litellm.messages.acreate(
- model="bedrock/us.anthropic.claude-3-5-sonnet-20241022-v2:0",
+ model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
messages=[{"role": "user", "content": "What is LiteLLM?"}],
- tools=[{"name": "WebSearch", ...}],
+ tools=[{
+ "type": "web_search_20250305", # Native Anthropic format
+ "name": "web_search",
+ "max_uses": 8
+ }],
max_tokens=1024,
- stream=True # Streaming is automatically converted to non-streaming for WebSearch
)
```
diff --git a/litellm/integrations/websearch_interception/__init__.py b/litellm/integrations/websearch_interception/__init__.py
index c0feb5235e2..f5b1963c1cf 100644
--- a/litellm/integrations/websearch_interception/__init__.py
+++ b/litellm/integrations/websearch_interception/__init__.py
@@ -8,5 +8,13 @@
from litellm.integrations.websearch_interception.handler import (
WebSearchInterceptionLogger,
)
+from litellm.integrations.websearch_interception.tools import (
+ get_litellm_web_search_tool,
+ is_web_search_tool,
+)
-__all__ = ["WebSearchInterceptionLogger"]
+__all__ = [
+ "WebSearchInterceptionLogger",
+ "get_litellm_web_search_tool",
+ "is_web_search_tool",
+]
diff --git a/litellm/integrations/websearch_interception/handler.py b/litellm/integrations/websearch_interception/handler.py
index 0b08bc2312a..943a2bb4f36 100644
--- a/litellm/integrations/websearch_interception/handler.py
+++ b/litellm/integrations/websearch_interception/handler.py
@@ -12,7 +12,12 @@
import litellm
from litellm._logging import verbose_logger
from litellm.anthropic_interface import messages as anthropic_messages
+from litellm.constants import LITELLM_WEB_SEARCH_TOOL_NAME
from litellm.integrations.custom_logger import CustomLogger
+from litellm.integrations.websearch_interception.tools import (
+ get_litellm_web_search_tool,
+ is_web_search_tool,
+)
from litellm.integrations.websearch_interception.transformation import (
WebSearchTransformation,
)
@@ -57,6 +62,55 @@ def __init__(
for p in enabled_providers
]
self.search_tool_name = search_tool_name
+ self._request_has_websearch = False # Track if current request has web search
+
+ async def async_pre_call_deployment_hook(
+ self, kwargs: Dict[str, Any], call_type: Optional[Any]
+ ) -> Optional[dict]:
+ """
+ Pre-call hook to convert native Anthropic web_search tools to regular tools.
+
+ This prevents Bedrock from trying to execute web search server-side (which fails).
+ Instead, we convert it to a regular tool so the model returns tool_use blocks
+ that we can intercept and execute ourselves.
+ """
+ # Check if this is for an enabled provider
+ custom_llm_provider = kwargs.get("litellm_params", {}).get("custom_llm_provider", "")
+ if custom_llm_provider not in self.enabled_providers:
+ return None
+
+ # Check if request has tools with native web_search
+ tools = kwargs.get("tools")
+ if not tools:
+ return None
+
+ # Check if any tool is a web search tool (native or already LiteLLM standard)
+ has_websearch = any(is_web_search_tool(t) for t in tools)
+
+ if not has_websearch:
+ return None
+
+ verbose_logger.debug(
+ "WebSearchInterception: Converting native web_search tools to LiteLLM standard"
+ )
+
+ # Convert native/custom web_search tools to LiteLLM standard
+ converted_tools = []
+ for tool in tools:
+ if is_web_search_tool(tool):
+ # Convert to LiteLLM standard web search tool
+ converted_tool = get_litellm_web_search_tool()
+ converted_tools.append(converted_tool)
+ verbose_logger.debug(
+ f"WebSearchInterception: Converted {tool.get('name', 'unknown')} "
+ f"(type={tool.get('type', 'none')}) to {LITELLM_WEB_SEARCH_TOOL_NAME}"
+ )
+ else:
+ # Keep other tools as-is
+ converted_tools.append(tool)
+
+ # Return modified kwargs with converted tools
+ return {"tools": converted_tools}
@classmethod
def from_config_yaml(
@@ -104,6 +158,83 @@ def from_config_yaml(
search_tool_name=search_tool_name,
)
+ async def async_pre_request_hook(
+ self, model: str, messages: List[Dict], kwargs: Dict
+ ) -> Optional[Dict]:
+ """
+ Pre-request hook to convert native web search tools to LiteLLM standard.
+
+ This hook is called before the API request is made, allowing us to:
+ 1. Detect native web search tools (web_search_20250305, etc.)
+ 2. Convert them to LiteLLM standard format (litellm_web_search)
+ 3. Convert stream=True to stream=False for interception
+
+ This prevents providers like Bedrock from trying to execute web search
+ natively (which fails), and ensures our agentic loop can intercept tool_use.
+
+ Returns:
+ Modified kwargs dict with converted tools, or None if no modifications needed
+ """
+ # Check if this request is for an enabled provider
+ custom_llm_provider = kwargs.get("litellm_params", {}).get(
+ "custom_llm_provider", ""
+ )
+
+ verbose_logger.debug(
+ f"WebSearchInterception: Pre-request hook called"
+ f" - custom_llm_provider={custom_llm_provider}"
+ f" - enabled_providers={self.enabled_providers}"
+ )
+
+ if custom_llm_provider not in self.enabled_providers:
+ verbose_logger.debug(
+ f"WebSearchInterception: Skipping - provider {custom_llm_provider} not in {self.enabled_providers}"
+ )
+ return None
+
+ # Check if request has tools
+ tools = kwargs.get("tools")
+ if not tools:
+ return None
+
+ # Check if any tool is a web search tool
+ has_websearch = any(is_web_search_tool(t) for t in tools)
+ if not has_websearch:
+ return None
+
+ verbose_logger.debug(
+ f"WebSearchInterception: Pre-request hook triggered for provider={custom_llm_provider}"
+ )
+
+ # Convert native web search tools to LiteLLM standard
+ converted_tools = []
+ for tool in tools:
+ if is_web_search_tool(tool):
+ standard_tool = get_litellm_web_search_tool()
+ converted_tools.append(standard_tool)
+ verbose_logger.debug(
+ f"WebSearchInterception: Converted {tool.get('name', 'unknown')} "
+ f"(type={tool.get('type', 'none')}) to {LITELLM_WEB_SEARCH_TOOL_NAME}"
+ )
+ else:
+ converted_tools.append(tool)
+
+ # Update kwargs with converted tools
+ kwargs["tools"] = converted_tools
+ verbose_logger.debug(
+ f"WebSearchInterception: Tools after conversion: {[t.get('name') for t in converted_tools]}"
+ )
+
+ # Convert stream=True to stream=False for WebSearch interception
+ if kwargs.get("stream"):
+ verbose_logger.debug(
+ "WebSearchInterception: Converting stream=True to stream=False"
+ )
+ kwargs["stream"] = False
+ kwargs["_websearch_interception_converted_stream"] = True
+
+ return kwargs
+
async def async_should_run_agentic_loop(
self,
response: Any,
@@ -128,11 +259,11 @@ async def async_should_run_agentic_loop(
)
return False, {}
- # Check if tools include WebSearch
- has_websearch_tool = any(t.get("name") == "WebSearch" for t in (tools or []))
+ # Check if tools include any web search tool (LiteLLM standard or native)
+ has_websearch_tool = any(is_web_search_tool(t) for t in (tools or []))
if not has_websearch_tool:
verbose_logger.debug(
- "WebSearchInterception: No WebSearch tool in request"
+ "WebSearchInterception: No web search tool in request"
)
return False, {}
diff --git a/litellm/integrations/websearch_interception/tools.py b/litellm/integrations/websearch_interception/tools.py
new file mode 100644
index 00000000000..4f8b7372fe3
--- /dev/null
+++ b/litellm/integrations/websearch_interception/tools.py
@@ -0,0 +1,95 @@
+"""
+LiteLLM Web Search Tool Definition
+
+This module defines the standard web search tool used across LiteLLM.
+Native provider tools (like Anthropic's web_search_20250305) are converted
+to this format for consistent interception and execution.
+"""
+
+from typing import Any, Dict
+
+from litellm.constants import LITELLM_WEB_SEARCH_TOOL_NAME
+
+
+def get_litellm_web_search_tool() -> Dict[str, Any]:
+ """
+ Get the standard LiteLLM web search tool definition.
+
+ This is the canonical tool definition that all native web search tools
+ (like Anthropic's web_search_20250305, Claude Code's web_search, etc.)
+ are converted to for interception.
+
+ Returns:
+ Dict containing the Anthropic-style tool definition with:
+ - name: Tool name
+ - description: What the tool does
+ - input_schema: JSON schema for tool parameters
+
+ Example:
+ >>> tool = get_litellm_web_search_tool()
+ >>> tool['name']
+ 'litellm_web_search'
+ """
+ return {
+ "name": LITELLM_WEB_SEARCH_TOOL_NAME,
+ "description": (
+ "Search the web for information. Use this when you need current "
+ "information or answers to questions that require up-to-date data."
+ ),
+ "input_schema": {
+ "type": "object",
+ "properties": {
+ "query": {
+ "type": "string",
+ "description": "The search query to execute"
+ }
+ },
+ "required": ["query"]
+ }
+ }
+
+
+def is_web_search_tool(tool: Dict[str, Any]) -> bool:
+ """
+ Check if a tool is a web search tool (native or LiteLLM standard).
+
+ Detects:
+ - LiteLLM standard: name == "litellm_web_search"
+ - Anthropic native: type starts with "web_search_" (e.g., "web_search_20250305")
+ - Claude Code: name == "web_search" with a type field
+ - Custom: name == "WebSearch" (legacy format)
+
+ Args:
+ tool: Tool dictionary to check
+
+ Returns:
+ True if tool is a web search tool
+
+ Example:
+ >>> is_web_search_tool({"name": "litellm_web_search"})
+ True
+ >>> is_web_search_tool({"type": "web_search_20250305", "name": "web_search"})
+ True
+ >>> is_web_search_tool({"name": "calculator"})
+ False
+ """
+ tool_name = tool.get("name", "")
+ tool_type = tool.get("type", "")
+
+ # Check for LiteLLM standard tool
+ if tool_name == LITELLM_WEB_SEARCH_TOOL_NAME:
+ return True
+
+ # Check for native Anthropic web_search_* types
+ if tool_type.startswith("web_search_"):
+ return True
+
+ # Check for Claude Code's web_search with a type field
+ if tool_name == "web_search" and tool_type:
+ return True
+
+ # Check for legacy WebSearch format
+ if tool_name == "WebSearch":
+ return True
+
+ return False
diff --git a/litellm/integrations/websearch_interception/transformation.py b/litellm/integrations/websearch_interception/transformation.py
index e8211311281..313358822a5 100644
--- a/litellm/integrations/websearch_interception/transformation.py
+++ b/litellm/integrations/websearch_interception/transformation.py
@@ -7,6 +7,7 @@
from typing import Any, Dict, List, Tuple
from litellm._logging import verbose_logger
+from litellm.constants import LITELLM_WEB_SEARCH_TOOL_NAME
from litellm.llms.base_llm.search.transformation import SearchResponse
@@ -94,17 +95,21 @@ def _detect_from_non_streaming_response(
block_id = getattr(block, "id", None)
block_input = getattr(block, "input", {})
- if block_type == "tool_use" and block_name == "WebSearch":
+ # Check for LiteLLM standard or legacy web search tools
+ # Handles: litellm_web_search, WebSearch, web_search
+ if block_type == "tool_use" and block_name in (
+ LITELLM_WEB_SEARCH_TOOL_NAME, "WebSearch", "web_search"
+ ):
# Convert to dict for easier handling
tool_call = {
"id": block_id,
"type": "tool_use",
- "name": "WebSearch",
+ "name": block_name, # Preserve original name
"input": block_input,
}
tool_calls.append(tool_call)
verbose_logger.debug(
- f"WebSearchInterception: Found WebSearch tool_use with id={tool_call['id']}"
+ f"WebSearchInterception: Found {block_name} tool_use with id={tool_call['id']}"
)
return len(tool_calls) > 0, tool_calls
diff --git a/litellm/llms/anthropic/experimental_pass_through/messages/fake_stream_iterator.py b/litellm/llms/anthropic/experimental_pass_through/messages/fake_stream_iterator.py
new file mode 100644
index 00000000000..542ae20b602
--- /dev/null
+++ b/litellm/llms/anthropic/experimental_pass_through/messages/fake_stream_iterator.py
@@ -0,0 +1,246 @@
+"""
+Fake Streaming Iterator for Anthropic Messages
+
+This module provides a fake streaming iterator that converts non-streaming
+Anthropic Messages responses into proper streaming format.
+
+Used when WebSearch interception converts stream=True to stream=False but
+the LLM doesn't make a tool call, and we need to return a stream to the user.
+"""
+
+import json
+from typing import Any, Dict, List, cast
+
+from litellm.types.llms.anthropic_messages.anthropic_response import (
+ AnthropicMessagesResponse,
+)
+
+
+class FakeAnthropicMessagesStreamIterator:
+ """
+ Fake streaming iterator for Anthropic Messages responses.
+
+ Used when we need to convert a non-streaming response to a streaming format,
+ such as when WebSearch interception converts stream=True to stream=False but
+ the LLM doesn't make a tool call.
+
+ This creates a proper Anthropic-style streaming response with multiple events:
+ - message_start
+ - content_block_start (for each content block)
+ - content_block_delta (for text content, chunked)
+ - content_block_stop
+ - message_delta (for usage)
+ - message_stop
+ """
+
+ def __init__(self, response: AnthropicMessagesResponse):
+ self.response = response
+ self.chunks = self._create_streaming_chunks()
+ self.current_index = 0
+
+ def _create_streaming_chunks(self) -> List[bytes]:
+ """Convert the non-streaming response to streaming chunks"""
+ chunks = []
+
+ # Cast response to dict for easier access
+ response_dict = cast(Dict[str, Any], self.response)
+
+ # 1. message_start event
+ usage = response_dict.get("usage", {})
+ message_start = {
+ "type": "message_start",
+ "message": {
+ "id": response_dict.get("id"),
+ "type": "message",
+ "role": response_dict.get("role", "assistant"),
+ "model": response_dict.get("model"),
+ "content": [],
+ "stop_reason": None,
+ "stop_sequence": None,
+ "usage": {
+ "input_tokens": usage.get("input_tokens", 0) if usage else 0,
+ "output_tokens": 0
+ }
+ }
+ }
+ chunks.append(f"event: message_start\ndata: {json.dumps(message_start)}\n\n".encode())
+
+ # 2-4. For each content block, send start/delta/stop events
+ content_blocks = response_dict.get("content", [])
+ if content_blocks:
+ for index, block in enumerate(content_blocks):
+ # Cast block to dict for easier access
+ block_dict = cast(Dict[str, Any], block)
+ block_type = block_dict.get("type")
+
+ if block_type == "text":
+ # content_block_start
+ content_block_start = {
+ "type": "content_block_start",
+ "index": index,
+ "content_block": {
+ "type": "text",
+ "text": ""
+ }
+ }
+ chunks.append(f"event: content_block_start\ndata: {json.dumps(content_block_start)}\n\n".encode())
+
+ # content_block_delta (send full text as one delta for simplicity)
+ text = block_dict.get("text", "")
+ content_block_delta = {
+ "type": "content_block_delta",
+ "index": index,
+ "delta": {
+ "type": "text_delta",
+ "text": text
+ }
+ }
+ chunks.append(f"event: content_block_delta\ndata: {json.dumps(content_block_delta)}\n\n".encode())
+
+ # content_block_stop
+ content_block_stop = {
+ "type": "content_block_stop",
+ "index": index
+ }
+ chunks.append(f"event: content_block_stop\ndata: {json.dumps(content_block_stop)}\n\n".encode())
+
+ elif block_type == "thinking":
+ # content_block_start for thinking
+ content_block_start = {
+ "type": "content_block_start",
+ "index": index,
+ "content_block": {
+ "type": "thinking",
+ "thinking": "",
+ "signature": ""
+ }
+ }
+ chunks.append(f"event: content_block_start\ndata: {json.dumps(content_block_start)}\n\n".encode())
+
+ # content_block_delta for thinking text
+ thinking_text = block_dict.get("thinking", "")
+ if thinking_text:
+ content_block_delta = {
+ "type": "content_block_delta",
+ "index": index,
+ "delta": {
+ "type": "thinking_delta",
+ "thinking": thinking_text
+ }
+ }
+ chunks.append(f"event: content_block_delta\ndata: {json.dumps(content_block_delta)}\n\n".encode())
+
+ # content_block_delta for signature (if present)
+ signature = block_dict.get("signature", "")
+ if signature:
+ signature_delta = {
+ "type": "content_block_delta",
+ "index": index,
+ "delta": {
+ "type": "signature_delta",
+ "signature": signature
+ }
+ }
+ chunks.append(f"event: content_block_delta\ndata: {json.dumps(signature_delta)}\n\n".encode())
+
+ # content_block_stop
+ content_block_stop = {
+ "type": "content_block_stop",
+ "index": index
+ }
+ chunks.append(f"event: content_block_stop\ndata: {json.dumps(content_block_stop)}\n\n".encode())
+
+ elif block_type == "redacted_thinking":
+ # content_block_start for redacted_thinking
+ content_block_start = {
+ "type": "content_block_start",
+ "index": index,
+ "content_block": {
+ "type": "redacted_thinking"
+ }
+ }
+ chunks.append(f"event: content_block_start\ndata: {json.dumps(content_block_start)}\n\n".encode())
+
+ # content_block_stop (no delta for redacted thinking)
+ content_block_stop = {
+ "type": "content_block_stop",
+ "index": index
+ }
+ chunks.append(f"event: content_block_stop\ndata: {json.dumps(content_block_stop)}\n\n".encode())
+
+ elif block_type == "tool_use":
+ # content_block_start
+ content_block_start = {
+ "type": "content_block_start",
+ "index": index,
+ "content_block": {
+ "type": "tool_use",
+ "id": block_dict.get("id"),
+ "name": block_dict.get("name"),
+ "input": {}
+ }
+ }
+ chunks.append(f"event: content_block_start\ndata: {json.dumps(content_block_start)}\n\n".encode())
+
+ # content_block_delta (send input as JSON delta)
+ input_data = block_dict.get("input", {})
+ content_block_delta = {
+ "type": "content_block_delta",
+ "index": index,
+ "delta": {
+ "type": "input_json_delta",
+ "partial_json": json.dumps(input_data)
+ }
+ }
+ chunks.append(f"event: content_block_delta\ndata: {json.dumps(content_block_delta)}\n\n".encode())
+
+ # content_block_stop
+ content_block_stop = {
+ "type": "content_block_stop",
+ "index": index
+ }
+ chunks.append(f"event: content_block_stop\ndata: {json.dumps(content_block_stop)}\n\n".encode())
+
+ # 5. message_delta event (with final usage and stop_reason)
+ message_delta = {
+ "type": "message_delta",
+ "delta": {
+ "stop_reason": response_dict.get("stop_reason"),
+ "stop_sequence": response_dict.get("stop_sequence")
+ },
+ "usage": {
+ "output_tokens": usage.get("output_tokens", 0) if usage else 0
+ }
+ }
+ chunks.append(f"event: message_delta\ndata: {json.dumps(message_delta)}\n\n".encode())
+
+ # 6. message_stop event
+ message_stop = {
+ "type": "message_stop",
+ "usage": usage if usage else {}
+ }
+ chunks.append(f"event: message_stop\ndata: {json.dumps(message_stop)}\n\n".encode())
+
+ return chunks
+
+ def __aiter__(self):
+ return self
+
+ async def __anext__(self):
+ if self.current_index >= len(self.chunks):
+ raise StopAsyncIteration
+
+ chunk = self.chunks[self.current_index]
+ self.current_index += 1
+ return chunk
+
+ def __iter__(self):
+ return self
+
+ def __next__(self):
+ if self.current_index >= len(self.chunks):
+ raise StopIteration
+
+ chunk = self.chunks[self.current_index]
+ self.current_index += 1
+ return chunk
diff --git a/litellm/llms/anthropic/experimental_pass_through/messages/handler.py b/litellm/llms/anthropic/experimental_pass_through/messages/handler.py
index 11245b1bdba..7e5a4f22a7f 100644
--- a/litellm/llms/anthropic/experimental_pass_through/messages/handler.py
+++ b/litellm/llms/anthropic/experimental_pass_through/messages/handler.py
@@ -33,6 +33,70 @@
#################################################
+async def _execute_pre_request_hooks(
+ model: str,
+ messages: List[Dict],
+ tools: Optional[List[Dict]],
+ stream: Optional[bool],
+ custom_llm_provider: Optional[str],
+ **kwargs,
+) -> Dict:
+ """
+ Execute pre-request hooks from CustomLogger callbacks.
+
+ Allows CustomLoggers to modify request parameters before the API call.
+ Used for WebSearch tool conversion, stream modification, etc.
+
+ Args:
+ model: Model name
+ messages: List of messages
+ tools: Optional tools list
+ stream: Optional stream flag
+ custom_llm_provider: Provider name (if not set, will be extracted from model)
+ **kwargs: Additional request parameters
+
+ Returns:
+ Dict containing all (potentially modified) request parameters including tools, stream
+ """
+ # If custom_llm_provider not provided, extract from model
+ if not custom_llm_provider:
+ try:
+ _, custom_llm_provider, _, _ = litellm.get_llm_provider(model=model)
+ except Exception:
+ # If extraction fails, continue without provider
+ pass
+
+ # Build complete request kwargs dict
+ request_kwargs = {
+ "tools": tools,
+ "stream": stream,
+ "litellm_params": {
+ "custom_llm_provider": custom_llm_provider,
+ },
+ **kwargs,
+ }
+
+ if not litellm.callbacks:
+ return request_kwargs
+
+ from litellm.integrations.custom_logger import CustomLogger as _CustomLogger
+
+ for callback in litellm.callbacks:
+ if not isinstance(callback, _CustomLogger):
+ continue
+
+ # Call the pre-request hook
+ modified_kwargs = await callback.async_pre_request_hook(
+ model, messages, request_kwargs
+ )
+
+ # If hook returned modified kwargs, use them
+ if modified_kwargs is not None:
+ request_kwargs = modified_kwargs
+
+ return request_kwargs
+
+
@client
async def anthropic_messages(
max_tokens: int,
@@ -57,39 +121,24 @@ async def anthropic_messages(
"""
Async: Make llm api request in Anthropic /messages API spec
"""
- # WebSearch Interception: Convert stream=True to stream=False if WebSearch interception is enabled
- # This allows transparent server-side agentic loop execution for streaming requests
- if stream and tools and any(t.get("name") == "WebSearch" for t in tools):
- # Extract provider using litellm's helper function
- try:
- _, provider, _, _ = litellm.get_llm_provider(
- model=model,
- custom_llm_provider=custom_llm_provider,
- api_base=api_base,
- api_key=api_key,
- )
- except Exception:
- # Fallback to simple split if helper fails
- provider = model.split("/")[0] if "/" in model else ""
+ # Execute pre-request hooks to allow CustomLoggers to modify request
+ request_kwargs = await _execute_pre_request_hooks(
+ model=model,
+ messages=messages,
+ tools=tools,
+ stream=stream,
+ custom_llm_provider=custom_llm_provider,
+ **kwargs,
+ )
- # Check if WebSearch interception is enabled in callbacks
- from litellm._logging import verbose_logger
- from litellm.integrations.websearch_interception import (
- WebSearchInterceptionLogger,
- )
- if litellm.callbacks:
- for callback in litellm.callbacks:
- if isinstance(callback, WebSearchInterceptionLogger):
- # Check if provider is enabled for interception
- if provider in callback.enabled_providers:
- verbose_logger.debug(
- f"WebSearchInterception: Converting stream=True to stream=False for WebSearch interception "
- f"(provider={provider})"
- )
- stream = False
- break
+ # Extract modified parameters
+ tools = request_kwargs.pop("tools", tools)
+ stream = request_kwargs.pop("stream", stream)
+ # Remove litellm_params from kwargs (only needed for hooks)
+ request_kwargs.pop("litellm_params", None)
+ # Merge back any other modifications
+ kwargs.update(request_kwargs)
- local_vars = locals()
loop = asyncio.get_event_loop()
kwargs["is_async"] = True
@@ -206,6 +255,11 @@ def anthropic_messages_handler(
"model": original_model,
"custom_llm_provider": custom_llm_provider,
}
+
+ # Check if stream was converted for WebSearch interception
+ # This is set in the async wrapper above when stream=True is converted to stream=False
+ if kwargs.get("_websearch_interception_converted_stream", False):
+ litellm_logging_obj.model_call_details["websearch_interception_converted_stream"] = True
if litellm_params.mock_response and isinstance(litellm_params.mock_response, str):
diff --git a/litellm/llms/custom_httpx/llm_http_handler.py b/litellm/llms/custom_httpx/llm_http_handler.py
index 490786155c6..ab1e735fca7 100644
--- a/litellm/llms/custom_httpx/llm_http_handler.py
+++ b/litellm/llms/custom_httpx/llm_http_handler.py
@@ -4418,6 +4418,41 @@ async def _call_agentic_completion_hooks(
f"LiteLLM.AgenticHookError: Exception in agentic completion hooks: {str(e)}"
)
+ # Check if we need to convert response to fake stream
+ # This happens when:
+ # 1. Stream was originally True but converted to False for WebSearch interception
+ # 2. No agentic loop ran (LLM didn't use the tool)
+ # 3. We have a non-streaming response that needs to be converted to streaming
+ websearch_converted_stream = (
+ logging_obj.model_call_details.get("websearch_interception_converted_stream", False)
+ if logging_obj is not None
+ else False
+ )
+
+ if websearch_converted_stream:
+ from typing import cast
+
+ from litellm._logging import verbose_logger
+ from litellm.llms.anthropic.experimental_pass_through.messages.fake_stream_iterator import (
+ FakeAnthropicMessagesStreamIterator,
+ )
+ from litellm.types.llms.anthropic_messages.anthropic_response import (
+ AnthropicMessagesResponse,
+ )
+
+ verbose_logger.debug(
+ "WebSearchInterception: No tool call made, converting non-streaming response to fake stream"
+ )
+
+ # Convert the non-streaming response to a fake stream
+ # The response should be an AnthropicMessagesResponse (dict)
+ if isinstance(response, dict):
+ # Create a fake streaming iterator
+ fake_stream = FakeAnthropicMessagesStreamIterator(
+ response=cast(AnthropicMessagesResponse, response)
+ )
+ return fake_stream
+
return None
def _handle_error(
diff --git a/litellm/proxy/proxy_config.yaml b/litellm/proxy/proxy_config.yaml
index 87e02a142ee..cf852805f83 100644
--- a/litellm/proxy/proxy_config.yaml
+++ b/litellm/proxy/proxy_config.yaml
@@ -46,7 +46,21 @@ model_list:
api_base: https://krish-mh44t553-eastus2.services.ai.azure.com
api_key: os.environ/AZURE_ANTHROPIC_API_KEY
+# Search Tools Configuration - Define search providers for WebSearch interception
+# search_tools:
+# - search_tool_name: "my-perplexity-search"
+# litellm_params:
+# search_provider: "perplexity" # Can be: perplexity, brave, etc.
+
+litellm_settings:
+ callbacks: ["websearch_interception"]
+ # WebSearch Interception - Automatically intercepts and executes WebSearch tool calls
+ # for models that don't natively support web search (e.g., Bedrock/Claude)
+ websearch_interception_params:
+ enabled_providers: ["bedrock"] # List of providers to enable interception for
+ search_tool_name: "my-perplexity-search" # Optional: Name of search tool from search_tools config
general_settings:
store_prompts_in_spend_logs: true
- forward_client_headers_to_llm_api: true
\ No newline at end of file
+ forward_client_headers_to_llm_api: true
+
diff --git a/tests/pass_through_unit_tests/test_websearch_interception_e2e.py b/tests/pass_through_unit_tests/test_websearch_interception_e2e.py
index 2dec9da8b70..bf50c1c9cd2 100644
--- a/tests/pass_through_unit_tests/test_websearch_interception_e2e.py
+++ b/tests/pass_through_unit_tests/test_websearch_interception_e2e.py
@@ -323,3 +323,632 @@ async def test_websearch_interception_streaming():
import traceback
traceback.print_exc()
return False
+
+
+async def test_websearch_interception_no_tool_call_streaming():
+ """
+ Test WebSearch interception when LLM doesn't make a tool call with streaming.
+
+ This tests the scenario where:
+ 1. User requests stream=True
+ 2. WebSearch tool is provided
+ 3. LLM decides NOT to use the tool (just responds with text)
+ 4. System should return a fake stream
+ """
+ print("\n" + "="*80)
+ print("E2E TEST 3: WebSearch Interception (No Tool Call, Streaming)")
+ print("="*80)
+
+ # Router already initialized from test 1
+ print("\n✅ Using existing router configuration")
+ print("✅ WebSearch interception already enabled for Bedrock")
+
+ try:
+ # Make request with WebSearch tool AND stream=True
+ # Use a query that the LLM will answer directly without using the tool
+ print("\n📞 Making litellm.messages.acreate() call with stream=True...")
+ print(f" Model: bedrock/us.anthropic.claude-3-5-sonnet-20241022-v2:0")
+ print(f" Query: 'What is 2+2?'")
+ print(f" Tools: WebSearch")
+ print(f" Stream: True")
+
+ response = await messages.acreate(
+ model="bedrock/us.anthropic.claude-3-5-sonnet-20241022-v2:0",
+ messages=[{"role": "user", "content": "What is 2+2? Just give me the answer, no need to search."}],
+ tools=[
+ {
+ "name": "WebSearch",
+ "description": "Search the web for information",
+ "input_schema": {
+ "type": "object",
+ "properties": {
+ "query": {
+ "type": "string",
+ "description": "The search query",
+ }
+ },
+ "required": ["query"],
+ },
+ }
+ ],
+ max_tokens=1024,
+ stream=True, # REQUEST STREAMING
+ )
+
+ print("\n✅ Received response!")
+
+ # Check if response is actually a stream (async generator or async iterator)
+ import inspect
+ is_async_gen = inspect.isasyncgen(response)
+ is_async_iter = hasattr(response, '__aiter__') and hasattr(response, '__anext__')
+ is_stream = is_async_gen or is_async_iter
+
+ if not is_stream:
+ print("\n❌ TEST 3 FAILED: Response is NOT a stream")
+ print(f"❌ Expected a fake stream when LLM doesn't use the tool")
+ print(f"❌ Response type: {type(response)}")
+ return False
+
+ print(f"✅ Response is a stream (async_gen={is_async_gen}, async_iter={is_async_iter})")
+ print("\n📦 Consuming stream chunks:")
+
+ chunks = []
+ chunk_count = 0
+ async for chunk in response:
+ chunk_count += 1
+ print(f"\n--- Chunk {chunk_count} ---")
+ print(f" Type: {type(chunk)}")
+ print(f" Content: {chunk[:200] if isinstance(chunk, bytes) else str(chunk)[:200]}...")
+ chunks.append(chunk)
+
+ print(f"\n✅ Received {len(chunks)} stream chunk(s)")
+
+ if len(chunks) > 0:
+ print("\n" + "="*80)
+ print("✅ TEST 3 PASSED!")
+ print("="*80)
+ print("✅ User made ONE litellm.messages.acreate() call with stream=True")
+ print("✅ LLM didn't use the WebSearch tool")
+ print("✅ Got back a fake stream (not a non-streaming response)")
+ print("✅ WebSearch interception handles no-tool-call case correctly!")
+ print("="*80)
+ return True
+ else:
+ print("\n❌ TEST 3 FAILED: No chunks received")
+ return False
+
+ except Exception as e:
+ print(f"\n❌ Test 3 failed with error: {str(e)}")
+ import traceback
+ traceback.print_exc()
+ return False
+
+
+async def test_claude_code_native_websearch():
+ """
+ Test WebSearch interception with Claude Code's native web_search_20250305 tool.
+
+ This tests the exact request format that Claude Code sends:
+ - tools: [{'type': 'web_search_20250305', 'name': 'web_search', 'max_uses': 8}]
+ - Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
+ """
+ print("\n" + "="*80)
+ print("E2E TEST: Claude Code Native WebSearch (web_search_20250305)")
+ print("="*80)
+
+ # Router already initialized from test 1
+ print("\n✅ Using existing router configuration")
+ print("✅ WebSearch interception already enabled for Bedrock")
+
+ try:
+ # Make request with Claude Code's exact native web_search tool format
+ print("\n📞 Making litellm.messages.acreate() call...")
+ print(f" Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0")
+ print(f" Query: 'Perform a web search for the query: litellm what is it'")
+ print(f" Tools: Native web_search_20250305")
+ print(f" Stream: False")
+
+ response = await messages.acreate(
+ model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+ messages=[{"role": "user", "content": "Perform a web search for the query: litellm what is it"}],
+ tools=[
+ {
+ "type": "web_search_20250305",
+ "name": "web_search",
+ "max_uses": 8
+ }
+ ],
+ max_tokens=1024,
+ stream=False,
+ )
+
+ print("\n✅ Received response!")
+
+ # Handle both dict and object responses
+ if isinstance(response, dict):
+ response_id = response.get("id")
+ response_model = response.get("model")
+ response_stop_reason = response.get("stop_reason")
+ response_content = response.get("content", [])
+ else:
+ response_id = response.id
+ response_model = response.model
+ response_stop_reason = response.stop_reason
+ response_content = response.content
+
+ print(f"\n📄 Response ID: {response_id}")
+ print(f"📄 Model: {response_model}")
+ print(f"📄 Stop Reason: {response_stop_reason}")
+ print(f"📄 Content blocks: {len(response_content)}")
+
+ # Debug: Print all content block types
+ for i, block in enumerate(response_content):
+ block_type = block.get("type") if isinstance(block, dict) else block.type
+ print(f" Block {i}: type={block_type}")
+ if block_type == "tool_use":
+ block_name = block.get("name") if isinstance(block, dict) else block.name
+ print(f" name={block_name}")
+
+ # Validate response
+ assert response is not None, "Response should not be None"
+ assert response_content is not None, "Response should have content"
+ assert len(response_content) > 0, "Response should have at least one content block"
+
+ # Check if response contains tool_use (means interception didn't work)
+ has_tool_use = any(
+ (block.get("type") if isinstance(block, dict) else block.type) == "tool_use"
+ for block in response_content
+ )
+
+ # Check if we got a text response
+ has_text = any(
+ (block.get("type") if isinstance(block, dict) else block.type) == "text"
+ for block in response_content
+ )
+
+ if has_tool_use:
+ print("\n❌ TEST FAILED: Interception did not work")
+ print(f"❌ Stop reason: {response_stop_reason}")
+ print("❌ Response contains tool_use blocks")
+ return False
+
+ elif has_text and response_stop_reason != "tool_use":
+ text_block = next(
+ block for block in response_content
+ if (block.get("type") if isinstance(block, dict) else block.type) == "text"
+ )
+ text_content = text_block.get("text") if isinstance(text_block, dict) else text_block.text
+
+ print(f"\n📝 Response Text:")
+ print(f" {text_content[:200]}...")
+
+ if "litellm" in text_content.lower():
+ print("\n" + "="*80)
+ print("✅ TEST PASSED!")
+ print("="*80)
+ print("✅ Claude Code's native web_search_20250305 tool was intercepted")
+ print("✅ Tool was converted to LiteLLM standard format")
+ print("✅ User made ONE litellm.messages.acreate() call")
+ print("✅ Got back final answer with search results")
+ print("✅ Agentic loop executed transparently")
+ print("✅ WebSearch interception working with Claude Code!")
+ print("="*80)
+ return True
+ else:
+ print("\n⚠️ Got text response but doesn't mention LiteLLM")
+ return False
+ else:
+ print("\n❌ Unexpected response format")
+ return False
+
+ except Exception as e:
+ print(f"\n❌ Test failed with error: {str(e)}")
+ import traceback
+ traceback.print_exc()
+ return False
+
+
+if __name__ == "__main__":
+ import asyncio
+
+ async def run_all_tests():
+ """Run all E2E tests"""
+ test_results = []
+
+ # Test 1: Non-streaming
+ result1 = await test_websearch_interception_non_streaming()
+ test_results.append(("Non-Streaming", result1))
+
+ # Test 2: Streaming
+ result2 = await test_websearch_interception_streaming()
+ test_results.append(("Streaming", result2))
+
+ # Test 3: No tool call with streaming
+ result3 = await test_websearch_interception_no_tool_call_streaming()
+ test_results.append(("No Tool Call Streaming", result3))
+
+ # Test 4: Claude Code native web_search
+ result4 = await test_claude_code_native_websearch()
+ test_results.append(("Claude Code Native WebSearch", result4))
+
+ # Print summary
+ print("\n" + "="*80)
+ print("TEST SUMMARY")
+ print("="*80)
+ for test_name, result in test_results:
+ status = "✅ PASSED" if result else "❌ FAILED"
+ print(f"{test_name}: {status}")
+ print("="*80)
+
+ # Return overall result
+ return all(result for _, result in test_results)
+
+ result = asyncio.run(run_all_tests())
+ import sys
+ sys.exit(0 if result else 1)
+
+
+async def test_litellm_standard_websearch_tool():
+ """
+ PRIORITY TEST #1: Test with the canonical litellm_web_search tool format.
+
+ This validates that using get_litellm_web_search_tool() directly
+ works end-to-end without any conversion needed.
+ """
+ print("\n" + "="*80)
+ print("E2E TEST: LiteLLM Standard WebSearch Tool")
+ print("="*80)
+
+ from litellm.integrations.websearch_interception import get_litellm_web_search_tool
+
+ print("\n✅ Using existing router configuration")
+ print("✅ WebSearch interception already enabled for Bedrock")
+
+ try:
+ print("\n📞 Making litellm.messages.acreate() call...")
+ print(f" Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0")
+ print(f" Query: 'What is the latest news about AI?'")
+ print(f" Tool: litellm_web_search (standard format, no conversion needed)")
+ print(f" Stream: False")
+
+ response = await messages.acreate(
+ model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+ messages=[{"role": "user", "content": "What is the latest news about AI? Give me a brief overview."}],
+ tools=[get_litellm_web_search_tool()],
+ max_tokens=1024,
+ stream=False,
+ )
+
+ print("\n✅ Received response!")
+
+ if isinstance(response, dict):
+ response_id = response.get("id")
+ response_stop_reason = response.get("stop_reason")
+ response_content = response.get("content", [])
+ else:
+ response_id = response.id
+ response_stop_reason = response.stop_reason
+ response_content = response.content
+
+ print(f"\n📄 Response ID: {response_id}")
+ print(f"📄 Stop Reason: {response_stop_reason}")
+ print(f"📄 Content blocks: {len(response_content)}")
+
+ for i, block in enumerate(response_content):
+ block_type = block.get("type") if isinstance(block, dict) else block.type
+ print(f" Block {i}: type={block_type}")
+
+ has_tool_use = any(
+ (block.get("type") if isinstance(block, dict) else block.type) == "tool_use"
+ for block in response_content
+ )
+
+ has_text = any(
+ (block.get("type") if isinstance(block, dict) else block.type) == "text"
+ for block in response_content
+ )
+
+ if has_tool_use:
+ print("\n❌ TEST FAILED: Interception did not work")
+ return False
+
+ elif has_text and response_stop_reason != "tool_use":
+ text_block = next(
+ block for block in response_content
+ if (block.get("type") if isinstance(block, dict) else block.type) == "text"
+ )
+ text_content = text_block.get("text") if isinstance(text_block, dict) else text_block.text
+
+ print(f"\n📝 Response Text: {text_content[:200]}...")
+
+ print("\n" + "="*80)
+ print("✅ TEST PASSED!")
+ print("="*80)
+ print("✅ LiteLLM standard tool format works without conversion")
+ print("✅ Agentic loop executed transparently")
+ print("="*80)
+ return True
+ else:
+ print("\n❌ Unexpected response format")
+ return False
+
+ except Exception as e:
+ print(f"\n❌ Test failed with error: {str(e)}")
+ import traceback
+ traceback.print_exc()
+ return False
+
+
+async def test_claude_code_native_websearch_streaming():
+ """
+ PRIORITY TEST #2: Test Claude Code's native tool WITH stream=True.
+
+ Validates:
+ - Native tool conversion (web_search_20250305 → litellm_web_search)
+ - Stream=True → Stream=False conversion
+ - Agentic loop executes with both conversions
+ """
+ print("\n" + "="*80)
+ print("E2E TEST: Claude Code Native WebSearch + Streaming")
+ print("="*80)
+
+ print("\n✅ Using existing router configuration")
+ print("✅ WebSearch interception already enabled for Bedrock")
+
+ try:
+ print("\n📞 Making litellm.messages.acreate() call with stream=True...")
+ print(f" Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0")
+ print(f" Tool: Native web_search_20250305")
+ print(f" Stream: True (will be converted to False)")
+
+ response = await messages.acreate(
+ model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+ messages=[{"role": "user", "content": "Search for the latest AI developments."}],
+ tools=[{"type": "web_search_20250305", "name": "web_search", "max_uses": 8}],
+ max_tokens=1024,
+ stream=True,
+ )
+
+ print("\n✅ Received response!")
+
+ import inspect
+ is_stream = inspect.isasyncgen(response)
+
+ if is_stream:
+ print("\n⚠️ Response is a stream (stream conversion didn't work)")
+ return False
+
+ print("✅ Response is NOT a stream (conversion worked!)")
+
+ if isinstance(response, dict):
+ response_stop_reason = response.get("stop_reason")
+ response_content = response.get("content", [])
+ else:
+ response_stop_reason = response.stop_reason
+ response_content = response.content
+
+ has_tool_use = any(
+ (block.get("type") if isinstance(block, dict) else block.type) == "tool_use"
+ for block in response_content
+ )
+
+ has_text = any(
+ (block.get("type") if isinstance(block, dict) else block.type) == "text"
+ for block in response_content
+ )
+
+ if has_tool_use:
+ print("\n❌ TEST FAILED: Interception did not work")
+ return False
+
+ elif has_text and response_stop_reason != "tool_use":
+ print("\n" + "="*80)
+ print("✅ TEST PASSED!")
+ print("="*80)
+ print("✅ Native tool converted to litellm_web_search")
+ print("✅ Stream=True converted to Stream=False")
+ print("✅ Both conversions working together!")
+ print("="*80)
+ return True
+ else:
+ print("\n❌ Unexpected response format")
+ return False
+
+ except Exception as e:
+ print(f"\n❌ Test failed with error: {str(e)}")
+ import traceback
+ traceback.print_exc()
+ return False
+
+
+def test_is_web_search_tool_detection():
+ """
+ PRIORITY TEST #3: Unit test for is_web_search_tool() utility.
+
+ Validates detection of all supported formats including future versions.
+ """
+ print("\n" + "="*80)
+ print("UNIT TEST: Web Search Tool Detection")
+ print("="*80)
+
+ from litellm.integrations.websearch_interception import is_web_search_tool
+
+ test_cases = [
+ ({"name": "litellm_web_search"}, True, "LiteLLM standard tool"),
+ ({"type": "web_search_20250305", "name": "web_search", "max_uses": 8}, True, "Current Anthropic native (2025)"),
+ ({"type": "web_search_2026", "name": "web_search"}, True, "Future Anthropic native (2026)"),
+ ({"type": "web_search_20270615", "name": "web_search"}, True, "Future Anthropic native (2027)"),
+ ({"name": "web_search", "type": "web_search_20250305"}, True, "Claude Code format"),
+ ({"name": "WebSearch"}, True, "Legacy WebSearch"),
+ ({"name": "calculator"}, False, "Non-web-search tool"),
+ ({"name": "some_tool", "type": "function"}, False, "Other tool with type"),
+ ({"type": "custom_tool"}, False, "Custom tool type"),
+ ]
+
+ passed = 0
+ failed = 0
+
+ for tool, expected, description in test_cases:
+ result = is_web_search_tool(tool)
+ if result == expected:
+ print(f" ✅ PASS: {description}")
+ passed += 1
+ else:
+ print(f" ❌ FAIL: {description}")
+ print(f" Tool: {tool}")
+ print(f" Expected: {expected}, Got: {result}")
+ failed += 1
+
+ print(f"\n📊 Results: {passed} passed, {failed} failed")
+
+ if failed == 0:
+ print("\n" + "="*80)
+ print("✅ ALL DETECTION TESTS PASSED!")
+ print("="*80)
+ print("✅ Detects all current formats")
+ print("✅ Future-proof for new web_search_* versions")
+ print("="*80)
+ return True
+ else:
+ print("\n❌ Some detection tests failed")
+ return False
+
+
+async def test_pre_request_hook_modifies_request_body():
+ """
+ Unit test to verify async_pre_request_hook correctly modifies request body.
+
+ Tests that:
+ 1. WebSearchInterceptionLogger is active
+ 2. Native web_search_20250305 tool is converted to litellm_web_search
+ 3. Stream is converted from True to False
+ 4. Modified parameters reach the API call
+ """
+ import asyncio
+ from unittest.mock import AsyncMock, patch, MagicMock
+ from litellm.constants import LITELLM_WEB_SEARCH_TOOL_NAME
+
+ litellm._turn_on_debug()
+
+ print("\n" + "="*80)
+ print("UNIT TEST: Pre-Request Hook Modifies Request Body")
+ print("="*80)
+
+ # Initialize WebSearchInterceptionLogger
+ litellm.callbacks = [
+ WebSearchInterceptionLogger(
+ enabled_providers=[LlmProviders.BEDROCK],
+ search_tool_name="test-search-tool"
+ )
+ ]
+
+ print("✅ WebSearchInterceptionLogger initialized")
+
+ # Track what actually gets sent to the API
+ captured_request = {}
+
+ def mock_anthropic_messages_handler(
+ max_tokens,
+ messages,
+ model,
+ metadata=None,
+ stop_sequences=None,
+ stream=None,
+ system=None,
+ temperature=None,
+ thinking=None,
+ tool_choice=None,
+ tools=None,
+ top_k=None,
+ top_p=None,
+ container=None,
+ api_key=None,
+ api_base=None,
+ client=None,
+ custom_llm_provider=None,
+ **kwargs
+ ):
+ """Mock handler that captures the actual request parameters"""
+ # Capture what gets sent to the handler (after hook modifications)
+ captured_request['tools'] = tools
+ captured_request['stream'] = stream
+ captured_request['max_tokens'] = max_tokens
+ captured_request['model'] = model
+
+ # Return a mock response (non-streaming)
+ from litellm.types.llms.anthropic_messages.anthropic_response import AnthropicMessagesResponse
+ return AnthropicMessagesResponse(
+ id="msg_test",
+ type="message",
+ role="assistant",
+ content=[{
+ "type": "text",
+ "text": "Test response"
+ }],
+ model="claude-sonnet-4-5",
+ stop_reason="end_turn",
+ usage={
+ "input_tokens": 10,
+ "output_tokens": 20
+ }
+ )
+
+ # Patch the anthropic_messages_handler function (called after hooks)
+ with patch('litellm.llms.anthropic.experimental_pass_through.messages.handler.anthropic_messages_handler',
+ side_effect=mock_anthropic_messages_handler):
+
+ print("\n📝 Making request with native web_search_20250305 tool (stream=True)...")
+
+ # Make the request with native tool format
+ response = await messages.acreate(
+ model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+ messages=[{"role": "user", "content": "Test query"}],
+ tools=[{
+ "type": "web_search_20250305",
+ "name": "web_search",
+ "max_uses": 8
+ }],
+ max_tokens=100,
+ stream=True # Should be converted to False
+ )
+
+ print("\n🔍 Verifying request modifications...")
+
+ # Verify tool was converted
+ tools = captured_request.get('tools')
+ print(f"\n Captured tools: {tools}")
+
+ if tools and len(tools) > 0:
+ tool = tools[0]
+ tool_name = tool.get('name')
+
+ if tool_name == LITELLM_WEB_SEARCH_TOOL_NAME:
+ print(f" ✅ Tool converted: web_search_20250305 → {LITELLM_WEB_SEARCH_TOOL_NAME}")
+ else:
+ print(f" ❌ Tool NOT converted: expected {LITELLM_WEB_SEARCH_TOOL_NAME}, got {tool_name}")
+ return False
+ else:
+ print(" ❌ No tools captured in request")
+ return False
+
+ # Verify stream was converted
+ stream = captured_request.get('stream')
+ print(f" Captured stream: {stream}")
+
+ if stream is False:
+ print(" ✅ Stream converted: True → False")
+ else:
+ print(f" ❌ Stream NOT converted: expected False, got {stream}")
+ return False
+
+ print("\n" + "="*80)
+ print("✅ PRE-REQUEST HOOK TEST PASSED!")
+ print("="*80)
+ print("✅ CustomLogger is active")
+ print("✅ async_pre_request_hook modifies request body")
+ print("✅ Tool conversion works correctly")
+ print("✅ Stream conversion works correctly")
+ print("="*80)
+
+ return True
+