diff --git a/docs/my-website/docs/tutorials/claude_code_websearch.md b/docs/my-website/docs/tutorials/claude_code_websearch.md
new file mode 100644
index 00000000000..cc2f79666da
--- /dev/null
+++ b/docs/my-website/docs/tutorials/claude_code_websearch.md
@@ -0,0 +1,192 @@
+# Claude Code - WebSearch Across All Providers
+
+Enable Claude Code's web search tool to work with any provider (Bedrock, Azure, Vertex, etc.). LiteLLM automatically intercepts web search requests and executes them server-side.
+
+## Proxy Configuration
+
+Add WebSearch interception to your `litellm_config.yaml`:
+
+```yaml
+model_list:
+  - model_name: bedrock-sonnet
+    litellm_params:
+      model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
+      aws_region_name: us-east-1
+
+# Enable WebSearch interception for providers
+litellm_settings:
+  callbacks:
+    - websearch_interception:
+        enabled_providers:
+          - bedrock
+          - azure
+          - vertex_ai
+        search_tool_name: perplexity-search  # Optional: specific search tool
+
+# Configure search provider
+search_tools:
+  - search_tool_name: perplexity-search
+    litellm_params:
+      search_provider: perplexity
+      api_key: os.environ/PERPLEXITY_API_KEY
+```
+
+## Quick Start
+
+### 1. Configure LiteLLM Proxy
+
+Create `config.yaml`:
+
+```yaml
+model_list:
+  - model_name: bedrock-sonnet
+    litellm_params:
+      model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
+      aws_region_name: us-east-1
+
+litellm_settings:
+  callbacks:
+    - websearch_interception:
+        enabled_providers: [bedrock]
+
+search_tools:
+  - search_tool_name: perplexity-search
+    litellm_params:
+      search_provider: perplexity
+      api_key: os.environ/PERPLEXITY_API_KEY
+```
+
+### 2. Start Proxy
+
+```bash
+export PERPLEXITY_API_KEY=your-key
+litellm --config config.yaml
+```
+
+### 3. Use with Claude Code
+
+```bash
+export ANTHROPIC_BASE_URL=http://localhost:4000
+export ANTHROPIC_API_KEY=sk-1234
+claude
+```
+
+Now use web search in Claude Code - it works with any provider!
+
+## How It Works
+
+When Claude Code sends a web search request, LiteLLM:
+1. Intercepts the native `web_search` tool
+2. Converts it to LiteLLM's standard format
+3. Executes the search via Perplexity/Tavily
+4. Returns the final answer to Claude Code
+
+```mermaid
+sequenceDiagram
+    participant CC as Claude Code
+    participant LP as LiteLLM Proxy
+    participant B as Bedrock/Azure/etc
+    participant P as Perplexity/Tavily
+
+    CC->>LP: Request with web_search tool
+    Note over LP: Convert native tool<br/>to LiteLLM format
+    LP->>B: Request with converted tool
+    B-->>LP: Response: tool_use
+    Note over LP: Detect web search<br/>tool_use
+    LP->>P: Execute search
+    P-->>LP: Search results
+    LP->>B: Follow-up with results
+    B-->>LP: Final answer
+    LP-->>CC: Final answer with search results
+```
+
+**Result**: One API call from Claude Code → Complete answer with search results
+
+## Supported Providers
+
+| Provider | Native Web Search | With LiteLLM |
+|----------|-------------------|--------------|
+| **Anthropic** | ✅ Yes | ✅ Yes |
+| **Bedrock** | ❌ No | ✅ Yes |
+| **Azure** | ❌ No | ✅ Yes |
+| **Vertex AI** | ❌ No | ✅ Yes |
+| **Other Providers** | ❌ No | ✅ Yes |
+
+## Search Providers
+
+Configure which search provider to use. LiteLLM supports multiple search providers:
+
+| Provider | Configuration |
+|----------|---------------|
+| **Perplexity** | `search_provider: perplexity` |
+| **Tavily** | `search_provider: tavily` |
+
+See [all supported search providers](../search/index.md) for the complete list.
+
+## Configuration Options
+
+### WebSearch Interception Parameters
+
+| Parameter | Type | Required | Description | Example |
+|-----------|------|----------|-------------|---------|
+| `enabled_providers` | List[String] | Yes | List of providers to enable web search interception for | `[bedrock, azure, vertex_ai]` |
+| `search_tool_name` | String | No | Specific search tool from `search_tools` config. If not set, uses first available search tool. | `perplexity-search` |
+
+### Supported Provider Values
+
+Use these values in `enabled_providers`:
+
+| Provider | Value | Description |
+|----------|-------|-------------|
+| AWS Bedrock | `bedrock` | Amazon Bedrock Claude models |
+| Azure OpenAI | `azure` | Azure-hosted models |
+| Google Vertex AI | `vertex_ai` | Google Cloud Vertex AI |
+| Any Other | Provider name | Any LiteLLM-supported provider |
+
+### Complete Configuration Example
+
+```yaml
+model_list:
+  - model_name: bedrock-sonnet
+    litellm_params:
+      model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
+      aws_region_name: us-east-1
+
+  - model_name: azure-gpt4
+    litellm_params:
+      model: azure/gpt-4
+      api_base: https://my-azure.openai.azure.com
+      api_key: os.environ/AZURE_API_KEY
+
+litellm_settings:
+  callbacks:
+    - websearch_interception:
+        enabled_providers:
+          - bedrock        # Enable for AWS Bedrock
+          - azure          # Enable for Azure OpenAI
+          - vertex_ai      # Enable for Google Vertex
+        search_tool_name: perplexity-search  # Optional: use specific search tool
+
+# Configure search tools
+search_tools:
+  - search_tool_name: perplexity-search
+    litellm_params:
+      search_provider: perplexity
+      api_key: os.environ/PERPLEXITY_API_KEY
+
+  - search_tool_name: tavily-search
+    litellm_params:
+      search_provider: tavily
+      api_key: os.environ/TAVILY_API_KEY
+```
+
+**How search tool selection works:**
+- If `search_tool_name` is specified → Uses that specific search tool
+- If `search_tool_name` is not specified → Uses first search tool in `search_tools` list
+- In example above: Without `search_tool_name`, would use `perplexity-search` (first in list)
+
+## Related
+
+- [Claude Code Quickstart](./claude_responses_api.md)
+- [Claude Code Cost Tracking](./claude_code_customer_tracking.md)
+- [Using Non-Anthropic Models](./claude_non_anthropic_models.md)
diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js
index fd651c3adc5..102e3dfe1c5 100644
--- a/docs/my-website/sidebars.js
+++ b/docs/my-website/sidebars.js
@@ -122,6 +122,7 @@ const sidebars = {
           items: [
             "tutorials/claude_responses_api",
             "tutorials/claude_code_customer_tracking",
+            "tutorials/claude_code_websearch",
             "tutorials/claude_mcp",
             "tutorials/claude_non_anthropic_models",
           ]
diff --git a/litellm/constants.py b/litellm/constants.py
index dba79b2f186..3bdd943481e 100644
--- a/litellm/constants.py
+++ b/litellm/constants.py
@@ -329,6 +329,11 @@
     "medium": 5,
     "high": 10,
 }
+
+# LiteLLM standard web search tool name
+# Used for web search interception across providers
+LITELLM_WEB_SEARCH_TOOL_NAME = "litellm_web_search"
+
 DEFAULT_IMAGE_ENDPOINT_MODEL = "dall-e-2"
 DEFAULT_VIDEO_ENDPOINT_MODEL = "sora-2"
 
diff --git a/litellm/integrations/custom_logger.py b/litellm/integrations/custom_logger.py
index 317613420a5..12243a19184 100644
--- a/litellm/integrations/custom_logger.py
+++ b/litellm/integrations/custom_logger.py
@@ -143,6 +143,34 @@ async def async_log_stream_event(self, kwargs, response_obj, start_time, end_tim
     async def async_log_pre_api_call(self, model, messages, kwargs):
         pass
 
+    async def async_pre_request_hook(
+        self, model: str, messages: List, kwargs: Dict
+    ) -> Optional[Dict]:
+        """
+        Hook called before making the API request to allow modifying request parameters.
+
+        This is specifically designed for modifying the request before it's sent to the provider.
+        Unlike async_log_pre_api_call (which is for logging), this hook is meant for transformations.
+
+        Args:
+            model: The model name
+            messages: The messages list
+            kwargs: The request parameters (tools, stream, temperature, etc.)
+
+        Returns:
+            Optional[Dict]: Modified kwargs to use for the request, or None if no modifications
+
+        Example:
+            ```python
+            async def async_pre_request_hook(self, model, messages, kwargs):
+                # Convert native tools to standard format
+                if kwargs.get("tools"):
+                    kwargs["tools"] = convert_tools(kwargs["tools"])
+                return kwargs
+            ```
+        """
+        pass
+
     async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
         pass
 
diff --git a/litellm/integrations/prometheus.py b/litellm/integrations/prometheus.py
index 1e1da803e48..b490c21174f 100644
--- a/litellm/integrations/prometheus.py
+++ b/litellm/integrations/prometheus.py
@@ -21,7 +21,12 @@
 import litellm
 from litellm._logging import print_verbose, verbose_logger
 from litellm.integrations.custom_logger import CustomLogger
-from litellm.proxy._types import LiteLLM_TeamTable, LiteLLM_UserTable, UserAPIKeyAuth
+from litellm.proxy._types import (
+    LiteLLM_DeletedVerificationToken,
+    LiteLLM_TeamTable,
+    LiteLLM_UserTable,
+    UserAPIKeyAuth,
+)
 from litellm.types.integrations.prometheus import *
 from litellm.types.integrations.prometheus import _sanitize_prometheus_label_name
 from litellm.types.utils import StandardLoggingPayload
@@ -2153,7 +2158,7 @@ async def _initialize_budget_metrics(
         self,
         data_fetch_function: Callable[..., Awaitable[Tuple[List[Any], Optional[int]]]],
         set_metrics_function: Callable[[List[Any]], Awaitable[None]],
-        data_type: Literal["teams", "keys"],
+        data_type: Literal["teams", "keys", "users"],
     ):
         """
         Generic method to initialize budget metrics for teams or API keys.
@@ -2245,7 +2250,7 @@ async def _initialize_api_key_budget_metrics(self):
 
         async def fetch_keys(
             page_size: int, page: int
-        ) -> Tuple[List[Union[str, UserAPIKeyAuth]], Optional[int]]:
+        ) -> Tuple[List[Union[str, UserAPIKeyAuth, LiteLLM_DeletedVerificationToken]], Optional[int]]:
             key_list_response = await _list_key_helper(
                 prisma_client=prisma_client,
                 page=page,
diff --git a/litellm/integrations/websearch_interception/ARCHITECTURE.md b/litellm/integrations/websearch_interception/ARCHITECTURE.md
index 345741c3c03..3aa0a1558d7 100644
--- a/litellm/integrations/websearch_interception/ARCHITECTURE.md
+++ b/litellm/integrations/websearch_interception/ARCHITECTURE.md
@@ -7,6 +7,98 @@ Server-side WebSearch tool execution for models that don't natively support it (
 User makes **ONE** `litellm.messages.acreate()` call → Gets final answer with search results.
 The agentic loop happens transparently on the server.
 
+## LiteLLM Standard Web Search Tool
+
+LiteLLM defines a standard web search tool format (`litellm_web_search`) that all native provider tools are converted to. This enables consistent interception across providers.
+
+**Standard Tool Definition** (defined in `tools.py`):
+```python
+{
+    "name": "litellm_web_search",
+    "description": "Search the web for information...",
+    "input_schema": {
+        "type": "object",
+        "properties": {
+            "query": {"type": "string", "description": "The search query"}
+        },
+        "required": ["query"]
+    }
+}
+```
+
+**Tool Name Constant**: `LITELLM_WEB_SEARCH_TOOL_NAME = "litellm_web_search"` (defined in `litellm/constants.py`)
+
+### Supported Tool Formats
+
+The interception system automatically detects and handles:
+
+| Tool Format | Example | Provider | Detection Method | Future-Proof |
+|-------------|---------|----------|------------------|-------------|
+| **LiteLLM Standard** | `name="litellm_web_search"` | Any | Direct name match | N/A |
+| **Anthropic Native** | `type="web_search_20250305"` | Bedrock, Claude API | Type prefix: `startswith("web_search_")` | ✅ Yes (web_search_2026, etc.) |
+| **Claude Code CLI** | `name="web_search"`, `type="web_search_20250305"` | Claude Code | Name + type check | ✅ Yes (version-agnostic) |
+| **Legacy** | `name="WebSearch"` | Custom | Name match | N/A (backwards compat) |
+
+**Future Compatibility**: The `startswith("web_search_")` check in `tools.py` automatically supports future Anthropic web search versions.
+
+### Claude Code CLI Integration
+
+Claude Code (Anthropic's official CLI) sends web search requests using Anthropic's native tool format:
+
+```python
+{
+    "type": "web_search_20250305",
+    "name": "web_search",
+    "max_uses": 8
+}
+```
+
+**What Happens:**
+1. Claude Code sends native `web_search_20250305` tool to LiteLLM proxy
+2. LiteLLM intercepts and converts to `litellm_web_search` standard format
+3. Bedrock receives converted tool (NOT native format)
+4. Model returns `tool_use` block for `litellm_web_search` (not `server_tool_use`)
+5. LiteLLM's agentic loop intercepts the `tool_use`
+6. Executes `litellm.asearch()` using configured provider (Perplexity, Tavily, etc.)
+7. Returns final answer to Claude Code user
+
+**Without Interception**: Bedrock would receive native tool → try to execute natively → return `web_search_tool_result_error` with `invalid_tool_input`
+
+**With Interception**: LiteLLM converts → Bedrock returns tool_use → LiteLLM executes search → Returns final answer ✅
+
+### Native Tool Conversion
+
+Native tools are converted to LiteLLM standard format **before** sending to the provider:
+
+1. **Conversion Point** (`litellm/llms/anthropic/experimental_pass_through/messages/handler.py`):
+   - In `anthropic_messages()` function (lines 60-127)
+   - Runs BEFORE the API request is made
+   - Detects native web search tools using `is_web_search_tool()`
+   - Converts to `litellm_web_search` format using `get_litellm_web_search_tool()`
+   - Prevents provider from executing search natively (avoids `web_search_tool_result_error`)
+
+2. **Response Detection** (`transformation.py`):
+   - Detects `tool_use` blocks with any web search tool name
+   - Handles: `litellm_web_search`, `WebSearch`, `web_search`
+   - Extracts search queries for execution
+
+**Example Conversion**:
+```python
+# Input (Claude Code's native tool)
+{
+    "type": "web_search_20250305",
+    "name": "web_search",
+    "max_uses": 8
+}
+
+# Output (LiteLLM standard)
+{
+    "name": "litellm_web_search",
+    "description": "Search the web for information...",
+    "input_schema": {...}
+}
+```
+
 ---
 
 ## Request Flow
@@ -63,6 +155,9 @@ sequenceDiagram
 | Component | File | Purpose |
 |-----------|------|---------|
 | **WebSearchInterceptionLogger** | `handler.py` | CustomLogger that implements agentic loop hooks |
+| **Tool Standardization** | `tools.py` | Standard tool definition, detection, and utilities |
+| **Tool Name Constant** | `constants.py` | `LITELLM_WEB_SEARCH_TOOL_NAME = "litellm_web_search"` |
+| **Tool Conversion** | `anthropic/.../ handler.py` | Converts native tools to LiteLLM standard before API call |
 | **Transformation Logic** | `transformation.py` | Detect tool_use, build tool_result messages, format search responses |
 | **Agentic Loop Hooks** | `integrations/custom_logger.py` | Base hooks: `async_should_run_agentic_loop()`, `async_run_agentic_loop()` |
 | **Hook Orchestration** | `llms/custom_httpx/llm_http_handler.py` | `_call_agentic_completion_hooks()` - calls hooks after response |
@@ -74,7 +169,10 @@ sequenceDiagram
 ## Configuration
 
 ```python
-from litellm.integrations.websearch_interception import WebSearchInterceptionLogger
+from litellm.integrations.websearch_interception import (
+    WebSearchInterceptionLogger,
+    get_litellm_web_search_tool,
+)
 from litellm.types.utils import LlmProviders
 
 # Enable for Bedrock with specific search tool
@@ -85,13 +183,25 @@ litellm.callbacks = [
     )
 ]
 
-# Make request (streaming or non-streaming both work)
+# Make request with LiteLLM standard tool (recommended)
+response = await litellm.messages.acreate(
+    model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+    messages=[{"role": "user", "content": "What is LiteLLM?"}],
+    tools=[get_litellm_web_search_tool()],  # LiteLLM standard
+    max_tokens=1024,
+    stream=True  # Auto-converted to non-streaming
+)
+
+# OR send native tools - they're auto-converted to LiteLLM standard
 response = await litellm.messages.acreate(
-    model="bedrock/us.anthropic.claude-3-5-sonnet-20241022-v2:0",
+    model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
     messages=[{"role": "user", "content": "What is LiteLLM?"}],
-    tools=[{"name": "WebSearch", ...}],
+    tools=[{
+        "type": "web_search_20250305",  # Native Anthropic format
+        "name": "web_search",
+        "max_uses": 8
+    }],
     max_tokens=1024,
-    stream=True  # Streaming is automatically converted to non-streaming for WebSearch
 )
 ```
 
diff --git a/litellm/integrations/websearch_interception/__init__.py b/litellm/integrations/websearch_interception/__init__.py
index c0feb5235e2..f5b1963c1cf 100644
--- a/litellm/integrations/websearch_interception/__init__.py
+++ b/litellm/integrations/websearch_interception/__init__.py
@@ -8,5 +8,13 @@
 from litellm.integrations.websearch_interception.handler import (
     WebSearchInterceptionLogger,
 )
+from litellm.integrations.websearch_interception.tools import (
+    get_litellm_web_search_tool,
+    is_web_search_tool,
+)
 
-__all__ = ["WebSearchInterceptionLogger"]
+__all__ = [
+    "WebSearchInterceptionLogger",
+    "get_litellm_web_search_tool",
+    "is_web_search_tool",
+]
diff --git a/litellm/integrations/websearch_interception/handler.py b/litellm/integrations/websearch_interception/handler.py
index 0b08bc2312a..943a2bb4f36 100644
--- a/litellm/integrations/websearch_interception/handler.py
+++ b/litellm/integrations/websearch_interception/handler.py
@@ -12,7 +12,12 @@
 import litellm
 from litellm._logging import verbose_logger
 from litellm.anthropic_interface import messages as anthropic_messages
+from litellm.constants import LITELLM_WEB_SEARCH_TOOL_NAME
 from litellm.integrations.custom_logger import CustomLogger
+from litellm.integrations.websearch_interception.tools import (
+    get_litellm_web_search_tool,
+    is_web_search_tool,
+)
 from litellm.integrations.websearch_interception.transformation import (
     WebSearchTransformation,
 )
@@ -57,6 +62,55 @@ def __init__(
                 for p in enabled_providers
             ]
         self.search_tool_name = search_tool_name
+        self._request_has_websearch = False  # Track if current request has web search
+
+    async def async_pre_call_deployment_hook(
+        self, kwargs: Dict[str, Any], call_type: Optional[Any]
+    ) -> Optional[dict]:
+        """
+        Pre-call hook to convert native Anthropic web_search tools to regular tools.
+
+        This prevents Bedrock from trying to execute web search server-side (which fails).
+        Instead, we convert it to a regular tool so the model returns tool_use blocks
+        that we can intercept and execute ourselves.
+        """
+        # Check if this is for an enabled provider
+        custom_llm_provider = kwargs.get("litellm_params", {}).get("custom_llm_provider", "")
+        if custom_llm_provider not in self.enabled_providers:
+            return None
+
+        # Check if request has tools with native web_search
+        tools = kwargs.get("tools")
+        if not tools:
+            return None
+
+        # Check if any tool is a web search tool (native or already LiteLLM standard)
+        has_websearch = any(is_web_search_tool(t) for t in tools)
+
+        if not has_websearch:
+            return None
+
+        verbose_logger.debug(
+            "WebSearchInterception: Converting native web_search tools to LiteLLM standard"
+        )
+
+        # Convert native/custom web_search tools to LiteLLM standard
+        converted_tools = []
+        for tool in tools:
+            if is_web_search_tool(tool):
+                # Convert to LiteLLM standard web search tool
+                converted_tool = get_litellm_web_search_tool()
+                converted_tools.append(converted_tool)
+                verbose_logger.debug(
+                    f"WebSearchInterception: Converted {tool.get('name', 'unknown')} "
+                    f"(type={tool.get('type', 'none')}) to {LITELLM_WEB_SEARCH_TOOL_NAME}"
+                )
+            else:
+                # Keep other tools as-is
+                converted_tools.append(tool)
+
+        # Return modified kwargs with converted tools
+        return {"tools": converted_tools}
 
     @classmethod
     def from_config_yaml(
@@ -104,6 +158,83 @@ def from_config_yaml(
             search_tool_name=search_tool_name,
         )
 
+    async def async_pre_request_hook(
+        self, model: str, messages: List[Dict], kwargs: Dict
+    ) -> Optional[Dict]:
+        """
+        Pre-request hook to convert native web search tools to LiteLLM standard.
+
+        This hook is called before the API request is made, allowing us to:
+        1. Detect native web search tools (web_search_20250305, etc.)
+        2. Convert them to LiteLLM standard format (litellm_web_search)
+        3. Convert stream=True to stream=False for interception
+
+        This prevents providers like Bedrock from trying to execute web search
+        natively (which fails), and ensures our agentic loop can intercept tool_use.
+
+        Returns:
+            Modified kwargs dict with converted tools, or None if no modifications needed
+        """
+        # Check if this request is for an enabled provider
+        custom_llm_provider = kwargs.get("litellm_params", {}).get(
+            "custom_llm_provider", ""
+        )
+
+        verbose_logger.debug(
+            f"WebSearchInterception: Pre-request hook called"
+            f" - custom_llm_provider={custom_llm_provider}"
+            f" - enabled_providers={self.enabled_providers}"
+        )
+
+        if custom_llm_provider not in self.enabled_providers:
+            verbose_logger.debug(
+                f"WebSearchInterception: Skipping - provider {custom_llm_provider} not in {self.enabled_providers}"
+            )
+            return None
+
+        # Check if request has tools
+        tools = kwargs.get("tools")
+        if not tools:
+            return None
+
+        # Check if any tool is a web search tool
+        has_websearch = any(is_web_search_tool(t) for t in tools)
+        if not has_websearch:
+            return None
+
+        verbose_logger.debug(
+            f"WebSearchInterception: Pre-request hook triggered for provider={custom_llm_provider}"
+        )
+
+        # Convert native web search tools to LiteLLM standard
+        converted_tools = []
+        for tool in tools:
+            if is_web_search_tool(tool):
+                standard_tool = get_litellm_web_search_tool()
+                converted_tools.append(standard_tool)
+                verbose_logger.debug(
+                    f"WebSearchInterception: Converted {tool.get('name', 'unknown')} "
+                    f"(type={tool.get('type', 'none')}) to {LITELLM_WEB_SEARCH_TOOL_NAME}"
+                )
+            else:
+                converted_tools.append(tool)
+
+        # Update kwargs with converted tools
+        kwargs["tools"] = converted_tools
+        verbose_logger.debug(
+            f"WebSearchInterception: Tools after conversion: {[t.get('name') for t in converted_tools]}"
+        )
+
+        # Convert stream=True to stream=False for WebSearch interception
+        if kwargs.get("stream"):
+            verbose_logger.debug(
+                "WebSearchInterception: Converting stream=True to stream=False"
+            )
+            kwargs["stream"] = False
+            kwargs["_websearch_interception_converted_stream"] = True
+
+        return kwargs
+
     async def async_should_run_agentic_loop(
         self,
         response: Any,
@@ -128,11 +259,11 @@ async def async_should_run_agentic_loop(
             )
             return False, {}
 
-        # Check if tools include WebSearch
-        has_websearch_tool = any(t.get("name") == "WebSearch" for t in (tools or []))
+        # Check if tools include any web search tool (LiteLLM standard or native)
+        has_websearch_tool = any(is_web_search_tool(t) for t in (tools or []))
         if not has_websearch_tool:
             verbose_logger.debug(
-                "WebSearchInterception: No WebSearch tool in request"
+                "WebSearchInterception: No web search tool in request"
             )
             return False, {}
 
diff --git a/litellm/integrations/websearch_interception/tools.py b/litellm/integrations/websearch_interception/tools.py
new file mode 100644
index 00000000000..4f8b7372fe3
--- /dev/null
+++ b/litellm/integrations/websearch_interception/tools.py
@@ -0,0 +1,95 @@
+"""
+LiteLLM Web Search Tool Definition
+
+This module defines the standard web search tool used across LiteLLM.
+Native provider tools (like Anthropic's web_search_20250305) are converted
+to this format for consistent interception and execution.
+"""
+
+from typing import Any, Dict
+
+from litellm.constants import LITELLM_WEB_SEARCH_TOOL_NAME
+
+
+def get_litellm_web_search_tool() -> Dict[str, Any]:
+    """
+    Get the standard LiteLLM web search tool definition.
+
+    This is the canonical tool definition that all native web search tools
+    (like Anthropic's web_search_20250305, Claude Code's web_search, etc.)
+    are converted to for interception.
+
+    Returns:
+        Dict containing the Anthropic-style tool definition with:
+        - name: Tool name
+        - description: What the tool does
+        - input_schema: JSON schema for tool parameters
+
+    Example:
+        >>> tool = get_litellm_web_search_tool()
+        >>> tool['name']
+        'litellm_web_search'
+    """
+    return {
+        "name": LITELLM_WEB_SEARCH_TOOL_NAME,
+        "description": (
+            "Search the web for information. Use this when you need current "
+            "information or answers to questions that require up-to-date data."
+        ),
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "query": {
+                    "type": "string",
+                    "description": "The search query to execute"
+                }
+            },
+            "required": ["query"]
+        }
+    }
+
+
+def is_web_search_tool(tool: Dict[str, Any]) -> bool:
+    """
+    Check if a tool is a web search tool (native or LiteLLM standard).
+
+    Detects:
+    - LiteLLM standard: name == "litellm_web_search"
+    - Anthropic native: type starts with "web_search_" (e.g., "web_search_20250305")
+    - Claude Code: name == "web_search" with a type field
+    - Custom: name == "WebSearch" (legacy format)
+
+    Args:
+        tool: Tool dictionary to check
+
+    Returns:
+        True if tool is a web search tool
+
+    Example:
+        >>> is_web_search_tool({"name": "litellm_web_search"})
+        True
+        >>> is_web_search_tool({"type": "web_search_20250305", "name": "web_search"})
+        True
+        >>> is_web_search_tool({"name": "calculator"})
+        False
+    """
+    tool_name = tool.get("name", "")
+    tool_type = tool.get("type", "")
+
+    # Check for LiteLLM standard tool
+    if tool_name == LITELLM_WEB_SEARCH_TOOL_NAME:
+        return True
+
+    # Check for native Anthropic web_search_* types
+    if tool_type.startswith("web_search_"):
+        return True
+
+    # Check for Claude Code's web_search with a type field
+    if tool_name == "web_search" and tool_type:
+        return True
+
+    # Check for legacy WebSearch format
+    if tool_name == "WebSearch":
+        return True
+
+    return False
diff --git a/litellm/integrations/websearch_interception/transformation.py b/litellm/integrations/websearch_interception/transformation.py
index e8211311281..313358822a5 100644
--- a/litellm/integrations/websearch_interception/transformation.py
+++ b/litellm/integrations/websearch_interception/transformation.py
@@ -7,6 +7,7 @@
 from typing import Any, Dict, List, Tuple
 
 from litellm._logging import verbose_logger
+from litellm.constants import LITELLM_WEB_SEARCH_TOOL_NAME
 from litellm.llms.base_llm.search.transformation import SearchResponse
 
 
@@ -94,17 +95,21 @@ def _detect_from_non_streaming_response(
                 block_id = getattr(block, "id", None)
                 block_input = getattr(block, "input", {})
 
-            if block_type == "tool_use" and block_name == "WebSearch":
+            # Check for LiteLLM standard or legacy web search tools
+            # Handles: litellm_web_search, WebSearch, web_search
+            if block_type == "tool_use" and block_name in (
+                LITELLM_WEB_SEARCH_TOOL_NAME, "WebSearch", "web_search"
+            ):
                 # Convert to dict for easier handling
                 tool_call = {
                     "id": block_id,
                     "type": "tool_use",
-                    "name": "WebSearch",
+                    "name": block_name,  # Preserve original name
                     "input": block_input,
                 }
                 tool_calls.append(tool_call)
                 verbose_logger.debug(
-                    f"WebSearchInterception: Found WebSearch tool_use with id={tool_call['id']}"
+                    f"WebSearchInterception: Found {block_name} tool_use with id={tool_call['id']}"
                 )
 
         return len(tool_calls) > 0, tool_calls
diff --git a/litellm/llms/anthropic/experimental_pass_through/messages/fake_stream_iterator.py b/litellm/llms/anthropic/experimental_pass_through/messages/fake_stream_iterator.py
new file mode 100644
index 00000000000..542ae20b602
--- /dev/null
+++ b/litellm/llms/anthropic/experimental_pass_through/messages/fake_stream_iterator.py
@@ -0,0 +1,246 @@
+"""
+Fake Streaming Iterator for Anthropic Messages
+
+This module provides a fake streaming iterator that converts non-streaming
+Anthropic Messages responses into proper streaming format.
+
+Used when WebSearch interception converts stream=True to stream=False but
+the LLM doesn't make a tool call, and we need to return a stream to the user.
+"""
+
+import json
+from typing import Any, Dict, List, cast
+
+from litellm.types.llms.anthropic_messages.anthropic_response import (
+    AnthropicMessagesResponse,
+)
+
+
+class FakeAnthropicMessagesStreamIterator:
+    """
+    Fake streaming iterator for Anthropic Messages responses.
+    
+    Used when we need to convert a non-streaming response to a streaming format,
+    such as when WebSearch interception converts stream=True to stream=False but
+    the LLM doesn't make a tool call.
+    
+    This creates a proper Anthropic-style streaming response with multiple events:
+    - message_start
+    - content_block_start (for each content block)
+    - content_block_delta (for text content, chunked)
+    - content_block_stop
+    - message_delta (for usage)
+    - message_stop
+    """
+    
+    def __init__(self, response: AnthropicMessagesResponse):
+        self.response = response
+        self.chunks = self._create_streaming_chunks()
+        self.current_index = 0
+    
+    def _create_streaming_chunks(self) -> List[bytes]:
+        """Convert the non-streaming response to streaming chunks"""
+        chunks = []
+        
+        # Cast response to dict for easier access
+        response_dict = cast(Dict[str, Any], self.response)
+        
+        # 1. message_start event
+        usage = response_dict.get("usage", {})
+        message_start = {
+            "type": "message_start",
+            "message": {
+                "id": response_dict.get("id"),
+                "type": "message",
+                "role": response_dict.get("role", "assistant"),
+                "model": response_dict.get("model"),
+                "content": [],
+                "stop_reason": None,
+                "stop_sequence": None,
+                "usage": {
+                    "input_tokens": usage.get("input_tokens", 0) if usage else 0,
+                    "output_tokens": 0
+                }
+            }
+        }
+        chunks.append(f"event: message_start\ndata: {json.dumps(message_start)}\n\n".encode())
+        
+        # 2-4. For each content block, send start/delta/stop events
+        content_blocks = response_dict.get("content", [])
+        if content_blocks:
+            for index, block in enumerate(content_blocks):
+                # Cast block to dict for easier access
+                block_dict = cast(Dict[str, Any], block)
+                block_type = block_dict.get("type")
+                
+                if block_type == "text":
+                    # content_block_start
+                    content_block_start = {
+                        "type": "content_block_start",
+                        "index": index,
+                        "content_block": {
+                            "type": "text",
+                            "text": ""
+                        }
+                    }
+                    chunks.append(f"event: content_block_start\ndata: {json.dumps(content_block_start)}\n\n".encode())
+                    
+                    # content_block_delta (send full text as one delta for simplicity)
+                    text = block_dict.get("text", "")
+                    content_block_delta = {
+                        "type": "content_block_delta",
+                        "index": index,
+                        "delta": {
+                            "type": "text_delta",
+                            "text": text
+                        }
+                    }
+                    chunks.append(f"event: content_block_delta\ndata: {json.dumps(content_block_delta)}\n\n".encode())
+                    
+                    # content_block_stop
+                    content_block_stop = {
+                        "type": "content_block_stop",
+                        "index": index
+                    }
+                    chunks.append(f"event: content_block_stop\ndata: {json.dumps(content_block_stop)}\n\n".encode())
+                
+                elif block_type == "thinking":
+                    # content_block_start for thinking
+                    content_block_start = {
+                        "type": "content_block_start",
+                        "index": index,
+                        "content_block": {
+                            "type": "thinking",
+                            "thinking": "",
+                            "signature": ""
+                        }
+                    }
+                    chunks.append(f"event: content_block_start\ndata: {json.dumps(content_block_start)}\n\n".encode())
+                    
+                    # content_block_delta for thinking text
+                    thinking_text = block_dict.get("thinking", "")
+                    if thinking_text:
+                        content_block_delta = {
+                            "type": "content_block_delta",
+                            "index": index,
+                            "delta": {
+                                "type": "thinking_delta",
+                                "thinking": thinking_text
+                            }
+                        }
+                        chunks.append(f"event: content_block_delta\ndata: {json.dumps(content_block_delta)}\n\n".encode())
+                    
+                    # content_block_delta for signature (if present)
+                    signature = block_dict.get("signature", "")
+                    if signature:
+                        signature_delta = {
+                            "type": "content_block_delta",
+                            "index": index,
+                            "delta": {
+                                "type": "signature_delta",
+                                "signature": signature
+                            }
+                        }
+                        chunks.append(f"event: content_block_delta\ndata: {json.dumps(signature_delta)}\n\n".encode())
+                    
+                    # content_block_stop
+                    content_block_stop = {
+                        "type": "content_block_stop",
+                        "index": index
+                    }
+                    chunks.append(f"event: content_block_stop\ndata: {json.dumps(content_block_stop)}\n\n".encode())
+                
+                elif block_type == "redacted_thinking":
+                    # content_block_start for redacted_thinking
+                    content_block_start = {
+                        "type": "content_block_start",
+                        "index": index,
+                        "content_block": {
+                            "type": "redacted_thinking"
+                        }
+                    }
+                    chunks.append(f"event: content_block_start\ndata: {json.dumps(content_block_start)}\n\n".encode())
+                    
+                    # content_block_stop (no delta for redacted thinking)
+                    content_block_stop = {
+                        "type": "content_block_stop",
+                        "index": index
+                    }
+                    chunks.append(f"event: content_block_stop\ndata: {json.dumps(content_block_stop)}\n\n".encode())
+                
+                elif block_type == "tool_use":
+                    # content_block_start
+                    content_block_start = {
+                        "type": "content_block_start",
+                        "index": index,
+                        "content_block": {
+                            "type": "tool_use",
+                            "id": block_dict.get("id"),
+                            "name": block_dict.get("name"),
+                            "input": {}
+                        }
+                    }
+                    chunks.append(f"event: content_block_start\ndata: {json.dumps(content_block_start)}\n\n".encode())
+                    
+                    # content_block_delta (send input as JSON delta)
+                    input_data = block_dict.get("input", {})
+                    content_block_delta = {
+                        "type": "content_block_delta",
+                        "index": index,
+                        "delta": {
+                            "type": "input_json_delta",
+                            "partial_json": json.dumps(input_data)
+                        }
+                    }
+                    chunks.append(f"event: content_block_delta\ndata: {json.dumps(content_block_delta)}\n\n".encode())
+                    
+                    # content_block_stop
+                    content_block_stop = {
+                        "type": "content_block_stop",
+                        "index": index
+                    }
+                    chunks.append(f"event: content_block_stop\ndata: {json.dumps(content_block_stop)}\n\n".encode())
+        
+        # 5. message_delta event (with final usage and stop_reason)
+        message_delta = {
+            "type": "message_delta",
+            "delta": {
+                "stop_reason": response_dict.get("stop_reason"),
+                "stop_sequence": response_dict.get("stop_sequence")
+            },
+            "usage": {
+                "output_tokens": usage.get("output_tokens", 0) if usage else 0
+            }
+        }
+        chunks.append(f"event: message_delta\ndata: {json.dumps(message_delta)}\n\n".encode())
+        
+        # 6. message_stop event
+        message_stop = {
+            "type": "message_stop",
+            "usage": usage if usage else {}
+        }
+        chunks.append(f"event: message_stop\ndata: {json.dumps(message_stop)}\n\n".encode())
+        
+        return chunks
+    
+    def __aiter__(self):
+        return self
+    
+    async def __anext__(self):
+        if self.current_index >= len(self.chunks):
+            raise StopAsyncIteration
+        
+        chunk = self.chunks[self.current_index]
+        self.current_index += 1
+        return chunk
+    
+    def __iter__(self):
+        return self
+    
+    def __next__(self):
+        if self.current_index >= len(self.chunks):
+            raise StopIteration
+        
+        chunk = self.chunks[self.current_index]
+        self.current_index += 1
+        return chunk
diff --git a/litellm/llms/anthropic/experimental_pass_through/messages/handler.py b/litellm/llms/anthropic/experimental_pass_through/messages/handler.py
index 11245b1bdba..7e5a4f22a7f 100644
--- a/litellm/llms/anthropic/experimental_pass_through/messages/handler.py
+++ b/litellm/llms/anthropic/experimental_pass_through/messages/handler.py
@@ -33,6 +33,70 @@
 #################################################
 
 
+async def _execute_pre_request_hooks(
+    model: str,
+    messages: List[Dict],
+    tools: Optional[List[Dict]],
+    stream: Optional[bool],
+    custom_llm_provider: Optional[str],
+    **kwargs,
+) -> Dict:
+    """
+    Execute pre-request hooks from CustomLogger callbacks.
+
+    Allows CustomLoggers to modify request parameters before the API call.
+    Used for WebSearch tool conversion, stream modification, etc.
+
+    Args:
+        model: Model name
+        messages: List of messages
+        tools: Optional tools list
+        stream: Optional stream flag
+        custom_llm_provider: Provider name (if not set, will be extracted from model)
+        **kwargs: Additional request parameters
+
+    Returns:
+        Dict containing all (potentially modified) request parameters including tools, stream
+    """
+    # If custom_llm_provider not provided, extract from model
+    if not custom_llm_provider:
+        try:
+            _, custom_llm_provider, _, _ = litellm.get_llm_provider(model=model)
+        except Exception:
+            # If extraction fails, continue without provider
+            pass
+
+    # Build complete request kwargs dict
+    request_kwargs = {
+        "tools": tools,
+        "stream": stream,
+        "litellm_params": {
+            "custom_llm_provider": custom_llm_provider,
+        },
+        **kwargs,
+    }
+
+    if not litellm.callbacks:
+        return request_kwargs
+
+    from litellm.integrations.custom_logger import CustomLogger as _CustomLogger
+
+    for callback in litellm.callbacks:
+        if not isinstance(callback, _CustomLogger):
+            continue
+
+        # Call the pre-request hook
+        modified_kwargs = await callback.async_pre_request_hook(
+            model, messages, request_kwargs
+        )
+
+        # If hook returned modified kwargs, use them
+        if modified_kwargs is not None:
+            request_kwargs = modified_kwargs
+
+    return request_kwargs
+
+
 @client
 async def anthropic_messages(
     max_tokens: int,
@@ -57,39 +121,24 @@ async def anthropic_messages(
     """
     Async: Make llm api request in Anthropic /messages API spec
     """
-    # WebSearch Interception: Convert stream=True to stream=False if WebSearch interception is enabled
-    # This allows transparent server-side agentic loop execution for streaming requests
-    if stream and tools and any(t.get("name") == "WebSearch" for t in tools):
-        # Extract provider using litellm's helper function
-        try:
-            _, provider, _, _ = litellm.get_llm_provider(
-                model=model,
-                custom_llm_provider=custom_llm_provider,
-                api_base=api_base,
-                api_key=api_key,
-            )
-        except Exception:
-            # Fallback to simple split if helper fails
-            provider = model.split("/")[0] if "/" in model else ""
+    # Execute pre-request hooks to allow CustomLoggers to modify request
+    request_kwargs = await _execute_pre_request_hooks(
+        model=model,
+        messages=messages,
+        tools=tools,
+        stream=stream,
+        custom_llm_provider=custom_llm_provider,
+        **kwargs,
+    )
 
-        # Check if WebSearch interception is enabled in callbacks
-        from litellm._logging import verbose_logger
-        from litellm.integrations.websearch_interception import (
-            WebSearchInterceptionLogger,
-        )
-        if litellm.callbacks:
-            for callback in litellm.callbacks:
-                if isinstance(callback, WebSearchInterceptionLogger):
-                    # Check if provider is enabled for interception
-                    if provider in callback.enabled_providers:
-                        verbose_logger.debug(
-                            f"WebSearchInterception: Converting stream=True to stream=False for WebSearch interception "
-                            f"(provider={provider})"
-                        )
-                        stream = False
-                        break
+    # Extract modified parameters
+    tools = request_kwargs.pop("tools", tools)
+    stream = request_kwargs.pop("stream", stream)
+    # Remove litellm_params from kwargs (only needed for hooks)
+    request_kwargs.pop("litellm_params", None)
+    # Merge back any other modifications
+    kwargs.update(request_kwargs)
 
-    local_vars = locals()
     loop = asyncio.get_event_loop()
     kwargs["is_async"] = True
 
@@ -206,6 +255,11 @@ def anthropic_messages_handler(
             "model": original_model,
             "custom_llm_provider": custom_llm_provider,
         }
+        
+        # Check if stream was converted for WebSearch interception
+        # This is set in the async wrapper above when stream=True is converted to stream=False
+        if kwargs.get("_websearch_interception_converted_stream", False):
+            litellm_logging_obj.model_call_details["websearch_interception_converted_stream"] = True
 
     if litellm_params.mock_response and isinstance(litellm_params.mock_response, str):
 
diff --git a/litellm/llms/custom_httpx/llm_http_handler.py b/litellm/llms/custom_httpx/llm_http_handler.py
index 490786155c6..ab1e735fca7 100644
--- a/litellm/llms/custom_httpx/llm_http_handler.py
+++ b/litellm/llms/custom_httpx/llm_http_handler.py
@@ -4418,6 +4418,41 @@ async def _call_agentic_completion_hooks(
                     f"LiteLLM.AgenticHookError: Exception in agentic completion hooks: {str(e)}"
                 )
 
+        # Check if we need to convert response to fake stream
+        # This happens when:
+        # 1. Stream was originally True but converted to False for WebSearch interception
+        # 2. No agentic loop ran (LLM didn't use the tool)
+        # 3. We have a non-streaming response that needs to be converted to streaming
+        websearch_converted_stream = (
+            logging_obj.model_call_details.get("websearch_interception_converted_stream", False)
+            if logging_obj is not None
+            else False
+        )
+        
+        if websearch_converted_stream:
+            from typing import cast
+
+            from litellm._logging import verbose_logger
+            from litellm.llms.anthropic.experimental_pass_through.messages.fake_stream_iterator import (
+                FakeAnthropicMessagesStreamIterator,
+            )
+            from litellm.types.llms.anthropic_messages.anthropic_response import (
+                AnthropicMessagesResponse,
+            )
+            
+            verbose_logger.debug(
+                "WebSearchInterception: No tool call made, converting non-streaming response to fake stream"
+            )
+            
+            # Convert the non-streaming response to a fake stream
+            # The response should be an AnthropicMessagesResponse (dict)
+            if isinstance(response, dict):
+                # Create a fake streaming iterator
+                fake_stream = FakeAnthropicMessagesStreamIterator(
+                    response=cast(AnthropicMessagesResponse, response)
+                )
+                return fake_stream
+        
         return None
 
     def _handle_error(
diff --git a/litellm/proxy/proxy_config.yaml b/litellm/proxy/proxy_config.yaml
index 87e02a142ee..cf852805f83 100644
--- a/litellm/proxy/proxy_config.yaml
+++ b/litellm/proxy/proxy_config.yaml
@@ -46,7 +46,21 @@ model_list:
       api_base: https://krish-mh44t553-eastus2.services.ai.azure.com
       api_key: os.environ/AZURE_ANTHROPIC_API_KEY
 
+# Search Tools Configuration - Define search providers for WebSearch interception
+# search_tools:
+#   - search_tool_name: "my-perplexity-search"
+#     litellm_params:
+#       search_provider: "perplexity"  # Can be: perplexity, brave, etc.
+
+litellm_settings:
+  callbacks: ["websearch_interception"]
+  # WebSearch Interception - Automatically intercepts and executes WebSearch tool calls
+  # for models that don't natively support web search (e.g., Bedrock/Claude)
+  websearch_interception_params:
+    enabled_providers: ["bedrock"]  # List of providers to enable interception for
+    search_tool_name: "my-perplexity-search"  # Optional: Name of search tool from search_tools config
 
 general_settings:
   store_prompts_in_spend_logs: true
-  forward_client_headers_to_llm_api: true
\ No newline at end of file
+  forward_client_headers_to_llm_api: true
+
diff --git a/tests/pass_through_unit_tests/test_websearch_interception_e2e.py b/tests/pass_through_unit_tests/test_websearch_interception_e2e.py
index 2dec9da8b70..bf50c1c9cd2 100644
--- a/tests/pass_through_unit_tests/test_websearch_interception_e2e.py
+++ b/tests/pass_through_unit_tests/test_websearch_interception_e2e.py
@@ -323,3 +323,632 @@ async def test_websearch_interception_streaming():
         import traceback
         traceback.print_exc()
         return False
+
+
+async def test_websearch_interception_no_tool_call_streaming():
+    """
+    Test WebSearch interception when LLM doesn't make a tool call with streaming.
+    
+    This tests the scenario where:
+    1. User requests stream=True
+    2. WebSearch tool is provided
+    3. LLM decides NOT to use the tool (just responds with text)
+    4. System should return a fake stream
+    """
+    print("\n" + "="*80)
+    print("E2E TEST 3: WebSearch Interception (No Tool Call, Streaming)")
+    print("="*80)
+
+    # Router already initialized from test 1
+    print("\n✅ Using existing router configuration")
+    print("✅ WebSearch interception already enabled for Bedrock")
+
+    try:
+        # Make request with WebSearch tool AND stream=True
+        # Use a query that the LLM will answer directly without using the tool
+        print("\n📞 Making litellm.messages.acreate() call with stream=True...")
+        print(f"   Model: bedrock/us.anthropic.claude-3-5-sonnet-20241022-v2:0")
+        print(f"   Query: 'What is 2+2?'")
+        print(f"   Tools: WebSearch")
+        print(f"   Stream: True")
+
+        response = await messages.acreate(
+            model="bedrock/us.anthropic.claude-3-5-sonnet-20241022-v2:0",
+            messages=[{"role": "user", "content": "What is 2+2? Just give me the answer, no need to search."}],
+            tools=[
+                {
+                    "name": "WebSearch",
+                    "description": "Search the web for information",
+                    "input_schema": {
+                        "type": "object",
+                        "properties": {
+                            "query": {
+                                "type": "string",
+                                "description": "The search query",
+                            }
+                        },
+                        "required": ["query"],
+                    },
+                }
+            ],
+            max_tokens=1024,
+            stream=True,  # REQUEST STREAMING
+        )
+
+        print("\n✅ Received response!")
+
+        # Check if response is actually a stream (async generator or async iterator)
+        import inspect
+        is_async_gen = inspect.isasyncgen(response)
+        is_async_iter = hasattr(response, '__aiter__') and hasattr(response, '__anext__')
+        is_stream = is_async_gen or is_async_iter
+
+        if not is_stream:
+            print("\n❌ TEST 3 FAILED: Response is NOT a stream")
+            print(f"❌ Expected a fake stream when LLM doesn't use the tool")
+            print(f"❌ Response type: {type(response)}")
+            return False
+
+        print(f"✅ Response is a stream (async_gen={is_async_gen}, async_iter={is_async_iter})")
+        print("\n📦 Consuming stream chunks:")
+
+        chunks = []
+        chunk_count = 0
+        async for chunk in response:
+            chunk_count += 1
+            print(f"\n--- Chunk {chunk_count} ---")
+            print(f"   Type: {type(chunk)}")
+            print(f"   Content: {chunk[:200] if isinstance(chunk, bytes) else str(chunk)[:200]}...")
+            chunks.append(chunk)
+
+        print(f"\n✅ Received {len(chunks)} stream chunk(s)")
+
+        if len(chunks) > 0:
+            print("\n" + "="*80)
+            print("✅ TEST 3 PASSED!")
+            print("="*80)
+            print("✅ User made ONE litellm.messages.acreate() call with stream=True")
+            print("✅ LLM didn't use the WebSearch tool")
+            print("✅ Got back a fake stream (not a non-streaming response)")
+            print("✅ WebSearch interception handles no-tool-call case correctly!")
+            print("="*80)
+            return True
+        else:
+            print("\n❌ TEST 3 FAILED: No chunks received")
+            return False
+
+    except Exception as e:
+        print(f"\n❌ Test 3 failed with error: {str(e)}")
+        import traceback
+        traceback.print_exc()
+        return False
+
+
+async def test_claude_code_native_websearch():
+    """
+    Test WebSearch interception with Claude Code's native web_search_20250305 tool.
+    
+    This tests the exact request format that Claude Code sends:
+    - tools: [{'type': 'web_search_20250305', 'name': 'web_search', 'max_uses': 8}]
+    - Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0
+    """
+    print("\n" + "="*80)
+    print("E2E TEST: Claude Code Native WebSearch (web_search_20250305)")
+    print("="*80)
+
+    # Router already initialized from test 1
+    print("\n✅ Using existing router configuration")
+    print("✅ WebSearch interception already enabled for Bedrock")
+
+    try:
+        # Make request with Claude Code's exact native web_search tool format
+        print("\n📞 Making litellm.messages.acreate() call...")
+        print(f"   Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0")
+        print(f"   Query: 'Perform a web search for the query: litellm what is it'")
+        print(f"   Tools: Native web_search_20250305")
+        print(f"   Stream: False")
+
+        response = await messages.acreate(
+            model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+            messages=[{"role": "user", "content": "Perform a web search for the query: litellm what is it"}],
+            tools=[
+                {
+                    "type": "web_search_20250305",
+                    "name": "web_search",
+                    "max_uses": 8
+                }
+            ],
+            max_tokens=1024,
+            stream=False,
+        )
+
+        print("\n✅ Received response!")
+
+        # Handle both dict and object responses
+        if isinstance(response, dict):
+            response_id = response.get("id")
+            response_model = response.get("model")
+            response_stop_reason = response.get("stop_reason")
+            response_content = response.get("content", [])
+        else:
+            response_id = response.id
+            response_model = response.model
+            response_stop_reason = response.stop_reason
+            response_content = response.content
+
+        print(f"\n📄 Response ID: {response_id}")
+        print(f"📄 Model: {response_model}")
+        print(f"📄 Stop Reason: {response_stop_reason}")
+        print(f"📄 Content blocks: {len(response_content)}")
+
+        # Debug: Print all content block types
+        for i, block in enumerate(response_content):
+            block_type = block.get("type") if isinstance(block, dict) else block.type
+            print(f"   Block {i}: type={block_type}")
+            if block_type == "tool_use":
+                block_name = block.get("name") if isinstance(block, dict) else block.name
+                print(f"            name={block_name}")
+
+        # Validate response
+        assert response is not None, "Response should not be None"
+        assert response_content is not None, "Response should have content"
+        assert len(response_content) > 0, "Response should have at least one content block"
+
+        # Check if response contains tool_use (means interception didn't work)
+        has_tool_use = any(
+            (block.get("type") if isinstance(block, dict) else block.type) == "tool_use"
+            for block in response_content
+        )
+
+        # Check if we got a text response
+        has_text = any(
+            (block.get("type") if isinstance(block, dict) else block.type) == "text"
+            for block in response_content
+        )
+
+        if has_tool_use:
+            print("\n❌ TEST FAILED: Interception did not work")
+            print(f"❌ Stop reason: {response_stop_reason}")
+            print("❌ Response contains tool_use blocks")
+            return False
+
+        elif has_text and response_stop_reason != "tool_use":
+            text_block = next(
+                block for block in response_content
+                if (block.get("type") if isinstance(block, dict) else block.type) == "text"
+            )
+            text_content = text_block.get("text") if isinstance(text_block, dict) else text_block.text
+
+            print(f"\n📝 Response Text:")
+            print(f"   {text_content[:200]}...")
+
+            if "litellm" in text_content.lower():
+                print("\n" + "="*80)
+                print("✅ TEST PASSED!")
+                print("="*80)
+                print("✅ Claude Code's native web_search_20250305 tool was intercepted")
+                print("✅ Tool was converted to LiteLLM standard format")
+                print("✅ User made ONE litellm.messages.acreate() call")
+                print("✅ Got back final answer with search results")
+                print("✅ Agentic loop executed transparently")
+                print("✅ WebSearch interception working with Claude Code!")
+                print("="*80)
+                return True
+            else:
+                print("\n⚠️  Got text response but doesn't mention LiteLLM")
+                return False
+        else:
+            print("\n❌ Unexpected response format")
+            return False
+
+    except Exception as e:
+        print(f"\n❌ Test failed with error: {str(e)}")
+        import traceback
+        traceback.print_exc()
+        return False
+
+
+if __name__ == "__main__":
+    import asyncio
+    
+    async def run_all_tests():
+        """Run all E2E tests"""
+        test_results = []
+        
+        # Test 1: Non-streaming
+        result1 = await test_websearch_interception_non_streaming()
+        test_results.append(("Non-Streaming", result1))
+        
+        # Test 2: Streaming
+        result2 = await test_websearch_interception_streaming()
+        test_results.append(("Streaming", result2))
+        
+        # Test 3: No tool call with streaming
+        result3 = await test_websearch_interception_no_tool_call_streaming()
+        test_results.append(("No Tool Call Streaming", result3))
+        
+        # Test 4: Claude Code native web_search
+        result4 = await test_claude_code_native_websearch()
+        test_results.append(("Claude Code Native WebSearch", result4))
+        
+        # Print summary
+        print("\n" + "="*80)
+        print("TEST SUMMARY")
+        print("="*80)
+        for test_name, result in test_results:
+            status = "✅ PASSED" if result else "❌ FAILED"
+            print(f"{test_name}: {status}")
+        print("="*80)
+        
+        # Return overall result
+        return all(result for _, result in test_results)
+    
+    result = asyncio.run(run_all_tests())
+    import sys
+    sys.exit(0 if result else 1)
+
+
+async def test_litellm_standard_websearch_tool():
+    """
+    PRIORITY TEST #1: Test with the canonical litellm_web_search tool format.
+
+    This validates that using get_litellm_web_search_tool() directly
+    works end-to-end without any conversion needed.
+    """
+    print("\n" + "="*80)
+    print("E2E TEST: LiteLLM Standard WebSearch Tool")
+    print("="*80)
+
+    from litellm.integrations.websearch_interception import get_litellm_web_search_tool
+
+    print("\n✅ Using existing router configuration")
+    print("✅ WebSearch interception already enabled for Bedrock")
+
+    try:
+        print("\n📞 Making litellm.messages.acreate() call...")
+        print(f"   Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0")
+        print(f"   Query: 'What is the latest news about AI?'")
+        print(f"   Tool: litellm_web_search (standard format, no conversion needed)")
+        print(f"   Stream: False")
+
+        response = await messages.acreate(
+            model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+            messages=[{"role": "user", "content": "What is the latest news about AI? Give me a brief overview."}],
+            tools=[get_litellm_web_search_tool()],
+            max_tokens=1024,
+            stream=False,
+        )
+
+        print("\n✅ Received response!")
+
+        if isinstance(response, dict):
+            response_id = response.get("id")
+            response_stop_reason = response.get("stop_reason")
+            response_content = response.get("content", [])
+        else:
+            response_id = response.id
+            response_stop_reason = response.stop_reason
+            response_content = response.content
+
+        print(f"\n📄 Response ID: {response_id}")
+        print(f"📄 Stop Reason: {response_stop_reason}")
+        print(f"📄 Content blocks: {len(response_content)}")
+
+        for i, block in enumerate(response_content):
+            block_type = block.get("type") if isinstance(block, dict) else block.type
+            print(f"   Block {i}: type={block_type}")
+
+        has_tool_use = any(
+            (block.get("type") if isinstance(block, dict) else block.type) == "tool_use"
+            for block in response_content
+        )
+
+        has_text = any(
+            (block.get("type") if isinstance(block, dict) else block.type) == "text"
+            for block in response_content
+        )
+
+        if has_tool_use:
+            print("\n❌ TEST FAILED: Interception did not work")
+            return False
+
+        elif has_text and response_stop_reason != "tool_use":
+            text_block = next(
+                block for block in response_content
+                if (block.get("type") if isinstance(block, dict) else block.type) == "text"
+            )
+            text_content = text_block.get("text") if isinstance(text_block, dict) else text_block.text
+
+            print(f"\n📝 Response Text: {text_content[:200]}...")
+
+            print("\n" + "="*80)
+            print("✅ TEST PASSED!")
+            print("="*80)
+            print("✅ LiteLLM standard tool format works without conversion")
+            print("✅ Agentic loop executed transparently")
+            print("="*80)
+            return True
+        else:
+            print("\n❌ Unexpected response format")
+            return False
+
+    except Exception as e:
+        print(f"\n❌ Test failed with error: {str(e)}")
+        import traceback
+        traceback.print_exc()
+        return False
+
+
+async def test_claude_code_native_websearch_streaming():
+    """
+    PRIORITY TEST #2: Test Claude Code's native tool WITH stream=True.
+
+    Validates:
+    - Native tool conversion (web_search_20250305 → litellm_web_search)
+    - Stream=True → Stream=False conversion
+    - Agentic loop executes with both conversions
+    """
+    print("\n" + "="*80)
+    print("E2E TEST: Claude Code Native WebSearch + Streaming")
+    print("="*80)
+
+    print("\n✅ Using existing router configuration")
+    print("✅ WebSearch interception already enabled for Bedrock")
+
+    try:
+        print("\n📞 Making litellm.messages.acreate() call with stream=True...")
+        print(f"   Model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0")
+        print(f"   Tool: Native web_search_20250305")
+        print(f"   Stream: True (will be converted to False)")
+
+        response = await messages.acreate(
+            model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+            messages=[{"role": "user", "content": "Search for the latest AI developments."}],
+            tools=[{"type": "web_search_20250305", "name": "web_search", "max_uses": 8}],
+            max_tokens=1024,
+            stream=True,
+        )
+
+        print("\n✅ Received response!")
+
+        import inspect
+        is_stream = inspect.isasyncgen(response)
+
+        if is_stream:
+            print("\n⚠️  Response is a stream (stream conversion didn't work)")
+            return False
+
+        print("✅ Response is NOT a stream (conversion worked!)")
+
+        if isinstance(response, dict):
+            response_stop_reason = response.get("stop_reason")
+            response_content = response.get("content", [])
+        else:
+            response_stop_reason = response.stop_reason
+            response_content = response.content
+
+        has_tool_use = any(
+            (block.get("type") if isinstance(block, dict) else block.type) == "tool_use"
+            for block in response_content
+        )
+
+        has_text = any(
+            (block.get("type") if isinstance(block, dict) else block.type) == "text"
+            for block in response_content
+        )
+
+        if has_tool_use:
+            print("\n❌ TEST FAILED: Interception did not work")
+            return False
+
+        elif has_text and response_stop_reason != "tool_use":
+            print("\n" + "="*80)
+            print("✅ TEST PASSED!")
+            print("="*80)
+            print("✅ Native tool converted to litellm_web_search")
+            print("✅ Stream=True converted to Stream=False")
+            print("✅ Both conversions working together!")
+            print("="*80)
+            return True
+        else:
+            print("\n❌ Unexpected response format")
+            return False
+
+    except Exception as e:
+        print(f"\n❌ Test failed with error: {str(e)}")
+        import traceback
+        traceback.print_exc()
+        return False
+
+
+def test_is_web_search_tool_detection():
+    """
+    PRIORITY TEST #3: Unit test for is_web_search_tool() utility.
+
+    Validates detection of all supported formats including future versions.
+    """
+    print("\n" + "="*80)
+    print("UNIT TEST: Web Search Tool Detection")
+    print("="*80)
+
+    from litellm.integrations.websearch_interception import is_web_search_tool
+
+    test_cases = [
+        ({"name": "litellm_web_search"}, True, "LiteLLM standard tool"),
+        ({"type": "web_search_20250305", "name": "web_search", "max_uses": 8}, True, "Current Anthropic native (2025)"),
+        ({"type": "web_search_2026", "name": "web_search"}, True, "Future Anthropic native (2026)"),
+        ({"type": "web_search_20270615", "name": "web_search"}, True, "Future Anthropic native (2027)"),
+        ({"name": "web_search", "type": "web_search_20250305"}, True, "Claude Code format"),
+        ({"name": "WebSearch"}, True, "Legacy WebSearch"),
+        ({"name": "calculator"}, False, "Non-web-search tool"),
+        ({"name": "some_tool", "type": "function"}, False, "Other tool with type"),
+        ({"type": "custom_tool"}, False, "Custom tool type"),
+    ]
+
+    passed = 0
+    failed = 0
+
+    for tool, expected, description in test_cases:
+        result = is_web_search_tool(tool)
+        if result == expected:
+            print(f"   ✅ PASS: {description}")
+            passed += 1
+        else:
+            print(f"   ❌ FAIL: {description}")
+            print(f"      Tool: {tool}")
+            print(f"      Expected: {expected}, Got: {result}")
+            failed += 1
+
+    print(f"\n📊 Results: {passed} passed, {failed} failed")
+
+    if failed == 0:
+        print("\n" + "="*80)
+        print("✅ ALL DETECTION TESTS PASSED!")
+        print("="*80)
+        print("✅ Detects all current formats")
+        print("✅ Future-proof for new web_search_* versions")
+        print("="*80)
+        return True
+    else:
+        print("\n❌ Some detection tests failed")
+        return False
+
+
+async def test_pre_request_hook_modifies_request_body():
+    """
+    Unit test to verify async_pre_request_hook correctly modifies request body.
+
+    Tests that:
+    1. WebSearchInterceptionLogger is active
+    2. Native web_search_20250305 tool is converted to litellm_web_search
+    3. Stream is converted from True to False
+    4. Modified parameters reach the API call
+    """
+    import asyncio
+    from unittest.mock import AsyncMock, patch, MagicMock
+    from litellm.constants import LITELLM_WEB_SEARCH_TOOL_NAME
+
+    litellm._turn_on_debug()
+
+    print("\n" + "="*80)
+    print("UNIT TEST: Pre-Request Hook Modifies Request Body")
+    print("="*80)
+
+    # Initialize WebSearchInterceptionLogger
+    litellm.callbacks = [
+        WebSearchInterceptionLogger(
+            enabled_providers=[LlmProviders.BEDROCK],
+            search_tool_name="test-search-tool"
+        )
+    ]
+
+    print("✅ WebSearchInterceptionLogger initialized")
+
+    # Track what actually gets sent to the API
+    captured_request = {}
+
+    def mock_anthropic_messages_handler(
+        max_tokens,
+        messages,
+        model,
+        metadata=None,
+        stop_sequences=None,
+        stream=None,
+        system=None,
+        temperature=None,
+        thinking=None,
+        tool_choice=None,
+        tools=None,
+        top_k=None,
+        top_p=None,
+        container=None,
+        api_key=None,
+        api_base=None,
+        client=None,
+        custom_llm_provider=None,
+        **kwargs
+    ):
+        """Mock handler that captures the actual request parameters"""
+        # Capture what gets sent to the handler (after hook modifications)
+        captured_request['tools'] = tools
+        captured_request['stream'] = stream
+        captured_request['max_tokens'] = max_tokens
+        captured_request['model'] = model
+
+        # Return a mock response (non-streaming)
+        from litellm.types.llms.anthropic_messages.anthropic_response import AnthropicMessagesResponse
+        return AnthropicMessagesResponse(
+            id="msg_test",
+            type="message",
+            role="assistant",
+            content=[{
+                "type": "text",
+                "text": "Test response"
+            }],
+            model="claude-sonnet-4-5",
+            stop_reason="end_turn",
+            usage={
+                "input_tokens": 10,
+                "output_tokens": 20
+            }
+        )
+
+    # Patch the anthropic_messages_handler function (called after hooks)
+    with patch('litellm.llms.anthropic.experimental_pass_through.messages.handler.anthropic_messages_handler',
+               side_effect=mock_anthropic_messages_handler):
+
+        print("\n📝 Making request with native web_search_20250305 tool (stream=True)...")
+
+        # Make the request with native tool format
+        response = await messages.acreate(
+            model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
+            messages=[{"role": "user", "content": "Test query"}],
+            tools=[{
+                "type": "web_search_20250305",
+                "name": "web_search",
+                "max_uses": 8
+            }],
+            max_tokens=100,
+            stream=True  # Should be converted to False
+        )
+
+        print("\n🔍 Verifying request modifications...")
+
+        # Verify tool was converted
+        tools = captured_request.get('tools')
+        print(f"\n   Captured tools: {tools}")
+
+        if tools and len(tools) > 0:
+            tool = tools[0]
+            tool_name = tool.get('name')
+
+            if tool_name == LITELLM_WEB_SEARCH_TOOL_NAME:
+                print(f"   ✅ Tool converted: web_search_20250305 → {LITELLM_WEB_SEARCH_TOOL_NAME}")
+            else:
+                print(f"   ❌ Tool NOT converted: expected {LITELLM_WEB_SEARCH_TOOL_NAME}, got {tool_name}")
+                return False
+        else:
+            print("   ❌ No tools captured in request")
+            return False
+
+        # Verify stream was converted
+        stream = captured_request.get('stream')
+        print(f"   Captured stream: {stream}")
+
+        if stream is False:
+            print("   ✅ Stream converted: True → False")
+        else:
+            print(f"   ❌ Stream NOT converted: expected False, got {stream}")
+            return False
+
+        print("\n" + "="*80)
+        print("✅ PRE-REQUEST HOOK TEST PASSED!")
+        print("="*80)
+        print("✅ CustomLogger is active")
+        print("✅ async_pre_request_hook modifies request body")
+        print("✅ Tool conversion works correctly")
+        print("✅ Stream conversion works correctly")
+        print("="*80)
+
+        return True
+