PrefectHQ · jlowin · Dec 4, 2025 · Dec 4, 2025 · Dec 4, 2025 · Dec 4, 2025
diff --git a/docs/clients/sampling.mdx b/docs/clients/sampling.mdx
@@ -109,6 +109,14 @@ The sampling handler receives three parameters:
     <ResponseField name="metadata" type="dict[str, Any] | None">
       Optional metadata to pass through to the LLM provider.
     </ResponseField>
+
+    <ResponseField name="tools" type="list[Tool] | None">
+      Optional list of tools the LLM can use during sampling. See [Handling Tool Requests](#handling-tool-requests).
+    </ResponseField>
+
+    <ResponseField name="toolChoice" type="ToolChoice | None">
+      Optional control over tool usage behavior (`auto`, `required`, or `none`).
+    </ResponseField>
     </Expandable>
 
 </ResponseField>
@@ -153,4 +161,201 @@ client = Client(
 
 <Note>
 If the client doesn't provide a sampling handler, servers can optionally configure a fallback handler. See [Server Sampling](/servers/sampling#sampling-fallback-handler) for details.
-</Note>
+</Note>
+
+<Tip>
+When you provide a `sampling_handler`, FastMCP automatically advertises full sampling capabilities including tool support to the server.
+</Tip>
+
+## Handling Tool Requests
+
+<VersionBadge version="2.14.0" />
+
+Servers may request sampling with tools, allowing the LLM to make tool calls during generation. When tools are provided in `params.tools`, your handler should return a `CreateMessageResultWithTools` object instead of a simple string.
+
+### Checking for Tools
+
+```python
+from fastmcp import Client
+from fastmcp.client.sampling import SamplingMessage, SamplingParams, RequestContext
+from mcp.types import (
+    CreateMessageResultWithTools,
+    TextContent,
+    ToolUseContent,
+    ToolChoice,
+)
+
+async def sampling_handler_with_tools(
+    messages: list[SamplingMessage],
+    params: SamplingParams,
+    context: RequestContext
+) -> str | CreateMessageResultWithTools:
+    # Check if tools were provided
+    if params.tools:
+        # Call your LLM with tool support
+        # Return CreateMessageResultWithTools with appropriate content
+        return CreateMessageResultWithTools(
+            role="assistant",
+            content=[TextContent(type="text", text="I'll help you with that.")],
+            model="gpt-4",
+            stopReason="endTurn",
+        )
+
+    # Standard text response when no tools
+    return "Generated response"
+
+client = Client(
+    "my_mcp_server.py",
+    sampling_handler=sampling_handler_with_tools,
+)
+```
+
+### Returning Tool Use
+
+When the LLM wants to call a tool, return a `CreateMessageResultWithTools` with `stopReason="toolUse"` and `ToolUseContent` in the content:
+
+```python
+from mcp.types import CreateMessageResultWithTools, ToolUseContent
+
+async def sampling_handler_with_tools(
+    messages: list[SamplingMessage],
+    params: SamplingParams,
+    context: RequestContext
+) -> str | CreateMessageResultWithTools:
+    if params.tools:
+        # Your LLM decided to call a tool
+        return CreateMessageResultWithTools(
+            role="assistant",
+            content=[
+                ToolUseContent(
+                    type="toolUse",
+                    id="call_123",
+                    name="search",
+                    input={"query": "Python tutorials"},
+                )
+            ],
+            model="gpt-4",
+            stopReason="toolUse",  # Indicates tool use
+        )
+
+    return "Response without tools"
+```
+
+### Tool Choice
+
+The `params.toolChoice` field indicates how the server wants tools to be used:
+
+- **`auto`**: The LLM decides whether to use tools
+- **`required`**: The LLM must use at least one tool
+- **`none`**: The LLM should not use any tools
+
+```python
+async def sampling_handler_with_tools(
+    messages: list[SamplingMessage],
+    params: SamplingParams,
+    context: RequestContext
+) -> str | CreateMessageResultWithTools:
+    if params.tools and params.toolChoice:
+        if params.toolChoice.mode == "required":
+            # Must return a tool call
+            pass
+        elif params.toolChoice.mode == "none":
+            # Should not return tool calls even though tools are available
+            pass
+        # "auto" - let the LLM decide
+
+    return "Response"
+```
+
+<Note>
+Tool execution happens on the server side. The client's role is to pass tools to the LLM and return the LLM's response (which may include tool use requests). The server then executes the tools and may send follow-up sampling requests with tool results.
+</Note>
+
+## Pre-built Handlers
+
+<VersionBadge version="2.14.0" />
+
+FastMCP provides ready-to-use sampling handlers for Anthropic and OpenAI that handle all the complexity of message conversion, tool formatting, and response parsing.
+
+### Anthropic Handler
+
+The Anthropic handler uses the Claude API:
+
+```python
+from anthropic import Anthropic
+from fastmcp import Client
+from fastmcp.server.sampling.anthropic import AnthropicSamplingHandler
+
+handler = AnthropicSamplingHandler(
+    default_model="claude-sonnet-4-5-20250929",
+    client=Anthropic(),  # Uses ANTHROPIC_API_KEY env var
+)
+
+async with Client("server.py", sampling_handler=handler) as client:
+    result = await client.call_tool("summarize", {"text": "..."})
+```
+
+The handler automatically:
+- Converts MCP messages to Anthropic's format
+- Translates tool definitions to Anthropic's tool format
+- Maps stop reasons (`tool_use` → `toolUse`, `end_turn` → `endTurn`)
+- Selects models based on server preferences (any model starting with `claude` is accepted)
+
+<Note>
+Requires the `anthropic` package. Install with `pip install fastmcp[anthropic]` or `pip install anthropic`.
+</Note>
+
+### OpenAI Handler
+
+The OpenAI handler works with the OpenAI API and compatible providers:
+
+```python
+from openai import OpenAI
+from fastmcp import Client
+from fastmcp.server.sampling.openai import OpenAISamplingHandler
+
+handler = OpenAISamplingHandler(
+    default_model="gpt-4o-mini",
+    client=OpenAI(),  # Uses OPENAI_API_KEY env var
+)
+
+async with Client("server.py", sampling_handler=handler) as client:
+    result = await client.call_tool("analyze", {"data": "..."})
+```
+
+The handler automatically:
+- Converts MCP messages to OpenAI's chat format
+- Translates tool definitions to OpenAI's function calling format
+- Maps stop reasons (`tool_calls` → `toolUse`, `stop` → `endTurn`)
+- Selects models based on server preferences
+
+<Note>
+Requires the `openai` package. Install with `pip install fastmcp[openai]` or `pip install openai`.
+</Note>
+
+### Using with Compatible Providers
+
+The OpenAI handler works with any OpenAI-compatible API:
+
+```python
+from openai import OpenAI
+from fastmcp.server.sampling.openai import OpenAISamplingHandler
+
+# Azure OpenAI
+handler = OpenAISamplingHandler(
+    default_model="gpt-4o",
+    client=OpenAI(
+        api_key=os.getenv("AZURE_OPENAI_API_KEY"),
+        base_url="https://your-resource.openai.azure.com/openai/deployments/your-deployment",
+    ),
+)
+
+# Local models (Ollama, vLLM, etc.)
+handler = OpenAISamplingHandler(
+    default_model="llama3",
+    client=OpenAI(
+        api_key="not-needed",
+        base_url="http://localhost:11434/v1",
+    ),
+)
+```