Support ImageContent and AudioContent in sampling handlers by jlowin · Pull Request #3550 · PrefectHQ/fastmcp

jlowin · 2026-03-18T16:13:50Z

The MCP specification explicitly includes ImageContent and AudioContent in sampling messages — a server can send an image to the client and ask the LLM to analyze it via sampling/createMessage. But all three sampling handlers (OpenAI, Anthropic, Google GenAI) only handled TextContent, silently dropping multimodal content in list contexts or raising opaque errors for single content.

Each handler now converts these content types to its provider's native format, with MIME type validation that fails early with clear messages rather than letting the provider API reject them later. Audio support varies by provider — OpenAI supports wav/mp3, Google GenAI accepts any audio MIME type, and Anthropic doesn't support audio at all (raises ValueError immediately).

from mcp.types import ImageContent, SamplingMessage

# MCP server can now include images in sampling requests
message = SamplingMessage(
    role="user",
    content=[
        TextContent(type="text", text="What do you see in this image?"),
        ImageContent(type="image", data=base64_data, mimeType="image/png"),
    ],
)

Closes #3544

Co-authored-by: Claude <noreply@anthropic.com>

marvin-context-protocol · 2026-03-18T16:23:45Z

Test Failure Analysis

Summary: Two failures: (1) Tests with lowest-direct dependencies fails due to Base64ImageSourceParam not existing in the minimum declared anthropic>=0.40.0; (2) the Integration tests job has a flaky timeout in test_github_api_schema_performance (network download exceeded 10s in CI).

Root Cause:

lowest-direct job — The PR adds Base64ImageSourceParam to imports in src/fastmcp/client/sampling/handlers/anthropic.py. This type was introduced in anthropic==0.48.0. When the suite runs against the minimum declared dependency (anthropic==0.40.0), the import fails and pytest cannot even collect test_anthropic_handler.py.
Integration test timeout — TestOpenAPIPerformance::test_github_api_schema_performance downloads ~10MB from raw.githubusercontent.com and must complete in 10 seconds. The test itself notes CI can take 6–7s; this run hit the 10s wall. This is unrelated to the PR changes (network variability in GHA).

Suggested Solution:

For the real bug (#1), bump the minimum anthropic version in pyproject.toml:

# pyproject.toml
anthropic = ["anthropic>=0.48.0"]

The integration test timeout (#2) is a pre-existing flaky test. No action required from this PR.

Detailed Analysis

Failure 1: `Tests with lowest-direct dependencies`

The PR adds these imports inside the try block in src/fastmcp/client/sampling/handlers/anthropic.py:

from anthropic.types import (
    Base64ImageSourceParam,  # ← Added by this PR
    ImageBlockParam,
    ...
)

Log excerpt:

src/fastmcp/client/sampling/handlers/anthropic.py:25: in <module>
    from anthropic.types import (
ImportError: cannot import name 'Base64ImageSourceParam' from 'anthropic.types'
...
ERROR tests/client/sampling/handlers/test_anthropic_handler.py
======= 5001 passed, 1 skipped, 1 xfailed, 1 error in 132.69s

Base64ImageSourceParam first appears in anthropic==0.48.0; the current floor >=0.40.0 is too old.

Failure 2: Integration test timeout

FAILED tests/server/providers/openapi/test_openapi_performance.py::TestOpenAPIPerformance::test_github_api_schema_performance
- Failed: Timeout (>10.0s) from pytest-timeout.

The test hits raw.githubusercontent.com to download a ~10MB schema. The test docstring already acknowledges CI can take 6–7s; this run exceeded 10s due to network conditions. Not related to this PR.

Related Files

src/fastmcp/client/sampling/handlers/anthropic.py — imports Base64ImageSourceParam from anthropic.types
pyproject.toml — defines anthropic>=0.40.0 (needs bumping to >=0.48.0)
tests/client/sampling/handlers/test_anthropic_handler.py — fails to collect due to import error
tests/server/providers/openapi/test_openapi_performance.py — flaky network-dependent test (pre-existing)

Note: This comment was edited to reflect workflow run 23256501982. The fix for the lowest-direct failure is unchanged.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 15196dabe6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-18T16:27:54Z

+                        # When there are tool calls, content must be text-only
+                        text_str = "\n".join(text_parts) or None
                        openai_messages.append(
                            ChatCompletionAssistantMessageParam(
                                role="assistant",


Reject assistant image/audio parts when tool_calls are present

If an assistant turn contains ToolUseContent plus ImageContent/AudioContent, this branch only serializes text_parts and never validates or forwards the non-text content_parts. The non-tool_calls path already raises for assistant multimodal content, and OpenAI's chat-completions assistant messages only accept text/refusal parts, so this now silently drops image/audio context instead of failing fast. That will change the conversation state for any mixed assistant turn and is very hard to diagnose.

Useful? React with 👍 / 👎.

Good catch — fixed in aa76c47. Both the tool_calls and non-tool_calls branches now raise ValueError for image/audio in assistant messages.

chatgpt-codex-connector · 2026-03-18T16:27:54Z

+    if isinstance(content, ImageContent):
+        return Part(
+            inline_data=Blob(
+                data=base64.b64decode(content.data),
+                mime_type=content.mimeType,


Validate Gemini media MIME types before building inline_data

Gemini only accepts a fixed allowlist of image and audio MIME types, but this new conversion path passes content.mimeType straight through without any checks (the audio branch immediately below does the same). Inputs like image/bmp or common mobile recordings such as audio/m4a will now get all the way to generate_content() and fail with provider-side 400s instead of the early, clear ValueError the other handlers return for unsupported media.

Useful? React with 👍 / 👎.

Gemini supports a much wider range of media types than OpenAI/Anthropic — a restrictive allowlist would reject valid inputs. The API validates and returns a clear error for unsupported types.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa76c47704

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-18T16:40:49Z

+            if isinstance(content, ImageContent):
+                anthropic_messages.append(
+                    MessageParam(
+                        role=message.role,
+                        content=[_image_content_to_anthropic_block(content)],


Reject assistant-side image blocks before calling Anthropic

_convert_to_anthropic_messages() now forwards ImageContent with role=message.role, but Anthropic's Messages API only accepts image blocks in user turns; assistant turns are limited to model text/tool-use content. If a sampling history includes an assistant ImageContent (or a list-content assistant turn with an image), this code will build an invalid payload and the next request fails with a provider-side 400 instead of a local validation error.

Useful? React with 👍 / 👎.

Good catch — fixed in 2298e76. Both single-content and list-content paths now validate that ImageContent is only in user messages.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2298e7639a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-18T16:57:13Z

+                    elif isinstance(item, ImageContent):
+                        content_parts.append(_image_content_to_openai_part(item))
+                    elif isinstance(item, AudioContent):
+                        content_parts.append(_audio_content_to_openai_part(item))


Reject non-text media in OpenAI tool results

This change makes ImageContent/AudioContent valid elsewhere in _convert_to_openai_messages(), but both ToolResultContent branches still build role="tool" messages from TextContent only. When an MCP tool returns content=[ImageContent(...)] or AudioContent(...), the payload is silently stripped and OpenAI receives an empty tool result, so the model never sees the screenshot/audio it asked for. OpenAI's chat-completions docs say tool messages only support text parts, so this path should fail fast instead of dropping the media.

Useful? React with 👍 / 👎.

Pre-existing limitation — ToolResultContent has always only extracted TextContent. This PR adds multimodal support for messages, not tool results. Valid follow-up but out of scope here.

chatgpt-codex-connector · 2026-03-18T16:57:13Z

+                    elif isinstance(item, ImageContent):
+                        if message.role != "user":
+                            raise ValueError(
+                                "ImageContent is only supported in user messages "
+                                "for Anthropic"


Preserve images inside Anthropic tool_result blocks

Anthropic's tool-use API allows tool_result.content to contain nested text or image blocks, but _convert_to_anthropic_messages() still serializes ToolResultContent.content by collecting only TextContent. If a FastMCP tool returns content=[ImageContent(...)], Claude will receive an empty tool result instead of the image it requested. Since this commit adds ImageContent support for user messages, the same media needs to be forwarded (or explicitly rejected) in ToolResultContent as well.

Useful? React with 👍 / 👎.

Same — pre-existing, unchanged by this PR. Anthropic does support images in tool_result blocks so it's a valid follow-up.

chatgpt-codex-connector · 2026-03-18T16:57:13Z

+    if isinstance(content, ImageContent):
+        return Part(
+            inline_data=Blob(
+                data=base64.b64decode(content.data),
+                mime_type=content.mimeType,


Handle media blocks in Gemini function responses too

The new ImageContent/AudioContent branches only cover top-level message parts. _sampling_content_to_google_genai_part() still raises from its ToolResultContent branch on any non-TextContent, so a tool returning ImageContent or AudioContent will still fail during sampling. Gemini's FunctionResponsePart supports inlineData blobs, so this leaves the new multimodal support incomplete for the common tool-result path.

Useful? React with 👍 / 👎.

Same — pre-existing, unchanged by this PR. Multimodal tool results are a valid follow-up.

jlowin added the feature Major new functionality. Reserved for 2-4 significant PRs per release. Not for issues. label Mar 18, 2026

marvin-context-protocol Bot added enhancement Improvement to existing functionality. For issues and smaller PR improvements. client Related to the FastMCP client SDK or client-side functionality. labels Mar 18, 2026

jlowin and others added 2 commits March 18, 2026 12:23

feat: support ImageContent and AudioContent in sampling handlers

d746172

Co-authored-by: Claude <noreply@anthropic.com>

Validate image MIME types, fix silent drop in assistant list messages

db9c1e2

chatgpt-codex-connector Bot reviewed Mar 18, 2026

View reviewed changes

jlowin force-pushed the feat/image-audio-sampling-handlers branch from 15196da to db9c1e2 Compare March 18, 2026 16:33

Reject image/audio in assistant messages with tool_calls

aa76c47

chatgpt-codex-connector Bot reviewed Mar 18, 2026

View reviewed changes

Reject ImageContent in assistant messages for Anthropic

2298e76

chatgpt-codex-connector Bot reviewed Mar 18, 2026

View reviewed changes

jlowin merged commit 734b93b into main Mar 18, 2026
7 of 15 checks passed

jlowin deleted the feat/image-audio-sampling-handlers branch March 18, 2026 19:25

jlowin removed the feature Major new functionality. Reserved for 2-4 significant PRs per release. Not for issues. label Mar 18, 2026

jlowin mentioned this pull request Mar 18, 2026

Bump anthropic minimum to 0.48.0 #3553

Merged

This was referenced Mar 18, 2026

Extract TokenCache utility, add caching to GitHubTokenVerifier #3547

Merged

Add review-pr skill for Codex bot workflow #3552

Merged

Conversation

jlowin commented Mar 18, 2026

Uh oh!

marvin-context-protocol Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Failure Analysis

Failure 1: Tests with lowest-direct dependencies

Failure 2: Integration test timeout

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

jlowin Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

jlowin Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

jlowin Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

jlowin Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

jlowin Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

jlowin Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

marvin-context-protocol Bot commented Mar 18, 2026 •

edited

Loading

Failure 1: `Tests with lowest-direct dependencies`