feat: add reasoning/thinking support to Anthropic /v1/messages endpoint by timon0305 · Pull Request #35035 · vllm-project/vllm

timon0305 · 2026-02-22T00:07:01Z

Summary

Adds support for extended thinking / reasoning output in the Anthropic Messages API (/v1/messages), resolving a feature gap where reasoning tokens from models like QwQ, DeepSeek-R1, and other thinking-capable models were not exposed through the Anthropic-compatible endpoint.

Changes:

protocol.py: Added AnthropicThinkingConfig model for the thinking request parameter (matching Anthropic's {"type": "enabled", "budget_tokens": N} format), added "thinking" content block type to AnthropicContentBlock, and added "thinking_delta" delta type to AnthropicDelta
serving.py:
- Request conversion: Maps thinking.type == "enabled" to include_reasoning=True on the OpenAI request; handles incoming thinking content blocks from prior assistant turns by converting them to text for the model
- Non-streaming: Extracts message.reasoning from the OpenAI response and prepends a thinking content block before the text content block
- Streaming: Handles delta.reasoning by emitting proper content_block_start/content_block_delta/content_block_stop events with thinking type blocks, correctly transitioning between thinking → text → tool_use block types
test_anthropic_reasoning.py: Unit tests covering protocol validation, request conversion, non-streaming response conversion, streaming response conversion with reasoning→text transitions, and serialization round-trips

Protocol compatibility:
The implementation follows the Anthropic API spec for extended thinking:

Request: {"thinking": {"type": "enabled", "budget_tokens": 4096}}
Response content blocks: [{"type": "thinking", "thinking": "..."}, {"type": "text", "text": "..."}]
Streaming: thinking_delta events with thinking field

Closes #29915

Test plan

Unit tests for AnthropicThinkingConfig validation (enabled requires budget_tokens)
Unit tests for request conversion (include_reasoning flag propagation)
Unit tests for non-streaming response with/without reasoning
Unit tests for streaming response with reasoning→text block transitions
Unit tests for serialization round-trips of thinking content blocks
Ruff lint and format checks pass

dosubot · 2026-02-22T00:07:10Z

Related Documentation

Checked 0 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

gemini-code-assist

Code Review

The pull request successfully implements reasoning/thinking support for the Anthropic Messages API. The protocol models are correctly updated to include the thinking content type and configuration, and the serving logic handles both streaming and non-streaming responses. However, there are several issues in the streaming converter logic where continue and elif statements could lead to data loss if multiple types of deltas (reasoning, content, tool calls) are present in a single chunk from the engine. These should be addressed to ensure robustness.

gemini-code-assist · 2026-02-22T00:08:39Z

vllm/entrypoints/anthropic/serving.py

+                            if delta.content == "":
                                continue
                            chunk = AnthropicStreamEvent(
                                index=content_block_index,
                                type="content_block_delta",
                                delta=AnthropicDelta(
                                    type="text_delta",
-                                    text=origin_chunk.choices[0].delta.content,
+                                    text=delta.content,
                                ),
                            )
                            data = chunk.model_dump_json(exclude_unset=True)
                            yield wrap_data_with_event(data, "content_block_delta")
                            continue


Similar to the reasoning block, using continue here prevents the processing of tool_calls if they are packed into the same chunk as a content delta. Removing the continue and adjusting the logic to use a non-skipping check for empty content ensures that all parts of the delta are handled.

Suggested change

if delta.content == "":

continue

chunk = AnthropicStreamEvent(

index=content_block_index,

type="content_block_delta",

delta=AnthropicDelta(

type="text_delta",

text=origin_chunk.choices[0].delta.content,

text=delta.content,

),

)

data = chunk.model_dump_json(exclude_unset=True)

yield wrap_data_with_event(data, "content_block_delta")

continue

if delta.content != "":

chunk = AnthropicStreamEvent(

index=content_block_index,

type="content_block_delta",

delta=AnthropicDelta(

type="text_delta",

text=delta.content,

),

)

data = chunk.model_dump_json(exclude_unset=True)

yield wrap_data_with_event(data, "content_block_delta")

gemini-code-assist · 2026-02-22T00:08:39Z

vllm/entrypoints/anthropic/serving.py

                        # tool calls
-                        elif len(origin_chunk.choices[0].delta.tool_calls) > 0:
-                            tool_call = origin_chunk.choices[0].delta.tool_calls[0]
+                        elif len(delta.tool_calls) > 0:


Using elif here means that tool calls will be ignored if delta.content was also present in the same chunk (and the continue above was removed). Changing this to an if allows the converter to sequentially close the text block and open a tool block within the same iteration if the engine emits them together.

Suggested change

elif len(delta.tool_calls) > 0:

if len(delta.tool_calls) > 0:

Signed-off-by: timon0305 <timon0305@outlook.com>

mergify · 2026-02-26T08:17:07Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @timon0305.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

chaunceyjiang · 2026-02-26T08:38:14Z

Thanks~ @timon0305 see #33671

timon0305 requested review from DarkLight1337, NickLucche, aarnphm, mgoin and robertgshaw2-redhat as code owners February 22, 2026 00:07

mergify bot added the frontend label Feb 22, 2026

gemini-code-assist bot reviewed Feb 22, 2026

View reviewed changes

feat: add reasoning/thinking support to Anthropic /v1/messages endpoint

e680cbf

Signed-off-by: timon0305 <timon0305@outlook.com>

timon0305 force-pushed the add-anthropic-reasoning-support branch from d53fbec to e680cbf Compare February 22, 2026 01:13

DarkLight1337 requested a review from chaunceyjiang February 23, 2026 02:29

mergify bot added the needs-rebase label Feb 26, 2026

chaunceyjiang self-assigned this Feb 26, 2026

ehfd mentioned this pull request Feb 26, 2026

[Bug]: Qwen3.5 (NVIDIA H200) Pointer argument (at 0) cannot be accessed from Triton #35390

Closed

1 task

bbartels mentioned this pull request Mar 13, 2026

[Bugfix] accept redacted thinking blocks in Anthropic messages #36992

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add reasoning/thinking support to Anthropic /v1/messages endpoint#35035

feat: add reasoning/thinking support to Anthropic /v1/messages endpoint#35035
timon0305 wants to merge 1 commit intovllm-project:mainfrom
timon0305:add-anthropic-reasoning-support

timon0305 commented Feb 22, 2026

Uh oh!

dosubot bot commented Feb 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 22, 2026

Uh oh!

gemini-code-assist bot Feb 22, 2026

Uh oh!

mergify bot commented Feb 26, 2026

Uh oh!

chaunceyjiang commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	elif len(delta.tool_calls) > 0:
	if len(delta.tool_calls) > 0:

Uh oh!

Conversation

timon0305 commented Feb 22, 2026

Summary

Test plan

Uh oh!

dosubot bot commented Feb 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Feb 26, 2026

Uh oh!

chaunceyjiang commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants