Fix DeepSeek V4 DSML tool argument parsing by QwertyJack · Pull Request #41241 · vllm-project/vllm

QwertyJack · 2026-04-29T13:59:34Z

Related to #41240

Summary

fix the existing deepseek_v4 tool parser to respect DSML string="true|false" parameter metadata, preserving literal strings and coercing non-string values through the request schema or JSON fallback
unwrap single arguments / input wrapper parameters when the wrapper is not part of the requested tool schema and the wrapped object matches the schema fields
escape real tool parameters named arguments while rendering DeepSeek V4 tool schemas/history, then unescape them when parsing model output
flush held plain text at the end of streaming when it only looked like the start of a DSML marker, such as 2 <
keep upstream DeepSeek V4 structural-tag support and add focused parser, tokenizer, and CLI validation coverage for the DeepSeek V4 serving flag combination

This PR does not add the deepseek_v4 parser from scratch; upstream already has the parser registered. Recent upstream #41198 also added generic DSV3.2/V4 non-streaming type conversion. This PR remains narrower and covers V4 DSML string-attribute handling, wrapper repair, real arguments field escaping, and streaming final flush behavior.

This PR also does not add top-level thinking={...} request support. Per review feedback, DeepSeek V4 thinking toggles should continue to use chat_template_kwargs, matching the vLLM DeepSeek V4 recipe.

Duplicate-work check

Reviewed issue [Bug]: DeepSeek V4 DSML tool parser mishandles wrapped and reserved arguments #41240 with gh issue view 41240 --repo vllm-project/vllm --comments.
Checked open PRs with gh pr list --repo vllm-project/vllm --state open --search "41240 in:body". No existing PR references the issue.
Checked open PRs with gh pr list --repo vllm-project/vllm --state open --search "deepseek_v4 DSML tool parser" and gh pr list --repo vllm-project/vllm --state open --search "DeepSeek V4 tool argument parsing". The relevant matches are this PR and unrelated DSv4 backend/model work such as [DSv4][Nvidia] SM12x DeepSeek V4 support #40991.
Related upstream [Bugfix] DSV32/V4 add missing type conversion for non-streaming tool calls #41198 partially overlaps by adding generic DSV3.2/V4 non-streaming type conversion, but it does not cover this PR's wrapper repair, real arguments field escaping, or streaming final flush behavior.
Related issue [Bug]: DeepSeek V3.2 & V4 incorrect structured output when thinking enabled #41132 covers structured output routed into reasoning when thinking is enabled; this PR is narrower and only targets DeepSeek V4 DSML tool-call parsing/tokenizer round-tripping.

Tests

.venv/bin/python -m pytest tests/tool_parsers/test_deepseekv4_tool_parser.py tests/tokenizers_/test_deepseek_v4.py tests/reasoning/test_deepseekv3_reasoning_parser.py::test_deepseek_v4_reasoning_parser_alias tests/entrypoints/openai/test_cli_args.py::test_deepseek_v4_agentic_flags_pass_validation -q
.venv/bin/python -m pytest tests/entrypoints/openai/chat_completion/test_chat.py::test_chat_completion_request_accepts_model_specific_reasoning_effort -q
.venv/bin/ruff check vllm/entrypoints/openai/chat_completion/protocol.py tests/entrypoints/openai/chat_completion/test_chat.py vllm/tool_parsers/deepseekv4_tool_parser.py vllm/tokenizers/deepseek_v4_encoding.py tests/tool_parsers/test_deepseekv4_tool_parser.py tests/tokenizers_/test_deepseek_v4.py tests/entrypoints/openai/test_cli_args.py
pre-commit hooks run during git commit --amend and passed

AI assistance

AI assistance was used to develop and validate this change. I reviewed the changed lines and test results.

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request adds support for DeepSeek V4, including a 'thinking' mode toggle in the chat completion protocol and a specialized DeepSeekV4ToolParser for DSML tool calls. The implementation handles parameter name escaping for 'arguments' and normalizes wrapped tool call inputs. Comprehensive tests verify the new protocol fields, CLI arguments, and tool parsing logic in both streaming and non-streaming modes. I have no feedback to provide.

QwertyJack · 2026-04-29T14:46:58Z

Thanks @chaunceyjiang, that makes sense. I removed the top-level thinking request-field mapping and kept the PR scoped to the DeepSeek V4 parser/tokenizer compatibility work. The PR body now notes that reasoning toggles should use chat_template_kwargs, matching the recipe.

UmutAlihan · 2026-05-04T10:27:17Z

looking very forward for this fix to be merged and release in stable releases

Handle DeepSeek V4 DSML parameters with typed values, unwrap model-emitted input/arguments wrapper objects when they are not schema fields, and preserve real tool parameters named arguments by escaping them in rendered schemas/history and unescaping them during parsing. Also flush held plain text at the end of a stream when it only looked like a partial DSML marker, and add focused parser, tokenizer, and CLI validation coverage for the DeepSeek V4 serving flags. Co-authored-by: OpenAI Codex <codex@openai.com> Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

QwertyJack · 2026-05-06T07:11:04Z

@chaunceyjiang Here is a minimal deterministic case for why the PR is still needed after #41198.

A DeepSeek V4 tool call may emit the real arguments object inside a single DSML arguments parameter:

import json
from unittest.mock import MagicMock
from vllm.tool_parsers.deepseekv4_tool_parser import DeepSeekV4ToolParser

mock_tokenizer = MagicMock()
mock_tokenizer.get_vocab.return_value = {}

tool = MagicMock()
tool.function.name = "get_weather"
tool.function.parameters = {
    "type": "object",
    "properties": {"location": {"type": "string"}},
}

parser = DeepSeekV4ToolParser(mock_tokenizer, tools=[tool])
request = MagicMock()
request.tools = [tool]

model_output = (
    '<｜DSML｜tool_calls>'
    '<｜DSML｜invoke name="get_weather">'
    '<｜DSML｜parameter name="arguments" string="false">{"location":"Beijing"}</｜DSML｜parameter>'
    '</｜DSML｜invoke>'
    '</｜DSML｜tool_calls>'
)

result = parser.extract_tool_calls(model_output, request)
print(json.loads(result.tool_calls[0].function.arguments))

On current upstream main (b53c507bc9), this prints:

{"arguments": "{\"location\":\"Beijing\"}"}

With this PR, it prints the OpenAI-compatible arguments object expected by the tool schema:

{"location": "Beijing"}

This is covered by test_extract_tool_calls_repairs_arguments_wrapper_object. The rest of the PR handles the same edge class for input, real schema fields named arguments, and final streaming flush when held text only looked like a partial DSML marker.

chaunceyjiang · 2026-05-06T10:50:17Z


+logger = init_logger(__name__)
+
+ESCAPED_ARGUMENTS_PARAM_NAME = "__vllm_param_arguments__"


Thanks @QwertyJack for identifying this issue. I think your fix is a bit of a hack.

I’ve submitted a new implementation here: #41801.

Could you help test it? @QwertyJack @UmutAlihan

chaunceyjiang · 2026-05-06T10:50:22Z


+logger = init_logger(__name__)
+
+ESCAPED_ARGUMENTS_PARAM_NAME = "__vllm_param_arguments__"


Thanks @QwertyJack for identifying this issue. I think your fix is a bit of a hack.

I’ve submitted a new implementation here: #41801.

Could you help test it? @QwertyJack @UmutAlihan

sfeng33 · 2026-05-06T21:52:31Z

Thanks @QwertyJack, I added you as co-author on #41801.
Closing this PR as fixed.

QwertyJack requested review from DarkLight1337, NickLucche, aarnphm, bbrowning, chaunceyjiang, njhill, robertgshaw2-redhat, russellb and sfeng33 as code owners April 29, 2026 13:59

claude Bot reviewed Apr 29, 2026

View reviewed changes

mergify Bot added deepseek Related to DeepSeek models frontend tool-calling labels Apr 29, 2026

github-project-automation Bot added this to Tool Calling Apr 29, 2026

gemini-code-assist Bot reviewed Apr 29, 2026

View reviewed changes

QwertyJack force-pushed the deepseek-v4-tool-thinking-parser branch from 0234224 to 44ac6cc Compare April 29, 2026 14:15

chaunceyjiang reviewed Apr 29, 2026

View reviewed changes

Comment thread tests/entrypoints/openai/chat_completion/test_chat.py Outdated

QwertyJack force-pushed the deepseek-v4-tool-thinking-parser branch from 44ac6cc to 86dfa2f Compare April 29, 2026 14:44

QwertyJack changed the title ~~Add DeepSeek V4 agentic parser support~~ Fix DeepSeek V4 DSML tool argument parsing Apr 29, 2026

QwertyJack force-pushed the deepseek-v4-tool-thinking-parser branch 2 times, most recently from 9c548ce to 16cc8f5 Compare April 29, 2026 16:27

QwertyJack mentioned this pull request Apr 30, 2026

[Bug]: DeepSeek V4 DSML tool parser mishandles wrapped and reserved arguments #41240

Closed

mertunsall mentioned this pull request May 5, 2026

Fix assistant thinking block normalization #41718

Open

QwertyJack force-pushed the deepseek-v4-tool-thinking-parser branch from 16cc8f5 to cf6a573 Compare May 6, 2026 06:52

chaunceyjiang reviewed May 6, 2026

View reviewed changes

sfeng33 closed this May 6, 2026

github-project-automation Bot moved this to Done in Tool Calling May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix DeepSeek V4 DSML tool argument parsing#41241

Fix DeepSeek V4 DSML tool argument parsing#41241
QwertyJack wants to merge 1 commit intovllm-project:mainfrom
QwertyJack:deepseek-v4-tool-thinking-parser

QwertyJack commented Apr 29, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

QwertyJack commented Apr 29, 2026

Uh oh!

UmutAlihan commented May 4, 2026

Uh oh!

QwertyJack commented May 6, 2026 •

edited

Loading

Uh oh!

chaunceyjiang May 6, 2026

Uh oh!

QwertyJack May 6, 2026

Uh oh!

chaunceyjiang May 6, 2026

Uh oh!

QwertyJack May 6, 2026

Uh oh!

sfeng33 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		logger = init_logger(__name__)

		ESCAPED_ARGUMENTS_PARAM_NAME = "__vllm_param_arguments__"

Uh oh!

Conversation

QwertyJack commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Duplicate-work check

Tests

AI assistance

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

QwertyJack commented Apr 29, 2026

Uh oh!

UmutAlihan commented May 4, 2026

Uh oh!

QwertyJack commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chaunceyjiang May 6, 2026

Choose a reason for hiding this comment

Uh oh!

QwertyJack May 6, 2026

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang May 6, 2026

Choose a reason for hiding this comment

Uh oh!

QwertyJack May 6, 2026

Choose a reason for hiding this comment

Uh oh!

sfeng33 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

QwertyJack commented Apr 29, 2026 •

edited

Loading

QwertyJack commented May 6, 2026 •

edited

Loading