Fix DeepSeek V4 DSML tool argument parsing#41241
Fix DeepSeek V4 DSML tool argument parsing#41241QwertyJack wants to merge 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds support for DeepSeek V4, including a 'thinking' mode toggle in the chat completion protocol and a specialized DeepSeekV4ToolParser for DSML tool calls. The implementation handles parameter name escaping for 'arguments' and normalizes wrapped tool call inputs. Comprehensive tests verify the new protocol fields, CLI arguments, and tool parsing logic in both streaming and non-streaming modes. I have no feedback to provide.
0234224 to
44ac6cc
Compare
44ac6cc to
86dfa2f
Compare
|
Thanks @chaunceyjiang, that makes sense. I removed the top-level |
9c548ce to
16cc8f5
Compare
|
looking very forward for this fix to be merged and release in stable releases |
Handle DeepSeek V4 DSML parameters with typed values, unwrap model-emitted input/arguments wrapper objects when they are not schema fields, and preserve real tool parameters named arguments by escaping them in rendered schemas/history and unescaping them during parsing. Also flush held plain text at the end of a stream when it only looked like a partial DSML marker, and add focused parser, tokenizer, and CLI validation coverage for the DeepSeek V4 serving flags. Co-authored-by: OpenAI Codex <codex@openai.com> Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
16cc8f5 to
cf6a573
Compare
|
@chaunceyjiang Here is a minimal deterministic case for why the PR is still needed after #41198. A DeepSeek V4 tool call may emit the real arguments object inside a single DSML import json
from unittest.mock import MagicMock
from vllm.tool_parsers.deepseekv4_tool_parser import DeepSeekV4ToolParser
mock_tokenizer = MagicMock()
mock_tokenizer.get_vocab.return_value = {}
tool = MagicMock()
tool.function.name = "get_weather"
tool.function.parameters = {
"type": "object",
"properties": {"location": {"type": "string"}},
}
parser = DeepSeekV4ToolParser(mock_tokenizer, tools=[tool])
request = MagicMock()
request.tools = [tool]
model_output = (
'<|DSML|tool_calls>'
'<|DSML|invoke name="get_weather">'
'<|DSML|parameter name="arguments" string="false">{"location":"Beijing"}</|DSML|parameter>'
'</|DSML|invoke>'
'</|DSML|tool_calls>'
)
result = parser.extract_tool_calls(model_output, request)
print(json.loads(result.tool_calls[0].function.arguments))On current upstream main ( {"arguments": "{\"location\":\"Beijing\"}"}With this PR, it prints the OpenAI-compatible arguments object expected by the tool schema: {"location": "Beijing"}This is covered by |
|
|
||
| logger = init_logger(__name__) | ||
|
|
||
| ESCAPED_ARGUMENTS_PARAM_NAME = "__vllm_param_arguments__" |
There was a problem hiding this comment.
Thanks @QwertyJack for identifying this issue. I think your fix is a bit of a hack.
I’ve submitted a new implementation here: #41801.
Could you help test it? @QwertyJack @UmutAlihan
|
|
||
| logger = init_logger(__name__) | ||
|
|
||
| ESCAPED_ARGUMENTS_PARAM_NAME = "__vllm_param_arguments__" |
There was a problem hiding this comment.
Thanks @QwertyJack for identifying this issue. I think your fix is a bit of a hack.
I’ve submitted a new implementation here: #41801.
Could you help test it? @QwertyJack @UmutAlihan
|
Thanks @QwertyJack, I added you as co-author on #41801. |
Related to #41240
Summary
deepseek_v4tool parser to respect DSMLstring="true|false"parameter metadata, preserving literal strings and coercing non-string values through the request schema or JSON fallbackarguments/inputwrapper parameters when the wrapper is not part of the requested tool schema and the wrapped object matches the schema fieldsargumentswhile rendering DeepSeek V4 tool schemas/history, then unescape them when parsing model output2 <This PR does not add the
deepseek_v4parser from scratch; upstream already has the parser registered. Recent upstream #41198 also added generic DSV3.2/V4 non-streaming type conversion. This PR remains narrower and covers V4 DSML string-attribute handling, wrapper repair, realargumentsfield escaping, and streaming final flush behavior.This PR also does not add top-level
thinking={...}request support. Per review feedback, DeepSeek V4 thinking toggles should continue to usechat_template_kwargs, matching the vLLM DeepSeek V4 recipe.Duplicate-work check
gh issue view 41240 --repo vllm-project/vllm --comments.gh pr list --repo vllm-project/vllm --state open --search "41240 in:body". No existing PR references the issue.gh pr list --repo vllm-project/vllm --state open --search "deepseek_v4 DSML tool parser"andgh pr list --repo vllm-project/vllm --state open --search "DeepSeek V4 tool argument parsing". The relevant matches are this PR and unrelated DSv4 backend/model work such as [DSv4][Nvidia] SM12x DeepSeek V4 support #40991.argumentsfield escaping, or streaming final flush behavior.Tests
.venv/bin/python -m pytest tests/tool_parsers/test_deepseekv4_tool_parser.py tests/tokenizers_/test_deepseek_v4.py tests/reasoning/test_deepseekv3_reasoning_parser.py::test_deepseek_v4_reasoning_parser_alias tests/entrypoints/openai/test_cli_args.py::test_deepseek_v4_agentic_flags_pass_validation -q.venv/bin/python -m pytest tests/entrypoints/openai/chat_completion/test_chat.py::test_chat_completion_request_accepts_model_specific_reasoning_effort -q.venv/bin/ruff check vllm/entrypoints/openai/chat_completion/protocol.py tests/entrypoints/openai/chat_completion/test_chat.py vllm/tool_parsers/deepseekv4_tool_parser.py vllm/tokenizers/deepseek_v4_encoding.py tests/tool_parsers/test_deepseekv4_tool_parser.py tests/tokenizers_/test_deepseek_v4.py tests/entrypoints/openai/test_cli_args.pygit commit --amendand passedAI assistance
AI assistance was used to develop and validate this change. I reviewed the changed lines and test results.