Skip to content

Fix DeepSeek V4 DSML tool argument parsing#41241

Closed
QwertyJack wants to merge 1 commit intovllm-project:mainfrom
QwertyJack:deepseek-v4-tool-thinking-parser
Closed

Fix DeepSeek V4 DSML tool argument parsing#41241
QwertyJack wants to merge 1 commit intovllm-project:mainfrom
QwertyJack:deepseek-v4-tool-thinking-parser

Conversation

@QwertyJack
Copy link
Copy Markdown
Contributor

@QwertyJack QwertyJack commented Apr 29, 2026

Related to #41240

Summary

  • fix the existing deepseek_v4 tool parser to respect DSML string="true|false" parameter metadata, preserving literal strings and coercing non-string values through the request schema or JSON fallback
  • unwrap single arguments / input wrapper parameters when the wrapper is not part of the requested tool schema and the wrapped object matches the schema fields
  • escape real tool parameters named arguments while rendering DeepSeek V4 tool schemas/history, then unescape them when parsing model output
  • flush held plain text at the end of streaming when it only looked like the start of a DSML marker, such as 2 <
  • keep upstream DeepSeek V4 structural-tag support and add focused parser, tokenizer, and CLI validation coverage for the DeepSeek V4 serving flag combination

This PR does not add the deepseek_v4 parser from scratch; upstream already has the parser registered. Recent upstream #41198 also added generic DSV3.2/V4 non-streaming type conversion. This PR remains narrower and covers V4 DSML string-attribute handling, wrapper repair, real arguments field escaping, and streaming final flush behavior.

This PR also does not add top-level thinking={...} request support. Per review feedback, DeepSeek V4 thinking toggles should continue to use chat_template_kwargs, matching the vLLM DeepSeek V4 recipe.

Duplicate-work check

Tests

  • .venv/bin/python -m pytest tests/tool_parsers/test_deepseekv4_tool_parser.py tests/tokenizers_/test_deepseek_v4.py tests/reasoning/test_deepseekv3_reasoning_parser.py::test_deepseek_v4_reasoning_parser_alias tests/entrypoints/openai/test_cli_args.py::test_deepseek_v4_agentic_flags_pass_validation -q
  • .venv/bin/python -m pytest tests/entrypoints/openai/chat_completion/test_chat.py::test_chat_completion_request_accepts_model_specific_reasoning_effort -q
  • .venv/bin/ruff check vllm/entrypoints/openai/chat_completion/protocol.py tests/entrypoints/openai/chat_completion/test_chat.py vllm/tool_parsers/deepseekv4_tool_parser.py vllm/tokenizers/deepseek_v4_encoding.py tests/tool_parsers/test_deepseekv4_tool_parser.py tests/tokenizers_/test_deepseek_v4.py tests/entrypoints/openai/test_cli_args.py
  • pre-commit hooks run during git commit --amend and passed

AI assistance

AI assistance was used to develop and validate this change. I reviewed the changed lines and test results.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for DeepSeek V4, including a 'thinking' mode toggle in the chat completion protocol and a specialized DeepSeekV4ToolParser for DSML tool calls. The implementation handles parameter name escaping for 'arguments' and normalizes wrapped tool call inputs. Comprehensive tests verify the new protocol fields, CLI arguments, and tool parsing logic in both streaming and non-streaming modes. I have no feedback to provide.

@QwertyJack QwertyJack force-pushed the deepseek-v4-tool-thinking-parser branch from 0234224 to 44ac6cc Compare April 29, 2026 14:15
Comment thread tests/entrypoints/openai/chat_completion/test_chat.py Outdated
@QwertyJack QwertyJack force-pushed the deepseek-v4-tool-thinking-parser branch from 44ac6cc to 86dfa2f Compare April 29, 2026 14:44
@QwertyJack
Copy link
Copy Markdown
Contributor Author

Thanks @chaunceyjiang, that makes sense. I removed the top-level thinking request-field mapping and kept the PR scoped to the DeepSeek V4 parser/tokenizer compatibility work. The PR body now notes that reasoning toggles should use chat_template_kwargs, matching the recipe.

@QwertyJack QwertyJack changed the title Add DeepSeek V4 agentic parser support Fix DeepSeek V4 DSML tool argument parsing Apr 29, 2026
@QwertyJack QwertyJack force-pushed the deepseek-v4-tool-thinking-parser branch 2 times, most recently from 9c548ce to 16cc8f5 Compare April 29, 2026 16:27
@UmutAlihan
Copy link
Copy Markdown

looking very forward for this fix to be merged and release in stable releases

Handle DeepSeek V4 DSML parameters with typed values, unwrap model-emitted input/arguments wrapper objects when they are not schema fields, and preserve real tool parameters named arguments by escaping them in rendered schemas/history and unescaping them during parsing.

Also flush held plain text at the end of a stream when it only looked like a partial DSML marker, and add focused parser, tokenizer, and CLI validation coverage for the DeepSeek V4 serving flags.

Co-authored-by: OpenAI Codex <codex@openai.com>

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
@QwertyJack QwertyJack force-pushed the deepseek-v4-tool-thinking-parser branch from 16cc8f5 to cf6a573 Compare May 6, 2026 06:52
@QwertyJack
Copy link
Copy Markdown
Contributor Author

QwertyJack commented May 6, 2026

@chaunceyjiang Here is a minimal deterministic case for why the PR is still needed after #41198.

A DeepSeek V4 tool call may emit the real arguments object inside a single DSML arguments parameter:

import json
from unittest.mock import MagicMock
from vllm.tool_parsers.deepseekv4_tool_parser import DeepSeekV4ToolParser

mock_tokenizer = MagicMock()
mock_tokenizer.get_vocab.return_value = {}

tool = MagicMock()
tool.function.name = "get_weather"
tool.function.parameters = {
    "type": "object",
    "properties": {"location": {"type": "string"}},
}

parser = DeepSeekV4ToolParser(mock_tokenizer, tools=[tool])
request = MagicMock()
request.tools = [tool]

model_output = (
    '<|DSML|tool_calls>'
    '<|DSML|invoke name="get_weather">'
    '<|DSML|parameter name="arguments" string="false">{"location":"Beijing"}</|DSML|parameter>'
    '</|DSML|invoke>'
    '</|DSML|tool_calls>'
)

result = parser.extract_tool_calls(model_output, request)
print(json.loads(result.tool_calls[0].function.arguments))

On current upstream main (b53c507bc9), this prints:

{"arguments": "{\"location\":\"Beijing\"}"}

With this PR, it prints the OpenAI-compatible arguments object expected by the tool schema:

{"location": "Beijing"}

This is covered by test_extract_tool_calls_repairs_arguments_wrapper_object. The rest of the PR handles the same edge class for input, real schema fields named arguments, and final streaming flush when held text only looked like a partial DSML marker.


logger = init_logger(__name__)

ESCAPED_ARGUMENTS_PARAM_NAME = "__vllm_param_arguments__"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @QwertyJack for identifying this issue. I think your fix is a bit of a hack.

I’ve submitted a new implementation here: #41801.

Could you help test it? @QwertyJack @UmutAlihan

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


logger = init_logger(__name__)

ESCAPED_ARGUMENTS_PARAM_NAME = "__vllm_param_arguments__"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @QwertyJack for identifying this issue. I think your fix is a bit of a hack.

I’ve submitted a new implementation here: #41801.

Could you help test it? @QwertyJack @UmutAlihan

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sfeng33
Copy link
Copy Markdown
Collaborator

sfeng33 commented May 6, 2026

Thanks @QwertyJack, I added you as co-author on #41801.
Closing this PR as fixed.

@sfeng33 sfeng33 closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek Related to DeepSeek models frontend tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants