Skip to content

Conversation

@Xu-Wenqing
Copy link
Contributor

@Xu-Wenqing Xu-Wenqing commented May 7, 2025

Support DeepSeek-V3-0324 function call.

usage:

vllm serve ... --enable-auto-tool-choice --tool-call-parser deepseek_v3 --chat-template examples/tool_chat_template_deepseekv3.jinja

test script (no streaming):

from openai import OpenAI

openai_api_base = ""
openai_api_key = ""

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base + "/v1",
)


tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_temperature",
            "description": "Get current temperature at a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": 'The location to get the temperature for, in the format "City, State, Country".',
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": 'The unit to return the temperature in. Defaults to "celsius".',
                    },
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_temperature_date",
            "description": "Get temperature at a location and date.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": 'The location to get the temperature for, in the format "City, State, Country".',
                    },
                    "date": {
                        "type": "string",
                        "description": 'The date to get the temperature for, in the format "Year-Month-Day".',
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": 'The unit to return the temperature in. Defaults to "celsius".',
                    },
                },
                "required": ["location", "date"],
            },
        },
    },
]


response = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant.\n\nCurrent Date: 2024-09-30",
        },
        {
            "role": "user",
            "content": "What's the temperature in San Francisco now? How about tomorrow?",
        },
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

print(response.choices[0].message.content)

tool_calls = response.choices[0].message.tool_calls
for c in tool_calls:
    print(c.function.name, c.function.arguments)

Output:

None
get_current_temperature {"location": "San Francisco, CA, USA", "unit": "celsius"}
get_temperature_date {"location": "San Francisco, CA, USA", "date": "2024-10-01", "unit": "celsius"}

test script (streaming):

from openai import OpenAI

openai_api_base = ""
openai_api_key = ""

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base + "/v1",
)

class bcolors:
    HEADER = '\033[95m'
    OKBLUE = '\033[94m'
    OKCYAN = '\033[96m'
    OKGREEN = '\033[92m'
    WARNING = '\033[93m'
    FAIL = '\033[91m'
    ENDC = '\033[0m'
    BOLD = '\033[1m'
    UNDERLINE = '\033[4m'


tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_temperature",
            "description": "Get current temperature at a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": 'The location to get the temperature for, in the format "City, State, Country".',
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": 'The unit to return the temperature in. Defaults to "celsius".',
                    },
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_temperature_date",
            "description": "Get temperature at a location and date.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": 'The location to get the temperature for, in the format "City, State, Country".',
                    },
                    "date": {
                        "type": "string",
                        "description": 'The date to get the temperature for, in the format "Year-Month-Day".',
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": 'The unit to return the temperature in. Defaults to "celsius".',
                    },
                },
                "required": ["location", "date"],
            },
        },
    },
]


tool_calls_stream = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant.\n\nCurrent Date: 2024-09-30",
        },
        {
            "role": "user",
            "content": "What's the temperature in San Francisco now? How about tomorrow?",
        },
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
    extra_body={"chat_template_kwargs": {"enable_thinking": True}},
)

print("reasoning content(Blue) and content(Green):")
chunks = []
for chunk in tool_calls_stream:
    chunks.append(chunk)
    if hasattr(chunk.choices[0].delta, "reasoning_content"):
        reasoning_content = chunk.choices[0].delta.reasoning_content
        if reasoning_content:
            print(bcolors.OKBLUE + reasoning_content, end="", flush=True)
    elif hasattr(chunk.choices[0].delta, "content"):
        content = chunk.choices[0].delta.content
        if content:
            print(bcolors.OKGREEN + content, end="", flush=True)

print(bcolors.ENDC + "\n### end of reasoning content and content. ###\n")

arguments = []
tool_call_idx = -1
for chunk in chunks:

    if chunk.choices[0].delta.tool_calls:
        tool_call = chunk.choices[0].delta.tool_calls[0]

        if tool_call.index != tool_call_idx:
            if tool_call_idx >= 0:
                print(f"streamed tool call arguments: {arguments[tool_call_idx]}")
            tool_call_idx = chunk.choices[0].delta.tool_calls[0].index
            arguments.append("")
        if tool_call.id:
            print(f"streamed tool call id: {tool_call.id} ")

        if tool_call.function:
            if tool_call.function.name:
                print(f"streamed tool call name: {tool_call.function.name}")

            if tool_call.function.arguments:
                arguments[tool_call_idx] += tool_call.function.arguments

if len(arguments):
    print(f"streamed tool call arguments: {arguments[-1]}")

Output:

reasoning content(Blue) and content(Green):


### end of reasoning content and content. ###

streamed tool call id: chatcmpl-tool-e2c23f26c3fa4a45a5eb029babe8ba12 
streamed tool call name: get_current_temperature
streamed tool call arguments: {"location": "San Francisco, USA", "unit": "celsius"}
streamed tool call id: chatcmpl-tool-c4945b99fcbd4134971134c302eaf6a9 
streamed tool call name: get_temperature_date
streamed tool call arguments: {"location": "San Francisco, USA", "date": "2024-10-01", "unit": "celsius"}

@github-actions
Copy link

github-actions bot commented May 7, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added documentation Improvements or additions to documentation frontend tool-calling labels May 7, 2025
@Xu-Wenqing Xu-Wenqing marked this pull request as ready for review May 7, 2025 11:20
@Xu-Wenqing Xu-Wenqing changed the title [Feature] Support DeepSeekV3 Function Call [WIP][Feature] Support DeepSeekV3 Function Call May 7, 2025
@Xu-Wenqing Xu-Wenqing marked this pull request as draft May 7, 2025 11:22
@Xu-Wenqing Xu-Wenqing changed the title [WIP][Feature] Support DeepSeekV3 Function Call [Feature] Support DeepSeekV3 Function Call May 7, 2025
@Xu-Wenqing Xu-Wenqing marked this pull request as ready for review May 7, 2025 11:42
@Xu-Wenqing
Copy link
Contributor Author

#14745

@houseroad
Copy link
Collaborator

Could you lint and attach a test plan?

Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq. The parser looks good for merging for now, but do we need toe tool chat template?

iirc this is already included in the default chat templates from the model repo?

@Xu-Wenqing
Copy link
Contributor Author

Xu-Wenqing commented May 8, 2025

qq. The parser looks good for merging for now, but do we need toe tool chat template?

iirc this is already included in the default chat templates from the model repo?

@aarnphm yeah, DeepSeek-V3-0324 default chat template not works for vllm, fixed some issues, e.g.
tool['function']['arguments'] changed to tool['function']['arguments']|tojson https://github.com/Xu-Wenqing/vllm/blob/dev/dpsk_r1_tool_parser/examples/tool_chat_template_deepseekv3.jinja#L56-L62

I updated the code, need review again, thanks.

@Xu-Wenqing
Copy link
Contributor Author

Xu-Wenqing commented May 8, 2025

Could you lint and attach a test plan?

@houseroad updated descriptions, including some test cases.

@Xu-Wenqing Xu-Wenqing requested a review from aarnphm May 8, 2025 07:00
logger = init_logger(__name__)


@ToolParserManager.register_module("deepseekv3")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@ToolParserManager.register_module("deepseekv3")
@ToolParserManager.register_module("deepseek_v3")

minor s/deepseekv3/deepseek_v3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny

@Xu-Wenqing
Copy link
Contributor Author

Tiny

@aarnphm done.

@Xu-Wenqing
Copy link
Contributor Author

Xu-Wenqing commented May 10, 2025

@DarkLight1337 @simon-mo @houseroad @russellb need review with write access, thanks.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) May 12, 2025 05:54
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 12, 2025
@vllm-bot vllm-bot merged commit 3a5ea75 into vllm-project:main May 12, 2025
11 of 14 checks passed
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
Signed-off-by: 许文卿 <[email protected]>
Signed-off-by: Xu Wenqing <[email protected]>
Signed-off-by: Mu Huai <[email protected]>
mawong-amd pushed a commit to ROCm/vllm that referenced this pull request May 14, 2025
zzzyq pushed a commit to zzzyq/vllm that referenced this pull request May 24, 2025
Signed-off-by: 许文卿 <[email protected]>
Signed-off-by: Xu Wenqing <[email protected]>
Signed-off-by: Yuqi Zhang <[email protected]>
@WangErXiao
Copy link
Contributor

@Xu-Wenqing Can this support deepseek_r1 tool calling?

@Xu-Wenqing
Copy link
Contributor Author

@Xu-Wenqing Can this support deepseek_r1 tool calling?

@WangErXiao Yes. DeepSeek just released DeepSeek-R1-0528 model, this model support function call. We can use "deepseek_v3" tool call parser, but the chat template "tool_chat_template_deepseekv3.jinja" here don't support DeepSeek-R1-0528. I created a PR: #18874, add DeepSeek-R1-0528 chat template.

@WangErXiao
Copy link
Contributor

LGTM 👍 @Xu-Wenqing

@xueshuai0922
Copy link

@Xu-Wenqing https://docs.vllm.ai/en/stable/features/tool_calling.html?h=function+call#deepseek-v3-models-deepseek_v3 Description in the official documentation. If the vllm version is greater than 0.8.3, it supports function call. Is there any difference between this update?

huiqiwa pushed a commit to huiqiwa/vllm-fork that referenced this pull request Oct 21, 2025
huiqiwa pushed a commit to huiqiwa/vllm-fork that referenced this pull request Oct 22, 2025
chendave

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend ready ONLY add when PR is ready to merge/full CI is needed tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants