-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Support DeepSeek-V3.1 tool call #9446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
zhyncs
merged 14 commits into
sgl-project:main
from
Xu-Wenqing:add_deepseek_v31_chat_template
Aug 27, 2025
Merged
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
9234425
Add DeepSeek-V3.1 tool call chat template
Xu-Wenqing b9fb19f
Add DeepSeek-V3.1 tool call chat template
Xu-Wenqing 7242d0f
Add DeepSeek-V3.1 tool call chat template
Xu-Wenqing 443ee7f
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing debbf69
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing 8b8b503
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing 55661f6
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing 6a46646
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing 5b4258b
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing 8d606e7
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing c82983a
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing 6f0a2e7
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing 7eda5f2
Merge branch 'main' into add_deepseek_v31_chat_template
Xu-Wenqing a8bf4e3
fix lint
JustinTong0323 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
91 changes: 91 additions & 0 deletions
91
examples/chat_template/tool_chat_template_deepseekv31.jinja
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| {% if not add_generation_prompt is defined %} | ||
| {% set add_generation_prompt = false %} | ||
| {% endif %} | ||
| {% if not thinking is defined %} | ||
| {% set thinking = false %} | ||
| {% endif %} | ||
| {% set ns = namespace(is_first=false, is_tool=false, system_prompt='', is_first_sp=true, is_last_user=false) %} | ||
| {%- for message in messages %} | ||
| {%- if message['role'] == 'system' %} | ||
| {%- if ns.is_first_sp %} | ||
| {% set ns.system_prompt = ns.system_prompt + message['content'] %} | ||
| {% set ns.is_first_sp = false %} | ||
| {%- else %} | ||
| {% set ns.system_prompt = ns.system_prompt + '\n\n' + message['content'] %} | ||
| {%- endif %} | ||
| {%- endif %} | ||
| {%- endfor %} | ||
|
|
||
| {% if tools is defined and tools is not none %} | ||
| {% set tool_ns = namespace(text='## Tools\nYou have access to the following tools:\n') %} | ||
| {% for tool in tools %} | ||
| {% set tool_ns.text = tool_ns.text + '\n### ' + tool.function.name + '\nDescription: ' + tool.function.description + '\n\nParameters: ' + (tool.function.parameters | tojson) + '\n' %} | ||
| {% endfor %} | ||
| {% set tool_ns.text = tool_ns.text + "\nIMPORTANT: ALWAYS adhere to this exact format for tool use:\n<|tool▁calls▁begin|><|tool▁call▁begin|>tool_call_name<|tool▁sep|>tool_call_arguments<|tool▁call▁end|>{{additional_tool_calls}}<|tool▁calls▁end|>\n\nWhere:\n\n- `tool_call_name` must be an exact match to one of the available tools\n- `tool_call_arguments` must be valid JSON that strictly follows the tool's Parameters Schema\n- For multiple tool calls, chain them directly without separators or spaces\n" %} | ||
| {% set ns.system_prompt = ns.system_prompt + '\n\n' + tool_ns.text %} | ||
| {% endif %} | ||
|
|
||
| {{ bos_token }}{{ ns.system_prompt }} | ||
| {%- for message in messages %} | ||
| {%- if message['role'] == 'user' %} | ||
| {%- set ns.is_tool = false -%} | ||
| {%- set ns.is_first = false -%} | ||
| {%- set ns.is_last_user = true -%} | ||
| {{'<|User|>' + message['content']}} | ||
| {%- endif %} | ||
| {%- if message['role'] == 'assistant' and message['tool_calls'] is defined and message['tool_calls'] is not none %} | ||
| {%- if ns.is_last_user %} | ||
| {{'<|Assistant|></think>'}} | ||
| {%- endif %} | ||
| {%- set ns.is_last_user = false -%} | ||
| {%- set ns.is_first = false %} | ||
| {%- set ns.is_tool = false -%} | ||
| {%- for tool in message['tool_calls'] %} | ||
| {%- if not ns.is_first %} | ||
| {%- if message['content'] is none %} | ||
| {{'<|tool▁calls▁begin|><|tool▁call▁begin|>'+ tool['function']['name'] + '<|tool▁sep|>' + tool['function']['arguments'] + '<|tool▁call▁end|>'}} | ||
| {%- else %} | ||
| {{message['content'] + '<|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['function']['name'] + '<|tool▁sep|>' + tool['function']['arguments'] + '<|tool▁call▁end|>'}} | ||
| {%- endif %} | ||
| {%- set ns.is_first = true -%} | ||
| {%- else %} | ||
| {{'<|tool▁call▁begin|>'+ tool['function']['name'] + '<|tool▁sep|>' + tool['function']['arguments'] + '<|tool▁call▁end|>'}} | ||
| {%- endif %} | ||
| {%- endfor %} | ||
| {{'<|tool▁calls▁end|><|end▁of▁sentence|>'}} | ||
| {%- endif %} | ||
| {%- if message['role'] == 'assistant' and (message['tool_calls'] is not defined or message['tool_calls'] is none) %} | ||
| {%- if ns.is_last_user %} | ||
| {{'<|Assistant|>'}} | ||
| {%- if message['prefix'] is defined and message['prefix'] and thinking %} | ||
| {{'<think>'}} | ||
| {%- else %} | ||
| {{'</think>'}} | ||
| {%- endif %} | ||
| {%- endif %} | ||
| {%- set ns.is_last_user = false -%} | ||
| {%- if ns.is_tool %} | ||
| {{message['content'] + '<|end▁of▁sentence|>'}} | ||
| {%- set ns.is_tool = false -%} | ||
| {%- else %} | ||
| {%- set content = message['content'] -%} | ||
| {%- if '</think>' in content %} | ||
| {%- set content = content.split('</think>', 1)[1] -%} | ||
| {%- endif %} | ||
| {{content + '<|end▁of▁sentence|>'}} | ||
| {%- endif %} | ||
| {%- endif %} | ||
| {%- if message['role'] == 'tool' %} | ||
| {%- set ns.is_last_user = false -%} | ||
| {%- set ns.is_tool = true -%} | ||
| {{'<|tool▁output▁begin|>' + message['content'] + '<|tool▁output▁end|>'}} | ||
| {%- endif %} | ||
| {%- endfor -%} | ||
| {%- if add_generation_prompt and ns.is_last_user and not ns.is_tool %} | ||
| {{'<|Assistant|>'}} | ||
| {%- if not thinking %} | ||
| {{'</think>'}} | ||
| {%- else %} | ||
| {{'<think>'}} | ||
| {%- endif %} | ||
| {% endif %} | ||
222 changes: 222 additions & 0 deletions
222
python/sglang/srt/function_call/deepseekv31_detector.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,222 @@ | ||
| import json | ||
| import logging | ||
| import re | ||
| from typing import List | ||
|
|
||
| from sglang.srt.entrypoints.openai.protocol import Tool | ||
| from sglang.srt.function_call.base_format_detector import BaseFormatDetector | ||
| from sglang.srt.function_call.core_types import ( | ||
| StreamingParseResult, | ||
| StructureInfo, | ||
| ToolCallItem, | ||
| _GetInfoFunc, | ||
| ) | ||
| from sglang.srt.function_call.ebnf_composer import EBNFComposer | ||
| from sglang.srt.function_call.utils import _is_complete_json | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class DeepSeekV31Detector(BaseFormatDetector): | ||
| """ | ||
| Detector for DeepSeek V3 model function call format. | ||
|
|
||
| The DeepSeek V3 format uses special Unicode tokens to delimit function calls | ||
| with JSON code blocks for arguments. | ||
|
|
||
| Format Structure: | ||
| ``` | ||
| <|tool▁calls▁begin|><|tool▁call▁begin|>{function_name}<|tool▁sep|>{json_arguments}<|tool▁calls▁end|><|end▁of▁sentence|> | ||
| ``` | ||
| Examples: | ||
| ``` | ||
| <|tool▁calls▁begin|><|tool▁call▁begin|>get_current_weather<|tool▁sep|>{"location": "Tokyo"}<|tool▁call▁end|><|tool▁call▁begin|>get_current_weather<|tool▁sep|>{"location": "Paris"}<|tool▁call▁end|><|tool▁calls▁end|><|end▁of▁sentence|> | ||
| ``` | ||
|
|
||
| Key Components: | ||
| - Tool Calls Section: Wrapped between `<|tool▁calls▁begin|>` and `<|tool▁calls▁end|>` | ||
| - Individual Tool Call: Wrapped between `<|tool▁call▁begin|>` and `<|tool▁call▁end|>` | ||
| - Function Declaration: `<|tool▁call▁begin|>{function_name}<|tool▁sep|>` | ||
| - Arguments: JSON code block between `<|tool▁sep|>` and `<|tool▁call▁end|>` | ||
| - Supports multiple tool calls | ||
|
|
||
| Reference: https://www.modelscope.cn/models/deepseek-ai/DeepSeek-V3.1 | ||
| """ | ||
|
|
||
| def __init__(self): | ||
| super().__init__() | ||
| self.bot_token = "<|tool▁calls▁begin|>" | ||
| self.eot_token = "<|tool▁calls▁end|>" | ||
| self.func_call_regex = r"<|tool▁call▁begin|>.*?<|tool▁call▁end|>" | ||
| self.func_detail_regex = ( | ||
| r"<|tool▁call▁begin|>(.*)<|tool▁sep|>(.*)<|tool▁call▁end|>" | ||
| ) | ||
| self._last_arguments = "" | ||
| self.current_tool_id = -1 | ||
|
|
||
| def has_tool_call(self, text: str) -> bool: | ||
| """Check if the text contains a deepseek format tool call.""" | ||
| return self.bot_token in text | ||
|
|
||
| def detect_and_parse(self, text: str, tools: List[Tool]) -> StreamingParseResult: | ||
| """ | ||
| One-time parsing: Detects and parses tool calls in the provided text. | ||
|
|
||
| :param text: The complete text to parse. | ||
| :param tools: List of available tools. | ||
| :return: ParseResult indicating success or failure, consumed text, leftover text, and parsed calls. | ||
| """ | ||
| idx = text.find(self.bot_token) | ||
| normal_text = text[:idx].strip() if idx != -1 else text | ||
| if self.bot_token not in text: | ||
| return StreamingParseResult(normal_text=normal_text, calls=[]) | ||
| match_result_list = re.findall(self.func_call_regex, text, re.DOTALL) | ||
| calls = [] | ||
| try: | ||
| for match_result in match_result_list: | ||
| # Get function name | ||
| func_detail = re.search(self.func_detail_regex, match_result, re.DOTALL) | ||
| func_name = func_detail.group(1) | ||
| func_args = func_detail.group(2) | ||
| func_args = json.loads(func_args) | ||
| # construct match_result for parse_base_json | ||
| match_result = {"name": func_name, "parameters": func_args} | ||
| calls.extend(self.parse_base_json(match_result, tools)) | ||
| return StreamingParseResult(normal_text=normal_text, calls=calls) | ||
| except Exception as e: | ||
| logger.error(f"Error in detect_and_parse: {e}") | ||
| # return the normal text if parsing fails | ||
| return StreamingParseResult(normal_text=text) | ||
|
|
||
| def parse_streaming_increment( | ||
| self, new_text: str, tools: List[Tool] | ||
| ) -> StreamingParseResult: | ||
| """ | ||
| Streaming incremental parsing tool calls for DeepSeekV3 format. | ||
| """ | ||
| self._buffer += new_text | ||
| current_text = self._buffer | ||
|
|
||
| # Check if we have a tool call (either the start token or individual tool call) | ||
| has_tool_call = ( | ||
| self.bot_token in current_text or "<|tool▁call▁begin|>" in current_text | ||
| ) | ||
|
|
||
| if not has_tool_call: | ||
| self._buffer = "" | ||
| for e_token in [self.eot_token, "<|tool▁call▁end|>"]: | ||
| if e_token in new_text: | ||
| new_text = new_text.replace(e_token, "") | ||
| return StreamingParseResult(normal_text=new_text) | ||
|
|
||
| if not hasattr(self, "_tool_indices"): | ||
| self._tool_indices = self._get_tool_indices(tools) | ||
|
|
||
| calls: list[ToolCallItem] = [] | ||
| try: | ||
| partial_match = re.search( | ||
| pattern=r"<|tool▁call▁begin|>(.*)<|tool▁sep|>(.*)<|tool▁call▁end|>", | ||
| string=current_text, | ||
| flags=re.DOTALL, | ||
| ) | ||
| if partial_match: | ||
| func_name = partial_match.group(1).strip() | ||
| func_args_raw = partial_match.group(2).strip() | ||
|
|
||
| # Initialize state if this is the first tool call | ||
| if self.current_tool_id == -1: | ||
| self.current_tool_id = 0 | ||
| self.prev_tool_call_arr = [] | ||
| self.streamed_args_for_tool = [""] | ||
|
|
||
| # Ensure we have enough entries in our tracking arrays | ||
| while len(self.prev_tool_call_arr) <= self.current_tool_id: | ||
| self.prev_tool_call_arr.append({}) | ||
| while len(self.streamed_args_for_tool) <= self.current_tool_id: | ||
| self.streamed_args_for_tool.append("") | ||
|
|
||
| if not self.current_tool_name_sent: | ||
| calls.append( | ||
| ToolCallItem( | ||
| tool_index=self.current_tool_id, | ||
| name=func_name, | ||
| parameters="", | ||
| ) | ||
| ) | ||
| self.current_tool_name_sent = True | ||
| # Store the tool call info for serving layer completions endpoint | ||
| self.prev_tool_call_arr[self.current_tool_id] = { | ||
| "name": func_name, | ||
| "arguments": {}, | ||
| } | ||
| else: | ||
| argument_diff = ( | ||
| func_args_raw[len(self._last_arguments) :] | ||
| if func_args_raw.startswith(self._last_arguments) | ||
| else func_args_raw | ||
| ) | ||
|
|
||
| if argument_diff: | ||
| calls.append( | ||
| ToolCallItem( | ||
| tool_index=self.current_tool_id, | ||
| name=None, | ||
| parameters=argument_diff, | ||
| ) | ||
| ) | ||
| self._last_arguments += argument_diff | ||
| self.streamed_args_for_tool[ | ||
| self.current_tool_id | ||
| ] += argument_diff | ||
|
|
||
| if _is_complete_json(func_args_raw): | ||
| # Update the stored arguments | ||
| try: | ||
| parsed_args = json.loads(func_args_raw) | ||
| self.prev_tool_call_arr[self.current_tool_id][ | ||
| "arguments" | ||
| ] = parsed_args | ||
| except json.JSONDecodeError: | ||
| pass | ||
|
|
||
| # Find the end of the current tool call and remove only that part from buffer | ||
| tool_call_end_pattern = ( | ||
| r"<|tool▁call▁begin|>.*?<|tool▁call▁end|>" | ||
| ) | ||
| match = re.search( | ||
| tool_call_end_pattern, current_text, re.DOTALL | ||
| ) | ||
| if match: | ||
| # Remove the completed tool call from buffer, keep any remaining content | ||
| self._buffer = current_text[match.end() :] | ||
| else: | ||
| self._buffer = "" | ||
|
|
||
| result = StreamingParseResult(normal_text="", calls=calls) | ||
| self.current_tool_id += 1 | ||
| self._last_arguments = "" | ||
| self.current_tool_name_sent = False | ||
| return result | ||
|
|
||
| return StreamingParseResult(normal_text="", calls=calls) | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Error in parse_streaming_increment: {e}") | ||
| return StreamingParseResult(normal_text=current_text) | ||
|
|
||
| def structure_info(self) -> _GetInfoFunc: | ||
| return lambda name: StructureInfo( | ||
| begin="<|tool▁call▁begin|>" + name + "<|tool▁sep|>", | ||
| end="<|tool▁call▁end|>", | ||
| trigger="<|tool▁call▁begin|>" + name + "<|tool▁sep|>", | ||
| ) | ||
|
|
||
| def build_ebnf(self, tools: List[Tool]): | ||
| return EBNFComposer.build_ebnf( | ||
| tools, | ||
| sequence_start_token=self.bot_token, | ||
| sequence_end_token=self.eot_token, | ||
| tool_call_separator="", | ||
| call_rule_fmt='"<|tool▁call▁begin|>{name}<|tool▁sep|>{arguments_rule}<|tool▁call▁end|>"', | ||
| function_format="json", | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.