Skip to content

Conversation

@Frank-Jie
Copy link
Contributor

@Frank-Jie Frank-Jie commented Apr 29, 2025

Motivation

Modifications

The position of function call prompts in the messages has been modified to ensure correct function response.

Copy link
Collaborator

@CatherineSue CatherineSue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please correct me if I'm wrong: the order of messages have an effect in deepseekv3 function call results?

Can you add a system role under https://github.com/sgl-project/sglang/blob/main/test/srt/test_function_calling.py#L76 to make sure this doesn't affect other models' behavior?

@Frank-Jie
Copy link
Contributor Author

Merged the system messages ( if provided by the user) instead of simply changing the position to make the structure more consistent with the typical message format.

@Frank-Jie
Copy link
Contributor Author

LGTM. Please correct me if I'm wrong: the order of messages have an effect in deepseekv3 function call results?

Can you add a system role under https://github.com/sgl-project/sglang/blob/main/test/srt/test_function_calling.py#L76 to make sure this doesn't affect other models' behavior?

Thank you for your feedback.

  1. Based on my testing results, the position of multiple system messages( if provided by the user ) has an impact on the performance of function calls. In the latest version, I merged the system messages to make the structure more consistent with the typical message format.

  2. I think adding system role content to the test case may affect other model output, but the changes in this PR is only effective where deepseekv3 is used as the tool_call_parser. https://github.com/sgl-project/sglang/pull/5882/files#diff-ae0b859c80ded41d0ffdbff5dbf98e603308d87fc976128187d6c346b706bbd4R968

  3. Additionally, I noticed that in the latest PR feat: support pythonic tool call and index in tool call streaming #5725, that chat template is being used as the parsing method. (Thanks for your idea ) I agree it is a better way, and I am considering submitting another PR to use the chat template(instead of format system prompt) for parsing DeepSeek V3 as well.

@CatherineSue
Copy link
Collaborator

@Frank-Jie thanks for the detailed info and thorough tests. I opened #5908 to add the chat template. I also attached a few test cases there. Can you take a look to see if it would cover the case you mentioned above? thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants