[Bugfix] Fix tool call streaming for gpt-oss/Harmony models#33520
[Bugfix] Fix tool call streaming for gpt-oss/Harmony models#33520alexbi29 wants to merge 2 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
|
Related Documentation No published documentation to review for changes on this repository. |
4c43d67 to
fa4c429
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Code Review
This pull request introduces several important bug fixes for tool call handling in streaming and non-streaming modes for gpt-oss/Harmony models. The changes address an IndexError during streaming, add missing tool call IDs in non-streaming responses, and fix issues with split tool calls in streaming by improving how tool call IDs are managed and by merging delta tool calls. The implementation looks solid and correctly addresses the described issues. The code is well-structured and the fixes are robust. I have no major concerns.
fa4c429 to
d78fb82
Compare
|
Hi @alexbi29, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
chaunceyjiang
left a comment
There was a problem hiding this comment.
Could you share a minimal example to reproduce the problem?
|
@chaunceyjiang here it is. |
This PR fixes several issues with tool call handling for gpt-oss models
using the Harmony streaming parser:
1. **IndexError in streaming generator**: Added `auto_tools_called` check
before accessing `prev_tool_call_arr` to prevent IndexError when the
array is empty.
2. **Missing tool call IDs in non-streaming responses**: Added proper ID
generation for named tool choice and auto tool choice cases that were
missing the required `id` field.
3. **Split tool calls in streaming**: Fixed an issue where a single tool
call was being split into multiple entries because:
- Tool call IDs are now stored by recipient name (e.g., "functions.glob")
instead of index number, since `base_index` changes between streaming
calls as messages complete.
- Continuation chunks now include the same ID as the opening chunk,
allowing clients to properly merge them.
- DeltaToolCalls with the same index are merged before sending to avoid
multiple entries in one SSE chunk.
Tested with opencode client against gpt-oss-120b model.
d78fb82 to
2e1d598
Compare
|
Hi @alexbi29, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Summary
This PR fixes several issues with tool call handling for gpt-oss models using the Harmony streaming parser:
IndexError in streaming generator: Added
auto_tools_calledcheck before accessingprev_tool_call_arrto prevent IndexError when the array is empty.Missing tool call IDs in non-streaming responses: Added proper ID generation for named tool choice and auto tool choice cases that were missing the required
idfield.Split tool calls in streaming: Fixed an issue where a single tool call was being split into multiple entries because:
functions.glob) instead of index number, sincebase_indexchanges between streaming calls as messages complete.Test plan
🤖 Generated with Claude Code
This makes opencode tool calls work properly with v1 chat completion api