fix: Responses API streaming tool call support for non-harmony models by herve-ves · Pull Request #36445 · vllm-project/vllm

herve-ves · 2026-03-09T04:10:52Z

Purpose

Fix Responses API streaming tool call support for non-harmony models.

_process_simple_streaming_events currently does not emit proper response.function_call_arguments.delta / response.function_call_arguments.done events during streaming. Instead, raw tool call XML (e.g. <tool_call><function=name>...) leaks into response.output_text.delta events.

Two root causes:

reasoning_parser and tool_parser are in a mutually exclusive if/elif chain. When a model has both (e.g. Qwen3.5 with <think> tags), the elif tool_parser: branch is never reached — after reasoning ends, tool call XML falls through as plain content.
Even without a reasoning parser, there was no code to convert DeltaMessage.tool_calls into Responses API function call events. The original code had # todo(kebe7jun) tool call support.

This PR modifies _process_simple_streaming_events to:

Handle both reasoning and tool calls together (reasoning first, then tool calls after is_reasoning_end(), matching the Chat Completions streaming behavior)
Handle tool-call-only models (no reasoning parser)
Emit ResponseFunctionCallArgumentsDeltaEvent / ResponseFunctionCallArgumentsDoneEvent via existing emit_function_call_delta_events / emit_function_call_done_events helpers
Properly close message output items before function call events when content precedes tool calls
Sync tool_streaming_state.current_output_index with the main output index
Suppress ResponseTextDeltaEvent once tool calls are detected

Test Plan

vllm serve Qwen/Qwen3.5-9B \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

with client.responses.create(
    model="Qwen/Qwen3.5-9B",
    input="What's the weather in Boston today?",
    tools=[{
        "type": "function",
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "The city and state"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location", "unit"]
        }
    }],
    stream=True,
) as stream:
    for event in stream:
        print(event)

Test Result

Before (broken): tool call XML in `response.output_text.delta`

ResponseOutputItemAddedEvent(
    item=ResponseOutputMessage(
        id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
        content=[],
        role='assistant',
        status='in_progress',
        type='message',
        phase=None
    ),
    output_index=1,
    sequence_number=102,
    type='response.output_item.added'
)
ResponseContentPartAddedEvent(
    content_index=0,
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    output_index=1,
    part=ResponseOutputText(annotations=[], text='', type='output_text', logprobs=[]),
    sequence_number=103,
    type='response.content_part.added'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=104,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='<tool_call>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=105,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=106,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='<',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=107,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='function',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=108,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='=get',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=109,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='_current',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=110,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='_weather',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=111,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=112,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=113,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='<',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=114,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='parameter',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=115,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='=',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=116,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='location',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=117,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=118,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=119,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='Boston',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=120,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta=',',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=121,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta=' MA',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=122,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=123,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='</',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=124,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='parameter',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=125,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=126,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=127,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='<',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=128,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='parameter',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=129,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='=',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=130,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='unit',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=131,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=132,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=133,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='f',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=134,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='ahrenheit',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=135,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=136,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='</',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=137,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='parameter',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=138,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=139,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=140,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='</',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=141,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='function',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=142,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=143,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=144,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='</tool_call>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=145,
    type='response.output_text.delta'
)
ResponseTextDoneEvent(
    content_index=0,
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=146,
    text='\n\n<tool_call>\n<function=get_current_weather>\n<parameter=location>\nBoston, 
MA\n</parameter>\n<parameter=unit>\nfahrenheit\n</parameter>\n</function>\n</tool_call>',
    type='response.output_text.done'
)
ResponseContentPartDoneEvent(
    content_index=0,
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    output_index=1,
    part=ResponseOutputText(
        annotations=[],
        text='\n\n<tool_call>\n<function=get_current_weather>\n<parameter=location>\nBoston, 
MA\n</parameter>\n<parameter=unit>\nfahrenheit\n</parameter>\n</function>\n</tool_call>',
        type='output_text',
        logprobs=None
    ),
    sequence_number=147,
    type='response.content_part.done'
)
ResponseOutputItemDoneEvent(
    item=ResponseOutputMessage(
        id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
        content=[
            ResponseOutputText(
                annotations=[],
                text='\n\n<tool_call>\n<function=get_current_weather>\n<parameter=location>\nBoston, 
MA\n</parameter>\n<parameter=unit>\nfahrenheit\n</parameter>\n</function>\n</tool_call>',
                type='output_text',
                logprobs=None
            )
        ],
        role='assistant',
        status='completed',
        type='message',
        phase=None,
        summary=[]
    ),
    output_index=1,
    sequence_number=148,
    type='response.output_item.done'
)

After (fixed): proper function call events

ResponseOutputItemAddedEvent(
    item=ResponseOutputMessage(
        id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
        content=[],
        role='assistant',
        status='in_progress',
        type='message',
        phase=None
    ),
    output_index=1,
    sequence_number=102,
    type='response.output_item.added'
)
ResponseContentPartAddedEvent(
    content_index=0,
    item_id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
    output_index=1,
    part=ResponseOutputText(annotations=[], text='', type='output_text', logprobs=[]),
    sequence_number=103,
    type='response.content_part.added'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n\n',
    item_id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
    logprobs=[],
    output_index=1,
    sequence_number=104,
    type='response.output_text.delta'
)
ResponseTextDoneEvent(
    content_index=0,
    item_id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
    logprobs=[],
    output_index=1,
    sequence_number=105,
    text='\n\n',
    type='response.output_text.done'
)
ResponseContentPartDoneEvent(
    content_index=0,
    item_id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
    output_index=1,
    part=ResponseOutputText(annotations=[], text='\n\n', type='output_text', logprobs=None),
    sequence_number=106,
    type='response.content_part.done'
)
ResponseOutputItemDoneEvent(
    item=ResponseOutputMessage(
        id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
        content=[ResponseOutputText(annotations=[], text='\n\n', type='output_text', logprobs=None)],
        role='assistant',
        status='completed',
        type='message',
        phase=None
    ),
    output_index=1,
    sequence_number=107,
    type='response.output_item.done'
)
ResponseOutputItemAddedEvent(
    item=ResponseFunctionToolCall(
        arguments='',
        call_id='call_8606b74d97bf7ffe',
        name='get_current_weather',
        type='function_call',
        id='fc_aae462a5c84e3692',
        status='in_progress'
    ),
    output_index=2,
    sequence_number=108,
    type='response.output_item.added'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta='',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=109,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta='{',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=110,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta='"location": "Boston, MA"',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=111,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta=', "unit": "fahrenheit"',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=112,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta='}',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=113,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDoneEvent(
    arguments='{"location": "Boston, MA", "unit": "fahrenheit"}',
    item_id='fc_aae462a5c84e3692',
    name='get_current_weather',
    output_index=2,
    sequence_number=114,
    type='response.function_call_arguments.done'
)
ResponseOutputItemDoneEvent(
    item=ResponseFunctionToolCall(
        arguments='{"location": "Boston, MA", "unit": "fahrenheit"}',
        call_id='call_8606b74d97bf7ffe',
        name='get_current_weather',
        type='function_call',
        id=None,
        status='completed',
        item_id='fc_aae462a5c84e3692',
        output_index=2,
        sequence_number=-1
    ),
    output_index=2,
    sequence_number=115,
    type='response.output_item.done'
)

ResponseCompletedEvent also correctly contains ResponseFunctionToolCall output items (via the existing non-streaming _make_response_output_items path).

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

mergify · 2026-03-09T04:15:06Z

Hi @herve-ves, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

gemini-code-assist

Code Review

This pull request effectively addresses a bug in the Responses API where streaming tool calls for non-harmony models were not working correctly. The changes correctly handle models with both reasoning and tool parsers by processing them sequentially and introduce the necessary logic to emit proper function call events. My main feedback concerns a potentially fragile mechanism for detecting the end of tool call arguments, which could be improved for better robustness and modularity.

_{Note: Security Review did not run due to the size of the PR.}

gemini-code-assist · 2026-03-09T04:15:59Z

vllm/entrypoints/openai/responses/serving.py

+                            if args_delta == "}":
+                                tc_idx = tc.index if tc.index is not None else 0
+                                if tc_idx < len(
+                                    tool_parser.prev_tool_call_arr
+                                ):
+                                    tc_info = (
+                                        tool_parser.prev_tool_call_arr[
+                                            tc_idx
+                                        ]
+                                    )
+                                    for event in (
+                                        emit_function_call_done_events(
+                                            tc_info.get("name", fn_name),
+                                            tc_info.get(
+                                                "arguments", "{}"
+                                            ),
+                                            tool_streaming_state,
+                                        )
+                                    ):
+                                        yield (
+                                            _increment_sequence_number_and_return(
+                                                event
+                                            )
+                                        )


The logic to detect the end of a function call's arguments by checking if args_delta == "}" is quite fragile. It creates a tight coupling between this serving logic and the specific implementation details of the tool parser.

This approach assumes that a tool parser will always emit the final closing brace of arguments as a standalone delta. This contract is implicit and might be violated by other current or future tool parsers, which could lead to ResponseFunctionCallArgumentsDoneEvent not being emitted. For instance, a parser might stream "{}" for an empty object, or the final } might be part of a larger token.

A more robust design would be to have the parser explicitly signal the completion of a tool call's arguments within the DeltaMessage. For example, the ToolCall object in DeltaMessage could have a done: bool flag. This would decouple the serving logic from the parser's output tokenization and create a clearer interface.

Good observation. The args_delta == "}" check mirrors how the Chat Completions streaming path handles tool call completion — it's the existing contract between Qwen3CoderToolParser and the serving layer (the parser explicitly emits "}" as a standalone delta at qwen3coder_tool_parser.py:689-696).

I agree that an explicit done signal in DeltaMessage / DeltaToolCall would be a cleaner interface, but that would require changes to the tool parser protocol and all existing parsers — a larger refactor better suited for a follow-up PR. This PR focuses on bringing the Responses API streaming path to parity with the existing Chat Completions streaming behavior.

Add tool parser integration to _process_simple_streaming_events so that tool call XML is parsed and emitted as proper function_call_arguments delta/done events instead of leaking into output_text events. - Initialize tool_parser from self.parser.tool_parser_cls - Call extract_tool_calls_streaming for each delta - Emit ResponseFunctionCallArgumentsDeltaEvent / DoneEvent via existing emit_function_call_delta_events / emit_function_call_done_events helpers - Suppress text events once tool calls are detected (tools_streamed flag) - Close message output item before function call events when content precedes tool calls - Sync tool_streaming_state.current_output_index with main output index

mergify · 2026-03-09T06:00:13Z

Hi @herve-ves, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

chaunceyjiang · 2026-03-09T06:18:43Z

see #29947

chaunceyjiang · 2026-03-09T06:43:21Z

vllm/entrypoints/openai/responses/serving.py

+                            if args_delta == "}" and tc_idx < len(
+                                tool_parser.prev_tool_call_arr
+                            ):
+                                tc_info = tool_parser.prev_tool_call_arr[tc_idx]


I’m not sure we should use the prev_tool_call_arr attribute here.

I remember implementing an earlier version in #29947 that didn’t require this attribute. I forgot to continue pushing that PR — thanks for the reminder. I’ll revisit it and try to incorporate your changes to move #29947 forward.

mergify · 2026-03-12T07:21:26Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @herve-ves.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

herve-ves requested review from DarkLight1337, aarnphm, chaunceyjiang and russellb as code owners March 9, 2026 04:10

mergify bot added frontend gpt-oss Related to GPT-OSS models labels Mar 9, 2026

github-project-automation bot added this to gpt-oss Issues & Enhancements Mar 9, 2026

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Mar 9, 2026

gemini-code-assist bot reviewed Mar 9, 2026

View reviewed changes

herve-ves force-pushed the fix/responses-streaming-tool-call branch from 776d948 to aaa917d Compare March 9, 2026 05:55

chaunceyjiang self-assigned this Mar 9, 2026

chaunceyjiang reviewed Mar 9, 2026

View reviewed changes

mergify bot added the needs-rebase label Mar 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Responses API streaming tool call support for non-harmony models#36445

fix: Responses API streaming tool call support for non-harmony models#36445
herve-ves wants to merge 1 commit intovllm-project:mainfrom
herve-ves:fix/responses-streaming-tool-call

herve-ves commented Mar 9, 2026 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Mar 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 9, 2026

Uh oh!

herve-ves Mar 9, 2026

Uh oh!

mergify bot commented Mar 9, 2026

Uh oh!

chaunceyjiang commented Mar 9, 2026

Uh oh!

chaunceyjiang Mar 9, 2026

Uh oh!

mergify bot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

herve-ves commented Mar 9, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mergify bot commented Mar 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

herve-ves Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Mar 9, 2026

Uh oh!

chaunceyjiang commented Mar 9, 2026

Uh oh!

chaunceyjiang Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

herve-ves commented Mar 9, 2026 •

edited by github-actions bot

Loading