Skip to content

fix: Responses API streaming tool call support for non-harmony models#36445

Open
herve-ves wants to merge 1 commit intovllm-project:mainfrom
herve-ves:fix/responses-streaming-tool-call
Open

fix: Responses API streaming tool call support for non-harmony models#36445
herve-ves wants to merge 1 commit intovllm-project:mainfrom
herve-ves:fix/responses-streaming-tool-call

Conversation

@herve-ves
Copy link

@herve-ves herve-ves commented Mar 9, 2026

Purpose

Fix Responses API streaming tool call support for non-harmony models.

Fixes #36435

_process_simple_streaming_events currently does not emit proper response.function_call_arguments.delta / response.function_call_arguments.done events during streaming. Instead, raw tool call XML (e.g. <tool_call><function=name>...) leaks into response.output_text.delta events.

Two root causes:

  1. reasoning_parser and tool_parser are in a mutually exclusive if/elif chain. When a model has both (e.g. Qwen3.5 with <think> tags), the elif tool_parser: branch is never reached — after reasoning ends, tool call XML falls through as plain content.

  2. Even without a reasoning parser, there was no code to convert DeltaMessage.tool_calls into Responses API function call events. The original code had # todo(kebe7jun) tool call support.

This PR modifies _process_simple_streaming_events to:

  • Handle both reasoning and tool calls together (reasoning first, then tool calls after is_reasoning_end(), matching the Chat Completions streaming behavior)
  • Handle tool-call-only models (no reasoning parser)
  • Emit ResponseFunctionCallArgumentsDeltaEvent / ResponseFunctionCallArgumentsDoneEvent via existing emit_function_call_delta_events / emit_function_call_done_events helpers
  • Properly close message output items before function call events when content precedes tool calls
  • Sync tool_streaming_state.current_output_index with the main output index
  • Suppress ResponseTextDeltaEvent once tool calls are detected

Test Plan

vllm serve Qwen/Qwen3.5-9B \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

with client.responses.create(
    model="Qwen/Qwen3.5-9B",
    input="What's the weather in Boston today?",
    tools=[{
        "type": "function",
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "The city and state"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location", "unit"]
        }
    }],
    stream=True,
) as stream:
    for event in stream:
        print(event)

Test Result

Before (broken): tool call XML in `response.output_text.delta`
ResponseOutputItemAddedEvent(
    item=ResponseOutputMessage(
        id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
        content=[],
        role='assistant',
        status='in_progress',
        type='message',
        phase=None
    ),
    output_index=1,
    sequence_number=102,
    type='response.output_item.added'
)
ResponseContentPartAddedEvent(
    content_index=0,
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    output_index=1,
    part=ResponseOutputText(annotations=[], text='', type='output_text', logprobs=[]),
    sequence_number=103,
    type='response.content_part.added'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=104,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='<tool_call>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=105,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=106,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='<',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=107,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='function',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=108,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='=get',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=109,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='_current',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=110,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='_weather',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=111,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=112,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=113,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='<',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=114,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='parameter',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=115,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='=',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=116,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='location',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=117,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=118,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=119,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='Boston',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=120,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta=',',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=121,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta=' MA',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=122,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=123,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='</',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=124,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='parameter',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=125,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=126,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=127,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='<',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=128,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='parameter',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=129,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='=',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=130,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='unit',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=131,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=132,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=133,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='f',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=134,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='ahrenheit',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=135,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=136,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='</',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=137,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='parameter',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=138,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=139,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=140,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='</',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=141,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='function',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=142,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=143,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=144,
    type='response.output_text.delta'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='</tool_call>',
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=145,
    type='response.output_text.delta'
)
ResponseTextDoneEvent(
    content_index=0,
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    logprobs=[],
    output_index=1,
    sequence_number=146,
    text='\n\n<tool_call>\n<function=get_current_weather>\n<parameter=location>\nBoston, 
MA\n</parameter>\n<parameter=unit>\nfahrenheit\n</parameter>\n</function>\n</tool_call>',
    type='response.output_text.done'
)
ResponseContentPartDoneEvent(
    content_index=0,
    item_id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
    output_index=1,
    part=ResponseOutputText(
        annotations=[],
        text='\n\n<tool_call>\n<function=get_current_weather>\n<parameter=location>\nBoston, 
MA\n</parameter>\n<parameter=unit>\nfahrenheit\n</parameter>\n</function>\n</tool_call>',
        type='output_text',
        logprobs=None
    ),
    sequence_number=147,
    type='response.content_part.done'
)
ResponseOutputItemDoneEvent(
    item=ResponseOutputMessage(
        id='24cf2c7d-f2cb-406b-bb99-fe807143e38f',
        content=[
            ResponseOutputText(
                annotations=[],
                text='\n\n<tool_call>\n<function=get_current_weather>\n<parameter=location>\nBoston, 
MA\n</parameter>\n<parameter=unit>\nfahrenheit\n</parameter>\n</function>\n</tool_call>',
                type='output_text',
                logprobs=None
            )
        ],
        role='assistant',
        status='completed',
        type='message',
        phase=None,
        summary=[]
    ),
    output_index=1,
    sequence_number=148,
    type='response.output_item.done'
)
After (fixed): proper function call events
ResponseOutputItemAddedEvent(
    item=ResponseOutputMessage(
        id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
        content=[],
        role='assistant',
        status='in_progress',
        type='message',
        phase=None
    ),
    output_index=1,
    sequence_number=102,
    type='response.output_item.added'
)
ResponseContentPartAddedEvent(
    content_index=0,
    item_id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
    output_index=1,
    part=ResponseOutputText(annotations=[], text='', type='output_text', logprobs=[]),
    sequence_number=103,
    type='response.content_part.added'
)
ResponseTextDeltaEvent(
    content_index=0,
    delta='\n\n',
    item_id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
    logprobs=[],
    output_index=1,
    sequence_number=104,
    type='response.output_text.delta'
)
ResponseTextDoneEvent(
    content_index=0,
    item_id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
    logprobs=[],
    output_index=1,
    sequence_number=105,
    text='\n\n',
    type='response.output_text.done'
)
ResponseContentPartDoneEvent(
    content_index=0,
    item_id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
    output_index=1,
    part=ResponseOutputText(annotations=[], text='\n\n', type='output_text', logprobs=None),
    sequence_number=106,
    type='response.content_part.done'
)
ResponseOutputItemDoneEvent(
    item=ResponseOutputMessage(
        id='4992e480-0e55-4d72-9cc8-f348218ef1cc',
        content=[ResponseOutputText(annotations=[], text='\n\n', type='output_text', logprobs=None)],
        role='assistant',
        status='completed',
        type='message',
        phase=None
    ),
    output_index=1,
    sequence_number=107,
    type='response.output_item.done'
)
ResponseOutputItemAddedEvent(
    item=ResponseFunctionToolCall(
        arguments='',
        call_id='call_8606b74d97bf7ffe',
        name='get_current_weather',
        type='function_call',
        id='fc_aae462a5c84e3692',
        status='in_progress'
    ),
    output_index=2,
    sequence_number=108,
    type='response.output_item.added'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta='',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=109,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta='{',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=110,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta='"location": "Boston, MA"',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=111,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta=', "unit": "fahrenheit"',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=112,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDeltaEvent(
    delta='}',
    item_id='fc_aae462a5c84e3692',
    output_index=2,
    sequence_number=113,
    type='response.function_call_arguments.delta'
)
ResponseFunctionCallArgumentsDoneEvent(
    arguments='{"location": "Boston, MA", "unit": "fahrenheit"}',
    item_id='fc_aae462a5c84e3692',
    name='get_current_weather',
    output_index=2,
    sequence_number=114,
    type='response.function_call_arguments.done'
)
ResponseOutputItemDoneEvent(
    item=ResponseFunctionToolCall(
        arguments='{"location": "Boston, MA", "unit": "fahrenheit"}',
        call_id='call_8606b74d97bf7ffe',
        name='get_current_weather',
        type='function_call',
        id=None,
        status='completed',
        item_id='fc_aae462a5c84e3692',
        output_index=2,
        sequence_number=-1
    ),
    output_index=2,
    sequence_number=115,
    type='response.output_item.done'
)

ResponseCompletedEvent also correctly contains ResponseFunctionToolCall output items (via the existing non-streaming _make_response_output_items path).


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify
Copy link

mergify bot commented Mar 9, 2026

Hi @herve-ves, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a bug in the Responses API where streaming tool calls for non-harmony models were not working correctly. The changes correctly handle models with both reasoning and tool parsers by processing them sequentially and introduce the necessary logic to emit proper function call events. My main feedback concerns a potentially fragile mechanism for detecting the end of tool call arguments, which could be improved for better robustness and modularity.

Note: Security Review did not run due to the size of the PR.

Comment on lines +1521 to +1544
if args_delta == "}":
tc_idx = tc.index if tc.index is not None else 0
if tc_idx < len(
tool_parser.prev_tool_call_arr
):
tc_info = (
tool_parser.prev_tool_call_arr[
tc_idx
]
)
for event in (
emit_function_call_done_events(
tc_info.get("name", fn_name),
tc_info.get(
"arguments", "{}"
),
tool_streaming_state,
)
):
yield (
_increment_sequence_number_and_return(
event
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic to detect the end of a function call's arguments by checking if args_delta == "}" is quite fragile. It creates a tight coupling between this serving logic and the specific implementation details of the tool parser.

This approach assumes that a tool parser will always emit the final closing brace of arguments as a standalone delta. This contract is implicit and might be violated by other current or future tool parsers, which could lead to ResponseFunctionCallArgumentsDoneEvent not being emitted. For instance, a parser might stream "{}" for an empty object, or the final } might be part of a larger token.

A more robust design would be to have the parser explicitly signal the completion of a tool call's arguments within the DeltaMessage. For example, the ToolCall object in DeltaMessage could have a done: bool flag. This would decouple the serving logic from the parser's output tokenization and create a clearer interface.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation. The args_delta == "}" check mirrors how the Chat Completions streaming path handles tool call completion — it's the existing contract between Qwen3CoderToolParser and the serving layer (the parser explicitly emits "}" as a standalone delta at qwen3coder_tool_parser.py:689-696).

I agree that an explicit done signal in DeltaMessage / DeltaToolCall would be a cleaner interface, but that would require changes to the tool parser protocol and all existing parsers — a larger refactor better suited for a follow-up PR. This PR focuses on bringing the Responses API streaming path to parity with the existing Chat Completions streaming behavior.

Add tool parser integration to _process_simple_streaming_events so that
tool call XML is parsed and emitted as proper function_call_arguments
delta/done events instead of leaking into output_text events.

- Initialize tool_parser from self.parser.tool_parser_cls
- Call extract_tool_calls_streaming for each delta
- Emit ResponseFunctionCallArgumentsDeltaEvent / DoneEvent via existing
  emit_function_call_delta_events / emit_function_call_done_events helpers
- Suppress text events once tool calls are detected (tools_streamed flag)
- Close message output item before function call events when content
  precedes tool calls
- Sync tool_streaming_state.current_output_index with main output index
@herve-ves herve-ves force-pushed the fix/responses-streaming-tool-call branch from 776d948 to aaa917d Compare March 9, 2026 05:55
@mergify
Copy link

mergify bot commented Mar 9, 2026

Hi @herve-ves, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@chaunceyjiang chaunceyjiang self-assigned this Mar 9, 2026
@chaunceyjiang
Copy link
Collaborator

see #29947

if args_delta == "}" and tc_idx < len(
tool_parser.prev_tool_call_arr
):
tc_info = tool_parser.prev_tool_call_arr[tc_idx]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure we should use the prev_tool_call_arr attribute here.

I remember implementing an earlier version in #29947 that didn’t require this attribute. I forgot to continue pushing that PR — thanks for the reminder. I’ll revisit it and try to incorporate your changes to move #29947 forward.

@mergify
Copy link

mergify bot commented Mar 12, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @herve-ves.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend gpt-oss Related to GPT-OSS models needs-rebase

Projects

Status: To Triage

2 participants