[Responses] Decouple SSE event helpers from Harmony context by sfeng33 · Pull Request #35148 · vllm-project/vllm

sfeng33 · 2026-02-23T23:24:06Z

Purpose

Refactor SSE streaming event helpers to decouple shared event-building logic from Harmony-specific context
Fix a bug for build-in python tool to generate code interpreter events instead of mcp events (according to https://community.openai.com/t/responses-api-streaming-the-simple-guide-to-events/1363122)
Enhance unit test to validate streaming events

Architecture: Two-layer design in streaming_events.py

The core refactor splits streaming_events.py into two layers: dispatchers that understand Harmony context objects, and leaf helpers that only accept plain strings. Dispatchers extract values from StreamingHarmonyContext / HarmonyMessage and delegate to leaf helpers, which build SSE events from primitive types. This means the event-building logic is reusable by any future backend without depending on Harmony.

     serving.py                streaming_events.py
    ┌──────────┐       ┌────────────────────────────────────────────────┐
    │          │       │                                                │
    │  ctx ────┼──────▶│  DISPATCHERS (Harmony-specific)               │
    │          │       │  ┌──────────────────────────────────────┐      │
    │          │       │  │ emit_content_delta_events(ctx,state) │      │
    │          │       │  │ emit_previous_item_done_events(prev) │      │
    │          │       │  │ emit_tool_action_events(ctx,state,ts)│      │
    │          │       │  └──────────────┬───────────────────────┘      │
    │          │       │                 │ extract plain values         │
    │          │       │                 ▼                              │
    │          │       │  LEAF HELPERS (backend-agnostic)               │
    │          │       │  ┌──────────────────────────────────────┐      │
    │          │       │  │  Delta:                              │      │
    │          │       │  │    emit_text_delta_events(str,.)     │      │
    │          │       │  │    emit_reasoning_delta_events(str,.)│      │
    │          │       │  │    emit_function_call_delta_ev(str,.)│      │
    │          │       │  │    emit_mcp_delta_events(str,.)      │      │
    │          │       │  │    emit_code_interp_delta_ev(str,.)  │      │
    │          │       │  │                                      │      │
    │          │       │  │  Done:                               │      │
    │          │       │  │    emit_text_output_done_events(str) │      │
    │          │       │  │    emit_reasoning_done_events(str)   │      │
    │          │       │  │    emit_function_call_done_ev(str,.) │      │
    │          │       │  │    emit_mcp_completion_events(str,.) │      │
    │          │       │  │    emit_code_interp_completion(..)   │      │
    │          │       │  │    emit_browser_tool_events(..)      │      │
    │          │       │  └──────────────────────────────────────┘      │
    │          │       │                 │                              │
    │          │       │                 ▼                              │
    │  events◄─┼───────│  list[StreamingResponsesResponse]             │
    │          │       │                                                │
    └──────────┘       └────────────────────────────────────────────────┘

Test Plan

pytest tests/entrypoints/openai/responses/test_harmony.py

mergify · 2026-02-23T23:24:44Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @sfeng33.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

gemini-code-assist

Code Review

This pull request is a significant and well-executed refactoring of the SSE event generation logic. The decoupling of event helpers from the Harmony-specific context into dispatchers and backend-agnostic leaf helpers is a great architectural improvement that enhances reusability and maintainability. The accompanying enhancements to the test suite, particularly the more robust validation of event stream pairing, ordering, and field consistency, are also excellent and will help ensure correctness going forward.

However, I've found one critical issue in the refactored logic for function calls. The call_id for a ResponseFunctionToolCall is not consistent between its in_progress and completed states in a streaming response. This breaks the tool-calling protocol for clients. Please see the detailed comment for more information.

vllm/entrypoints/openai/responses/streaming_events.py

Signed-off-by: sfeng33 <4florafeng@gmail.com>

sfeng33 · 2026-02-23T23:47:49Z

PTAL: @qandrew @mgoin
cc @bbrowning

vllm/entrypoints/openai/responses/streaming_events.py

chaunceyjiang

Thanks~

chaunceyjiang · 2026-02-24T06:09:01Z

/cc @qandrew PTAL.

chaunceyjiang · 2026-02-24T14:13:22Z

https://buildkite.com/vllm/ci/builds/52856#019c8e44-d8e8-42f7-a2a8-b6b25c0767d9 @sfeng33 PTAL.

qandrew

thanks for the PR! lgtm, i trust Chauncey's review. is the eventual goal to have simpleContext/parsableContext use the logic in streaming_events too?

also cc @daniel-salib who is working on #35184 for streaming

sfeng33 · 2026-02-24T18:23:11Z

thanks for the PR! lgtm, i trust Chauncey's review. is the eventual goal to have simpleContext/parsableContext use the logic in streaming_events too?

also cc @daniel-salib who is working on #35184 for streaming

Thanks for taking a look, yes the goal is for parsableContext to re-use streaming_events for sse events. And deprecate simpleContext eventually.

Signed-off-by: sfeng33 <4florafeng@gmail.com>

tests/entrypoints/openai/responses/test_mcp_tools.py

vllm/entrypoints/openai/responses/streaming_events.py

bbrowning · 2026-02-24T20:40:28Z

This looks like a reasonable cleanup that also fixes some bugs in our Responses streaming events. I don't think it necessarily fixes all the bugs (left a comment in one place about browser and container events), but I don't think fixing all bugs is necessarily the bar. Were you able to test this with a live model and a real client just to verify streaming behavior outside of what the unit test has?

sfeng33 · 2026-02-24T21:46:30Z

This looks like a reasonable cleanup that also fixes some bugs in our Responses streaming events. I don't think it necessarily fixes all the bugs (left a comment in one place about browser and container events), but I don't think fixing all bugs is necessarily the bar. Were you able to test this with a live model and a real client just to verify streaming behavior outside of what the unit test has?

Totally agree there are remaining bugs, in this PR I tried to keep it the same functionality, while fixing the one obvious bug I list in the PR summary. In terms of manual testing, I tested with gpt oss 20b model in the basic text and reasoning case, and see the stream events are the same as main. For the tool call/mcp/function call events, I actually think the way the events are emitted aren't as expected as the openai response api should emit, e.g. we emit all browser related events once, where we should emit right after tool call is executed.

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com>

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com> Signed-off-by: xjx <493337577@qq.com>

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com>

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com> Signed-off-by: Andrii Skliar <askliar@nvidia.com>

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com>

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>

mergify bot added frontend gpt-oss Related to GPT-OSS models labels Feb 23, 2026

github-project-automation bot added this to gpt-oss Issues & Enhancements Feb 23, 2026

mergify bot added the needs-rebase label Feb 23, 2026

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Feb 23, 2026

gemini-code-assist bot reviewed Feb 23, 2026

View reviewed changes

vllm/entrypoints/openai/responses/streaming_events.py Outdated Show resolved Hide resolved

sfeng33 added 2 commits February 23, 2026 18:28

Refactor sse events

2285968

Signed-off-by: sfeng33 <4florafeng@gmail.com>

Add typing, fix test

55b7300

Signed-off-by: sfeng33 <4florafeng@gmail.com>

sfeng33 force-pushed the sse_interface branch from 644f8ac to 55b7300 Compare February 23, 2026 23:30

mergify bot removed the needs-rebase label Feb 23, 2026

sfeng33 marked this pull request as ready for review February 23, 2026 23:32

sfeng33 requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang, robertgshaw2-redhat and russellb as code owners February 23, 2026 23:32

fix call id

50dd2b0

Signed-off-by: sfeng33 <4florafeng@gmail.com>

chaunceyjiang reviewed Feb 24, 2026

View reviewed changes

vllm/entrypoints/openai/responses/streaming_events.py Outdated Show resolved Hide resolved

chaunceyjiang approved these changes Feb 24, 2026

View reviewed changes

github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Feb 24, 2026

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 24, 2026

qandrew approved these changes Feb 24, 2026

View reviewed changes

qandrew mentioned this pull request Feb 24, 2026

[Bugfix] Emit reasoning_part events in simple streaming path for Resp… #35184

Merged

2 tasks

fix test

4bfab41

Signed-off-by: sfeng33 <4florafeng@gmail.com>

sfeng33 commented Feb 24, 2026

View reviewed changes

tests/entrypoints/openai/responses/test_mcp_tools.py Show resolved Hide resolved

bbrowning reviewed Feb 24, 2026

View reviewed changes

vllm/entrypoints/openai/responses/streaming_events.py Show resolved Hide resolved

vllm-bot merged commit ec1d30c into vllm-project:main Feb 25, 2026
47 of 50 checks passed

github-project-automation bot moved this from Ready to Done in gpt-oss Issues & Enhancements Feb 25, 2026

sfeng33 deleted the sse_interface branch February 25, 2026 04:10

tom-zju pushed a commit to tom-zju/vllm that referenced this pull request Feb 26, 2026

[Responses] Decouple SSE event helpers from Harmony context (vllm-pro…

a9aa934

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com>

llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026

[Responses] Decouple SSE event helpers from Harmony context (vllm-pro…

c6fa442

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com>

will-deines mentioned this pull request Mar 4, 2026

[Bugfix] Fix Harmony streaming cross-channel delta accumulation #36011

Open

5 tasks

tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026

[Responses] Decouple SSE event helpers from Harmony context (vllm-pro…

db54560

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com>

askliar pushed a commit to askliar/vllm that referenced this pull request Mar 9, 2026

[Responses] Decouple SSE event helpers from Harmony context (vllm-pro…

3eaf0b6

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com> Signed-off-by: Andrii Skliar <askliar@nvidia.com>

Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026

[Responses] Decouple SSE event helpers from Harmony context (vllm-pro…

e391784

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com>

EricccYang pushed a commit to EricccYang/vllm that referenced this pull request Apr 1, 2026

[Responses] Decouple SSE event helpers from Harmony context (vllm-pro…

4fed0ba

…ject#35148) Signed-off-by: sfeng33 <4florafeng@gmail.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>

Uh oh!

Conversation

sfeng33 commented Feb 23, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Architecture: Two-layer design in streaming_events.py

Test Plan

Uh oh!

mergify bot commented Feb 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

sfeng33 commented Feb 23, 2026

Uh oh!

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang commented Feb 24, 2026

Uh oh!

chaunceyjiang commented Feb 24, 2026

Uh oh!

qandrew left a comment

Choose a reason for hiding this comment

Uh oh!

sfeng33 commented Feb 24, 2026

Uh oh!

Uh oh!

Uh oh!

bbrowning commented Feb 24, 2026

Uh oh!

sfeng33 commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sfeng33 commented Feb 23, 2026 •

edited by github-actions bot

Loading