Skip to content

[Refactor] Extract Harmony streaming SSE event builders into streaming_events.py#34909

Merged
vllm-bot merged 4 commits intovllm-project:mainfrom
sfeng33:streaming_parser
Feb 20, 2026
Merged

[Refactor] Extract Harmony streaming SSE event builders into streaming_events.py#34909
vllm-bot merged 4 commits intovllm-project:mainfrom
sfeng33:streaming_parser

Conversation

@sfeng33
Copy link
Copy Markdown
Contributor

@sfeng33 sfeng33 commented Feb 19, 2026

Purpose

serving.py is ~2800 lines. The _emit_* methods are pure functions (state + data → events) that don't use instance state, yet they live on the class. Extracting them:

  1. Reduces serving.py by ~800 lines
  2. Makes event builders independently testable
  3. Prepares for StreamingParsableContext support — the upcoming streaming parsable implementation will reuse these same event builders instead of duplicating them (as _process_simple_streaming_events currently does)

Future plan

This is the first step toward unifying SSE event emission across all three streaming paths (Harmony, Simple, Parsable):

  1. This PR: Extract event builders into shared module
  2. Follow-up: Further refactor SSE events so that StreamingParsableContext and HarmonyContext share the same event processor loop

Test Plan

pytest tests/entrypoints/openai/responses/test_harmony.py

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Signed-off-by: sfeng33 <4florafeng@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully refactors the Harmony streaming SSE event builders by extracting them from serving.py into a dedicated streaming_events.py module. This significantly improves the maintainability and testability of the code, and sets a solid foundation for unifying event emission across different streaming contexts. The refactor correctly handles the transition of state and dependencies (like tool_server). I have identified a robustness issue in the browser tool event builder that could lead to runtime crashes on unexpected model output.

Comment on lines +636 to +658
parsed_args = json.loads(previous_item.content[0].text)
action = None

if function_name == "search":
action = response_function_web_search.ActionSearch(
type="search",
query=parsed_args["query"],
)
elif function_name == "open":
action = response_function_web_search.ActionOpenPage(
type="open_page",
# TODO: translate to url
url=f"cursor:{parsed_args.get('cursor', '')}",
)
elif function_name == "find":
action = response_function_web_search.ActionFind(
type="find",
pattern=parsed_args["pattern"],
# TODO: translate to url
url=f"cursor:{parsed_args.get('cursor', '')}",
)
else:
raise ValueError(f"Unknown function name: {function_name}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The emit_browser_tool_events function lacks robustness against unexpected model output. Specifically:

  1. json.loads will raise a JSONDecodeError if the model produces malformed JSON.
  2. Accessing parsed_args["query"] and parsed_args["pattern"] will raise a KeyError if those keys are missing.
  3. The ValueError at line 658 will propagate and cause a 500 error for the streaming request.

Since this is an async generator context, these unhandled exceptions will crash the stream for the client. Consider implementing robust parsing with error handling similar to _parse_browser_tool_call in harmony_utils.py.

    try:
        parsed_args = json.loads(previous_item.content[0].text)
    except (json.JSONDecodeError, IndexError):
        return []

    if function_name == "search":
        query = parsed_args.get("query")
        if not query: return []
        action = response_function_web_search.ActionSearch(
            type="search",
            query=query,
        )
    elif function_name == "open":
        action = response_function_web_search.ActionOpenPage(
            type="open_page",
            # TODO: translate to url
            url=f"cursor:{parsed_args.get('cursor', '')}",
        )
    elif function_name == "find":
        pattern = parsed_args.get("pattern")
        if not pattern: return []
        action = response_function_web_search.ActionFind(
            type="find",
            pattern=pattern,
            # TODO: translate to url
            url=f"cursor:{parsed_args.get('cursor', '')}",
        )
    else:
        return []

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This piece of code is copied over, I prefer to keep logic unchanged in this PR since it's for refactoring, but let me know if folks think otherwise.

@sfeng33
Copy link
Copy Markdown
Contributor Author

sfeng33 commented Feb 19, 2026

PTAL: @qandrew @daniel-salib

Copy link
Copy Markdown
Contributor

@qandrew qandrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks! cc @houseroad can we add "ready" / automerge?

@mgoin mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 19, 2026
Copy link
Copy Markdown
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just two nits. They can wait if the goal is to make this purely a movement PR

@github-project-automation github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Feb 19, 2026
@sfeng33
Copy link
Copy Markdown
Contributor Author

sfeng33 commented Feb 20, 2026

Verified the entrypoints-integration-responses-api tests passed locally (test_mcp_tools, test_parsable_context, test_parsable_context). The other failed tests are unrelated.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) February 20, 2026 04:15
@vllm-bot vllm-bot merged commit ed31a02 into vllm-project:main Feb 20, 2026
47 of 50 checks passed
DarkLight1337 added a commit to DarkLight1337/vllm that referenced this pull request Feb 21, 2026
…g_events.py (vllm-project#34909)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
@sfeng33 sfeng33 deleted the streaming_parser branch February 21, 2026 18:24
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Feb 22, 2026
…g_events.py (vllm-project#34909)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
jmamou pushed a commit to jmamou/vllm that referenced this pull request Feb 23, 2026
…g_events.py (vllm-project#34909)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026
…g_events.py (vllm-project#34909)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026
…g_events.py (vllm-project#34909)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
askliar pushed a commit to askliar/vllm that referenced this pull request Mar 9, 2026
…g_events.py (vllm-project#34909)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026
…g_events.py (vllm-project#34909)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend gpt-oss Related to GPT-OSS models ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants