Skip to content

fix(streaming): preserve custom attributes on final stream chunk#23530

Merged
Sameerlite merged 1 commit intoBerriAI:mainfrom
Sameerlite:litellm_preserve-final-streaming-attributes
Mar 16, 2026
Merged

fix(streaming): preserve custom attributes on final stream chunk#23530
Sameerlite merged 1 commit intoBerriAI:mainfrom
Sameerlite:litellm_preserve-final-streaming-attributes

Conversation

@Sameerlite
Copy link
Contributor

@Sameerlite Sameerlite commented Mar 13, 2026

Summary

  • preserve upstream non-OpenAI attributes on final finish_reason chunks in CustomStreamWrapper.return_processed_chunk_logic
  • run attribute preservation before _is_delta_empty branching so both empty-delta and holding-chunk-flush paths keep custom fields
  • add regression tests for both final-chunk paths in test_streaming_handler.py

Test plan

  • poetry run pytest tests/test_litellm/litellm_core_utils/test_streaming_handler.py::test_finish_reason_chunk_preserves_non_openai_attributes -v
  • poetry run pytest tests/test_litellm/litellm_core_utils/test_streaming_handler.py::test_finish_reason_with_holding_chunk_preserves_non_openai_attributes -v

Fixes #23444

Ensure final finish_reason chunks retain non-OpenAI attributes from original provider chunks, including the holding_chunk flush path where delta is non-empty. Add regression tests for both final-chunk branches.

Made-with: Cursor
@vercel
Copy link

vercel bot commented Mar 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 13, 2026 7:43am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR fixes a bug where custom (non-OpenAI) attributes on final finish_reason stream chunks were being dropped. The fix moves the preserve_upstream_non_openai_attributes call in CustomStreamWrapper.return_processed_chunk_logic to before the _is_delta_empty branching, ensuring both the empty-delta path and the holding-chunk-flush path correctly carry through provider-specific attributes. The import of ChatCompletionChunk is also updated to use the more precise litellm-specific OpenAIChatCompletionChunk subclass.

Key changes:

  • streaming_handler.py: Adds attribute preservation in the received_finish_reason branch (lines 1015–1022), placed before _is_delta_empty so both code paths benefit. The placement is correct — custom fields live on model_response itself, so the subsequent Delta(content=None) replacement in the empty-delta path does not clobber them.
  • test_streaming_handler.py: Two regression tests added covering the empty-delta final chunk path and the holding-chunk-flush path; both are fully mocked with no real network calls.
  • The ChatCompletionChunkOpenAIChatCompletionChunk rename aligns the type annotation in copy_model_response_level_provider_specific_fields with the actual litellm-subclassed type used at runtime.

Confidence Score: 4/5

  • This PR is safe to merge — the change is narrowly scoped to a single new call in an elif branch that cannot double-fire with the existing call in the is_chunk_non_empty branch.
  • The fix is minimal and well-placed. The two call sites for preserve_upstream_non_openai_attributes are in mutually exclusive if/elif branches, so there is no risk of double-application. Regression tests cover both affected code paths and are fully mocked. The only minor caveat is that sent_last_chunk is not set when _is_delta_empty=False (holding-chunk flush), but this is pre-existing behavior unaffected by the PR.
  • No files require special attention.

Important Files Changed

Filename Overview
litellm/litellm_core_utils/streaming_handler.py Adds preserve_upstream_non_openai_attributes call in the received_finish_reason path before _is_delta_empty branching; also renames import from ChatCompletionChunk to OpenAIChatCompletionChunk. Logic is sound and placement is correct for both code paths.
tests/test_litellm/litellm_core_utils/test_streaming_handler.py Adds two regression tests covering the empty-delta and holding-chunk-flush code paths. Tests are fully mocked with no real network calls, consistent with the folder's policy.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[return_processed_chunk_logic called] --> B{is_chunk_non_empty?}
    B -- Yes --> C[Non-empty delta path]
    C --> D{original_chunk in response_obj?}
    D -- Yes --> E[preserve_upstream_non_openai_attributes]
    E --> F[Return model_response]
    D -- No --> F
    B -- No --> G{received_finish_reason is not None?}
    G -- No --> H[Other paths / return None]
    G -- Yes --> I{sent_last_chunk is True?}
    I -- Yes --> J[Raise StopIteration]
    I -- No --> K{holding_chunk not empty?}
    K -- Yes --> L[Flush holding_chunk into delta.content]
    K -- No --> M[Compute _is_delta_empty]
    L --> M
    M --> N["✨ NEW: preserve_upstream_non_openai_attributes\n(before branching - applies to both paths)"]
    N --> O{_is_delta_empty?}
    O -- Yes --> P[Set delta=Delta content=None\nSet finish_reason\nSet sent_last_chunk=True]
    O -- No --> Q[Non-empty delta returned\nwith holding chunk content]
    P --> R[Return model_response]
    Q --> R
Loading

Last reviewed commit: a012486

@Sameerlite Sameerlite merged commit b796ee9 into BerriAI:main Mar 16, 2026
28 of 37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Streaming final chunk drops non-OpenAI attributes (preserve_upstream_non_openai_attributes not called)

2 participants