Skip to content

fix(responses): strip empty text content parts before chat/completion…#22977

Open
dsteeley wants to merge 8 commits intoBerriAI:mainfrom
dsteeley:fix/strip-empty-text-content-parts-responses-api
Open

fix(responses): strip empty text content parts before chat/completion…#22977
dsteeley wants to merge 8 commits intoBerriAI:mainfrom
dsteeley:fix/strip-empty-text-content-parts-responses-api

Conversation

@dsteeley
Copy link
Contributor

@dsteeley dsteeley commented Mar 6, 2026

Strict OpenAI-compatible models (e.g. Kimi-K2.5, gpt-oss-120b on Azure AI) reject messages whose content array contains {"type": "text", "text": ""}. These empty parts are produced by the Responses API -> chat/completions transformation when an assistant turn contains a tool call alongside an empty text block.

Add _strip_empty_text_content_parts() to handler.py and call it in both the sync (response_api_handler) and async (async_response_api_handler) paths before litellm.completion() / litellm.acompletion(). The helper is a no-op when no empty parts are present (returns the original dict).

Relevant issues

Related to #22933 which fixes the same class of bug in the /v1/messages path.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🐛 Bug Fix

Changes

Problem

When LiteLLM's Responses API → chat/completions fallback path (LiteLLMCompletionTransformationHandler) converts a Responses API request containing tool results into chat completion messages, it can produce content arrays with empty text parts:

[
{"type": "text", "text": ""},
{"type": "tool_calls", "tool_calls": [...]}
]
Strict OpenAI-compatible endpoints reject these with:

422 Unprocessable Entity: "Content part type is ContentPartType.text but text is not provided"
This affects any model that uses the chat/completions fallback path — i.e. get_provider_responses_api_config() returns None — including azure_ai/kimi-k2.5, azure_ai/gpt-oss-120b, and similar. It reproduces reliably in multi-turn conversations with tool calls (e.g. Claude Code / Codex routing through these providers).

The root cause is in transformation.py where text_value is None is checked but text_value == "" is not, so empty strings pass through.

Fix

_strip_empty_text_content_parts() is added to handler.py and called in both the sync (response_api_handler) and async (async_response_api_handler) paths immediately before litellm.completion() / litellm.acompletion(). In the async path it runs after async_responses_api_session_handler, covering messages injected by session replay as well.

The helper:

iterates messages and filters {"type": "text", "text": ""} parts from any list-valued content
falls back to content="" (not []) when filtering would empty the list, avoiding a different validation error
returns the original dict unchanged when no modifications are needed (no allocation overhead on the happy path)

Tests
6 unit tests added in tests/test_litellm/responses/litellm_completion_transformation/test_strip_empty_text_content_parts.py:

…s fallback

Strict OpenAI-compatible models (e.g. Kimi-K2.5, gpt-oss-120b on Azure AI)
reject messages whose content array contains {"type": "text", "text": ""}.
These empty parts are produced by the Responses API -> chat/completions
transformation when an assistant turn contains a tool call alongside an
empty text block.

Add _strip_empty_text_content_parts() to handler.py and call it in both
the sync (response_api_handler) and async (async_response_api_handler)
paths before litellm.completion() / litellm.acompletion(). The helper is
a no-op when no empty parts are present (returns the original dict).

Related to PR BerriAI#22933 which fixes the same class of bug in the /v1/messages
path.
@vercel
Copy link

vercel bot commented Mar 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 13, 2026 1:42pm

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR fixes a 422 Unprocessable Entity error thrown by strict OpenAI-compatible endpoints (e.g. azure_ai/kimi-k2.5, azure_ai/gpt-oss-120b) when the Responses API → chat/completions fallback path emits assistant messages whose content array contains {"type": "text", "text": ""} entries. Two complementary layers of defence are introduced:

  • Root-cause fix (transformation.py): adds or text_value == "" to the existing None guard so empty-string text parts are dropped at the point they are created during transformation.
  • Defence-in-depth (handler.py): _strip_empty_text_content_parts() is called immediately before both litellm.completion and litellm.acompletion, acting as a safety net for empty parts that can arrive via other paths (e.g. messages injected by async_responses_api_session_handler during session replay). The helper also handles the edge case where removing all parts would leave an empty content array by falling back to content="".

A secondary, unrelated change in discoverable_endpoints.py refactors the MCP OAuth /token endpoint to parse form bodies manually instead of using FastAPI Form() injection, eliminating the hard python-multipart dependency for the URL-encoded case.

Key points:

  • Both fix layers are correct and non-regressive; the _strip_empty_text_content_parts helper is a no-op (returns the original dict) when no empty parts are present.
  • Six pure unit tests are added covering the primary scenario, edge cases, and identity-return behaviour.
  • new_messages in _strip_empty_text_content_parts is unconditionally allocated inside the loop even on the no-op path; on a high-throughput route this is a minor unnecessary allocation per request.
  • grant_type and client_id extracted from form_data in token_endpoint are Optional[str] at their declaration site but are passed to exchange_token_with_server which expects str; runtime-safe but will produce type-checker warnings.

Confidence Score: 4/5

  • Safe to merge — the fix is well-scoped, non-regressive, and covered by unit tests; only minor style/type issues remain.
  • Both fix layers (transformation.py root-cause + handler.py safety net) are logically correct and align with the described failure mode. The helper is a no-op on the happy path and the test suite covers the key scenarios. The minor deductions are: (1) new_messages is always allocated even when no modification is needed, creating unnecessary GC pressure on the hot path; (2) the discoverable_endpoints.py type annotation mismatch will cause mypy/pyright warnings without impacting runtime; (3) no unit tests for the new manual form-parsing logic in token_endpoint.
  • litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py — type narrowing issue after Optional[str] guard, and no tests for the new manual body-parsing path.

Important Files Changed

Filename Overview
litellm/responses/litellm_completion_transformation/handler.py Adds _strip_empty_text_content_parts() as a defensive, post-transformation filter applied in both sync and async paths before calling litellm.completion/litellm.acompletion. Logic is correct and well-scoped; async placement (after async_responses_api_session_handler) intentionally covers session-replay-injected messages too.
litellm/responses/litellm_completion_transformation/transformation.py Root-cause one-liner fix: adds or text_value == "" to the existing text_value is None guard, preventing empty-string text parts from ever being emitted by the transformation. Clean and minimal change.
litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py Replaces FastAPI Form() injection in token_endpoint with manual body parsing to remove the hard python-multipart dependency. Logic correctly handles both multipart/form-data and application/x-www-form-urlencoded; however, grant_type and client_id are typed as Optional[str] at the call-site but exchange_token_with_server expects str.
tests/test_litellm/responses/litellm_completion_transformation/test_litellm_completion_responses.py Six focused unit tests cover the key scenarios: adjacent tool-calls stripping, non-empty part preservation, empty-list fallback to "", identity return when unchanged, non-text part preservation, and string-content pass-through. All tests are pure unit tests with no network calls — compliant with repository testing policy.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Responses API Request] --> B[transform_responses_api_request_to_chat_completion_request]
    B --> C{text_value is None\nor empty string?}
    C -- Yes --> D[Skip / filter out\nempty text part\ntransformation.py fix]
    C -- No --> E[Add to content_list]
    E --> F[litellm_completion_request dict]
    D --> F

    F --> G{_is_async?}
    G -- No / sync --> H[_strip_empty_text_content_parts\nhandler.py sync fix]
    G -- Yes / async --> I[async_response_api_handler]

    I --> J{previous_response_id?}
    J -- Yes --> K[async_responses_api_session_handler\nappends session messages]
    J -- No --> L[_strip_empty_text_content_parts\nhandler.py async fix]
    K --> L

    H --> M[litellm.completion]
    L --> N[litellm.acompletion]

    M --> O[Transform response\nback to Responses API]
    N --> O
Loading

Comments Outside Diff (2)

  1. litellm/responses/litellm_completion_transformation/handler.py, line 39-57 (link)

    new_messages list always allocated, even on the no-op path

    new_messages is unconditionally built by the for loop (an O(n) allocation), but is only ever used when modified is True. On the happy path — which is every request that contains no empty text parts — the entire list is constructed and then discarded at the if not modified: return check.

    Since this function sits in the critical request path (called on every sync and async completion), you may want to delay the list construction to only when a modification is actually required:

        modified_messages: Optional[List[Any]] = None
        for i, msg in enumerate(messages):
            content = msg.get("content") if isinstance(msg, dict) else None
            if isinstance(content, list):
                filtered = [
                    part
                    for part in content
                    if not (
                        isinstance(part, dict)
                        and part.get("type") == "text"
                        and part.get("text") == ""
                    )
                ]
                if len(filtered) != len(content):
                    if modified_messages is None:
                        modified_messages = list(messages[:i])
                    new_msg = dict(msg)
                    new_msg["content"] = filtered if filtered else ""
                    modified_messages.append(new_msg)
                    continue
            if modified_messages is not None:
                modified_messages.append(msg)
    
        if modified_messages is None:
            return litellm_completion_request
    
        result = dict(litellm_completion_request)
        result["messages"] = modified_messages
        return result

    This avoids any heap allocation on the no-op path entirely.

  2. litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py, line 404-412 (link)

    Type mismatch passed to exchange_token_with_server

    form_data.get(...) returns Optional[str], so grant_type and client_id are inferred as Optional[str] here. However, exchange_token_with_server declares both as str (non-optional). Even though the runtime guard on lines 406–407 ensures neither is None at the call site, static type-checkers (mypy/pyright) cannot narrow through the early raise and will flag the downstream call as a type error.

    Adding a cast(str, ...) (or an assert guard) after the validation block would make the types explicit and suppress the false-positive diagnostics:

        grant_type_str: str = grant_type  # type: ignore[assignment]  # narrowed by guard above
        client_id_str: str = client_id    # type: ignore[assignment]

    or via typing.cast. No runtime behaviour changes.

Last reviewed commit: b7b79be

(e.g. Kimi-K2.5, gpt-oss-120b on Azure AI).
"""

import pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused pytest import

pytest is imported but never used in this file — no pytest.raises, pytest.mark, pytest.fixture, or any other pytest.* references appear. This will trigger a linter/flake8 F401 warning. Test discovery works via function naming conventions without the import.

Suggested change
import pytest

dsteeley added 2 commits March 9, 2026 11:15
…xt parts

- Apply Black formatting to handler.py, transformation.py, and test file
  to fix CI lint failures
- Fix root cause in transformation.py: also skip text items where text == ""
  (previously only text is None was guarded), preventing empty text parts
  from being generated in the first place
Copy link
Collaborator

@Sameerlite Sameerlite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit and please resolve merge conflict

…ing completion transformation test file\n\nRemove unused pytest import per reviewer nit. Move 6 tests from standalone\ntest_strip_empty_text_content_parts.py into TestStripEmptyTextContentParts\nclass in test_litellm_completion_responses.py and delete the now-empty file.
@dsteeley
Copy link
Contributor Author

@Sameerlite Conflicts resolved again, please could you look at this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants