fix(responses): strip empty text content parts before chat/completion…#22977
fix(responses): strip empty text content parts before chat/completion…#22977dsteeley wants to merge 8 commits intoBerriAI:mainfrom
Conversation
…s fallback
Strict OpenAI-compatible models (e.g. Kimi-K2.5, gpt-oss-120b on Azure AI)
reject messages whose content array contains {"type": "text", "text": ""}.
These empty parts are produced by the Responses API -> chat/completions
transformation when an assistant turn contains a tool call alongside an
empty text block.
Add _strip_empty_text_content_parts() to handler.py and call it in both
the sync (response_api_handler) and async (async_response_api_handler)
paths before litellm.completion() / litellm.acompletion(). The helper is
a no-op when no empty parts are present (returns the original dict).
Related to PR BerriAI#22933 which fixes the same class of bug in the /v1/messages
path.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes a
A secondary, unrelated change in Key points:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/responses/litellm_completion_transformation/handler.py | Adds _strip_empty_text_content_parts() as a defensive, post-transformation filter applied in both sync and async paths before calling litellm.completion/litellm.acompletion. Logic is correct and well-scoped; async placement (after async_responses_api_session_handler) intentionally covers session-replay-injected messages too. |
| litellm/responses/litellm_completion_transformation/transformation.py | Root-cause one-liner fix: adds or text_value == "" to the existing text_value is None guard, preventing empty-string text parts from ever being emitted by the transformation. Clean and minimal change. |
| litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py | Replaces FastAPI Form() injection in token_endpoint with manual body parsing to remove the hard python-multipart dependency. Logic correctly handles both multipart/form-data and application/x-www-form-urlencoded; however, grant_type and client_id are typed as Optional[str] at the call-site but exchange_token_with_server expects str. |
| tests/test_litellm/responses/litellm_completion_transformation/test_litellm_completion_responses.py | Six focused unit tests cover the key scenarios: adjacent tool-calls stripping, non-empty part preservation, empty-list fallback to "", identity return when unchanged, non-text part preservation, and string-content pass-through. All tests are pure unit tests with no network calls — compliant with repository testing policy. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Responses API Request] --> B[transform_responses_api_request_to_chat_completion_request]
B --> C{text_value is None\nor empty string?}
C -- Yes --> D[Skip / filter out\nempty text part\ntransformation.py fix]
C -- No --> E[Add to content_list]
E --> F[litellm_completion_request dict]
D --> F
F --> G{_is_async?}
G -- No / sync --> H[_strip_empty_text_content_parts\nhandler.py sync fix]
G -- Yes / async --> I[async_response_api_handler]
I --> J{previous_response_id?}
J -- Yes --> K[async_responses_api_session_handler\nappends session messages]
J -- No --> L[_strip_empty_text_content_parts\nhandler.py async fix]
K --> L
H --> M[litellm.completion]
L --> N[litellm.acompletion]
M --> O[Transform response\nback to Responses API]
N --> O
Comments Outside Diff (2)
-
litellm/responses/litellm_completion_transformation/handler.py, line 39-57 (link)new_messageslist always allocated, even on the no-op pathnew_messagesis unconditionally built by theforloop (an O(n) allocation), but is only ever used whenmodifiedisTrue. On the happy path — which is every request that contains no empty text parts — the entire list is constructed and then discarded at theif not modified: returncheck.Since this function sits in the critical request path (called on every sync and async completion), you may want to delay the list construction to only when a modification is actually required:
modified_messages: Optional[List[Any]] = None for i, msg in enumerate(messages): content = msg.get("content") if isinstance(msg, dict) else None if isinstance(content, list): filtered = [ part for part in content if not ( isinstance(part, dict) and part.get("type") == "text" and part.get("text") == "" ) ] if len(filtered) != len(content): if modified_messages is None: modified_messages = list(messages[:i]) new_msg = dict(msg) new_msg["content"] = filtered if filtered else "" modified_messages.append(new_msg) continue if modified_messages is not None: modified_messages.append(msg) if modified_messages is None: return litellm_completion_request result = dict(litellm_completion_request) result["messages"] = modified_messages return result
This avoids any heap allocation on the no-op path entirely.
-
litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py, line 404-412 (link)Type mismatch passed to
exchange_token_with_serverform_data.get(...)returnsOptional[str], sogrant_typeandclient_idare inferred asOptional[str]here. However,exchange_token_with_serverdeclares both asstr(non-optional). Even though the runtime guard on lines 406–407 ensures neither isNoneat the call site, static type-checkers (mypy/pyright) cannot narrow through the earlyraiseand will flag the downstream call as a type error.Adding a
cast(str, ...)(or anassertguard) after the validation block would make the types explicit and suppress the false-positive diagnostics:grant_type_str: str = grant_type # type: ignore[assignment] # narrowed by guard above client_id_str: str = client_id # type: ignore[assignment]
or via
typing.cast. No runtime behaviour changes.
Last reviewed commit: b7b79be
| (e.g. Kimi-K2.5, gpt-oss-120b on Azure AI). | ||
| """ | ||
|
|
||
| import pytest |
There was a problem hiding this comment.
Unused pytest import
pytest is imported but never used in this file — no pytest.raises, pytest.mark, pytest.fixture, or any other pytest.* references appear. This will trigger a linter/flake8 F401 warning. Test discovery works via function naming conventions without the import.
| import pytest |
…xt parts - Apply Black formatting to handler.py, transformation.py, and test file to fix CI lint failures - Fix root cause in transformation.py: also skip text items where text == "" (previously only text is None was guarded), preventing empty text parts from being generated in the first place
…t-content-parts-responses-api
7f815f2 to
f73e53f
Compare
…t cause fix and apply Black
Sameerlite
left a comment
There was a problem hiding this comment.
nit and please resolve merge conflict
...t_litellm/responses/litellm_completion_transformation/test_strip_empty_text_content_parts.py
Outdated
Show resolved
Hide resolved
…ing completion transformation test file\n\nRemove unused pytest import per reviewer nit. Move 6 tests from standalone\ntest_strip_empty_text_content_parts.py into TestStripEmptyTextContentParts\nclass in test_litellm_completion_responses.py and delete the now-empty file.
|
@Sameerlite Conflicts resolved again, please could you look at this. |
Strict OpenAI-compatible models (e.g. Kimi-K2.5, gpt-oss-120b on Azure AI) reject messages whose content array contains {"type": "text", "text": ""}. These empty parts are produced by the Responses API -> chat/completions transformation when an assistant turn contains a tool call alongside an empty text block.
Add _strip_empty_text_content_parts() to handler.py and call it in both the sync (response_api_handler) and async (async_response_api_handler) paths before litellm.completion() / litellm.acompletion(). The helper is a no-op when no empty parts are present (returns the original dict).
Relevant issues
Related to #22933 which fixes the same class of bug in the /v1/messages path.
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🐛 Bug Fix
Changes
Problem
When LiteLLM's Responses API → chat/completions fallback path (LiteLLMCompletionTransformationHandler) converts a Responses API request containing tool results into chat completion messages, it can produce content arrays with empty text parts:
[
{"type": "text", "text": ""},
{"type": "tool_calls", "tool_calls": [...]}
]
Strict OpenAI-compatible endpoints reject these with:
422 Unprocessable Entity: "Content part type is ContentPartType.text but text is not provided"
This affects any model that uses the chat/completions fallback path — i.e. get_provider_responses_api_config() returns None — including azure_ai/kimi-k2.5, azure_ai/gpt-oss-120b, and similar. It reproduces reliably in multi-turn conversations with tool calls (e.g. Claude Code / Codex routing through these providers).
The root cause is in transformation.py where text_value is None is checked but text_value == "" is not, so empty strings pass through.
Fix
_strip_empty_text_content_parts() is added to handler.py and called in both the sync (response_api_handler) and async (async_response_api_handler) paths immediately before litellm.completion() / litellm.acompletion(). In the async path it runs after async_responses_api_session_handler, covering messages injected by session replay as well.
The helper:
iterates messages and filters {"type": "text", "text": ""} parts from any list-valued content
falls back to content="" (not []) when filtering would empty the list, avoiding a different validation error
returns the original dict unchanged when no modifications are needed (no allocation overhead on the happy path)
Tests
6 unit tests added in tests/test_litellm/responses/litellm_completion_transformation/test_strip_empty_text_content_parts.py: