fix(bedrock): normalise streaming choice index=0 for extended-thinkin… by awais786 · Pull Request #23248 · BerriAI/litellm

awais786 · 2026-03-10T09:45:21Z

…g blocks (issue #23178)

When Claude extended-thinking is enabled on Bedrock the converse API emits two content-block types in the same response:
contentBlockIndex=0 → reasoning / thinking block
contentBlockIndex=1 → text block

The existing converse_chunk_parser already hardcodes StreamingChoices(index=0) for every event (tool-calls fix from #22867), so the normalisation is already in place for the converse path. The AmazonAnthropicClaudeStreamDecoder (invoke/anthropic path) likewise always sets index=0 via AnthropicModelResponseIterator.chunk_parser.

This commit adds explicit regression tests for both paths covering the full thinking-block event sequence (start, delta, signature, stop) and the subsequent text-block events that arrive on contentBlockIndex=1, ensuring choices[0].index is always 0 and OpenAI-compatible clients do not crash.

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…g blocks (issue BerriAI#23178) When Claude extended-thinking is enabled on Bedrock the converse API emits two content-block types in the same response: contentBlockIndex=0 → reasoning / thinking block contentBlockIndex=1 → text block The existing converse_chunk_parser already hardcodes StreamingChoices(index=0) for every event (tool-calls fix from BerriAI#22867), so the normalisation is already in place for the converse path. The AmazonAnthropicClaudeStreamDecoder (invoke/anthropic path) likewise always sets index=0 via AnthropicModelResponseIterator.chunk_parser. This commit adds explicit regression tests for both paths covering the full thinking-block event sequence (start, delta, signature, stop) and the subsequent text-block events that arrive on contentBlockIndex=1, ensuring choices[0].index is always 0 and OpenAI-compatible clients do not crash. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

vercel · 2026-03-10T09:45:26Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Error		Mar 10, 2026 9:47am

greptile-apps · 2026-03-10T09:49:59Z

Greptile Summary

This PR adds regression tests to verify that Bedrock streaming responses always normalise choices[0].index to 0 when Claude extended-thinking is enabled — covering both the converse path (AWSEventStreamDecoder.converse_chunk_parser) and the invoke/anthropic path (AmazonAnthropicClaudeStreamDecoder._chunk_parser). No production code is modified; the existing normalisation (index=0 is hardcoded in both paths) was already in place from prior fixes.

Key observations:

The three new tests (test_thinking_block_chunks_use_choice_index_zero, test_thinking_signature_chunk_uses_choice_index_zero, test_anthropic_invoke_decoder_thinking_uses_choice_index_zero) are correct unit tests with no real network calls, satisfying the mock-only test requirement for this folder.
All tested assertions should pass against the current production code, confirming these are pure regression/documentation tests.
The PR title and description use "fix" language, but no production code is changed. The actual normalisation fix was shipped in prior PRs (fix(bedrock,azure_ai): strip scope from cache_control for Anthropic messages #22867 for converse; the Anthropic iterator hardcodes index = 0 on line 688 of handler.py). This PR's value is in formalising regression coverage.
The "Relevant Issues" section in the PR description is left empty. Issue [Bug]: Bedrock Claude extended thinking streams mixed choice indices (0 and 1), breaking OpenAI compatibility #23178 is referenced only in code comments and docstrings, not as a formal Fixes #23178 link. Per project guidelines, PRs claiming to fix an issue should include traceable evidence in the description body.
The pre-submission checklist is entirely unchecked (all [ ]), including the testing item, despite tests being present.
test_anthropic_invoke_decoder_thinking_uses_choice_index_zero only validates index == 0 per chunk and omits content-value assertions that the converse-path test includes for comparable chunks, making it a weaker regression guard against value-propagation regressions in AmazonAnthropicClaudeStreamDecoder.

Confidence Score: 4/5

Safe to merge — only adds regression tests, no production code is modified.
All new tests are pure unit tests with no network calls, the logic is correct, and no existing behaviour is altered. Score is 4 rather than 5 primarily because the PR description lacks a formal "Fixes [Bug]: Bedrock Claude extended thinking streams mixed choice indices (0 and 1), breaking OpenAI compatibility #23178" issue link and the pre-submission checklist is entirely unchecked, which are process gaps the team typically requires before merge. The test_anthropic_invoke_decoder_thinking_uses_choice_index_zero test would also benefit from content-value assertions to serve as a stronger regression guard.
No production files were changed. tests/test_litellm/llms/bedrock/chat/test_streaming_choice_index.py could be strengthened with content-value assertions in the invoke-path test.

Important Files Changed

Filename	Overview
tests/test_litellm/llms/bedrock/chat/test_streaming_choice_index.py	Adds three regression tests for `choices[0].index == 0` normalisation during extended-thinking (reasoning) streaming for both the converse path (`AWSEventStreamDecoder.converse_chunk_parser`) and the invoke/anthropic path (`AmazonAnthropicClaudeStreamDecoder._chunk_parser`). No production code is changed — all tested behaviour was already in place. Tests are purely unit tests with no real network calls.

Sequence Diagram

sequenceDiagram
    participant Client as OpenAI-compat Client
    participant LiteLLM as LiteLLM Streaming Layer
    participant Bedrock as Bedrock API

    Note over Bedrock,LiteLLM: Extended-thinking response stream

    Bedrock->>LiteLLM: contentBlockIndex=0, start {reasoningContent}
    LiteLLM->>Client: choices[0].index=0 (thinking start)

    Bedrock->>LiteLLM: contentBlockIndex=0, delta {reasoningContent.text}
    LiteLLM->>Client: choices[0].index=0 (thinking delta → reasoning_content)

    Bedrock->>LiteLLM: contentBlockIndex=0, delta {reasoningContent.signature}
    LiteLLM->>Client: choices[0].index=0 (signature → thinking_blocks)

    Bedrock->>LiteLLM: contentBlockIndex=0 (stop)
    LiteLLM->>Client: choices[0].index=0 (thinking stop)

    Note over Bedrock,LiteLLM: ⚠️ Previously broken: index=1 was forwarded

    Bedrock->>LiteLLM: contentBlockIndex=1, start {} (text block)
    LiteLLM->>Client: choices[0].index=0 (normalised ✓)

    Bedrock->>LiteLLM: contentBlockIndex=1, delta {text}
    LiteLLM->>Client: choices[0].index=0 (normalised ✓)

    Bedrock->>LiteLLM: contentBlockIndex=1 (stop)
    LiteLLM->>Client: choices[0].index=0 (normalised ✓)

    Bedrock->>LiteLLM: stopReason=end_turn
    LiteLLM->>Client: choices[0].index=0 (finish)

_{Last reviewed commit: 5521f95}

greptile-apps · 2026-03-10T09:50:03Z

tests/test_litellm/llms/bedrock/chat/test_streaming_choice_index.py

+            result = handler._chunk_parser(chunk)
+            assert result.choices[0].index == 0, (
+                f"chunk type={chunk.get('type')} index={chunk.get('index')} "
+                f"produced choices[0].index={result.choices[0].index}, expected 0"
+            )


Missing assertion on thinking/content delta values

The test_anthropic_invoke_decoder_thinking_uses_choice_index_zero test only validates choices[0].index == 0 for all chunks, including the content_block_delta chunks that carry actual thinking text ("I should multiply.") and text content ("27 x 453 = 12231"). The existing test_thinking_block_chunks_use_choice_index_zero counterpart already asserts delta.reasoning_content and delta.content on comparable chunks.

For completeness and to catch regressions in value propagation through AmazonAnthropicClaudeStreamDecoder, consider adding content assertions for the delta chunks, e.g.:

for chunk in anthropic_chunks: result = handler._chunk_parser(chunk) assert result.choices[0].index == 0, ( f"chunk type={chunk.get('type')} index={chunk.get('index')} " f"produced choices[0].index={result.choices[0].index}, expected 0" ) # Validate that thinking delta propagates reasoning_content if chunk.get("type") == "content_block_delta" and chunk.get("index") == 0: assert result.choices[0].delta.reasoning_content is not None # Validate that text delta propagates content if chunk.get("type") == "content_block_delta" and chunk.get("index") == 1: assert result.choices[0].delta.content == "27 x 453 = 12231"

awais786 closed this Mar 10, 2026

vercel bot had a problem deploying to Preview March 10, 2026 09:47 Failure

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(bedrock): normalise streaming choice index=0 for extended-thinkin…#23248

fix(bedrock): normalise streaming choice index=0 for extended-thinkin…#23248
awais786 wants to merge 1 commit intoBerriAI:mainfrom
awais786:fix/bedrock-extended-thinking-stream-choice-index

awais786 commented Mar 10, 2026

Uh oh!

vercel bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 10, 2026

Important Files Changed

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

awais786 commented Mar 10, 2026

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 10, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 10, 2026 •

edited

Loading