feat(bedrock): support native structured outputs for Invoke API (Claude 4.5+)#23778
feat(bedrock): support native structured outputs for Invoke API (Claude 4.5+)#23778ndgigliotti wants to merge 2 commits intoBerriAI:mainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds native structured output support (via Bedrock's Key changes:
Outstanding concern: The hardcoded Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/llms/bedrock/chat/invoke_transformations/anthropic_claude3_transformation.py | Core implementation of native structured output support for Bedrock InvokeModel; uses a hardcoded model set (acknowledged in PR discussion) and an opaque model-name override hack for the tool-based fallback path (also acknowledged). Logic is sound and well-tested. |
| litellm/llms/bedrock/common_utils.py | Correctly extracts add_additional_properties_to_schema into a shared utility, resolving the cross-class private-method dependency. Implementation is identical to the original and functionally correct. |
| litellm/llms/bedrock/chat/converse_transformation.py | Clean refactor: _add_additional_properties_to_schema now delegates to the shared common_utils implementation, maintaining backward compatibility for existing callers. |
| tests/test_litellm/llms/bedrock/chat/invoke_transformations/test_bedrock_chat_invoke_transformations_anthropic_claude3_transformation.py | 8 solid unit tests covering model detection, native vs fallback paths, end-to-end transform, schema normalization, and json_object fallback — all mock-based, appropriate for the CI folder. |
| tests/llm_translation/test_bedrock_completion.py | The added test_bedrock_invoke_native_structured_output makes live AWS network calls with no mocking, violating the project policy that only mock-based tests can be added to this folder (this was flagged in a previous review thread and remains unresolved). |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["map_openai_params(response_format, model)"] --> B{response_format is dict\nAND model is in\nBEDROCK_INVOKE_NATIVE_\nSTRUCTURED_OUTPUT_MODELS?}
B -- Yes --> C["map_response_format_to_\nanthropic_output_format()"]
C --> D{output_format\nis not None?}
D -- Yes --> E["Set optional_params\n[output_format, json_mode=True]\nCall parent without response_format"]
D -- No --> F
B -- No --> F{response_format\nis not None?}
F -- Yes --> G["Override model=\n'claude-3-sonnet-20240229'\n(tool-based fallback)"]
G --> H["AnthropicConfig.map_openai_params\n(with original response_format)"]
F -- No --> H
E --> I["transform_request()"]
H --> I
I --> J["AnthropicConfig.transform_request()"]
J --> K{output_format present\nin anthropic_request\nAND type==json_schema?}
K -- Yes --> L["add_additional_properties\n_to_schema(schema)\nBuild output_config.format"]
L --> M["Set request\noutput_config.format"]
K -- No --> N["Pop output_config\nentirely from request\n(Bedrock rejects extraneous keys)"]
Last reviewed commit: f7287f0
| BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS = { | ||
| "claude-haiku-4-5", | ||
| "claude-sonnet-4-5", | ||
| "claude-sonnet-4-6", | ||
| "claude-opus-4-5", | ||
| "claude-opus-4-6", | ||
| } |
There was a problem hiding this comment.
Hardcoded model list violates project policy
BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS is a hardcoded set of model name substrings, which means every time AWS adds a new Claude model that supports native structured outputs on the Invoke API, users must upgrade LiteLLM to get support.
The project convention (also violated by the existing BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS in converse_transformation.py) is to store these flags in model_prices_and_context_window.json and read them via get_model_info. This lets users pick up new model support without an SDK upgrade.
The recommended fix is to:
- Add a
"supports_bedrock_invoke_structured_outputs": truekey to each model entry inmodel_prices_and_context_window.json - Replace
_supports_native_structured_outputswith a lookup throughget_model_info(similar to howsupports_reasoningis used for the reasoning effort feature)
# Instead of:
BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS = {
"claude-haiku-4-5",
"claude-sonnet-4-5",
...
}
# Do something like:
from litellm.utils import get_model_info
def _supports_native_structured_outputs(model: str) -> bool:
try:
info = get_model_info(model=model, custom_llm_provider="bedrock")
return bool(info.get("supports_bedrock_invoke_structured_outputs"))
except Exception:
return FalseNote: the same issue exists in converse_transformation.py's BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS, but that's pre-existing. Fixing it here would be a good opportunity to align with the policy.
Rule Used: What: Do not hardcode model-specific flags in the ... (source)
There was a problem hiding this comment.
Valid point. The model_prices_and_context_window.json approach would be cleaner long-term, but there are a couple of blockers for this PR:
- The Invoke path doesn't have its own model entries in the JSON -- there's only one
bedrock/invoke/entry and it's for an old model. The lookup would need to strip thebedrock/invoke/prefix, handle inference profile IDs (us.anthropic.claude-sonnet-4-6), and fall back to the base Bedrock entry. That model ID resolution logic doesn't exist yet. - The existing Converse path (
BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS) uses the same hardcoded set pattern.
Happy to follow up with a separate PR to migrate both Invoke and Converse sets to JSON lookups if the maintainers prefer that approach.
| else: | ||
| # Non-native path: strip output_config entirely | ||
| # Fixes: https://github.com/BerriAI/litellm/issues/22797 | ||
| _anthropic_request.pop("output_config", None) |
There was a problem hiding this comment.
output_config stripped unconditionally on non-native path
The else branch removes output_config from the request regardless of what put it there. This means that if a user passes both reasoning_effort (which sets output_config.effort) and a response_format on a model that does not support native structured outputs, the output_config (with the effort key) will be silently dropped.
Concretely, if Bedrock ever supports output_config.effort on the Invoke API, this else-branch will silently discard it for any mixed-mode request. Even today, if AnthropicConfig.transform_request populates output_config from other optional params, it would be wiped here.
The if branch already handles the merge case correctly (it calls _anthropic_request.get("output_config") or {} and merges format in), so the else branch should at minimum avoid stripping keys it didn't set. Consider only removing the format key (or the whole output_config only when it came from an output_format source):
else:
# Non-native path: remove only the format key that we never populated.
# Leave any other output_config keys (e.g. effort) intact.
output_config = _anthropic_request.get("output_config")
if output_config and "format" in output_config:
output_config.pop("format")
if not output_config:
_anthropic_request.pop("output_config", None)If the intent is truly that Bedrock Invoke does not support output_config at all (as fixed in #22797), then at least add a comment explaining that any future output_config keys (like effort) should be handled here explicitly.
There was a problem hiding this comment.
The suggestion to only strip the format key won't work here. Bedrock Invoke rejects the output_config key itself for unsupported models -- not just sub-keys inside it. The error from #22797 is: extraneous key [output_config] is not permitted. So any output_config content (including effort) would cause a 400 on unsupported models.
Added a comment explaining this.
| # Fallback: force tool-based structured outputs for unsupported models | ||
| # (or json_object without schema on a supported model). | ||
| if response_format is not None: | ||
| model = "claude-3-sonnet-20240229" |
There was a problem hiding this comment.
Fallback silently overrides model for tool injection
Setting model = "claude-3-sonnet-20240229" is a local variable override — it tricks the parent's map_openai_params into choosing the tool-based path by selecting an old model name that is known not to support native outputs. This is a subtle and fragile approach.
If the parent's native-supported model set ever changes (e.g., adds "claude-3-sonnet-20240229" to the native list — unlikely but possible), this fallback would break silently. A more explicit approach would be to directly call map_response_format_to_anthropic_tool on the fallback path rather than relying on an opaque model override.
There was a problem hiding this comment.
Agreed this is fragile. It's the pre-existing pattern from the code this PR refactored -- the previous implementation also overrode the model name the same way. Calling map_response_format_to_anthropic_tool directly would be cleaner, but that method also handles tool_choice injection and thinking-mode checks that are coupled to the parent's internal state. Extracting just the tool-based path without duplicating logic would require refactoring the parent class, which is out of scope here.
Open to revisiting if the maintainers want to refactor the parent class.
…de 4.5+) For Bedrock InvokeModel Claude models that support native structured outputs (Haiku 4.5, Sonnet 4.5, Opus 4.5, Opus 4.6), use output_config.format with json_schema instead of the synthetic json_tool_call workaround. Unsupported models automatically fall back to the existing tool-call approach. Completes the Invoke API portion of BerriAI#21208 (Converse was merged in BerriAI#21222).
3d00984 to
af16f12
Compare
| from litellm.llms.bedrock.chat.converse_transformation import ( | ||
| AmazonConverseConfig, | ||
| ) |
There was a problem hiding this comment.
Cross-class use of a private static method creates fragile coupling
AmazonConverseConfig._add_additional_properties_to_schema is a _-prefixed (private) static method on the Converse transformer. Calling it directly from the Invoke transformer creates an implicit dependency between two sibling classes: if the method is ever renamed, moved, or its contract changed, the Invoke path will break silently at runtime rather than at import time.
The cleaner fix is to extract the utility into a shared location (e.g., litellm/llms/bedrock/common_utils.py) so both transformers can import it without either depending on the other.
# In bedrock/common_utils.py
def add_additional_properties_to_schema(schema: dict) -> dict:
"""Recursively ensure all object types have additionalProperties: false."""
...Then update both converse_transformation.py and anthropic_claude3_transformation.py to import from the shared location.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
- Move _add_additional_properties_to_schema from AmazonConverseConfig to bedrock/common_utils.py as a shared top-level function. Both Converse and Invoke paths now import from the shared location, removing the cross-class private method dependency. - Remove 65-line commented-out test block (test_structured_outputs_beta_header_filtered_for_bedrock_invoke). - Update stale docstring on test_output_config_removed_from_bedrock_chat_invoke_request to reflect that native-path models now keep output_config.
| @pytest.mark.parametrize( | ||
| "model", | ||
| [ | ||
| "bedrock/invoke/us.anthropic.claude-haiku-4-5-20251001-v1:0", | ||
| "bedrock/invoke/us.anthropic.claude-sonnet-4-5-20250929-v1:0", | ||
| "bedrock/invoke/us.anthropic.claude-sonnet-4-6", | ||
| "bedrock/invoke/us.anthropic.claude-opus-4-5-20251101-v1:0", | ||
| "bedrock/invoke/us.anthropic.claude-opus-4-6-v1", | ||
| ], | ||
| ) | ||
| def test_bedrock_invoke_native_structured_output(model): | ||
| """ | ||
| Integration test: verify native structured outputs via Bedrock InvokeModel | ||
| for each supported Claude model. | ||
|
|
||
| Uses output_config.format (Bedrock InvokeModel native API) instead of | ||
| the synthetic json_tool_call workaround. | ||
| """ | ||
| response = completion( | ||
| model=model, | ||
| messages=[ | ||
| { | ||
| "role": "user", | ||
| "content": "Classify the sentiment of 'I love this product!' as positive, negative, or neutral.", | ||
| } | ||
| ], | ||
| response_format={ | ||
| "type": "json_schema", | ||
| "json_schema": { | ||
| "name": "sentiment", | ||
| "schema": { | ||
| "type": "object", | ||
| "properties": { | ||
| "sentiment": { | ||
| "type": "string", | ||
| "enum": ["positive", "negative", "neutral"], | ||
| } | ||
| }, | ||
| "required": ["sentiment"], | ||
| "additionalProperties": False, | ||
| }, | ||
| }, | ||
| }, | ||
| max_tokens=100, | ||
| ) | ||
| print(f"Response for {model}: {response}") | ||
|
|
||
| # Validate the response is a ModelResponse with valid JSON content | ||
| assert isinstance(response, ModelResponse) | ||
| content = response.choices[0].message.content | ||
| assert content is not None | ||
|
|
||
| parsed = json.loads(content) | ||
| assert "sentiment" in parsed | ||
| assert parsed["sentiment"] in ["positive", "negative", "neutral"] |
There was a problem hiding this comment.
Real network call in mock-only folder
test_bedrock_invoke_native_structured_output makes a live call to AWS Bedrock (completion(model=model, ...)) with no @patch or mock applied. The project policy for tests/llm_translation/ is that only mock-based tests may be added, to ensure the suite passes on GitHub CI and for all developers locally without AWS credentials.
This test should either be moved to an integration-test directory that is excluded from the standard CI run, or converted to use unittest.mock.patch to intercept the HTTP call and return a canned response. A typical approach used elsewhere in this file:
@pytest.mark.parametrize("model", [...])
@patch("litellm.llms.custom_httpx.http_handler.HTTPHandler.post")
def test_bedrock_invoke_native_structured_output(mock_post, model):
mock_response = Mock()
mock_response.json.return_value = {...} # canned Bedrock response
mock_response.status_code = 200
mock_post.return_value = mock_response
...Rule Used: What: prevent any tests from being added here that... (source)
|
|
| BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS = { | ||
| "claude-haiku-4-5", | ||
| "claude-sonnet-4-5", | ||
| "claude-sonnet-4-6", | ||
| "claude-opus-4-5", | ||
| "claude-opus-4-6", | ||
| } |
There was a problem hiding this comment.
Hardcoded model list should live in model_prices_and_context_window.json
Per project convention, model-capability flags should be stored in model_prices_and_context_window.json and read via get_model_info, not hardcoded here. Hardcoding means users must upgrade LiteLLM every time AWS adds a new Claude model to the Invoke native structured-output feature set.
The same pattern exists for the Converse path (BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS); both should eventually be migrated. The PR author has already noted this as a follow-up concern in the discussion thread.
Rule Used: What: Do not hardcode model-specific flags in the ... (source)
| @pytest.mark.parametrize( | ||
| "model", | ||
| [ | ||
| "bedrock/invoke/us.anthropic.claude-haiku-4-5-20251001-v1:0", | ||
| "bedrock/invoke/us.anthropic.claude-sonnet-4-5-20250929-v1:0", | ||
| "bedrock/invoke/us.anthropic.claude-sonnet-4-6", | ||
| "bedrock/invoke/us.anthropic.claude-opus-4-5-20251101-v1:0", | ||
| "bedrock/invoke/us.anthropic.claude-opus-4-6-v1", | ||
| ], | ||
| ) | ||
| def test_bedrock_invoke_native_structured_output(model): | ||
| """ | ||
| Integration test: verify native structured outputs via Bedrock InvokeModel | ||
| for each supported Claude model. | ||
|
|
||
| Uses output_config.format (Bedrock InvokeModel native API) instead of | ||
| the synthetic json_tool_call workaround. | ||
| """ | ||
| response = completion( | ||
| model=model, | ||
| messages=[ | ||
| { | ||
| "role": "user", | ||
| "content": "Classify the sentiment of 'I love this product!' as positive, negative, or neutral.", | ||
| } | ||
| ], | ||
| response_format={ | ||
| "type": "json_schema", | ||
| "json_schema": { | ||
| "name": "sentiment", | ||
| "schema": { | ||
| "type": "object", | ||
| "properties": { | ||
| "sentiment": { | ||
| "type": "string", | ||
| "enum": ["positive", "negative", "neutral"], | ||
| } | ||
| }, | ||
| "required": ["sentiment"], | ||
| "additionalProperties": False, | ||
| }, | ||
| }, | ||
| }, | ||
| max_tokens=100, | ||
| ) | ||
| print(f"Response for {model}: {response}") | ||
|
|
||
| # Validate the response is a ModelResponse with valid JSON content | ||
| assert isinstance(response, ModelResponse) | ||
| content = response.choices[0].message.content | ||
| assert content is not None | ||
|
|
||
| parsed = json.loads(content) | ||
| assert "sentiment" in parsed | ||
| assert parsed["sentiment"] in ["positive", "negative", "neutral"] |
There was a problem hiding this comment.
Live network call in mock-only test folder
test_bedrock_invoke_native_structured_output issues a real completion() call to AWS Bedrock with no @patch decorator. The project policy for tests/llm_translation/ is that only mock-based tests may live here, so that the suite passes on GitHub CI and for every developer without AWS credentials.
This test will fail for any contributor without live AWS access. It should either be:
- Moved to an integration-test directory that is gated on AWS credentials, or
- Converted to use
unittest.mock.patch— the same approach used extensively elsewhere in this file — intercepting the HTTP call and returning a canned Bedrock response body.
Rule Used: What: prevent any tests from being added here that... (source)
Relevant issues
Fixes #21208
Related: #19652, #22797
Pre-Submission checklist
tests/test_litellm/directorymake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewType
New Feature
Changes
Adds native structured output support for Claude 4.5+ models on the Bedrock InvokeModel API path. Previously, all Bedrock Invoke structured output requests were forced through a synthetic tool-call workaround (injecting a fake tool and extracting the arguments). Models that support Bedrock's
output_config.formatparameter now use it directly.Supported models
How it works
map_openai_paramschecks whether the model supports native structured outputsjson_schemaresponse format is provided, it builds an Anthropicoutput_formatand passes it through (instead of injecting a tool)transform_requestconverts the Anthropicoutput_formatinto Bedrock'soutput_config.format, with recursiveadditionalProperties: falsenormalization (required by Bedrock)json_objectwithout a schema fall back to the existing tool-call approachWhy a separate model set from Converse
The Invoke and Converse APIs are different Bedrock endpoints with independent feature rollouts.
BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELSis maintained separately from the Converse path'sBEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELSto allow them to diverge.Tests