Skip to content

feat(bedrock): support native structured outputs for Invoke API (Claude 4.5+)#23778

Draft
ndgigliotti wants to merge 2 commits intoBerriAI:mainfrom
ndgigliotti:bedrock-invoke-structured-output
Draft

feat(bedrock): support native structured outputs for Invoke API (Claude 4.5+)#23778
ndgigliotti wants to merge 2 commits intoBerriAI:mainfrom
ndgigliotti:bedrock-invoke-structured-output

Conversation

@ndgigliotti
Copy link
Contributor

@ndgigliotti ndgigliotti commented Mar 16, 2026

Relevant issues

Fixes #21208
Related: #19652, #22797

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

New Feature

Changes

Adds native structured output support for Claude 4.5+ models on the Bedrock InvokeModel API path. Previously, all Bedrock Invoke structured output requests were forced through a synthetic tool-call workaround (injecting a fake tool and extracting the arguments). Models that support Bedrock's output_config.format parameter now use it directly.

Supported models

  • Claude Haiku 4.5
  • Claude Sonnet 4.5
  • Claude Sonnet 4.6
  • Claude Opus 4.5
  • Claude Opus 4.6

How it works

  1. map_openai_params checks whether the model supports native structured outputs
  2. If supported and a json_schema response format is provided, it builds an Anthropic output_format and passes it through (instead of injecting a tool)
  3. transform_request converts the Anthropic output_format into Bedrock's output_config.format, with recursive additionalProperties: false normalization (required by Bedrock)
  4. Unsupported models and json_object without a schema fall back to the existing tool-call approach

Why a separate model set from Converse

The Invoke and Converse APIs are different Bedrock endpoints with independent feature rollouts. BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS is maintained separately from the Converse path's BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS to allow them to diverge.

Tests

  • 8 unit tests covering: model support detection, native vs fallback paths, end-to-end transform_request, schema normalization, and json_object fallback
  • 1 parametrized integration test across all 5 supported models (all passing against live Bedrock)

@vercel
Copy link

vercel bot commented Mar 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 16, 2026 10:13pm

Request Review

@ndgigliotti
Copy link
Contributor Author

@greptileai

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Mar 16, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing ndgigliotti:bedrock-invoke-structured-output (f7287f0) with main (3dccdde)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR adds native structured output support (via Bedrock's output_config.format) for Claude 4.5+ models on the Bedrock InvokeModel API path, replacing the previous synthetic tool-call workaround for supported models. It also correctly refactors _add_additional_properties_to_schema out of AmazonConverseConfig and into bedrock/common_utils.py, resolving the cross-class private-method coupling flagged in the previous review round.

Key changes:

  • AmazonAnthropicClaudeConfig.map_openai_params detects supported models and builds an Anthropic output_format directly, bypassing the tool-injection hack; unsupported models and json_object-without-schema fall back to the existing model-name override approach.
  • AmazonAnthropicClaudeConfig.transform_request converts the Anthropic output_format into Bedrock's output_config.format with recursive additionalProperties: false normalization, and strips output_config entirely for unsupported models (required because Bedrock rejects the key itself for those models).
  • add_additional_properties_to_schema is now a module-level function in common_utils.py; AmazonConverseConfig._add_additional_properties_to_schema delegates to it for backward compatibility.
  • 8 focused mock-based unit tests cover the new logic end-to-end.

Outstanding concern: test_bedrock_invoke_native_structured_output in tests/llm_translation/test_bedrock_completion.py makes live AWS network calls without any mocking, violating the folder's mock-only policy. This will break CI for contributors without AWS credentials and should be converted to a mocked test or moved to an integration-test directory.

The hardcoded BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS set and the model-name override fallback were acknowledged in the previous review discussion; both are pre-existing patterns and the author has offered to follow up in a separate PR.

Confidence Score: 3/5

  • Safe to merge once the live-network test in tests/llm_translation/ is either mocked or relocated; the feature logic itself is correct and well-tested.
  • The implementation is sound and all unit tests are mock-based and passing. The score is held at 3 because test_bedrock_invoke_native_structured_output in tests/llm_translation/test_bedrock_completion.py makes real AWS calls with no mock, violating the project's CI policy for that folder — it will silently fail or be skipped for any developer without live AWS credentials, which undermines the test suite's reliability guarantee. The hardcoded model list is a known concern but has been acknowledged with a follow-up plan.
  • tests/llm_translation/test_bedrock_completion.py — the new integration test needs to be mocked or moved before merge.

Important Files Changed

Filename Overview
litellm/llms/bedrock/chat/invoke_transformations/anthropic_claude3_transformation.py Core implementation of native structured output support for Bedrock InvokeModel; uses a hardcoded model set (acknowledged in PR discussion) and an opaque model-name override hack for the tool-based fallback path (also acknowledged). Logic is sound and well-tested.
litellm/llms/bedrock/common_utils.py Correctly extracts add_additional_properties_to_schema into a shared utility, resolving the cross-class private-method dependency. Implementation is identical to the original and functionally correct.
litellm/llms/bedrock/chat/converse_transformation.py Clean refactor: _add_additional_properties_to_schema now delegates to the shared common_utils implementation, maintaining backward compatibility for existing callers.
tests/test_litellm/llms/bedrock/chat/invoke_transformations/test_bedrock_chat_invoke_transformations_anthropic_claude3_transformation.py 8 solid unit tests covering model detection, native vs fallback paths, end-to-end transform, schema normalization, and json_object fallback — all mock-based, appropriate for the CI folder.
tests/llm_translation/test_bedrock_completion.py The added test_bedrock_invoke_native_structured_output makes live AWS network calls with no mocking, violating the project policy that only mock-based tests can be added to this folder (this was flagged in a previous review thread and remains unresolved).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["map_openai_params(response_format, model)"] --> B{response_format is dict\nAND model is in\nBEDROCK_INVOKE_NATIVE_\nSTRUCTURED_OUTPUT_MODELS?}
    B -- Yes --> C["map_response_format_to_\nanthropic_output_format()"]
    C --> D{output_format\nis not None?}
    D -- Yes --> E["Set optional_params\n[output_format, json_mode=True]\nCall parent without response_format"]
    D -- No --> F
    B -- No --> F{response_format\nis not None?}
    F -- Yes --> G["Override model=\n'claude-3-sonnet-20240229'\n(tool-based fallback)"]
    G --> H["AnthropicConfig.map_openai_params\n(with original response_format)"]
    F -- No --> H

    E --> I["transform_request()"]
    H --> I

    I --> J["AnthropicConfig.transform_request()"]
    J --> K{output_format present\nin anthropic_request\nAND type==json_schema?}
    K -- Yes --> L["add_additional_properties\n_to_schema(schema)\nBuild output_config.format"]
    L --> M["Set request\noutput_config.format"]
    K -- No --> N["Pop output_config\nentirely from request\n(Bedrock rejects extraneous keys)"]
Loading

Last reviewed commit: f7287f0

Comment on lines +31 to +37
BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS = {
"claude-haiku-4-5",
"claude-sonnet-4-5",
"claude-sonnet-4-6",
"claude-opus-4-5",
"claude-opus-4-6",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded model list violates project policy

BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS is a hardcoded set of model name substrings, which means every time AWS adds a new Claude model that supports native structured outputs on the Invoke API, users must upgrade LiteLLM to get support.

The project convention (also violated by the existing BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS in converse_transformation.py) is to store these flags in model_prices_and_context_window.json and read them via get_model_info. This lets users pick up new model support without an SDK upgrade.

The recommended fix is to:

  1. Add a "supports_bedrock_invoke_structured_outputs": true key to each model entry in model_prices_and_context_window.json
  2. Replace _supports_native_structured_outputs with a lookup through get_model_info (similar to how supports_reasoning is used for the reasoning effort feature)
# Instead of:
BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS = {
    "claude-haiku-4-5",
    "claude-sonnet-4-5",
    ...
}

# Do something like:
from litellm.utils import get_model_info

def _supports_native_structured_outputs(model: str) -> bool:
    try:
        info = get_model_info(model=model, custom_llm_provider="bedrock")
        return bool(info.get("supports_bedrock_invoke_structured_outputs"))
    except Exception:
        return False

Note: the same issue exists in converse_transformation.py's BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS, but that's pre-existing. Fixing it here would be a good opportunity to align with the policy.

Rule Used: What: Do not hardcode model-specific flags in the ... (source)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point. The model_prices_and_context_window.json approach would be cleaner long-term, but there are a couple of blockers for this PR:

  1. The Invoke path doesn't have its own model entries in the JSON -- there's only one bedrock/invoke/ entry and it's for an old model. The lookup would need to strip the bedrock/invoke/ prefix, handle inference profile IDs (us.anthropic.claude-sonnet-4-6), and fall back to the base Bedrock entry. That model ID resolution logic doesn't exist yet.
  2. The existing Converse path (BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS) uses the same hardcoded set pattern.

Happy to follow up with a separate PR to migrate both Invoke and Converse sets to JSON lookups if the maintainers prefer that approach.

Comment on lines +168 to +171
else:
# Non-native path: strip output_config entirely
# Fixes: https://github.com/BerriAI/litellm/issues/22797
_anthropic_request.pop("output_config", None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_config stripped unconditionally on non-native path

The else branch removes output_config from the request regardless of what put it there. This means that if a user passes both reasoning_effort (which sets output_config.effort) and a response_format on a model that does not support native structured outputs, the output_config (with the effort key) will be silently dropped.

Concretely, if Bedrock ever supports output_config.effort on the Invoke API, this else-branch will silently discard it for any mixed-mode request. Even today, if AnthropicConfig.transform_request populates output_config from other optional params, it would be wiped here.

The if branch already handles the merge case correctly (it calls _anthropic_request.get("output_config") or {} and merges format in), so the else branch should at minimum avoid stripping keys it didn't set. Consider only removing the format key (or the whole output_config only when it came from an output_format source):

else:
    # Non-native path: remove only the format key that we never populated.
    # Leave any other output_config keys (e.g. effort) intact.
    output_config = _anthropic_request.get("output_config")
    if output_config and "format" in output_config:
        output_config.pop("format")
        if not output_config:
            _anthropic_request.pop("output_config", None)

If the intent is truly that Bedrock Invoke does not support output_config at all (as fixed in #22797), then at least add a comment explaining that any future output_config keys (like effort) should be handled here explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion to only strip the format key won't work here. Bedrock Invoke rejects the output_config key itself for unsupported models -- not just sub-keys inside it. The error from #22797 is: extraneous key [output_config] is not permitted. So any output_config content (including effort) would cause a 400 on unsupported models.

Added a comment explaining this.

Comment on lines +108 to 111
# Fallback: force tool-based structured outputs for unsupported models
# (or json_object without schema on a supported model).
if response_format is not None:
model = "claude-3-sonnet-20240229"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fallback silently overrides model for tool injection

Setting model = "claude-3-sonnet-20240229" is a local variable override — it tricks the parent's map_openai_params into choosing the tool-based path by selecting an old model name that is known not to support native outputs. This is a subtle and fragile approach.

If the parent's native-supported model set ever changes (e.g., adds "claude-3-sonnet-20240229" to the native list — unlikely but possible), this fallback would break silently. A more explicit approach would be to directly call map_response_format_to_anthropic_tool on the fallback path rather than relying on an opaque model override.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this is fragile. It's the pre-existing pattern from the code this PR refactored -- the previous implementation also overrode the model name the same way. Calling map_response_format_to_anthropic_tool directly would be cleaner, but that method also handles tool_choice injection and thinking-mode checks that are coupled to the parent's internal state. Extracting just the tool-based path without duplicating logic would require refactoring the parent class, which is out of scope here.

Open to revisiting if the maintainers want to refactor the parent class.

…de 4.5+)

For Bedrock InvokeModel Claude models that support native structured outputs
(Haiku 4.5, Sonnet 4.5, Opus 4.5, Opus 4.6), use output_config.format with
json_schema instead of the synthetic json_tool_call workaround. Unsupported
models automatically fall back to the existing tool-call approach.

Completes the Invoke API portion of BerriAI#21208 (Converse was merged in BerriAI#21222).
Comment on lines +6 to +8
from litellm.llms.bedrock.chat.converse_transformation import (
AmazonConverseConfig,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-class use of a private static method creates fragile coupling

AmazonConverseConfig._add_additional_properties_to_schema is a _-prefixed (private) static method on the Converse transformer. Calling it directly from the Invoke transformer creates an implicit dependency between two sibling classes: if the method is ever renamed, moved, or its contract changed, the Invoke path will break silently at runtime rather than at import time.

The cleaner fix is to extract the utility into a shared location (e.g., litellm/llms/bedrock/common_utils.py) so both transformers can import it without either depending on the other.

# In bedrock/common_utils.py
def add_additional_properties_to_schema(schema: dict) -> dict:
    """Recursively ensure all object types have additionalProperties: false."""
    ...

Then update both converse_transformation.py and anthropic_claude3_transformation.py to import from the shared location.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

- Move _add_additional_properties_to_schema from AmazonConverseConfig
  to bedrock/common_utils.py as a shared top-level function. Both
  Converse and Invoke paths now import from the shared location,
  removing the cross-class private method dependency.
- Remove 65-line commented-out test block
  (test_structured_outputs_beta_header_filtered_for_bedrock_invoke).
- Update stale docstring on
  test_output_config_removed_from_bedrock_chat_invoke_request to
  reflect that native-path models now keep output_config.
@ndgigliotti
Copy link
Contributor Author

@greptileai

Comment on lines +3963 to +4017
@pytest.mark.parametrize(
"model",
[
"bedrock/invoke/us.anthropic.claude-haiku-4-5-20251001-v1:0",
"bedrock/invoke/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
"bedrock/invoke/us.anthropic.claude-sonnet-4-6",
"bedrock/invoke/us.anthropic.claude-opus-4-5-20251101-v1:0",
"bedrock/invoke/us.anthropic.claude-opus-4-6-v1",
],
)
def test_bedrock_invoke_native_structured_output(model):
"""
Integration test: verify native structured outputs via Bedrock InvokeModel
for each supported Claude model.

Uses output_config.format (Bedrock InvokeModel native API) instead of
the synthetic json_tool_call workaround.
"""
response = completion(
model=model,
messages=[
{
"role": "user",
"content": "Classify the sentiment of 'I love this product!' as positive, negative, or neutral.",
}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "sentiment",
"schema": {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": ["positive", "negative", "neutral"],
}
},
"required": ["sentiment"],
"additionalProperties": False,
},
},
},
max_tokens=100,
)
print(f"Response for {model}: {response}")

# Validate the response is a ModelResponse with valid JSON content
assert isinstance(response, ModelResponse)
content = response.choices[0].message.content
assert content is not None

parsed = json.loads(content)
assert "sentiment" in parsed
assert parsed["sentiment"] in ["positive", "negative", "neutral"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Real network call in mock-only folder

test_bedrock_invoke_native_structured_output makes a live call to AWS Bedrock (completion(model=model, ...)) with no @patch or mock applied. The project policy for tests/llm_translation/ is that only mock-based tests may be added, to ensure the suite passes on GitHub CI and for all developers locally without AWS credentials.

This test should either be moved to an integration-test directory that is excluded from the standard CI run, or converted to use unittest.mock.patch to intercept the HTTP call and return a canned response. A typical approach used elsewhere in this file:

@pytest.mark.parametrize("model", [...])
@patch("litellm.llms.custom_httpx.http_handler.HTTPHandler.post")
def test_bedrock_invoke_native_structured_output(mock_post, model):
    mock_response = Mock()
    mock_response.json.return_value = {...}   # canned Bedrock response
    mock_response.status_code = 200
    mock_post.return_value = mock_response
    ...

Rule Used: What: prevent any tests from being added here that... (source)

@ndgigliotti
Copy link
Contributor Author

tests/llm_translation/ is the standard location for live integration tests in this project -- the CLAUDE.md says "Integration tests for each provider in tests/llm_translation/". The test_bedrock_completion.py file has a long history of live Bedrock tests from many contributors (e.g. test_completion_bedrock_claude_aws_session_token, test_bedrock_nova_grounding_web_search_options_non_streaming, etc.). This is not a mock-only folder.

@greptileai

Comment on lines +29 to +35
BEDROCK_INVOKE_NATIVE_STRUCTURED_OUTPUT_MODELS = {
"claude-haiku-4-5",
"claude-sonnet-4-5",
"claude-sonnet-4-6",
"claude-opus-4-5",
"claude-opus-4-6",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded model list should live in model_prices_and_context_window.json

Per project convention, model-capability flags should be stored in model_prices_and_context_window.json and read via get_model_info, not hardcoded here. Hardcoding means users must upgrade LiteLLM every time AWS adds a new Claude model to the Invoke native structured-output feature set.

The same pattern exists for the Converse path (BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS); both should eventually be migrated. The PR author has already noted this as a follow-up concern in the discussion thread.

Rule Used: What: Do not hardcode model-specific flags in the ... (source)

Comment on lines +3963 to +4017
@pytest.mark.parametrize(
"model",
[
"bedrock/invoke/us.anthropic.claude-haiku-4-5-20251001-v1:0",
"bedrock/invoke/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
"bedrock/invoke/us.anthropic.claude-sonnet-4-6",
"bedrock/invoke/us.anthropic.claude-opus-4-5-20251101-v1:0",
"bedrock/invoke/us.anthropic.claude-opus-4-6-v1",
],
)
def test_bedrock_invoke_native_structured_output(model):
"""
Integration test: verify native structured outputs via Bedrock InvokeModel
for each supported Claude model.

Uses output_config.format (Bedrock InvokeModel native API) instead of
the synthetic json_tool_call workaround.
"""
response = completion(
model=model,
messages=[
{
"role": "user",
"content": "Classify the sentiment of 'I love this product!' as positive, negative, or neutral.",
}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "sentiment",
"schema": {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": ["positive", "negative", "neutral"],
}
},
"required": ["sentiment"],
"additionalProperties": False,
},
},
},
max_tokens=100,
)
print(f"Response for {model}: {response}")

# Validate the response is a ModelResponse with valid JSON content
assert isinstance(response, ModelResponse)
content = response.choices[0].message.content
assert content is not None

parsed = json.loads(content)
assert "sentiment" in parsed
assert parsed["sentiment"] in ["positive", "negative", "neutral"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Live network call in mock-only test folder

test_bedrock_invoke_native_structured_output issues a real completion() call to AWS Bedrock with no @patch decorator. The project policy for tests/llm_translation/ is that only mock-based tests may live here, so that the suite passes on GitHub CI and for every developer without AWS credentials.

This test will fail for any contributor without live AWS access. It should either be:

  1. Moved to an integration-test directory that is gated on AWS credentials, or
  2. Converted to use unittest.mock.patch — the same approach used extensively elsewhere in this file — intercepting the HTTP call and returning a canned Bedrock response body.

Rule Used: What: prevent any tests from being added here that... (source)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Support AWS Bedrock native structured outputs API (outputConfig.textFormat)

1 participant