Bedrock: move native structured output model list to cost JSON, add Sonnet 4.6 by ndgigliotti · Pull Request #23794 · BerriAI/litellm

ndgigliotti · 2026-03-17T00:26:18Z

Relevant issues

Addresses Greptile feedback on #21222 and #23778 recommending the hardcoded model set be moved to the cost JSON.

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

Refactoring / New Feature

Changes

Claude Sonnet 4.6 was released after the native structured output feature landed (#21222) and was not included. DeepSeek v3 was listed in the hardcoded set but the substring didn't match the actual model ID, so it was silently broken. This PR flags both for native structured output and moves the model capability check from a hardcoded set to the cost JSON, so future models are supported without code changes or releases (since litellm.model_cost is fetched from the remote JSON at import time).

litellm/llms/bedrock/chat/converse_transformation.py: Removed hardcoded BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS set. _supports_native_structured_outputs() now looks up the supports_native_structured_output flag in litellm.model_cost via get_bedrock_base_model(), with a fallback that strips version suffixes (e.g. :0).
model_prices_and_context_window.json / litellm/model_prices_and_context_window_backup.json: Added "supports_native_structured_output": true to 44 Bedrock models:
- Claude 4.5/4.6 (haiku, sonnet, opus) + all regional variants
- Qwen3 (32b, 235b, coder-30b, coder-480b, next-80b, vl-235b)
- Mistral (ministral-3 3b/8b/14b, mistral-large-3, voxtral-mini-3b, voxtral-small-24b)
- Minimax (m2), Moonshot (kimi-k2-thinking)
- Nvidia (nemotron-nano-3-30b), DeepSeek (v3-v1:0)
tests/test_litellm/llms/bedrock/chat/test_converse_transformation.py: Updated tests to load local cost map and assert against real model IDs.

All 44 flagged models were integration tested against real Bedrock endpoints (sync + streaming) on the native outputConfig.textFormat path.

Models deliberately excluded after integration testing:

google.gemma-3 (4b/12b/27b): ignores schema, returns free text
nvidia.nemotron-nano (9b/12b): errors with "Tool calling is not supported in streaming mode" even on sync
deepseek.v3.2: Bedrock returns 400 on outputConfig.textFormat
minimax.minimax-m2.1: Bedrock returns 400 on outputConfig.textFormat
moonshotai.kimi-k2.5: Bedrock returns 400 on outputConfig.textFormat
qwen.qwen3-coder-next: unavailable in both us-east-1 and us-west-2

These excluded models continue to work via the existing tool-call fallback path.

vercel · 2026-03-17T00:26:24Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 17, 2026 1:13am

codspeed-hq · 2026-03-17T00:29:10Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing ndgigliotti:feat/bedrock-structured-output-cost-json (3b1e124) with main (278c9ba)}

ndgigliotti · 2026-03-17T00:33:46Z

@greptileai

greptile-apps · 2026-03-17T00:37:24Z

Greptile Summary

This PR replaces a hardcoded BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS substring-match set with a cost-JSON-driven lookup in _supports_native_structured_outputs, aligning with the project's convention of encoding model capabilities in model_prices_and_context_window.json. It also adds Claude Sonnet 4.6 (previously missing) and correctly identifies DeepSeek v3 (the old "deepseek-v3.1" substring never matched real Bedrock model IDs).

Key changes:

_supports_native_structured_outputs now calls get_bedrock_base_model() to normalize the model ID, does a direct litellm.model_cost lookup, then falls back to stripping the version suffix (:0) for models that appear in the JSON without a version qualifier.
44 Bedrock models gain "supports_native_structured_output": true in both the main and backup JSON files.
New tests (test_supports_native_structured_outputs, test_native_structured_output_no_fake_stream, test_json_object_no_schema_falls_back_to_tool_call) correctly guard litellm.model_cost state with try/finally. However, test_translate_response_format_native_output_config (line 2729) omits that setup and may be flaky if run against a remote cost map that hasn't been updated yet.
The remainder of the diff is pure line-wrapping/formatting reformatting with no logic changes.

Confidence Score: 4/5

Safe to merge with one minor fix: test_translate_response_format_native_output_config should set up the local cost map the same way the other new tests do.
The core logic change is clean, well-tested, and follows the project convention for capability flags. The 44 JSON additions are consistent. The only concern is a potential flaky test that could cause CI noise when the remote cost map CDN hasn't been updated yet.
tests/test_litellm/llms/bedrock/chat/test_converse_transformation.py — test_translate_response_format_native_output_config (line 2729) needs the same LITELLM_LOCAL_MODEL_COST_MAP / litellm.model_cost setup guard as the other new tests.

Important Files Changed

Filename	Overview
litellm/llws/bedrock/chat/converse_transformation.py	Removes hardcoded `BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS` set and replaces `_supports_native_structured_outputs` with a cost-JSON lookup via `litellm.model_cost` + `get_bedrock_base_model`, with a version-suffix fallback. The rest of the diff is pure whitespace/line-wrapping reformatting with no logic changes.
model_prices_and_context_window.json	Adds `"supports_native_structured_output": true` to 44 Bedrock models (Claude 4.5/4.6 variants, Qwen3, Mistral, MiniMax, Moonshot, NVIDIA, DeepSeek). Also includes an unrelated correction to `vertex_ai/gemini-embedding-2-preview` (discussed in a previous thread).
litellm/model_prices_and_context_window_backup.json	Mirror of the main JSON changes: same 44 models flagged with `supports_native_structured_output: true`, and the same `vertex_ai/gemini-embedding-2-preview` correction.
tests/test_litellm/llms/bedrock/chat/test_converse_transformation.py	Adds tests for the new cost-JSON-backed `_supports_native_structured_outputs` logic. Three new tests properly save/restore `litellm.model_cost` with `try/finally`, but `test_translate_response_format_native_output_config` (line 2729) skips that setup, making it potentially flaky when the remote cost map hasn't been updated yet. The rest of the diff is cosmetic reformatting.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["_translate_response_format_param(model, response_format, ...)"] --> B{"json_schema present?"}
    B -- No --> E["Tool-call fallback path"]
    B -- Yes --> C["_supports_native_structured_outputs(model)"]
    C --> C1["get_bedrock_base_model(model)\n(strip region prefix, routing prefix, throughput suffix, ARN)"]
    C1 --> C2["litellm.model_cost.get(base_model)"]
    C2 -- found --> C4["return info.get('supports_native_structured_output', False) is True"]
    C2 -- not found & ':' in model --> C3["Retry with version suffix stripped\n(e.g. 'model-v1:0' → 'model-v1')"]
    C3 --> C4
    C2 -- not found & no ':' --> C5["return False"]
    C4 -- True --> D["Native path: build outputConfig.textFormat\nNo tool injection, no fake_stream"]
    C4 -- False --> E
    C5 --> E
    E["Tool-call fallback: inject synthetic tool,\nset tool_choice, possibly set fake_stream"]

Comments Outside Diff (1)

tests/test_litellm/llws/bedrock/chat/test_converse_transformation.py, line 2729-2770 (link)

test_translate_response_format_native_output_config may be flaky without local cost map setup

Unlike test_supports_native_structured_outputs, test_native_structured_output_no_fake_stream, and test_json_object_no_schema_falls_back_to_tool_call — all of which explicitly set LITELLM_LOCAL_MODEL_COST_MAP and reload litellm.model_cost from the local backup — this test relies on whatever litellm.model_cost was loaded at import time.

If the CI environment fetches the remote JSON (i.e., LITELLM_LOCAL_MODEL_COST_MAP is not set in the process environment at import), and the remote CDN hasn't yet been updated with the "supports_native_structured_output": true flag added by this PR, then _supports_native_structured_outputs("anthropic.claude-sonnet-4-5-20250929-v1:0") will return False, outputConfig won't be added, and the assertion on line 2758 will fail.

Consider applying the same pattern as the other tests:
```
def test_translate_response_format_native_output_config():
    old_env = os.environ.get("LITELLM_LOCAL_MODEL_COST_MAP")
    old_cost = litellm.model_cost
    os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
    litellm.model_cost = litellm.get_model_cost_map(url="")
    try:
        config = AmazonConverseConfig()
        # ... rest of test ...
    finally:
        litellm.model_cost = old_cost
        if old_env is None:
            os.environ.pop("LITELLM_LOCAL_MODEL_COST_MAP", None)
        else:
            os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = old_env
```

_{Last reviewed commit: 3b1e124}

greptile-apps · 2026-03-17T00:37:28Z

tests/test_litellm/llms/bedrock/chat/test_converse_transformation.py

+    os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
+    litellm.model_cost = litellm.get_model_cost_map(url="")


Missing cleanup of LITELLM_LOCAL_MODEL_COST_MAP and litellm.model_cost

os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] and litellm.model_cost are set here without any teardown, which means these side effects persist for the rest of the test session. Since litellm.model_cost is a module-level global, any test that runs afterward and relies on the original (remote) cost map—or on the env var not being set—could behave differently or produce false positives/negatives.

The same pattern appears in test_native_structured_output_no_fake_stream (line 2797) and test_json_object_no_schema_falls_back_to_tool_call (line 3066).

Each of these tests should save and restore both values with try/finally, or use pytest's monkeypatch fixture:

def test_supports_native_structured_outputs(monkeypatch): monkeypatch.setenv("LITELLM_LOCAL_MODEL_COST_MAP", "True") original_model_cost = litellm.model_cost litellm.model_cost = litellm.get_model_cost_map(url="") try: # ... assertions ... finally: litellm.model_cost = original_model_cost

Or more concisely with monkeypatch for the attribute too:

def test_supports_native_structured_outputs(monkeypatch): monkeypatch.setenv("LITELLM_LOCAL_MODEL_COST_MAP", "True") monkeypatch.setattr(litellm, "model_cost", litellm.get_model_cost_map(url="")) # ... assertions ...

ndgigliotti · 2026-03-17T00:41:30Z

Fixed the test state leakage in 97e7fb9. All three tests now restore os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] and litellm.model_cost in a finally block.

@greptileai

greptile-apps · 2026-03-17T00:48:34Z

model_prices_and_context_window.json

        "mode": "chat",
        "output_cost_per_token": 3.2e-06,
        "source": "https://cloud.google.com/vertex-ai/generative-ai/pricing#glm-models",
-        "supported_regions": ["global"],
+        "supported_regions": [
+            "global"
+        ],


Unrelated pricing change modifies vertex_ai/gemini-embedding-2-preview

This PR also updates vertex_ai/gemini-embedding-2-preview in ways unrelated to Bedrock native structured outputs:

Drops input_cost_per_audio_per_second, input_cost_per_image, and input_cost_per_video_per_second fields

Changes input_cost_per_token from 2e-07 to 1.5e-07

Switches the source URL from the Google Cloud pricing page to the AI Studio embeddings page

The identical change is in litellm/model_prices_and_context_window_backup.json.

This seems to be fixing a pre-existing duplicate key in the JSON (there were two vertex_ai/gemini-embedding-2-preview entries), but by removing audio/video/image costs it changes the cost calculation result for users calling this embedding model. If this change is intentional, it should either be documented in this PR or separated into its own PR to make the impact clear.

ndgigliotti · 2026-03-17T01:03:26Z

Thanks for the follow-up review.

vertex_ai/gemini-embedding-2-preview diff: Main has two duplicate JSON keys for this entry. Our JSON round-trip collapsed them to the second (text-only) entry, but PR fix(model-prices): remove duplicate vertex_ai/gemini-embedding-2-preview entry #23599 intentionally keeps the first (multimodal pricing with audio/image/video fields). Fixed in e404349 to align with fix(model-prices): remove duplicate vertex_ai/gemini-embedding-2-preview entry #23599.
Missing deepseek.v3-v1:0 test assertion: Added in 254567f.

…t JSON lookup Move the source of truth for which Bedrock models support native structured outputs (outputConfig.textFormat) from a hardcoded substring set (BEDROCK_NATIVE_STRUCTURED_OUTPUT_MODELS) to the cost JSON via a new "supports_native_structured_output" flag. This makes it possible to add support for new models (including Claude Sonnet 4.6, which was missing) by updating the JSON alone, with no code changes needed.

Integration testing confirmed gemma-3 (4b/12b/27b) ignores the JSON schema and returns free text, and nemotron-nano (9b/12b) errors with "Tool calling is not supported in streaming mode" even on sync calls. Remove the flag so these models fall back to the tool-call approach. Also fix test assertions to match (nemotron-nano-3-30b is supported, gemma-3 and nemotron-nano-12b are not).

Integration tested 28/28 (10 sync + 10 streaming + extras) on the native outputConfig.textFormat path in us-west-2. deepseek.v3.2 does not support native structured output (Bedrock returns 400).

…en3-coder-next Integration testing confirmed: - minimax.minimax-m2.1: Bedrock rejects outputConfig.textFormat (400) - moonshotai.kimi-k2.5: Bedrock rejects outputConfig.textFormat (400) - qwen.qwen3-coder-next: unavailable in us-east-1 and us-west-2

Wrap cost-map-dependent tests in try/finally to restore os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] and litellm.model_cost, preventing test-ordering sensitivity.

…I#23599 Main has two duplicate keys for vertex_ai/gemini-embedding-2-preview. Our JSON round-trip collapsed them to the second (text-only) entry, but PR BerriAI#23599 intentionally keeps the first (multimodal pricing) entry. Restore the multimodal entry to avoid conflicts.

ndgigliotti · 2026-03-17T01:11:12Z

Rebased on latest main (3b1e124).

ndgigliotti · 2026-03-17T01:28:59Z

@krrishdholakia ready for review when you get a chance. Greptile gave 4/5 and all feedback has been addressed.

vercel bot deployed to Preview March 17, 2026 00:27 View deployment

vercel bot deployed to Preview March 17, 2026 00:32 View deployment

ndgigliotti changed the title ~~Flag Claude Sonnet 4.6 for native structured output and move Bedrock model check to cost JSON~~ Bedrock: move native structured output model list to cost JSON, add Sonnet 4.6 Mar 17, 2026

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

vercel bot deployed to Preview March 17, 2026 00:42 View deployment

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

vercel bot deployed to Preview March 17, 2026 01:00 View deployment

vercel bot deployed to Preview March 17, 2026 01:04 View deployment

ndgigliotti added 7 commits March 16, 2026 21:09

Add native structured output flag for deepseek.v3-v1:0

f5ecb9e

Integration tested 28/28 (10 sync + 10 streaming + extras) on the native outputConfig.textFormat path in us-west-2. deepseek.v3.2 does not support native structured output (Bedrock returns 400).

Fix test state leakage: restore env and model_cost after each test

990d7dd

Wrap cost-map-dependent tests in try/finally to restore os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] and litellm.model_cost, preventing test-ordering sensitivity.

Add test assertion for deepseek.v3-v1:0 native structured output

ed3994f

ndgigliotti force-pushed the feat/bedrock-structured-output-cost-json branch from e404349 to 3b1e124 Compare March 17, 2026 01:10

vercel bot deployed to Preview March 17, 2026 01:13 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bedrock: move native structured output model list to cost JSON, add Sonnet 4.6#23794

Bedrock: move native structured output model list to cost JSON, add Sonnet 4.6#23794
ndgigliotti wants to merge 7 commits intoBerriAI:mainfrom
ndgigliotti:feat/bedrock-structured-output-cost-json

ndgigliotti commented Mar 17, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 17, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 17, 2026 •

edited

Loading

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

greptile-apps bot commented Mar 17, 2026 •

edited

Loading

Important Files Changed

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 17, 2026

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

greptile-apps bot Mar 17, 2026

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
		litellm.model_cost = litellm.get_model_cost_map(url="")

Uh oh!

Conversation

ndgigliotti commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

greptile-apps bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

ndgigliotti commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ndgigliotti commented Mar 17, 2026 •

edited

Loading

vercel bot commented Mar 17, 2026 •

edited

Loading

codspeed-hq bot commented Mar 17, 2026 •

edited

Loading

greptile-apps bot commented Mar 17, 2026 •

edited

Loading