Skip to content

fix(bedrock): omit toolChoice.tool on Llama for synthetic structured-output tool#3196

Merged
akshaydeo merged 1 commit intomaximhq:mainfrom
Orphic-AI:upstream/llama-toolchoice-gate
May 4, 2026
Merged

fix(bedrock): omit toolChoice.tool on Llama for synthetic structured-output tool#3196
akshaydeo merged 1 commit intomaximhq:mainfrom
Orphic-AI:upstream/llama-toolchoice-gate

Conversation

@ryan-orphic
Copy link
Copy Markdown
Contributor

Summary

Bedrock Converse rejects toolConfig.toolChoice.tool on Meta Llama variants with HTTP 400:

This model doesn't support the toolConfig.toolChoice.tool field. Remove toolConfig.toolChoice.tool and try again.

The synthetic bf_so_* tool path used for structured output (introduced in #3184) unconditionally pinned the synthetic tool via toolChoice.tool, which broke every with_structured_output(...) / response_format=json_schema caller against Llama 4 Maverick / Scout / Llama 3.x on Bedrock.

This PR completes the per-family gating story #3184 started for Anthropic.

Changes

Adds an IsLlamaModel helper next to the existing IsNovaModel / IsAnthropicModel family in core/schemas/utils.go, then gates the forced-tool emission off it in three places:

  1. core/providers/bedrock/utils.go — ChatCompletions synthetic-tool injection (response_format=json_schemabf_so_* tool path). Synthetic tool is still injected; only the forced toolChoice.tool is suppressed.
  2. core/providers/bedrock/responses.go — OpenAI Responses API mirror path (text.format=json_schemabf_so_* tool path). Same shape as Refactor Bifrost for plugin support and multi-provider architecture along with tests #1.
  3. core/providers/bedrock/utils.goconvertToolConfigFromFiltered's explicit tool_choice path (defense-in-depth for callers using bind_tools(tool_choice={\"type\": \"function\", \"function\": {\"name\": \"X\"}}) directly).

With one synthetic tool bound and Bedrock's default "auto" behavior, omitting tool_choice yields the same outcome on Llama because there's exactly one tool the model can call. This mirrors the gate langchain-aws ships (supports_tool_choice_values only allows "auto" for llama4, llama3-1, llama3-3 model families, causing with_structured_output to emit no tool_choice at all).

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (React)
  • Docs

How to test

Tests added (mirroring #3184's `TestBedrockAnthropicChatStructuredOutputUsesSyntheticTool` naming + structure):

  • `TestBedrockLlamaChatStructuredOutputOmitsForcedToolChoice` — synthetic tool present, `ToolChoice nil` for Llama on the ChatCompletions path.
  • `TestBedrockNonLlamaChatStructuredOutputForcesToolChoice` — regression guard: Nova / Anthropic still get the forced `tool_choice` (gate doesn't over-fire).
  • `TestToBedrockResponsesRequest_LlamaStructuredOutputOmitsForcedToolChoice` — same negative+positive on the Responses API path.
  • `TestBedrockLlamaConvertToolConfigOmitsForcedToolChoice` — defense-in-depth coverage for explicit `tool_choice` struct callers.

```sh
go test -count=1 -run "TestBedrockLlama|TestBedrockNonLlama|TestToBedrockResponsesRequest_Llama" ./core/providers/bedrock/...
```

Live verification: ran a forked build of this patch behind production CRS traffic (LangChain `with_structured_output(strict=True)` against `us.meta.llama4-maverick-17b-instruct-v1:0` on Bedrock) — previously 400 ValidationException on every planner request; with this gate, structured output succeeds end-to-end.

Breaking changes

  • Yes
  • No

The gate only narrows behavior for Llama (where the prior code path was already failing 100% of requests with HTTP 400). Nova / Anthropic / non-Llama paths are unchanged and covered by the negative regression test.

Related issues

Companion to #3184 — that PR added the synthetic-tool fallback for Bedrock Converse `response_format=json_schema`, which made the forced-tool emission load-bearing for all structured-output callers. This PR completes that work for the Llama family, which has a stricter `toolChoice` support matrix than Anthropic / Nova.

Security considerations

No new auth surfaces, secrets, PII, or sandboxing changes. Internal request shaping only.

Checklist

  • I added/updated tests where appropriate (4 new tests covering both positive and negative cases on both API paths)
  • I verified builds succeed (Go) — patched build is running in production CRS traffic
  • I verified the CI pipeline passes locally if applicable (Go not installed locally; relying on PR CI)

References

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 4, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 51038338-d4e0-48f5-ac5e-18d4460ad448

📥 Commits

Reviewing files that changed from the base of the PR and between bf85e94 and a49a95f.

📒 Files selected for processing (4)
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/utils.go
  • core/schemas/utils.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/responses.go
  • core/schemas/utils.go
  • core/providers/bedrock/bedrock_test.go

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Prevents Bedrock from forcing a specific tool choice for Meta Llama models when using structured-output, improving Converse compatibility.
    • Ensures the synthetic structured-output tool is only force-applied for non-Llama models.
  • Tests

    • Added regression tests covering structured-output tool-choice behavior across Llama and non-Llama models.

Walkthrough

This PR adds Llama-aware tool-choice handling in the Bedrock provider: a new IsLlamaModel() helper detects Llama variants, and conversion logic now omits forced synthetic structured-output toolChoice.tool for Llama models while preserving the injected synthetic tool in the tools list. Multiple regression tests cover chat and Responses conversions and defensive cases.

Changes

Bedrock Llama Tool-Choice Fix

Layer / File(s) Summary
Helper Function
core/schemas/utils.go
Added IsLlamaModel(model string) bool which returns true when model contains "llama".
Data/Conversion Guard
core/providers/bedrock/utils.go
convertChatParameters and convertToolConfigFromFiltered now clear/omit a forced ToolChoice (specific pinned tool) when the target model is a Llama variant.
Responses API Conversion
core/providers/bedrock/responses.go
ToBedrockResponsesRequest clears converted bedrockToolChoice if it contains a concrete Tool for Llama models and only forces the synthetic structured-output toolChoice for non-Llama models.
Regression Tests
core/providers/bedrock/bedrock_test.go
Added tests covering chat and Responses conversions and defensive cases: TestBedrockLlamaChatStructuredOutputOmitsForcedToolChoice, TestBedrockNonLlamaChatStructuredOutputForcesToolChoice, TestToBedrockResponsesRequest_LlamaStructuredOutputOmitsForcedToolChoice, TestBedrockLlamaConvertToolConfigOmitsForcedToolChoice, TestToBedrockResponsesRequest_LlamaConvertResponsesToolChoiceOmitsForcedToolChoice, and TestToBedrockResponsesRequest_NonLlamaConvertResponsesToolChoiceForcesToolChoice.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰
A rabbit hops through Bedrock code so spry,
It whispers, "Llama — free to roam, don't tie."
Synthetic tools still line the trail,
But forced pins fade where Llama sail.
Small tests applaud with a cheerful cry.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely describes the main fix: omitting toolChoice.tool on Llama models for synthetic structured-output tool, which directly addresses the HTTP 400 validation error mentioned in the description.
Description check ✅ Passed The description comprehensively covers all required template sections including Summary, Changes, Type of change, Affected areas, How to test, Breaking changes, Related issues, Security considerations, and Checklist. All key information is present and well-documented.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 6/8 reviews remaining, refill in 7 minutes and 35 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 4, 2026

Confidence Score: 5/5

Safe to merge — the fix is narrowly scoped to Llama models, all previously failing paths now have gates, non-Llama behavior is unchanged and regression-guarded by tests.

No P0/P1 findings. The prior P1 comment (missing Llama gate for explicit tool_choice on the Responses API path) is fully addressed in this version. The IsLlamaModel helper follows the established pattern of other family helpers, the gate logic is applied consistently across all four affected call sites, and the test suite covers both positive and negative cases for every path.

No files require special attention.

Important Files Changed

Filename Overview
core/schemas/utils.go Adds IsLlamaModel helper using strings.Contains(model, "llama"), consistent with existing IsNovaModel/IsAnthropicModel patterns; no issues.
core/providers/bedrock/utils.go Gates toolChoice.tool emission on Llama in both the synthetic-tool path (convertChatParameters) and the explicit tool_choice path (convertToolConfigFromFiltered); logic is correct and well-commented.
core/providers/bedrock/responses.go Adds Llama gates for both the explicit tool_choice path and the synthetic structured-output tool path on the Responses API; the previous comment about the missing explicit-tool_choice gate is fully addressed in this version.
core/providers/bedrock/bedrock_test.go Adds 5 well-structured tests covering positive (Llama gate fires) and negative (non-Llama regression guard) cases across all affected code paths; good coverage.

Reviews (2): Last reviewed commit: "fix(bedrock): omit toolChoice.tool on Ll..." | Re-trigger Greptile

…output tool

Bedrock Converse rejects toolConfig.toolChoice.tool on Meta Llama variants
with HTTP 400 ("This model doesn't support the toolConfig.toolChoice.tool
field. Remove toolConfig.toolChoice.tool and try again."). The synthetic
bf_so_* tool path used for structured output unconditionally pinned the
synthetic tool via toolChoice.tool, which broke every with_structured_output
caller against Llama 4 Maverick / Scout / Llama 3.x on Bedrock.

This patch adds an IsLlamaModel helper next to the existing IsNovaModel /
IsAnthropicModel family and gates the forced-tool emission off it in four
places (covering both API surfaces and both the synthetic and explicit
tool_choice paths on each):

  1. core/providers/bedrock/utils.go: ChatCompletions synthetic-tool injection
     (response_format=json_schema -> bf_so_* tool path).
  2. core/providers/bedrock/responses.go: OpenAI Responses API mirror of maximhq#1
     (text.format=json_schema -> bf_so_* tool path).
  3. core/providers/bedrock/utils.go: convertToolConfigFromFiltered's explicit
     tool_choice path (defense-in-depth for ChatCompletions callers using
     bind_tools(tool_choice={"type": "function", "function": {"name": "X"}})
     directly).
  4. core/providers/bedrock/responses.go: Responses API mirror of maximhq#3 — the
     explicit tool_choice path runs convertResponsesToolChoice and writes
     BedrockToolChoice{Tool: ...} straight onto the request, which trips the
     same Llama 400 without the gate.

With one synthetic tool bound and Bedrock's default "auto" behavior, omitting
tool_choice yields the same outcome on Llama because there's exactly one tool
the model can call. This mirrors the gate langchain-aws ships
(supports_tool_choice_values for llama4 / llama3-1 / llama3-3 sets only
"auto", causing with_structured_output to emit no tool_choice at all).

Tests added (mirroring PR maximhq#3184's TestBedrockAnthropicChatStructuredOutputUsesSyntheticTool
naming and structure):

  - TestBedrockLlamaChatStructuredOutputOmitsForcedToolChoice — synthetic tool
    present, ToolChoice nil for Llama on the ChatCompletions path.
  - TestBedrockNonLlamaChatStructuredOutputForcesToolChoice — regression guard:
    Nova / Anthropic still get the forced tool_choice (gate doesn't over-fire).
  - TestToBedrockResponsesRequest_LlamaStructuredOutputOmitsForcedToolChoice —
    same negative+positive on the Responses API synthetic-tool path.
  - TestBedrockLlamaConvertToolConfigOmitsForcedToolChoice — defense-in-depth
    coverage for ChatCompletions explicit tool_choice struct callers.
  - TestToBedrockResponsesRequest_LlamaConvertResponsesToolChoiceOmitsForcedToolChoice
    — defense-in-depth coverage for Responses API explicit tool_choice
    (function-typed) struct callers — the gap surfaced in code review.
  - TestToBedrockResponsesRequest_NonLlamaConvertResponsesToolChoiceForcesToolChoice
    — Anthropic regression guard for maximhq#5 so the Responses-side Llama gate
    doesn't over-fire.

Refs:
  - https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolChoice.html
  - https://github.com/langchain-ai/langchain-aws (bedrock_converse.py
    supports_tool_choice_values family-detection lines 678-711).
  - PR maximhq#3184 (Anthropic structured-output fallback to synthetic tool path) —
    immediate predecessor; this PR completes the per-family gating story.
@ryan-orphic ryan-orphic force-pushed the upstream/llama-toolchoice-gate branch from bf85e94 to a49a95f Compare May 4, 2026 05:47
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/bedrock/responses.go (1)

1959-1993: ⚠️ Potential issue | 🔴 Critical

Add Llama guard for explicit user tool_choice in Responses path.

The chat completions path (convertToolConfigFromFiltered, utils.go:1575–1580) drops .Tool-type choices on Llama models to prevent the HTTP 400 rejection. The Responses path lacks this defense for explicit user-provided tool_choice.

When a caller passes both text.format=json_schema and an explicit tool_choice: {type:"function", name:"some_fn"} against Llama:

  1. Lines 1960–1968 convert the explicit choice → bedrockReq.ToolConfig.ToolChoice = {Tool: {Name:"some_fn"}}
  2. Line 1986 guard only protects the synthetic tool injection, not the user choice from step 1
  3. Result: Bedrock still receives toolChoice.tool → HTTP 400 on Llama

Add a Llama gate immediately after line 1968 to strip .Tool choices on Llama models, mirroring the chat path logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/responses.go` around lines 1959 - 1993, After
converting an explicit user tool choice via convertResponsesToolChoice (when
bifrostReq.Params.ToolChoice != nil) ensure you strip any .Tool entries for
Llama models: after obtaining bedrockToolChoice and before assigning it to
bedrockReq.ToolConfig.ToolChoice, check schemas.IsLlamaModel(bifrostReq.Model)
and if true remove/null out the Tool field on bedrockToolChoice (so only
non-Tool choices remain) then continue assigning; this mirrors the chat-path
guard and prevents sending toolChoice.tool to Bedrock for Llama models.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@core/providers/bedrock/responses.go`:
- Around line 1959-1993: After converting an explicit user tool choice via
convertResponsesToolChoice (when bifrostReq.Params.ToolChoice != nil) ensure you
strip any .Tool entries for Llama models: after obtaining bedrockToolChoice and
before assigning it to bedrockReq.ToolConfig.ToolChoice, check
schemas.IsLlamaModel(bifrostReq.Model) and if true remove/null out the Tool
field on bedrockToolChoice (so only non-Tool choices remain) then continue
assigning; this mirrors the chat-path guard and prevents sending toolChoice.tool
to Bedrock for Llama models.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6cdb07cf-68cf-4c4c-9aa7-02f332422771

📥 Commits

Reviewing files that changed from the base of the PR and between c257460 and bf85e94.

📒 Files selected for processing (4)
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/utils.go
  • core/schemas/utils.go

@ryan-orphic
Copy link
Copy Markdown
Contributor Author

Thanks @greptile-apps — good catch on the Responses API explicit `tool_choice` gap. The Chat Completions path had defense-in-depth via `convertToolConfigFromFiltered`, but the analogous gate on `responses.go` lines 1962–1971 was missing.

Just pushed an amend that closes that gap:

  • `core/providers/bedrock/responses.go` — adds the same per-model gate immediately after `convertResponsesToolChoice` returns (drops `bedrockToolChoice` when `Tool != nil` AND `schemas.IsLlamaModel(bifrostReq.Model)`).
  • Two new tests in `bedrock_test.go`:
    • `TestToBedrockResponsesRequest_LlamaConvertResponsesToolChoiceOmitsForcedToolChoice` — Llama + explicit `{type: function, name: X}` tool_choice → `ToolConfig.Tools` survives, `ToolConfig.ToolChoice == nil`.
    • `TestToBedrockResponsesRequest_NonLlamaConvertResponsesToolChoiceForcesToolChoice` — Anthropic regression guard so the new gate doesn't over-fire.

Commit message updated to call out four gate sites instead of three (two API surfaces × two paths each: synthetic-tool injection + explicit `tool_choice`).

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 4, 2026

The diff confirms all four gate sites are now covered:

Site File Gated?
convertChatParameters synthetic-tool ToolChoice utils.go !IsLlamaModel
convertToolConfigFromFiltered explicit tool_choice utils.go toolChoice.Tool != nil && IsLlamaModel
ToBedrockResponsesRequest synthetic-tool ToolChoice responses.go !IsLlamaModel
ToBedrockResponsesRequest explicit tool_choice via convertResponsesToolChoice responses.go bedrockToolChoice.Tool != nil && IsLlamaModel

The two gates in responses.go are ordered correctly — the explicit tool_choice gate fires first (lines ~1962–1971), then the synthetic-tool gate fires later in the same function. The two conditions are independent, so there's no risk of one clobbering the other.

One small note on IsLlamaModel: strings.Contains(model, "llama") is case-sensitive. Bedrock model IDs appear to always be lowercase (confirmed by the AWS model ID format us.meta.llama4-maverick-17b-instruct-v1:0), so this is fine in practice — just worth keeping in mind if AWS ever introduces a mixed-case alias. The pattern is consistent with the existing IsNovaModel/IsAnthropicModel helpers so it's the right call to be consistent here.

Six tests covering both positive (tool survives) and negative (forced pin dropped) assertions across all four sites — that's thorough. LGTM.

Tip: You can customize Greptile's behavior for this repo with .greptile/rules.md and .greptile/config.json.

@akshaydeo akshaydeo merged commit 7eb2631 into maximhq:main May 4, 2026
3 of 4 checks passed
@ryan-orphic ryan-orphic deleted the upstream/llama-toolchoice-gate branch May 4, 2026 14:41
@ryan-orphic
Copy link
Copy Markdown
Contributor Author

@akshaydeo Thank you. How long does it usually take for changes to be included in your enterprise builds?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants