feat: add tool_choice support by vladnosiv · Pull Request #4722 · ai-dynamo/dynamo

vladnosiv · 2025-12-03T13:41:18Z

Overview:

Implements OpenAI tool_choice parameter support for Chat Completions API.

Details:

Jail Stream Enhancement:

Added JailMode enum with MarkerBased (traditional) and Immediate (tool_choice) variants

JSON Schema Generation:

New module for deriving JSON schemas from tool_choice + tools
Integrated into CommonExtProvider::get_guided_json() for guided decoding
Supports none, auto, required, and named tool choice modes

Request Preprocessing:

Routes requests to appropriate JailMode based on tool_choice option

Finish Reasons Handling:

fix_finish_reason() post-processor adjusts finish reasons after jail stream

Where should the reviewer start?

lib/llm/src/protocols/openai/chat_completions/jail.rs - JailMode::Immediate implementation
lib/llm/src/protocols/openai/tools.rs - JSON schema generation for guided decoding

Future Work

Per-call streaming for required mode to emit individual tool calls as they complete, rather than waiting for the entire array.

Related Issues:

Resolves [FEATURE]: Add tool_choice support #4721

Summary by CodeRabbit

Release Notes

New Features
- Added support for Named and Required tool choice modes, enabling more explicit control over which tools the model uses.
- Implemented automatic JSON schema derivation from tool definitions for guided decoding.
- Enhanced tool calling configuration with new builder methods for specifying tool choice requirements.
Improvements
- Improved finish reason handling for tool-based completion scenarios.
- Refined tool jailing logic to support multiple activation modes.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

copy-pr-bot · 2025-12-03T13:41:22Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-12-03T13:41:31Z

👋 Hi vladnosiv! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

grahamking · 2025-12-03T15:52:21Z

@ayushag-nv Could you to an early review of the approach here?

vladnosiv · 2025-12-03T15:55:54Z

I noticed here that the usual FC in dynamo are not streamed like OpenAI, but wait for the full generation of the entire FC

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv · 2025-12-03T16:30:09Z

To be honest, I expected that fewer changes would be required.
PR can be split into several to simplify the review, if the approach itself is considered viable.

ayushag-nv · 2025-12-04T16:42:06Z

will review today !

ayushag-nv · 2025-12-04T16:49:47Z

Implements OpenAI tool_choice parameter support for Chat/Completions API with incremental streaming. Enables function calling with proper tool_calls deltas, parallel function calling, and OpenAI API compatibility.

@vladnosiv Yes thats the jail streaming part that allows us to use non-streaming parsers. The normal content gets streamed as it is but it blocks the stream as soon as it detects potential start of tool calls. Then tries to find the complete tool call and then stream it.

vladnosiv · 2025-12-04T17:03:15Z

@vladnosiv Yes thats the jail streaming part that allows us to use non-streaming parsers. The normal content gets streamed as it is but it blocks the stream as soon as it detects potential start of tool calls. Then tries to find the complete tool call and then stream it.

Okay, I really overlooked that the current FC implementation does not involve honest FC streaming. In the current version of this PR, in the case of tool_choice != none/auto I added honest streaming.

If dynamo does not plan to support FC streaming, then it makes sense to remove this part from here. Otherwise, it would be possible to generalize the approach with partial Json parsing for streaming FC-deltas in the jail streams component.

In addition, I realized here that perhaps the case of FC through the JSON schema can be tried to squeeze into the code path with a jail streaming. It may turn out to be a little more hacky than with honest tags, especially in the case of a parallel call in a required scenario.

ayushag-nv · 2025-12-04T18:05:13Z

@vladnosiv Yes thats the jail streaming part that allows us to use non-streaming parsers. The normal content gets streamed as it is but it blocks the stream as soon as it detects potential start of tool calls. Then tries to find the complete tool call and then stream it.

Okay, I really overlooked that the current FC implementation does not involve honest FC streaming. In the current version of this PR, in the case of tool_choice != none/auto I added honest streaming.

If dynamo does not plan to support FC streaming, then it makes sense to remove this part from here. Otherwise, it would be possible to generalize the approach with partial Json parsing for streaming FC-deltas in the jail streams component.

In addition, I realized here that perhaps the case of FC through the JSON schema can be tried to squeeze into the code path with a jail streaming. It may turn out to be a little more hacky than with honest tags, especially in the case of a parallel call in a required scenario.

yeah right now no plans and requirements also to support honest FC streaming. So, you can update your PR accordingly.

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv · 2025-12-05T15:42:39Z

Hi @ayushag-nv !
I fixed the code, now the code follows the path of the jail stream. The only significant change in this part is that JailMode = Immediate has been added, which does not wait for the start tokens (and does not delete them for parsing). It also helps to know the function name when using named tool_choice, because LLM generates only arguments in this scenario.

Honest FC streaming has been removed. In addition, to simplify the current PR in required mode, with parallel FC in streaming, all chunks with FC will return at the end. This simplification can be adjusted for per-call streaming in another PR.

coderabbitai · 2025-12-05T19:44:25Z

Walkthrough

This PR implements comprehensive tool_choice support for the OpenAI-compatible API. It extends the jail-based tool call interception mechanism with configurable modes (Named, Required, MarkerBased), converts tool_choice configurations into JSON schemas for guided decoding, and updates multiple public function signatures to accept optional parameters for tool parsing and tool choice handling.

Changes

Cohort / File(s)	Summary
Preprocessor core logic `lib/llm/src/preprocessor.rs`	Expanded jail decision logic to trigger immediate jail for Named or Required tool_choice modes even without a parser. Updated `should_apply_tool_jail` and `apply_tool_calling_jail` signatures to accept `Option<String>` for parser and `Option<ChatCompletionToolChoiceOption>` for tool choice. Call sites now always invoke jail application with both optional parser and tool choice.
Protocol-level API surface `lib/llm/src/protocols/openai.rs`, `lib/llm/src/protocols/openai/common_ext.rs`	Added public `tools` module export. Changed `CommonExtProvider::get_guided_json()` return type from `Option<&serde_json::Value>` to `Option<serde_json::Value>` across trait and implementations.
Tool schema derivation `lib/llm/src/protocols/openai/tools.rs`	New module providing `get_json_schema_from_tools()` to derive JSON schemas from tool_choice and tools definitions. Introduced public `ToolChoiceError` enum with variants for missing tools, conflicts, and empty toolsets. Includes helpers for finding tools, cloning parameters, and merging schema definitions.
Chat completions implementation `lib/llm/src/protocols/openai/chat_completions.rs`	Updated `get_guided_json()` to return owned `Option<serde_json::Value>`. Implemented fallback logic to derive JSON schema from tool_choice and tools via `tools::get_json_schema_from_tools()` when no explicit guided JSON is present.
Completions implementation `lib/llm/src/protocols/openai/completions.rs`	Updated `get_guided_json()` to return owned `Option<serde_json::Value>`, replacing `as_ref()` with `clone()` for parameter access.
Jail mechanism enhancements `lib/llm/src/protocols/openai/chat_completions/jail.rs`	Introduced public `JailMode` enum with `MarkerBased` and `Immediate { format: ToolChoiceFormat }` variants. Added public `ToolChoiceFormat` enum with `SingleObject { tool_name }` and `ArrayOfTools` variants. Extended `ChoiceJailState` and `ChoiceJailStateCollection` to accept `starts_jailed: bool`. Enhanced `JailedStream` and `JailedStreamBuilder` with `jail_mode` field and builder methods `tool_choice_named()` and `tool_choice_required()`. Updated finish reason handling and stream processing logic to honor new jail modes, including JSON parsing for Immediate mode tool choices.
Test infrastructure updates `lib/llm/tests/test_common_ext.rs`, `lib/llm/tests/test_reasoning_parser.rs`, `lib/llm/tests/test_streaming_tool_parsers.rs`	Updated test expectations and call sites to use new `apply_tool_calling_jail` signature accepting `Option<String>` for tool parser and `Option<ChatCompletionToolChoiceOption>` for tool choice. Modified assertions to reflect owned JSON value returns.
New test modules `lib/llm/tests/tool_choice.rs`, `lib/llm/tests/tool_choice_finish_reasons.rs`	Added comprehensive test suites for tool choice handling, including parsing Named and Required tool choices from JSON, handling parse failures, verifying finish reasons (Stop, ToolCalls, Length, ContentFilter) across different tool choice modes, and validating tool call details.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Jail mechanism redesign: The jail.rs file contains substantial logic density with new enum variants, builder patterns, state initialization changes, and dual-mode handling (MarkerBased vs Immediate) affecting multiple code paths.
Schema derivation logic: The new tools.rs module implements non-trivial JSON schema merging and validation logic with error handling for conflicting definitions and missing tools.
API signature propagation: Multiple public function signatures have changed across preprocessor.rs, common_ext.rs, and chat_completions.rs, requiring careful tracking of how optional parameters flow through the call chain.
Test coverage expansion: New test modules add significant logic that exercises tool choice modes, JSON parsing, and finish reason transformations, warranting close verification.

Areas requiring extra attention:

Jail mode initialization and state management in ChoiceJailState::new() and ChoiceJailStateCollection::get_or_create_state()
JSON schema derivation logic in tools.rs, particularly the $defs merging and conflict detection
Tool choice JSON parsing in parse_tool_choice_json() and its interaction with Immediate mode handling
Finish reason transformation logic in fix_finish_reason() when switching between modes

Poem

🐰 With tool_choice now in full bloom,
Named, Required—no more gloom!
Schemas dance through JSON streams,
Jailed and parsed—fulfilling dreams!
The rabbit hops with glee so grand,
Tool calling works across the land! 🎉

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 72.88% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely identifies the main feature being added: tool_choice support, which matches the primary objective of the pull request.
Description check	✅ Passed	The description follows the template structure with Overview, Details, and Where to Start sections. It clearly explains the implementation approach, key components added, and future work planned.
Linked Issues check	✅ Passed	The PR successfully implements all key requirements from #4721: support for tool_choice modes (none, auto, required, named), JSON schema generation from tool_choice+tools, guided decoding integration, and partial JSON response parsing via jail stream.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with the tool_choice feature scope. The PR modifies jail streams, adds JSON schema generation, updates preprocessor logic, and adjusts finish reasons—all supporting the linked issue #4721 requirements without introducing unrelated functionality.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (6)

lib/llm/src/protocols/openai/tools.rs (2)
58-63: Consider whether default schema needs additionalProperties: false.

The fallback schema for tools without parameters allows any additional properties by default. If strict guided decoding is intended, consider adding "additionalProperties": false:
 fn clone_parameters(function: &FunctionObject) -> Value {
     function
         .parameters
         .clone()
-        .unwrap_or_else(|| json!({"type": "object", "properties": {}}))
+        .unwrap_or_else(|| json!({"type": "object", "properties": {}, "additionalProperties": false}))
 }
However, this may be intentional to allow flexibility when no schema is defined.

201-262: Good test coverage for core scenarios.

Tests cover the main happy paths and key error conditions. Consider adding tests for edge cases like tool_choice=None returning Ok(None) and tool_choice=required with tools=None returning MissingTools error for completeness.
lib/llm/src/protocols/openai/chat_completions/jail.rs (1)
889-897: Tool call IDs are non-unique across requests.

The hardcoded "call-1" and format!("call-{}", idx + 1) IDs differ from the marker-based path which uses parser-generated IDs. Consider using UUIDs for consistency with typical OpenAI responses:
-                    Ok(vec![ChatCompletionMessageToolCallChunk {
-                        index: 0,
-                        id: Some("call-1".to_string()),
+                    Ok(vec![ChatCompletionMessageToolCallChunk {
+                        index: 0,
+                        id: Some(format!("call_{}", uuid::Uuid::new_v4())),
This ensures tool call IDs are globally unique, which some clients may rely on for tracking.
lib/llm/tests/tool_choice_finish_reasons.rs (3)
54-87: Consider adding error context to unwraps for easier test debugging.

The chained unwraps at line 86 could make it hard to diagnose test failures. Consider using expect() with descriptive messages.
-    output_stream.next().await.unwrap().data.unwrap()
+    output_stream
+        .next()
+        .await
+        .expect("jail stream should emit at least one item")
+        .data
+        .expect("jail output should contain data")
198-225: Missing jail-based test for named tool choice with Stop finish reason.

This test verifies the generator behavior (Stop remains Stop before jail), but doesn't verify the jail behavior. Test 6 shows that Required+Stop becomes ToolCalls through jail. There should be a corresponding async test verifying that Named+Stop remains Stop (or becomes ToolCalls) through jail.

Consider adding an async version that applies jail transformation:
#[tokio::test]
async fn test_named_tool_choice_normal_stop_becomes_stop_through_jail() {
    let mut request = create_test_request();
    let tool_choice = Some(ChatCompletionToolChoiceOption::Named(
        ChatCompletionNamedToolChoice {
            r#type: ChatCompletionToolType::Function,
            function: FunctionName {
                name: "get_weather".to_string(),
            },
        },
    ));
    request.inner.tool_choice = tool_choice.clone();

    let mut generator = request.response_generator("req-stop-jail".to_string());
    let backend_output = build_backend_output_with_finish(
        r#"{"location":"Paris","unit":"celsius"}"#,
        common::FinishReason::Stop,
    );

    let raw_response = generator
        .choice_from_postprocessor(backend_output)
        .expect("choice generation");

    let response = apply_jail_transformation(raw_response, tool_choice).await;

    // Verify jail behavior for named tool choice with Stop
    assert_eq!(
        response.choices[0].finish_reason,
        Some(dynamo_async_openai::types::FinishReason::Stop), // or ToolCalls?
    );
}
122-196: Consider adding jail-based tests for ContentFilter preservation.

Tests 3 and 4 verify ContentFilter preservation at the generator level, but don't verify jail behavior. If the jail has different handling for ContentFilter vs Length, these cases would be uncovered. Consider adding async versions similar to test 1.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8c4a5f and bb95bb1.

📒 Files selected for processing (12)

lib/llm/src/preprocessor.rs (3 hunks)
lib/llm/src/protocols/openai.rs (2 hunks)
lib/llm/src/protocols/openai/chat_completions.rs (2 hunks)
lib/llm/src/protocols/openai/chat_completions/jail.rs (17 hunks)
lib/llm/src/protocols/openai/common_ext.rs (1 hunks)
lib/llm/src/protocols/openai/completions.rs (1 hunks)
lib/llm/src/protocols/openai/tools.rs (1 hunks)
lib/llm/tests/test_common_ext.rs (1 hunks)
lib/llm/tests/test_reasoning_parser.rs (2 hunks)
lib/llm/tests/test_streaming_tool_parsers.rs (1 hunks)
lib/llm/tests/tool_choice.rs (1 hunks)
lib/llm/tests/tool_choice_finish_reasons.rs (1 hunks)

🧰 Additional context used

🧠 Learnings (10)

📓 Common learnings

Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: The create_choice method exists on multiple different objects in the codebase. The DeltaGenerator::create_choice in lib/llm/src/protocols/openai/chat_completions/delta.rs has its own signature that was updated to include reasoning_content, but other objects in lib/llm/src/engines.rs have their own separate create_choice methods with different signatures that are not related to chat completions.

Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: There are two separate DeltaGenerator classes in the codebase: one for chat completions (lib/llm/src/protocols/openai/chat_completions/delta.rs with object "chat.completion.chunk") and one for text completions (lib/llm/src/protocols/openai/completions/delta.rs with object "text_completion"). They have different create_choice method signatures and serve different OpenAI API endpoints. The reasoning parsing functionality is only relevant to the chat completions DeltaGenerator.

Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 2932
File: lib/llm/src/protocols/openai/chat_completions/aggregator.rs:66-86
Timestamp: 2025-09-10T05:04:58.417Z
Learning: In the dynamo codebase, tool call chunks from streaming responses always contain complete tool calls (one chunk = one tool call), unlike standard OpenAI streaming where tool calls can be fragmented across multiple chunks. The convert_tool_chunk_to_message_tool_call function correctly assumes complete tool call data in each chunk.

Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 2932
File: lib/llm/src/preprocessor.rs:768-844
Timestamp: 2025-09-10T15:27:42.511Z
Learning: In the tool calling jail implementation in lib/llm/src/preprocessor.rs, the design intentionally emits only the first accumulated choice that contains tool calls during unjailing, dropping other accumulated choices. This is a deliberate design decision, not a bug.

📚 Learning: 2025-08-22T19:55:41.608Z

Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: The create_choice method exists on multiple different objects in the codebase. The DeltaGenerator::create_choice in lib/llm/src/protocols/openai/chat_completions/delta.rs has its own signature that was updated to include reasoning_content, but other objects in lib/llm/src/engines.rs have their own separate create_choice methods with different signatures that are not related to chat completions.

Applied to files:

lib/llm/tests/tool_choice.rs
lib/llm/src/protocols/openai/completions.rs
lib/llm/src/protocols/openai/chat_completions.rs
lib/llm/tests/tool_choice_finish_reasons.rs
lib/llm/src/protocols/openai/tools.rs
lib/llm/src/protocols/openai/chat_completions/jail.rs

📚 Learning: 2025-09-10T15:27:42.511Z

Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 2932
File: lib/llm/src/preprocessor.rs:768-844
Timestamp: 2025-09-10T15:27:42.511Z
Learning: In the tool calling jail implementation in lib/llm/src/preprocessor.rs, the design intentionally emits only the first accumulated choice that contains tool calls during unjailing, dropping other accumulated choices. This is a deliberate design decision, not a bug.

Applied to files:

lib/llm/tests/tool_choice.rs
lib/llm/tests/test_streaming_tool_parsers.rs
lib/llm/src/protocols/openai/tools.rs
lib/llm/tests/test_reasoning_parser.rs
lib/llm/src/protocols/openai/chat_completions/jail.rs
lib/llm/src/preprocessor.rs

📚 Learning: 2025-08-22T19:55:41.608Z

Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: There are two separate DeltaGenerator classes in the codebase: one for chat completions (lib/llm/src/protocols/openai/chat_completions/delta.rs with object "chat.completion.chunk") and one for text completions (lib/llm/src/protocols/openai/completions/delta.rs with object "text_completion"). They have different create_choice method signatures and serve different OpenAI API endpoints. The reasoning parsing functionality is only relevant to the chat completions DeltaGenerator.

Applied to files:

lib/llm/tests/tool_choice.rs
lib/llm/src/protocols/openai/completions.rs
lib/llm/src/protocols/openai/chat_completions.rs
lib/llm/src/protocols/openai/tools.rs
lib/llm/src/protocols/openai/chat_completions/jail.rs

📚 Learning: 2025-09-10T05:04:58.417Z

Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 2932
File: lib/llm/src/protocols/openai/chat_completions/aggregator.rs:66-86
Timestamp: 2025-09-10T05:04:58.417Z
Learning: In the dynamo codebase, tool call chunks from streaming responses always contain complete tool calls (one chunk = one tool call), unlike standard OpenAI streaming where tool calls can be fragmented across multiple chunks. The convert_tool_chunk_to_message_tool_call function correctly assumes complete tool call data in each chunk.

Applied to files:

lib/llm/tests/tool_choice.rs
lib/llm/src/protocols/openai/chat_completions/jail.rs
lib/llm/src/preprocessor.rs

📚 Learning: 2025-09-16T19:47:30.312Z

Learnt from: KrishnanPrash
Repo: ai-dynamo/dynamo PR: 3067
File: lib/llm/src/preprocessor/prompt/template/oai.rs:87-134
Timestamp: 2025-09-16T19:47:30.312Z
Learning: In Dynamo, multimodal requests (containing image_url or other non-text content) are processed through a completely different workflow than text-only requests, so the may_be_fix_msg_content function in lib/llm/src/preprocessor/prompt/template/oai.rs will only encounter text-only content arrays.

Applied to files:

lib/llm/tests/tool_choice.rs

📚 Learning: 2025-09-02T16:46:54.015Z

Learnt from: GuanLuo
Repo: ai-dynamo/dynamo PR: 2714
File: lib/llm/src/discovery/model_entry.rs:38-42
Timestamp: 2025-09-02T16:46:54.015Z
Learning: In lib/llm/src/discovery/model_entry.rs, GuanLuo prefers not to add serde defaults for model_type and model_input fields to keep the specification explicit and avoid user errors, relying on atomic deployment strategy to avoid backward compatibility issues.

Applied to files:

lib/llm/src/protocols/openai/common_ext.rs
lib/llm/src/protocols/openai/chat_completions.rs
lib/llm/tests/test_common_ext.rs
lib/llm/src/protocols/openai/tools.rs

📚 Learning: 2025-09-10T22:32:12.978Z

Learnt from: zhongdaor-nv
Repo: ai-dynamo/dynamo PR: 2999
File: lib/parsers/src/tool_calling/harmony/harmony_parser.rs:250-256
Timestamp: 2025-09-10T22:32:12.978Z
Learning: In lib/parsers/src/tool_calling/harmony/harmony_parser.rs, the team prefers to maintain identical code patterns between parse_tool_calls_harmony and parse_tool_calls_harmony_complete functions, including message.content[0] indexing, to ensure consistency between streaming and complete parser implementations.

Applied to files:

lib/llm/tests/test_streaming_tool_parsers.rs
lib/llm/tests/test_reasoning_parser.rs
lib/llm/src/protocols/openai/chat_completions/jail.rs
lib/llm/src/preprocessor.rs

📚 Learning: 2025-08-22T20:10:09.345Z

Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/parsers/src/reasoning/gpt_oss_parser.rs:31-37
Timestamp: 2025-08-22T20:10:09.345Z
Learning: StreamableParser from openai-harmony does not implement the Debug trait, which prevents storing it as a field in structs that derive Debug in lib/parsers/src/reasoning/gpt_oss_parser.rs.

Applied to files:

lib/llm/tests/test_streaming_tool_parsers.rs
lib/llm/tests/test_reasoning_parser.rs

📚 Learning: 2025-09-08T21:18:43.478Z

Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2936
File: lib/parsers/src/reasoning/granite_parser.rs:42-46
Timestamp: 2025-09-08T21:18:43.478Z
Learning: In GraniteReasoningParser in lib/parsers/src/reasoning/granite_parser.rs, the think_start_tokens and think_end_tokens are hardcoded in the constructor with fixed values, so unwrap() calls on these vectors are safe and won't panic.

Applied to files:

lib/llm/tests/test_reasoning_parser.rs

🧬 Code graph analysis (4)

lib/llm/tests/tool_choice.rs (2)

lib/llm/src/protocols/openai/chat_completions/delta.rs (1)

new (103-134)

lib/llm/src/protocols/openai/chat_completions/jail.rs (5)

default (1192-1194)

builder (471-473)

new (126-135)

new (425-427)

new (1026-1034)

lib/llm/src/protocols/openai/common_ext.rs (2)

lib/llm/src/protocols/openai/chat_completions.rs (1)

get_guided_json (162-180)

lib/llm/src/protocols/openai/completions.rs (1)

get_guided_json (186-188)

lib/llm/tests/tool_choice_finish_reasons.rs (1)

lib/llm/src/protocols/openai/chat_completions/jail.rs (2)

default (1192-1194)

builder (471-473)

lib/llm/src/preprocessor.rs (3)

lib/llm/src/protocols/openai/chat_completions/jail.rs (2)

tool_call_parser (1069-1072)

builder (471-473)

lib/bindings/python/rust/llm/local_model.rs (1)

tool_call_parser (97-99)

lib/llm/src/preprocessor/prompt/template/oai.rs (1)

tool_choice (188-194)

🔇 Additional comments (30)

lib/llm/src/protocols/openai/common_ext.rs (1)

97-97: Trait signature change to owned Value is appropriate.

Returning an owned serde_json::Value enables implementations to construct new schemas dynamically (e.g., deriving from tool_choice + tools), which wouldn't be possible with a borrowed reference. The relevant implementations in chat_completions.rs and completions.rs have been updated accordingly.

lib/llm/src/protocols/openai/tools.rs (1)

65-106: Well-structured schema generation for required tool_choice.

The anyOf approach correctly allows the model to select any tool while enforcing the schema for each tool's parameters. The $defs merging with conflict detection is a good safeguard.

lib/llm/src/protocols/openai/chat_completions/jail.rs (4)

767-792: Immediate mode jail termination logic is correct.

The JSON parsing approach correctly waits for complete, valid JSON before ending the jail. The non-empty array check for ArrayOfTools aligns with the minItems: 1 constraint in the generated schema.

987-1000: Finish reason handling aligns with OpenAI semantics.

Named tool choice keeping Stop and required tool choice changing to ToolCalls matches OpenAI's behavior where forced tool invocation via named choice is considered a normal completion, while required mode indicates explicit tool call handling.

543-562: Emission filtering for jailed choices prevents aggregator issues.

The logic correctly filters out empty deltas for choices that were jailed, avoiding issues where the aggregator receives deltas without roles after unjailing. Based on learnings, this aligns with the jail stream's design.

1092-1106: Builder methods for immediate jail modes are clean.

The tool_choice_named and tool_choice_required methods provide a clear API for configuring immediate jail modes. Note that calling these will override any previously set jail_mode.

lib/llm/src/protocols/openai.rs (2)

20-20: New tools module correctly exported.

The public export enables access to get_json_schema_from_tools and ToolChoiceError for external schema derivation needs.

134-141: Guided JSON now passed as owned value.

The change from .cloned() to direct pass-through is correct since get_guided_json() now returns an owned Option<serde_json::Value>. This simplifies the code and avoids an unnecessary clone.

lib/llm/src/protocols/openai/completions.rs (1)

186-188: Implementation correctly updated for owned return type.

The .clone() is necessary since the field is stored as Option<serde_json::Value> and the trait now requires returning an owned value.

lib/llm/tests/test_common_ext.rs (1)

93-96: Test expectation correctly updated for owned return type.

The assertion now expects Some(serde_json::json!({"key": "value"})) (owned) instead of a borrowed reference, aligning with the updated get_guided_json() signature that returns Option<serde_json::Value>.

lib/llm/tests/test_reasoning_parser.rs (2)

486-490: Call site correctly updated for new API signature.

The apply_tool_calling_jail call now passes the tool parser as Some(...) and None for tool_choice, matching the updated function signature. This is appropriate since this test focuses on reasoning/tool parsing without tool_choice configuration.

599-603: Consistent API adaptation for the ignored test.

Same signature update applied correctly. The None for tool_choice maintains the existing test behavior.

lib/llm/src/protocols/openai/chat_completions.rs (1)

162-180: Well-structured guided JSON derivation with tool_choice fallback.

The implementation correctly:

Prioritizes explicit guided_json from the request (cloned for ownership)

Falls back to deriving JSON schema from tool_choice + tools when available

Gracefully handles derivation failures with a warning log and None return

The ownership change from borrowed to owned is necessary to support the new derivation path that constructs values rather than referencing existing fields.

lib/llm/tests/test_streaming_tool_parsers.rs (1)

159-164: Call site correctly adapted with clear documentation.

The update wraps tool_parser in Some(...) and passes None for tool_choice, matching the new API. The inline comment clarifies intent for future readers.

lib/llm/tests/tool_choice.rs (8)

1-35: Clean test setup with minimal request construction.

The imports and create_test_request helper provide a solid foundation for the test suite. The request construction uses sensible defaults with only required fields populated.

37-70: Single-response jail transformation helper is well-structured.

The helper correctly uses the builder pattern and handles both Named and Required tool choice variants. The unwrap() on line 69 is acceptable in test code.

72-110: Streaming variant correctly collects all transformed responses.

The implementation mirrors the single-response variant but collects all outputs via filter_map and collect, which is appropriate for streaming tests.

173-234: Thorough coverage of parallel tool calls with required tool choice.

The test properly validates:

Correct FinishReason::ToolCalls

Two tool calls with sequential indices (0, 1)

Correct IDs (call-1, call-2)

Proper function names and arguments extraction

The JSON array parsing for parallel tool calls is well-tested.

236-255: Good coverage of parse failure fallback behavior.

Tests the important edge case where JSON parsing fails, ensuring the raw content is returned rather than being silently dropped. The comment on lines 251-252 clarifies this matches marker-based FC behavior.

257-325: Streaming buffering test validates correct accumulation.

Tests that partial JSON chunks are buffered until complete, then emitted as a single tool call. The chunk boundary ({"location":" / Paris","unit":" / celsius"}) is a realistic test case.

327-406: Parallel tool calls in streaming mode well-tested.

Validates that the jail stream correctly buffers and parses a JSON array split across chunks, producing multiple tool calls in a single response.

408-436: Synchronous test for no-tool-choice baseline behavior.

Good to have this baseline test confirming normal text output flows through without tool call interpretation when tool_choice is not set.

lib/llm/tests/tool_choice_finish_reasons.rs (5)

1-15: LGTM on imports and license header.

Imports are well-organized and include all necessary types for testing the jail transformation and finish_reason handling.

16-37: LGTM on test request helper.

Clean minimal setup for test requests.

39-52: LGTM on backend output helper.

Correctly constructs minimal backend output for finish_reason testing.

89-120: LGTM on named tool choice length preservation test.

Correctly verifies that the jail transformation preserves Length finish reason for named tool choice.

227-250: LGTM on required tool choice stop-to-tool_calls transformation test.

Correctly verifies that the jail transforms Stop to ToolCalls for required tool choice.

lib/llm/src/preprocessor.rs (3)

754-757: LGTM on enabling jail for Named/Required without parser.

This correctly enables immediate jail mode for Named and Required tool_choice, which don't require a parser for marker-based detection since they operate in immediate mode.

780-814: LGTM on jail configuration logic.

The match-based configuration correctly routes:

Named → immediate jail with specific function name

Required → immediate jail in required mode

Auto/None/unspecified → traditional marker-based jail with parser

The fallthrough case at lines 802-809 is safe because should_apply_tool_jail already returns false when there's no parser and tool_choice is Auto/None.

974-979: LGTM on updated call site.

The call correctly passes both tool_call_parser and tool_choice to the updated apply_tool_calling_jail function.

lib/llm/tests/tool_choice.rs

lib/llm/src/protocols/openai/tools.rs

lib/llm/src/protocols/openai/chat_completions/jail.rs

lib/llm/src/protocols/openai/tools.rs

ayushag-nv

I am ok with the implementation. @grahamking can u give your rust expert view on this once ?

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

grahamking · 2025-12-09T16:47:19Z

/ok to test b3c69af

lib/llm/src/protocols/openai/chat_completions/jail.rs

grahamking · 2025-12-09T17:05:12Z

Thanks @vladnosiv !

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

Merged 238 commits from main branch to bring the feature branch up to date. Key conflicts resolved: - Removed lib/kvbm-kernels references (deleted in main) - Kept nova/nova-backend/kvbm workspace members from feature branch - Maintained v2 module API refactoring from feature branch - Updated Cargo.lock files to reflect new dependencies Major updates from main include: - LoRA support for vLLM (#4810) - Multimodal documentation (#4510) - Scaling adapter features (#4699, #4825) - Tool calling support (#4822, #4722) - NIXL connect improvements (#4433) Signed-off-by: Ryan Olson <rolson@nvidia.com>

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

pull-request-size bot added the size/XXL label Dec 3, 2025

github-actions bot added the feat label Dec 3, 2025

github-actions bot added the external-contribution Pull request is from an external contributor label Dec 3, 2025

add tool choice support

cdddc24

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv force-pushed the tool-choice-support branch from b732106 to cdddc24 Compare December 3, 2025 15:57

vladnosiv added 4 commits December 3, 2025 18:59

cargo fmt

bc4a6fe

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

revert deprecated field

5ba3d4e

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

replace deprecated annotation

773d86b

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

fix clippy

6986482

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

ayushag-nv self-requested a review December 4, 2025 16:42

vladnosiv added 11 commits December 5, 2025 14:00

remove honest fc streaming

941fa99

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

refactor on jail stream

3342460

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

remove partial json lib

e91b122

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

clean aggregator diff

e32318f

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

fix fmt

ea836bb

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

fix clippy

ff607bd

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

revert clippy change on non-relevant file

0456aa6

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

remove change

a21c9ae

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

Merge branch 'main' into tool-choice-support

797e4a3

revert delta.rs and cargo fmt

85ac49d

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

fix should apply tool jail

bb95bb1

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

vladnosiv marked this pull request as ready for review December 5, 2025 19:40

vladnosiv requested a review from a team as a code owner December 5, 2025 19:40

coderabbitai bot reviewed Dec 5, 2025

View reviewed changes

lib/llm/tests/tool_choice.rs Show resolved Hide resolved

ayushag-nv reviewed Dec 9, 2025

View reviewed changes

lib/llm/src/protocols/openai/tools.rs Show resolved Hide resolved

ayushag-nv reviewed Dec 9, 2025

View reviewed changes

lib/llm/src/protocols/openai/chat_completions/jail.rs Outdated Show resolved Hide resolved

ayushag-nv reviewed Dec 9, 2025

View reviewed changes

lib/llm/src/protocols/openai/tools.rs Show resolved Hide resolved

ayushag-nv reviewed Dec 9, 2025

View reviewed changes

vladnosiv added 4 commits December 9, 2025 14:08

Merge branch 'main' into tool-choice-support

d11ee9b

add helper function for tool chunk creation

7f6ba72

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

add comments in tools.rs

69cdcdf

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

fix id validation in tests

b3c69af

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

copy-pr-bot bot had a problem deploying to GITLAB December 9, 2025 16:47 Failure

grahamking reviewed Dec 9, 2025

View reviewed changes

lib/llm/src/protocols/openai/chat_completions/jail.rs Show resolved Hide resolved

grahamking reviewed Dec 9, 2025

View reviewed changes

lib/llm/src/protocols/openai/chat_completions/jail.rs Show resolved Hide resolved

grahamking approved these changes Dec 9, 2025

View reviewed changes

grahamking enabled auto-merge (squash) December 9, 2025 17:30

grahamking merged commit 5585f80 into ai-dynamo:main Dec 9, 2025
37 of 38 checks passed

karen-sy pushed a commit that referenced this pull request Dec 9, 2025

feat: add tool_choice support (#4722)

14f18ed

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

zxue2 pushed a commit to zxue2/dynamo that referenced this pull request Dec 11, 2025

feat: add tool_choice support (ai-dynamo#4722)

78a3250

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

smatta-star pushed a commit to smatta-star/dynamo that referenced this pull request Dec 19, 2025

feat: add tool_choice support (ai-dynamo#4722)

3cd3806

Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>

Conversation

vladnosiv commented Dec 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Future Work

Related Issues:

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Dec 3, 2025

Uh oh!

github-actions bot commented Dec 3, 2025

Uh oh!

grahamking commented Dec 3, 2025

Uh oh!

vladnosiv commented Dec 3, 2025

Uh oh!

vladnosiv commented Dec 3, 2025

Uh oh!

ayushag-nv commented Dec 4, 2025

Uh oh!

ayushag-nv commented Dec 4, 2025

Uh oh!

vladnosiv commented Dec 4, 2025

Uh oh!

ayushag-nv commented Dec 4, 2025

Uh oh!

vladnosiv commented Dec 5, 2025

Uh oh!

coderabbitai bot commented Dec 5, 2025

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ayushag-nv left a comment

Choose a reason for hiding this comment

Uh oh!

grahamking commented Dec 9, 2025

Uh oh!

Uh oh!

Uh oh!

grahamking commented Dec 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vladnosiv commented Dec 3, 2025 •

edited by coderabbitai bot

Loading