feat: reasoning parser transformation #3295

ayushag-nv · 2025-09-29T22:43:08Z

Overview:

Earlier reasoning parsing was happening as a part of DeltaCreation Process. Ideally, it is a separate post/pre processing step that should happen either after delta creation or before it. Keeping the design of the code impact, this PR moves the reasoning parsing to a separate transformation step similar to tool call parsing step.

The transformation step happens after delta creation and applies standard stream folding logic.

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: [REFACTOR]: Relocation tool call and reasoning parser to the same location #3176

Summary by CodeRabbit

New Features
- Optional reasoning-content extraction in streaming responses, exposing separated reasoning and normal text when enabled (supports Basic, Mistral, Kimi formats).
Bug Fixes
- More reliable end-of-stream signaling in chat streaming responses.
- Preserves whitespace in reasoning content for accurate display.
Refactor
- Simplified streaming path by removing embedded reasoning parsing from the generator; end-user behavior unchanged unless reasoning parsing is enabled.
Tests
- Added comprehensive tests covering reasoning-content parsing across multiple parser styles.

Signed-off-by: ayushag <[email protected]>

copy-pr-bot · 2025-09-29T22:43:13Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: ayushag <[email protected]>

coderabbitai · 2025-09-30T17:47:16Z

Walkthrough

Implements a streaming API signature change for choice creation, relocates reasoning-content parsing from the delta generator to the preprocessor with a new streaming wrapper, expands preprocessor public fields and methods, adjusts tests accordingly, and alters a parser to preserve whitespace in reasoning text.

Changes

Cohort / File(s)	Summary of Changes
Choice API update (streaming) `lib/llm/src/engines.rs`, `lib/llm/tests/http-service.rs`, `lib/llm/tests/http_metrics.rs`	Updated calls to create_choice to a four-argument form; finish reason repositioned; removed trailing None(s). Tests aligned with new signature.
Preprocessor reasoning parsing integration `lib/llm/src/preprocessor.rs`, `lib/llm/tests/test_reasoning_parser.rs`	Added on-demand reasoning parsing via parse_reasoning_content_from_stream, ReasoningState, and ReasoningParser integration; expanded OpenAIPreprocessor public fields; added comprehensive tests for multiple parser styles and passthrough behavior.
Delta generator simplification (remove reasoning handling) `lib/llm/src/protocols/openai/chat_completions/delta.rs`	Removed reasoning_parser state and related methods; create_choice no longer accepts reasoning_content; streaming deltas no longer include reasoning_content; post-processing path simplified.
Parser behavior adjustment (whitespace) `lib/parsers/src/reasoning/base_parser.rs`	BasicReasoningParser now preserves leading/trailing whitespace in reasoning_text (removed trim).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Preprocessor
  participant Stream as Model Stream
  participant DeltaGen as DeltaGenerator
  participant API as OpenAI-Compatible API

  Note over Client,API: New flow (reasoning parsed in preprocessor)

  Client->>Preprocessor: Request chat completion (stream)
  Preprocessor->>Preprocessor: should_parse_reasoning?
  alt Reasoning parser configured
    Preprocessor->>Preprocessor: parse_reasoning_content_from_stream()
    Preprocessor->>Stream: Initialize wrapped stream with ReasoningParser
    loop For each chunk
      Stream-->>Preprocessor: Chunk { content, reasoning_content? }
      Preprocessor->>Preprocessor: Update delta.content and delta.reasoning_content
      Preprocessor-->>DeltaGen: Annotated chunk (content/reasoning split)
      DeltaGen->>API: create_choice(index, text, finish_reason, logprobs)
    end
  else No parser configured
    Preprocessor-->>DeltaGen: Pass-through stream chunks
    loop For each chunk
      DeltaGen->>API: create_choice(index, text, finish_reason, logprobs)
    end
  end
  API-->>Client: Streamed responses

sequenceDiagram
  autonumber
  participant Client
  participant DeltaGen as DeltaGenerator
  participant API as OpenAI-Compatible API

  Note over Client,API: Previous flow (reasoning parsed in DeltaGenerator)

  Client->>DeltaGen: Request chat completion (stream)
  loop For each chunk
    DeltaGen->>DeltaGen: create_reasoning_content(text, tokens)
    DeltaGen->>API: create_choice(index, normal_text, finish_reason, logprobs)
    Note right of API: reasoning_content previously embedded/handled here
  end
  API-->>Client: Streamed responses

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I thump my paw: the streams now flow,
With thoughts up front, not far below.
The delta’s light, its load made lean,
While preprocessors parse unseen.
Whitespace kept, no trims tonight—
Four args hop in, and all’s alright. 🥕🐇

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description follows the required template by including Overview and Related Issues sections, but the Details and Where should the reviewer start? sections remain unfilled with placeholder text instead of concrete change descriptions and file references, leaving critical information incomplete.	Please update the Details section to describe the specific code changes introduced by this PR and populate the Where should the reviewer start? section with the paths or modules that reviewers should examine first.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title “feat: reasoning parser transformation” succinctly conveys that this pull request introduces a feature-level change centering on refactoring or transforming the reasoning parser, which aligns with the main modifications observed across multiple modules. It follows the conventional commit style with a clear prefix and accurately reflects the primary intent without extraneous details.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 97a9854 and a1cc09a.

📒 Files selected for processing (7)

lib/llm/src/engines.rs (1 hunks)
lib/llm/src/preprocessor.rs (4 hunks)
lib/llm/src/protocols/openai/chat_completions/delta.rs (2 hunks)
lib/llm/tests/http-service.rs (1 hunks)
lib/llm/tests/http_metrics.rs (1 hunks)
lib/llm/tests/test_reasoning_parser.rs (1 hunks)
lib/parsers/src/reasoning/base_parser.rs (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-08-22T19:55:41.608Z

Learnt from: nachiketb-nvidia
PR: ai-dynamo/dynamo#2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: The create_choice method exists on multiple different objects in the codebase. The DeltaGenerator::create_choice in lib/llm/src/protocols/openai/chat_completions/delta.rs has its own signature that was updated to include reasoning_content, but other objects in lib/llm/src/engines.rs have their own separate create_choice methods with different signatures that are not related to chat completions.

Applied to files:

lib/llm/src/engines.rs
lib/llm/tests/http-service.rs
lib/llm/src/protocols/openai/chat_completions/delta.rs

📚 Learning: 2025-08-22T19:55:41.608Z

Learnt from: nachiketb-nvidia
PR: ai-dynamo/dynamo#2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: There are two separate DeltaGenerator classes in the codebase: one for chat completions (lib/llm/src/protocols/openai/chat_completions/delta.rs with object "chat.completion.chunk") and one for text completions (lib/llm/src/protocols/openai/completions/delta.rs with object "text_completion"). They have different create_choice method signatures and serve different OpenAI API endpoints. The reasoning parsing functionality is only relevant to the chat completions DeltaGenerator.

Applied to files:

lib/llm/src/protocols/openai/chat_completions/delta.rs

🧬 Code graph analysis (5)

lib/llm/tests/http_metrics.rs (1)

lib/llm/src/protocols/openai/chat_completions/delta.rs (1)

create_choice (197-239)

lib/llm/src/preprocessor.rs (2)

lib/llm/src/protocols/openai.rs (1)

new (242-247)

lib/parsers/src/reasoning/mod.rs (1)

get_reasoning_parser_from_name (178-194)

lib/llm/src/engines.rs (2)

lib/llm/src/protocols/openai/chat_completions/delta.rs (1)

create_choice (197-239)

lib/llm/src/protocols/openai/completions/delta.rs (1)

create_choice (142-170)

lib/llm/tests/http-service.rs (2)

lib/llm/src/protocols/openai/chat_completions/delta.rs (1)

create_choice (197-239)

lib/llm/src/protocols/openai/completions/delta.rs (1)

create_choice (142-170)

lib/llm/tests/test_reasoning_parser.rs (2)

lib/llm/src/preprocessor.rs (2)

parse_reasoning_content_from_stream (682-727)

new (114-120)

lib/parsers/src/reasoning/base_parser.rs (1)

new (17-31)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (1)

lib/llm/src/protocols/openai/chat_completions/delta.rs (1)

197-239: Updated four-arg create_choice looks good

Signature and call-site updates line up with the streamlined streaming API, and the delta payload still mirrors the OpenAI contract.

lib/llm/src/preprocessor.rs

Signed-off-by: ayushag <[email protected]>

elyasmnvidian

Can we add a very simple test case with reasoning + tool calling for nemotron and gpt oss. We can try to even rearrange it with (Reasoning, text, tool calling) and do all permutations of order to see if it fails in any case.

Signed-off-by: ayushag <[email protected]>

…arser-refactor' into ayushag/reasoning-parser-refactor

ayushag-nv · 2025-10-01T05:16:07Z

@zhongdaor-nv

DEBUG: Reasoning parsed chunk: Annotated {
    data: Some(
        CreateChatCompletionStreamResponse {
            id: "test-id",
            choices: [
                ChatChoiceStream {
                    index: 0,
                    delta: ChatCompletionStreamResponseDelta {
                        content: None,
                        function_call: None,
                        tool_calls: None,
                        role: Some(
                            Assistant,
                        ),
                        refusal: None,
                        reasoning_content: Some(
                            "Let me analyze this request. I need to get the current weather for San Francisco.",
                        ),
                    },
                    finish_reason: None,
                    logprobs: None,
                },
            ],
            created: 1234567890,
            model: "test-model",
            service_tier: None,
            system_fingerprint: Some(
                "test-fingerprint",
            ),
            object: "chat.completion.chunk",
            usage: None,
        },
    ),
    id: Some(
        "test-id",
    ),
    event: None,
    comment: None,
}
DEBUG: Reasoning parsed chunk: Annotated {
    data: Some(
        CreateChatCompletionStreamResponse {
            id: "test-id",
            choices: [
                ChatChoiceStream {
                    index: 0,
                    delta: ChatCompletionStreamResponseDelta {
                        content: None,
                        function_call: None,
                        tool_calls: None,
                        role: Some(
                            Assistant,
                        ),
                        refusal: None,
                        reasoning_content: None,
                    },
                    finish_reason: None,
                    logprobs: None,
                },
            ],
            created: 1234567890,
            model: "test-model",
            service_tier: None,
            system_fingerprint: Some(
                "test-fingerprint",
            ),
            object: "chat.completion.chunk",
            usage: None,
        },
    ),
    id: Some(
        "test-id",
    ),
    event: None,
    comment: None,
}
DEBUG: Reasoning parsed chunk: Annotated {
    data: Some(
        CreateChatCompletionStreamResponse {
            id: "test-id",
            choices: [
                ChatChoiceStream {
                    index: 0,
                    delta: ChatCompletionStreamResponseDelta {
                        content: Some(
                            "I'll check the weather for you.",
                        ),
                        function_call: None,
                        tool_calls: None,
                        role: Some(
                            Assistant,
                        ),
                        refusal: None,
                        reasoning_content: None,
                    },
                    finish_reason: None,
                    logprobs: None,
                },
            ],
            created: 1234567890,
            model: "test-model",
            service_tier: None,
            system_fingerprint: Some(
                "test-fingerprint",
            ),
            object: "chat.completion.chunk",
            usage: None,
        },
    ),
    id: Some(
        "test-id",
    ),
    event: None,
    comment: None,
}

Current reasoning parser implementation does not work well when there is tool calling output in the input stream

Signed-off-by: ayushag <[email protected]>

ayushag-nv · 2025-10-01T16:17:23Z

Can we add a very simple test case with reasoning + tool calling for nemotron and gpt oss. We can try to even rearrange it with (Reasoning, text, tool calling) and do all permutations of order to see if it fails in any case.

@elyasmnvidian Added nemotron and gpt-oss test. Gpt-oss test fails , I think there is some fundamental issue with the parsing. I will fix in follow up

done

elyasmnvidian

Nice, approved

ayushag-nv · 2025-10-01T16:23:37Z

/ok to test cfd7f8a

zhongdaor-nv

Thanks Ayush!

Signed-off-by: ayushag <[email protected]>

feat: reasoning parser transformation

ebeeab9

Signed-off-by: ayushag <[email protected]>

ayushag-nv requested a review from GuanLuo September 29, 2025 22:43

ayushag-nv requested a review from a team as a code owner September 29, 2025 22:43

pull-request-size bot added the size/L label Sep 29, 2025

ayushag-nv marked this pull request as draft September 29, 2025 22:43

github-actions bot added the feat label Sep 29, 2025

ayushag-nv changed the title ~~feat: reasoning parser transformation~~ [WIP] feat: reasoning parser transformation Sep 29, 2025

ayushag-nv added 2 commits September 29, 2025 23:26

fix: tests

a8410c2

Signed-off-by: ayushag <[email protected]>

chore: better code now

a1cc09a

Signed-off-by: ayushag <[email protected]>

pull-request-size bot added size/XL and removed size/L labels Sep 30, 2025

ayushag-nv requested a review from grahamking September 30, 2025 17:23

ayushag-nv changed the title ~~[WIP] feat: reasoning parser transformation~~ feat: reasoning parser transformation Sep 30, 2025

ayushag-nv marked this pull request as ready for review September 30, 2025 17:41

coderabbitai bot reviewed Sep 30, 2025

View reviewed changes

lib/llm/src/preprocessor.rs Show resolved Hide resolved

fix: coderabbit

4193cc0

Signed-off-by: ayushag <[email protected]>

elyasmnvidian previously requested changes Sep 30, 2025

View reviewed changes

ayushag-nv added 4 commits September 30, 2025 21:35

feat: reasoning parser transformation

4d8feb9

Signed-off-by: ayushag <[email protected]>

fix: tests

ffb7e26

Signed-off-by: ayushag <[email protected]>

chore: better code now

7b58b38

Signed-off-by: ayushag <[email protected]>

fix: coderabbit

ddd0bdf

Signed-off-by: ayushag <[email protected]>

ayushag-nv force-pushed the ayushag/reasoning-parser-refactor branch from 4193cc0 to ddd0bdf Compare October 1, 2025 04:35

Merge remote-tracking branch 'refs/remotes/origin/ayushag/reasoning-p…

aae879b

…arser-refactor' into ayushag/reasoning-parser-refactor

ayushag-nv added 2 commits October 1, 2025 05:43

chore: add e2e reasoning + tool tests

2e10fdf

Signed-off-by: ayushag <[email protected]>

fix: clippy

cfd7f8a

Signed-off-by: ayushag <[email protected]>

elyasmnvidian self-requested a review October 1, 2025 16:18

elyasmnvidian approved these changes Oct 1, 2025

View reviewed changes

ayushag-nv requested a review from rmccorm4 October 1, 2025 17:03

rmccorm4 requested a review from zhongdaor-nv October 1, 2025 17:05

zhongdaor-nv approved these changes Oct 1, 2025

View reviewed changes

rmccorm4 approved these changes Oct 1, 2025

View reviewed changes

rmccorm4 merged commit 2a2bf58 into main Oct 1, 2025
23 checks passed

rmccorm4 deleted the ayushag/reasoning-parser-refactor branch October 1, 2025 17:43

GuanLuo mentioned this pull request Oct 6, 2025

[REFACTOR]: Relocation tool call and reasoning parser to the same location #3176

Closed

nv-tusharma pushed a commit that referenced this pull request Oct 20, 2025

feat: reasoning parser transformation (#3295)

f582d90

Signed-off-by: ayushag <[email protected]>

rmccorm4 added the frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` label Nov 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: reasoning parser transformation #3295

feat: reasoning parser transformation #3295

Uh oh!

ayushag-nv commented Sep 29, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Sep 29, 2025

Uh oh!

coderabbitai bot commented Sep 30, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

elyasmnvidian left a comment

Uh oh!

ayushag-nv commented Oct 1, 2025

Uh oh!

ayushag-nv commented Oct 1, 2025

Uh oh!

elyasmnvidian left a comment

Uh oh!

ayushag-nv commented Oct 1, 2025

Uh oh!

zhongdaor-nv left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

feat: reasoning parser transformation #3295

feat: reasoning parser transformation #3295

Uh oh!

Conversation

ayushag-nv commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Sep 29, 2025

Uh oh!

coderabbitai bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elyasmnvidian left a comment

Choose a reason for hiding this comment

Uh oh!

ayushag-nv commented Oct 1, 2025

Uh oh!

ayushag-nv commented Oct 1, 2025

Uh oh!

elyasmnvidian left a comment

Choose a reason for hiding this comment

Uh oh!

ayushag-nv commented Oct 1, 2025

Uh oh!

zhongdaor-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ayushag-nv commented Sep 29, 2025 •

edited

Loading

coderabbitai bot commented Sep 30, 2025 •

edited

Loading