-
Notifications
You must be signed in to change notification settings - Fork 725
feat: reasoning parser transformation #3295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: ayushag <[email protected]>
Signed-off-by: ayushag <[email protected]>
Signed-off-by: ayushag <[email protected]>
WalkthroughImplements a streaming API signature change for choice creation, relocates reasoning-content parsing from the delta generator to the preprocessor with a new streaming wrapper, expands preprocessor public fields and methods, adjusts tests accordingly, and alters a parser to preserve whitespace in reasoning text. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Preprocessor
participant Stream as Model Stream
participant DeltaGen as DeltaGenerator
participant API as OpenAI-Compatible API
Note over Client,API: New flow (reasoning parsed in preprocessor)
Client->>Preprocessor: Request chat completion (stream)
Preprocessor->>Preprocessor: should_parse_reasoning?
alt Reasoning parser configured
Preprocessor->>Preprocessor: parse_reasoning_content_from_stream()
Preprocessor->>Stream: Initialize wrapped stream with ReasoningParser
loop For each chunk
Stream-->>Preprocessor: Chunk { content, reasoning_content? }
Preprocessor->>Preprocessor: Update delta.content and delta.reasoning_content
Preprocessor-->>DeltaGen: Annotated chunk (content/reasoning split)
DeltaGen->>API: create_choice(index, text, finish_reason, logprobs)
end
else No parser configured
Preprocessor-->>DeltaGen: Pass-through stream chunks
loop For each chunk
DeltaGen->>API: create_choice(index, text, finish_reason, logprobs)
end
end
API-->>Client: Streamed responses
sequenceDiagram
autonumber
participant Client
participant DeltaGen as DeltaGenerator
participant API as OpenAI-Compatible API
Note over Client,API: Previous flow (reasoning parsed in DeltaGenerator)
Client->>DeltaGen: Request chat completion (stream)
loop For each chunk
DeltaGen->>DeltaGen: create_reasoning_content(text, tokens)
DeltaGen->>API: create_choice(index, normal_text, finish_reason, logprobs)
Note right of API: reasoning_content previously embedded/handled here
end
API-->>Client: Streamed responses
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
lib/llm/src/engines.rs(1 hunks)lib/llm/src/preprocessor.rs(4 hunks)lib/llm/src/protocols/openai/chat_completions/delta.rs(2 hunks)lib/llm/tests/http-service.rs(1 hunks)lib/llm/tests/http_metrics.rs(1 hunks)lib/llm/tests/test_reasoning_parser.rs(1 hunks)lib/parsers/src/reasoning/base_parser.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-22T19:55:41.608Z
Learnt from: nachiketb-nvidia
PR: ai-dynamo/dynamo#2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: The create_choice method exists on multiple different objects in the codebase. The DeltaGenerator::create_choice in lib/llm/src/protocols/openai/chat_completions/delta.rs has its own signature that was updated to include reasoning_content, but other objects in lib/llm/src/engines.rs have their own separate create_choice methods with different signatures that are not related to chat completions.
Applied to files:
lib/llm/src/engines.rslib/llm/tests/http-service.rslib/llm/src/protocols/openai/chat_completions/delta.rs
📚 Learning: 2025-08-22T19:55:41.608Z
Learnt from: nachiketb-nvidia
PR: ai-dynamo/dynamo#2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: There are two separate DeltaGenerator classes in the codebase: one for chat completions (lib/llm/src/protocols/openai/chat_completions/delta.rs with object "chat.completion.chunk") and one for text completions (lib/llm/src/protocols/openai/completions/delta.rs with object "text_completion"). They have different create_choice method signatures and serve different OpenAI API endpoints. The reasoning parsing functionality is only relevant to the chat completions DeltaGenerator.
Applied to files:
lib/llm/src/protocols/openai/chat_completions/delta.rs
🧬 Code graph analysis (5)
lib/llm/tests/http_metrics.rs (1)
lib/llm/src/protocols/openai/chat_completions/delta.rs (1)
create_choice(197-239)
lib/llm/src/preprocessor.rs (2)
lib/llm/src/protocols/openai.rs (1)
new(242-247)lib/parsers/src/reasoning/mod.rs (1)
get_reasoning_parser_from_name(178-194)
lib/llm/src/engines.rs (2)
lib/llm/src/protocols/openai/chat_completions/delta.rs (1)
create_choice(197-239)lib/llm/src/protocols/openai/completions/delta.rs (1)
create_choice(142-170)
lib/llm/tests/http-service.rs (2)
lib/llm/src/protocols/openai/chat_completions/delta.rs (1)
create_choice(197-239)lib/llm/src/protocols/openai/completions/delta.rs (1)
create_choice(142-170)
lib/llm/tests/test_reasoning_parser.rs (2)
lib/llm/src/preprocessor.rs (2)
parse_reasoning_content_from_stream(682-727)new(114-120)lib/parsers/src/reasoning/base_parser.rs (1)
new(17-31)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (1)
lib/llm/src/protocols/openai/chat_completions/delta.rs (1)
197-239: Updated four-argcreate_choicelooks goodSignature and call-site updates line up with the streamlined streaming API, and the delta payload still mirrors the OpenAI contract.
Signed-off-by: ayushag <[email protected]>
elyasmnvidian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a very simple test case with reasoning + tool calling for nemotron and gpt oss. We can try to even rearrange it with (Reasoning, text, tool calling) and do all permutations of order to see if it fails in any case.
Signed-off-by: ayushag <[email protected]>
Signed-off-by: ayushag <[email protected]>
Signed-off-by: ayushag <[email protected]>
Signed-off-by: ayushag <[email protected]>
4193cc0 to
ddd0bdf
Compare
…arser-refactor' into ayushag/reasoning-parser-refactor
Current reasoning parser implementation does not work well when there is tool calling output in the input stream |
Signed-off-by: ayushag <[email protected]>
Signed-off-by: ayushag <[email protected]>
@elyasmnvidian Added nemotron and gpt-oss test. Gpt-oss test fails , I think there is some fundamental issue with the parsing. I will fix in follow up |
elyasmnvidian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, approved
|
/ok to test cfd7f8a |
zhongdaor-nv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Ayush!
Signed-off-by: ayushag <[email protected]>
Overview:
Earlier reasoning parsing was happening as a part of DeltaCreation Process. Ideally, it is a separate post/pre processing step that should happen either after delta creation or before it. Keeping the design of the code impact, this PR moves the reasoning parsing to a separate transformation step similar to tool call parsing step.
The transformation step happens after delta creation and applies standard stream folding logic.
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests