feat: DeepSeek V3.2 chat template support#4797
Conversation
Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>
|
👋 Hi vladnosiv! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
WalkthroughThis change introduces native DeepSeek V3.2 prompt formatting support to the LLM library. A new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
lib/llm/tests/deepseek_v32_encoding.rs (1)
41-47: Potential panic on malformed test input.The code uses
.unwrap()onas_object_mut()which will panic if the first message isn't an object. While acceptable for tests, consider usingexpect()with a descriptive message for better debugging.- first_msg - .as_object_mut() - .unwrap() + first_msg + .as_object_mut() + .expect("First message should be an object")lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)
66-68: Remove unused constant or add usage.
TOOL_CALLS_TEMPLATEis marked with#[allow(dead_code)]but appears unused. If it's reserved for future use, consider adding a TODO comment explaining the intent. Otherwise, remove it to reduce code clutter.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
lib/llm/src/preprocessor/prompt.rs(1 hunks)lib/llm/src/preprocessor/prompt/deepseek_v32.rs(1 hunks)lib/llm/src/preprocessor/prompt/template.rs(1 hunks)lib/llm/tests/data/deepseek-v3.2/test_input.json(1 hunks)lib/llm/tests/data/deepseek-v3.2/test_input_search_wo_date.json(1 hunks)lib/llm/tests/data/deepseek-v3.2/test_output.txt(1 hunks)lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt(1 hunks)lib/llm/tests/deepseek_v32_encoding.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-24T20:59:35.725Z
Learnt from: ishandhanani
Repo: ai-dynamo/dynamo PR: 1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.
Applied to files:
lib/llm/src/preprocessor/prompt.rs
🧬 Code graph analysis (3)
lib/llm/src/preprocessor/prompt/template.rs (1)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (2)
new(420-422)new_thinking(425-427)
lib/llm/tests/deepseek_v32_encoding.rs (1)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)
encode_messages(390-409)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)
lib/llm/src/preprocessor/prompt.rs (6)
tools(53-55)messages(52-52)supports_add_generation_prompt(83-83)supports_add_generation_prompt(96-98)render(84-84)render(100-116)
🪛 GitHub Actions: Copyright Checks
lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt
[error] 1-1: Copyright check failed: Invalid/Missing Header detected in file lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt.
lib/llm/tests/data/deepseek-v3.2/test_output.txt
[error] 1-1: Copyright check failed: Invalid/Missing Header detected in file lib/llm/tests/data/deepseek-v3.2/test_output.txt.
🪛 LanguageTool
lib/llm/tests/data/deepseek-v3.2/test_output.txt
[grammar] ~17-~17: Use a hyphen to join words.
Context: ...ibute should be set to "true" for string type parameters and "false" for other ty...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: clippy (.)
- GitHub Check: clippy (launch/dynamo-run)
- GitHub Check: clippy (lib/bindings/python)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (11)
lib/llm/src/preprocessor/prompt.rs (1)
26-26: LGTM!The new
deepseek_v32module is correctly exposed as public, following the existing pattern in this file.lib/llm/src/preprocessor/prompt/template.rs (1)
21-31: Model detection relies on display_name string matching.The detection logic is reasonable for the current use case. However, string-based model detection can be fragile if model naming conventions change.
Consider documenting the expected model name patterns that will match this logic (e.g., "DeepSeek-V3.2", "DeepSeek-V3.2-Speciale") either in a comment here or in the module documentation.
lib/llm/tests/data/deepseek-v3.2/test_input.json (1)
1-149: LGTM!Comprehensive test input covering the weather workflow scenario with tools, tool calls, reasoning content, and multi-turn conversation. The JSON structure is valid and exercises the key encoding features.
lib/llm/tests/deepseek_v32_encoding.rs (2)
403-408: Verify chat mode behavior.The comment states "Chat mode should have
</think>, thinking mode should have<think>" but the assertion on line 407 only checks that thinking mode contains<think>without checking it also contains</think>. This may be intentional based on the encoding logic, but worth verifying the expected behavior.
100-106: Verify that referenced test data files exist.The test references
test_input_search_w_date.jsonandtest_output_search_w_date.txt. Confirm these test data files are present in the repository's test data directory.lib/llm/tests/data/deepseek-v3.2/test_input_search_wo_date.json (1)
1-533: LGTM!Comprehensive test input exercising the developer role, multi-step tool invocations (search, open, find), and extensive reasoning content. This provides good coverage for complex agentic workflows.
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (5)
88-119: JSON formatting handles common cases but has edge cases.The
to_json()function adds spaces after colons and commas to match Python'sjson.dumps()format. The string detection logic (prev_char != '\\') handles simple escaped quotes but won't correctly handle sequences like\\"(escaped backslash followed by quote).For test data compatibility this is likely sufficient, but if edge cases arise, consider using
serde_json::to_string_prettywith custom formatting or a more robust state machine.
278-285: Verify thinking mode logic for assistant messages.The condition
last_user_idx.is_some_and(|idx| index > idx)correctly checks if this assistant message comes after the last user message. The comment on lines 276-277 clarifies that<think>was already added in the user message rendering.This is correct behavior - just noting for reviewers that the thinking tag handling is split across user and assistant message rendering.
360-371: Verify tool result thinking tag logic.The condition
index >= idx(line 362) triggers thinking start when the tool result index is greater than or equal tolast_user_idx. This differs from the assistant handling which usesindex > idx.Is the
>=intentional? If the tool result is at exactly the same index as the last user (which shouldn't happen structurally), this could produce unexpected output. The>comparison would be more consistent with assistant handling.
1-3: LGTM on overall implementation.Clean native Rust implementation of DeepSeek V3.2 prompt formatting. The code follows the official Python reference implementation structure and handles the key features: system/user/developer/assistant/tool messages, DSML token encoding, thinking mode, and tool call formatting.
440-458: Verify tool handling inrender()method.The
render()method converts messages but doesn't callreq.tools(). If the request provides tools separately viareq.tools(), they won't be rendered unless pre-injected into messages. Confirm whether tools are always guaranteed to be embedded in messages beforerender()is called, or if tool injection logic is needed.
Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>
Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>
Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>
|
/ok to test 82faa91 |
@grahamking, there was an error processing your request: See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/ |
|
/ok to test 4b6b97b |
dmitry-tokarev-nv
left a comment
There was a problem hiding this comment.
added a comment wrt attribution
Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>
|
/ok to test 6652fc9 |
|
/ok to test c462de6 |
Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com> Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Overview:
Add DeepSeek V3.2 chat template support (without tool call parser support now)
NOTE: ~5k lines here are the official tests files (json|txt) from deepseek: https://huggingface.co/deepseek-ai/DeepSeek-V3.2/tree/main/encoding
Details:
from_mdcfor dsv 3.2 and dsv 3.2-SpecialeWhere should the reviewer start?
lib/llm/src/preprocessor/prompt/deepseek_v32.rsmain implementationlib/llm/tests/deepseek_v32_encoding.rstests with official test datalib/llm/src/preprocessor/prompt/template.rsrenderer auto-detectionRelated Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Relates to #4796