Skip to content

Conversation

elyasmnvidian
Copy link
Contributor

@elyasmnvidian elyasmnvidian commented Oct 10, 2025

Fixes: #3599

Summary

This PR fixes critical issues with the DeepSeek tool call parser for streaming JSON responses and adds comprehensive integration testing infrastructure.

Key Changes

  • Fixed DeepSeek tool call parser to correctly handle streaming JSON responses
  • Added parallel tool call integration tests to validate multi-tool scenarios
  • Enhanced tool calling configuration for improved parser integration
  • Created helper tooling for generating and capturing tool call stream test data

Problem Statement

The DeepSeek tool call parser was failing to correctly parse streaming JSON responses, particularly when handling tool calls with complex arguments. The parser underwent several iterations to handle edge cases in the streaming format, requiring significant refactoring of the JSON parsing logic.

Changes

1. DeepSeek Parser Refactoring (lib/parsers/src/tool_calling/json/deepseek_parser.rs)

  • 328 lines modified with significant refactoring of the JSON parsing logic
  • Improved handling of streaming JSON chunks
  • Fixed edge cases in tool call argument parsing
  • Enhanced state management during streaming
  • Now correctly handles incomplete JSON chunks during streaming
  • Manages state transitions between reasoning and tool call content
  • Parses tool call arguments that arrive fragmented across multiple chunks
  • Distinguishes between regular content and structured tool calls

2. Configuration Updates (lib/parsers/src/tool_calling/config.rs)

  • 28 lines modified for better integration with updated parser
  • Adjusted tool calling configuration to support improved parsing

3. Parallel Tool Call Integration Tests (lib/llm/tests/parallel_tool_call_integration.rs)

  • New 96-line integration test suite for parallel tool calls
  • Validates multi-tool call scenarios
  • Ensures parser correctly handles concurrent tool invocations

4. Helper Tooling (lib/llm/tests/scripts/tool_call_stream_request_helper.py)

  • 167-line Python script for generating and capturing streaming test data
  • Enables easy creation of new test cases from live SGLang endpoints
  • Facilitates regression testing with real-world data

Testing

The changes have been validated with:

  • Integration tests for parallel tool calls
  • Manual testing with streaming endpoints
  • Validation against real SGLang responses

Implementation Details

The parser now correctly:

  • Handles incomplete JSON chunks during streaming
  • Manages state transitions between reasoning and tool call content
  • Parses tool call arguments that arrive fragmented across multiple chunks
  • Distinguishes between regular content and structured tool calls

Summary by CodeRabbit

  • New Features
    • Expanded tool-call parsing compatibility for DeepSeek v3.1, recognizing more start/end/separator tokens.
  • Bug Fixes
    • More robust extraction of normal content and parsing of multiline or imperfect JSON in tool calls.
  • Refactor
    • Reworked parser to token-based extraction for improved reliability across formats.
  • Tests
    • Added comprehensive streaming fixtures and end-to-end cases covering tool and no-tool scenarios across multiple models.
  • Chores
    • Introduced a lightweight CLI helper to issue streaming requests and summarize chunk counts.

@github-actions github-actions bot added the chore label Oct 10, 2025
@elyasmnvidian elyasmnvidian force-pushed the elyas/sglang_stream_test branch from ccb040b to cb06ca4 Compare October 13, 2025 21:20
@elyasmnvidian elyasmnvidian force-pushed the elyas/sglang_stream_test branch from cb06ca4 to 31c8143 Compare October 13, 2025 21:22
@elyasmnvidian elyasmnvidian force-pushed the elyas/sglang_stream_test branch from 31c8143 to 59cb41d Compare October 13, 2025 21:22
@elyasmnvidian elyasmnvidian force-pushed the elyas/sglang_stream_test branch from 59cb41d to 5b3a031 Compare October 14, 2025 17:45
@elyasmnvidian elyasmnvidian force-pushed the elyas/sglang_stream_test branch from 9c8fa6b to 4aa911c Compare October 15, 2025 17:40
@elyasmnvidian elyasmnvidian changed the title chore: add sglang e2e parsing test data fix: deepseek tool parsing fixed && added some sglang e2e tests Oct 15, 2025
@elyasmnvidian elyasmnvidian marked this pull request as ready for review October 15, 2025 17:43
@elyasmnvidian elyasmnvidian requested review from a team as code owners October 15, 2025 17:43
@github-actions github-actions bot added fix and removed chore labels Oct 15, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 15, 2025

Walkthrough

Adds multiple streaming chat-completion test fixtures (SGLang and vLLM), standardizes their JSON schema to use expected_output and input_stream, introduces a helper CLI script for streaming with optional tools, extends parser config with separator tokens, and refactors DeepSeek v3.1 tool-call parsing with new block extraction and single-call parsing. Updates tests accordingly.

Changes

Cohort / File(s) Summary
SGLang streaming fixtures (new)
lib/llm/tests/data/sglang/gpt-oss-20b/chat_completion_stream_19c97899-tool.json, lib/llm/tests/data/sglang/gpt-oss-20b/chat_completion_stream_675195a8-no-tool.json, lib/llm/tests/data/sglang/qwen3-0.6B/chat_completion_stream_c42ba578-tool.json
Add streaming test payloads (with/without tool calls) including expected_output and input_stream sequences for GPT-OSS and Qwen models.
vLLM fixtures: schema reorg and content tweaks
lib/llm/tests/data/vllm/gpt-oss-20b/chat_completion_stream_49f581c1-no-tool.json, lib/llm/tests/data/vllm/gpt-oss-20b/chat_completion_stream_f0c86d72-tool.json, lib/llm/tests/data/vllm/nemotron-49b/chat_completion_stream_3d40f925-tool.json, lib/llm/tests/data/vllm/qwen3-0.6B/chat_completion_stream_8f33c28b-tool.json, lib/llm/tests/data/vllm/qwen3-0.6B/chat_completion_stream_5627a4c6-no-tool.json
Migrate to expected_output container and input_stream root field; minor textual/emoji updates; no logic changes.
E2E parser tests
lib/llm/tests/test_parsers_e2e.rs
Adapt tests to new fixture schema; construct annotated stream chunks; add multiple async tests covering SGLang/Qwen with/without tools; aggregate and assert reasoning, content, and tool_calls.
Parallel tool-call parsing test
lib/llm/tests/parallel_tool_call_integration.rs
Add test for DeepSeek v3.1 tool-call parsing, validating IDs, types, JSON args, and remaining content.
Streaming helper script (new)
lib/llm/tests/scripts/tool_call_stream_request_helper.py
Add CLI utility to send streaming chat requests with optional tools; returns chunk count.
Parser config: separators and tokens
lib/parsers/src/tool_calling/config.rs
Add public field tool_call_separator_tokens to JsonParserConfig; expand DeepSeek v3.1 start/end/separator token sets; update defaults.
DeepSeek parser refactor
lib/parsers/src/tool_calling/json/deepseek_parser.rs
Replace regex approach with token-based block extraction and per-block parsing; add extract_tool_call_blocks (public); generate UUIDs for tool-call IDs; expand tests and normalize normal_text handling.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Tester
  participant TestRunner as Test runner
  participant Fixture as Test fixture (JSON)
  participant Parser as Stream parser
  participant ToolParser as DeepSeek v3.1 tool-call parser

  Tester->>TestRunner: Run E2E streaming tests
  TestRunner->>Fixture: Load expected_output + input_stream
  Note over Fixture: expected_output.{normal_content, reasoning_content, tool_calls}
  TestRunner->>Parser: parse_response_stream(input_stream, opts)
  loop For each chunk
    Parser-->>TestRunner: deltas (reasoning/content/tool signals)
  end
  alt Tool-call tokens present
    Parser->>ToolParser: parse_tool_calls_deepseek_v3_1(text)
    ToolParser->>ToolParser: extract_tool_call_blocks(start_tokens, end_tokens)
    ToolParser->>ToolParser: parse_single_tool_call(block, separator_tokens)
    ToolParser-->>Parser: ToolCallResponse(s) with UUIDs
  else No tool-call tokens
    Parser-->>TestRunner: Aggregated text only
  end
  TestRunner->>Fixture: Compare aggregated vs expected_output
  TestRunner-->>Tester: Assertions pass/fail
Loading
sequenceDiagram
  autonumber
  participant Client as CLI helper
  participant OpenAI as OpenAI client
  participant Server as LLM server

  Client->>OpenAI: chat.completions.create(stream=true, tools?)
  OpenAI->>Server: Request (model, messages, params)
  loop Stream
    Server-->>OpenAI: chat.completion.chunk
    OpenAI-->>Client: yield chunk
    Client->>Client: count += 1
  end
  Client-->>Client: print total chunks
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I thump the keys with whiskered grace,
New streams align, in tidy place.
Tokens hop, then tools appear—
A UUID for every gear.
Parsers sniff the start and end,
Carrots cached in tests we send.
Ship it? Nose wiggle: aye, my friend. 🥕🐇

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description does not follow the repository’s required template structure; it omits the mandatory “#### Overview:”, “#### Details:”, and “#### Where should the reviewer start?” headings and the Related Issues section does not use the “- closes GitHub issue: #xxx” format. Although the description includes substantive information under custom headings, it diverges from the prescribed layout and may confuse reviewers expecting the template. Please revise the description to use the repository’s template by adding the “#### Overview:”, “#### Details:”, and “#### Where should the reviewer start?” sections, and update Related Issues to “- closes GitHub issue: #3599” so that it aligns with the required structure.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title Check ✅ Passed The title indicates the pull request addresses a fix to the DeepSeek tool parsing logic, which aligns with a core code change. It omits the bulk of new SGLang end-to-end tests, parser configuration additions, and multiple data fixtures, so it only reflects part of the scope. While concise, it does not convey the comprehensive set of updates across tests and configurations.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
lib/llm/tests/test_parsers_e2e.rs (3)

29-57: Loader enforces new format only; comment is misleading.

Code requires expected_output and input_stream; the “supports both new and legacy formats” comment is inaccurate. Either update the comment or add a legacy fallback.

Apply one of:

  • Update comment:
-    // Extract expected values (supports both new and legacy formats)
+    // Extract expected values (new format only: requires `expected_output` and `input_stream`)
  • Or add tolerant fallback (example sketch):
-    let expected = parsed_json
-        .get("expected_output")
-        .expect("No 'expected_output' object found in JSON");
+    let expected = parsed_json
+        .get("expected_output")
+        .or_else(|| Some(&parsed_json))
+        .expect("No 'expected_output' object found in JSON or legacy root fields missing");
...
-    let data_chunks = parsed_json
-        .get("input_stream")
-        .and_then(|v| v.as_array())
-        .expect("No 'input_stream' array found in JSON");
+    let data_chunks = parsed_json
+        .get("input_stream")
+        .or_else(|| parsed_json.get("data"))
+        .and_then(|v| v.as_array())
+        .expect("No 'input_stream' (new) or 'data' (legacy) array found in JSON");

165-184: Tool-call equality is brittle across streamed chunks.

Streamed tool_calls may repeat across chunks. Consider deduping by (function.name, arguments) before comparing lengths.

Example:

-    assert_eq!(actual_tool_calls.len(), expected_tool_calls.len());
+    let mut unique_actual = actual_tool_calls.to_vec();
+    unique_actual.dedup_by(|a, b| a["function"]["name"] == b["function"]["name"]
+        && a["function"]["arguments"] == b["function"]["arguments"]);
+    assert_eq!(unique_actual.len(), expected_tool_calls.len());

353-360: Test flag/comment mismatch.

Comment says “parsing disabled” but you pass true,true (tool_parse_enable, reasoning_enable). Fix the comment or set both flags to false.

-        // Parse the response stream with reasoning and tool parsing disabled
+        // Parse the response stream with reasoning and tool parsing enabled

Or:

-            true,
-            true,
+            false,
+            false,
lib/llm/tests/scripts/tool_call_stream_request_helper.py (3)

17-20: Remove unused import and let isort/black settle imports.

pathlib.Path is unused; ruff will flag it.

-import argparse
-from openai import OpenAI
-from pathlib import Path
+import argparse
+from openai import OpenAI

64-69: Rename unused loop variable to appease ruff (B007).

-    for chunk in stream:
+    for _chunk in stream:
         chunk_count += 1

162-164: Avoid f-string without placeholders (F541).

-    print(f"✓ Request completed successfully")
+    print("✓ Request completed successfully")
lib/parsers/src/tool_calling/json/deepseek_parser.rs (3)

63-66: Avoid early return to miss mixed-token matches.

Returning after first token-pair match can skip valid blocks if models mix token variants across chunks. Aggregate across pairs and dedup.

-                // If we found matches with this token pair, don't try other combinations
-                if !blocks.is_empty() {
-                    return blocks;
-                }
+                // Continue scanning other token pairs; we'll aggregate and dedup below

Optionally dedup at end:

blocks.sort();
blocks.dedup();

150-166: Pick earliest start index for normal_text.

Current find_map returns the first configured token found, not the earliest occurrence. Compute min index across tokens to avoid off-by-selection when multiple variants exist.

-    let normal_text = if !wrapper_tokens.is_empty() {
-        wrapper_tokens
-            .iter()
-            .find_map(|token| {
-                trimmed
-                    .find(token.as_str())
-                    .map(|idx| trimmed[..idx].to_string())
-            })
-            .unwrap_or_else(String::new)
+    let normal_text = if !wrapper_tokens.is_empty() {
+        let idx = wrapper_tokens
+            .iter()
+            .filter_map(|t| trimmed.find(t.as_str()))
+            .min();
+        idx.map(|i| trimmed[..i].to_string()).unwrap_or_default()
     } else {
         // Fallback to first individual call token if no wrapper found
-        tool_call_start_tokens
-            .iter()
-            .filter(|token| !token.is_empty())
-            .find_map(|token| trimmed.find(token).map(|idx| trimmed[..idx].to_string()))
-            .unwrap_or_else(String::new)
+        let idx = tool_call_start_tokens
+            .iter()
+            .filter(|token| !token.is_empty())
+            .filter_map(|t| trimmed.find(t))
+            .min();
+        idx.map(|i| trimmed[..i].to_string()).unwrap_or_default()
     };

215-233: Simplify partial-token detection and reduce allocations.

Current loop builds Strings per char. Use starts_with/ends_with checks.

-    config.tool_call_start_tokens.iter().any(|token| {
-        if token.is_empty() {
-            return false;
-        }
-        // Check if the chunk could be a prefix of this start token
-        // Handle Unicode character boundaries properly
-        for i in 1..=token.chars().count() {
-            if let Some(prefix) = token.chars().take(i).collect::<String>().get(..) {
-                let prefix_str = &prefix[..prefix.len()];
-                if trimmed == prefix_str || trimmed.ends_with(prefix_str) {
-                    return true;
-                }
-            }
-        }
-        false
-    })
+    config.tool_call_start_tokens.iter().any(|token| {
+        !token.is_empty() && (token.starts_with(trimmed) || trimmed.ends_with(token))
+    })
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5854e1 and 4aa911c.

📒 Files selected for processing (13)
  • lib/llm/tests/data/sglang/gpt-oss-20b/chat_completion_stream_19c97899-tool.json (1 hunks)
  • lib/llm/tests/data/sglang/gpt-oss-20b/chat_completion_stream_675195a8-no-tool.json (1 hunks)
  • lib/llm/tests/data/sglang/qwen3-0.6B/chat_completion_stream_c42ba578-tool.json (1 hunks)
  • lib/llm/tests/data/vllm/gpt-oss-20b/chat_completion_stream_49f581c1-no-tool.json (11 hunks)
  • lib/llm/tests/data/vllm/gpt-oss-20b/chat_completion_stream_f0c86d72-tool.json (1 hunks)
  • lib/llm/tests/data/vllm/nemotron-49b/chat_completion_stream_3d40f925-tool.json (1 hunks)
  • lib/llm/tests/data/vllm/qwen3-0.6B/chat_completion_stream_5627a4c6-no-tool.json (2 hunks)
  • lib/llm/tests/data/vllm/qwen3-0.6B/chat_completion_stream_8f33c28b-tool.json (1 hunks)
  • lib/llm/tests/parallel_tool_call_integration.rs (1 hunks)
  • lib/llm/tests/scripts/tool_call_stream_request_helper.py (1 hunks)
  • lib/llm/tests/test_parsers_e2e.rs (2 hunks)
  • lib/parsers/src/tool_calling/config.rs (3 hunks)
  • lib/parsers/src/tool_calling/json/deepseek_parser.rs (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
lib/llm/tests/parallel_tool_call_integration.rs (2)
lib/parsers/src/tool_calling/parsers.rs (1)
  • detect_and_parse_tool_call (80-104)
lib/parsers/src/tool_calling/json/deepseek_parser.rs (2)
  • serde_json (95-95)
  • serde_json (107-107)
lib/llm/tests/scripts/tool_call_stream_request_helper.py (1)
lib/async-openai/src/client.rs (1)
  • chat (96-98)
lib/parsers/src/tool_calling/json/deepseek_parser.rs (1)
lib/parsers/src/tool_calling/config.rs (1)
  • deepseek_v3_1 (155-189)
🪛 GitHub Actions: Copyright Checks
lib/llm/tests/scripts/tool_call_stream_request_helper.py

[error] 1-1: Copyright header check failed. Missing/invalid header detected in this file.

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/3557/merge) by elyasmnvidian.
lib/llm/tests/scripts/tool_call_stream_request_helper.py

[error] 1-1: isort reformatted the file during pre-commit (files were modified by this hook).


[error] 1-1: Black formatting reformatted the file during pre-commit (reformatted lib/llm/tests/scripts/tool_call_stream_request_helper.py).


[error] 1-1: Ruff linting found issues and fixed them during pre-commit (2 errors fixed).

🪛 Ruff (0.14.0)
lib/llm/tests/scripts/tool_call_stream_request_helper.py

65-65: Loop control variable chunk not used within loop body

Rename unused chunk to _chunk

(B007)


162-162: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: vllm (amd64)
  • GitHub Check: trtllm (arm64)
  • GitHub Check: vllm (arm64)
  • GitHub Check: sglang
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: tests (launch/dynamo-run)
  • GitHub Check: tests (.)
  • GitHub Check: tests (lib/bindings/python)
  • GitHub Check: Mirror Repository to GitLab
🔇 Additional comments (12)
lib/llm/tests/data/vllm/nemotron-49b/chat_completion_stream_3d40f925-tool.json (1)

3-18: LGTM! Test data restructuring is consistent.

The restructuring wraps response fields into expected_output and introduces input_stream, aligning with the broader test fixture standardization across the PR.

lib/llm/tests/data/vllm/qwen3-0.6B/chat_completion_stream_8f33c28b-tool.json (1)

3-18: LGTM! Consistent test data restructuring.

The changes align with the standardized test fixture format across the PR.

lib/llm/tests/data/sglang/qwen3-0.6B/chat_completion_stream_c42ba578-tool.json (1)

1-132: LGTM! Well-structured SGLang test fixture.

The new test fixture follows the standardized format with expected_output and input_stream, providing comprehensive test coverage for SGLang streaming with tool calls.

lib/llm/tests/data/vllm/qwen3-0.6B/chat_completion_stream_5627a4c6-no-tool.json (1)

4-4: LGTM! Improved emoji representation.

The change from Unicode escape sequence to direct emoji character is valid and improves readability.

Also applies to: 167-167

lib/llm/tests/data/vllm/gpt-oss-20b/chat_completion_stream_f0c86d72-tool.json (1)

3-18: LGTM! Consistent test fixture restructuring.

The standardized expected_output and input_stream structure aligns with the broader test fixture updates.

lib/llm/tests/parallel_tool_call_integration.rs (1)

385-479: LGTM! Comprehensive DeepSeek v3.1 parsing test.

The new test provides thorough coverage of DeepSeek v3.1 tool-call parsing, including:

  • Parser invocation and error handling
  • Content extraction (remaining vs. tool calls)
  • Tool call structure validation
  • Detailed argument validation with nested array checks

Error handling with .expect() is appropriate for tests, and the assertions are specific and meaningful.

lib/llm/tests/data/sglang/gpt-oss-20b/chat_completion_stream_19c97899-tool.json (1)

1-83: LGTM! Complete SGLang streaming fixture.

The new test fixture is well-structured and follows the standardized format with expected_output and input_stream, supporting end-to-end testing of SGLang streaming with tool invocation.

lib/parsers/src/tool_calling/config.rs (2)

26-29: LGTM! Well-designed separator token support.

The new tool_call_separator_tokens field is:

  • Properly documented with clear examples
  • Backward compatible (defaults to empty vector)
  • Appropriately scoped for models that need it (like DeepSeek v3.1)

Also applies to: 49-49


159-184: LGTM! Comprehensive DeepSeek v3.1 token variants.

The expanded token variants support different tokenizer representations:

  • Unicode-style brackets: <|...|>
  • ASCII-style brackets: <|...|>
  • Underscore vs. Unicode spaces: tool_calls_begin vs. tool▁calls▁begin

This ensures robust parsing across tokenizer variations.

lib/llm/tests/data/vllm/gpt-oss-20b/chat_completion_stream_49f581c1-no-tool.json (1)

3-7: Fixture shape LGTM (expected_output + input_stream).

Structure matches loader (expected_output.normal_content/reasoning_content/tool_calls + input_stream). Unicode punctuation looks consistent; repeated id is correct. No issues from me.

Also applies to: 8-256

lib/llm/tests/data/sglang/gpt-oss-20b/chat_completion_stream_675195a8-no-tool.json (1)

1-16: SGLang fixture looks good.

expected_output + input_stream align with tests. Extra event/comment fields are harmless. No tool_calls present as expected.

Also applies to: 8-135

lib/parsers/src/tool_calling/json/deepseek_parser.rs (1)

19-24: Helper visibility differs from summary.

extract_tool_call_blocks is private (fn …), but the PR summary claims it’s public. Confirm intended visibility and adjust if needed.

@elyasmnvidian elyasmnvidian force-pushed the elyas/sglang_stream_test branch from 4aa911c to a268cfd Compare October 15, 2025 18:27
@elyasmnvidian elyasmnvidian changed the title fix: deepseek tool parsing fixed && added some sglang e2e tests fix: deepseek tool parsing fixed Oct 15, 2025
This branch contains parser implementation changes:
- Updated deepseek_parser.rs with bug fixes
- Modified tool_calling/config.rs
- Added parallel_tool_call_integration.rs test
- Added tool_call_stream_request_helper.py script
Copy link
Contributor

@ayushag-nv ayushag-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

@elyasmnvidian elyasmnvidian enabled auto-merge (squash) October 16, 2025 21:24
@elyasmnvidian elyasmnvidian merged commit e41fa1a into main Oct 16, 2025
30 of 31 checks passed
@elyasmnvidian elyasmnvidian deleted the elyas/sglang_stream_test branch October 16, 2025 22:11
athreesh pushed a commit that referenced this pull request Oct 16, 2025
athreesh pushed a commit that referenced this pull request Oct 16, 2025
ziqifan617 pushed a commit that referenced this pull request Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: tool_call_end is not handled for DeepSeek V3.1 parser

2 participants