Fix some Mistral parser issues#37209
Conversation
Signed-off-by: juliendenize <julien.denize@mistral.ai>
There was a problem hiding this comment.
Code Review
This pull request introduces several fixes to the Mistral parser to improve its robustness. The changes include moving imports for better compatibility, adding type hints for clarity, and making the tool parsing logic more resilient. Specifically, assertions are replaced with conditional checks to prevent crashes, and checks for tool call tokens are made more robust by considering both token IDs and their text representation.
My review focuses on a potential remaining issue in the streaming tool call parsing logic. While the changes are improvements, the parser might still fail if the special [TOOL_CALLS] token is split across multiple streaming chunks, which could lead to incorrect parsing. I've provided a detailed comment on this.
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Audited recent tool parser bug-fix PRs and found that several landed without corresponding test coverage. Added unit tests for each fix to prevent regressions. - Mistral: fast detokenization text detection (PR vllm-project#37209) - Qwen3Coder: malformed XML crash, anyOf double-encoding, speculative decode streaming (PRs vllm-project#36774, vllm-project#36032, vllm-project#35615) - DeepSeekV32: delimiter preservation with fast detokenization, skip_special_tokens adjustment (PR vllm-project#33964) - GLM-4 MoE: zero-argument tool calls, transformers 5.x delimiter handling, Unicode character preservation (PRs vllm-project#32321, vllm-project#31622, vllm-project#30920) - MiniMax M2: anyOf nullable parameter handling for non-null and null values (PR vllm-project#32342) - Step3p5: MTP-style variable-chunk and multi-token streaming (PR vllm-project#33690) - Kimi K2: native tool call ID extraction and multi-turn ID continuity (PR vllm-project#32768) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ben Browning <bbrownin@redhat.com>
Purpose
This PR seeks to fix some parser issues before refactoring how Mistral handle requests inspired by #37081
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.