Skip to content

feat: Add MiniMax-M2 tool call parser#98

Closed
janhilgard wants to merge 1 commit intowaybarrios:mainfrom
janhilgard:feat/minimax-tool-parser
Closed

feat: Add MiniMax-M2 tool call parser#98
janhilgard wants to merge 1 commit intowaybarrios:mainfrom
janhilgard:feat/minimax-tool-parser

Conversation

@janhilgard
Copy link
Copy Markdown
Collaborator

@janhilgard janhilgard commented Feb 18, 2026

Summary

  • Add MiniMaxToolParser for MiniMax-M2 models' native XML tool call format
  • Integrate tool parser with reasoning parser path in streaming mode
  • Strip MiniMax special tokens from output

Details

MiniMax-M2 models use a custom XML-based tool call format distinct from other parsers. This PR adds:

  1. minimax_tool_parser.py — Full parser with regex-based extraction of <minimax:tool_call> blocks, <invoke> elements, and <parameter> elements. Supports both non-streaming (extract_tool_calls) and streaming (extract_tool_calls_streaming) modes. Parameter values are auto-parsed as JSON when possible for proper typing.

  2. CLI integrationminimax added to --tool-call-parser choices.

  3. Special token stripping — MiniMax end-of-turn token [e~[, role markers ]~b]assistant etc., and BOS token ]~!b[ are added to the global SPECIAL_TOKENS_PATTERN regex.

  4. Streaming: reasoning + tool parser integration — MiniMax wraps tool calls inside <think> blocks, so when --reasoning-parser is enabled, the reasoning parser captures tool call XML as reasoning content. This PR adds detection of tool call markers in the reasoning stream and redirects them to the content stream for proper tool parser processing. This fixes streaming tool calls when both --reasoning-parser and --tool-call-parser minimax are used together.

Changes

File Change
vllm_mlx/tool_parsers/minimax_tool_parser.py New parser for MiniMax XML format
vllm_mlx/tool_parsers/__init__.py Import and register MiniMaxToolParser
vllm_mlx/cli.py Add "minimax" to --tool-call-parser choices
vllm_mlx/api/utils.py Add MiniMax special tokens to SPECIAL_TOKENS_PATTERN
vllm_mlx/server.py Integrate tool parser within reasoning parser streaming path

Test plan

  • Tested with MiniMax-M2-5-REAP-39 (8-bit, 138GB MoE) on Apple M3 Ultra
  • Non-streaming tool calls parse correctly
  • Streaming tool calls: no XML leaking, proper tool_calls delta emission
  • Streaming with --reasoning-parser deepseek_r1: tool calls detected in reasoning and redirected
  • Streaming content (no tool call): reasoning/content split works correctly
  • Special tokens stripped from responses

Usage

vllm-mlx serve <minimax-model> \
    --enable-auto-tool-choice \
    --tool-call-parser minimax \
    --reasoning-parser deepseek_r1

🤖 Generated with Claude Code

@janhilgard janhilgard force-pushed the feat/minimax-tool-parser branch from 4e7af16 to 4ab0d64 Compare February 18, 2026 22:56
Add full tool call parsing for MiniMax-M2 models' native XML format,
including streaming integration with reasoning parsers.

Changes:
- minimax_tool_parser.py: Parser for <minimax:tool_call>/<invoke> XML
  format with streaming support; handles bare <invoke> without wrapper
  (model sometimes emits inside <think> blocks); filters hallucinated
  <invoke> tags without parameters
- cli.py: Add "minimax" to --tool-call-parser choices
- tool_parsers/__init__.py: Register MiniMaxToolParser
- api/utils.py: Strip MiniMax special tokens ([e~[, ]~b]role, ]~!b[)
- server.py: Integrate tool parser within reasoning parser streaming
  path — detect tool call markers in reasoning stream and redirect to
  content for parsing; suppress whitespace-only content before tool
  calls to avoid confusing clients

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@janhilgard janhilgard force-pushed the feat/minimax-tool-parser branch from 4ab0d64 to 9f64a24 Compare February 19, 2026 18:43
@raullenchai
Copy link
Copy Markdown

Thanks for this PR. This works out really well on M3 ultra with "MiniMax-M2.5-MLX-4bit".

python -m vllm_mlx.cli serve "~\MiniMax-M2.5-MLX-4bit" \
  --host 0.0.0.0 \
  --port 8000 \
  --enable-prefix-cache \
  --max-num-seqs 2 \
  --tool-call-parser minimax \
  --enable-auto-tool-choice

@janhilgard janhilgard mentioned this pull request Mar 21, 2026
4 tasks
@Thump604 Thump604 mentioned this pull request Apr 8, 2026
@Thump604
Copy link
Copy Markdown
Collaborator

Thump604 commented Apr 8, 2026

@waybarrios, @janhilgard: brief endorsement plus cross-references.

This addresses issue #35 (jverkoey, "Minimax M2.1 support?") which is still open. The PR adds a full MiniMaxToolParser for the <minimax:tool_call> XML format with both non-streaming and streaming modes, parameter coercion for nested object/array values, and special-token stripping.

Coordination note: there is also PR #231 (sjswerdloff, "feat: add MiniMax tool call parsing support") which appears to target the same general area. The two are not currently cross-linked. Wayne can pick whichever is more complete, or merge both if they cover different model variants.

PR is MERGEABLE on current main. Last activity Feb 24 (~6 weeks ago) which is very stale for a feature that has an open issue requesting it.

@janhilgard
Copy link
Copy Markdown
Collaborator Author

@Thump604 Superseded — MiniMax-M2 tool parser is already in main via #278. Closing.

@janhilgard janhilgard closed this Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants