Skip to content

feat: add Gemma 4 tool call and reasoning parsers#274

Closed
mikepixelmagic-dev wants to merge 1 commit intowaybarrios:mainfrom
mikepixelmagic-dev:feature/gemma4-parsers-upstream
Closed

feat: add Gemma 4 tool call and reasoning parsers#274
mikepixelmagic-dev wants to merge 1 commit intowaybarrios:mainfrom
mikepixelmagic-dev:feature/gemma4-parsers-upstream

Conversation

@mikepixelmagic-dev
Copy link
Copy Markdown
Contributor

Summary

Adds native Gemma 4 support for both tool calling and reasoning extraction.

  • Tool call parser (--tool-call-parser gemma4): Parses Gemma 4's <|tool_call>call:func{key:<|"|>value<|"|>}<tool_call|> format. Supports hyphenated/dotted function names (e.g. searxng-search).
  • Reasoning parser (--reasoning-parser gemma4): Extracts <|channel>thought...<channel|> tags into the reasoning_content API field. Stateful streaming parser handles tags that span multiple MLX tokens.

Usage

vllm-mlx serve mlx-community/gemma-4-26b-a4b-it-4bit \
  --enable-auto-tool-choice \
  --tool-call-parser gemma4 \
  --reasoning-parser gemma4

Files changed

File What
vllm_mlx/tool_parsers/gemma4_tool_parser.py Tool call parser (new)
vllm_mlx/tool_parsers/__init__.py Register import
vllm_mlx/reasoning/gemma4_parser.py Reasoning parser (new)
vllm_mlx/reasoning/__init__.py Register in builtin parsers

Test plan

  • Non-streaming tool calling with standard function names
  • Non-streaming tool calling with hyphenated names (e.g. searxng-search)
  • Streaming tool call extraction
  • Non-streaming reasoning extraction (reasoning_content populated, content clean)
  • Streaming reasoning extraction (no <|channel> tag leakage)
  • Tested on Apple M5 Max with mlx-community/gemma-4-26b-a4b-it-4bit

Fixes #264

🤖 Generated with Claude Code

Adds native support for Gemma 4's unique token format:

Tool calling:
- Parses <|tool_call>call:func_name{key:<|"|>value<|"|>}<tool_call|>
- Supports hyphenated/dotted function names (e.g. searxng-search)
- Streaming support with multi-token tag handling

Reasoning:
- Extracts <|channel>thought...<channel|> into reasoning_content field
- Stateful streaming parser handles tags that span multiple MLX tokens
- Works with both explicit and implicit thinking modes

Usage:
  vllm-mlx serve mlx-community/gemma-4-26b-a4b-it-4bit \
    --enable-auto-tool-choice \
    --tool-call-parser gemma4 \
    --reasoning-parser gemma4

Tested on Apple M5 Max with Gemma 4 26B-A4B (4-bit MLX).

Fixes waybarrios#264

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@waybarrios
Copy link
Copy Markdown
Owner

The reasoning parser landed via #268 and the tool call parser is covered by #269. Closing in favor of those two. Thanks for putting this together.

@waybarrios waybarrios closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemma 4 tool-call-parser and reasoning-parser support

2 participants