Skip to content

feat: add Gemma 4 tool call parser#254

Closed
keegoid wants to merge 2 commits intowaybarrios:mainfrom
keegoid:feature/gemma4-tool-parser
Closed

feat: add Gemma 4 tool call parser#254
keegoid wants to merge 2 commits intowaybarrios:mainfrom
keegoid:feature/gemma4-tool-parser

Conversation

@keegoid
Copy link
Copy Markdown

@keegoid keegoid commented Apr 4, 2026

Summary

  • Adds a new gemma4_tool_parser.py that handles Gemma 4's unique tool calling format (<|tool_call>call:name{key:<|"|>value<|"|>}<tool_call|>)
  • Implements a state-machine converter for Gemma's custom argument syntax (unquoted keys, <|"|> string delimiters) to valid JSON
  • Integrates Gemma 4 detection into the auto parser (both streaming and non-streaming paths)
  • Adds gemma4 to CLI --tool-call-parser choices
  • 21 tests covering arg conversion, nested objects, arrays, hyphenated tool names, streaming, think tag stripping, and auto-detection

Background

Gemma 4 models use a non-standard tool call format defined in their chat template. Unlike other model families that use JSON or XML, Gemma 4 uses special tokens (<|tool_call>, <tool_call|>, <|"|>) for structure and string quoting. This parser converts that format to OpenAI-compatible tool_calls responses.

Tested end-to-end with mlx-community/gemma-4-31b-it-8bit on Apple Silicon (M5 Max 128GB):

  • Single and multi-tool calls
  • Multi-turn with tool responses
  • Vision + tool calling combined
  • Streaming

Usage

vllm-mlx serve mlx-community/gemma-4-31b-it-8bit --port 8000 --mllm   --enable-auto-tool-choice --tool-call-parser gemma4

Test plan

  • All 21 new Gemma 4 parser tests pass
  • All 78 existing tool parser tests pass (99 total, 0 failures)
  • End-to-end tool calling verified against live Gemma 4 31B model
  • Vision + tool calling combo verified
  • Streaming tool call buffering verified
  • Gemma 4 E4B model (not yet tested, same format expected)

🤖 Generated with Claude Code

Gemma 4 uses a unique tool calling format with special tokens for
delimiters instead of JSON quotes. This adds a state-machine parser
that converts the custom argument format to valid JSON.

- New gemma4_tool_parser.py with streaming support
- Auto-detection in auto_tool_parser.py (both non-streaming and streaming)
- 21 tests covering arg conversion, nested objects, arrays, hyphenated
  names, streaming, think tags, and auto-detection
- CLI --tool-call-parser gemma4 option

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
janhilgard added a commit to janhilgard/vllm-mlx that referenced this pull request Apr 5, 2026
State-machine parser for Gemma 4's custom tool call format:
<|tool_call>call:name{key:<|"|>value<|"|>}<tool_call|>

Includes dedicated parser, auto-detection in AutoToolParser,
CLI --tool-call-parser gemma4 option, and 21 tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gajesh2007 added a commit to Gajesh2007/vllm-mlx that referenced this pull request Apr 7, 2026
Port from upstream PRs waybarrios#250 and waybarrios#254:
- gemma4_parser.py: extracts <|channel>thought...<channel|> into reasoning_content
- gemma4_tool_parser.py: parses <|tool_call>call:name{...}<tool_call|> format
- Auto-detection in AutoToolParser for Gemma 4 markers
- Tests for tool parser
keegoid added a commit to keegoid/vllm-mlx that referenced this pull request Apr 10, 2026
waybarrios added a commit to jackneil/vllm-mlx that referenced this pull request Apr 10, 2026
integrates Gemma 4 format as the first format tried in auto-detection,
adds streaming markers for tool call start/end. based on keegoid's
approach in waybarrios#254.
@waybarrios
Copy link
Copy Markdown
Owner

The auto-detection logic you built for AutoToolParser was the right call. I pulled that approach into #269 so Gemma 4 tool calls work out of the box with --enable-auto-tool-choice. The standalone parser is also covered there. Thanks for flagging the RotatingKVCache issue on #256 too, that was a key catch. Closing in favor of #269.

@waybarrios waybarrios closed this Apr 10, 2026
Thump604 pushed a commit that referenced this pull request Apr 10, 2026
* test: add Gemma 4 tool parser tests (red)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Gemma 4 tool call parser

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: register Gemma 4 parser, add streaming tests and wiring

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add edge case tests for DC review findings

- Unclosed tool call block (server fallback path)
- String containing colon (step-ordering guard)
- String with real newline and double quote (JSON escaping)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: verify Gemma 4 tool calls produce exact OpenAI format for Claude Code

Integration tests that verify the full pipeline (parser → server models →
JSON serialization) matches what Claude Code expects: tool_calls structure,
null content, function.arguments as JSON string, correct finish_reason.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* add Gemma 4 auto-detection to AutoToolParser

integrates Gemma 4 format as the first format tried in auto-detection,
adds streaming markers for tool call start/end. based on keegoid's
approach in #254.

* remove unused pytest imports

* run black on tool parser, tests, and server

---------

Co-authored-by: Jack Neil <jackneil@Jacks-Mac-Studio.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Wayner Barrios <waybarrios@gmail.com>
@keegoid keegoid deleted the feature/gemma4-tool-parser branch April 11, 2026 04:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants