feat: add Gemma 4 tool call parser#254
Closed
keegoid wants to merge 2 commits intowaybarrios:mainfrom
Closed
Conversation
Gemma 4 uses a unique tool calling format with special tokens for delimiters instead of JSON quotes. This adds a state-machine parser that converts the custom argument format to valid JSON. - New gemma4_tool_parser.py with streaming support - Auto-detection in auto_tool_parser.py (both non-streaming and streaming) - 21 tests covering arg conversion, nested objects, arrays, hyphenated names, streaming, think tags, and auto-detection - CLI --tool-call-parser gemma4 option Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
11 tasks
janhilgard
added a commit
to janhilgard/vllm-mlx
that referenced
this pull request
Apr 5, 2026
State-machine parser for Gemma 4's custom tool call format:
<|tool_call>call:name{key:<|"|>value<|"|>}<tool_call|>
Includes dedicated parser, auto-detection in AutoToolParser,
CLI --tool-call-parser gemma4 option, and 21 tests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gajesh2007
added a commit
to Gajesh2007/vllm-mlx
that referenced
this pull request
Apr 7, 2026
Port from upstream PRs waybarrios#250 and waybarrios#254: - gemma4_parser.py: extracts <|channel>thought...<channel|> into reasoning_content - gemma4_tool_parser.py: parses <|tool_call>call:name{...}<tool_call|> format - Auto-detection in AutoToolParser for Gemma 4 markers - Tests for tool parser
keegoid
added a commit
to keegoid/vllm-mlx
that referenced
this pull request
Apr 10, 2026
waybarrios
added a commit
to jackneil/vllm-mlx
that referenced
this pull request
Apr 10, 2026
integrates Gemma 4 format as the first format tried in auto-detection, adds streaming markers for tool call start/end. based on keegoid's approach in waybarrios#254.
3 tasks
Owner
|
The auto-detection logic you built for AutoToolParser was the right call. I pulled that approach into #269 so Gemma 4 tool calls work out of the box with --enable-auto-tool-choice. The standalone parser is also covered there. Thanks for flagging the RotatingKVCache issue on #256 too, that was a key catch. Closing in favor of #269. |
Thump604
pushed a commit
that referenced
this pull request
Apr 10, 2026
* test: add Gemma 4 tool parser tests (red) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Gemma 4 tool call parser Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: register Gemma 4 parser, add streaming tests and wiring Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add edge case tests for DC review findings - Unclosed tool call block (server fallback path) - String containing colon (step-ordering guard) - String with real newline and double quote (JSON escaping) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: verify Gemma 4 tool calls produce exact OpenAI format for Claude Code Integration tests that verify the full pipeline (parser → server models → JSON serialization) matches what Claude Code expects: tool_calls structure, null content, function.arguments as JSON string, correct finish_reason. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * add Gemma 4 auto-detection to AutoToolParser integrates Gemma 4 format as the first format tried in auto-detection, adds streaming markers for tool call start/end. based on keegoid's approach in #254. * remove unused pytest imports * run black on tool parser, tests, and server --------- Co-authored-by: Jack Neil <jackneil@Jacks-Mac-Studio.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Wayner Barrios <waybarrios@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
gemma4_tool_parser.pythat handles Gemma 4's unique tool calling format (<|tool_call>call:name{key:<|"|>value<|"|>}<tool_call|>)<|"|>string delimiters) to valid JSONgemma4to CLI--tool-call-parserchoicesBackground
Gemma 4 models use a non-standard tool call format defined in their chat template. Unlike other model families that use JSON or XML, Gemma 4 uses special tokens (
<|tool_call>,<tool_call|>,<|"|>) for structure and string quoting. This parser converts that format to OpenAI-compatibletool_callsresponses.Tested end-to-end with
mlx-community/gemma-4-31b-it-8biton Apple Silicon (M5 Max 128GB):Usage
Test plan
🤖 Generated with Claude Code