Skip to content

fix: Handle Gemma 3 required pixel_values parameter in continuous batching#4

Merged
lubauss merged 1 commit intomainfrom
fix/gemma3-pixel-values
Jan 19, 2026
Merged

fix: Handle Gemma 3 required pixel_values parameter in continuous batching#4
lubauss merged 1 commit intomainfrom
fix/gemma3-pixel-values

Conversation

@lubauss
Copy link
Copy Markdown
Owner

@lubauss lubauss commented Jan 19, 2026

Summary

  • Fixes Gemma 3 continuous batching mode which failed with Model.__call__() missing 1 required positional argument: 'pixel_values'
  • Gemma 3's model requires pixel_values as a positional argument, unlike Qwen2-VL which makes it optional
  • The MLLMModelWrapper now injects pixel_values=None for text-only requests

Technical Details

The issue was in how different MLLM models handle the pixel_values parameter:

Qwen2-VL:

def __call__(self, input_ids, pixel_values: Optional = None, ...)

Gemma 3:

def __call__(self, input_ids, pixel_values, ...)  # Required!

When BatchGenerator calls the model with model(input_ids, cache=cache), Gemma 3 fails because pixel_values is missing.

Test plan

  • Server starts successfully with Gemma 3 27B 4-bit
  • Text generation works
  • Streaming works
  • Prefix caching works (2x speedup observed)
  • Concurrent requests work

🤖 Generated with Claude Code

…ching

Gemma 3's model __call__() requires pixel_values as a positional argument,
unlike Qwen2-VL which makes it optional. This caused "missing required
positional argument: 'pixel_values'" errors when using continuous batching
with text-only requests.

The MLLMModelWrapper now injects pixel_values=None for text-only requests,
enabling Gemma 3 to work with continuous batching and prefix caching.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@lubauss lubauss merged commit c283b49 into main Jan 19, 2026
@lubauss lubauss deleted the fix/gemma3-pixel-values branch January 19, 2026 20:43
lubauss added a commit that referenced this pull request Jan 20, 2026
…ching (#4)

Gemma 3's model __call__() requires pixel_values as a positional argument,
unlike Qwen2-VL which makes it optional. This caused "missing required
positional argument: 'pixel_values'" errors when using continuous batching
with text-only requests.

The MLLMModelWrapper now injects pixel_values=None for text-only requests,
enabling Gemma 3 to work with continuous batching and prefix caching.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
lubauss added a commit that referenced this pull request Jan 20, 2026
…ching (#4)

Gemma 3's model __call__() requires pixel_values as a positional argument,
unlike Qwen2-VL which makes it optional. This caused "missing required
positional argument: 'pixel_values'" errors when using continuous batching
with text-only requests.

The MLLMModelWrapper now injects pixel_values=None for text-only requests,
enabling Gemma 3 to work with continuous batching and prefix caching.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant