Add unified arch for gemma4 by neilmehta24 · Pull Request #305 · lmstudio-ai/mlx-engine

neilmehta24 · 2026-04-07T16:28:29Z

Summary of changes:

Removed _BatchedLogitsProcessorAdapter now that Align batch logits processor token contract ml-explore/mlx-lm#1115 has landed
Add vision addon for gemma4
- The vision features for gemma4 include both embeddings and per-layer-inputs. So, add a model patch to ensure that per_layer_inputs is wired into the language model.
- Unapply embed_scale before passing into the language model, because mlx-lm will apply it.
- Add tests for vision input, text input, and prompt caching.

To verify that this implementation is correct, I ensured the following things:

The input_ids reaching the first mlx-lm pass match native mlx-vlm
The per_layer_inputs reaching that same pass also match native mlx-vlm
The embeddings seen by mlx-lm match native behavior. The only difference is a slight rounding error for image embeddings from the un-application of the embed_scale followed by a later re-application.
I also ran an e2e generation with (1) a multi-image summary task and (2) a single-image transcription task. I didn't notice any quality degradation between mlx-vlm and this unified impl.

Codex review:

will-lms · 2026-04-08T14:56:55Z

mlx_engine/model_kit/patches/gemma4.py

Did you confirm the logits are the same for both text-only and vision prompts before and after the patch? Consider adding tests to test_patched_models.py in line with the Qwen 3.5 heavy tests to verify.

I verified that the logits matched for text-only work, and that the logits are close-enough within a tolerance for image+text work. I added a test for each of these cases.

will-lms · 2026-04-08T18:40:34Z

tests/patched_model_test_utils.py

+REAL_MODEL_CASES = [
+    pytest.param("lmstudio-community/Qwen3.5-2B-MLX-4bit", id="dense"),
+    pytest.param(
+        "lmstudio-community/Qwen3.5-35B-A3B-MLX-4bit",
+        marks=pytest.mark.heavy,
+        id="moe",
+    ),
+]
+GEMMA4_MODEL_NAME = "lmstudio-community/gemma-4-E2B-it-MLX-4bit"
+GEMMA4_IMAGE_TOPK = 5
+GEMMA4_IMAGE_TOPK_PROB_RTOL = 0.25
+GEMMA4_IMAGE_TOPK_PROB_REF_FLOOR = 1e-3


Feels like these should be in the model-specific test files, not the shared utils.

Moved these to the model-specific files

tests/patched_model_test_utils.py

will-lms · 2026-04-08T18:47:55Z

tests/test_patched_gemma4.py

+from tests.shared import read_image_b64
+
+
+def test_gemma4_text_only_generation_patched_matches_unpatched():


Should this be heavy? I see the VLM test is.

Yes, missed it. Added pytestmark = pytest.mark.heavy

neilmehta24 added 4 commits April 7, 2026 12:12

gemma4 unified

3838c80

wire in per layers inputs

12750c7

refactor

bdde0af

inline

476c740

github-actions bot added the CLA signed Indicates that all contributors have signed label Apr 7, 2026

neilmehta24 requested a review from will-lms April 7, 2026 16:42

will-lms reviewed Apr 8, 2026

View reviewed changes

gemma4 unified tests

21750de

will-lms reviewed Apr 8, 2026

View reviewed changes

neilmehta24 added 2 commits April 8, 2026 15:38

refactor tests

fbaed17

simplify

d1f3c64

will-lms approved these changes Apr 8, 2026

View reviewed changes

neilmehta24 merged commit 147cc6f into main Apr 8, 2026
2 checks passed

neilmehta24 deleted the neil/gemma4-unified branch April 8, 2026 20:19

github-actions bot locked and limited conversation to collaborators Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unified arch for gemma4#305

Add unified arch for gemma4#305
neilmehta24 merged 7 commits intomainfrom
neil/gemma4-unified

neilmehta24 commented Apr 7, 2026 •

edited

Loading

Uh oh!

will-lms Apr 8, 2026

Uh oh!

neilmehta24 Apr 8, 2026

Uh oh!

will-lms Apr 8, 2026

Uh oh!

neilmehta24 Apr 8, 2026

Uh oh!

Uh oh!

Uh oh!

will-lms Apr 8, 2026

Uh oh!

neilmehta24 Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		from tests.shared import read_image_b64


		def test_gemma4_text_only_generation_patched_matches_unpatched():

Conversation

neilmehta24 commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

will-lms Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

neilmehta24 Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

will-lms Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

neilmehta24 Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

will-lms Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

neilmehta24 Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

neilmehta24 commented Apr 7, 2026 •

edited

Loading