mtmd: fix gemma 4 audio rms norm eps by ngxson · Pull Request #23815 · ggml-org/llama.cpp

ngxson · 2026-05-28T11:55:29Z

Overview

Seems to be a mistake from #21421

All gemma 4 models (text / vision / audio) use 1e-6 for norm eps:

While on GGUF conversion code, it's hard coded to 1e-5

I guess that's why we have trust issues with AI-generated code

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure:

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

CISC

Wow, no \r\n mess this time? 😆

ngxson · 2026-05-28T14:26:07Z

hmm, something goes wrong with the CI?

* mtmd: fix gemma 4 audio rms norm eps * Update tools/mtmd/clip.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* origin/master: (32 commits) hexagon: basic/generic op fusion support and RMS_NORM+MUL fusion (ggml-org#23835) mtmd-debug: add color and rainbow mode (ggml-org#23829) mtmd: fix gemma 4 projector pre_norm (ggml-org#23822) opencl: move backend info printing into its own function (ggml-org#23702) ci : run ui publish on ubuntu-slim (ggml-org#23818) ui: fix audio and video modality detection (ggml-org#23756) ci : releases use Github-hosted builds for the UI (ggml-org#23823) app : improve help output (ggml-org#23805) mtmd: n_head_kv defaults to n_head (ggml-org#23782) mtmd: fix gemma 4 audio rms norm eps (ggml-org#23815) ci : change Vulkan builds to Release to reduce ccache (ggml-org#23820) arg: Add LLAMA_ARG_API_KEY_FILE environment variable for --api-key-file (ggml-org#23167) test-llama-archs: fix table format [no release] (ggml-org#23810) ggml: auto apply iGPU flag CUDA/HIP if integrated device (ggml-org#23007) mmvq Optim: add MMVQ_PARAMETERS_TURING(mmvq_parameter_table_id) for … (ggml-org#23729) CUDA: route batch>=4 quantized matmul to MMQ on AMD MFMA hardware (ggml-org#23227) server: minor tweaks to use more cpp features (ggml-org#23785) hexagon: minor refresh for HMX FA and MM (ggml-org#23796) vulkan: fast path for walsh-hadamard transform (ggml-org#23687) chat : add Granite 4.1 chat template (ggml-org#23518) ...

* mtmd: fix gemma 4 audio rms norm eps * Update tools/mtmd/clip.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

joseph777111 · 2026-06-02T04:08:01Z

@danielhanchen How would this affect the computation of Gemma-4's imatrices? Would having the norm_eps being hardcoded to 1e-5, instead of 1e-6 (like Gemma-4 expects) cause significant enough representational losses that recomputing Gemma-4's imatrices would be necessary to undo the representational losses? I'm curious if we should train another imatrix for Gemma-4 again with freshly converted Gemma-4 models. 🤔

Obviously this PR is already merged, and so you definitely want to use the latest version of llama.cpp (main). But, my posited question still stands. Because norm_eps seems pretty significant as to need to be configured correctly; otherwise, with norm_eps being 1e-5 (which is wrong for Gemma-4), we are losing accuracy for its norm_eps.

Update: I have been playing with a newly converted and quantized version of Gemma-4-E4B-it, and the model seems to be more capable with its responses (I haven't run into refusals - even with reasoning on, using a custom system prompt). This is huge as even with a newly converted and imatrix (Unsloth's) quantized version of Gemma-4-26B-4B-it, I still hit refusals currently - even though it appears that having the proper norm_eps (1e-6) seems to make the model behave better. 🤔

* mtmd: fix gemma 4 audio rms norm eps * Update tools/mtmd/clip.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

ngxson · 2026-06-02T13:20:15Z

imatrix doesn't cover multimodal input or mmproj, so it's unaffected

mtmd: fix gemma 4 audio rms norm eps

2528c65

ngxson requested review from a team and CISC as code owners May 28, 2026 11:55

danbev approved these changes May 28, 2026

View reviewed changes

CISC approved these changes May 28, 2026

View reviewed changes

github-actions Bot added examples python python script changes labels May 28, 2026

CISC reviewed May 28, 2026

View reviewed changes

Comment thread tools/mtmd/clip.cpp Outdated

Update tools/mtmd/clip.cpp

c807c2c

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

CISC approved these changes May 28, 2026

View reviewed changes

ggerganov approved these changes May 28, 2026

View reviewed changes

ngxson merged commit d6be315 into master May 28, 2026
22 of 29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mtmd: fix gemma 4 audio rms norm eps#23815

mtmd: fix gemma 4 audio rms norm eps#23815
ngxson merged 2 commits into
masterfrom
xsn/fix_gemma4a_esp

ngxson commented May 28, 2026

Uh oh!

Uh oh!

CISC left a comment

Uh oh!

ngxson commented May 28, 2026

Uh oh!

Uh oh!

joseph777111 commented Jun 2, 2026 •

edited

Loading

Uh oh!

ngxson commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ngxson commented May 28, 2026

Overview

Requirements

Uh oh!

Uh oh!

CISC left a comment

Choose a reason for hiding this comment

Uh oh!

ngxson commented May 28, 2026

Uh oh!

Uh oh!

joseph777111 commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngxson commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

joseph777111 commented Jun 2, 2026 •

edited

Loading