Conversation
danbev
approved these changes
Jun 3, 2026
ggerganov
approved these changes
Jun 3, 2026
Contributor
Author
|
@CISC I need to merge this now, but can push fixes in follow-up PR if you spot any problems |
Contributor
|
For those of you waiting for more information: https://developers.googleblog.com/gemma-4-12b-the-developer-guide/ |
LostRuins
reviewed
Jun 4, 2026
| return ctx->model.mm_fc_w->ne[1]; | ||
| case PROJECTOR_TYPE_LFM2A: | ||
| return ctx->model.position_embeddings->ne[0]; | ||
| case PROJECTOR_TYPE_GEMMA4A: |
Collaborator
There was a problem hiding this comment.
Hello, why is case PROJECTOR_TYPE_GEMMA4A: removed? Now loading gemma E4B will give GGML_ABORT("Unknown projector type");
There was a problem hiding this comment.
Hello, why is
case PROJECTOR_TYPE_GEMMA4A:removed? Now loading gemma E4B will giveGGML_ABORT("Unknown projector type");
Same issue here.
Collaborator
Collaborator
LostRuins
added a commit
to LostRuins/koboldcpp
that referenced
this pull request
Jun 4, 2026
CISC
reviewed
Jun 4, 2026
| } break; | ||
| case PROJECTOR_TYPE_GEMMA4UV: | ||
| { | ||
| model.mm_input_proj_w = get_tensor(TN_MM_INP_PROJ); |
Contributor
Author
There was a problem hiding this comment.
it's already included in the name, but I think that should be refactored at some point:
#define TN_MM_INP_PROJ "mm.input_projection.weight"
LostRuins
added a commit
to LostRuins/koboldcpp
that referenced
this pull request
Jun 4, 2026
This reverts commit da0bb97.
TheTom
pushed a commit
to TheTom/llama-cpp-turboquant
that referenced
this pull request
Jun 5, 2026
* add model * nits (cherry picked from commit a731805)
TheTom
added a commit
to TheTom/llama-cpp-turboquant
that referenced
this pull request
Jun 5, 2026
#168) * mtmd, model: allow skip build_vit() (ggml-org#24077) * add model * nits (cherry picked from commit a731805) * mtmd: fix Gemma 4 unified FPE (ggml-org#24088) (cherry picked from commit 94a220c) * mtmd: enable non-causal vision for gemma 4 unified (ggml-org#24082) (cherry picked from commit c8d6a00) * fix(mtmd): handle Gemma 4 audio projector embedding size (ggml-org#24091) * mtmd: handle Gemma 4 audio projector embedding size * rm projection_dim from clip_n_mmproj_embd --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co> (cherry picked from commit e3ba22d) * convert: Fix Gemma 4 Unified conversion (ggml-org#24118) * Fix Gemma 4 Unified conversion * Set audio hidden size to audio_embed_dim (cherry picked from commit e802356) * ggml-metal: fall back to CPU for im2col when KH*KW exceeds threadgroup limit The Metal im2col kernel launches KH*KW threads per threadgroup (one per kernel element). For large conv kernels — e.g. the Gemma 4 unified vision (gemma4uv) patch embedding — KH*KW exceeds the Apple GPU 1024-thread cap and the kernel hits a runtime GGML_ASSERT instead of producing a result. Guard supports_op so an oversized im2col is declined; the backend scheduler then runs that one op on CPU while the rest of the graph stays on the GPU. Fixes Gemma 4 12B vision on the Metal backend (verified end-to-end: loads mmproj + describes an image correctly on an M5 Max). --------- Co-authored-by: Xuan-Son Nguyen <son@huggingface.co> Co-authored-by: Andrei <abetlen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
More info about this PR will be added soon
Requirements