Skip to content

mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos (breaking change)#22082

Merged
ngxson merged 2 commits into
ggml-org:masterfrom
ngxson:xsn/decoder_pos_0
Apr 19, 2026
Merged

mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos (breaking change)#22082
ngxson merged 2 commits into
ggml-org:masterfrom
ngxson:xsn/decoder_pos_0

Conversation

@ngxson
Copy link
Copy Markdown
Contributor

@ngxson ngxson commented Apr 18, 2026

Overview

Add pos_0 parameter to mtmd_image_tokens_get_decoder_pos, this allow model to have total control over other dimensions of the RoPE positions

Tested on Qwen3 and confirmed that it doesn't break anything

Requirements

@ngxson ngxson requested a review from a team April 18, 2026 13:00
@ngxson ngxson requested a review from a team as a code owner April 18, 2026 13:00
Copy link
Copy Markdown
Member

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to fix the failing test obviously. :)

pos[i + batch.n_tokens ] = pos_0 + i;
pos[i + batch.n_tokens * 2] = pos_0 + i;
pos[i + batch.n_tokens * 3] = 0; // last pos dim is unused
pos[i + batch.n_tokens * 3] = pos_0 + i;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change ok?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it should not change anything because mrope_sections is always configured such that last dim = 0, so backend will always skip the positional data being set here.

@github-actions github-actions Bot added the testing Everything test related label Apr 18, 2026
@ngxson ngxson requested a review from CISC April 18, 2026 21:38
@ngxson ngxson merged commit 1912407 into ggml-org:master Apr 19, 2026
50 of 51 checks passed
wendadawen pushed a commit to ManaEstras/llama.cpp that referenced this pull request Apr 20, 2026
- decoder_pos: move HunyuanVL BOI/EOI/newline layout into mtmd_image_tokens_get_decoder_pos (matches ggml-org#22082)
- remove set_position_mrope_hunyuanvl and mtmd_decode_use_mrope_hunyuanvl; mtmd-helper.cpp now identical to master
- image_tokens: replace n_tokens_total with n_boi/n_eoi/n_newline/image_idx
- convert: drop hardcoded vit.perceive.* remapping, use standard tensor mapping
- clip: temporarily use ggml_interpolate (without custom sf)
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Apr 21, 2026
ggml-org#22082)

* mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos

* fix build
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Apr 23, 2026
ggml-org#22082)

* mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos

* fix build
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
ggml-org#22082)

* mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos

* fix build
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
ggml-org#22082)

* mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos

* fix build
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
ggml-org#22082)

* mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos

* fix build
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
ggml-org#22082)

* mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos

* fix build
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
ggml-org#22082)

* mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos

* fix build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants