Skip to content

mtmd: correct get_n_pos / get_decoder_pos#22175

Merged
ngxson merged 1 commit into
ggml-org:masterfrom
ngxson:xsn/mtmd_get_n_pos_get_decoder_pos
Apr 20, 2026
Merged

mtmd: correct get_n_pos / get_decoder_pos#22175
ngxson merged 1 commit into
ggml-org:masterfrom
ngxson:xsn/mtmd_get_n_pos_get_decoder_pos

Conversation

@ngxson
Copy link
Copy Markdown
Contributor

@ngxson ngxson commented Apr 20, 2026

Overview

Continue #22161

  • Add a new enum mtmd_pos_type for using internally (the idea is to support falcon-ocr positioning style in near future)
  • Also update get_n_pos and get_decoder_pos to return the correct positioning style

Test results:

[vision] OK:   ggml-org/SmolVLM-500M-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/SmolVLM2-2.2B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/SmolVLM2-500M-Video-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/gemma-3-4b-it-GGUF:Q4_K_M
[vision] OK:   THUDM/glm-edge-v-5b-gguf:Q4_K_M
[vision] OK:   second-state/Llava-v1.5-7B-GGUF:Q2_K
[vision] OK:   cjpais/llava-1.6-mistral-7b-gguf:Q3_K_M
[vision] OK:   ibm-research/granite-vision-3.2-2b-GGUF:Q4_K_M
[vision] OK:   second-state/MiniCPM-Llama3-V-2_5-GGUF:Q2_K
[vision] OK:   openbmb/MiniCPM-V-2_6-gguf:Q2_K
[vision] OK:   openbmb/MiniCPM-o-2_6-gguf:Q4_0
[vision] OK:   bartowski/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/InternVL2_5-1B-GGUF:Q8_0
[vision] OK:   ggml-org/InternVL3-1B-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/Qwen2.5-Omni-3B-GGUF:Q4_K_M
[vision] OK:   ggml-org/LFM2-VL-450M-GGUF:Q8_0
[vision] OK:   ggml-org/granite-docling-258M-GGUF:Q8_0
[vision] OK:   ggml-org/LightOnOCR-1B-1025-GGUF:Q8_0
[vision] OK:   ggml-org/DeepSeek-OCR-GGUF:Q8_0
[vision] OK:   ggml-org/dots.ocr-GGUF:Q8_0
[vision] OK:   ggml-org/HunyuanOCR-GGUF:Q8_0
[vision] OK:   ggml-org/gemma-4-E2B-it-GGUF:Q8_0
[audio]  OK:   ggml-org/ultravox-v0_5-llama-3_2-1b-GGUF:Q8_0
[audio]  OK:   ggml-org/Qwen2.5-Omni-3B-GGUF:Q4_K_M
[audio]  OK:   ggml-org/Voxtral-Mini-3B-2507-GGUF:Q4_K_M
[audio]  OK:   ggml-org/LFM2-Audio-1.5B-GGUF:Q8_0
[audio]  OK:   ggml-org/gemma-4-E2B-it-GGUF:Q8_0
[audio]  OK:   ggml-org/Qwen3-ASR-0.6B-GGUF:Q8_0
[vision] OK:   ggml-org/pixtral-12b-GGUF:Q4_K_M
[vision] OK:   ggml-org/Mistral-Small-3.1-24B-Instruct-2503-GGUF
[vision] OK:   ggml-org/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2-VL-7B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-VL-7B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen3-VL-2B-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/InternVL3-8B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/InternVL3-14B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-Omni-7B-GGUF:Q4_K_M
[vision] OK:   ggml-org/GLM-4.6V-Flash-GGUF:Q4_K_M
[audio]  OK:   ggml-org/ultravox-v0_5-llama-3_1-8b-GGUF:Q4_K_M
[audio]  OK:   ggml-org/Qwen2.5-Omni-7B-GGUF:Q4_K_M

Requirements

@ngxson ngxson requested a review from a team April 20, 2026 15:49
@ngxson ngxson requested a review from a team as a code owner April 20, 2026 15:49
@ngxson ngxson merged commit 86f8daa into ggml-org:master Apr 20, 2026
50 of 51 checks passed
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Apr 21, 2026
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Apr 23, 2026
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants