feat: add Vulkan REPEAT op support for f16 to f16.#23298
Merged
0cc4m merged 7 commits intoMay 27, 2026
Conversation
jeffbolznv
requested changes
May 18, 2026
0cc4m
reviewed
May 19, 2026
jeffbolznv
reviewed
May 19, 2026
jeffbolznv
reviewed
May 20, 2026
Contributor
Author
|
Hi, @jeffbolznv can I help with the failing CI actions? |
Contributor
|
They're probably unrelated, I wouldn't worry about them. |
0cc4m
approved these changes
May 27, 2026
Contributor
|
@jeffbolznv can you approve as well? |
jeffbolznv
approved these changes
May 27, 2026
gabe-l-hart
added a commit
to gabe-l-hart/llama.cpp
that referenced
this pull request
May 27, 2026
* origin/master: hexagon: add support for Q4_1 in MUL_MAT and MUL_MAT_ID (ggml-org#23647) ggml-webgpu: Fix how to dispatch WG to some ops (ggml-org#23750) vulkan: Switch MUL_MAT_VEC to 4 K per iteration for F16/32 (ggml-org#22887) vulkan: use GL_NV_cooperative_matrix_decode_vector for faster matmul (ggml-org#23541) vulkan: add REPEAT op support for f16 to f16. (ggml-org#23298) ci : move ARM jobs to self-hosted + disable kleidiai mac release (ggml-org#23780) vendor : update cpp-httplib to 0.46.0 (ggml-org#23650) pyproject : add conversion folder and update dependencies (ggml-org#23746) CUDA: restrict PDL to CTK >= 12.3 due to MSVC issues (ggml-org#23742) ci : bump cuda release to 13.3 (ggml-org#23749) common : fix env names to all have LLAMA_ARG_ prefix (ggml-org#23778) ci : fix windows ccaches (ggml-org#23777) ci : remove wasm test (ggml-org#23733) vulkan: avoid preferring transfer queue on AMD UMA devices (ggml-org#22455) ci : add ccache to server builds + fix undefined sanitizer build (ggml-org#23763) docs : fix duplicated "the" in granitevision and model-conversion docs (ggml-org#23767) convert: add MiniCPM5 tokenizer support (ggml-org#23384) server : fix the log message when using SSL (ggml-org#23393)
fewtarius
pushed a commit
to fewtarius/llama.cpp
that referenced
this pull request
May 30, 2026
* feat: extend repeat op for vulkan * feat: add repeat_f16 vulkan pipeline * fix: ensure same dst and src types * fix: use type_size instead of data types * fix: use int16 and int32 for repeat shader op * chore: rename repeat_f* to repeat_i* * chore: rename repeat vulkan pipelines
turbo-tan
pushed a commit
to turbo-tan/llama.cpp-tq3
that referenced
this pull request
Jun 2, 2026
* feat: extend repeat op for vulkan * feat: add repeat_f16 vulkan pipeline * fix: ensure same dst and src types * fix: use type_size instead of data types * fix: use int16 and int32 for repeat shader op * chore: rename repeat_f* to repeat_i* * chore: rename repeat vulkan pipelines
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Add Vulkan REPEAT op support for f16 to f16.
(Please advise if the PR is redundant and/or missing steps to full implementation)
Additional information
Getting:
while running video generation(stable-diffusion.cpp - relies on ggml) with the following models:
ltx-2.3-22b-dev-UD-Q4_K_M.ggufltx-2.3-22b-dev_video_vae.safetensorsltx-2.3-22b-dev_audio_vae.safetensorsaudio-vaeencoding works fine on CPU, but breaks with the message above on Linux-VulkanReusing
pipeline_repeat_f32worked fine, but looks dubious.Requirements