Skip to content

feat: add Vulkan REPEAT op support for f16 to f16.#23298

Merged
0cc4m merged 7 commits into
ggml-org:masterfrom
l8bloom:feat/add-repeat-op-for-f16-to-f16
May 27, 2026
Merged

feat: add Vulkan REPEAT op support for f16 to f16.#23298
0cc4m merged 7 commits into
ggml-org:masterfrom
l8bloom:feat/add-repeat-op-for-f16-to-f16

Conversation

@l8bloom
Copy link
Copy Markdown
Contributor

@l8bloom l8bloom commented May 18, 2026

Overview

Add Vulkan REPEAT op support for f16 to f16.

(Please advise if the PR is redundant and/or missing steps to full implementation)

Additional information

Getting:

[INFO ] stable-diffusion.cpp:4777 - sampling completed, taking 277.99s
[DEBUG] ggml_extend.hpp:1904 - ltx_audio_vae compute buffer size: 90.68 MB(VRAM)
[INFO ] ggml_extend.hpp:2142 - ltx_audio_vae offload params (339.87 MB, 1285 tensors) to runtime backend (Vulkan0), taking 0.08s
[ERROR] ggml_vulkan: Error: Missing op: REPEAT for f16 to f16 ~/Projects/local_ai/stable-diffusion.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:9968: fatal error
[New LWP 233602]
[New LWP 233600]
[New LWP 233586]
[New LWP 233471]
[New LWP 233470]
[New LWP 233469]
[New LWP 233468] 

while running video generation(stable-diffusion.cpp - relies on ggml) with the following models:

  • Diffusion Model: ltx-2.3-22b-dev-UD-Q4_K_M.gguf
  • Video VAE: ltx-2.3-22b-dev_video_vae.safetensors
  • Audio VAE: ltx-2.3-22b-dev_audio_vae.safetensors

audio-vae encoding works fine on CPU, but breaks with the message above on Linux-Vulkan

Reusing pipeline_repeat_f32 worked fine, but looks dubious.

Requirements

@l8bloom l8bloom requested a review from a team as a code owner May 18, 2026 21:51
@github-actions github-actions Bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels May 18, 2026
Comment thread ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated
Comment thread ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated
Comment thread ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated
Comment thread ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated
Comment thread ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp Outdated
@l8bloom
Copy link
Copy Markdown
Contributor Author

l8bloom commented May 25, 2026

Hi, @jeffbolznv can I help with the failing CI actions?
ubuntu-cpu-riscv64-native and ubuntu-cpu (x64, ubuntu-22.04) look like could pass if re-run.

@l8bloom l8bloom requested a review from jeffbolznv May 25, 2026 19:05
@jeffbolznv
Copy link
Copy Markdown
Contributor

They're probably unrelated, I wouldn't worry about them.

@0cc4m
Copy link
Copy Markdown
Contributor

0cc4m commented May 27, 2026

@jeffbolznv can you approve as well?

@0cc4m 0cc4m merged commit 837bb6b into ggml-org:master May 27, 2026
47 of 50 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request May 27, 2026
* origin/master:
hexagon: add support for Q4_1 in MUL_MAT and MUL_MAT_ID (ggml-org#23647)
ggml-webgpu: Fix how to dispatch WG to some ops (ggml-org#23750)
vulkan: Switch MUL_MAT_VEC to 4 K per iteration for F16/32 (ggml-org#22887)
vulkan: use GL_NV_cooperative_matrix_decode_vector for faster matmul (ggml-org#23541)
vulkan: add REPEAT op support for f16 to f16. (ggml-org#23298)
ci : move ARM jobs to self-hosted + disable kleidiai mac release (ggml-org#23780)
vendor : update cpp-httplib to 0.46.0 (ggml-org#23650)
pyproject : add conversion folder and update dependencies (ggml-org#23746)
CUDA: restrict PDL to CTK >= 12.3 due to MSVC issues (ggml-org#23742)
ci : bump cuda release to 13.3 (ggml-org#23749)
common : fix env names to all have LLAMA_ARG_ prefix (ggml-org#23778)
ci : fix windows ccaches (ggml-org#23777)
ci : remove wasm test (ggml-org#23733)
vulkan: avoid preferring transfer queue on AMD UMA devices (ggml-org#22455)
ci : add ccache to server builds + fix undefined sanitizer build (ggml-org#23763)
docs : fix duplicated "the" in granitevision and model-conversion docs (ggml-org#23767)
convert: add MiniCPM5 tokenizer support (ggml-org#23384)
server : fix the log message when using SSL (ggml-org#23393)
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
* feat: extend repeat op for vulkan

* feat: add repeat_f16 vulkan pipeline

* fix: ensure same dst and src types

* fix: use type_size instead of data types

* fix: use int16 and int32 for repeat shader op

* chore: rename repeat_f* to repeat_i*

* chore: rename repeat vulkan pipelines
turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026
* feat: extend repeat op for vulkan

* feat: add repeat_f16 vulkan pipeline

* fix: ensure same dst and src types

* fix: use type_size instead of data types

* fix: use int16 and int32 for repeat shader op

* chore: rename repeat_f* to repeat_i*

* chore: rename repeat vulkan pipelines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants