ggml-cuda: refactor fusion code by am17an · Pull Request #22468 · ggml-org/llama.cpp

am17an · 2026-04-28T11:19:59Z

Overview

Refactor the fusion code to be a single function. Also fix a bug in the fusion code where it does not check the value of the env variable to disable fusion.

Additional information

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: No

gaugarg-nv · 2026-04-28T12:12:18Z

+// try and fuse nodes and return the number of nodes to skip
+static int ggml_cuda_try_fuse(ggml_backend_cuda_context * cuda_ctx, ggml_cgraph * cgraph, int i) {
+
+    static bool disable_fusion = getenv("GGML_CUDA_DISABLE_FUSION") != nullptr && std::atoi(getenv("GGML_CUDA_DISABLE_FUSION")) == 1;


I'm fine with doing an explicit check for the value, but it seems this is inconsistent with how GGML checks env variables in different backends and within CUDA.

For example, consider GGML_VK_DISABLE_FUSION or GGML_CUDA_DISABLE_GRAPHS.

I think it's just generally inconsistent across the codebase, I prefer explicitly checking the value because it allows me to bench using 0,1. There are cases where it is done like that (e.g. GGML_CUDA_GRAPH_OPT, LLAMA_ATTN_ROT_DISABLE) already, so there is no convention per se.

Yes, I agree. I don't have a strong opinion on this.
So, the PR is fine as is.

I definitely would not expect you to refactor environment variables in this PR but my preference would be to have the same semantics as for the truth values of integers in C/C++. Meaning that a value of 0 is false and all other values are true. With this implementation a value of 2 would be evaluated as false.

JohannesGaessler

Regarding the formatting: my preference is for there to be a visual distinction between the () and {} blocks of a conditional statement but I don't feel particularly strongly about it either way.

JohannesGaessler · 2026-04-29T07:49:08Z

+// try and fuse nodes and return the number of nodes to skip
+static int ggml_cuda_try_fuse(ggml_backend_cuda_context * cuda_ctx, ggml_cgraph * cgraph, int i) {
+
+    static bool disable_fusion = getenv("GGML_CUDA_DISABLE_FUSION") != nullptr && std::atoi(getenv("GGML_CUDA_DISABLE_FUSION")) == 1;


I definitely would not expect you to refactor environment variables in this PR but my preference would be to have the same semantics as for the truth values of integers in C/C++. Meaning that a value of 0 is false and all other values are true. With this implementation a value of 2 would be evaluated as false.

This reverts commit 3142f1d.

* ggml-cuda: refactor fusion code * apply formatting + make env variable truthy

ggml-cuda: refactor fusion code

6553b91

am17an requested a review from a team as a code owner April 28, 2026 11:20

gaugarg-nv reviewed Apr 28, 2026

View reviewed changes

github-actions Bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Apr 28, 2026

gaugarg-nv approved these changes Apr 28, 2026

View reviewed changes

am17an mentioned this pull request Apr 28, 2026

CUDA: fuse SSM_CONV + ADD(bias) + SILU #22478

Merged

JohannesGaessler approved these changes Apr 29, 2026

View reviewed changes

apply formatting + make env variable truthy

a50acae

JohannesGaessler approved these changes Apr 29, 2026

View reviewed changes

am17an merged commit 3142f1d into ggml-org:master Apr 29, 2026
47 checks passed

am17an deleted the cuda-fusion-detect-dispatch branch April 29, 2026 08:19

cnsiva added a commit to saas-home/llama.cpp that referenced this pull request May 1, 2026

Revert "ggml-cuda: refactor fusion code (ggml-org#22468)"

08d8c6f

This reverts commit 3142f1d.

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026

ggml-cuda: refactor fusion code (ggml-org#22468)

4e1eb97

* ggml-cuda: refactor fusion code * apply formatting + make env variable truthy

samuraieng pushed a commit to samuraieng/llama.cpp that referenced this pull request May 6, 2026

ggml-cuda: refactor fusion code (ggml-org#22468)

26d78b2

* ggml-cuda: refactor fusion code * apply formatting + make env variable truthy

ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026

ggml-cuda: refactor fusion code (ggml-org#22468)

59a6a0f

* ggml-cuda: refactor fusion code * apply formatting + make env variable truthy

meh pushed a commit to meh/llama.cpp that referenced this pull request May 10, 2026

ggml-cuda: refactor fusion code (ggml-org#22468)

f574fda

* ggml-cuda: refactor fusion code * apply formatting + make env variable truthy

baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026

ggml-cuda: refactor fusion code (ggml-org#22468)

c266477

* ggml-cuda: refactor fusion code * apply formatting + make env variable truthy

winstonma pushed a commit to winstonma/llama.cpp that referenced this pull request May 27, 2026

ggml-cuda: refactor fusion code (ggml-org#22468)

7b94dd0

* ggml-cuda: refactor fusion code * apply formatting + make env variable truthy

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

ggml-cuda: refactor fusion code (ggml-org#22468)

3c01bda

* ggml-cuda: refactor fusion code * apply formatting + make env variable truthy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-cuda: refactor fusion code#22468

ggml-cuda: refactor fusion code#22468
am17an merged 2 commits into
ggml-org:masterfrom
am17an:cuda-fusion-detect-dispatch

am17an commented Apr 28, 2026 •

edited

Loading

Uh oh!

gaugarg-nv Apr 28, 2026

Uh oh!

am17an Apr 28, 2026

Uh oh!

gaugarg-nv Apr 28, 2026

Uh oh!

JohannesGaessler Apr 29, 2026

Uh oh!

JohannesGaessler left a comment

Uh oh!

JohannesGaessler Apr 29, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

am17an commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Additional information

Requirements

Uh oh!

gaugarg-nv Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

am17an Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

gaugarg-nv Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

JohannesGaessler Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

JohannesGaessler left a comment

Choose a reason for hiding this comment

Uh oh!

JohannesGaessler Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

am17an commented Apr 28, 2026 •

edited

Loading