UPSTREAM PR #18102: ggml-cuda: Delta-Net linear attention for Qwen3-Next by loci-dev · Pull Request #593 · auroralabs-loci/llama.cpp

loci-dev · 2025-12-16T16:43:34Z

cuda kernel for delta-net linear attention layers in qwen3next.

adds GGML_OP_DELTA_NET + recurrent kernel for decode, blackwell path (sm12.0+) for prefill with 64k shmem. also improved solve_tri for the chunked prefill path.

getting ~45-55 t/s on q4/mxfp4 and ~40 t/s bf16 on 80B-A3B (blackwell). pre-blackwell cards get ~38-40 t/s from solve_tri improvements (baseline was the original ~20 t/s).

Edit: omitted some small bits.

loci-review · 2025-12-16T18:32:39Z

Explore the complete analysis inside the Version Insights

hauhaut added 2 commits December 16, 2025 16:40

ggml-cuda: Delta-Net linear attention for Qwen3-Next

4114537

qwen3next: trim comments

0a19293

loci-dev had a problem deploying to PROD__AL_DEMO December 16, 2025 16:43 — with GitHub Actions Failure

ggml-cpu: add DELTA_NET backend + tests

128a6c2

loci-dev temporarily deployed to PROD__AL_DEMO December 16, 2025 17:39 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from 765e416 to 3c6cece Compare December 16, 2025 18:12

loci-dev force-pushed the main branch 23 times, most recently from ab5b02c to 2f30a3d Compare December 18, 2025 17:11

loci-dev force-pushed the main branch 30 times, most recently from 086f09a to 15838f1 Compare December 24, 2025 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #18102: ggml-cuda: Delta-Net linear attention for Qwen3-Next#593

UPSTREAM PR #18102: ggml-cuda: Delta-Net linear attention for Qwen3-Next#593
loci-dev wants to merge 3 commits intomainfrom
upstream-PR18102-branch_hauhaut-deltanet-cuda

loci-dev commented Dec 16, 2025

Uh oh!

loci-review bot commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Dec 16, 2025

Uh oh!

loci-review bot commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants