Skip to content

Minor delta-net tweak#1337

Merged
ikawrakow merged 1 commit intomainfrom
ik/qkvz_tweak1
Mar 1, 2026
Merged

Minor delta-net tweak#1337
ikawrakow merged 1 commit intomainfrom
ik/qkvz_tweak1

Conversation

@ikawrakow
Copy link
Owner

Worth ~1% better TG on CUDA and CPU-only.

@magikRUKKOLA
Copy link

magikRUKKOLA commented Feb 28, 2026

Whoa!

IQ2_KL Qwen3.5 397B; 8x3090:

PP TG N_KV T_PP s S_PP t/s T_TG s S_TG t/s
4096 1024 0 3.762 1088.69 26.000 39.38
4096 1024 4096 3.887 1053.72 26.060 39.29
4096 1024 8192 4.017 1019.60 26.525 38.60
4096 1024 12288 4.144 988.36 27.040 37.87
4096 1024 16384 4.274 958.27 27.461 37.29
4096 1024 20480 4.401 930.60 27.973 36.61
4096 1024 24576 4.534 903.31 28.108 36.43
4096 1024 28672 4.666 877.90 28.524 35.90
4096 1024 32768 4.793 854.59 29.253 35.00
4096 1024 36864 4.918 832.81 29.230 35.03
4096 1024 40960 5.041 812.57 29.671 34.51
4096 1024 45056 5.165 793.01 30.189 33.92
4096 1024 49152 5.302 772.52 30.575 33.49
4096 1024 53248 5.428 754.63 30.868 33.17
4096 1024 57344 5.549 738.22 31.350 32.66
4096 1024 61440 5.694 719.41 31.510 32.50
4096 1024 65536 5.803 705.79 31.931 32.07
4096 1024 69632 5.926 691.19 32.492 31.52
4096 1024 73728 6.059 676.02 32.775 31.24
4096 1024 77824 6.189 661.86 32.997 31.03
4096 1024 81920 6.313 648.86 33.296 30.75
4096 1024 86016 6.455 634.50 33.817 30.28
4096 1024 90112 6.573 623.19 34.101 30.03
4096 1024 94208 6.690 612.23 34.378 29.79
4096 1024 98304 6.828 599.92 34.939 29.31
4096 1024 102400 6.955 588.94 35.140 29.14
4096 1024 106496 7.085 578.13 35.639 28.73
4096 1024 110592 7.218 567.46 35.946 28.49
4096 1024 114688 7.334 558.53 36.429 28.11
4096 1024 118784 7.461 548.96 36.542 28.02
4096 1024 122880 7.601 538.87 37.010 27.67
4096 1024 126976 7.766 527.42 37.403 27.38
4096 1024 131072 7.865 520.80 37.735 27.14

references:

#1320 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants