Make sure tensor row size is multiple of block size also when quantizing with --pure #294

ikawrakow · 2025-03-27T09:46:26Z

ffn_down_exps row sizes are not a multiple of 256 in DeepSeek-Lite. When using --pure with llama-quantize this leads to a crash. I got tired of having to do custom quantization overrides in that case, so this PR adds the check for divisibility by the quantization block size also for --pure, and uses the fallback quantization type if necessary.

I often want to quantize with --pure to see quantization performance without quantization mixes. But for models where there qre tensors with row sizes that are not multiple of 256, this results in a crash for k- and i-quants. Hence, lets add a check if the quant selected via --pure is applicable, and change it if not.

Iwan Kawrakow added 10 commits March 26, 2025 12:32

WIP - not working

970c164

q8_0 without bells and wistles works

40ab112

It works for q8_0

8e2d549

Use bf16 instead of f16,int16

9ce890e

q4_0_r8

0170c8f

q5_0_r4

a4d7fb7

q6_0_r4

f1b4762

Also q4_1 and q5_1

b428bf1

Merge remote-tracking branch 'origin/main' into ik/change_q_pure

7e706ab

ikawrakow merged commit 23b0add into main Mar 27, 2025

This was referenced Mar 28, 2025

Quantization improvements #295

Merged

Possible numerical stability issue with experimental quant of DeepSeek-V3-0324? #296

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make sure tensor row size is multiple of block size also when quantizing with --pure #294

Make sure tensor row size is multiple of block size also when quantizing with --pure #294

Uh oh!

ikawrakow commented Mar 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Make sure tensor row size is multiple of block size also when quantizing with --pure #294

Make sure tensor row size is multiple of block size also when quantizing with --pure #294

Uh oh!

Conversation

ikawrakow commented Mar 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants