Fix SER (CPU) #415

ikawrakow · 2025-05-13T12:25:46Z

There have been reports that Smart Expert Reduction (SER) can produce garbage.

This PR (hopefully and finally) fixes the CPU implementation.

The issue was that when fewer experts are used than specified by the number of active experts, there are some rows in the experts matrix multiplication results that have not been set to any value. Normally this should not be an issue as these rows get multiplied by zero before being summed up to obtain the final experts result. But if there are Inf or NaN values in the rows that were not computed, then we get NaNs, and this leads to garbage output. If there are Inf and NaN values is a matter of luck that depends on what happened before in the computation, as the same memory is used by other operations to store results. This is why the issue does not always manifests itself (but yes, if one has a long enough conversation, the DDDDDDDD or GGGGGGGGG output will eventually show up).

A similar fix is required for the CUDA implementation. This is left for a follow up PR.

ikawrakow · 2025-05-13T14:54:54Z

I couldn't break it, so I'll just merge it without waiting for feedback.

ubergarm · 2025-05-13T15:04:17Z

Hah, our sleep schedules are just off, I just tested this compiling CPU only and it indeed fixes the issue when using -ser 6,1.

Without the fix I saw:

/home/w/projects/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:16600: GGML_ASSERT(fms.S[j] > 0) failed/home/w/projects/ik_llama.cpp/ggml/src/iqk/iqk_mul_m
at.cpp:16600: GGML_ASSERT(fms.S[j] > 0) failed
/home/w/projects/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:16600: GGML_ASSERT(fms.S[j] > 0) failed

With this PR it works fine.

Thanks!

I'll peep the CUDA one next.

Iwan Kawrakow added 2 commits May 13, 2025 14:55

Fixing SER bugs

10ae78e

Cleanup

2c18ef1

ikawrakow mentioned this pull request May 13, 2025

Fix SER (CUDA) #416

Merged

ikawrakow merged commit 1374062 into main May 13, 2025

saood06 mentioned this pull request May 16, 2025

Bug: llama-batched-bench crashed with batch size >2 #389

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix SER (CPU) #415

Fix SER (CPU) #415

Uh oh!

ikawrakow commented May 13, 2025

Uh oh!

ikawrakow commented May 13, 2025

Uh oh!

ubergarm commented May 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Fix SER (CPU) #415

Fix SER (CPU) #415

Uh oh!

Conversation

ikawrakow commented May 13, 2025

Uh oh!

ikawrakow commented May 13, 2025

Uh oh!

ubergarm commented May 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants