Skip to content

Conversation

@OuadiElfarouki
Copy link
Contributor

@OuadiElfarouki OuadiElfarouki commented Sep 2, 2024

MUL_MAT test-backend-ops currently fail on intel GPUs for Q4_1, Q5_0, Q5_1 and Q8_0 due to a small edge-case issue in the dequantize_mul_mat_vec kernel (when ncols <= GGML_SYCL_DMMV_X specifically).

This is a minor fix that halts the access to out-bound/extra quant elements in the kernel reduction step.

All unit-tests are passing following this fix.
Performance on intel GPUs is almost not affected.

@github-actions github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label Sep 2, 2024
@joeatodd
Copy link
Contributor

joeatodd commented Sep 3, 2024

I think given the perf implications of this bounds checking, we should dig a little deeper.

@OuadiElfarouki
Copy link
Contributor Author

@joeatodd Agree

@OuadiElfarouki
Copy link
Contributor Author

Updated fix and performance is preserved now.

Copy link
Contributor

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@OuadiElfarouki OuadiElfarouki merged commit 5910ea9 into ggml-org:master Sep 4, 2024
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
Fixed dmmv dequant for ncols== GGML_SYCL_DMMV_X
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
Fixed dmmv dequant for ncols== GGML_SYCL_DMMV_X
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants