[OneDNN] add mxfp8, mxfp4 onednn gemm#235
Merged
Merged
Conversation
b655464 to
2ddc61e
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the oneDNN backend and XPU bindings to enable MXFP8/MXFP4 (FP8/FP4 with block scaling) GEMM paths, along with expanded test coverage.
Changes:
- Bump oneDNN submodule to a commit that includes/aligns with MXFP8/MXFP4 GEMM support.
- Add FP4 GEMM operator plumbing (C++ op, torch binding, Python wrapper) and new FP4 GEMM tests.
- Extend FP8 GEMM tests and update FP8 matmul scaling-attribute handling for MXFP8 block scales.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| third_party/oneDNN | Updates oneDNN submodule commit to pick up MXFP8/MXFP4 GEMM support. |
| tests/test_fp8_gemm_onednn.py | Expands fp8 GEMM test matrices and adds an MXFP8 GEMM test. |
| tests/test_fp4_gemm_onednn.py | Adds coverage for MXFP4 GEMM including reference reconstruction. |
| tests/register_ops.py | Adds a Python-level wrapper for the new fp4_gemm op. |
| csrc/xpu/torch_bindings.cpp | Registers the fp4_gemm operator schema and XPU implementation. |
| csrc/xpu/ops.h | Extends API surface with fp4_gemm declaration and updates shape comment. |
| csrc/xpu/onednn/onednn_matmul.cpp | Implements fp4_gemm entry point and routes to oneDNN FP4 matmul. |
| csrc/xpu/onednn/onednn_ext.h | Adds oneDNN dtype mappings for e8m0 scales and FP4, plus joint dtype cases. |
| csrc/xpu/onednn/fp8_gemm_w8a8.h | Adds MXFP8 scale handling via e8m0 block-wise scales. |
| csrc/xpu/onednn/fp4_gemm_w4a4.h | New oneDNN FP4 matmul implementation using block-wise e8m0 scales. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ca13f77 to
e33478e
Compare
2f9930c to
2857c28
Compare
xinyu-intel
approved these changes
Apr 8, 2026
Yejing-Lai
reviewed
Apr 8, 2026
* add mxfp4 onednn gemm Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com> * add ut for mx Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com> * fix Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com> * format with pre-commit Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com> * thanks copilot Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com> --------- Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
Signed-off-by: mayuyuace <qiming1.zhang@intel.com> Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
vllm-project#232) Signed-off-by: Qiao, Zhefeng <zhefeng.qiao@intel.com> Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
faa5c73 to
8f97d49
Compare
jikunshang
approved these changes
Apr 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.