[MXFP4] Add scale generation utils #503

dsikka · 2025-10-22T18:35:33Z

Summary:

Adds utilities required to convert scales to be mxfp4 compliant
MxFp4 scales have the following properties:

Scales are rounded to Powers of 2
Converted to exponents
Stored in uint8

Reference: https://github.com/vllm-project/vllm/blob/main/tests/quantization/reference_mxfp4.py

Added functions:

Add utilities to generate powers of 2 through round_to_power_2
This is called by generate_mxfp4_scales which will be called by calculate_qparams. generate_mxfp4_scales will also convert the scales to exponents and store as uint8
Add convert_mxfp4_exp_scale to convert the exponent scales back to dense such that they can be applied to the weight / activations during QDQ

Testing

Added unit and some e2e testing
Will potentially update depending on the vLLM integration, specifically the requirement to -2 when generating the exponent

brian-dellabetta

Some nice bit shift wizardry 🧙 , tests look well-covered though

A few nits

src/compressed_tensors/quantization/utils/mxfp4_utils.py

tests/test_quantization/test_utils/test_mxfp4_utils.py

brian-dellabetta

🔥 🔥 🔥

rahul-tuli

Couple very small nits, but LGTM!

src/compressed_tensors/quantization/utils/mxfp4_utils.py

dsikka added 9 commits October 22, 2025 14:34

add mxfp4 scale generation

a7401a0

add rounding test

3d9fa3b

update

c72bf4c

update

6a36cbf

update

b03d7c8

update

1b5099d

update

5a6e290

Merge branch 'main' into mxfp4_utils

e00283d

add additional case

d66463b

dsikka marked this pull request as ready for review October 23, 2025 19:57

brian-dellabetta reviewed Oct 23, 2025

View reviewed changes

src/compressed_tensors/quantization/utils/mxfp4_utils.py Outdated Show resolved Hide resolved

tests/test_quantization/test_utils/test_mxfp4_utils.py Outdated Show resolved Hide resolved

tests/test_quantization/test_utils/test_mxfp4_utils.py Outdated Show resolved Hide resolved

update

db85238

brian-dellabetta approved these changes Oct 23, 2025

View reviewed changes

rahul-tuli approved these changes Oct 27, 2025

View reviewed changes

src/compressed_tensors/quantization/utils/mxfp4_utils.py Show resolved Hide resolved

src/compressed_tensors/quantization/utils/mxfp4_utils.py Show resolved Hide resolved

dsikka merged commit d104bf2 into main Oct 28, 2025
3 checks passed

dsikka deleted the mxfp4_utils branch October 28, 2025 12:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MXFP4] Add scale generation utils #503

[MXFP4] Add scale generation utils #503

Uh oh!

dsikka commented Oct 22, 2025 •

edited

Loading

Uh oh!

brian-dellabetta left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brian-dellabetta left a comment

Uh oh!

rahul-tuli left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[MXFP4] Add scale generation utils #503

[MXFP4] Add scale generation utils #503

Uh oh!

Conversation

dsikka commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary:

Added functions:

Testing

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

rahul-tuli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dsikka commented Oct 22, 2025 •

edited

Loading