Skip to content

Conversation

@dsikka
Copy link
Collaborator

@dsikka dsikka commented Oct 22, 2025

Summary:

  • Adds utilities required to convert scales to be mxfp4 compliant
  • MxFp4 scales have the following properties:
  1. Scales are rounded to Powers of 2
  2. Converted to exponents
  3. Stored in uint8

Reference: https://github.com/vllm-project/vllm/blob/main/tests/quantization/reference_mxfp4.py

Added functions:

  • Add utilities to generate powers of 2 through round_to_power_2
  • This is called by generate_mxfp4_scales which will be called by calculate_qparams. generate_mxfp4_scales will also convert the scales to exponents and store as uint8
  • Add convert_mxfp4_exp_scale to convert the exponent scales back to dense such that they can be applied to the weight / activations during QDQ

Testing

  • Added unit and some e2e testing
  • Will potentially update depending on the vLLM integration, specifically the requirement to -2 when generating the exponent

@dsikka dsikka marked this pull request as ready for review October 23, 2025 19:57
Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nice bit shift wizardry 🧙 , tests look well-covered though

A few nits

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 🔥 🔥

Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple very small nits, but LGTM!

@dsikka dsikka merged commit d104bf2 into main Oct 28, 2025
3 checks passed
@dsikka dsikka deleted the mxfp4_utils branch October 28, 2025 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants