Skip to content

Revise supported formats to reflect new quantization types#2678

Merged
dsikka merged 2 commits into
mainfrom
dsikka-update-precision
May 1, 2026
Merged

Revise supported formats to reflect new quantization types#2678
dsikka merged 2 commits into
mainfrom
dsikka-update-precision

Conversation

@dsikka
Copy link
Copy Markdown
Collaborator

@dsikka dsikka commented Apr 30, 2026

Updated the supported formats section to include new precisions and types for quantization.

Updated the supported formats section to include new precisions and types for quantization.

Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>
@dsikka dsikka added the ready When a PR is ready for review label Apr 30, 2026
@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7a13ba46-0e90-42a2-82bd-1357ebdba5ae

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Updates the README documentation to reflect expanded quantization capabilities by renaming the "Supported Formats" section to "Supported Precisions and Types," introducing new precision variants (W4AFP8, NVFP4, MXFP4, MXFP8), adding mixed-precision combinations, explicitly documenting attention and KV cache quantization, and extending the "Supported Algorithms" list with rotation-based methods.

Changes

Cohort / File(s) Summary
README Documentation
README.md
Updated quantization capability documentation with renamed section ("Supported Precisions and Types"), new precision names and microscale variants (W4AFP8, NVFP4, MXFP4, MXFP8), mixed-precision entries (MXFP4A16, NVFP4A16), explicit attention and KV-cache quantization callouts, and rotation-based algorithms (SpinQuant, QuIP).

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Suggested labels

enhancement, nvfp4, fp8, w4a16

Suggested reviewers

  • brian-dellabetta
  • kylesayrs
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change in the pull request, which updates the README documentation to reflect new quantization types and expanded precision names.
Description check ✅ Passed The description relates to the changeset by mentioning the update to the supported formats section with new precisions and types, though it contains placeholder text indicating incomplete documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dsikka-update-precision

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mergify mergify Bot added the documentation Improvements or additions to documentation label Apr 30, 2026
@coderabbitai coderabbitai Bot added enhancement New feature or request fp8 For any issue / PR related to FP8 support nvfp4 For any PR / issue related to NVFP4 support w4a16 labels Apr 30, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the README.md to reflect expanded support for various quantization precisions, types, and algorithms, including microscale formats and rotation-based methods. Review feedback suggests correcting the naming convention for 4-bit weight/8-bit activation quantization and specifying the supported formats for attention and KV cache quantization to improve clarity and consistency.

Comment thread README.md
Comment thread README.md Outdated
Copy link
Copy Markdown
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Comment thread README.md Outdated
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Dipika Sikka <ds3822@columbia.edu>
@dsikka dsikka merged commit 76b28ce into main May 1, 2026
11 of 13 checks passed
@dsikka dsikka deleted the dsikka-update-precision branch May 1, 2026 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request fp8 For any issue / PR related to FP8 support nvfp4 For any PR / issue related to NVFP4 support ready When a PR is ready for review w4a16

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants