Updates to docs; move mxfp8 examples by dsikka · Pull Request #2673 · vllm-project/llm-compressor

dsikka · 2026-04-30T19:22:41Z

SUMMARY:

Move mxfp8 out of the experimental folder as fully supported in vLLM for CT
Update repo README with example and guide links
Update contributor docs with Developer Guide
Remove old sparse24 empty examples folder

coderabbitai · 2026-04-30T19:22:54Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b760ae62-c3eb-45c1-a006-dc4692bdd8bc

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

Documentation updates across contributing and main README files. CONTRIBUTING.md adds a developer guide link for extending LLM Compressor. README.md reorganizes quantization sections with thematic example groupings and updates user guides. A deprecated sparse quantization example README is removed.

Changes

Cohort / File(s)	Summary
Documentation Resources `CONTRIBUTING.md`, `README.md`	CONTRIBUTING.md adds a link to the Developer Guide for extending LLM Compressor. README.md replaces "optimization selection" section with "step-by-step" quantization guidance, reorganizes "End-to-End Examples" into thematic subsections (Weight & Activation, Weight Only, Attention & KV Cache, Architectural Specific, Non-Uniform), updates example links from file-specific to directory-style paths, and replaces "User Guides" section with new entries covering big-model support, model-free quantization, and DDP quantization.
Deprecated Examples `examples/sparse_2of4_quantization_fp8/README.md`	Removes deprecated Sparse2of4 example README due to lack of support in vLLM and LLM Compressor.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

enhancement, fp8, nvfp4, model_free_ptq

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately captures the main changes: documentation updates and moving mxfp8 examples, which aligns with the changeset modifications to CONTRIBUTING.md, README.md, and the sparse24 example folder deletion.
Description check	✅ Passed	The description is directly related to the changeset, covering the documentation updates to README and CONTRIBUTING.md, mxfp8 examples handling, and removal of the sparse24 examples folder.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch mxfp8_examples

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-30T19:22:58Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

gemini-code-assist

Code Review

This pull request updates the documentation by adding a link to the Developer Guide in CONTRIBUTING.md and significantly reorganizing the End-to-End Examples in the README.md into categorized sections like Weight and Activation, Weight Only, and Architecture-Specific quantization. It also updates several documentation links and removes a file related to unsupported Sparse24 models. The review feedback identifies several opportunities for improvement in the README, including correcting grammatical errors in the quantization guide description, ensuring consistent punctuation and capitalization (specifically for NVFP4), and maintaining uniform link formatting by removing trailing slashes.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Line 70: The sentence in README.md contains a grammar error: change "selecting
a quantization schemes" to either "selecting a quantization scheme" (singular)
or "selecting quantization schemes" (plural) in the line that reads "Please
refer to our [step-by-step compression
guide](https://docs.vllm.ai/projects/llm-compressor/en/latest/steps/choosing-model/)
for detailed information about selecting a quantization schemes, algorithm, and
their use cases." Update the phrase so the article ("a") matches singular
"scheme" or remove "a" to use the plural "schemes", and ensure the rest of the
clause ("algorithm, and their use cases") reads consistently with that choice.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4433ad92-ca66-4cec-8c70-f000acc7f98d

📥 Commits

Reviewing files that changed from the base of the PR and between 7a52e9e and a99ef00.

📒 Files selected for processing (6)

CONTRIBUTING.md
README.md
examples/quantization_w8a8_mxfp8/autoround_qwen3_example.py
examples/quantization_w8a8_mxfp8/qwen3_example_w8a16_mxfp8.py
examples/quantization_w8a8_mxfp8/qwen3_example_w8a8_mxfp8.py
examples/sparse_2of4_quantization_fp8/README.md

💤 Files with no reviewable changes (1)

examples/sparse_2of4_quantization_fp8/README.md

mergify · 2026-04-30T19:25:04Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

mergify · 2026-04-30T19:26:48Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Dipika Sikka <ds3822@columbia.edu>

mergify · 2026-04-30T19:30:55Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

dsikka · 2026-04-30T19:41:51Z

The links in the autoround readme are failing - I can just remove them?

brian-dellabetta

🥇

update readme; move mxfp8 out of experimental

a99ef00

dsikka added the ready When a PR is ready for review label Apr 30, 2026

mergify Bot added the documentation Improvements or additions to documentation label Apr 30, 2026

coderabbitai Bot added enhancement New feature or request fp8 For any issue / PR related to FP8 support nvfp4 For any PR / issue related to NVFP4 support model_free_ptq For any PR/issue related to the `model_free_ptq` pathway labels Apr 30, 2026

gemini-code-assist Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread README.md Outdated

Comment thread README.md

Comment thread README.md Outdated

Comment thread README.md Outdated

Comment thread README.md Outdated

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread README.md Outdated

small update to repo summary

f75e953

mergify Bot added quality-failed and removed quality-failed labels Apr 30, 2026

mergify Bot added the quality-failed label Apr 30, 2026

fix readme links

ac11414

mergify Bot removed the quality-failed label Apr 30, 2026

Apply suggestions from code review

821faf9

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Dipika Sikka <ds3822@columbia.edu>

mergify Bot added the quality-failed label Apr 30, 2026

fix wording

39f041f

mergify Bot removed the quality-failed label Apr 30, 2026

fix formatting

1ee2ef6

brian-dellabetta approved these changes Apr 30, 2026

View reviewed changes

fix link

7f8f829

dsikka enabled auto-merge (squash) April 30, 2026 19:55

dsikka disabled auto-merge April 30, 2026 19:56

dsikka merged commit 494430e into main Apr 30, 2026
12 of 14 checks passed

dsikka deleted the mxfp8_examples branch April 30, 2026 19:56

This was referenced Apr 30, 2026

Revise supported formats to reflect new quantization types #2678

Merged

Fix AutoRound ignore-layer metadata handling and add Qwen3-30B to mxfp8 example. #2687

Closed

Conversation

dsikka commented Apr 30, 2026

Uh oh!

coderabbitai Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Suggested labels

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mergify Bot commented Apr 30, 2026

Uh oh!

mergify Bot commented Apr 30, 2026

Uh oh!

mergify Bot commented Apr 30, 2026

Uh oh!

dsikka commented Apr 30, 2026

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading