Fix AutoRound ignore-layer metadata handling and add Qwen3-30B to mxfp8 example.#2687
Fix AutoRound ignore-layer metadata handling and add Qwen3-30B to mxfp8 example.#2687changwangss wants to merge 0 commit into
Conversation
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🔴 Require two reviewsWaiting for
This rule is failing.PRs labelled "two-reviews" must have at least two approving reviews before merging.
|
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
There was a problem hiding this comment.
Code Review
This pull request updates the examples/autoround/README.md file to include MXFP8 and MXFP4 quantization schemes in the examples table and the list of supported schemes. Review feedback suggests improving the consistency of file paths for the new examples and correcting the casing of 'WNA16' along with punctuation in the documentation.
WalkthroughThe PR updates the examples/autoround README by adding three new quantization precision rows (MXFP8, MXFP4, NVFP4) to the Support Matrix table with their associated example script paths, and updates the Known Issues section to reflect expanded AutoRound scheme support. ChangesREADME Documentation Update
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
examples/autoround/README.md (1)
78-78: 💤 Low valueConsider standardizing scheme capitalization.
Line 78 uses "WNA16" while the Support Matrix table (lines 66–68) uses "wNa16". For consistency, consider matching the casing used in the table.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/autoround/README.md` at line 78, The README uses inconsistent capitalization for the quantization scheme names: change the occurrence of "WNA16" to match the Support Matrix's casing "wNa16" (or alternatively normalize both entries to a single chosen casing) so that the term "wNa16" is used consistently across the README (reference symbols: "WNA16" and "wNa16", and the Support Matrix table).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@examples/autoround/README.md`:
- Line 78: The README uses inconsistent capitalization for the quantization
scheme names: change the occurrence of "WNA16" to match the Support Matrix's
casing "wNa16" (or alternatively normalize both entries to a single chosen
casing) so that the term "wNa16" is used consistently across the README
(reference symbols: "WNA16" and "wNa16", and the Support Matrix table).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0618543b-9bcc-45ea-8cf4-0e8534319205
📒 Files selected for processing (1)
examples/autoround/README.md
Summary
This PR mainly includes two targeted fixes:
iters=0forQwen/Qwen3-30B-A3B-Instruct-2507.Test
test mxfp8 with Qwen/Qwen3-30B-A3B-Instruct-2507.
load the mxfp8 model quantized by LLMC AutoRound with vLLM