Skip to content

[Quantization] Deprecate Long Tail of Schemes#31688

Merged
robertgshaw2-redhat merged 11 commits intomainfrom
deprecate-quantization-schemes
Jan 9, 2026
Merged

[Quantization] Deprecate Long Tail of Schemes#31688
robertgshaw2-redhat merged 11 commits intomainfrom
deprecate-quantization-schemes

Conversation

@robertgshaw2-redhat
Copy link
Copy Markdown
Collaborator

@robertgshaw2-redhat robertgshaw2-redhat commented Jan 4, 2026

SUMMARY

  • start deprecation process for long tail of quantization schemes
  • we will have 1 release with the ability to enable the deprecated stuff, then remove completely

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Robert Shaw added 2 commits January 4, 2026 15:12
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a deprecation mechanism for a number of quantization schemes. A new allow_deprecated_quantization flag is added to ModelConfig and exposed as a command-line argument, allowing users to continue using these schemes with a warning. The implementation is clear and the deprecation logic is sound. I have one suggestion to improve code clarity by removing an unused parameter.

Comment on lines +116 to +118
def get_quantization_config(
quantization: str, allow_deprecated: bool = False
) -> type[QuantizationConfig]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The allow_deprecated parameter is introduced in this function's signature but it's not used within the function body. The deprecation logic is correctly handled in vllm.config.model.ModelConfig._verify_quantization. To avoid confusion for future developers who might expect this parameter to have an effect, it should be removed.

def get_quantization_config(quantization: str) -> type[QuantizationConfig]:

Robert Shaw added 2 commits January 4, 2026 15:46
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
@robertgshaw2-redhat robertgshaw2-redhat marked this pull request as ready for review January 4, 2026 20:48
@robertgshaw2-redhat robertgshaw2-redhat added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 4, 2026
Robert Shaw added 2 commits January 4, 2026 16:14
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! Will allow_deprecated_quantization=True only in test_auto_round.py be enough?

Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
@mergify
Copy link
Copy Markdown

mergify bot commented Jan 6, 2026

Hi @robertgshaw2-redhat, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@github-project-automation github-project-automation bot moved this to Backlog in MoE Refactor Jan 6, 2026
@robertgshaw2-redhat robertgshaw2-redhat moved this from Backlog to In progress in MoE Refactor Jan 6, 2026
Robert Shaw added 2 commits January 8, 2026 17:40
Signed-off-by: Robert Shaw <robshaw@redhat.com>
@robertgshaw2-redhat robertgshaw2-redhat moved this from In progress to In review in MoE Refactor Jan 8, 2026
Robert Shaw added 2 commits January 8, 2026 19:10
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
@robertgshaw2-redhat robertgshaw2-redhat merged commit 5825bbc into main Jan 9, 2026
63 checks passed
@robertgshaw2-redhat robertgshaw2-redhat deleted the deprecate-quantization-schemes branch January 9, 2026 00:07
@github-project-automation github-project-automation bot moved this from In review to Done in MoE Refactor Jan 9, 2026
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
@micah-wil
Copy link
Copy Markdown
Contributor

Hi, I noticed LM Eval Large Models failed on the nightly CI run after this PR: https://buildkite.com/vllm/ci/builds/46314/steps/canvas?sid=019ba18f-0a1a-4482-82d8-b8c102fd92c0. I've made a PR to allow for deprecated quantization similar to what you've done here for the other relevant tests: #32065.

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants