[Quantization] Deprecate Long Tail of Schemes#31688
[Quantization] Deprecate Long Tail of Schemes#31688robertgshaw2-redhat merged 11 commits intomainfrom
Conversation
Signed-off-by: Robert Shaw <robshaw@redhat.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a deprecation mechanism for a number of quantization schemes. A new allow_deprecated_quantization flag is added to ModelConfig and exposed as a command-line argument, allowing users to continue using these schemes with a warning. The implementation is clear and the deprecation logic is sound. I have one suggestion to improve code clarity by removing an unused parameter.
| def get_quantization_config( | ||
| quantization: str, allow_deprecated: bool = False | ||
| ) -> type[QuantizationConfig]: |
There was a problem hiding this comment.
The allow_deprecated parameter is introduced in this function's signature but it's not used within the function body. The deprecation logic is correctly handled in vllm.config.model.ModelConfig._verify_quantization. To avoid confusion for future developers who might expect this parameter to have an effect, it should be removed.
def get_quantization_config(quantization: str) -> type[QuantizationConfig]:Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
yewentao256
left a comment
There was a problem hiding this comment.
Thanks for the work! Will allow_deprecated_quantization=True only in test_auto_round.py be enough?
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
Hi @robertgshaw2-redhat, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com> Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
Hi, I noticed LM Eval Large Models failed on the nightly CI run after this PR: https://buildkite.com/vllm/ci/builds/46314/steps/canvas?sid=019ba18f-0a1a-4482-82d8-b8c102fd92c0. I've made a PR to allow for deprecated quantization similar to what you've done here for the other relevant tests: #32065. |
Signed-off-by: Robert Shaw <robshaw@redhat.com> Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com> Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Signed-off-by: Robert Shaw <robshaw@redhat.com> Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
SUMMARY
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.