Support RL online quantization with torchao#23014
Merged
vllm-bot merged 1 commit intovllm-project:mainfrom Oct 1, 2025
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
This is to enable online quant for verl. The PR
added support for initializing a TorchAOConfig object in vllm
through a serialized json file that specifies the type of quantization
people want. Or a json serialized TorchAOConfig object
Code for serializing the config to json:
Code for serializing the config to file
This also supports module level config as well through the
ModuleFqnToConfigconfighttps://huggingface.co/docs/transformers/main/en/quantization/torchao#per-module-quantization
although not tested yet.
more configs: https://docs.pytorch.org/ao/main/api_ref_quantization.html#inference-apis-for-quantize
Note: this has incorporated changes from @LiyuanLucasLiu's PR: #23901, although vllm fp8 quant method is not supported yet, we can add that in a separate PR
Test Plan:
pytest tests/quantization/test_torchao.py -k test_on_the_fly_quant
pytest tests/quantization/test_torchao.py -k test_reload_weights
and regression tests
pytest tests/quantization/test_torchao.py
Reviewers:
Subscribers:
Tasks:
Tags: