[main][Quant] Remove unused rotation functions and parameters from W4A4 LAOS quantization#6648
Conversation
Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
Summary of ChangesHello @SlightwindSec, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on refining the W4A4 LAOS dynamic quantization implementation by excising dead code. The changes streamline the codebase and improve efficiency by removing functions and parameters related to rotation that were never invoked in the current workflow, thereby also reducing memory footprint. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request removes unused rotation-related functions and parameters from the AscendW4A4LaosDynamicLinearMethod. The changes simplify the codebase by removing dead code, which is a good improvement for maintainability. The changes are correct and I have no further comments on the code.
Following the repository's style guide, I've provided suggestions for the PR title and summary:
Suggested PR Title:
[main][Quant][Refactor] Remove unused rotation logic in W4A4 LAOS quantizationSuggested PR Summary:
### What this PR does / why we need it?
This PR refactors the `AscendW4A4LaosDynamicLinearMethod` by removing unused rotation-related code. The following components are removed:
- The `set_rotation_config` and `apply_rotation` methods.
- The `rotation_type` instance variable.
- Unused quantization parameters: `heads_rotation`, `kronecker_rotation_n`, and `kronecker_rotation_m`.
These components are not utilized in the current W4A4 LAOS dynamic quantization workflow. Their removal simplifies the codebase and eliminates unnecessary memory allocation for large rotation matrices, improving maintainability and efficiency.
### Does this PR introduce _any_ user-facing change?
No, this is an internal refactoring and does not introduce any user-facing changes.
### How was this patch tested?
CI is expected to pass with existing unit tests.…A4 LAOS quantization (vllm-project#6648) ## Summary - Remove unused `set_rotation_config` and `apply_rotation` methods from `AscendW4A4LaosDynamicLinearMethod` - Remove unused `rotation_type` field and associated conditional quantization parameters (`heads_rotation`, `kronecker_rotation_n`, `kronecker_rotation_m`) These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow. - vLLM version: v0.15.0 - vLLM main: vllm-project/vllm@d7e17aa Signed-off-by: SlightwindSec <slightwindsec@gmail.com> Signed-off-by: mikequan0425 <mikequan0425@foxmail.com>
…to qwen3next_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: [Docs] Fix GLM-5 deploy command (vllm-project#6711) [npugraph_ex]enable npugraph_ex by default (vllm-project#6664) [doc]add GLM5.md (vllm-project#6709) [Model] GLM5 adaptation (vllm-project#6642) [Bugfix] Update target probs to target logits in rejection sample (vllm-project#6685) [Main][Ops] Make triton rope support index_selecting from cos_sin_cache (vllm-project#5450) [CI]fix nightly multi node test error for wait for pod ready (vllm-project#6675) [main to main] upgrade main 0210 (vllm-project#6673) [main][Quant] Remove unused rotation functions and parameters from W4A4 LAOS quantization (vllm-project#6648) [Test][BugFix] Fix torch.rand usage in triton penalty test (vllm-project#6680) Add Worker Interface:check_health (vllm-project#6681)
…A4 LAOS quantization (vllm-project#6648) ## Summary - Remove unused `set_rotation_config` and `apply_rotation` methods from `AscendW4A4LaosDynamicLinearMethod` - Remove unused `rotation_type` field and associated conditional quantization parameters (`heads_rotation`, `kronecker_rotation_n`, `kronecker_rotation_m`) These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow. - vLLM version: v0.15.0 - vLLM main: vllm-project/vllm@d7e17aa Signed-off-by: SlightwindSec <slightwindsec@gmail.com> Signed-off-by: momochenchuw <chenchuw@huawei.com>
…A4 LAOS quantization (vllm-project#6648) ## Summary - Remove unused `set_rotation_config` and `apply_rotation` methods from `AscendW4A4LaosDynamicLinearMethod` - Remove unused `rotation_type` field and associated conditional quantization parameters (`heads_rotation`, `kronecker_rotation_n`, `kronecker_rotation_m`) These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow. - vLLM version: v0.15.0 - vLLM main: vllm-project/vllm@d7e17aa Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
…A4 LAOS quantization (vllm-project#6648) ## Summary - Remove unused `set_rotation_config` and `apply_rotation` methods from `AscendW4A4LaosDynamicLinearMethod` - Remove unused `rotation_type` field and associated conditional quantization parameters (`heads_rotation`, `kronecker_rotation_n`, `kronecker_rotation_m`) These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow. - vLLM version: v0.15.0 - vLLM main: vllm-project/vllm@d7e17aa Signed-off-by: SlightwindSec <slightwindsec@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…A4 LAOS quantization (vllm-project#6648) ## Summary - Remove unused `set_rotation_config` and `apply_rotation` methods from `AscendW4A4LaosDynamicLinearMethod` - Remove unused `rotation_type` field and associated conditional quantization parameters (`heads_rotation`, `kronecker_rotation_n`, `kronecker_rotation_m`) These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow. - vLLM version: v0.15.0 - vLLM main: vllm-project/vllm@d7e17aa Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
…A4 LAOS quantization (vllm-project#6648) ## Summary - Remove unused `set_rotation_config` and `apply_rotation` methods from `AscendW4A4LaosDynamicLinearMethod` - Remove unused `rotation_type` field and associated conditional quantization parameters (`heads_rotation`, `kronecker_rotation_n`, `kronecker_rotation_m`) These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow. - vLLM version: v0.15.0 - vLLM main: vllm-project/vllm@d7e17aa Signed-off-by: SlightwindSec <slightwindsec@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…A4 LAOS quantization (vllm-project#6648) ## Summary - Remove unused `set_rotation_config` and `apply_rotation` methods from `AscendW4A4LaosDynamicLinearMethod` - Remove unused `rotation_type` field and associated conditional quantization parameters (`heads_rotation`, `kronecker_rotation_n`, `kronecker_rotation_m`) These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow. - vLLM version: v0.15.0 - vLLM main: vllm-project/vllm@d7e17aa Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
…A4 LAOS quantization (vllm-project#6648) ## Summary - Remove unused `set_rotation_config` and `apply_rotation` methods from `AscendW4A4LaosDynamicLinearMethod` - Remove unused `rotation_type` field and associated conditional quantization parameters (`heads_rotation`, `kronecker_rotation_n`, `kronecker_rotation_m`) These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow. - vLLM version: v0.15.0 - vLLM main: vllm-project/vllm@d7e17aa Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
Summary
set_rotation_configandapply_rotationmethods fromAscendW4A4LaosDynamicLinearMethodrotation_typefield and associated conditional quantization parameters (heads_rotation,kronecker_rotation_n,kronecker_rotation_m)These rotation-related functions and parameters are never called in the current W4A4 LAOS dynamic quantization workflow.