Skip to content

Conversation

@rahul-tuli
Copy link
Collaborator

SUMMARY:
"please provide a brief summary"

TEST PLAN:
"please outline how the changes were tested"

@github-actions
Copy link

github-actions bot commented Oct 7, 2024

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

@kylesayrs
Copy link
Collaborator

Note that we should update the list of supported algorithms in the readme when this lands

@brian-dellabetta brian-dellabetta mentioned this pull request Feb 19, 2025
dsikka pushed a commit that referenced this pull request Apr 21, 2025
SUMMARY:
Addition of [`AWQModifier`](https://arxiv.org/pdf/2306.00978), based on
[AutoAWQ
implementation](https://github.com/casper-hansen/AutoAWQ/blob/main/awq/quantize/quantizer.py#L28).

Should be reviewed/merged in conjunction with
vllm-project/compressed-tensors#269

Replaces #181 and #824 

TEST PLAN:
Some unit tests included, but as this was mostly a port from AutoAWQ, we
validated the code by ensuring we could reproduce the evaluation metrics
in Table 4 of [the paper](https://arxiv.org/pdf/2306.00978). We achieve
the following wikitext PPL scores:

Llama-2 7B Group 128:
1. Paper: 5.60
2. AutoAWQ: 5.615
3. This implementation: 5.612
4. we match what the paper reports for just RTN -- 5.73
5. We get reasonable results for channel-wise -- 6.788. AutoAWQ errors
out for this (setting "q_group_size": -1 in the quant_config), and
results not reported in paper.

Llama-2 13B Group 128:
1. We match the results of AutoAWQ and the results shown in the paper:
4.97
2. We match what the paper reports for just RTN -- 4.984

NOTE: We are excluding the clipping logic in this implementation, if we
want to add it we should add it as another modifier, they are mutually
exclusive and the data model for AWQ doesn't align well with clipping.
That might be the reason for the slight deviation of results reported in
the paper and in our implementation

---------

Signed-off-by: Brian Dellabetta <[email protected]>
@dsikka dsikka closed this Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants