[Int4-AWQ] Torch Int-4 AWQ Dequantization and Configuration Options by hegemanjw4amd · Pull Request #146 · ROCm/vllm

hegemanjw4amd · 2024-08-21T10:35:17Z

This PR creates a fully general Int4-AWQ dequantization function which uses torch and adds environment options (flags) for controlling torch-vs-triton codepaths.

Testing: Two HuggingFace models quantized in Int4-AWQ format have been successfully run:
Qwen2-7B-Instruct-AWQ (Latency benchmarking)
Phi-3-mini-4k-instruct-AWQ (Input verification)
For the latter model, specific input prompts were supplied and the output examined, in order to provide a sanity check for correctness.

Unit testing is accomplished via tests/kernels/test_awq_triton.py.

Resolves: https://github.com/ROCm/FasterTransformer-Internal/issues/287

shajrawi

ship it

hegemanjw4amd requested review from rasmith and shajrawi August 21, 2024 10:35

hegemanjw4amd force-pushed the hegeman/basic-sdpa-attention-int4-awq-interim branch 3 times, most recently from 0b78568 to dd9a148 Compare August 21, 2024 10:47

shajrawi approved these changes Aug 21, 2024

View reviewed changes

[hegeman/AWQ] Torch Int-4 AWQ Dequantization and Configuration Options

d4332ec

hegemanjw4amd force-pushed the hegeman/basic-sdpa-attention-int4-awq-interim branch from dd9a148 to d4332ec Compare August 21, 2024 16:15

hegemanjw4amd merged commit 4e9830e into main Aug 21, 2024

mawong-amd mentioned this pull request Sep 3, 2024

Reconcile merge differences [fix Custom All Reduce; remove Torchrun & Cython] #163

Closed

gshtras deleted the hegeman/basic-sdpa-attention-int4-awq-interim branch September 10, 2024 19:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Int4-AWQ] Torch Int-4 AWQ Dequantization and Configuration Options#146

[Int4-AWQ] Torch Int-4 AWQ Dequantization and Configuration Options#146
hegemanjw4amd merged 1 commit intomainfrom
hegeman/basic-sdpa-attention-int4-awq-interim

hegemanjw4amd commented Aug 21, 2024

Uh oh!

shajrawi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

hegemanjw4amd commented Aug 21, 2024

Uh oh!

shajrawi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants