[KV Cache] support kv cache int8 per channel quant #398

Eviannn · 2025-07-19T08:34:37Z

kv cache quant int8 per channel is supported using this pr.
Besieds, llm-compressor needs to be updated as well: vllm-project/llm-compressor#1663

Signed-off-by: evian <[email protected]>

dsikka · 2025-07-31T15:21:14Z

src/compressed_tensors/quantization/lifecycle/initialize.py


-    expected_shape = 1  # per tensor
+    if quantization_args.strategy == QuantizationStrategy.CHANNEL:
+        expected_shape = module.k_proj.out_features


If channel wise quantization - this should be 2D
Please refer to the init here; https://github.com/neuralmagic/compressed-tensors/blob/3d49764c02d4d9437e59d35f8c49abb5bc94636c/src/compressed_tensors/quantization/lifecycle/initialize.py#L175

done，thx a lot!

This was referenced Jul 19, 2025

[KV Cache] support kv cache int8 per channel quant vllm-project/llm-compressor#1662

Closed

[KV Cache] support kv cache int8 per channel quantization vllm-project/llm-compressor#1663

Open

Eviannn force-pushed the main branch from 7eb22dc to 2c96312 Compare July 19, 2025 11:19

[KV Cache] support kv cache int8 per channel quant

2c96312

Signed-off-by: evian <[email protected]>

dsikka self-requested a review July 19, 2025 12:27

dsikka requested changes Jul 31, 2025

View reviewed changes

[KV Cache] fix per channel shape

fc6b734

Eviannn requested a review from dsikka August 7, 2025 03:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[KV Cache] support kv cache int8 per channel quant #398

[KV Cache] support kv cache int8 per channel quant #398

Uh oh!

Eviannn commented Jul 19, 2025 •

edited

Loading

Uh oh!

dsikka Jul 31, 2025

Uh oh!

Eviannn Aug 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[KV Cache] support kv cache int8 per channel quant #398

Are you sure you want to change the base?

[KV Cache] support kv cache int8 per channel quant #398

Uh oh!

Conversation

Eviannn commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsikka Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Eviannn Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Eviannn commented Jul 19, 2025 •

edited

Loading