Refactor: review improvements for KV cache quantization by janhilgard · Pull Request #72 · waybarrios/vllm-mlx

janhilgard · 2026-02-11T16:41:30Z

Summary

Extract _maybe_dequantize() helper method to eliminate 5 repeated conditional dequantize blocks in fetch() (DRY)
Replace duck-typed hasattr/isinstance(keys, (list, tuple)) check with explicit isinstance(layer_cache, QuantizedKVCache) in estimate_kv_cache_memory() — more robust and self-documenting
Replace __new__() bypass with normal QuantizedKVCache(group_size=..., bits=...) constructor in _trim_cache_offset() — forward-compatible if __init__ adds new attributes
Extract _add_kv_cache_quantization_args() helper to deduplicate identical CLI argument definitions between serve and bench parsers

Test plan

All 16 test_kv_cache_quantization.py tests pass
ruff check clean on changed files (pre-existing N806 on _QKVCache unrelated)
Pure refactoring — no behavioral changes

🤖 Generated with Claude Code

janhilgard · 2026-02-11T16:42:45Z

Hey @waybarrios — ideally I'd push these changes directly into #62, but I don't have write access and maintainerCanModify is disabled on PR #62. It would be cleaner if you could either grant me write access to the repo or enable maintainerCanModify on #62, so I can push directly to your branch instead of creating separate PRs. Thanks!

waybarrios · 2026-02-13T02:11:03Z

@janhilgard Sorry for the late. Being pretty busy. but now you have access!! :)

janhilgard · 2026-02-13T10:50:20Z

Thank you very much @waybarrios! Really appreciate it 🙏

- Add `_maybe_dequantize()` method to replace 5 repeated dequantize patterns in fetch() - Use explicit `isinstance(QuantizedKVCache)` instead of duck-typing in estimate_kv_cache_memory() - Use normal QuantizedKVCache constructor instead of __new__() bypass in _trim_cache_offset() - Extract `_add_kv_cache_quantization_args()` to deduplicate 4 CLI args in serve/bench parsers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

janhilgard · 2026-03-21T22:18:41Z

Closing: Related PRs #73 and #69 have already been merged. This refactor is now outdated.

janhilgard force-pushed the review-improvements branch from f805997 to 36164d1 Compare February 14, 2026 16:43

janhilgard changed the base branch from feat/kv-cache-quantization to main February 14, 2026 16:44

janhilgard closed this Mar 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: review improvements for KV cache quantization#72

Refactor: review improvements for KV cache quantization#72
janhilgard wants to merge 1 commit intowaybarrios:mainfrom
janhilgard:review-improvements

janhilgard commented Feb 11, 2026

Uh oh!

janhilgard commented Feb 11, 2026

Uh oh!

waybarrios commented Feb 13, 2026

Uh oh!

janhilgard commented Feb 13, 2026

Uh oh!

janhilgard commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

janhilgard commented Feb 11, 2026

Summary

Test plan

Uh oh!

janhilgard commented Feb 11, 2026

Uh oh!

waybarrios commented Feb 13, 2026

Uh oh!

janhilgard commented Feb 13, 2026

Uh oh!

janhilgard commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants