Revert "[HMA] [KVEvent] Enable GPU-side KV events for HMA" (#37688)#39658
Revert "[HMA] [KVEvent] Enable GPU-side KV events for HMA" (#37688)#39658vllm-agent wants to merge 1 commit intovllm-project:mainfrom
Conversation
|
Documentation preview: https://vllm--39658.org.readthedocs.build/en/39658/ |
There was a problem hiding this comment.
Code Review
This pull request removes the group_idx field from KV cache events (BlockStored and BlockRemoved) and updates associated tests and logic across the codebase. It also introduces a change to disable the hybrid KV cache manager when KV events are configured, as they are currently incompatible. Feedback suggests refining the logic in vllm/config/vllm.py to only disable the hybrid KV cache manager if KV events are explicitly enabled, rather than just checking for the presence of the configuration object.
| if self.kv_events_config is not None: | ||
| # Hybrid KV cache manager is not compatible with KV events. | ||
| need_disable_hybrid_kv_cache_manager = True |
There was a problem hiding this comment.
The current logic disables the hybrid KV cache manager (HMA) whenever kv_events_config is present, regardless of whether KV events are actually enabled. This can lead to unnecessary performance regressions for hybrid models if the configuration object exists but enable_kv_cache_events is set to False. It is better to check the enable_kv_cache_events flag specifically.
| if self.kv_events_config is not None: | |
| # Hybrid KV cache manager is not compatible with KV events. | |
| need_disable_hybrid_kv_cache_manager = True | |
| if self.kv_events_config is not None and self.kv_events_config.enable_kv_cache_events: | |
| # Hybrid KV cache manager is not compatible with KV events. | |
| need_disable_hybrid_kv_cache_manager = True |
Revert of #37688
This reverts commit cc07dad.
Reason: This PR is linked to 1 new CI failure in build #60970:
test_full_decode_gpu_cache_hit_metrics— local_cache_hit off-by-one (expected 128, got 127.0)Original PR: #37688
Auto-generated by CI failure analyzer