Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions vllm/v1/worker/gpu/attn_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,12 @@ def _reshape_kv_cache(
) -> dict[str, torch.Tensor]:
kv_caches: dict[str, torch.Tensor] = {}
for kv_cache_group_spec in kv_cache_config.kv_cache_groups:
kv_cache_spec = kv_cache_group_spec.kv_cache_spec
assert isinstance(kv_cache_spec, AttentionSpec)
for layer_name in kv_cache_group_spec.layer_names:
kv_cache_spec = kv_cache_group_spec.kv_cache_spec
if isinstance(kv_cache_spec, UniformTypeKVCacheSpecs):
kv_cache_spec = kv_cache_spec.kv_cache_specs[layer_name]
assert isinstance(kv_cache_spec, AttentionSpec)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The assert statement on this line performs a critical type check. If assertions are disabled in a production environment, this check will be skipped, potentially leading to AttributeError or TypeError in subsequent operations if kv_cache_spec is not an AttentionSpec. For robust error handling, consider replacing this assert with an explicit TypeError or ValueError to ensure type validation always occurs, regardless of assertion settings.

Suggested change
assert isinstance(kv_cache_spec, AttentionSpec)
if not isinstance(kv_cache_spec, AttentionSpec):
raise TypeError(f"Expected kv_cache_spec to be AttentionSpec, but got {type(kv_cache_spec)}")


raw_tensor = kv_cache_raw_tensors[layer_name]
assert raw_tensor.numel() % kv_cache_spec.page_size_bytes == 0
num_blocks = raw_tensor.numel() // kv_cache_spec.page_size_bytes
Expand Down
Loading