Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions vllm/model_executor/layers/fused_moe/fused_moe.py
Original file line number Diff line number Diff line change
Expand Up @@ -1217,6 +1217,7 @@ def should_moe_wna16_use_cuda(
):
return (
current_platform.is_cuda()
and not current_platform.is_rocm()
Comment on lines 1219 to +1220
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To make the platform check more direct and robust against potential inconsistencies in is_cuda() behavior across environments, consider using current_platform.device_name == 'cuda'. This directly checks for the CUDA platform and is less prone to misinterpretation.

Suggested change
current_platform.is_cuda()
and not current_platform.is_rocm()
current_platform.device_name == "cuda"

and bit == 4
and group_size in [32, 64, 128]
and num_valid_tokens / num_experts <= 6
Expand Down
1 change: 1 addition & 0 deletions vllm/platforms/rocm.py
Original file line number Diff line number Diff line change
Expand Up @@ -336,6 +336,7 @@ class RocmPlatform(Platform):
"petit_nvfp4",
"torchao",
"bitsandbytes",
"moe_wna16",
]

@classmethod
Expand Down