Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions vllm/_aiter_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -1052,12 +1052,16 @@ def is_fp8bmm_enabled(cls) -> bool:
@classmethod
@if_aiter_supported
def is_fp4bmm_enabled(cls) -> bool:
return cls._AITER_ENABLED and cls._FP4BMM_ENABLED
from vllm.platforms.rocm import on_gfx950

return cls._AITER_ENABLED and cls._FP4BMM_ENABLED and on_gfx950()

@classmethod
@if_aiter_supported
def is_asm_fp4_gemm_dynamic_quant_enabled(cls) -> bool:
return cls._AITER_ENABLED and cls._FP4_GEMM_DYNAMIC_QUANT_ASM
from vllm.platforms.rocm import on_gfx950

return cls._AITER_ENABLED and cls._FP4_GEMM_DYNAMIC_QUANT_ASM and on_gfx950()
Comment on lines 1054 to +1064
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The import from vllm.platforms.rocm import on_gfx950 is duplicated in both is_fp4bmm_enabled and is_asm_fp4_gemm_dynamic_quant_enabled. To improve performance and avoid code duplication, consider moving this import to a higher scope, such as the module level. This would prevent repeated import overhead in these check functions, which may be in a hot path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local import is intentional — it follows the existing pattern used throughout the codebase to avoid circular imports (e.g., mxfp4_utils.py:43, layers/utils.py:146). Python caches imports after the first call, so there's no meaningful performance overhead.


@classmethod
@if_aiter_supported
Expand Down