[ROCm] Fix broken import in platform attention backend dispatching#30432
[ROCm] Fix broken import in platform attention backend dispatching#30432gshtras merged 2 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
There was a problem hiding this comment.
Code Review
This pull request fixes a broken import in the ROCm platform configuration by replacing a call to get_env_variable_attn_backend with direct environment variable checks. While this resolves the import issue, the new implementation for checking environment variables is inconsistent with how boolean flags are handled elsewhere in the codebase, which could lead to incorrect behavior. I've provided a suggestion to align the implementation with the project's standards.
vllm/platforms/rocm.py
Outdated
| os.environ.get("VLLM_ROCM_USE_AITER_UNIFIED_ATTENTION") | ||
| and os.environ.get("VLLM_ROCM_USE_AITER") |
There was a problem hiding this comment.
The direct use of os.environ.get() to check for these boolean environment variables is inconsistent with the established pattern in vllm.envs. This can lead to incorrect behavior. For instance, if a user sets VLLM_ROCM_USE_AITER=0 to disable it, os.environ.get() will return the string "0", which is truthy, causing the block size to be incorrectly set to 64.
To ensure consistent and correct behavior, you should use the vllm.envs module, which is already imported in this file. This module correctly parses these environment variables as booleans.
| os.environ.get("VLLM_ROCM_USE_AITER_UNIFIED_ATTENTION") | |
| and os.environ.get("VLLM_ROCM_USE_AITER") | |
| envs.VLLM_ROCM_USE_AITER_UNIFIED_ATTENTION | |
| and envs.VLLM_ROCM_USE_AITER |
There was a problem hiding this comment.
This is out of the scope of this PR. We can address it in a future PR. For now the purpose of this PR is to resolve an import error and deprecate get_env_variable_attn_backend
There was a problem hiding this comment.
No, the bot is right, don't use os.environ
vllm/platforms/rocm.py
Outdated
| os.environ.get("VLLM_ROCM_USE_AITER_UNIFIED_ATTENTION") | ||
| and os.environ.get("VLLM_ROCM_USE_AITER") |
There was a problem hiding this comment.
No, the bot is right, don't use os.environ
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
…llm-project#30432) Signed-off-by: Andreas Karatzas <akaratza@amd.com> Signed-off-by: Ubuntu <mjtaheri68@gmail.com>
…llm-project#30432) Signed-off-by: Andreas Karatzas <akaratza@amd.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Summary
Removes broken dependency on
get_env_variable_attn_backendfromvllm.attention.selectorin ROCm platform configuration.Problem
The import
from vllm.attention.selector import get_env_variable_attn_backendwas causing failures on ROCm. This was used to check forROCM_AITER_UNIFIED_ATTNbackend selection when setting KV cache block size.Fix
Deprecate the
get_env_variable_attn_backendcheck and rely on environment variables (VLLM_ROCM_USE_AITER_UNIFIED_ATTENTION,VLLM_ROCM_USE_AITER) directly for block size configuration.Tracking upstream PR #30396 for the proper way to handle attention backend selection going forward.
Testing
Verified ROCm platform initializes correctly without import errors.