[ROCm][CI] fix get_valid_backends#32787
Conversation
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
There was a problem hiding this comment.
Code Review
This pull request correctly addresses a potential issue where hasattr might incorrectly evaluate due to the custom __getattr__ implementation in vllm/platforms/interface.py. By using getattr(current_platform.__class__, "get_valid_backends", None), the code now reliably checks for the presence of the get_valid_backends method on the class itself. The platform-specific fallbacks for ROCm and other platforms are also appropriate. This is a good fix that improves the robustness of the backend selection logic.
tjtanaa
left a comment
There was a problem hiding this comment.
We will go with this first, however, we should start supporting the new abstraction on ROCm as well and define get_valid_backends in the platform/rocm.py.
| get_valid_backends = getattr(current_platform.__class__, "get_valid_backends", None) | ||
| if get_valid_backends is None: | ||
| if current_platform.is_rocm(): | ||
| return ["TRITON_ATTN"] |
There was a problem hiding this comment.
Can you explain why TRITON_ATTN is used here with a code comment?
There was a problem hiding this comment.
@DarkLight1337 Triton is AMD's default attention backend, and we are not compatible with the flash attn backend that is used for the CUDA equivalent tests. Let us know if you'd like us to put a comment there still.
There was a problem hiding this comment.
Yes, the purpose of the comment is to tell other developers who are not as familiar with ROCm about this
Signed-off-by: Divakar Verma <divakar.verma@amd.com> Signed-off-by: mohammad najafi <mohammad.najafi@amd.com>
Signed-off-by: Divakar Verma <divakar.verma@amd.com> Signed-off-by: 陈建华 <1647430658@qq.com>
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
This PR fixes the "get_valid_backends" check
Details - the getattr is overridden in interface.py which returns
Noneif the attribute is not found instead of raising an AttributeErrorpytest -s -v tests/v1/spec_decode/test_acceptance_length.pyNote: There is more debugging ongoing, might need another PR to fully fix this test. Preliminary analysis hints towards the refactored cudagraph PR causing issue for this test.