[Platform] import activation_quant_fusion for CUDA only#23882
[Platform] import activation_quant_fusion for CUDA only#23882vllm-bot merged 1 commit intovllm-project:mainfrom
Conversation
94e8e36 to
16767d0
Compare
There was a problem hiding this comment.
Code Review
The pull request aims to fix an AttributeError on platforms without custom ops by adding hasattr checks. While the intention is correct, the implementation is incomplete and introduces new runtime errors because other parts of the code unconditionally assume these ops are available. The pattern registration logic needs to be made conditional to fully resolve the issue.
16767d0 to
b96e1cf
Compare
|
https://github.com/vllm-project/vllm/blob/main/vllm/compilation/fusion.py#L41-L54 |
|
@elvischenv there is a platform check when import the file, so it's fine with custom platform vllm/vllm/compilation/pass_manager.py Lines 10 to 12 in 006477e And I'm not sure if |
jikunshang
left a comment
There was a problem hiding this comment.
Thanks for fixing this! maybe we need some refactor to avoid op register issue on multiple platforms.
ProExpertProg
left a comment
There was a problem hiding this comment.
This is very ugly, but I understand that we can't set these globals due to other platforms. I think we've already dealt with this in other files and in a nicer way, could you look around briefly?
|
Yeah I see the link now, can you just make the pass a conditional import like it's done for fusion? |
Yes, this is only for cuda. Let's follow this conditional import as @ProExpertProg mentioned. Thanks. |
b96e1cf to
280c59a
Compare
|
@elvischenv @ProExpertProg Done, and tested with vllm-ascend again, it works |
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
280c59a to
8d8c2c5
Compare
…#23882) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
…#23882) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Purpose
The commit #23671 add the custom init logic like
SILU_MUL_OP = torch.ops._C.silu_and_mul.default. But it doesn't work on the platform which doesn't support custom ops. It will raise error when setup vLLM.This PR let activation_quant_fusion be imported only for CUDA case
Test Plan
This is a change for custom platform. No need for new test. and all test should not break. vLLM Ascend e2e test will be happy after this change.
Test Result
Custom platform works well then.