[csrc][bugfix] Add compile-time Ascend950/910_95 compatibility for custom ops between CANN8.5 and 9.0#6936
Conversation
Signed-off-by: linfeng-yuan <1102311262@qq.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses compatibility issues when compiling against different versions of the CANN toolkit (specifically 8.5 and 9.0). It implements a robust mechanism to dynamically detect the specific Ascend SoC enum (either ASCEND950 or ASCEND910_95) available in the installed CANN headers. This detected version is then used consistently throughout the C++ codebase, ensuring that operators and tiling configurations are correctly compiled regardless of the CANN version, thereby preventing build failures. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request introduces a compile-time mechanism to handle different SoC enum values between CANN versions, improving build compatibility. The changes are logically sound, but I've found a critical issue in the runtime check that could break compatibility between different compile-time and run-time CANN versions. I've provided suggestions to fix this in csrc/cmake/intf_pub.cmake and csrc/moe_gating_top_k/op_host/moe_gating_top_k_tiling_arch35.cpp.
Additionally, the pull request description is currently empty. Per the repository's style guide, I've drafted a suggested title and summary:
Suggested PR Title:
[Build][BugFix] Fix kernel compilation compatibility for different CANN versions
Suggested PR Summary:
### What this PR does / why we need it?
This PR addresses a compilation issue when building vLLM kernels against different versions of Huawei's CANN toolkit (e.g., 8.5 and 9.0). These versions use different enum values for the Ascend 950 series SoC (`ASCEND910_95` vs `ASCEND950`).
To resolve this, the PR introduces a compile-time check in CMake to detect which enum value is supported by the CANN headers. The result is then passed to the C++ code via preprocessor definitions (`VLLM_ASCEND_950_SOC_ENUM` and `VLLM_ASCEND_950_SOC_CONFIG`). This allows the code to use the correct SoC identifier, ensuring compilation compatibility across different CANN versions.
### Does this PR introduce _any_ user-facing change?
No. This is a build-time compatibility fix and does not change any user-facing APIs or behavior.
### How was this patch tested?
The changes were tested by compiling the kernels against different CANN versions to ensure compatibility. CI tests should be run to confirm correctness and prevent runtime regressions.|
This solution cannot cover the cross-version case of “build with CANN 9.0 + run with CANN 8.5,” and that scenario itself is high-risk. Why: The current selection logic is decided once at build time (config.cmake probes headers and fixes macros); it is not re-evaluated at runtime. |
…to qwen3next_graph * 'main' of https://github.com/vllm-project/vllm-ascend: (40 commits) [Feature] Add docs of batch invariance and make some extra operators patch (vllm-project#6910) [bugfix]Qwen2.5VL accurate question (vllm-project#6975) [CI] Add DeepSeek-V3.2 large EP nightly ci (vllm-project#6378) [Ops][BugFix] Fix RoPE shape mismatch for mtp models with flashcomm v1 enabled (vllm-project#6939) [bugfix]fix file not found error in nightly of single-node (vllm-project#6976) [Bugfix] Fix the acceptance rates dorp issue when applying eagle3 to QuaRot model (vllm-project#6914) [CI] Enable auto upgrade e2e estimated time for auto-partition suites (vllm-project#6840) [Doc][Misc] Fix msprobe_guide.md documentation issues (vllm-project#6965) [Nightly][Refactor]Migrate nightly single-node model tests from `.py` to `.yaml` (vllm-project#6503) [BugFix] Improve GDN layer detection for multimodal models (vllm-project#6941) [feat]ds3.2 pcp support mtp and chunkprefill (vllm-project#6917) [CPU binding] Implement global CPU slicing and improve IRQ binding for Ascend NPUs (vllm-project#6945) [Triton] Centralize Ascend extension op dispatch in triton_utils (vllm-project#6937) [csrc][bugfix] Add compile-time Ascend950/910_95 compatibility for custom ops between CANN8.5 and 9.0 (vllm-project#6936) [300I][Bugfix] fix unquant model weight nd2nz error (vllm-project#6851) [doc] fix supported_models (vllm-project#6930) [CI] nightly test timeout (vllm-project#6912) [CI] Upgrade CANN to 8.5.1 (vllm-project#6897) [Model]Add Qwen3-Omni quantization Ascend NPU adaptation and optimization (vllm-project#6828) [P/D][v0.16.0]Adapt to RecomputeScheduler in vLLM 0.16.0 (vllm-project#6898) ...
…stom ops between CANN8.5 and 9.0 (vllm-project#6936) ### What this PR does / why we need it? Remove hardcoded ASCEND910_95 usage in csrc custom-op host/tiling code and select the SoC target at CMake configure time. - Probe CANN headers with check_cxx_source_compiles: prefer platform_ascendc::SocVersion::ASCEND950, fallback to ASCEND910_95. - Export the selected enum/config string via shared compile definitions (VLLM_ASCEND_950_SOC_ENUM / VLLM_ASCEND_950_SOC_CONFIG). - Apply the shared macros to affected paths (moe_gating_top_k, add_rms_norm_bias) to avoid per-file hardcoding. - Keep behavior unchanged; this is an internal build-compatibility fix for CANN 8.5 and 9.x. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - vLLM version: v0.16.0 - vLLM main: vllm-project/vllm@15d76f7 --------- Signed-off-by: linfeng-yuan <1102311262@qq.com>
…stom ops between CANN8.5 and 9.0 (vllm-project#6936) ### What this PR does / why we need it? Remove hardcoded ASCEND910_95 usage in csrc custom-op host/tiling code and select the SoC target at CMake configure time. - Probe CANN headers with check_cxx_source_compiles: prefer platform_ascendc::SocVersion::ASCEND950, fallback to ASCEND910_95. - Export the selected enum/config string via shared compile definitions (VLLM_ASCEND_950_SOC_ENUM / VLLM_ASCEND_950_SOC_CONFIG). - Apply the shared macros to affected paths (moe_gating_top_k, add_rms_norm_bias) to avoid per-file hardcoding. - Keep behavior unchanged; this is an internal build-compatibility fix for CANN 8.5 and 9.x. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - vLLM version: v0.16.0 - vLLM main: vllm-project/vllm@15d76f7 --------- Signed-off-by: linfeng-yuan <1102311262@qq.com>
… compatibility for custom ops between CANN8.5 and 9.0 (#7116) cherry-pick from #6936 to fix pip install error ### What this PR does / why we need it? Remove hardcoded ASCEND910_95 usage in csrc custom-op host/tiling code and select the SoC target at CMake configure time. - Probe CANN headers with check_cxx_source_compiles: prefer platform_ascendc::SocVersion::ASCEND950, fallback to ASCEND910_95. - Export the selected enum/config string via shared compile definitions (VLLM_ASCEND_950_SOC_ENUM / VLLM_ASCEND_950_SOC_CONFIG). - Apply the shared macros to affected paths (moe_gating_top_k, add_rms_norm_bias) to avoid per-file hardcoding. - Keep behavior unchanged; this is an internal build-compatibility fix for CANN 8.5 and 9.x. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Signed-off-by: linfeng-yuan <1102311262@qq.com> Co-authored-by: linfeng-yuan <1102311262@qq.com> Co-authored-by: weijinqian0 <1184188277@qq.com>
What this PR does / why we need it?
Remove hardcoded ASCEND910_95 usage in csrc custom-op host/tiling code and
select the SoC target at CMake configure time.
prefer platform_ascendc::SocVersion::ASCEND950, fallback to ASCEND910_95.
Does this PR introduce any user-facing change?
No.
How was this patch tested?