[bugfix] fix test_camem failed with triton-ascend#5492
[bugfix] fix test_camem failed with triton-ascend#5492wangxiyuan merged 2 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the codebase to address an e2e test failure involving Triton. The core change involves centralizing the import of torch_npu._inductor from multiple Triton kernel files into a single location within the NPUWorker's initialization. This side-effect-only import is now performed once when a worker starts, ensuring proper and timely initialization before any Triton operations are executed. This change improves code structure and resolves potential issues arising from multiple or improperly timed imports. The implementation appears correct and aligns with the project's established coding practices.
d1bffdc to
41cd56e
Compare
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
| adapt_patch() | ||
| from vllm.triton_utils import HAS_TRITON | ||
| if HAS_TRITON: | ||
| import torch_npu._inductor # noqa: F401 |
There was a problem hiding this comment.
Would mind adding a note to show why need import this?
There was a problem hiding this comment.
Done! I've added a comment explaining.
|
|
||
| if HAS_TRITON: | ||
| import torch_npu._inductor # noqa: F401 | ||
| from vllm.triton_utils import tl, triton |
There was a problem hiding this comment.
It seems we need to add a check Forbid import torch_npu._inductor in vllm_ascend/ops/triton/
https://github.com/vllm-project/vllm/blob/main/.pre-commit-config.yaml#L132
There was a problem hiding this comment.
This import of torch_npu._inductor fixes graph mode running errors with triton-ascend. Once added, the issue no longer occurs, so future Triton ops won't need similar imports in ops/triton. Therefore, we don't need a dedicated pre-commit specifically for this.
|
Please fullfill commit msg and change commit title to a meaningful title. |
Thanks, I have modified it |
03d96a9 to
723aed4
Compare
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
### What this PR does / why we need it? This fixes a bug that occurred when running `test_camem.py` in the triton-ascend environment `NPU function error: aclrtGetMemInfo(ACL_HBM_MEM, &device_free, &device_total)` - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (58 commits) [Main2Main] Upgrade vllm commit to 0106 (vllm-project#5617) [CI]update bisheng version (vllm-project#5621) [UT][PCP&DCP] UT for block_table.py (vllm-project#5032) [Main2Main] Upgrade vllm commit to 0105 (vllm-project#5595) [CI] mv ops to correct path (vllm-project#5615) [BugFix] Fix Smoke Testing Bug for DSR1 longseq (vllm-project#5613) Revert "[Feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5545)" (vllm-project#5611) [TRITON][TEST]Add nightly test for triton split_qkv_rmsnorm_rope (vllm-project#5267) [perf] Fix MLAPO weight disposal for KV-consumer MLA in PD-mix deploy... (vllm-project#5192) [docs] Correct image about prefill phase of PCP (vllm-project#5598) [CI] update triton-ascend version (vllm-project#5584) [P/D]Remove mooncake kvpool unused parameter `local_hostname` (vllm-project#5574) [Bugfix] record cos and sin cache in AscendRotaryEmbedding (vllm-project#5516) [bugfix] fix test_camem failed with triton-ascend (vllm-project#5492) [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat (vllm-project#5474) [CI] Download models from ms (vllm-project#5405) Docs: Add A3 Docker image guidance for Atlas A3 machines (vllm-project#5256) [Doc] Add NNAL installation guide and requirements (vllm-project#5235) Add the requirement of arctic-inference which speculative decoding with suffix_decode (vllm-project#5045) [BugFix][Fusion] Fix graph fusion failure problem (vllm-project#5253) ...
### What this PR does / why we need it? This fixes a bug that occurred when running `test_camem.py` in the triton-ascend environment `NPU function error: aclrtGetMemInfo(ACL_HBM_MEM, &device_free, &device_total)` - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
### What this PR does / why we need it? This fixes a bug that occurred when running `test_camem.py` in the triton-ascend environment `NPU function error: aclrtGetMemInfo(ACL_HBM_MEM, &device_free, &device_total)` - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
### What this PR does / why we need it? This fixes a bug that occurred when running `test_camem.py` in the triton-ascend environment `NPU function error: aclrtGetMemInfo(ACL_HBM_MEM, &device_free, &device_total)` - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
### What this PR does / why we need it? This fixes a bug that occurred when running `test_camem.py` in the triton-ascend environment `NPU function error: aclrtGetMemInfo(ACL_HBM_MEM, &device_free, &device_total)` - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
### What this PR does / why we need it? This fixes a bug that occurred when running `test_camem.py` in the triton-ascend environment `NPU function error: aclrtGetMemInfo(ACL_HBM_MEM, &device_free, &device_total)` - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
### What this PR does / why we need it? This fixes a bug that occurred when running `test_camem.py` in the triton-ascend environment `NPU function error: aclrtGetMemInfo(ACL_HBM_MEM, &device_free, &device_total)` - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
What this PR does / why we need it?
This fixes a bug that occurred when running
test_camem.pyin the triton-ascend environmentNPU function error: aclrtGetMemInfo(ACL_HBM_MEM, &device_free, &device_total)Does this PR introduce any user-facing change?
How was this patch tested?