Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request applies a formatting fix to resolve a linting issue in vllm_ascend/platform.py. The change refactors a long if condition into multiple lines for better readability, without altering any functionality. While reviewing the change, I noticed a potential bug in the logic that constructs the PYTORCH_NPU_ALLOC_CONF environment variable, which could lead to a malformed value. I've added a comment with a suggestion to fix it. Otherwise, the formatting change is correct.
| and "max_split_size_mb" not in npu_alloc_configs | ||
| and "garbage_collection_threshold" not in npu_alloc_configs | ||
| ): | ||
| npu_alloc_configs += ",expandable_segments:True" |
There was a problem hiding this comment.
The current logic for appending ,expandable_segments:True can lead to a malformed environment variable if PYTORCH_NPU_ALLOC_CONF is an empty string. In that case, npu_alloc_configs would become ',expandable_segments:True', which is likely incorrect due to the leading comma. It's better to handle the case where npu_alloc_configs is empty separately.
| npu_alloc_configs += ",expandable_segments:True" | |
| if npu_alloc_configs: | |
| npu_alloc_configs += ",expandable_segments:True" | |
| else: | |
| npu_alloc_configs = "expandable_segments:True" |
…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (110 commits) [Performance] Remove index opetation when VLLM_ASCEND_FLASHCOMM2_PARALLEL_SIZE=1 (vllm-project#5936) [main][bugfix] fix mooncake kv cache transfer when one P has multi nodes (vllm-project#5960) [Feature] Adapt DispathGmmCombineDecode opertor to align with weight scale dtype of small operators. [RFC: issue 5476] (vllm-project#5755) [Refactor] Move AttentionSpec initialization to Attention module (vllm-project#5834) [EPLB][Bugfix] policy_swift_balancer bugfix and renaming (vllm-project#5897) [CI]fix for lint CI (vllm-project#5982) [Fusion] [Graph]Add Matmul Allreduce Rmsnorm fusion Pass (vllm-project#5034) [Refactor] Migrate profiler config from env vars to explicit ProfilerConfig (vllm-project#5928) [EPLB][Bugfix] Dispatch Allgather use log2phy if enable eplb (vllm-project#5933) [EPLB][Nightly][Bugfix] Get expert from moe layer only (vllm-project#5908) [Bugfix][MM] Fix multi-modal inference OOM issues by setting `expandable_segments:True` (vllm-project#5855) [doc]Table split (vllm-project#5929) [Doc] Upgrade outdated ut doc (vllm-project#5937) [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#2) (vllm-project#5977) Eagle3 mm support, enablement on qwen3vl (vllm-project#4848) [Doc] Remove Chinese characters from the icons in the doc. (vllm-project#5959) [P/D]The issue of solving the force-free secondary release request, which causes the node to crash. (vllm-project#5968) [Feature] Support fine-grained shared expert overlap (vllm-project#5482) [Bugfix] fix cpu offload hang with tp=1 (vllm-project#5963) [Feature]: Support 310P device run qwen2.5/3 dense and qwen2.5vl models (vllm-project#5776) ...
…to qwen3next_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (637 commits) [Performance] Remove index opetation when VLLM_ASCEND_FLASHCOMM2_PARALLEL_SIZE=1 (vllm-project#5936) [main][bugfix] fix mooncake kv cache transfer when one P has multi nodes (vllm-project#5960) [Feature] Adapt DispathGmmCombineDecode opertor to align with weight scale dtype of small operators. [RFC: issue 5476] (vllm-project#5755) [Refactor] Move AttentionSpec initialization to Attention module (vllm-project#5834) [EPLB][Bugfix] policy_swift_balancer bugfix and renaming (vllm-project#5897) [CI]fix for lint CI (vllm-project#5982) [Fusion] [Graph]Add Matmul Allreduce Rmsnorm fusion Pass (vllm-project#5034) [Refactor] Migrate profiler config from env vars to explicit ProfilerConfig (vllm-project#5928) [EPLB][Bugfix] Dispatch Allgather use log2phy if enable eplb (vllm-project#5933) [EPLB][Nightly][Bugfix] Get expert from moe layer only (vllm-project#5908) [Bugfix][MM] Fix multi-modal inference OOM issues by setting `expandable_segments:True` (vllm-project#5855) [doc]Table split (vllm-project#5929) [Doc] Upgrade outdated ut doc (vllm-project#5937) [Lint]Style: Convert `vllm-ascend/` to ruff format(Batch vllm-project#2) (vllm-project#5977) Eagle3 mm support, enablement on qwen3vl (vllm-project#4848) [Doc] Remove Chinese characters from the icons in the doc. (vllm-project#5959) [P/D]The issue of solving the force-free secondary release request, which causes the node to crash. (vllm-project#5968) [Feature] Support fine-grained shared expert overlap (vllm-project#5482) [Bugfix] fix cpu offload hang with tp=1 (vllm-project#5963) [Feature]: Support 310P device run qwen2.5/3 dense and qwen2.5vl models (vllm-project#5776) ...
### What this PR does / why we need it? fix lint CI - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: MrZ20 <2609716663@qq.com>
### What this PR does / why we need it? fix lint CI - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: MrZ20 <2609716663@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
### What this PR does / why we need it? fix lint CI - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: MrZ20 <2609716663@qq.com>
### What this PR does / why we need it? fix lint CI - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: MrZ20 <2609716663@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
### What this PR does / why we need it? fix lint CI - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: MrZ20 <2609716663@qq.com>
What this PR does / why we need it?
fix lint CI
Does this PR introduce any user-facing change?
How was this patch tested?