[0.13.0][Bugfix] Fix setting of speculative_config.enforce_eager for dsv32#5958
Conversation
…r dsv32 Signed-off-by: Zetong Li <slippersss@126.com>
There was a problem hiding this comment.
Code Review
This pull request aims to fix an issue with speculative_config.enforce_eager being incorrectly forced for deepseek_v32 models on the Ascend platform, which supports graph mode. The change correctly overrides this setting. However, the implementation introduces a potential AttributeError that could lead to a crash. I have provided a critical-severity comment with a suggested code change to make the logic more robust and prevent this potential runtime error.
| if model_config and speculative_config and \ | ||
| hasattr(model_config.hf_text_config, "model_type") and \ | ||
| model_config.hf_text_config.model_type == "deepseek_v32" and \ | ||
| speculative_config.enforce_eager: | ||
| speculative_config.enforce_eager = False |
There was a problem hiding this comment.
The condition to check for the deepseek_v32 model type is not fully robust. It could raise an AttributeError if model_config.hf_text_config is None, which would cause the application to crash. It's safer to use getattr to access nested attributes with a default value to prevent such crashes. I've refactored the logic to be more robust and readable.
| if model_config and speculative_config and \ | |
| hasattr(model_config.hf_text_config, "model_type") and \ | |
| model_config.hf_text_config.model_type == "deepseek_v32" and \ | |
| speculative_config.enforce_eager: | |
| speculative_config.enforce_eager = False | |
| if model_config and speculative_config and speculative_config.enforce_eager: | |
| hf_text_config = getattr(model_config, "hf_text_config", None) | |
| if (hf_text_config and | |
| getattr(hf_text_config, "model_type", None) == "deepseek_v32"): | |
| speculative_config.enforce_eager = False |
…lm-ascend into FIA_v0.13.0 * 'releases/v0.13.0' of https://github.com/vllm-project/vllm-ascend: [0.13.0][Bugfix] Fix setting of `speculative_config.enforce_eager` for dsv32 (vllm-project#5958) [v0.13.0][Bugfix] Fix XliteModelRunner init failed when aclgraph is enabled (vllm-project#5887) [0.13.0][Bugfix] Fixed an problem related to embeddings sharing (vllm-project#5972) [Bugfix]Fixed precision issues caused by pooled request pooling (vllm-project#6057) [0.13.0][Bugfix] fix pcp aclgraph qwen FIA bug (vllm-project#6038) [0.13.0][cherry-pick][bugfix] fix bug of triton mrope (vllm-project#6009) 【0.13.0】【bugfix】Resolved memory deallocation failure in the pooling layer under re-computation workloads. (vllm-project#6056)
…r dsv32 (vllm-project#5958) ### What this PR does / why we need it? This PR is cherry-picked from vllm-project#5945. We leave only changes in platform.py. This PR aims to fix setting of `speculative_config.enforce_eager` in deepseek v3.2 mtp. The point is that, vllm sets `speculative_config.enforce_eager` as True if using deepseek_v32 with mtp. Since we support graph mode, we simply ignore it here. However, this fix will also implicitly ignore user setting of `speculative_config.enforce_eager`, we need to take care and remove it once vllm supports this feature. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? by ci Signed-off-by: Zetong Li <slippersss@126.com> Signed-off-by: Mercykid-bash <ruanche0218@gmail.com>
…r dsv32 (vllm-project#5958) ### What this PR does / why we need it? This PR is cherry-picked from vllm-project#5945. We leave only changes in platform.py. This PR aims to fix setting of `speculative_config.enforce_eager` in deepseek v3.2 mtp. The point is that, vllm sets `speculative_config.enforce_eager` as True if using deepseek_v32 with mtp. Since we support graph mode, we simply ignore it here. However, this fix will also implicitly ignore user setting of `speculative_config.enforce_eager`, we need to take care and remove it once vllm supports this feature. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? by ci Signed-off-by: Zetong Li <slippersss@126.com>
…r dsv32 (vllm-project#5958) ### What this PR does / why we need it? This PR is cherry-picked from vllm-project#5945. We leave only changes in platform.py. This PR aims to fix setting of `speculative_config.enforce_eager` in deepseek v3.2 mtp. The point is that, vllm sets `speculative_config.enforce_eager` as True if using deepseek_v32 with mtp. Since we support graph mode, we simply ignore it here. However, this fix will also implicitly ignore user setting of `speculative_config.enforce_eager`, we need to take care and remove it once vllm supports this feature. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? by ci Signed-off-by: Zetong Li <slippersss@126.com>
What this PR does / why we need it?
This PR is cherry-picked from #5945. We leave only changes in platform.py.
This PR aims to fix setting of
speculative_config.enforce_eagerin deepseek v3.2 mtp. The point is that, vllm setsspeculative_config.enforce_eageras True if using deepseek_v32 with mtp. Since we support graph mode, we simply ignore it here. However, this fix will also implicitly ignore user setting ofspeculative_config.enforce_eager, we need to take care and remove it once vllm supports this feature.Does this PR introduce any user-facing change?
N/A
How was this patch tested?
by ci