[Bugfix] fix greedy temperature detection#5417
[Bugfix] fix greedy temperature detection#5417MengqingCao merged 3 commits intovllm-project:mainfrom
Conversation
Signed-off-by: realliujiaxu <realliujiaxu@163.com>
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a bug in greedy temperature detection by changing GREEDY_TEMPERATURE from -1 to 0. This change aligns the behavior with the standard convention where a temperature of 0 indicates greedy sampling. The logic correctly handles this by replacing the temperature with 1 before division to avoid divide-by-zero errors, while still identifying greedy requests correctly. I have one suggestion to improve type consistency, which is important for robustness in numerical code.
|
|
||
| PLACEHOLDER_TOKEN_ID = -1 | ||
| GREEDY_TEMPERATURE = -1 | ||
| GREEDY_TEMPERATURE = 0 |
There was a problem hiding this comment.
For consistency and to avoid potential floating-point comparison issues, it's better to define GREEDY_TEMPERATURE as a float (0.0). Temperature values are typically floats, and the corresponding test file (tests/ut/sample/test_rejection_sampler.py) also defines this constant as 0.0. While 0 == 0.0 evaluates to true in most cases, using the same type improves code clarity and robustness, which is critical in a numerical library.
| GREEDY_TEMPERATURE = 0 | |
| GREEDY_TEMPERATURE = 0.0 |
|
do we need this change in inputbatch? |
|
|
||
| PLACEHOLDER_TOKEN_ID = -1 | ||
| GREEDY_TEMPERATURE = -1 | ||
| GREEDY_TEMPERATURE = 0 |
There was a problem hiding this comment.
import from vLLM directly?
After checking, we don't need to do this change. just ignore this comment |
Signed-off-by: realliujiaxu <realliujiaxu@163.com>
|
CI failed due to the upgrade of infrastructure, I think this pr is tested enough |
…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: (46 commits) [Feature] Support to use fullgraph with eagle (vllm-project#5118) [EPLB][refactor] Modification of the initialization logic for expert_map and log2phy(depend on pr5285) (vllm-project#5311) [Refactor]6/N Extract common code of class AscendMLAImpl (vllm-project#5314) [Refactor] cache cos/sin in mla & remove parameter model in builder. (vllm-project#5277) update vllm pin to 12.27 (vllm-project#5412) [ReleaseNote] Add release note for v0.13.0rc1 (vllm-project#5334) [Bugfix] Correctly handle the output shape in multimodal attention (vllm-project#5443) Fix nightly (vllm-project#5413) [bugfix] fix typo of _skip_all_reduce_across_dp_group (vllm-project#5435) [Doc]modify pcp tutorial doc (vllm-project#5440) [Misc] fast fail for exiting if tools/install_flash_infer_attention_score_ops_a2.sh (vllm-project#5422) [Doc] Update DeepSeek V3.1/R1 2P1D doc (vllm-project#5387) [DOC]Fix model weight download links (vllm-project#5436) [Doc] Modify DeepSeek-R1/V3.1 documentation (vllm-project#5426) Revert "[feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5300)" (vllm-project#5434) [Bugfix] fix greedy temperature detection (vllm-project#5417) [doc] Update Qwen3-235B doc for reproducing latest performance (vllm-project#5323) [feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5300) [Doc] delete environment variable HCCL_OP_EXPANSION_MODE in DeepSeekV3.1/R1 (vllm-project#5419) [Doc] add long_sequence feature user guide (vllm-project#5343) ...
### What this PR does / why we need it? fix greedy temperature detection from vllm-project/vllm#27077 - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@81786c8 --------- Signed-off-by: realliujiaxu <realliujiaxu@163.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
### What this PR does / why we need it? fix greedy temperature detection from vllm-project/vllm#27077 - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@81786c8 --------- Signed-off-by: realliujiaxu <realliujiaxu@163.com>
### What this PR does / why we need it? fix greedy temperature detection from vllm-project/vllm#27077 - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@81786c8 --------- Signed-off-by: realliujiaxu <realliujiaxu@163.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
What this PR does / why we need it?
fix greedy temperature detection from vllm-project/vllm#27077
Does this PR introduce any user-facing change?
No
How was this patch tested?