[Sampler] Support returning final logprobs#22387
Conversation
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Code Review
This PR introduces a new sampled_logprobs mode. I've identified a critical issue in vllm/v1/sample/sampler.py where the sample() method returns inconsistent types, and a high-severity issue in vllm/v1/sample/ops/topk_topp_sampler.py where the CUDA implementation's behavior is inconsistent with the native and TPU implementations. These issues could lead to incorrect log-probability values and subtle bugs.
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for a new logprobs_mode, final_logprobs, which allows retrieving log probabilities after all sampling processors (including temperature, top-k, and top-p) have been applied. The overall implementation is good, but I've identified a critical issue that could lead to incorrect sampling results due to an in-place modification of the logits tensor. Additionally, there's a minor error in the documentation for the new mode. My review provides specific suggestions to address these points.
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Duncan Moss <djm.moss@gmail.com>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Xiao Yu <xiao.yu@amd.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Purpose
RL workflows may need to get the final logprobs, after applying all logit processors including top_k/top_p, temperature. This PR updates the
logprobs_mode'sprocessed_logprobs,processed_logitsto return the final values.Test Plan
pytest tests/v1/sample/test_logprobs.py -k 'test_logprobs_mode'Test Result
Unit tests passed
(Optional) Documentation Update
Added detailed sampling flow illustration in v1
Samplerclass.Updated v1 guide.
https://vllm--22387.org.readthedocs.build/en/22387/api/vllm/v1/sample/sampler.html#vllm.v1.sample.sampler.Sampler
https://vllm--22387.org.readthedocs.build/en/22387/usage/v1_guide.html#logprobs-calculation