[Model Runner V2] Bug fix: logprob dtype int64/int32 issue#41761
Conversation
Signed-off-by: yewentao256 <zhyanwentao@126.com>
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
Hi @yewentao256, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
There was a problem hiding this comment.
Code Review
This pull request modifies vllm/v1/worker/gpu/sample/logprob.py to refine the handling of top-k logprobs, including initializing topk_token_ids from the GPU state and applying type casting in both host code and the Triton kernel. Feedback suggests explicitly setting the dtype of logprob_token_ids to int64 to ensure memory safety and consistency with the kernel's 8-byte write operations.
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
njhill
left a comment
There was a problem hiding this comment.
Thanks @yewentao256. I pushed a couple of minor adjustments and also added a validation check and test, since I was digging into the behavior here.
…ect#41761) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ect#41761) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ect#41761) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ect#41761) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Purpose
Part of #41286
Originally
VLLM_USE_V2_MODEL_RUNNER=1 pytest tests/entrypoints/openai/generative_scoring/test_generative_scoring_e2e.py::TestGenerativeScoringAPI::test_basic_score_and_response_structureNow
================================== 1 passed, 16 warnings in 22.28s ==================================