Skip to content

[Model Runner V2] Bug fix: logprob dtype int64/int32 issue#41761

Merged
yewentao256 merged 4 commits into
mainfrom
wentao-fix-mrv2-logprob-dtype-issue
May 11, 2026
Merged

[Model Runner V2] Bug fix: logprob dtype int64/int32 issue#41761
yewentao256 merged 4 commits into
mainfrom
wentao-fix-mrv2-logprob-dtype-issue

Conversation

@yewentao256
Copy link
Copy Markdown
Member

Purpose

Part of #41286

Originally

VLLM_USE_V2_MODEL_RUNNER=1 pytest tests/entrypoints/openai/generative_scoring/test_generative_scoring_e2e.py::TestGenerativeScoringAPI::test_basic_score_and_response_structure

(EngineCore pid=3942696)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/triton/runtime/jit.py", line 720, in run
(EngineCore pid=3942696)     kernel = self._do_compile(key, signature, device, constexprs, options, attrs, warmup)
(EngineCore pid=3942696)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=3942696)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/triton/runtime/jit.py", line 849, in _do_compile
(EngineCore pid=3942696)     kernel = self.compile(src, target=target, options=options.__dict__)
(EngineCore pid=3942696)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=3942696)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/triton/compiler/compiler.py", line 304, in compile
(EngineCore pid=3942696)     module = src.make_ir(target, options, codegen_fns, module_map, context)
(EngineCore pid=3942696)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=3942696)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/triton/compiler/compiler.py", line 80, in make_ir
(EngineCore pid=3942696)     return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
(EngineCore pid=3942696)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=3942696) triton.compiler.errors.CompilationError: at 32:4:
(EngineCore pid=3942696)     sampled = tl.load(sampled_token_ids_ptr + batch_idx)
(EngineCore pid=3942696)     tl.store(out_token_ids_ptr + batch_idx * out_token_ids_stride, sampled)
(EngineCore pid=3942696)     tl.store(out_valid_mask_ptr + batch_idx * out_valid_mask_stride, 1)
(EngineCore pid=3942696) 
(EngineCore pid=3942696)     req_state_idx = tl.load(expanded_idx_mapping_ptr + batch_idx)
(EngineCore pid=3942696)     num_custom = tl.load(num_per_req_token_ids_ptr + req_state_idx)
(EngineCore pid=3942696) 
(EngineCore pid=3942696)     col = tl.arange(0, PADDED_COLS)
(EngineCore pid=3942696)     tid_base = out_token_ids_ptr + batch_idx * out_token_ids_stride + 1
(EngineCore pid=3942696)     mask_base = out_valid_mask_ptr + batch_idx * out_valid_mask_stride + 1
(EngineCore pid=3942696) 
(EngineCore pid=3942696)     if num_custom > 0:
(EngineCore pid=3942696)     ^
(EngineCore pid=3942696) AssertionError('Mismatched type for src between then block (pointer<int32>) and else block (pointer<int64>)')
(APIServer pid=3942045) INFO:     Shutting down
(APIServer pid=3942045) INFO:     Waiting for application shutdown.
(APIServer pid=3942045) INFO:     Application shutdown complete.
(APIServer pid=3942045) INFO:     Finished server process [3942045]
================================== 1 failed, 16 warnings in 21.59s ==================================

Now

================================== 1 passed, 16 warnings in 22.28s ==================================

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label May 5, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented May 5, 2026

Hi @yewentao256, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@mergify mergify Bot added v1 bug Something isn't working labels May 5, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies vllm/v1/worker/gpu/sample/logprob.py to refine the handling of top-k logprobs, including initializing topk_token_ids from the GPU state and applying type casting in both host code and the Triton kernel. Feedback suggests explicitly setting the dtype of logprob_token_ids to int64 to ensure memory safety and consistency with the kernel's 8-byte write operations.

Comment thread vllm/v1/worker/gpu/sample/logprob.py
Copy link
Copy Markdown
Member Author

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @njhill

njhill added 2 commits May 11, 2026 12:27
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
@njhill njhill requested a review from NickLucche as a code owner May 11, 2026 19:59
Copy link
Copy Markdown
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yewentao256. I pushed a couple of minor adjustments and also added a validation check and test, since I was digging into the behavior here.

@yewentao256 yewentao256 enabled auto-merge (squash) May 11, 2026 20:28
@yewentao256 yewentao256 merged commit d7af6b3 into main May 11, 2026
69 checks passed
@yewentao256 yewentao256 deleted the wentao-fix-mrv2-logprob-dtype-issue branch May 11, 2026 21:55
weifang231 pushed a commit to weifang231/eb-vllm that referenced this pull request May 13, 2026
…ect#41761)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
mfylcek pushed a commit to mfylcek/vllm that referenced this pull request May 19, 2026
…ect#41761)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
…ect#41761)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
@njhill njhill added the v2 label May 20, 2026
h1t35h pushed a commit to h1t35h/vllm that referenced this pull request May 21, 2026
…ect#41761)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed v1 v2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants