[Bugfix][V1] Warm up slot mapping before JIT monitor by lesj0610 · Pull Request #61 · lesj0610/vllm

lesj0610 · 2026-05-09T13:07:13Z

Purpose

vllm-project#40137 added Triton JIT monitoring that activates after warmup finishes. But V1 warmup path (_dummy_run()) never calls BlockTable.compute_slot_mapping(). So when first real request comes in, _compute_slot_mapping_kernel compiles while JIT monitor is already active. Users see unexpected compilation warning during normal inference.

Problem has two sides. First, V1 warmup simply does not exercise the slot mapping path. Second, _compute_slot_mapping_kernel was specialized on num_tokens parameter, meaning even if we warm up with one token count, different request size triggers recompilation again.

Fix is also two parts.

I add do_not_specialize=["num_tokens"] to the kernel so one compilation covers all request sizes. max_num_tokens stays specialized — it is constant for engine lifetime and Triton can optimize the padding loop with it.

I also add small warmup in warmup_v1_slot_mapping_kernel() that calls compute_slot_mapping() directly before JIT monitor activates. It temporarily uses block id 1 (block 0 is null block), then clears in finally block. This runs on all PP ranks because every rank calls compute_slot_mapping() during input preparation.

I did not add synthetic execute_model() warmup. That needs model-specific dummy inputs and is not safe for all model types. This PR only covers slot mapping kernel.

V2 warmup path is not touched. V1 sampler warmup is not touched.

Checked open PRs, no existing PR for this issue.

Test Plan

.venv/bin/python -m pytest tests/v1/worker/test_gpu_model_runner.py -v

pre-commit run ruff-format --files \
  vllm/v1/worker/block_table.py \
  vllm/v1/worker/gpu/warmup.py \
  vllm/v1/worker/gpu_worker.py \
  tests/v1/worker/test_gpu_model_runner.py

pre-commit run ruff-check --files \
  vllm/v1/worker/block_table.py \
  vllm/v1/worker/gpu/warmup.py \
  vllm/v1/worker/gpu_worker.py \
  tests/v1/worker/test_gpu_model_runner.py

pre-commit run mypy-3.10 --files \
  vllm/v1/worker/block_table.py \
  vllm/v1/worker/gpu/warmup.py \
  vllm/v1/worker/gpu_worker.py \
  tests/v1/worker/test_gpu_model_runner.py \
  --hook-stage manual

git diff --check

Test Result

tests/v1/worker/test_gpu_model_runner.py: 34 passed, 16 warnings.

ruff format / ruff check: passed.

mypy-3.10: passed.

git diff --check: passed.

Local smoke on V1 runner with Qwen3-8B text-only: HTTP 200, no _compute_slot_mapping_kernel warning on first request.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR
The test plan, such as providing test command.
The test results

AI assistance was used (Codex, Claude).

… operand layout with WGMMA (vllm-project#42076) Signed-off-by: kermit <ckeming@outlook.com>

Signed-off-by: lesj0610 <lesj0610@users.noreply.github.com>

[Bugfix] Fix GDN KKT precision loss on Hopper GPUs by aligning tl.dot…

adb6d96

… operand layout with WGMMA (vllm-project#42076) Signed-off-by: kermit <ckeming@outlook.com>

lesj0610 changed the title ~~[V1] Warm up slot mapping before JIT monitor~~ [Bugfix][V1] Warm up slot mapping before JIT monitor May 9, 2026

Warm up V1 slot mapping before JIT monitor

877e619

Signed-off-by: lesj0610 <lesj0610@users.noreply.github.com>

lesj0610 force-pushed the lesj/v1-slot-mapping-jit-warmup-upstream-20260509 branch from cad2699 to 877e619 Compare May 9, 2026 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix][V1] Warm up slot mapping before JIT monitor#61

[Bugfix][V1] Warm up slot mapping before JIT monitor#61
lesj0610 wants to merge 2 commits into
mainfrom
lesj/v1-slot-mapping-jit-warmup-upstream-20260509

lesj0610 commented May 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lesj0610 commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lesj0610 commented May 9, 2026 •

edited

Loading