Skip to content

Add nightly b200 test for spec decode eagle correctness#38577

Merged
benchislett merged 16 commits into
vllm-project:mainfrom
puririshi98:patch-8
Apr 9, 2026
Merged

Add nightly b200 test for spec decode eagle correctness#38577
benchislett merged 16 commits into
vllm-project:mainfrom
puririshi98:patch-8

Conversation

@puririshi98

Copy link
Copy Markdown
Contributor

Signed-off-by: Rishi Puri <riship@nvidia.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot added the ci/build label Mar 30, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new Buildkite test step for 'Spec Decode Eagle Nightly B200' to execute correctness tests on B200 hardware using nightly PyTorch builds. A review comment pointed out that the pytest command uses an incorrect path, missing the 'tests/' prefix required to correctly locate the test suite.

Comment thread .buildkite/test_areas/spec_decode.yaml
Comment thread .buildkite/test_areas/spec_decode.yaml Outdated
@benchislett benchislett added the verified Run pre-commit for new contributors without triggering other tests label Apr 7, 2026
- vllm/v1/worker/gpu/spec_decode/
- tests/v1/e2e/spec_decode/
commands:
- pytest -v -s v1/e2e/spec_decode -k "eagle_correctness"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This nightly should cover more than just the eagle correctness. Ideally we'd check at least MTP, maybe also draft model.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Comment thread .buildkite/test_areas/spec_decode.yaml
@robertgshaw2-redhat robertgshaw2-redhat enabled auto-merge (squash) April 7, 2026 17:43
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 7, 2026

@ProExpertProg ProExpertProg left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Rishi Puri <riship@nvidia.com>
auto-merge was automatically disabled April 7, 2026 17:46

Head branch was pushed to by a user without write access

@benchislett benchislett enabled auto-merge (squash) April 7, 2026 17:53
@mgoin

mgoin commented Apr 7, 2026

Copy link
Copy Markdown
Member

Why did you rebase :( It canceled the tests

@benchislett benchislett merged commit adaabb8 into vllm-project:main Apr 9, 2026
16 checks passed
benchislett added a commit to CentML/vllm that referenced this pull request Apr 10, 2026
…-project#38577)"

This reverts commit adaabb8.

Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
benchislett added a commit that referenced this pull request Apr 11, 2026
…)" (#39512)

Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
wojciech-wais pushed a commit to wojciech-wais/vllm that referenced this pull request Apr 13, 2026
wojciech-wais pushed a commit to wojciech-wais/vllm that referenced this pull request Apr 13, 2026
stecasta added a commit to stecasta/vllm that referenced this pull request Apr 21, 2026
Re-applies the 3 optional nightly B200 buildkite steps originally added
in vllm-project#38577 and reverted in vllm-project#39512. The revert was due
to the Blackwell specdec correctness regression; the preceding commit
in this PR fixes the underlying bug.

Addresses Matthew Bonanni's review ask to re-enable the previously
failing tests and confirm they pass CI.

Co-authored-by: Rishi Puri <riship@nvidia.com>
Signed-off-by: Stefano Castagnetta <scastagnetta@nvidia.com>
stecasta added a commit to stecasta/vllm that referenced this pull request Apr 21, 2026
…m-project#38577)" (vllm-project#39512)

This reverts commit af661a1.

Signed-off-by: Stefano Castagnetta <scastagnetta@nvidia.com>
stecasta added a commit to puririshi98/vllm that referenced this pull request Apr 21, 2026
…m-project#38577)" (vllm-project#39512)

This reverts commit af661a1.

Signed-off-by: Stefano Castagnetta <scastagnetta@nvidia.com>
whk-lab pushed a commit to whk-lab/vllm that referenced this pull request Apr 23, 2026
whk-lab pushed a commit to whk-lab/vllm that referenced this pull request Apr 23, 2026
avinashsingh77 pushed a commit to avinashsingh77/vllm that referenced this pull request Apr 27, 2026
…-project#38577)" (vllm-project#39512)

Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
@khluu

khluu commented May 12, 2026

Copy link
Copy Markdown
Member

Hey @puririshi98 — heads up: while migrating B200 jobs from the old b200 queue to b200-k8s (Kubernetes), we found that test_mtp_correctness[deepseek] now fails on Blackwell with:

RuntimeError: Check failed: args->top_k < (args->topk_group * args->num_experts / args->n_group) (4 vs. 4)
: top_k must be less than total number of experts in selected groups

This comes from flashinfer's TRTLLM fused MoE kernel (trtllm_fused_moe_kernel_launcher.cu:991). The kernel uses a strict < check, while vllm's own check in csrc/moe/grouped_topk_kernels.cu:1023 uses <= for the same constraint.

This test was previously being skipped because it's decorated with @single_gpu_only and the old b200 runners had 2 GPUs. Now that b200-k8s runners have 1 GPU, the test actually runs and hits this failure.

For now we've added a skipif on Blackwell in #42387 so the migration can proceed.

Build with the failure: https://buildkite.com/vllm/ci/builds/65711#019e1b4e-7f4c-41c5-8f0e-82cbde49317a

@khluu

khluu commented May 12, 2026

Copy link
Copy Markdown
Member

Also, test_eagle_correctness_light[FLASH_ATTN-deepseek_eagle] has the same story — it was previously skipped on B200 because @single_gpu_only + 2-GPU runners meant it never ran. Now with 1-GPU b200-k8s runners it actually executes and fails with:

AssertionError: (head_dim, head_dim_v)=(192, 192) is not supported on SM100/SM110.
head_dim and head_dim_v must be between 8 and 128 and divisible by 8, or (192, 128) for DeepSeek.

Added a skipif on Blackwell for this test as well in #42387.

Build with the failure: https://buildkite.com/vllm/ci/builds/65711#019e19b3-b994-44e0-a1cc-ce8614caa13a

my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…#38577)

Signed-off-by: Rishi Puri <riship@nvidia.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…-project#38577)" (vllm-project#39512)

Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed verified Run pre-commit for new contributors without triggering other tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants