Add more ci for moe refactor b200 by robertgshaw2-redhat · Pull Request #31769 · vllm-project/vllm

robertgshaw2-redhat · 2026-01-06T04:05:21Z

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Robert Shaw <robshaw@redhat.com>

…m-project/vllm into add-more-ci-for-moe-refactor

Signed-off-by: Robert Shaw <robshaw@redhat.com>

mergify · 2026-01-06T04:05:46Z

⚠️ The sha of the head commit of this PR conflicts with #31759. Mergify cannot evaluate rules on this PR. ⚠️

gemini-code-assist

Code Review

This pull request adds new CI tests for a Mixture-of-Experts (MoE) refactor. While adding test coverage is valuable, the implementation contains several critical and high-severity issues, primarily due to copy-paste errors. There's a critical error in the Buildkite pipeline configuration where the B200 test incorrectly uses the H100 configuration file. Additionally, there are multiple inconsistencies in the YAML test configuration files, including typos in environment variable names, invalid YAML syntax, and mismatches between filenames and their corresponding test settings. These issues will likely cause CI failures or lead to tests running with incorrect configurations, undermining their purpose.

gemini-code-assist · 2026-01-06T04:07:28Z

.buildkite/test-pipeline.yaml

+  optional: true
+  num_gpus: 2
+  commands:
+    - pytest -s -v evals/gsm8k/test_gsm8k_correctness.py --config-list-file=evals/gsm8k/configs/moe-refactor/config-h100.txt


The B200 integration test is incorrectly using the configuration file for H100 (config-h100.txt). This will cause the wrong set of tests to be executed on the B200 hardware. It should be using config-b200.txt to run the tests intended for B200.

- pytest -s -v evals/gsm8k/test_gsm8k_correctness.py --config-list-file=evals/gsm8k/configs/moe-refactor/config-b200.txt

gemini-code-assist · 2026-01-06T04:07:28Z

tests/evals/gsm8k/configs/moe-refactor/Llama-4-Scout-Fp8-ModelOpt-fi-trtllm.yaml

+server_args: "--enforce-eager --max-model-len 8192 --tensor-parallel-size 2"
+env:
+  VLLM_USE_FLASHINFER_MOE_FP8: "1"
+  VLLM_FLASHINFER_MaOE_BACKEND: "latency"


There is a typo in the environment variable name VLLM_FLASHINFER_MaOE_BACKEND. It should be VLLM_FLASHINFER_MOE_BACKEND. This typo will prevent the correct backend from being configured, causing the test to not run as intended.

VLLM_FLASHINFER_MOE_BACKEND: "latency"

gemini-code-assist · 2026-01-06T04:07:28Z

tests/evals/gsm8k/configs/moe-refactor/Llama-4-Scout-Fp8-ModelOpt-triton.yaml

+num_questions: 1319
+num_fewshot: 5
+server_args: "--enforce-eager --max-model-len 8192 --tensor-parallel-size 2"
+


This line contains only indentation, which makes the YAML file invalid. This will cause a parsing error when the test configuration is loaded. Please remove this line to ensure the file is valid.

gemini-code-assist · 2026-01-06T04:07:28Z

tests/evals/gsm8k/configs/moe-refactor/Qwen3-30B-A3B-Fp8-CT-Channel-vllm-cutlass.yaml

+env:
+  VLLM_TEST_FORCE_FP8_MARLIN: "1"


The filename suggests a vllm-cutlass configuration, but the environment variable VLLM_TEST_FORCE_FP8_MARLIN is set, which forces the marlin kernel. This is inconsistent and will not test the intended vllm-cutlass kernel. Based on other vllm-cutlass configurations, this env block should be removed.

gemini-code-assist · 2026-01-06T04:07:28Z

tests/evals/gsm8k/configs/moe-refactor/Qwen3-30B-A3B-NvFp4-CT-marlin.yaml

+env:
+  VLLM_USE_FLASHINFER_MOE_FP4: "1"
+  VLLM_FLASHINFER_MOE_BACKEND: "throughput"


The filename indicates a marlin test configuration, but the environment variables are set for flashinfer. This is inconsistent and will not test the marlin kernel. Please update the environment variables to be consistent with a marlin test for this model type.

env: VLLM_USE_DEEP_GEMM: "0" VLLM_USE_DEEP_GEMM_MOE: "0" VLLM_TEST_FORCE_FP8_MARLIN: "1"

gemini-code-assist · 2026-01-06T04:07:28Z

tests/evals/gsm8k/configs/moe-refactor/Qwen3-30B-A3B-NvFp4-CT-vllm-cutlass.yaml

+env:
+  VLLM_TEST_FORCE_FP8_MARLIN: "1"


The filename suggests a vllm-cutlass configuration, but the environment variable VLLM_TEST_FORCE_FP8_MARLIN is set, which forces the marlin kernel. This is inconsistent and will not test the intended vllm-cutlass kernel. The env block should be removed to allow the default kernel to be used.

gemini-code-assist · 2026-01-06T04:07:28Z

tests/evals/gsm8k/configs/moe-refactor/Qwen3-30B-A3B-NvFp4-ModelOpt-marlin.yaml

+env:
+  VLLM_USE_FLASHINFER_MOE_FP4: "1"
+  VLLM_FLASHINFER_MOE_BACKEND: "throughput"


The filename indicates a marlin test configuration, but the environment variables are set for flashinfer. This is inconsistent and will not test the marlin kernel. Please update the environment variables to force the marlin kernel.

env: VLLM_TEST_FORCE_FP8_MARLIN: "1"

gemini-code-assist · 2026-01-06T04:07:28Z

tests/evals/gsm8k/configs/moe-refactor/Qwen3-30B-A3B-NvFp4-ModelOpt-vllm-cutlass.yaml

+env:
+  VLLM_TEST_FORCE_FP8_MARLIN: "1"


The filename suggests a vllm-cutlass configuration, but the environment variable VLLM_TEST_FORCE_FP8_MARLIN is set, which forces the marlin kernel. This is inconsistent and will not test the intended vllm-cutlass kernel. The env block should be removed.

gemini-code-assist · 2026-01-06T04:07:28Z

tests/evals/gsm8k/configs/moe-refactor/config-b200.txt

@@ -0,0 +1,12 @@
+Llama-4-Scout-Fp8-ModelOpt-fi-trtllm.yaml
+Qwen3-30B-A3B-Fp8-AutoFp8-fi-trtllm.yaml  


This line has trailing whitespace, which could cause parsing issues for scripts that consume this file, potentially leading to test failures. Please remove the trailing spaces.

Qwen3-30B-A3B-Fp8-AutoFp8-fi-trtllm.yaml

mergify · 2026-01-06T04:08:02Z

⚠️ The sha of the head commit of this PR conflicts with #31759. Mergify cannot evaluate rules on this PR. ⚠️

mergify · 2026-01-06T05:27:05Z

⚠️ The sha of the head commit of this PR conflicts with #31759. Mergify cannot evaluate rules on this PR. ⚠️

Robert Shaw and others added 10 commits January 5, 2026 20:31

add h100 integration tests

dd90b6d

Signed-off-by: Robert Shaw <robshaw@redhat.com>

add newlines

acaa9a5

Signed-off-by: Robert Shaw <robshaw@redhat.com>

updated

ab6a497

Signed-off-by: Robert Shaw <robshaw@redhat.com>

Merge branch 'main' into add-more-ci-for-moe-refactor

e16b007

updated to set the tp size explicitly

546b42a

Signed-off-by: Robert Shaw <robshaw@redhat.com>

Merge branch 'add-more-ci-for-moe-refactor' of https://github.com/vll…

dfa7775

…m-project/vllm into add-more-ci-for-moe-refactor

update

d68cff5

Signed-off-by: Robert Shaw <robshaw@redhat.com>

fix the pathname

b18f148

Signed-off-by: Robert Shaw <robshaw@redhat.com>

actually fix the path

4386dd8

Signed-off-by: Robert Shaw <robshaw@redhat.com>

setup b200 tests

1fb843b

Signed-off-by: Robert Shaw <robshaw@redhat.com>

robertgshaw2-redhat closed this Jan 6, 2026

robertgshaw2-redhat deleted the add-more-ci-for-moe-refactor-b200 branch January 6, 2026 04:05

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

		@@ -0,0 +1,12 @@
		Llama-4-Scout-Fp8-ModelOpt-fi-trtllm.yaml
		Qwen3-30B-A3B-Fp8-AutoFp8-fi-trtllm.yaml

Uh oh!

Conversation

robertgshaw2-redhat commented Jan 6, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mergify bot commented Jan 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Jan 6, 2026

Uh oh!

mergify bot commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

robertgshaw2-redhat commented Jan 6, 2026 •

edited by github-actions bot

Loading