Skip to content

[CI][NIXL] Split DPEP tests#31491

Merged
robertgshaw2-redhat merged 1 commit intovllm-project:mainfrom
NickLucche:nixl-split-ci-tests
Dec 30, 2025
Merged

[CI][NIXL] Split DPEP tests#31491
robertgshaw2-redhat merged 1 commit intovllm-project:mainfrom
NickLucche:nixl-split-ci-tests

Conversation

@NickLucche
Copy link
Copy Markdown
Collaborator

Separate TP and DPEP end to end tests to better make use of parallel resources (when available), while also enabling DPEP to scale independently in num_gpus for future test scenarios.

Signed-off-by: NickLucche <nlucches@redhat.com>
@NickLucche NickLucche requested a review from ApostaC as a code owner December 29, 2025 10:16
@NickLucche NickLucche added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 29, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively separates the Tensor Parallelism (TP) and Data Parallelism/Expert Parallelism (DPEP) tests into distinct CI jobs. This is achieved by modifying the .buildkite/test-pipeline.yaml to include a new job for DPEP tests and refactoring the config_sweep_accuracy_test.sh script to select the appropriate test configurations based on the DP_EP environment variable. The changes are well-structured and correctly implement the separation of tests, which should improve CI parallelism and resource utilization. The implementation looks solid.

Comment on lines +22 to +27
if [[ -n "${DP_EP:-}" ]]; then
configs=("${dp_ep_configs[@]}")
echo "DP_EP is set, using dp_ep_configs"
else
configs=("${tp_configs[@]}")
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While the logic for selecting test configurations is correct, there is a potential robustness issue in the run_tests function that follows this block (outside the diff). The function passes extra_args to the test script without quotes:

if ! env ${cfg} bash "${SCRIPT}" ${extra_args}; then

This will cause issues if extra_args ever contains arguments with spaces, due to shell word splitting. This is a latent bug that could cause future test failures.

To make the script more robust, I recommend modifying run_tests to handle arguments as an array. For example:

run_tests() {
  local label=$1
  shift
  local extra_args=("$@")

  # ...
    if ! env ${cfg} bash "${SCRIPT}" "${extra_args[@]}"; then
  # ...
}

# ...
run_tests "FLASHINFER backend" --attention-backend FLASHINFER

Since this is outside the diff, I'm adding this comment here for visibility. Addressing this would improve the script's maintainability.

@robertgshaw2-redhat robertgshaw2-redhat merged commit ab1af6a into vllm-project:main Dec 30, 2025
23 of 26 checks passed
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
sxl1993 pushed a commit to sxl1993/vllm that referenced this pull request Jan 19, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: shixingliang.sxl <shixingliang.sxl@antgroup.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants