Skip to content

Revert "[CI] Add Async Eplb nightly CI tests"#30086

Closed
SageMoore wants to merge 1 commit into
vllm-project:mainfrom
neuralmagic:revert-29385-eplb_nightly_ci
Closed

Revert "[CI] Add Async Eplb nightly CI tests"#30086
SageMoore wants to merge 1 commit into
vllm-project:mainfrom
neuralmagic:revert-29385-eplb_nightly_ci

Conversation

@SageMoore
Copy link
Copy Markdown
Contributor

@SageMoore SageMoore commented Dec 4, 2025

Reverts #29385

This test appears to be OOMing in CI. Let's revert until we figure out what's going on. CC @david6666666

@mergify mergify Bot added the ci/build label Dec 4, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to revert the addition of asynchronous EPLB nightly CI tests, which are reportedly causing out-of-memory issues. The changes largely consist of removing the problematic test scripts and their corresponding CI pipeline configurations. This is a reasonable step to stabilize the CI. However, the PR also includes an unrelated addition of comments in vllm/distributed/eplb/rebalance_execute.py. For maintainability and a clean Git history, revert PRs should be atomic. I've recommended moving this unrelated change to a separate PR.

Comment on lines +325 to +327
# A buffer to hold the expert weights in one layer during the exchange.
# NOTE: Currently we assume the same weights across different layers
# have the same shape.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This pull request is intended to be a revert, but it includes the addition of these new comments. To maintain a clean and atomic Git history, it's a best practice for a revert to only contain the reverted changes. Please consider moving this documentation improvement to a separate pull request.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is exclusively a reversion of #29385. I have made no other changes.

@david6666666
Copy link
Copy Markdown
Contributor

david6666666 commented Dec 5, 2025

I will take a look, https://buildkite.com/vllm/ci/builds/41883/steps/canvas?sid=019ae82a-231f-4893-aa07-bd72e4ed5bbf
Qwen3-Next-80B-A3B-Instruct MTP Async EPLB Accuracy

@LucasWilkinson
Copy link
Copy Markdown
Collaborator

Was green on the nightly feels environmental 🤔 https://buildkite.com/vllm/ci/builds/42071

@LucasWilkinson LucasWilkinson self-assigned this Dec 5, 2025
@khluu khluu added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 10, 2025
@khluu khluu enabled auto-merge (squash) December 10, 2025 04:55
@khluu khluu disabled auto-merge December 10, 2025 20:39
@SageMoore SageMoore closed this Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants