Skip to content

Conversation

@jeejeelee
Copy link
Collaborator

@jeejeelee jeejeelee commented Aug 1, 2025

Essential Elements of an Effective PR Description Checklist

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

The current code doesn't check whether intermediate_size can be evenly divided by TP, which may lead to generating some useless configs. This PR mainly ensures that intermediate_size can be divided by TP.

Test Plan

Test Result

(Optional) Documentation Update

Signed-off-by: Jee Jee Li <[email protected]>
@jeejeelee jeejeelee requested a review from WoosukKwon August 1, 2025 07:57
@mergify mergify bot added the performance Performance-related issues label Aug 1, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a check to ensure intermediate_size is divisible by the tensor parallelism size in the MoE benchmark, which is a good correctness fix. My review focuses on improving the implementation of this check. I've suggested inlining the assertion, which simplifies the code and removes a new helper function that has a misleading generic name for a specific purpose.

Comment on lines +25 to +30
def ensure_divisibility(numerator, denominator):
"""Ensure that numerator is divisible by the denominator."""
assert numerator % denominator == 0, (
"intermediate_size {} is not divisible by tp {}.".format(numerator, denominator)
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Since this function is only used once, it can be inlined at the call site for simplicity. This also avoids introducing a new helper function with a generic name but a specific implementation, which could be misleading for future maintenance.

intermediate_size = config.intermediate_size
shard_intermediate_size = 2 * intermediate_size // args.tp_size

ensure_divisibility(intermediate_size, args.tp_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Inlining the divisibility check here makes the code more direct and removes the need for a separate helper function.

Suggested change
ensure_divisibility(intermediate_size, args.tp_size)
assert intermediate_size % args.tp_size == 0, f"intermediate_size {intermediate_size} is not divisible by tp {args.tp_size}"

@github-actions
Copy link

github-actions bot commented Aug 1, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@houseroad houseroad added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 1, 2025
@DarkLight1337 DarkLight1337 merged commit 8d70599 into vllm-project:main Aug 1, 2025
29 checks passed
@jeejeelee jeejeelee deleted the improve-benchmark-moe branch August 2, 2025 06:59
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants