-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[Tests] Fixing bug inside MultiModalProfiler. #21842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tests] Fixing bug inside MultiModalProfiler. #21842
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR fixes a critical bug in the MultiModalProfiler that caused hangs with large image requests. The fix involves using the length parameter on PlaceHolderRange for accurate token budget calculation. A new test suite has been added to validate the fix. I've identified a high-severity issue in the test file related to incorrect token calculation, which needs to be addressed.
|
See my comments in #21798 |
|
Maybe you can add the tests while the other author focuses on the fix itself |
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Makes sense 👍 |
05da435 to
ff1c085
Compare
ff1c085 to
2a80323
Compare
|
Can you fix pre-commit? |
Signed-off-by: Varun Shenoy <[email protected]>
2a80323 to
caf4475
Compare
DarkLight1337
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM given tests pass
Fixed. Please note that the failure now is not due to changes in the PR, but, due to file below. I will rebase and check if that fixes the issue. docs/design/fused_moe_modular_kernel.md |
|
No need to rebase, I can force-merge |
Signed-off-by: Varun Shenoy <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]> Signed-off-by: x22x22 <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]> Signed-off-by: Noam Gat <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]> Signed-off-by: Paul Pak <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]>
Signed-off-by: Varun Shenoy <[email protected]>
Purpose
This only adds tests as the fix is addressed as part of #21798
This PR fixes the issue #21344 where large image requests hangs in vllm waiting for the request to timeout.Bug: MultiModalProfiler counts only
patchtokens but, there are other bookeeping tokens liketile_seperator,image_start,image_endtokens in the input. This causes the encoder_budget to be slightly lower the actual budget. Whenever an image that uses all tiles is sent, VLLM accept the request but, scheduler can never schedule it because there is not enough encoder budget. Silent issue and No error is producedTest Plan
Test Result
(Optional) Documentation Update