-
-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[Bugfix][VLM] Fix incompatibility between #7902 and #7230 #7948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…e support in LLaVA-NeXT
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge). To run full CI, you can do one of these:
🚀 |
expected_dims = (2, ) | ||
|
||
def _validate_shape(d: torch.Tensor): | ||
actual_dims = tuple(d.shape) | ||
|
||
if actual_dims != expected_dims: | ||
expected_expr = str(expected_dims) | ||
raise ValueError( | ||
f"The expected shape of image sizes per image per batch " | ||
f"is {expected_expr}. You supplied {tuple(d.shape)}.") | ||
|
||
for d in data: | ||
_validate_shape(d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is more complex, the error message becomes consistent with the one for pixel_values
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for the fix!
Thanks for the fix! |
Thank you for the fix! Sorry about that. |
…project#7230 (vllm-project#7948) Signed-off-by: Alvant <[email protected]>
…project#7230 (vllm-project#7948) Signed-off-by: LeiWang1999 <[email protected]>
This PR fixes an incompatibility between multimodal tensor stacking and multi-image support in LLaVA-NeXT and InternVL which causes the VLM CI to fail.
It also updates the docstrings about the input shapes which have become outdated since #7230.
The issue failed to be caught in #7902 because the changes from #7230 were not merged into its feature branch prior to merging #7902.