[BugFix][Multi Modal] Fix TensorSchema shape mismatch in Molmo#24559
[BugFix][Multi Modal] Fix TensorSchema shape mismatch in Molmo#24559vllm-bot merged 2 commits intovllm-project:mainfrom
Conversation
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request addresses a shape mismatch issue in the Molmo model by correctly marking the 'number of crops' (nc) dimension as dynamic. This is a crucial fix for handling batches of images that produce a variable number of crops. My review identifies a related issue where the 'token sequence positions' (tp) dimension, which can also vary between images, was not marked as dynamic. I've suggested a fix to prevent potential validation errors.
| feat_is_patch: Annotated[ | ||
| Union[torch.Tensor, list[torch.Tensor]], | ||
| TensorShape("bn", "nc", "tp", dynamic_dims={"nc"})] |
There was a problem hiding this comment.
The number of token sequence positions (tp) can also be dynamic for different images, similar to the number of crops (nc). The number of tokens depends on the image's height and width. To prevent potential ValueError during tensor shape validation when processing a batch of images with varying sizes, tp should also be marked as a dynamic dimension.
| feat_is_patch: Annotated[ | |
| Union[torch.Tensor, list[torch.Tensor]], | |
| TensorShape("bn", "nc", "tp", dynamic_dims={"nc"})] | |
| feat_is_patch: Annotated[ | |
| Union[torch.Tensor, list[torch.Tensor]], | |
| TensorShape("bn", "nc", "tp", dynamic_dims={"nc", "tp"})] |
Isotr0py
left a comment
There was a problem hiding this comment.
Thanks for fixing! This model wasn't covered by CI because of its requirement conflicts 😅
|
Thanks, |
|
Thanks @wwl2755 for the fix! |
…project#24559) Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
…project#24559) Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
Number of crops would be dynamic in different images, so need to make it dynamic.
Fix: #24544
Related: #22022
cc: @bbeckca @DarkLight1337
Test
Result