[Bugfix] handle alignment of encoder_seq_lens in mllama.py#14784
[Bugfix] handle alignment of encoder_seq_lens in mllama.py#14784heheda12345 merged 5 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
d289902 to
8e6903c
Compare
8e6903c to
fb1d347
Compare
vllm/model_executor/models/mllama.py
Outdated
There was a problem hiding this comment.
input_processor_for_mllama has been removed and I don't see our use of the Transformers processor doing this same trick to only process the last image group.
I'm working to undersatnd this better, but I think we either should re-implement the "cheat" or remove the extra complexity (i.e. this check comparing to num tiles and remove kv_range_for_decode).
There was a problem hiding this comment.
I'm checking with the author of PR #11427 which removes the previous input processor. See the discussion here.
https://vllm-dev.slack.com/archives/C07QCGVDNUF/p1743001000770969
|
This pull request has merge conflicts that must be resolved before it can be |
fb1d347 to
582552e
Compare
582552e to
5b257d9
Compare
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
5b257d9 to
aa6d40d
Compare
|
@heheda12345 I rebased and updated the branch. Ready for another look! |
heheda12345
left a comment
There was a problem hiding this comment.
LGTM in general. Only some nits.
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
|
@heheda12345 Thanks for the review! I pushed the suggested changes (the linter didn't like |
heheda12345
left a comment
There was a problem hiding this comment.
Thanks for the bug fix!
…ect#14784) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> Signed-off-by: Yang Wang <elainewy@meta.com>
…ect#14784) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
…ect#14784) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
…ect#14784) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Fix for the crash repro reported in this comment. The bug and fix are pretty similar to #12347 in that the problem arises in a batch of mixed text and image requests leading to the attn metadata having some lists with an element for each sequence and others with an element only for each sequence-with-images.
Another fix that came out of #10648
FIX #10648