fix image-to-text batch incorrect output issue#29342
Conversation
|
@amyeroberts please help review |
|
Hi @sywangyi, thanks for opening a PR! To show what this PR addresses could you:
Have you tested with non-batched input to confirm it's equivalent? |
|
|
before the fix: after the fix |
|
I have tested with non-batched input and confirm it's equivalent. |
|
Thanks for providing the info @sywangyi. Looking at your example, I think the issue is coming from the argument In [9]: from transformers import pipeline
...: import torch
...: import requests
...: import PIL.Image
...: import time
...:
...: image_url = "https://ankur3107.github.io/assets/images/image-captioning-example.png"
...: image = []
...: image.append(PIL.Image.open(requests.get(image_url, stream=True, timeout=3000).raw))
...: image.append(PIL.Image.open(requests.get(image_url, stream=True, timeout=3000).raw))
...:
...: generator = pipeline(
...: "image-to-text",
...: model="Salesforce/blip-image-captioning-large",
...: )
...:
...: result = generator(image)
...: print(f"{result}")
/Users/amyroberts/code/transformers/src/transformers/generation/utils.py:1181: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation.
warnings.warn(
[[{'generated_text': 'arafed image of a soccer player kicking a soccer ball'}], [{'generated_text': 'arafed image of a soccer player kicking a soccer ball'}]] |
|
Hi, @amyeroberts with and wo batch_size=2, it goes to different running logic. |
|
@sywangyi Right. My understanding is that My concern is that this is going to change the behaviour for the cases when datasets are passed - it's not obvious from the change or PR description what is being fixed here. Could you add some tests to make sure the pipeline still have the intended behaviour for the following cases:
|
Hi, @amyeroberts . I write the test for the third case you mentioned without the PR. the behavior of the 3 cases you mentioned is
with the PR.
|
|
@sywangyi Sorry, what I meant was add a test to our testing suite. |
|
I add some test, is it enough? @amyeroberts |
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for adding this fix and tests!
Just a small note on the tests. One addressed I think we'll be good to merge!
38898b6 to
8965fc9
Compare
amyeroberts
left a comment
There was a problem hiding this comment.
Great - thanks a lot for working on this and adding tests!
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>


fix image to text multi-batch input , but output incorrect issue