-
Notifications
You must be signed in to change notification settings - Fork 32.3k
Add ImageTextToText pipeline #29572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Add ImageTextToText pipeline #29572
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
cff2439
Add pipeline
NielsRogge d254f58
More improvements
NielsRogge 40051db
More improvements
NielsRogge 55a40ad
Add support for Donut
NielsRogge cd0f3ac
More improvements
NielsRogge 21c6fb9
More improvements
NielsRogge 056e363
More improvements
NielsRogge 0d6d7df
Fix tests
NielsRogge b48add3
Fix tests
NielsRogge e4541aa
Fix git tests
NielsRogge 7cbb644
Fix merge
NielsRogge cfc8a13
Fix merge
NielsRogge 04fcbfe
Merge branch 'feature/use_processor' of github.com:NielsRogge/transfo…
NielsRogge 021334d
Fix merge
NielsRogge 9c384cc
Update metadata
NielsRogge f6ba64d
Add support for idefics
NielsRogge fc4363a
Add pipeline
NielsRogge dc2ca31
More improvements
NielsRogge d575921
More improvements
NielsRogge fd77e76
Add support for Donut
NielsRogge 5f772f1
More improvements
NielsRogge 59855ad
More improvements
NielsRogge c2067b9
More improvements
NielsRogge 40fe2f8
Fix tests
NielsRogge 8b06c67
Fix tests
NielsRogge 81db879
Fix git tests
NielsRogge 7382075
Update metadata
NielsRogge 8acf164
Add support for idefics
NielsRogge 89ac5c4
Fix documentation test
NielsRogge b64070c
Merge remote-tracking branch 'upstream/main' into feature/use_processor
NielsRogge 40bd731
Remove script
NielsRogge 743a967
Fix merge
NielsRogge 22d3d70
Address comments
NielsRogge File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| from transformers import pipeline | ||
|
|
||
|
|
||
| # OK: | ||
| # model_id = "microsoft/git-base-coco" | ||
| model_id = "Salesforce/blip-image-captioning-base" | ||
| # model_id = "Salesforce/blip2-opt-2.7b" ok, although it doesn't include the text prompt in the output | ||
| # model_id = "Salesforce/instructblip-flan-t5-xl" ok, although it doesn't include the text prompt in the output | ||
| # model_id = "llava-hf/llava-1.5-7b-hf" | ||
| # model_id = "adept/fuyu-8b" | ||
| # model_id = "google/pix2struct-textcaps-base" | ||
| # model_id = "microsoft/udop-large" | ||
| # model_id = "naver-clova-ix/donut-base-finetuned-docvqa" | ||
| # model_id = "microsoft/kosmos-2-patch14-224" | ||
|
|
||
| pipe = pipeline(task="image-text-to-text", model=model_id) | ||
|
|
||
| outputs = pipe( | ||
| images=["http://images.cocodataset.org/val2017/000000039769.jpg"], | ||
| # text="USER: <image>\nWhat does the label 15 represent? (1) lava (2) core (3) tunnel (4) ash cloud\nASSISTANT:", | ||
| text=["A photo of", "The cats are"], | ||
| max_new_tokens=200, | ||
| ) | ||
|
|
||
| print(outputs) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.