Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass text and images as kwargs to VLM processor #1126

Merged
merged 1 commit into from
Aug 30, 2024
Merged

Conversation

lapp0
Copy link
Contributor

@lapp0 lapp0 commented Aug 30, 2024

Fixes #1123

Problem

In a new VLM processor, the text and image arguments are reversed (https://github.com/huggingface/transformers/pull/32473/files#r1734779753). This results in outlines.models.transformers_vision passing the arguments to the processor incorrectly.

Solution

Pass by kwarg

Testing

  • CI doesn't run transformers_vision tests because they require GPU. Tests for transformers_vision pass locally, however.
  • CI was run while huggingface.co was experiencing instability, 3.8 passed, 3.10 failed due to a failure to retrieve a tokenizer.

@lapp0 lapp0 marked this pull request as ready for review August 30, 2024 18:43
@lapp0 lapp0 requested a review from rlouf August 30, 2024 18:43
@rlouf rlouf merged commit 18eafb0 into dottxt-ai:main Aug 30, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Running outlines with Idefics3
2 participants