-
Notifications
You must be signed in to change notification settings - Fork 31.9k
[ProcessingIdefics] Attention mask bug with padding
#29449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
amyeroberts
merged 16 commits into
huggingface:main
from
byi8220:attention-mask-bug-with-padding
Apr 4, 2024
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
43119af
Defaulted IdeficsProcessor padding to 'longest', removed manual padding
byi8220 8326681
make fixup
byi8220 6728770
Defaulted processor call to padding=False
byi8220 bebd7c1
Add padding to processor call in IdeficsModelIntegrationTest as well
byi8220 8d8cc2f
Defaulted IdeficsProcessor padding to 'longest', removed manual padding
byi8220 26571b2
make fixup
byi8220 72d9e00
Defaulted processor call to padding=False
byi8220 8675211
Add padding to processor call in IdeficsModelIntegrationTest as well
byi8220 9014e8b
Merge branch 'huggingface:main' into attention-mask-bug-with-padding
byi8220 1be6fff
Merge branch 'huggingface:main' into attention-mask-bug-with-padding
byi8220 ba647ea
Merge branch 'attention-mask-bug-with-padding' of https://github.com/…
byi8220 5e27c2c
redefaulted padding=longest again
byi8220 a597d6d
fixup/doc
byi8220 8a62722
Merge branch 'huggingface:main' into attention-mask-bug-with-padding
byi8220 ba2f2da
Merge branch 'huggingface:main' into attention-mask-bug-with-padding
byi8220 f79ea7f
Merge branch 'huggingface:main' into attention-mask-bug-with-padding
byi8220 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should change the default here for two reasons:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may have tagged the wrong person 🙃
Does the idefics model support non-padded inputs? From my understanding of the original issue, it seems they desire expect some padding even when the argument is not passed.
I'm not sure if this is a bug, but the default behavior appears to have been inaccurate to begin with. Even if the user passes in
padding=False, lines 347-354 seems to be forcibly padding the text input to the maximum sequence length anyways.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, yes, sorry about that
No, but no models support non-padded inputs if batch_size > 1 and the input sequences are of different lengths, but all processors and tokenizers do not pad the inputs by default.
In this case, the forcible padding when
padding=Falseshould be removedUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this behavior was removed as part of this PR. After this PR,
padding=Falseor not settingpaddingwill not pad the input.I have modified the PR to default to
padding=False, and the unit tests (and one of the integration tests) to explicitly specifypadding='longest'. I guess my only concern was that by changing this behavior, a user which was relying on the default behavior would find their code broken overnight.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@byi8220 Hmm, yes, this is tricky and that's a good point. OK, in this case, I think your original solution of setting
padding='longest'as default is best, ideally with a comment linking to this issue to explain why the default is different and adding a description for theFalseoption in the docstringThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done