-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Adding support for multiple mask tokens. #14716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a3fca5e to
34c02f4
Compare
8068d62 to
61686e2
Compare
- Original implem: huggingface#10222 Co-authored-by: njafer <[email protected]>
we add information to the tasks to specify tasks where we know for sure if we need the tokenizer/feature_extractor or not.
61686e2 to
5b0189e
Compare
LysandreJik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| NO_FEATURE_EXTRACTOR_TASKS = set() | ||
| NO_TOKENIZER_TASKS = set() | ||
| for task, values in SUPPORTED_TASKS.items(): | ||
| if values["type"] == "text": | ||
| NO_FEATURE_EXTRACTOR_TASKS.add(task) | ||
| elif values["type"] in {"audio", "image"}: | ||
| NO_TOKENIZER_TASKS.add(task) | ||
| elif values["type"] != "multimodal": | ||
| raise ValueError(f"SUPPORTED_TASK {task} contains invalid type {values['type']}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach, it should make the pipelines more robust to models with different capabilities in terms of preprocessors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as well!
| # than them | ||
| self.assertEqual(len(outputs), 3) | ||
|
|
||
| def fill_mask_with_multiple_masks(self, model, tokenizer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we perhaps add a test for Perceiver (similar to the image classification models)?
Or is this not required here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is pretty orthogonal to Perceiver.
We could add a slow test for sure, but it doesn't have to be Perceiver specific.
In fact, I'll add something on the random model (it just needs to be consistent, actual values are less important)
LysandreJik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok this looks good to me! Looking forward to the additional test, feel free to merge whenever.
What does this PR do?
When presented multiple masks, it's impossible to retrieve the conjugate probabilities.
Instead of trying to workaround that (see discussions in previous PR) this PR instead
just outputs the raw
top_kproposition at each locus, since it gets trickier to find a goodproxy for "joint probabilities". Instead of trying to solve this impossible problem we simply
show exactly what the model outputs.
@naveenjafer mentionned as co-author since much of this PR was pulled from there.
This PR was resurrected partly because Perceiver (byte-level model) need to do this type of masking to be useful.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.