-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Diverse number of return sequences for greedy search and sampling generation #8840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* refactor * further refactor * fix the rest tomorrow * save intermediate * finish slow tokenizer * make more tests pass * finish refactor * fix comment * clean further * fix name * fix naming * Update src/transformers/models/reformer/tokenization_reformer.py * Apply suggestions from code review * Apply suggestions from code review * refactor * fix init tokenizers * refactor * improve convert * refactor * correct convert slow tokenizer * final fix for Pegasus Tok * remove ipdb * improve links
* Fix minor typos * Additional typos * Style fix Co-authored-by: guyrosin <guyrosin@assist-561.cs.technion.ac.il>
* implement job skipping for doc-only PRs * silent grep is crucial * wip * wip * wip * wip * wip * wip * wip * wip * let's add doc * let's add code * revert test commits * restore * Better name * Better name * Better name * some more testing * some more testing * some more testing * finish testing
| probs = F.softmax(scores, dim=-1) | ||
| next_tokens = torch.multinomial(probs, num_samples=num_return_sequences).reshape(-1) | ||
| # Once we got next_tokens, we have expand metadata | ||
| unfinished_sequences = unfinished_sequences.repeat_interleave(num_return_sequences, dim=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this quickly explode the number of current sequences?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its purpose to explode sequences. Instead of exlploding them before calling sample and start each sequence from scratch (when nothing stops first token of each sequence to be equal (to the first token of another sequence)) this ensures first tokens to differ
|
Hey @LSinev, thanks for your PR! Since the If this is not sufficient, we have to think a bit more about how to add this PR. |
Ok. I will check if it is possible (but this can move
Never thought about this. Will check.
Nothing openly available as far as i know. Because of |
|
This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions. If you think this still needs to be addressed please comment on this thread. |
What does this PR do?
A new option proposed,
diverse_sequences, for cases, when one wants really different sequences to be generated (conversational bot, for example). For greedy search, it starts generating new sequences from topnum_return_sequencestokens (as first tokens in sequences). For sample generation mode,num_return_sequencesfirst tokens are taken from a multinomial distribution.Default
diverse_sequences=Falseleaves generation in a way it was before this PR.Before submitting
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
GPT2: @patrickvonplaten
Text Generation: @TevenLeScao
T5: @patrickvonplaten