Fix GPT_neox incorrect output with batch query#1358
Conversation
|
@dvarshney-habana @libinta can you please help with review? change is in text-generation, you may have a better idea about effect of changes here. |
| if next_tokens[i] == eos_token_id: | ||
| idx_bs[i] = token_idx.item() | ||
| if token_idx > idx_bs[i]: | ||
| next_tokens[i] = pad_token_id |
There was a problem hiding this comment.
please do not add any logic inside the generation loop, it will impact performance. instead you can do it after generation to return right content based on eos
There was a problem hiding this comment.
moved logic to after generation loop and before return
a8b7750 to
8c0ead3
Compare
| idx_bs = idx | ||
| if idx > idx_bs: | ||
| input_ids[i][idx] = pad_token_id | ||
| idx_bs = generation_config.max_length |
There was a problem hiding this comment.
do we still want to return padding token in this case or just stop here?
There was a problem hiding this comment.
according to customer requirement, we need to padding token after <|endoftext|>
There was a problem hiding this comment.
https://huggingface.co/facebook/bart-large-cnn/blob/main/config.json
For bart decoder_start_token_id and eos_token_id are same.
This breaks bart generation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
PR huggingface#1358 from upstream introduced a nested for loop which caused performance drop observed in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com> Signed-off-by: Urszula Golowicz <urszula.golowicz@intel.com>
PR huggingface#1358 from upstream introduced a nested for loop which caused performance drop observed in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com> Signed-off-by: Urszula Golowicz <urszula.golowicz@intel.com>
PR huggingface#1358 from upstream introduced a nested for loop which caused performance drop observed in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com> Signed-off-by: Urszula Golowicz <urszula.golowicz@intel.com>
PR huggingface#1358 from upstream introduced a nested for loop which caused performance drop observed in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com> Signed-off-by: Urszula Golowicz <urszula.golowicz@intel.com>
PR huggingface#1358 introduced a nested for loop which caused performance drop observed e.g. in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com> Signed-off-by: Urszula Golowicz <urszula.golowicz@intel.com>
PR huggingface#1358 from upstream introduced a nested for loop which caused performance drop observed in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com>
No description provided.