Fix GPT_neox incorrect output with batch query by Jianhong-Zhang · Pull Request #1358 · huggingface/optimum-habana

Jianhong-Zhang · 2024-09-25T00:46:34Z

No description provided.

vivekgoe · 2024-09-27T07:18:42Z

@dvarshney-habana @libinta can you please help with review? change is in text-generation, you may have a better idea about effect of changes here.

libinta · 2024-10-01T20:16:00Z

+                        if next_tokens[i] == eos_token_id:
+                            idx_bs[i] = token_idx.item()
+                        if token_idx > idx_bs[i]:
+                            next_tokens[i] = pad_token_id


please do not add any logic inside the generation loop, it will impact performance. instead you can do it after generation to return right content based on eos

moved logic to after generation loop and before return

libinta · 2024-10-14T21:49:45Z

+                        idx_bs = idx
+                    if idx > idx_bs:
+                        input_ids[i][idx] = pad_token_id
+                idx_bs = generation_config.max_length


do we still want to return padding token in this case or just stop here?

according to customer requirement, we need to padding token after <|endoftext|>

https://huggingface.co/facebook/bart-large-cnn/blob/main/config.json
For bart decoder_start_token_id and eos_token_id are same.
This breaks bart generation

HuggingFaceDocBuilderDev · 2024-10-16T16:53:06Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

PR huggingface#1358 from upstream introduced a nested for loop which caused performance drop observed in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com> Signed-off-by: Urszula Golowicz <urszula.golowicz@intel.com>

PR huggingface#1358 introduced a nested for loop which caused performance drop observed e.g. in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com> Signed-off-by: Urszula Golowicz <urszula.golowicz@intel.com>

PR huggingface#1358 from upstream introduced a nested for loop which caused performance drop observed in T5 summarization. This commit translates the loop into tensor operations and restores performance. Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com>

Jianhong-Zhang requested review from bhargaveede, ssarkar2 and vivekgoe as code owners September 25, 2024 00:46

vivekgoe requested review from a user and libinta September 27, 2024 07:16

libinta reviewed Oct 1, 2024

View reviewed changes

Fix GPT_neox incorrect output with batch query

8c0ead3

Jianhong-Zhang force-pushed the gpt_neox_output branch from a8b7750 to 8c0ead3 Compare October 2, 2024 22:09

Jianhong-Zhang requested a review from libinta October 2, 2024 22:23

libinta reviewed Oct 14, 2024

View reviewed changes

Jianhong-Zhang requested a review from libinta October 14, 2024 23:24

libinta added the run-test Run CI for PRs from external contributors label Oct 16, 2024

regisss approved these changes Oct 16, 2024

View reviewed changes

regisss merged commit ddd4e8d into huggingface:main Oct 16, 2024

regisss pushed a commit that referenced this pull request Oct 17, 2024

Fix GPT_neox incorrect output with batch query (#1358)

94aab1c

ugolowic mentioned this pull request Dec 3, 2024

Restore performance in generate #1546

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GPT_neox incorrect output with batch query#1358

Fix GPT_neox incorrect output with batch query#1358
regisss merged 1 commit into
huggingface:mainfrom
Jianhong-Zhang:gpt_neox_output

Jianhong-Zhang commented Sep 25, 2024

Uh oh!

vivekgoe commented Sep 27, 2024

Uh oh!

libinta Oct 1, 2024

Uh oh!

Jianhong-Zhang Oct 2, 2024 •

edited

Loading

Uh oh!

libinta Oct 14, 2024

Uh oh!

Jianhong-Zhang Oct 14, 2024

Uh oh!

bhargaveede Nov 21, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Oct 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

Jianhong-Zhang commented Sep 25, 2024

Uh oh!

vivekgoe commented Sep 27, 2024

Uh oh!

libinta Oct 1, 2024

Choose a reason for hiding this comment

Uh oh!

Jianhong-Zhang Oct 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

libinta Oct 14, 2024

Choose a reason for hiding this comment

Uh oh!

Jianhong-Zhang Oct 14, 2024

Choose a reason for hiding this comment

Uh oh!

bhargaveede Nov 21, 2024

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Jianhong-Zhang Oct 2, 2024 •

edited

Loading