A generation issue: when ignore_eos=False and the model's pad_token==eos_t… by YunLiu1 · Pull Request #1539 · huggingface/optimum-habana

YunLiu1 · 2024-12-02T03:24:31Z

A generation issue: when ignore_eos=False and the model's pad_token==eos_token (like Llama3), the generated results in same batch size erased.

What does this PR do?

When generating text with Optimum-habana, if BS>1, ignore_eos=False, the model's pad_token==eos_token (like Llama3.1-8B),
This is an example:
I submit 2 prompts ("Hello world,", "How are you?"), the BS=2, the short one is padded at the left:

After generation, the first pad_token is recognized as eos_token, and the response is erased.

Fixes

I modified the post-process to ignore the left pad_tokens, and only erase the tokens after real eos_token.

Unit Tests

The changed codes passed this Unit Tests.
eos_test_py.txt

Function Tests

And it passed the Function Tests:
lm_eval mmlu_pro_business for Meta-Llama-3.1-8B-Instruct (pad_token=eos_toekn, bs=8):

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
business	1	custom-extract	5	exact_match	↑	0.4791	±	0.0178

lm_eval mmlu_pro_business for llama2-7b (pad_token=0, bs=8):

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
business	1	custom-extract	5	exact_match	↑	0.1888	±	0.0139

…oken (like Llama3), the generated results in same batch size erased.

regisss · 2024-12-02T21:20:20Z

@YunLiu1 Can you provide an example of command that enables to reproduce this issue please?

YunLiu1 · 2024-12-03T00:12:08Z

@YunLiu1 Can you provide an example of command that enables to reproduce this issue please?

Sure, because in run_generation.py the ignore_eos is always True, you need to change the code first,
Edit examples/text-generation/run_generation.py, L512, explicitly set "ignore_eos=False,",
Then run the cmd:

python3 ~/optimum-habana/examples/text-generation/run_generation.py
--model_name_or_path /host/mnt/disk3/hf_models/Meta-Llama-3.1-8B
--use_hpu_graphs --use_kv_cache --bf16 --batch_size 2
--warmup 0 --n_iterations 1
--prompt "Hello world," "How are you?"

There is no output for the short prompt "Hello world,"

regisss · 2024-12-10T09:44:08Z

@YunLiu1 When I run this command, I get:

Input/outputs:
input 1: ('Hello world,',)
output 1.1: ('Hello world, I am a new member of the forum. I am a 20 year old male from the UK. I have been diagnosed with Aspergers and ADHD. I have been diagnosed with Aspergers for 2 years now. I have been diagnosed with ADHD for 1 year now. I have been diagnosed with depression for 2 years now. I have been diagnosed with anxiety for 1 year now. I have been diagnosed with OCD for 1 year now. I have been diagnosed with social',)

input 2: ('How are you?',)
output 2.1: ('How are you? I hope you are well. I am writing to you to ask for your help. I am a student at the University of the West Indies, Mona Campus. I am currently doing a research project on the topic of the impact of the COVID-19 pandemic on the mental health of Jamaican youth. I am hoping to get your help with this project. I am asking you to complete a survey that will take about 10 minutes of your time. The survey is completely anonymous and confidential. I am',)

which looks fine. Can you try again on the latest main branch and let me know if you still see it please?

Besides, you can set ignore_eos to False from the command line using the argument --no-ignore_eos.

regisss · 2024-12-10T09:47:34Z

It's possible #1546 and #1569 helped to fix this

YunLiu1 · 2024-12-11T00:19:03Z

@regisss
Hi I confirmed this issue has bee fixed in #1569, this PR is no longer needed.

For the issue: when ignore_eos=False and the model's pad_token==eos_t…

134eabd

…oken (like Llama3), the generated results in same batch size erased.

YunLiu1 requested review from bhargaveede, ssarkar2 and vivekgoe as code owners December 2, 2024 03:24

regisss closed this Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A generation issue: when ignore_eos=False and the model's pad_token==eos_t…#1539

A generation issue: when ignore_eos=False and the model's pad_token==eos_t…#1539
YunLiu1 wants to merge 1 commit into
huggingface:mainfrom
YunLiu1:oh115

YunLiu1 commented Dec 2, 2024

Uh oh!

regisss commented Dec 2, 2024

Uh oh!

YunLiu1 commented Dec 3, 2024 •

edited

Loading

Uh oh!

regisss commented Dec 10, 2024

Uh oh!

regisss commented Dec 10, 2024

Uh oh!

YunLiu1 commented Dec 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YunLiu1 commented Dec 2, 2024

What does this PR do?

Fixes

Unit Tests

Function Tests

Uh oh!

regisss commented Dec 2, 2024

Uh oh!

YunLiu1 commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

regisss commented Dec 10, 2024

Uh oh!

regisss commented Dec 10, 2024

Uh oh!

YunLiu1 commented Dec 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

YunLiu1 commented Dec 3, 2024 •

edited

Loading