Fix TF generation (especially for `TFMarian`) #20853

ydshieh · 2022-12-20T18:29:42Z

What does this PR do?

Fix TF generation (especially for the TFMarian generation issue in #18149)

Fix #18149

HuggingFaceDocBuilderDev · 2022-12-20T18:44:54Z

The documentation is not available anymore as the PR was closed or merged.

ydshieh

Leave a few comments to help the review

ydshieh · 2022-12-21T07:02:45Z

src/transformers/generation/tf_utils.py


            # 2. can the new beams still improve?
-            best_running_score = running_scores[:, :1] / (max_length**length_penalty)
+            best_running_score = running_scores[:, :1] / tf.cast(cur_len, dtype=running_scores.dtype) ** length_penalty


In current main branch, max_length is used instead of cur_len. However, in our PyTorch generation's BeamHypotheses, it is cur_len, see

transformers/src/transformers/generation/beam_search.py

Line 895 in 3be028b

cur_score = best_sum_logprobs / cur_len**self.length_penalty

When running the code snippet in the reported TFMarian issue (#18149), we get max_length being a constant of 512, but the PyTorch generation code runs with cur_len which is from 1 (or 2) to 5.

(However, this is not the root cause of the issue in #18149)

ydshieh · 2022-12-21T07:27:05Z

src/transformers/generation/tf_utils.py

+            # still_open_beam = ~(tf.math.reduce_all(is_sent_finished) & early_stopping)
+            still_open_beam = ~(tf.math.reduce_all(is_sent_finished))

-            return not_max_length_yet & (still_open_beam | improvement_still_possible)
+            _early_stopping = tf.constant(early_stopping > 0, dtype=tf.bool)
+
+            # return not_max_length_yet & (still_open_beam | improvement_still_possible)
+            return not_max_length_yet & (still_open_beam | (~_early_stopping & improvement_still_possible))


The method beam_search_cond_fn corresponds to BeamHypotheses.is_done in our PyTorch generation code (despite the meaning is reversed: generation done v.s. not done).

In current main branch, the logic here is

still_open_beam = ~(tf.math.reduce_all(is_sent_finished) & early_stopping) return not_max_length_yet & (still_open_beam | improvement_still_possible)

When early_stopping is False, still_open_beam will be True and the return value becomes True (if
not_max_length_yet is True) - i.e. it should continue the generation

However, in BeamHypotheses.is_done, if early_stopping is False (and if ), it will compare the scores

transformers/src/transformers/generation/beam_search.py

Lines 895 to 897 in 3be028b

cur_score = best_sum_logprobs / cur_len**self.length_penalty

ret = self.worst_score >= cur_score

return ret

and in the code snippet for Inference for TFMarianMTModel (en to Romance language translation) is slow and inaccurate #18149, it returns True for is_done after 5 or 6 generation steps - i.e. it should **NOT** continue the generation

The above suggests:

The main issue in TFMarian super slow generation comes from the condition around early_stopping

With the changes in this PR, it could generate quickly just as the Marian

I run the slow tests for bert, gpt2, bart, t5: One test need to be fixed tests/models/bart/test_modeling_tf_bart.py::TFBartModelTest::test_xla_generate_slow

However, one thing I don't understand very well is:

this part len(self) < self.num_beams in BeamHypotheses.is_done

transformers/src/transformers/generation/beam_search.py

Line 890 in 3be028b

if len(self) < self.num_beams:

v.s.

tf.math.reduce_all(is_sent_finished) and/or not_max_length_yet in beam_search_cond_fn.

It doesn't seem 100% equivalent conditions. (But I didn't really go into the details around this part)

ydshieh · 2022-12-21T07:36:34Z

src/transformers/generation/tf_utils.py


            # 3. is there still a beam that has not finished?
-            still_open_beam = ~(tf.math.reduce_all(is_sent_finished) & early_stopping)
+            # still_open_beam = ~(tf.math.reduce_all(is_sent_finished) & early_stopping)


This line should be removed before merge

ydshieh · 2022-12-21T07:36:42Z

src/transformers/generation/tf_utils.py

-            return not_max_length_yet & (still_open_beam | improvement_still_possible)
+            _early_stopping = tf.constant(early_stopping > 0, dtype=tf.bool)
+
+            # return not_max_length_yet & (still_open_beam | improvement_still_possible)


should be removed before merge

gante · 2022-12-26T10:48:09Z

Hey @ydshieh 👋

Thank you for opening this PR, it made me realize a detail that is wrong in both frameworks 👀

We know that logprobs is a negative value, and we want to maximize it in beam search (i.e. make it as close to 0 as possible). Since logprobs is always negative, and the final score is the sum of the logprobs, we can anticipate the best possible score and use it to end beam search with no drawback. Well, it turns out that the method to compute the best possible score depends on length_penalty, and we are not accounting for that!

Scenario 1, length_penalty > 0.0: In this case, as the sentence grows, the denominator grows as well. This means the score can get closer to 0 (i.e. higher) as the sentence grows, and longer sentences are promoted. In this case, the best possible score can be determined from the maximum sequence length (TF implementation).
Scenario 2, length_penalty < 0.0: In this case, as the sentence grows, the denominator gets smaller. This means the score will get farther away to 0 (i.e. lower) as the sentence grows, and shorter sentences are promoted. In this case, the best possible score can be determined from the current sequence length (PT implementation).

On top of this incomplete best score computation on both ends, your PR made me realize that the stopping condition for TF also had a problem (after factoring in the correct length penalty computation, a few tests failed).

I'm opening a PR to compare against this one with what I think is the correct solution to this bug 🐛

ydshieh · 2023-01-03T10:56:38Z

Close in favor of #20901

fix TF gen

7fe1312

fix

3818ed4

ydshieh commented Dec 21, 2022

View reviewed changes

ydshieh requested a review from gante December 21, 2022 07:35

ydshieh marked this pull request as ready for review December 21, 2022 07:36

ydshieh commented Dec 21, 2022

View reviewed changes

gante mentioned this pull request Dec 26, 2022

🚨🚨 Generate: correct beam search best possible score computation and handling #20901

Closed

ydshieh closed this Jan 3, 2023

gante mentioned this pull request Jan 30, 2023

🚨🚨 Generate: standardize beam search behavior across frameworks #21368

Merged

4 tasks

ydshieh deleted the fix_tf_gen branch February 2, 2023 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix TF generation (especially for `TFMarian`) #20853

Fix TF generation (especially for `TFMarian`) #20853

Uh oh!

ydshieh commented Dec 20, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Dec 20, 2022 •

edited

Loading

Uh oh!

ydshieh left a comment •

edited

Loading

Uh oh!

ydshieh Dec 21, 2022 •

edited

Loading

Uh oh!

ydshieh Dec 21, 2022 •

edited

Loading

Uh oh!

ydshieh Dec 21, 2022 •

edited

Loading

Uh oh!

ydshieh Dec 21, 2022

Uh oh!

ydshieh Dec 21, 2022

Uh oh!

gante commented Dec 26, 2022 •

edited

Loading

Uh oh!

ydshieh commented Jan 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	cur_score = best_sum_logprobs / cur_len**self.length_penalty
	ret = self.worst_score >= cur_score
	return ret

Fix TF generation (especially for TFMarian) #20853

Fix TF generation (especially for TFMarian) #20853

Uh oh!

Conversation

ydshieh commented Dec 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh Dec 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh Dec 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh Dec 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

ydshieh Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

gante commented Dec 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Jan 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix TF generation (especially for `TFMarian`) #20853

Fix TF generation (especially for `TFMarian`) #20853

ydshieh commented Dec 20, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 20, 2022 •

edited

Loading

ydshieh left a comment •

edited

Loading

ydshieh Dec 21, 2022 •

edited

Loading

ydshieh Dec 21, 2022 •

edited

Loading

ydshieh Dec 21, 2022 •

edited

Loading

gante commented Dec 26, 2022 •

edited

Loading