make `test_eager_matches_sdpa_inference `less flaky by ydshieh · Pull Request #34512 · huggingface/transformers

ydshieh · 2024-10-30T15:11:07Z

What does this PR do?

With torch.bfloat16 the numerical difference/instability occurs quite often, especially with multiple hidden layers.

This PR first changes test_eager_matches_sdpa_inference to create models with only 1 hidden layer.

number of failures per 500 runs

	main	n_layer=1
llama	16	2
idefics2	391	15

Then it relaxes the condition a bit: only checks 80% of the sequences. If the results match on those 80%, the test pass.

This makes the test much less flaky. On 500 runs, it pass (for llama, mistral, idefics2 and Llava)

Finally, change the image size of llava and VipLlava from 30 to 8 so the sequence length is much smaller and avoid numerical issues.

tests/test_modeling_common.py

HuggingFaceDocBuilderDev · 2024-10-30T15:39:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante

LGTM 👍 Thank you for fixing

Extra note: L4170 (model_sdpa = model_class.from_pretrained(tmpdirname, torch_dtype=torch_dtype)) should also have attn_implementation="sdpa", in case we update the default.

tests/generation/test_utils.py

tests/test_modeling_common.py

ArthurZucker

thanks !

ArthurZucker · 2024-10-31T16:03:00Z

tests/generation/test_utils.py

            if model.get_output_embeddings() is None:
                self.skipTest("DoLa is not supported for models that don't have output embeddings")
+
+            logits_processor_kwargs = self._get_logits_processor_kwargs(do_sample=True, config=model.config)


do sample is random no?

I am using the same value as in generation_kwargs = {...} a few line below.

Yes it is random but this method is test_...._sample so makes sense.

* try * try * try * try * try * try * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ydshieh requested review from Rocketknight1 and gante October 30, 2024 15:13

ydshieh commented Oct 30, 2024

View reviewed changes

tests/test_modeling_common.py Show resolved Hide resolved

ydshieh commented Oct 30, 2024

View reviewed changes

tests/test_modeling_common.py Show resolved Hide resolved

ydshieh commented Oct 30, 2024

View reviewed changes

tests/test_modeling_common.py Show resolved Hide resolved

gante approved these changes Oct 31, 2024

View reviewed changes

tests/generation/test_utils.py Outdated Show resolved Hide resolved

tests/test_modeling_common.py Outdated Show resolved Hide resolved

tests/test_modeling_common.py Show resolved Hide resolved

ydshieh added 13 commits October 31, 2024 15:55

try

194adfb

try

33022d4

try

a1eb7c3

try

a3d0b3c

try

dde9a4b

try

4684815

update

1e5bff9

update

390cc75

update

4535dd5

update

224b922

update

563c71b

update

e68f450

update

83175e9

ydshieh force-pushed the less_flaky branch from b51ca31 to 83175e9 Compare October 31, 2024 14:59

ArthurZucker approved these changes Oct 31, 2024

View reviewed changes

ydshieh merged commit 114dd81 into main Oct 31, 2024

ydshieh deleted the less_flaky branch October 31, 2024 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make `test_eager_matches_sdpa_inference` less flaky#34512

make `test_eager_matches_sdpa_inference` less flaky#34512
ydshieh merged 13 commits intomainfrom
less_flaky

ydshieh commented Oct 30, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 30, 2024

Uh oh!

gante left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker Oct 31, 2024

Uh oh!

ydshieh Oct 31, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ydshieh commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 30, 2024

Uh oh!

gante left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Oct 31, 2024

Choose a reason for hiding this comment

Uh oh!

ydshieh Oct 31, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ydshieh commented Oct 30, 2024 •

edited

Loading

gante left a comment •

edited

Loading