Paligemma: fix static cache test #33941

zucchini-nlp · 2024-10-04T08:06:18Z

What does this PR do?

Fixes the flaky test on paligemma from #33630

zucchini-nlp · 2024-10-04T08:07:28Z

src/transformers/models/paligemma/modeling_paligemma.py

                causal_mask = torch.triu(causal_mask, diagonal=1)
            else:
-                causal_mask = torch.zeros_like(causal_mask)
+                causal_mask[:, :sequence_length] = 0.0


this was the cause as it was not masking dummy tokens from static cache, and thus we always ended up with no mask on those token positions

aah gotcha. good catch

zucchini-nlp · 2024-10-04T08:08:23Z

src/transformers/models/paligemma/modeling_paligemma.py

                min_dtype=min_dtype,
                cache_position=cache_position,
                batch_size=batch_size,
-                is_training=is_training,


if we come to prepare static cache from here, then we cannot be in training mode. I don't think it is common to pass labels through generation, right?

I'm not seeing many use-cases indeed, except for maybe constrained generation and RL?

guess so, let's see what generation master (gante) thinks 😄

If labels in paligemma has the usual meaning (=tensor with which we compute the loss, with no further uses), then generate will never use labels :D

nice, yes those are normal labels :)

molbap

LGTM, added comment on training case for generation :)

HuggingFaceDocBuilderDev · 2024-10-04T08:30:53Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante

LGTM, thank you for fixing 🤗

ArthurZucker

Thanks 🤗

* fix * not flaky anymore + style

zucchini-nlp added 3 commits October 3, 2024 13:53

fix

322b7ca

Merge remote-tracking branch 'upstream/main' into paligemma-fix

0b3d258

not flaky anymore + style

282495b

zucchini-nlp requested review from gante and molbap October 4, 2024 08:06

zucchini-nlp commented Oct 4, 2024

View reviewed changes

molbap approved these changes Oct 4, 2024

View reviewed changes

gante approved these changes Oct 4, 2024

View reviewed changes

gante requested a review from LysandreJik October 4, 2024 13:22

ArthurZucker approved these changes Oct 4, 2024

View reviewed changes

zucchini-nlp merged commit 612065e into huggingface:main Oct 5, 2024

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024

Paligemma: fix static cache test (huggingface#33941)

59ca3dc

* fix * not flaky anymore + style

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paligemma: fix static cache test #33941

Paligemma: fix static cache test #33941

Uh oh!

zucchini-nlp commented Oct 4, 2024

Uh oh!

zucchini-nlp Oct 4, 2024

Uh oh!

molbap Oct 4, 2024

Uh oh!

zucchini-nlp Oct 4, 2024

Uh oh!

molbap Oct 4, 2024

Uh oh!

zucchini-nlp Oct 4, 2024

Uh oh!

gante Oct 4, 2024

Uh oh!

zucchini-nlp Oct 4, 2024

Uh oh!

molbap left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 4, 2024

Uh oh!

gante left a comment

Uh oh!

ArthurZucker left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Paligemma: fix static cache test #33941

Paligemma: fix static cache test #33941

Uh oh!

Conversation

zucchini-nlp commented Oct 4, 2024

What does this PR do?

Uh oh!

zucchini-nlp Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

molbap Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

molbap Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

gante Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

molbap left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 4, 2024

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants