Skip to content

Conversation

@ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented May 26, 2025

What does this PR do?

All tests now pass, except the following one

Detect and fix most _init_weights() issues - make it work for composite models (#37070)

fails

tests/models/gemma/test_modeling_gemma.py::GemmaModelTest::test_sdpa_equivalence

2e-3 --> 4e-3, which is larger than 3e-3.

Let's deal with this in a separate PR.

@ydshieh
Copy link
Collaborator Author

ydshieh commented May 26, 2025

run-slow: gemma

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/gemma']
quantizations: [] ...

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ydshieh ydshieh requested a review from Cyrilvallez May 26, 2025 16:31
@ydshieh ydshieh changed the title [not ready] update gemma tests update gemma tests May 26, 2025
ydshieh added 4 commits May 26, 2025 19:00
…ransformers into update_gemma_tests

# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks! This was discussed offline -> test_model_2b_bf16 and test_model_2b_sdpa were 1:1 similar so we removed the latter, and the switch from bf16 to fp16 for eager is justified by the fact that on T4 specifically (not on A100), eager gives non-sensical outputs for this prompt on bf16

@ydshieh ydshieh merged commit 07848a8 into main May 26, 2025
15 checks passed
@ydshieh ydshieh deleted the update_gemma_tests branch May 26, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants