Fix `kwargs` handling in `generate_with_fallback` #29225

cifkao · 2024-02-22T22:14:37Z

What does this PR do?

Fixes #29312.

changes the pop() to get() to avoid modifying kwargs between loop iterations,
makes sure a copy of kwargs is made as the first step in generate_with_fallback() to prevent any changes to it from propagating outside the method call.
makes sure the keys that were assigned to generation_config are removed from the keyword arguments to super().generate() (to avoid overriding the former), but this is done in a copy of kwargs that is not reused between iterations.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@patrickvonplaten @sanchit-gandhi @ylacombe

ylacombe

Hey @cifkao, thanks for the great work here, it's a nice catch.

The fix seems okay to me, I don't think we have a way to test if it does work, otherwise I'd have ask you that!

@sanchit-gandhi could we have your review here as well ?

ylacombe · 2024-03-07T12:48:50Z

src/transformers/models/whisper/generation_whisper.py

+            generation_config.num_beams = kwargs.get("num_beams", 1) if not generation_config.do_sample else 1

+            generate_kwargs = dict(kwargs)
+            for key in ["do_sample", "temperature", "num_beams"]:


temperature shouldn't be in kwargs as it's already an argument of .generate here right ?

It seems okay to check for do_sample and num_beams here

Just wanted to be extra cautious here and make sure everything is safe locally, rather than relying on what gets passed down from 2 call frames up the stack. But I can remove temperature if you prefer.

Looks good to me as is - there's a preference for more explicit handling of kwargs than more buried ones

ylacombe · 2024-03-07T12:50:03Z

src/transformers/models/whisper/generation_whisper.py

        do_condition_on_prev_tokens,
        kwargs,
    ):
+        kwargs = dict(kwargs)


Why do you use dict(...) here and below? Is it to copy ? If yes, shouldn't we use copy.deepcopy instead ?

Yes, it's just to make a copy. My thinking here was that a shallow copy (using dict() or copy.copy()) has the same effect as using the **kwargs syntax.

copy.deepcopy should make the trick then right ? I'm just afraid that using dict might no be self-explanatory

I still don't think we want to make a deep copy (what if kwargs contains a large object like assistant_model, for example?). So I changed the dict to copy.copy, which is equivalent and more readable.

HuggingFaceDocBuilderDev · 2024-03-07T13:14:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Thanks @cifkao! Sounds good, just want to make sure you have a reproducer!

sanchit-gandhi

Thanks for the great issue and super clear PR description @cifkao! The PR looks good to me. My only request is that we add a test to confirm beam search is working as expected. Could we modify your reproducer to do this, possibly with something like the following?

import datasets
from transformers import AutoProcessor, GenerationMixin, WhisperForConditionalGeneration
import numpy as np

processor = AutoProcessor.from_pretrained("openai/whisper-tiny")
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny")

orig_generate = GenerationMixin.generate
NUM_BEAMS = 2

def generate(self, *args, **kwargs):
    assert args[1].num_beams == NUM_BEAMS
    return orig_generate(self, *args, **kwargs)


GenerationMixin.generate = generate

ds = datasets.load_dataset(
    "google/fleurs", "en_us", split="test", trust_remote_code=True
)
ds = ds = ds.cast_column("audio", datasets.Audio(sampling_rate=16000))
raw_audio = np.concatenate([x["array"].astype(np.float32) for x in ds[:16]["audio"]])

inputs = processor(
    [raw_audio],
    return_tensors="pt",
    truncation=False,
    padding="longest",
    return_attention_mask=True,
    sampling_rate=16_000,
)

model.generate(
    **inputs,
    num_beams=NUM_BEAMS,
    task="transcribe",
    language="en",
)

cifkao · 2024-04-02T18:23:10Z

@sanchit-gandhi Test added!

On main:

FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_beam - assert 1 == 2

After fix:

PASSED tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_beam

ArthurZucker

Thanks for iterating and adding a test!

cifkao added 4 commits February 22, 2024 14:36

Fix generate_with_fallback **kwargs

9a692bc

Change pop to get

6b38f76

Delete keys from kwargs to prevent overriding generation_config

2671894

Revert to passing kwargs by reference, but make a (shallow) copy

c74edca

ArthurZucker requested review from sanchit-gandhi and ylacombe March 7, 2024 11:28

ylacombe approved these changes Mar 7, 2024

View reviewed changes

dict -> copy.copy

64e3163

ArthurZucker reviewed Mar 30, 2024

View reviewed changes

cifkao requested a review from ArthurZucker March 31, 2024 00:02

sanchit-gandhi approved these changes Apr 2, 2024

View reviewed changes

Add test_whisper_longform_multi_batch_beam

39d09a3

ArthurZucker approved these changes Apr 3, 2024

View reviewed changes

ArthurZucker merged commit bcd42c4 into huggingface:main Apr 3, 2024

This was referenced Apr 3, 2024

Fix whisper kwargs and generation config #30018

Merged

copy Whisper generate_with_fallback kwargs #29434

Closed

gante mentioned this pull request May 20, 2025

[whisper] small changes for faster tests #38236

Merged

Fix kwargs handling in generate_with_fallback #29225

Fix kwargs handling in generate_with_fallback #29225

Uh oh!

Conversation

cifkao commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

ylacombe left a comment

Choose a reason for hiding this comment

Uh oh!

ylacombe Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

cifkao Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

sanchit-gandhi Apr 2, 2024

Choose a reason for hiding this comment

Uh oh!

ylacombe Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

cifkao Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

ylacombe Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

cifkao Mar 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 7, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

sanchit-gandhi left a comment

Choose a reason for hiding this comment

Uh oh!

cifkao commented Apr 2, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fix `kwargs` handling in `generate_with_fallback` #29225

Fix `kwargs` handling in `generate_with_fallback` #29225

cifkao commented Feb 22, 2024 •

edited

Loading

cifkao Mar 8, 2024 •

edited

Loading