Add Flash Attention 2 support to Musicgen and Musicgen Melody #29939

ylacombe · 2024-03-28T14:17:27Z

What does this PR do?

Supersedes #27924

The attention tests all pass but there are no integration equivalence between the original attention models and the FA ones. I don't hear any difference in quality despite not being the same song, though.

cc @sanchit-gandhi and @amyeroberts, could you review please?

HuggingFaceDocBuilderDev · 2024-03-28T14:36:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sanchit-gandhi

Thanks for adding this!

sanchit-gandhi · 2024-03-28T15:40:34Z

src/transformers/models/musicgen/configuration_musicgen.py

        return self.audio_encoder.sampling_rate
+
+    @property
+    def _attn_implementation(self):


This method is one-to-one the same as in the PreTrainedConfig class:

transformers/src/transformers/configuration_utils.py

Line 407 in 536ea2a

def _attn_implementation(self):

Can we remove it from here?

Not if we want to keep the setter part!

sanchit-gandhi · 2024-03-28T15:41:28Z

src/transformers/models/musicgen/modeling_musicgen.py

+
+MUSICGEN_ATTENTION_CLASSES = {
+    "eager": MusicgenAttention,
+    "flash_attention_2": MusicgenFlashAttention2,


Worth adding sdpa in one go as well? Would enable you to showcase attention implementation through sdpa on free tier Colab T4 GPU (where FA2 is not available)

sanchit-gandhi · 2024-03-28T15:43:03Z

src/transformers/models/musicgen_melody/configuration_musicgen_melody.py

        return self.audio_encoder.sampling_rate
+
+    @property
+    def _attn_implementation(self):


sanchit-gandhi · 2024-03-28T15:43:41Z

tests/models/musicgen/test_modeling_musicgen.py

+                    else outputs_fa.decoder_hidden_states[-1]
+                )
+
+                assert torch.allclose(logits_fa[1:], logits[1:], atol=4e-2, rtol=4e-2)


Good enough for a generative audio model with FA2

I've copied out the same tolerance threshold than any other models (regardless of modality) btw

ylacombe · 2024-03-29T13:08:45Z

I've also added SDPA!

cc @amyeroberts or @ArthurZucker could you review when you have time?

ArthurZucker

LGTM! Tests are ... huge, would be nice if you can use copied from, would help the review 😅

ArthurZucker · 2024-03-30T17:54:04Z

src/transformers/models/musicgen/modeling_musicgen.py

+        self._use_flash_attention_2 = config._attn_implementation == "flash_attention_2"
+        self._use_sdpa = config._attn_implementation == "sdpa"


let's only save self._attn_implementation please

ArthurZucker · 2024-03-30T17:54:45Z

src/transformers/models/musicgen_melody/modeling_musicgen_melody.py

+
+        return attn_output, None, past_key_value
+
+


copied from can be used here as well!

ArthurZucker

Ouf! Thanks for the big PR and adding those tests!

ylacombe added 6 commits March 28, 2024 10:13

add FA2 to o.g Musicgen

a7207c9

make style

cb3fd9f

add FA2 support to Musicgen Melody

716c18a

add generation FA2 tests to o.g Musicgen

a165c30

make style and fix copies

b0f4258

add Musicgen to FA2 docs + deprecate list

3d7e1a5

ylacombe requested review from amyeroberts and sanchit-gandhi March 28, 2024 14:19

sanchit-gandhi approved these changes Mar 28, 2024

View reviewed changes

ylacombe and others added 3 commits March 29, 2024 09:25

Merge branch 'huggingface:main' into add-FA2-musicgen

6ca29ce

add sdpa supports to Musicgen's

0a3ce47

make style and fix copies

4d5f009

ylacombe requested a review from ArthurZucker March 29, 2024 13:08

ArthurZucker reviewed Mar 30, 2024

View reviewed changes

ylacombe and others added 5 commits April 1, 2024 19:05

refactor attention implementation arguments

1231285

add Copied from to sdpa tests

2ed6e9e

Merge branch 'huggingface:main' into add-FA2-musicgen

1479ef2

add copied form in sdpa tests melody

1146c92

add copied for FA2 generation tests

16657c6

ArthurZucker approved these changes Apr 2, 2024

View reviewed changes

ylacombe added 2 commits April 2, 2024 11:23

add FA2 inference copied from

c36bf26

make style

9fbd9ec

ylacombe merged commit 0d04b1e into huggingface:main Apr 2, 2024

sanchit-gandhi mentioned this pull request Apr 3, 2024

[MusicGen] SDPA gives nans/infs during sampling #30020

Closed

4 tasks

xenova mentioned this pull request Apr 5, 2024

Musicgen ONNX export (text-conditional only) huggingface/optimum#1779

Merged

amosyou mentioned this pull request Apr 19, 2024

Flash Attention 2 for audio/musicgen #27552

Closed

This was referenced May 23, 2024

Flash Attention Support huggingface/parler-tts#55

Open

Add Flash Attention 2 support to ParlerTTS huggingface/parler-tts#59

Closed

		self._use_flash_attention_2 = config._attn_implementation == "flash_attention_2"
		self._use_sdpa = config._attn_implementation == "sdpa"

Add Flash Attention 2 support to Musicgen and Musicgen Melody #29939

Add Flash Attention 2 support to Musicgen and Musicgen Melody #29939

Conversation

ylacombe commented Mar 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 28, 2024

Uh oh!

sanchit-gandhi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ylacombe commented Mar 29, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ylacombe commented Mar 28, 2024 •

edited

Loading