Convert MusicLDM #4579

sanchit-gandhi · 2023-08-11T15:08:14Z

What does this PR do?

Adds the conversion script for MusicLDM and a new pipeline class, closely based on the existing AudioLDM pipeline.

Changes compared to the existing AudioLDM pipeline:

AudioLDM only uses the CLAP text branch. MusicLDM uses the full CLAP model (text + audio branch) for similarity scoring: the cosine similarity is computed between the generated waveforms and the text inputs, and the audios ranked based on these scores (most similar -> least similar). For MusicLDM, this scoring has quite a big effect on the quality of the generated audios when num_waveforms_per_prompt>1.
Addition of the CLAP feature extractor for pre-processing the audio waveforms for the CLAP audio branch: the feature extractor is registered as a new module in the __init__, and is used in the score_waveforms method

TODO:

Finalise design - are we happy with adding a new pipeline as described above, or do we want to try and make the existing pipeline compatible with the two changes described above, possibly at the expense of greater code complexity (need to condition every call to the text_encoder)
Add tests & update docs

cc @Vaibhavs10 @sayakpaul

HuggingFaceDocBuilderDev · 2023-08-11T15:15:16Z

The documentation is not available anymore as the PR was closed or merged.

sayakpaul · 2023-08-14T16:27:22Z

Depending on the complexity, I think it's okay to add a separate pipeline for this.

docs/source/en/api/pipelines/musicldm.md

tests/pipelines/musicldm/test_musicldm.py

src/diffusers/pipelines/musicldm/pipeline_musicldm.py

patrickvonplaten

Very clean! Only left some nits

Co-authored-by: Patrick von Platen <[email protected]>

…usicldm

* from audioldm * fix vae * move to new pipeline * copied from audioldm * remove redundant control flow * iterate * fix docstring * finish pipeline * tests: from audioldm2 * iterate * finish fast tests * finish slow integration tests * add docs * remove dtype test * update toctree * "copied from" in conversion (where possible) * Update docs/source/en/api/pipelines/musicldm.md Co-authored-by: Patrick von Platen <[email protected]> * fix docstring * make nightly * style * fix dtype test --------- Co-authored-by: Patrick von Platen <[email protected]>

sanchit-gandhi force-pushed the convert-musicldm branch from 5076425 to 29398c6 Compare August 14, 2023 15:18

sanchit-gandhi force-pushed the convert-musicldm branch from 885d754 to 3443e65 Compare August 23, 2023 10:20

sanchit-gandhi added 15 commits August 23, 2023 13:18

from audioldm

4ba59be

fix vae

0df964a

move to new pipeline

ac9095f

copied from audioldm

d068510

remove redundant control flow

9afccf3

iterate

05cd6c4

fix docstring

9309c96

finish pipeline

a6532ad

tests: from audioldm2

78ed6f6

iterate

c2d8374

finish fast tests

1c6e13e

finish slow integration tests

2cfd9bb

add docs

517ad2f

remove dtype test

d9ed6f2

update toctree

e4a9bd8

sanchit-gandhi force-pushed the convert-musicldm branch from 67d3f3e to e4a9bd8 Compare August 23, 2023 12:22

sanchit-gandhi requested review from sayakpaul and patrickvonplaten August 23, 2023 13:20

"copied from" in conversion (where possible)

a073325

patrickvonplaten reviewed Aug 23, 2023

View reviewed changes

docs/source/en/api/pipelines/musicldm.md Outdated Show resolved Hide resolved

patrickvonplaten reviewed Aug 23, 2023

View reviewed changes

tests/pipelines/musicldm/test_musicldm.py Show resolved Hide resolved

patrickvonplaten reviewed Aug 23, 2023

View reviewed changes

src/diffusers/pipelines/musicldm/pipeline_musicldm.py Outdated Show resolved Hide resolved

patrickvonplaten approved these changes Aug 23, 2023

View reviewed changes

sanchit-gandhi and others added 3 commits August 25, 2023 11:42

Update docs/source/en/api/pipelines/musicldm.md

fac57f0

Co-authored-by: Patrick von Platen <[email protected]>

fix docstring

c30502c

make nightly

98b8394

sanchit-gandhi added 3 commits August 25, 2023 11:46

Merge remote-tracking branch 'origin/convert-musicldm' into convert-m…

cbc8b18

…usicldm

style

2dfe6e8

fix dtype test

2b19b5c

sanchit-gandhi merged commit b1290d3 into huggingface:main Aug 25, 2023

sanchit-gandhi deleted the convert-musicldm branch August 25, 2023 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert MusicLDM #4579

Convert MusicLDM #4579

sanchit-gandhi commented Aug 11, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 11, 2023 •

edited

Loading

sayakpaul commented Aug 14, 2023

patrickvonplaten left a comment

Convert MusicLDM #4579

Convert MusicLDM #4579

Conversation

sanchit-gandhi commented Aug 11, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Aug 11, 2023 • edited Loading

sayakpaul commented Aug 14, 2023

patrickvonplaten left a comment

Choose a reason for hiding this comment

sanchit-gandhi commented Aug 11, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 11, 2023 •

edited

Loading