Add script to convert HF Whisper models back to OpenAI format #26854

zuazo · 2023-10-16T23:52:27Z

What does this PR do?

This PR introduces a new script named convert_hf_to_openai.py that allows for the conversion of Hugging Face Whisper models back to the original OpenAI Whisper format. This just does the opposite of the convert_openai_to_hf.py script.

While Hugging Face is easier to use, for example, for fine-tuning and has many integrations, the original OpenAI Whisper library provides more fine-grained control over this specific model, facilitating the testing of new approaches and certain algorithms (at least in our case).

Doctests

I added a doctest at the beginning that passes, but it requires the openai-whisper package to be installed, so I left it disabled with the double >>. Not sure how do you prefer to handle this case: leave it like that, adding the Whisper package somewhere in the CI (like .github/workflows/doctests.yml), or in any other way.

Besides, even though the original convert_openai_to_hf.py script did not have them, let me know if you want me to add some tests to this. I have tested it myself to work with all the Whisper model sizes, even the Large V2.

Before submitting

This PR fixes a typo or improves the docs.
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Possible candidates:

convert_openai_to_hf.py script creator: @ArthurZucker
Speech models: @sanchit-gandhi

…t_hf_to_openai.py

… disabled doctest

ArthurZucker · 2023-10-17T07:49:32Z

Linking #20953 as it was asked quite a while ago. We don't usually add these in transformers and would rather add it to the ## resource section, as a link to a your repo with the script. WDYT @sanchit-gandhi

sanchit-gandhi · 2023-10-23T17:59:29Z

Probably in the Resource section would be best here @zuazo! It becomes a bit difficult to maintain Transformers if we become a one-to-many export library (e.g. export Transformers format to any number of other libraries).

Curious to hear what parameters you need from the OpenAI implementation that we don't offer in Transformers though! We can certainly discuss on GitHub adding them to Transformers to improve the experience for you. Currently, we're a lot faster than OpenAI: https://twitter.com/reach_vb/status/1714741554030481786

zuazo · 2023-10-25T13:25:00Z

Absolutely, it sounds reasonable. I will open a new PR to add it to the ## Resources section once we finish PR #26834 to avoid any merge conflicts.

Regarding our usage, we have experimented with HF fine-tuned Whisper models coupled with n-gram LMs. It seemed straightforward in the whisper library due to their existing BeamSearchDecoder, making it simple to incorporate a KenLM.

If there is a similar feature in Transformers that I overlooked, I apologize. Navigating through such a comprehensive library can sometimes be challenging.

sanchit-gandhi · 2023-10-26T17:54:50Z

Interesting use-case! Did you find that the Whisper decoder was not enough to predict accurate spelling/transcriptions? The Whisper decoder is in effect an "internal" LM, since it plays the role of generating the text conditional on the encoder hidden-states. Is your n-gram LM trained with the same vocab size as Whisper, i.e. you use the Whisper logits in combination with the n-gram model to get your final transcription? We have something similar to this with Wav2Vec2 + n-gram here:

transformers/src/transformers/models/wav2vec2_with_lm/processing_wav2vec2_with_lm.py

Line 284 in 34a6406

def batch_decode(

zuazo added 2 commits October 17, 2023 01:38

Add script to convert HF Whisper models back to OpenAI format: conver…

c37c870

…t_hf_to_openai.py

Restyle convert_hf_to_openai.py file and fix doc-builder error in the…

787e84c

… disabled doctest

zuazo mentioned this pull request Oct 25, 2023

Fix Whisper Conversion Script: Correct decoder_attention_heads and _download function #26834

Merged

4 tasks

zuazo mentioned this pull request Nov 19, 2023

Add convert_hf_to_openai.py script to Whisper documentation resources #27590

Merged

5 tasks

ArthurZucker closed this in #27590 Nov 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add script to convert HF Whisper models back to OpenAI format #26854

Add script to convert HF Whisper models back to OpenAI format #26854

Uh oh!

zuazo commented Oct 16, 2023

Uh oh!

ArthurZucker commented Oct 17, 2023

Uh oh!

sanchit-gandhi commented Oct 23, 2023

Uh oh!

zuazo commented Oct 25, 2023

Uh oh!

sanchit-gandhi commented Oct 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add script to convert HF Whisper models back to OpenAI format #26854

Add script to convert HF Whisper models back to OpenAI format #26854

Uh oh!

Conversation

zuazo commented Oct 16, 2023

What does this PR do?

Doctests

Before submitting

Who can review?

Uh oh!

ArthurZucker commented Oct 17, 2023

Uh oh!

sanchit-gandhi commented Oct 23, 2023

Uh oh!

zuazo commented Oct 25, 2023

Uh oh!

sanchit-gandhi commented Oct 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants