Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Commit

Permalink
ASR HUGGINGFACE_BACKBONES use AutoModel (#874)
Browse files Browse the repository at this point in the history
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Ethan Harris <ethanwharris@gmail.com>
4 people authored Nov 24, 2021

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
1 parent 5f11d8f commit 4cda031
Showing 3 changed files with 5 additions and 3 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -28,6 +28,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

- Updated `FlashFinetuning` callback to use separate hooks that lets users use the freezing logic provided out-of-the-box from flash, route FlashFinetuning through a registry. ([#830](https://github.com/PyTorchLightning/lightning-flash/pull/830))

- Changed the `SpeechRecognition` task to use `AutoModelForCTC` rather than just `Wav2Vec2ForCTC` ([#874](https://github.com/PyTorchLightning/lightning-flash/pull/874))

### Deprecated

- Deprecated `flash.core.data.process.Serializer` in favour of `flash.core.data.io.output.Output` ([#927](https://github.com/PyTorchLightning/lightning-flash/pull/927))
4 changes: 2 additions & 2 deletions flash/audio/speech_recognition/backbone.py
Original file line number Diff line number Diff line change
@@ -20,7 +20,7 @@
SPEECH_RECOGNITION_BACKBONES = FlashRegistry("backbones")

if _AUDIO_AVAILABLE:
from transformers import Wav2Vec2ForCTC
from transformers import AutoModelForCTC, Wav2Vec2ForCTC

WAV2VEC_MODELS = ["facebook/wav2vec2-base-960h", "facebook/wav2vec2-large-960h-lv60"]

@@ -31,6 +31,6 @@
providers=[_HUGGINGFACE, _FAIRSEQ],
)

HUGGINGFACE_BACKBONES = ExternalRegistry(Wav2Vec2ForCTC.from_pretrained, "backbones", providers=_HUGGINGFACE)
HUGGINGFACE_BACKBONES = ExternalRegistry(AutoModelForCTC.from_pretrained, "backbones", providers=_HUGGINGFACE)

SPEECH_RECOGNITION_BACKBONES += HUGGINGFACE_BACKBONES
2 changes: 1 addition & 1 deletion requirements/datatype_audio.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
torchaudio
librosa>=0.8.1
transformers>=4.5
transformers>=4.11.0
datasets>=1.8

0 comments on commit 4cda031

Please sign in to comment.