Skip to content

Ichigo Whisper v0.1

Latest
Compare
Choose a tag to compare
@hahuyhoang411 hahuyhoang411 released this 30 Dec 13:16
2124598

Ichigo Whisper v0.1 Release Notes:

  • ๐ŸŽ‰ Introducing Ichigo Whisper v0.1
    We are thrilled to announce our very first speech tokenizer built upon the Whisper-medium model!
    Ichigo Whisper is a lightweight (22M parameters), open-source speech tokenizer designed to optimize performance for multilingual while maintaining strong English capabilities. Unlike continuous embedding models, Ichigo Whisper compresses speech into discrete tokens, enabling seamless integration with large language models (LLMs) for advanced speech understanding.

๐Ÿš€ Performance Highlights:

1. Vietnamese

Model Codebook Size Test Dataset Test Samples WER
Ichigo Whisper 2561 viVoice 1000 11.36
Whisper Medium - viVoice 1000 18.64

2. English

Model Codebook Size Test Dataset Test Samples WER
Ichigo Whisper 2561 LibriTTS-R 1000 12.96
Whisper Medium - LibriTTS-R 1000 12.99

๐Ÿ”— Resources: