WeNet 3.0.1
❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤
What's Changed
- Fix loss returned by CTC model in RNNT by @kobenaxie in #2327
- [dataset] new io for code reuse for many speech tasks by @Mddct in #2316
- (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
- Fix eot by @Qiaochu-Song in #2330
- [decode] support length penalty by @xingchensong in #2331
- [bin] limit step when averaging model by @xingchensong in #2332
- fix 'th_accuracy' not in transducer by @DaobinZhu in #2337
- [dataset] support bucket by seq length by @Mddct in #2333
- [examples] remove useless yaml by @xingchensong in #2343
- [whisper] support arbitrary language and task by @xingchensong in #2342
- (!! breaking changes, happy whisper happy life !!) 💯💯💯
- Minor fix decode_wav by @kobenaxie in #2340
- fix comment by @Mddct in #2344
- [w2vbert] support w2vbert fbank by @Mddct in #2346
- [dataset ] fix typo by @Mddct in #2347
- [wenet] fix args.enc by @Mddct in #2354
- [examples] Initial whisper results on wenetspeech by @xingchensong in #2356
- [examples] fix --penalty by @xingchensong in #2358
- [paraformer] add decoding args by @xingchensong in #2359
- [transformer] support flash att by 'torch scaled dot attention' by @Mddct in #2351
- (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
- [conformer] support flash att by torch sdpa by @Mddct in #2360
- (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
- [conformer] sdpa default to false by @Mddct in #2362
- [transformer] fix bidecoder sdpa by @Mddct in #2368
- [runtime] Configurable blank token idx by @zhr1201 in #2366
- [wenet] modify - runtime/code/decoder more faster by @Sang-Hoon-Pakr in #2367
- (!! Significant improvement on warmup when using libtorch !!) 🚀🚀🚀
- [lint] fix lint by @cdliang11 in #2373
- [examples] better results on wenetspeech using revised transcripts by @xingchensong in #2371
- (!! Significant improvement on results of whisper !!) 💯💯💯
- [dataset] support pad or trim for whisper decoding by @Mddct in #2378
- [bin/recognize.py] support numworkers and compute dtype by @Mddct in #2379
- (!! Significant improvement on inference speed when using fp16 !!) 🚀🚀🚀
- [whisper] fix decoding maxlen by @Mddct in #2380
- fix whisper ckpt modify error by @fclearner in #2381
- 更新 recognize.py by @Mddct in #2383
- [transformer] add cross attention by @Mddct in #2388
- (!! Significant improvement on inference speed of attention_beam_search !!) 🚀🚀🚀
- [paraformer] fix some bugs by @Mddct in #2389
- new modules and methods by @Mddct in 🤩🤩🤩
New Contributors
- @Qiaochu-Song made their first contribution in #2330
- @Sang-Hoon-Pakr made their first contribution in #2367
❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤
Full Changelog: v3.0.0...v3.0.1