Skip to content

WeNet 3.0.1

Compare
Choose a tag to compare
@xingchensong xingchensong released this 09 Mar 06:50
· 117 commits to main since this release
a93af33

❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤

What's Changed

  • Fix loss returned by CTC model in RNNT by @kobenaxie in #2327
  • [dataset] new io for code reuse for many speech tasks by @Mddct in #2316
    • (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
  • Fix eot by @Qiaochu-Song in #2330
  • [decode] support length penalty by @xingchensong in #2331
  • [bin] limit step when averaging model by @xingchensong in #2332
  • fix 'th_accuracy' not in transducer by @DaobinZhu in #2337
  • [dataset] support bucket by seq length by @Mddct in #2333
  • [examples] remove useless yaml by @xingchensong in #2343
  • [whisper] support arbitrary language and task by @xingchensong in #2342
    • (!! breaking changes, happy whisper happy life !!) 💯💯💯
  • Minor fix decode_wav by @kobenaxie in #2340
  • fix comment by @Mddct in #2344
  • [w2vbert] support w2vbert fbank by @Mddct in #2346
  • [dataset ] fix typo by @Mddct in #2347
  • [wenet] fix args.enc by @Mddct in #2354
  • [examples] Initial whisper results on wenetspeech by @xingchensong in #2356
  • [examples] fix --penalty by @xingchensong in #2358
  • [paraformer] add decoding args by @xingchensong in #2359
  • [transformer] support flash att by 'torch scaled dot attention' by @Mddct in #2351
    • (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
  • [conformer] support flash att by torch sdpa by @Mddct in #2360
    • (!! breaking changes, please update to torch2.x torchaudio2.x !!) 🚀🚀🚀
  • [conformer] sdpa default to false by @Mddct in #2362
  • [transformer] fix bidecoder sdpa by @Mddct in #2368
  • [runtime] Configurable blank token idx by @zhr1201 in #2366
  • [wenet] modify - runtime/code/decoder more faster by @Sang-Hoon-Pakr in #2367
    • (!! Significant improvement on warmup when using libtorch !!) 🚀🚀🚀
  • [lint] fix lint by @cdliang11 in #2373
  • [examples] better results on wenetspeech using revised transcripts by @xingchensong in #2371
    • (!! Significant improvement on results of whisper !!) 💯💯💯
  • [dataset] support pad or trim for whisper decoding by @Mddct in #2378
  • [bin/recognize.py] support numworkers and compute dtype by @Mddct in #2379
    • (!! Significant improvement on inference speed when using fp16 !!) 🚀🚀🚀
  • [whisper] fix decoding maxlen by @Mddct in #2380
  • fix whisper ckpt modify error by @fclearner in #2381
  • 更新 recognize.py by @Mddct in #2383
  • [transformer] add cross attention by @Mddct in #2388
    • (!! Significant improvement on inference speed of attention_beam_search !!) 🚀🚀🚀
  • [paraformer] fix some bugs by @Mddct in #2389
  • new modules and methods by @Mddct in 🤩🤩🤩

New Contributors

❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤

Full Changelog: v3.0.0...v3.0.1