Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[examples] update whisper results on aishell #2322

Merged
merged 1 commit into from
Jan 25, 2024
Merged

Conversation

xingchensong
Copy link
Member

  1. Initializing the ctc.ctc_lo.weight with the decoder.embed.0.weight enhances convergence.
  2. zero dp, 0.1 -> 0.0
  3. smaller lr, 5e-4 -> 1e-5
  4. larger epochs, 40 -> 80
  5. more updates, accum 4 -> accum 1

@robin1001 robin1001 merged commit e600e87 into main Jan 25, 2024
5 checks passed
@robin1001 robin1001 deleted the xcsong-whisper-aishell branch January 25, 2024 01:28
| attention decoder | 2.78 % N=104765 C=101943 S=2711 D=111 I=87 |
| ctc greedy search | 6.89 % N=104765 C=98386 S=6210 D=169 I=839 |
| ctc prefix beam search | 6.86 % N=104765 C=98410 S=6194 D=161 I=830 |
| attention rescoring | 5.00 % N=104765 C=99771 S=4874 D=120 I=245 |

## Whisper-largev3 (conv2d4, full-parameter tuning) Result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

星神,这里我有点好奇,降帧率的情况下,降采样层需要重新初始化,会不会导致模型遗忘,这里只测试了aishell的同分布测试集吧,有没有可能泛化性变差了

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

得加数据,aishell的example只是用来验证可以训,训练效率高

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

得加数据,aishell的example只是用来验证可以训,训练效率高

嗯嗯,好奇如果【只初始化降采样层,然后冻结原模型,加lora微调】是不是可以保证泛化性不损失的太狠,我先实验看看,主要低资源语种微调,比如粤语,需要whisper减少遗忘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants