-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
关于流式TTS(streaming TTS)多说话人synthesize_streaming.py 推理失败问题 #2965
Comments
处理大致上没有问题,有个细节需要注意下,如果推理时 args 输入了 inference_dir , 则会进入如下判断
会先对模块进行动转静后在 load 静态模型进行推理,我判断是你的动转静没有传入 speaker 相关的参数,有两个做法:
|
动转静之后不支持输入形式参数,可能需要修改下函数的参数的顺序,因为现在第二个参数是 alpha 不是 spk_id,你现在这样应该传错参数了
|
对的就是参数位置问题,将encoder_infer()第二个参数改成spk_id,就没问题了 |
@443127316 可以分享修改后的synthesize_streaming.py吗 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
General Question
你好,我目前开发流式TT(多说话人)服务,信息总结如下:
1.使用了aishell3数据集,在PaddleSpeech/examples/aishell3/tts3脚本中进行训练;
2.训练阶段替换config文件为 conf/cnndecoder.yaml
3.训练正常,并且在synthesize_e2e.py中可以正常推理(多说话人);
目前正在改造synthesize_streaming.py 文件,以期可以实现流式推理,修改细节如下:
a) 增加了 --speaker_dict 和 --spk_id ;
b) 提前导入了 speaker_dict 并且计算了 spk_num
c) 在初始化 am 的时候,将 line 77 修改为: am = am_class(idim=vocab_size, odim=odim, spk_num=spk_num,**am_config["model"])
增加了 spk_num=spk_num
d) 在sentence推理的时候,将line 157 - line 159 改为:
with paddle.no_grad():
# acoustic model
spk_id = paddle.to_tensor(args.spk_id)
orig_hs = am_encoder_infer(phone_ids,spk_id=spk_id)
遇到的问题:
1)在推理的时候,返回错误:
2)如果去掉spk_id之后,生成语音为 静音
对于错误的猜想:
a) 在这个函数 line 85: am_encoder_infer = am.encoder_infer 是否要增加spk相关内容,如果是的话,应该怎么加
The text was updated successfully, but these errors were encountered: