We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在实际应用中要怎样提升在线模型(Streaming)的效率呢? 语言模型可以通过batch size进行批量推理,来提升推理效率;可以使用多实例来应对推理请求并发的情况;可以使用TensorRT来优化推理速度。 请问对于PaddleSpeech在线模型,上面哪些措施是可行的,有没有更好的推荐?
The text was updated successfully, but these errors were encountered:
目前暂不支持多线程推理,但可以使用多实例
Sorry, something went wrong.
No branches or pull requests
General Question
在实际应用中要怎样提升在线模型(Streaming)的效率呢?
语言模型可以通过batch size进行批量推理,来提升推理效率;可以使用多实例来应对推理请求并发的情况;可以使用TensorRT来优化推理速度。
请问对于PaddleSpeech在线模型,上面哪些措施是可行的,有没有更好的推荐?
The text was updated successfully, but these errors were encountered: