support for large scale data training(millions or even tens of millions of hours)? #1505

brainbpe · 2024-02-19T08:51:39Z

The K2 framework does not support large-scale data (millions or even tens of millions of hours) of ASR training, such as the efficiency of multi-machine and multi-card GPU training is not perfect. Are there any plans to improve this in the future?

kobenaxie · 2024-02-19T10:00:37Z

Actually we can use lhotse like Nemo-lhotse-dataset to load large scale data.

JinZr closed this as completed Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support for large scale data training(millions or even tens of millions of hours)? #1505

support for large scale data training(millions or even tens of millions of hours)? #1505

brainbpe commented Feb 19, 2024

kobenaxie commented Feb 19, 2024

support for large scale data training(millions or even tens of millions of hours)? #1505

support for large scale data training(millions or even tens of millions of hours)? #1505

Comments

brainbpe commented Feb 19, 2024

kobenaxie commented Feb 19, 2024