Sample rate impact on RTF #140

kafan1986 · 2021-12-19T09:34:45Z

kafan1986
Dec 19, 2021

I have seen the minor accuracy impact of 8KHz vs 16kHz. I wanted to know the impact of sample rate on RTF.

My audio is orginally in 8KHz, would resampling allow to gain some accuracy or given that data is lost, I will be processing 2x samples without any gain in accuracy?

Also, why the onnx model based inference which runs faster compared to pytorch does not support 8KHz sample rate. Any plan to support that?

Answered by snakers4

Dec 19, 2021

I have seen the minor accuracy impact of 8KHz vs 16kHz.

As it should be according to our tests.

I wanted to know the impact of sample rate on RTF.

There should be very little impact, because the JIT model actually contains separate, but very similar models for 8 and 16 kHz (for different scenarios different model splits were optimal, but for 30ms+ models we ended up with this).

My audio is orginally in 8KHz, would resampling allow to gain some accuracy or given that data is lost, I will be processing 2x samples without any gain in accuracy?

You may try, but most likely no, the information does not just appear. It may be estimated or hallucinated, but probably it will not help either.

…

View full answer

snakers4 · 2021-12-19T09:45:25Z

snakers4
Dec 19, 2021
Maintainer

I have seen the minor accuracy impact of 8KHz vs 16kHz.

As it should be according to our tests.

I wanted to know the impact of sample rate on RTF.

There should be very little impact, because the JIT model actually contains separate, but very similar models for 8 and 16 kHz (for different scenarios different model splits were optimal, but for 30ms+ models we ended up with this).

My audio is orginally in 8KHz, would resampling allow to gain some accuracy or given that data is lost, I will be processing 2x samples without any gain in accuracy?

You may try, but most likely no, the information does not just appear. It may be estimated or hallucinated, but probably it will not help either.

In any case the difference in quality is not that big, really.

Also, why the onnx model based inference which runs faster compared to pytorch does not support 8KHz sample rate. Any plan to support that?

The fact that it runs faster is due to ONNX itself or its compiler ot static graph, idk. I personally observed this only for very small models on very short inputs (i.e. exactly like this one). For longer inputs on STT models this difference was not meaningful or stable.

When we are exporting an ONNX model via tracing, it works but without ifs.
If we need ifs (like with 8 or 16 kHz) then we need scripting. And with scripting if support is kind of meh and some cryptic errors arise (provided that we already plowed through a number of similar cryptic errors).

So since we decided to have only one model in future to avoid chaos and complexity, we opted for 16 kHz. Since all of these sampling rates (8, 16, 32, 48) are multiples of each other, simple resampling strategies can be half-assed (e.g. for 48 kHz just average each group of 3 values).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sample rate impact on RTF #140

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Sample rate impact on RTF #140

kafan1986 Dec 19, 2021

Replies: 1 comment

snakers4 Dec 19, 2021 Maintainer

kafan1986
Dec 19, 2021

snakers4
Dec 19, 2021
Maintainer