-
🐛 BugTo Reproducejust run examples in silero-vad.ipynb, set window_size_samples of Stream imitation example to 512, the stream result is different with full audio resultstream restult:{'start': 2080} {'end': 31712} {'start': 43040} {'end': 74208} {'start': 79904} {'end': 109024} {'start': 149536} {'end': 164320} {'start': 167456} {'end': 182240} {'start': 183840} {'end': 212448} {'start': 217120} {'end': 228320} {'start': 230432} {'end': 241632} {'start': 245792} {'end': 253408} {'start': 261152} {'end': 286176} {'start': 294944} {'end': 301536} {'start': 304160} {'end': 312288} {'start': 326176} {'end': 420832} {'start': 422944} {'end': 455648} {'start': 459296} {'end': 491488} {'start': 493600} {'end': 520672} {'start': 524320} {'end': 567264} {'start': 572960} {'end': 601568} {'start': 607776} {'end': 621536} {'start': 639008} {'end': 669664} {'start': 672288} {'end': 692192} {'start': 698400} {'end': 713184} {'start': 721440} {'end': 749024} {'start': 782368} {'end': 799200} {'start': 818208} {'end': 854496} {'start': 857120} {'end': 865760} {'start': 872480} {'end': 904160} {'start': 906784} {'end': 917472} {'start': 920608} {'end': 952800} {'start': 958496} full audio result:[{'start': 1568, 'end': 31200}, {'start': 42528, 'end': 73696}, {'start': 79392, 'end': 108512}, {'start': 149024, 'end': 163808}, {'start': 166944, 'end': 181728}, {'start': 183328, 'end': 211936}, {'start': 216608, 'end': 227808}, {'start': 229920, 'end': 241120}, {'start': 245280, 'end': 252896}, {'start': 260640, 'end': 285664}, {'start': 294432, 'end': 301024}, {'start': 303648, 'end': 311776}, {'start': 325664, 'end': 420320}, {'start': 422432, 'end': 455136}, {'start': 458784, 'end': 490976}, {'start': 493088, 'end': 520160}, {'start': 523808, 'end': 566752}, {'start': 572448, 'end': 601056}, {'start': 607264, 'end': 621024}, {'start': 638496, 'end': 669152}, {'start': 671776, 'end': 691680}, {'start': 697888, 'end': 712672}, {'start': 720928, 'end': 748512}, {'start': 781856, 'end': 798688}, {'start': 817696, 'end': 853984}, {'start': 856608, 'end': 865248}, {'start': 871968, 'end': 903648}, {'start': 906272, 'end': 916960}, {'start': 920096, 'end': 952288}] as you can see, stream result is exactly 512 bigger than full audio result. The stream result get wrong output |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi, this is not a bug, but a feature. |
Beta Was this translation helpful? Give feedback.
Hi, this is not a bug, but a feature.
The streaming method does not have the luxury of looking into future.
The ordinary and the streaming method are very different.