You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I fune-tuned a new az language based on stt_en_fastconformer_hybrid_large_pc model (100h data, wer 7.4%, 350k steps), when transcribing a normal speech (microphone), audio books or movie, etc, a result is very good, but when trying a phone speech (good quality, no noise) getting very poor results from it (all audio 16khz/mono/16bit). Is there any params to change for training or maybe another nero model is more accurate for phone records?