different results with fine-tuned model

Hi,

I fune-tuned a new az language based on _stt_en_fastconformer_hybrid_large_pc_ model (100h data, wer 7.4%, 350k steps), when transcribing a normal speech (microphone), audio books or movie, etc, a result is very good, but when trying a phone speech (good quality, no noise) getting very poor results from it (all audio 16khz/mono/16bit). Is there any params to change for training or maybe another nero model is more accurate for phone records?   

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

different results with fine-tuned model #14700

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

different results with fine-tuned model #14700

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions