Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 483 Bytes

paper.md

File metadata and controls

15 lines (8 loc) · 483 Bytes

Research papers

CNN ARCHITECTURES FOR LARGE-SCALE AUDIO CLASSIFICATION

UTTERANCE-LEVEL AGGREGATION FOR SPEAKER RECOGNITION IN THE WILD

SPEAKER DIARIZATION WITH LSTM

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing

WAVE-U-NET: A MULTI-SCALE NEURAL NETWORK FOR END-TO-END AUDIO SOURCE SEPARATION

Deep Speaker: an End-to-End Neural Speaker Embedding System

X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION