From 91905973846731075abf961ae63e1c59f6e923f0 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed, 11 Oct 2023 11:55:47 -0700 Subject: [PATCH] Remove PUBLICATIONS.md, point to github.io NeMo page instead (#7694) (#7695) * update publications section to point to blog website page * add hyphen * use double backquotes for code formatting --------- Signed-off-by: Elena Rastorgueva Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com> Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com> --- PUBLICATIONS.md | 214 ------------------------------------------------ README.rst | 5 +- 2 files changed, 4 insertions(+), 215 deletions(-) delete mode 100644 PUBLICATIONS.md diff --git a/PUBLICATIONS.md b/PUBLICATIONS.md deleted file mode 100644 index cd120efc7e7b..000000000000 --- a/PUBLICATIONS.md +++ /dev/null @@ -1,214 +0,0 @@ -# Publications - -Here, we list a collection of research articles that utilize the NeMo Toolkit. If you would like to include your paper in this collection, please submit a PR updating this document. - -------- - -# Automatic Speech Recognition (ASR) - -
- 2023 - - * [Confidence-based Ensembles of End-to-End Speech Recognition Models](https://arxiv.org/abs/2306.15824) - * [Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-to-End Automatic Speech Recognition](https://ieeexplore.ieee.org/abstract/document/10022960) - * [Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition](https://ieeexplore.ieee.org/abstract/document/10023219) - -
- -
- 2022 - - * [Multi-blank Transducers for Speech Recognition](https://arxiv.org/abs/2211.03541) - -
- -
- 2021 - - * [Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition](https://arxiv.org/abs/2104.01721) - * [SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition](https://www.isca-speech.org/archive/interspeech_2021/oneill21_interspeech.html) - * [CarneliNet: Neural Mixture Model for Automatic Speech Recognition](https://arxiv.org/abs/2107.10708) - * [CTC Variations Through New WFST Topologies](https://arxiv.org/abs/2110.03098) - * [A Toolbox for Construction and Analysis of Speech Datasets](https://openreview.net/pdf?id=oJ0oHQtAld) - -
- - -
- 2020 - - * [Cross-Language Transfer Learning, Continuous Learning, and Domain Adaptation for End-to-End Automatic Speech Recognition](https://ieeexplore.ieee.org/document/9428334) - * [Correction of Automatic Speech Recognition with Transformer Sequence-To-Sequence Model](https://ieeexplore.ieee.org/abstract/document/9053051) - * [Improving Noise Robustness of an End-to-End Neural Model for Automatic Speech Recognition](https://arxiv.org/abs/2010.12715) - -
- - -
- 2019 - - * [Jasper: An End-to-End Convolutional Neural Acoustic Model](https://arxiv.org/abs/1904.03288) - * [QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions](https://arxiv.org/abs/1910.10261) - - -
- - --------- - - -## Speaker Recognition (SpkR) - -
- 2022 - - * [TitaNet: Neural Model for Speaker Representation with 1D Depth-Wise Separable Convolutions and Global Context](https://ieeexplore.ieee.org/abstract/document/9746806) - -
- - -
- 2020 - - * [SpeakerNet: 1D Depth-wise Separable Convolutional Network for Text-Independent Speaker Recognition and Verification]( https://arxiv.org/pdf/2010.12653.pdf) - -
- --------- - -## Speech Classification - -
- 2022 - - * [AmberNet: A Compact End-to-End Model for Spoken Language Identification](https://arxiv.org/abs/2210.15781) - * [Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models](https://arxiv.org/abs/2211.05103) - - -
- -
- 2021 - - * [MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection](https://ieeexplore.ieee.org/abstract/document/9414470/) - -
- - -
- 2020 - - * [MatchboxNet - 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition](http://www.interspeech2020.org/index.php?m=content&c=index&a=show&catid=337&id=993) - -
- - --------- - -## Speech Translation - -
- 2022 - - * [NVIDIA NeMo Offline Speech Translation Systems for IWSLT 2022](https://aclanthology.org/2022.iwslt-1.18/) - -
- - --------- - -# Natural Language Processing (NLP) - -## Language Modeling - -
- 2022 - - * [Evaluating Parameter Efficient Learning for Generation](https://arxiv.org/abs/2210.13673) - * [Text Mining Drug/Chemical-Protein Interactions using an Ensemble of BERT and T5 Based Models](https://arxiv.org/abs/2111.15617) - -
- -
- 2021 - - * [BioMegatron: Larger Biomedical Domain Language Model ](https://aclanthology.org/2020.emnlp-main.379/) - -
- -## Neural Machine Translation - -
- 2022 - - * [Finding the Right Recipe for Low Resource Domain Adaptation in Neural Machine Translation](https://arxiv.org/abs/2206.01137) - -
- -
- 2021 - - * [NVIDIA NeMo Neural Machine Translatio Systems for English-German and English-Russian News and Biomedical Tasks at WMT21](https://arxiv.org/pdf/2111.08634.pdf) - -
- --------- - -## Dialogue State Tracking - -
- 2021 - - * [SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services](https://arxiv.org/abs/2105.08049) - -
- -
- 2020 - - * [A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided Dialogue Dataset](https://arxiv.org/abs/2008.12335) - -
--------- - - -# Text To Speech (TTS) - -
- 2022 - - * [Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers](https://arxiv.org/abs/2211.00585) - -
- -
- 2021 - - * [TalkNet: Fully-Convolutional Non-Autoregressive Speech Synthesis Model](https://www.isca-speech.org/archive/interspeech_2021/beliaev21_interspeech.html) - * [TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction](https://arxiv.org/abs/2104.08189) - * [Hi-Fi Multi-Speaker English TTS Dataset](https://www.isca-speech.org/archive/pdfs/interspeech_2021/bakhturina21_interspeech.pdf) - * [Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings](https://arxiv.org/abs/2110.03584) - -
- - --------- - -# (Inverse) Text Normalization -
- 2022 - - * [Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization](https://arxiv.org/abs/2203.15917) - * [Thutmose Tagger: Single-pass neural model for Inverse Text Normalization](https://arxiv.org/abs/2208.00064) - -
- -
- 2021 - - * [NeMo Inverse Text Normalization: From Development to Production](https://www.isca-speech.org/archive/pdfs/interspeech_2021/zhang21ga_interspeech.pdf) - * [A Unified Transformer-based Framework for Duplex Text Normalization](https://arxiv.org/pdf/2108.09889.pdf ) - -
- --------- \ No newline at end of file diff --git a/README.rst b/README.rst index 15a2120d30b5..acc30a68e610 100644 --- a/README.rst +++ b/README.rst @@ -360,7 +360,10 @@ We welcome community contributions! Please refer to the `CONTRIBUTING.md `_. We welcome the addition of your own articles to this list ! +We provide an ever-growing list of `publications `_ that utilize the NeMo framework. + +If you would like to add your own article to the list, you are welcome to do so via a pull request to this repository's ``gh-pages-src`` branch. +Please refer to the instructions in the `README of that branch `_. License -------