Generating Repetitions with Appropriate Repeated Words

Code repository for Generating Repetitions with Appropriate Repeated Words of NAACL 2022 https://aclanthology.org/2022.naacl-main.62/

Usage

Setup

git clone https://github.com/titech-nlp/repetition-generation

cd repetition-generation

poetry install

Download repetition dataset from URL (License of dataset: CC BY-NC 4.0)

Unzip and set the dataset to data/repetition/

Preprocess and calculate repeat score

export PYTHONPATH='./'
poetry run python scripts/preprocess.py

This process trains a language model to prepare repeat score.

Training

poetry run python scripts/train.py

Or you can download our trained model from URL and unzip and set it to models/

Test

poetry run python scripts/test.py

Citation

@inproceedings{kawamoto-etal-2022-generating,
    title = "Generating Repetitions with Appropriate Repeated Words",
    author = "Kawamoto, Toshiki  and
      Kamigaito, Hidetaka  and
      Funakoshi, Kotaro  and
      Okumura, Manabu",
    booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
    month = jul,
    year = "2022",
    address = "Seattle, United States",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.naacl-main.62",
    pages = "852--859",
    abstract = "A repetition is a response that repeats words in the previous speaker{'}s utterance in a dialogue. Repetitions are essential in communication to build trust with others, as investigated in linguistic studies. In this work, we focus on repetition generation. To the best of our knowledge, this is the first neural approach to address repetition generation. We propose Weighted Label Smoothing, a smoothing method for explicitly learning which words to repeat during fine-tuning, and a repetition scoring method that can output more appropriate repetitions during decoding. We conducted automatic and human evaluations involving applying these methods to the pre-trained language model T5 for generating repetitions. The experimental results indicate that our methods outperformed baselines in both evaluations.",
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
models		models
scripts		scripts
.gitignore		.gitignore
README.md		README.md
mecabrc		mecabrc
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Generating Repetitions with Appropriate Repeated Words

Usage

Setup

Preprocess and calculate repeat score

Training

Test

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

titech-nlp/repetition-generation

Folders and files

Latest commit

History

Repository files navigation

Generating Repetitions with Appropriate Repeated Words

Usage

Setup

Preprocess and calculate repeat score

Training

Test

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages