OpenLTG-MLM is a code repository designed for Open-ended Text Generation tasks using bidirectional pre-trained langugage models such as BERT Family models (e.g., BERT and RoBERTa) through a Non-autoregressive Generation paradigm.
The main advantages of our work (has been accepted for ACL 2023!) include "enhancing the diversity of generated text" and "improving the generation speed of long text." We aim to bring renewed attention to bidirectional attention models, as they still hold potential in the field of text generation!
The codebase relies on Fairseq and PyTorch. As of March 12, 2024 (2024/3/12), it has been verified that Fairseq version 0.12.2 is compatible.
pip install fairseq
We conducted experiments on "Writing Prompts" tasks in open domains, presenting results from different-sized datasets.
Dataset | Test set | Size |
---|---|---|
WritingPrompts (Slim) | 26k | |
WritingPrompts | download (.tar.bz2) | 272k |
WritingPromptsX | 587k |
We have incorporated two methods of generation: Direct Generation and Recursive Span Generation:
-
DirectGen generate the target text directly in its entirety.
-
RecSpanGen generate the target text by specifying span numbers.
| Recursive span generation helps our model remain competitive in scenarios involving longer text generation.
Sampling Parameters:
- -DSWAttn (Dynamic Sliding Window Attention) can assist attention mechanisms in focusing on crucial information within a broader local context, thus preventing interference from distant noise.
- -NSamping (Nucleus Sampling) helps mitigate prevent degradation issues in language models during open-domain tasks.
- -LTD (Linear Temperature Decay) is a crucial technique that ensures the model maintains high-quality outputs in the iterative process.
bash openltg_mlm/scripts/tasks/xsum/run_inf.sh
We can extend the maximum encoding length of the RoBERTa model using --hierarchical-pos
to support usage scenarios larger than 1k.
# Prepare Data
bash openltg_mlm/scripts/process/xsum/binarize.sh
# DirectGen
bash openltg_mlm/scripts/tasks/xsum/run_train.sh
# or RecSpanGen
# bash openltg_mlm/scripts/tasks/xsum/run_rec_train.sh
@inproceedings{liang-etal-2023-open,
title = "Open-ended Long Text Generation via Masked Language Modeling",
author = "Liang, Xiaobo and
Tang, Zecheng and
Li, Juntao and
Zhang, Min",
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.acl-long.13",
doi = "10.18653/v1/2023.acl-long.13",
pages = "223--241",
}