Skip to content

Commit fe330c3

Browse files
committed
initial commit
1 parent 5ad3757 commit fe330c3

15 files changed

+3313
-0
lines changed

README.md

+52
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
2+
<img src="plot/pytorch-logo-dark.png" width="10%"> [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
3+
4+
<img align="right" src="plot/HKUST.jpg" width="12%">
5+
6+
This is the implementation of the **EMNLP** 2020 paper:
7+
8+
**MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems**. [**Zhaojiang Lin**](https://zlinao.github.io/), [**Andrea Madotto**](https://andreamad8.github.io), [**Genta Indra Winata**](https://gentawinata.com), Pascale Fung [[PDF]]()
9+
10+
11+
## Citation:
12+
If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex is listed below:
13+
<pre>
14+
15+
</pre>
16+
17+
18+
## Abstract:
19+
In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems and alleviate the over-dependency on annotated data. MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models, and jointly learn dialogue state tracking and dialogue response generation. Unlike previous approaches, which use a copy mechanism to "carryover" the old dialogue states to the new one, we introduce Levenshtein belief spans (Lev), that allows efficient dialogue state tracking with a minimal generation length. We instantiate our learning framework with two pretrained backbones: T5 (Raffel et al., 2019) and BART (Lewis et al., 2019), and evaluate them on MultiWOZ. Extensive experiments demonstrate that: 1) our systems establish new state-of-the-art results on end-to-end response generation, 2) MinTL-based systems are more robust than baseline methods in the low resource setting, and they achieve competitive results with only 20% training data, and 3) Lev greatly improves the inference efficiency.
20+
21+
22+
## Dependency
23+
Check the packages needed or simply run the command
24+
```console
25+
❱❱❱ pip install -r requirements.txt
26+
```
27+
28+
## Experiments Setup
29+
We used the preprocess script from [**DAMD**](https://gitlab.com/ucdavisnlp/damd-multiwoz).
30+
Please check setup.sh for data preprocessing.
31+
32+
## Experiments
33+
**T5 End2End**
34+
```console
35+
❱❱❱ python train.py --mode train --context_window 2 --pretrained_checkpoint t5-small --cfg seed=557 batch_size=32
36+
```
37+
**T5 DST**
38+
```console
39+
❱❱❱ python DST.py --mode train --context_window 3 --cfg seed=557 batch_size=32
40+
```
41+
42+
**BART End2End**
43+
```console
44+
❱❱❱ python train.py --mode train --context_window 2 --pretrained_checkpoint bart-large-cnn --gradient_accumulation_steps 8 --lr 3e-5 --back_bone bart --cfg seed=557 batch_size=8
45+
```
46+
**BART DST**
47+
```console
48+
❱❱❱ python DST.py --mode train --context_window 3 --gradient_accumulation_steps 10 --pretrained_checkpoint bart-large-cnn --back_bone bart --lr 1e-5 --cfg seed=557 batch_size=4
49+
```
50+
51+
check src/run.py for more information.
52+

plot/HKUST.jpg

36.2 KB
Loading

plot/MinTL.png

338 KB
Loading

plot/pytorch-logo-dark.png

15.3 KB
Loading

requirements.txt

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
torch==1.4.0
2+
transformers>=2.8.0
3+
spacy==2.2.2
4+
nltk==3.4.5
5+

setup.sh

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
cd src/damd_multiwoz
2+
python -m spacy download en_core_web_sm
3+
# preprocessing
4+
python data_analysis.py
5+
python preprocess.py
6+
# setup python path
7+
# type pwd inside damd_multiwoz to find out the path of damd_multiwoz folder
8+
export PYTHONPATH='path of damd_multiwoz folder'
9+

src/.DS_Store

8 KB
Binary file not shown.

0 commit comments

Comments
 (0)