Pascalson
diff --git a/‎Qadpt_model.png
349 KB b/‎Qadpt_model.png
349 KB
diff --git a/‎Readme.md
+30-43 b/‎Readme.md
+30-43
diff --git a/‎args.py
+52 b/‎args.py
+52
@@ -1,51 +1,38 @@
 # DyKGChat
-This project is the implementation of our paper **DyKgChat: A Multi-domain Chit-chat Dialogue Generation Dataset Grounding on Dynamic Knowledge Graphs**.
+The project contains the collected data and code of our paper **Yi-Lin Tuan, Yun-Nung Chen, Hung-yi Lee. "DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs", EMNLP 2019**.
 
+* our proposed approach: (Qadpt) **Q**uick **Ad**a**pt**ive Dynamic Knoledge-Grounded Neural Converation Model (pronouce: Q-adapt)
 
-## Requirements
-* jieba
-* python3
+![Qadpt](Qadpt_model.png)
+
+## Setup
+### Installation (my environment)
+* python3.6
 * tensorflow r1.13
+* jieba
+* nltk3.2.5
 
-## Files
-* `data/`: the collected data `hgzhz/` and `friends/` as well as their trained TransE
-* `model_ckpts/`: the trained models
-* `Qadpt/`: the programs
+### Files
+* `data/`: the collected data `hgzhz/` and `friends/` as well as the trained TransE
+* `model_ckpts/`: the trained models in the paper
 
 
 ## Usage
-* clone the repository and switch to directory `Qadpt/`
-```
-$cd Qadpt/
-```
-
-* testing hgzhz (the following commands must be in order)
-```
-$bash run.sh -1 pred_acc Qadpt
-$bash run.sh -1 ifchange Qadpt
-$bash run.sh -1 eval_pred_acc Qadpt
-```
-
-* testing friends
-```
-$bash frun.sh -1 pred_acc Qadpt
-$bash frun.sh -1 ifchange Qadpt
-$bash frun.sh -1 eval_pred_acc Qadpt
-```
-
-The automatic evaluation results will be printed on the screen, and some files will be outputed to `Qadpt/hgzhz_results/` or `Qadpt/friends_results/`.
-
-The default `ifchange` evaluates **Last-1** score. To change to **random** or **Last-2**, modify the `line 464` in `main.py` to `level=-1` or `level=1`.
-
-
-* training hgzhz
-```
-$bash run.sh 0 None Qadpt_new
-```
-
-* testing friends
-```
-$bash frun.sh 0 None Qadpt_new
-```
-
-The trained model will be stored in `model_ckpts/hgzhz/Qadpt_new/` or `model_ckpts/friends/Qadpt_new/`
+* clone the repository
+* run the script `run.sh`
+```
+$bash run.sh <GPU_ID> <method> <model> <data> <exp_name>
+```
+  * for <GPU_ID>, check your device avalibility by `nvidia-smi`
+  * for <method>, choose from `train`, `pred_acc`, `eval_pred_acc`, `ifchange`
+  * for <model>, choose from `seq2seq`, `MemNet`, `TAware`, `KAware`, `Qadpt`
+  * for <data>, choose from `friends`, `hgzhz_v1_0`(used in our paper), `hgzhz`(current newest version)
+  * for <exp_name>, check the directory `model_ckpts`
+
+## More description
+* testing method
+  * `pred_acc`: for metrics `Generated-KW`, `BLEU-2`, `distinct-n`
+  * `eval_pred_acc`: for metrics `KW-Acc`, `KW/Generic`, `perplexity`
+  * `ifchange`: for change rates / accurate change rates
+* script options
+  * the `hops_num` and `change_level` are required to be changed in `run.sh`
@@ -0,0 +1,52 @@
+import argparse
+import re
+
+def parse():
+    parser = argparse.ArgumentParser(
+        description='You have to set the parameters for the model.')
+
+    # directory related
+    parser.add_argument("--model", type=str, default='Qadpt')
+    parser.add_argument("--model-dir", type=str, default='model_ckpts')
+    parser.add_argument("--results-dir", type=str, default='results')
+    parser.add_argument("--data-dir", type=str, default='data')
+    parser.add_argument("--data-path", type=str, default='data/friends/friends.txt')
+    parser.add_argument("--data-type", type=str, default='test')
+    # parameters related
+    parser.add_argument("--size", type=int, default=128)
+    parser.add_argument("--num-layers", type=int, default=1)
+    parser.add_argument("--hops-num", type=int, default=1)
+    parser.add_argument("--kgpath-len", type=int, default=6)
+    parser.add_argument("--vocab-size", type=int, default=20000)
+    parser.add_argument("--fact-size", type=int, default=100)
+    # for training setting
+    parser.add_argument("--lr", type=float, default=0.5)
+    parser.add_argument("--lr-decay", type=float, default=0.99)
+    parser.add_argument("--grad-norm", type=float, default=5.0)
+    parser.add_argument("--buckets", type=str, default='[(10, 5)]')
+    parser.add_argument("--batch-size", type=int, default=128)
+    parser.add_argument("--max-seq-len", type=int, default=50)
+    parser.add_argument("--max-train-data-size", type=int, default=0)# 0: no limit
+    parser.add_argument("--steps-per-checkpoint", type=int, default=200)
+    # test
+    parser.add_argument("--test-type", type=str, default='train')
+    parser.add_argument("--change-level", type=int, default=0)
+    
+    return parser.parse_args()
+
+def parse_buckets(str_buck):
+    _pair = re.compile(r"(\d+,\d+)")
+    _num = re.compile(r"\d+")
+    buck_list = _pair.findall(str_buck)
+    if len(buck_list) < 1:
+        raise ValueError("The bucket should has at least 1 component.")
+    buckets = []
+    for buck in buck_list:
+        tmp = _num.findall(buck)
+        d_tmp = (int(tmp[0]), int(tmp[1]))
+        buckets.append(d_tmp)
+    return buckets
+
+FLAGS = parse()
+FLAGS.data_dir, _ = FLAGS.data_path.rsplit('/',1)
+_buckets = parse_buckets(FLAGS.buckets)