This repository contains code and models for the paper: Semantic Graphs for Generating Deep Questions (ACL 2020). Below is the framework of our proposed model (on the right) together with an input example (on the left).
allennlp 1.0.0
allennlp-models 1.0.0
pytorch 1.4.0
nltk 3.4.4
numpy 1.18.1
tqdm 4.32.2
We release all the datasets below which are processed based on HotpotQA.
- 
get tokenized data files of documents,questions,answers- get results in folder text-data
 
- get results in folder 
- 
prepare the json files ready as illustrated in build-semantic-graphs- get results in folder json-data
 
- get results in folder 
- 
run scripts/preprocess_data.shto get the preprocessed data ready for training- 
get results in folder preprocessed-dataand folderDatasets
- 
utilize glove.840B.300d.txtfrom GloVe to initialize the word-embeddings
 
- 
We release both classifier and generator models in this work. The models are constructed based on a sequence-to-sequence architecture. Typically, we use GRU and GNN in the encoder and GRU in the decoder, you can choose other methods (e.g. Transformer) which have also been implemented in our repository.
- 
classifier: accuracy - 84.06773% 
- 
generator: BLeU-4 - 15.28304 
- 
run scripts/train_classifier.shto train on the Content Selection task
- 
run scripts/train_generator.shto train on the Question Generation task, the default one is to finetune based on the pretrained classifier
- run scripts/translate.shto get the prediction on the validation dataset
We take use of the Evaluation codes for MS COCO caption generation for evaluation on automatic metrics.
- To install pycocoevalcap and the pycocotools dependency, run:
pip install git+https://github.com/salaniz/pycocoevalcap
- To evaluate the results in the translated file, e.g. prediction.txt, run:
python evaluate_metrics.py prediction.txt
    @inproceedings{pan-etal-2020-DQG,
      title = {Semantic Graphs for Generating Deep Questions},
      author = {Pan, Liangming and Xie, Yuxi and Feng, Yansong and Chua, Tat-Seng and Kan, Min-Yen},
      booktitle = {Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL)},
      year = {2020}
    }
