Multilingual Grammar Induction with Continuous Language Identification

This is PyTorch implementation of the paper:

Multilingual Grammar Induction with Continuous Language Identification
Wenjuan Han, Ge Wang, Yong Jiang, Kewei Tu
EMNLP 2019

The code performs unsupervised grammar learning in the dataset of 15 languages selected from UD treebanks v1.4.

Please concact [email protected] or [email protected] if you have any questions.

Environments

Python 2.7
PyTorch >=1.0
Numpy 1.15.4

Data

run:

python ml_dmv_parser.py --cvalency 2 --do_eval --neural_epoch 1 --function_mask --lr 0.001 --epoch 70 --use_neural --embed_languages --em_type em --child_only --language_predict

The grammar induction is by default set to be performed on 15 selected languages, you can customize your own language selection by changing the contents in the language_list file

Reference

@inproceedings{han2019multilingual,
  title={Multilingual grammar induction with continuous language identification},
  author={Han, Wenjuan and Wang, Ge and Jiang, Yong and Tu, Kewei},
  booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
  pages={5728--5733},
  year={2019}
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
data/ud_file		data/ud_file
.gitignore		.gitignore
README.md		README.md
eisner_for_dmv.py		eisner_for_dmv.py
language_list		language_list
ml_dmv_model.py		ml_dmv_model.py
ml_dmv_parser.py		ml_dmv_parser.py
ml_neural_m_step.py		ml_neural_m_step.py
neural_m_step.py		neural_m_step.py
test.py		test.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual Grammar Induction with Continuous Language Identification

Environments

Data

Reference

About

Releases

Packages

Contributors 2

Languages

WinnieHAN/mndmv

Folders and files

Latest commit

History

Repository files navigation

Multilingual Grammar Induction with Continuous Language Identification

Environments

Data

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages