Skip to content

Latest commit

 

History

History
17 lines (9 loc) · 993 Bytes

README.md

File metadata and controls

17 lines (9 loc) · 993 Bytes

Korean Parallel corpora

Jungyeul Park, Jeen-Pyo Hong and Jeong-Won Cha (2016) Korean Language Resources for Everyone. In Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation (PACLIC 30). October 28 - 30, 2016. Seoul, Korea. https://www.aclweb.org/anthology/Y16-2002/

See also JHE evaluation data (dev and eval), available at https://doi.org/10.5281/zenodo.891295

These corpora are made available under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)

Please post any questions about the corpus to jungyeul.park (AT) gmail.com

We welcome any contribution of Korean-English parallel data that you might want to share with other people.

(December 2020) 31K sentences added in the bible folder

(December 2020) A Korean raw text collection for creating a language model is available at http://doi.org/10.5281/zenodo.4317288

(July 2020) North Korean dev and test files are added (korean-english-news-v1-NK).