We have developed a Japanese version of the MS COCO caption dataset, which we call YJ Captions 26k Dataset. It is created to facilitate the development of image captioning in Japanese language. Each Japanese caption describes the specified image provided in MS COCO dataset and each image has 5 captions.
The annotations are stored using the JSON file format. The annotation scheme is the same as that of MS COCO. Please see the section on Image Caption Annotations.
Creative Commons Attribution 4.0 License
@InProceedings{P16-1168,
author = "Miyazaki, Takashi and Shimizu, Nobuyuki",
title = "Cross-Lingual Image Caption Generation",
booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
year = "2016",
publisher = "Association for Computational Linguistics",
pages = "1780--1790",
location = "Berlin, Germany",
doi = "10.18653/v1/P16-1168",
url = "http://aclweb.org/anthology/P16-1168"
}