- Python: >=3.9,<3.12
- Python dependencies: See pyproject.toml.
- nobu-g/GLIP
- nobu-g/cohesion-analysis
poetry env use /path/to/python
poetry install
-
Download the J-CRe3 annotations.
git clone [email protected]:riken-grp/J-CRe3.git /somewhere/J-CRe3
-
Download the video and audio files following the instructions in J-CRe3.
-
Place data files under
data
directory. You can usecp -r
instead ofln -s
.mkdir -p data ln -s /somewhere/J-CRe3/textual_annotations ./data/knp ln -s /somewhere/J-CRe3/visual_annotations ./data/image_text_annotation ln -s /somewhere/J-CRe3/id ./data/id ln -s /somewhere/J-CRe3/recording ./data/recording ln -s /somewhere/J-CRe3/recording ./data/dataset
-
Follow the instructions in nobu-g/GLIP to set up the GLIP environment. We recommend to use Poetry to install the dependencies.
-
Download the checkpoint files under the GLIP root directory.
cd /somewhere/GLIP wget https://lotus.kuee.kyoto-u.ac.jp/~ueda/dist/GLIP/OUTPUT.zip unzip OUTPUT.zip
-
Clone nobu-g/cohesion-analysis and checkout
jcre3
branch.git clone [email protected]:nobu-g/cohesion-analysis.git cd cohesion-analysis git checkout jcre3
-
Follow the instructions in nobu-g/cohesion-analysis to set up the Python virtual environment.
-
Download the pre-trained checkpoint file.
wget https://lotus.kuee.kyoto-u.ac.jp/~ueda/dist/cohesion_analysis_v2/model_jcre3_large.bin
- Make a config file for cohesion analysis.
- Copy the example config file and modify it as needed.
cp configs/cohesion/example.yaml configs/cohesion/default.yaml
- Modify the
python
,project_root
, andcheckpoint
fields inconfigs/cohesion/default.yaml
as needed.
- Make a config file for phrase grounding by GLIP.
- Copy the example config file and modify it as needed.
cp configs/glip/example.yaml configs/glip/ft2_deberta_b24_u3s1_b48_1e.yaml
- Modify the
python
,project_root
,name
,exp_name
, andcheckpoint
fields inconfigs/glip/ft2_deberta_b24_u3s1_b48_1e.yaml
as needed. Thename
field should be the same as the file name of the config file (ft2_deberta_b24_u3s1_b48_1e
).
- Make a config file for prediction writer.
- Copy the example config file and modify it as needed.
cp configs/example.yaml configs/default.yaml
- Modify the
cohesion
,glip
, andcheckpoint
fields inconfigs/prediction_writer/default.yaml
as needed.
-
Run
prediction_writer.py
.[AVAILABLE_GPUS=0,1,2,3] python src/prediction_writer.py -cn default \ phrase_grounding_model=glip \ glip=ft2_deberta_b24_u3s1_b48_1e \ id_file=<(cat data/id/test.id data/id/valid.id) \ luigi.workers=4
python src/evaluation.py \
--dataset-dir data/dataset \
--gold-knp-dir data/knp \
--gold-annotation-dir data/image_text_annotation \
--prediction-mmref-dir result/mmref/glip_ft2_deberta_b24_u3s1_b48_1e \
--prediction-knp-dir result/cohesion \
--scenario-ids $(cat data/id/test.id) \
--recall-topk -1 1 5 10 \
--confidence-threshold 0.0 \
--column-prefixes rel_type prec rec
@inproceedings{ueda-2024-jcre3,
title={J-CRe3: A Japanese Conversation Dataset for Real-world Reference Resolution},
author={Nobuhiro Ueda and Hideko Habe and Yoko Matsui and Akishige Yuguchi and Seiya Kawano and Yasutomo Kawanishi and Sadao Kurohashi and Koichiro Yoshino},
booktitle={Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
year={2024},
pages={},
address={Turin, Italy},
}
@inproceedings{植田2023a,
author = {植田 暢大 and 波部 英子 and 湯口 彰重 and 河野 誠也 and 川西 康友 and 黒橋 禎夫 and 吉野 幸一郎},
title = {実世界における総合的参照解析を目的としたマルチモーダル対話データセットの構築},
booktitle = {言語処理学会 第29回年次大会},
year = {2023},
address = {沖縄},
}