PersRefEx is a comprehensive dataset and platform designed to facilitate research in multi-perspective referential communication within photorealistic 3D environments. This project enables the study of how embodied agents (both human and AI) communicate about their surroundings from different viewpoints, aiming to achieve communicative success—where the listener accurately understands the speaker's intended referent.
📁 Dataset Huggingface Datasets
from datasets import load_dataset, Audio
dataset = load_dataset("ZinengTang/PersReFex", split="validation")
Finetuned model with PPO Huggingface Models
Follow the instructions in the huggingface page
The PersRefEx dataset comprises:
- 2,970 human-written referring expressions.
- 1,485 generated scenes.
- 27,504 sampled scenes with varying agent perspectives and referent placements.
@misc{tang2024groundinglanguagemultiperspectivereferential,
title={Grounding Language in Multi-Perspective Referential Communication},
author={Zineng Tang and Lingjun Mao and Alane Suhr},
year={2024},
eprint={2410.03959},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.03959},
}