Possible World VQA (PW-VQA)

This repository facilitates the reproduction of results from the PW-VQA paper. It incorporates code largely based on the implementations of RUBi and CF-VQA.

Installation

Setup and Dependencies

Install Anaconda or Miniconda distribution for Python 3+ from their respective download sites.

conda create --name pwvqa python=3.9
source activate pwvqa
pip install -r requirements.txt

You may also need to replace several files in the environment with the commands like the following (depending on where you create your environment)

!cp accuracy.py /usr/local/envs/pwvqa/lib/python3.9/site-packages/bootstrap/models/metrics
!cp -r external /usr/local/envs/pwvqa/lib/python3.9/site-packages/block/

2. Download datasets

Download annotations, images and features for VQA experiments:

bash pwvqa/datasets/scripts/download_vqa2.sh
bash pwvqa/datasets/scripts/download_vqacp2.sh

Training a Model

You can train our best model on VQA-CP v2 (PWVQA+SMRL) by running:

python -m bootstrap.run -o pwvqa/options/vqacp2/smrl_pwvqa.yaml

Then, several files are going to be created in logs/vqacp2/smrl_pwvqa/:

[options.yaml] (copy of options)
[logs.txt] (history of print)
[logs.json] (batchs and epochs statistics)
[_vq_val_oe.json] (statistics for the language-prior based strategy)
[_pwvqa_val_oe.json] (statistics for PW-VQA)
[_q_val_oe.json] (statistics for language-only branch)
[_v_val_oe.json] (statistics for vision-only branch)
[_all_val_oe.json] (statistics for the ensembled branch)
ckpt_last_engine.pth.tar (checkpoints of last epoch)
ckpt_last_model.pth.tar
ckpt_last_optimizer.pth.tar

Evaluating a Model

There is no test set for VQA-CP v2, which is our main dataset. The evaluation is conducted on the validation set.

python -m bootstrap.run \
-o ./logs/vqacp2/smrl_pwvqa/options.yaml \
--exp.resume last \
--dataset.train_split '' \
--dataset.eval_split val \
--misc.logs_name test

Acknowledgements

We thank the excellent works by RUBi and CF-VQA.

Citation

If you find the Possible World VQA (PW-VQA) useful in your research, please consider citing our work:

@article{vosoughi2024cross,
  title={Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA},
  author={Vosoughi*, Ali and Deng*, Shijian and Zhang, Songyang and Tian, Yapeng and Xu, Chenliang and Luo, Jiebo},
  journal={IEEE Transactions on Multimedia},
  year={2024},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
external/VQA		external/VQA
pwvqa		pwvqa
README.md		README.md
accuracy.py		accuracy.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Possible World VQA (PW-VQA)

Table of Contents

Installation

Setup and Dependencies

2. Download datasets

Training a Model

Evaluating a Model

Acknowledgements

Citation

About

Releases

Packages

Languages

ali-vosoughi/PW-VQA

Folders and files

Latest commit

History

Repository files navigation

Possible World VQA (PW-VQA)

Table of Contents

Installation

Setup and Dependencies

2. Download datasets

Training a Model

Evaluating a Model

Acknowledgements

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages