This repository includes the code for NeurIPS24 Datasets and Benchmarks Paper:
"BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays"
Create a conda environment and activate it:
conda create -n BenchX python=3.11
conda activate BenchX
Clone the repository of BenchX and install requirements:
git clone https://github.com/yangzhou12/BenchX
cd BenchX
pip install -r requirements.txt
Install MMSegmentation following the offitial instruction:
# Install the dependency for MMSegmentation
pip install -U openmim
mim install mmengine
mim install "mmcv==2.1.0"
# Install MMSegmentation
git clone -b main https://github.com/open-mmlab/mmsegmentation.git
cd mmsegmentation
pip install -v -e .
Supported Datasets and Tasks (click to expand)
- COVIDx CXR-4 (Binary Classification)
- NIH Chest X-Rays (Multi-Label Classification)
- Object-CXR (Binary Classification, Segmentation)
- RSNA Pneumonia (Binary Classification, Segmentation)
- SIIM-ACR Pneumothorax Segmentation (Binary Classification, Segmentation)
- TBX11K (Segmentation)
- VinDr-CXR (Multi-Label Classification, Segmentation)
- IU X-Ray (Report Generation)
We provide scripts to process existing datasets. Please refer to
datasets/README.md
for more detail. In particular, we do not distribute datasets in this repository, and we do not own copyrights on any of the datasets.
Supported MedVLP methods (click to expand)
- ConVIRT: "Contrastive Learning of Medical Visual Representations from Paired Images and Text" [Ours]
- GLoRIA: "GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition" [Official] [Ours]
- MedCLIP: "MedCLIP: Contrastive Learning from Unpaired Medical Images and Texts" [Official]
- MedKLIP: "MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in Radiology" [Official] [Ours]
- M-FLAG: "M-FLAG: Medical Vision-Language Pre-training with Frozen Language Models and Latent Space Geometry Optimization" [Ours]
- MGCA: "Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning" [Official] [Ours-ResNet50] [Ours-ViT]
- PTUnifier: "Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts" [Ours]
- MRM: "Advancing Radiograph Representation Learning with Masked Record Modeling" [Official] [Ours]
- REFERS: "Generalized Radiograph Representation Learning via Cross-Supervision Between Images and Free-Text Radiology Reports" [Official] [Ours]
We pre-trained 8 MedVLP models on the same training set from MIMIC-CXR, and release our pre-trained checkpoints. Since the labels for training MedCLIP are not publicly avaliable, we use its offitial checkpoints for evaluation.
For segmentation, please run the scripts in 'preprocess/model_converters' to convert keys in MedVLP models to the MMSegmentation format before training.
You can run any benchmark task supported by BenchX using the following commands. The MedVLP model, dataset, and training parameters can be specified by a config file in config/
. Additional command-line arguments can also be specified to override the configuration setting.
To fine-tune a MedVLP model for classification, run this command:
python bin/train.py configs/classification/<dataset_name>/<model_name>.yml
To fine-tune a MedVLP model for segmentation, run this command:
python mmsegmentation/tools/train.py mmsegmentation/configs/benchmark/<dataset_name>/<model_name>.py
Adapting MMSegmentation: We provide the necessary files for adapting MMSegmentation in preprocess/mmsegmentation/. Please modify the installed MMSegmentaiton framework in mmsegmentation/ by adding the provided files before training and evaluation.
To fine-tune a MedVLP model for report generation, run this command:
python bin/train.py configs/report_generation/<dataset_name>/<model_name>.yml
To evaluate fine-tuned MedVLP models, run:
# For classification and report generation
python bin/test.py configs/<task_name>/<dataset_name>/<model_name>.yml validator.splits=[test] ckpt_dir=<path_to_checkpoint>
# For segmentation
python mmsegmentation/tools/my_test.py mmsegmentation/configs/<dataset_name>/<model_name>.yml <path_to_checkpoint>
To reproduce the benchmark results in the paper, run the following scripts:
# For binary classification on COVIDx
sh scripts/classification/run_BenchX_COVIDx.sh
# For binary classification on RSNA
sh scripts/classification/run_BenchX_RSNA.sh
# For binary classification on SIIM
sh scripts/classification/run_BenchX_SIIM.sh
# For multi-lable classification on NIH Chest X-Ray
sh scripts/classification/run_BenchX_NIH.sh
# For multi-lable classification on VinDr-CXR
sh scripts/classification/run_BenchX_VinDr.sh
# For segmentation on Object-CXR
sh scripts/segmentation/run_BenchX_Object_CXR.sh
# For segmentation on RSNA
sh scripts/segmentation/run_BenchX_RSNA.sh
# For segmentation on SIIM
sh scripts/segmentation/run_BenchX_SIIM.sh
# For segmentation on TBX11K
sh scripts/segmentation/run_BenchX_TBX11K.sh
# For report generation on IU X-Ray
sh scripts/report_generation/run_BenchX_IU_XRay.sh
We thanks the following projects for reference of creating BenchX:
If you find BenchX useful for your research and applications, please cite using this BibTeX:
@inproceedings{zhou2024benchx,
title={BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays},
author={Yang Zhou, Tan Li Hui Faith, Yanyu Xu, Sicong Leng, Xinxing Xu, Yong Liu, Rick Siow Mong Goh},
booktitle={Proceedings of NeurIPS},
year={2024}
}