This repository is the official implementation of the following paper accepted by the Thirty-ninth International Conference on Machine Learning (ICML) 2022:
Zhaoxuan Wu, Yao Shu, Bryan Kian Hsiang Low
DAVINZ: Data Valuation using Deep Neural Networks at Initialization
To install requirements:
conda env create -f environment.yml
MNIST and CIFAR-10: The code automatically downloads the required datasets.
MNISTM: It can be downloaded here. Then, place the extracted keras_mnistm.pkl
file under the data/ folder.
Ising Phyicial Model Dataset: It can be downloaded at here. Then, place the ising_data.h5
file under the data/ directory.
At the beginning of the main.py
and main_reg.py
files, you can find example usages of DAVINZ for classficiation and regression tasks, respectively.
We give one example here:
mkdir data results checkpoints
python main.py --dataset=MNIST_baseline --model=ResNet18 --num_parties=10 --split_method=by_class --seed=0 --gpu=0
We implemented validation performance (VP), influence function (IF) and robust volume (RV) for comparisons. The code, including the example usages, can be found under the baselines/ directory.