The prototype code for the paper Data Origin Inference in Machine Learning.
This repository is targeting for mobile user
as the data origin in OpenImage dataset.
To test the function of this repository, simply run
python script/oi_user_tiny.py
The intermediate and final results are saved in log/res/oi/user_tiny/
.
All the configuration files are in config/
. The entry configuration file is config/*.yaml
(e.g. config/oi_user_tiny.yaml) to redirect to the other configuration files for different functional modules.
There are four functional modules in this repository:
- dataset: how to extract the raw data of data origin from the original dataset
- metadata: how to split the extracted raw data to facilitate the shadow training
- model: the details about how to train the target model and shadow model
- infer: the details about how to train and test the meta model for the final data origin inference
Change the information in the config/*/*.yaml
(e.g. config/dataset/oi_user_tiny.yaml) to customize any of the above modules' parameters.
Note: The current save directory is data/
, where the raw data, the metadata and the models (DNNs and meta models) are saved. If you want to reorganize the save directories, change the values with the key suffixed with path
, dir
or csv
in config/*/*.yaml
.
If you have any questions about this repository or the paper, please don't hesitate to contact the repository owner or ping [email protected].
If you would like to cite this work, please use the following information:
@article{xu2022data,
title={Data Origin Inference in Machine Learning},
author={Xu, Mingxue and Li, Xiang-Yang},
journal={arXiv preprint arXiv:2211.13416},
year={2022}
}