Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods

This is the code for our paper, "Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods."

Read the paper.

Getting started

Setup virtual environment and install requirements:

conda create -n fooling_limeshap python=3.7
source activate fooling_limeshap
pip install -r requirements.txt

You should be able to run the code now!

We provide a short walk through on COMPAS in COMPAS_Example.ipynb. This is a nice place to get started to see how our method works. Applications of the attack on each data set can be found in compas_experiment.py, cc_experiment.py, and german_experiment.py.

References

Please consider citing our paper if you found this work useful!

@inproceedings{advlime:aies20,
  author = {Dylan Slack and Sophie Hilgard and Emily Jia and Sameer Singh and Himabindu Lakkaraju},
  title = {Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods},
  booktitle = {AAAI/ACM Conference on AI, Ethics, and Society (AIES)},
  year = {2020}
}

Contact

This code was developed by Dylan Slack, Sophie Hilgard, and Emily Jia. Reach out to us with any questions!

Our emails are: [email protected], [email protected], and [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
data		data
model_configurations		model_configurations
.gitignore		.gitignore
COMPAS_Example.ipynb		COMPAS_Example.ipynb
adversarial_models.py		adversarial_models.py
analyze_threshold.py		analyze_threshold.py
cc_experiment.py		cc_experiment.py
compas_experiment.py		compas_experiment.py
create_pca.py		create_pca.py
german_experiment.py		german_experiment.py
get_data.py		get_data.py
readme.md		readme.md
requirements.txt		requirements.txt
threshold.py		threshold.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods

Getting started

References

Contact

About

Releases

Packages

Languages

rotcx/Fooling-LIME-SHAP

Folders and files

Latest commit

History

Repository files navigation

Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods

Getting started

References

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages