Skip to content

loris3/evaluation_explanation_quality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A collection of experiments to assess the quality of explanations for detectors of machine-generated text.

Setup

git clone --recurse-submodules repo_url

cd repo_dir

Models

cd repo_dir

wget https://openaipublic.azureedge.net/gpt-2/detector-models/v1/detector-base.pt models/radford et al/detector-base.pt

Cache

Unzip explanation_cache.zip to explanation_cache. The filenames contain the SHA256 hash of the input string. See fi_explainer.py.

Python

Create a .venv and activate it.

pip install -r requirements.txt

python -m spacy download en_core_web_sm

python -m spacy download en_core_web_lg

You may want to intall pytorch manually.

Make sure to select the .venv in all notebooks.

Dataset

Run dataset_sampling.ipynb to obtain the base dataset from Guo et al. 2023 (CC-BY-SA). The individual notebooks derive additional synthetic datasets (CC-BY-SA).

Detectors

All detectors are extended to support masked input:

Explanation Methods

SHAP is used as-is.

Forks of LIME and Anchor are provided as submodules.

Anchor

  • Addition of a budget limiting the number of samples used during search to cap runtime (200 samples per candidate during search, unlimited samples in final "best of each size" round)
  • DistillBERT was replaced with DistillRoBERTA and the mask probability adjusted to increase the coherence of perturbations
  • Changes to the rendering functions for the user-study (used to share JS and CSS scope with LIME)

LIME

  • Cosmetic changes to the bar-charts for the user-study

Experiments

The explanations are provided as a zip file. All experiments are designed so that any subset of the dataset can be processed in parallel by executing the notebooks with different offsets.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published