Skip to content

Latest commit



67 lines (48 loc) · 4.11 KB

File metadata and controls

67 lines (48 loc) · 4.11 KB

🕵🏻 Detecting Training Data of Large Language Models via Expectation Maximization

arXiv License

This is the repository for the official implementation of Detecting Training Data of Large Language Models via Expectation Maximization.


pip install -r requirements.txt

📘 Data

We mainly use WikiMIA and OLMoMIA (will be uploaded in 🤗 soon, or you can create your own using our scripts) as our benchmark datasets. You may use MIMIR or your own dataset by modifying prepare_data function in to return a list of dictionaries with a format of {"input": text, "label", label}, where label is either 1 (member) or 0 (non-member).

🚀 Running Experiments

Running python (or with --target_model ${MODEL} --dataset_name ${DATASET} performs membership inference attack on a target model ${MODEL} and a target dataset ${DATASET} and stores membership scores in output/${MODEL}/${DATASET}/score/${METHOD}.jsonl" for each ${METHOD}. For OLMo models, you should specify --olmo_step to select an intermediate checkpoint. You can specify methods to use with the -m argument.


The default baseline methods (by not specifiying the -m argument) are Loss, Ref, Zlib, Min-K, and Min-K++. For Ref, you need to specify a reference model with the --ref_model argument. A default reference model is EleutherAI/pythia-70m for WikiMIA and stabilityai/stablelm-base-alpha-3b-v2 for OLMoMIA.

ReCaLL-based Baselines

You can apply ReCaLL using a prefix by concatenating randomly selected n shots defined by the --num_shots argument. Depending on where shots come from, there are three different methods: ReCaLL-Rand from the entire dataset, ReCaLL-RandM from members in the dataset, and ReCaLL-RandM from non-members in the dataset. For example, you can add arguments like -m ReCaLL-RandM ReCaLL-Rand ReCaLL-RandNM --num_shots "[1,2,4,8,12]".

With the -m ReCaLL-all argument, you can apply ReCaLL on all data in the dataset by using each data in the dataset as a prefix. After that, you can apply average baselines (Avg, AvgP) with the -m Avg argument and TopPref baseline with the -m TopPref -n $n argument in


You can run EM-MIA by specifying initialization method(s) and prefix score update function(s) such as -i Loss Min-K++_20 -p AUC-ROC in


To compare different MIA methods, you can plot AUC-ROC curves and calucate evaluation metrics, AUC-ROC and TPR @ k% FPR (k=0.1, 1, 5, 10, 20). You can also get score statistics and draw histograms for members and non-members. You can specify methods to compare by their name or their prefixes like python output/${MODEL}/${DATASET} -m Loss Zlib Min-K_20 Min-K++_20 -p Ref ReCaLL-Avg ReCaLL-Rand --keep_used.


This codebase is adapted from the following repositories: Min-K%, Min-K%++, and ReCaLL.


⭐ If you find our work (paper, implementation, and datasets) helpful, please consider citing our paper:

  title={Detecting Training Data of Large Language Models via Expectation Maximization},
  author={Kim, Gyuwan and Li, Yang and Spiliopoulou, Evangelia and Ma, Jie and Ballesteros, Miguel and Wang, William Yang},
  journal={arXiv preprint arXiv:2410.07582},


For any inquiry, please open an issue or contact the authors directly.