MGTBench 2.0:Rethinking the Machine-Generated Text Detection

MGTBench2.0 provides the reference implementations of different machine-generated text (MGT) detection methods. It is still under continuous development and we will include more detection methods as well as analysis tools in the future.

Quick Start

Installation

git clone -b release https://github.com/Y-L-LIU/MGTBench-2.0
cd MGTBench-2.0
conda env create -f environment.yml;
conda activate mgtbench2;

Check out demo.ipynb for a quick start.

from mgtbench import AutoDetector, AutoExperiment
from mgtbench.loading.dataloader import load

model_name_or_path = '/data1/zzy/gpt2-medium'
metric = AutoDetector.from_detector_name('ll', 
                                            model_name_or_path=model_name_or_path)
experiment = AutoExperiment.from_experiment_name('threshold',detector=[metric])

data_name = 'AITextDetect'
detectLLM = 'gpt35'
category = 'Art'
data = load(data_name, detectLLM, category)
experiment.load_data(data)
res = experiment.launch()

print('train:', res[0].train)
print('test:', res[0].test)

Supported Methods

Currently, we support the following methods (continuous updating):

Metric-based methods:
- Log-Likelihood [Ref];
- Rank [Ref];
- Log-Rank [Ref];
- Entropy [Ref];
- GLTR Test 2 Features (Rank Counting) [Ref];
- DetectGPT [Ref];
- LRR [Ref];
- NPR [Ref];
- DNA-GPT [Ref];
- Fast-DetectGPT [Ref];
- Binoculars [Ref];
Model-based methods:
- OpenAI Detector [Ref];
- ChatGPT Detector [Ref];
- ConDA [Ref] [Model Weights];
- GPTZero [Ref];
- RADAR [Ref];
- LM Detector [Ref];

Supported Datasets

AITextDetect

It contains human written and AI polished text in different categories, including:

STEM (Physics, Math, Computer, Biology, Chemistry, Electrical, Medicine, Statistics)
Social Sciences (Education, Management, Economy and Finance)
Humanities (Art, History, Literature, Philosophy, Law)

From wiki, arxiv, and Gutenberg

To check the dataset:

'''
supported LLMs and detect categories:

categories = ['Physics', 'Medicine', 'Biology', 'Electrical_engineering', 'Computer_science', 'Literature', 'History', 'Education', 'Art', 'Law', 'Management', 'Philosophy', 'Economy', 'Math', 'Statistics', 'Chemistry']

llms = ['Moonshot', 'gpt35', 'Mixtral', 'Llama3']

'Human' for human written data
'''

detectLLM = 'Llama3'
category = 'Math'

from datasets import load_dataset

# ai polished
polish = load_dataset("AITextDetect/AI_Polish_clean",
                      name=detectLLM,
                      split=category,
                      trust_remote_code=True
                    )

# human written
human = load_dataset("AITextDetect/AI_Polish_clean",
                     name='Human',
                     split=category,
                     trust_remote_code=True
                    )

You can also download the dataset from Huggingface, and examine locally:

from datasets import load_dataset

# for human data, chemistry category
human_chemistry = load_dataset("path/to/AITextDetect/AI_Polish_clean/Human/Chemistry")

Usage

To run the benchmark on the AITextDetect dataset:

# specify the model with local path to your model, or model name on huggingface

# distinguish Human vs. Llama3 using LM-D detector
python benchmark.py --detectLLM Llama3\
                    --method LM-D\
                    --model /path/to/distilbert-base-uncased\
                    --epochs 1 \
                    --batch_size 64 \
                    --lr 5e-6  


# distinguish Human vs. gpt3.5 using log-likelihood detector
python benchmark.py --detectLLM gpt35 --method ll --model /path/to/gpt2-medium

To run model attribution on the AITextDetect dataset:

# distinguish Human, Moonshot, gpt3.5, Mixtral, Llama3 using LM-D detector

python attribution_train_all.py \
    --model_save_dir /data1/model_attribution \ # path to save the models
    --output_csv attribution_results_new.csv && \ 
python attribution_eval_all.py \
    --result_csv eval_result.csv

Produces a attribution_results_new.csv file with all results and a eval_result.csv file with the highest F1 score for each category. The figure folder contains the confusion matrix for each category.

Note that you can also specify your own datasets on dataloader.py.

Cite

If you find this repo and dataset useful, please consider cite our work

@inproceedings{he2024mgtbench,
author = {He, Xinlei and Shen, Xinyue and Chen, Zeyuan and Backes, Michael and Zhang, Yang},
title = {{Mgtbench: Benchmarking machine-generated text detection}},
booktitle = {{ACM SIGSAC Conference on Computer and Communications Security (CCS)}},
pages = {},
publisher = {ACM},
year = {2024}
}

@software{liu2024rethinkingMGT,
  author = {Liu, Yule and Zhong, Zhiyuan and Liao, Yifan and Leng, Jiaqi and Sun, Zhen and Chen, Yang and Gong, Qingyuan and Zhang, Yang and He, Xinlei},
  month = {10},
  title = {{MGTBench-2.0: Rethinking the Machine-Generated Text Detection}},
  url = {https://github.com//Y-L-LIU/MGTBench-2.0},
  version = {2.0.0},
  year = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
mgtbench		mgtbench
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
assign_transfer_mitigate.py		assign_transfer_mitigate.py
attribution_eval_all.py		attribution_eval_all.py
attribution_train_all.py		attribution_train_all.py
benchmark.py		benchmark.py
demo.ipynb		demo.ipynb
environment.yml		environment.yml
run_attribution_eval.py		run_attribution_eval.py
run_attribution_train.py		run_attribution_train.py
transfer_mitigate.py		transfer_mitigate.py
transfer_new.py		transfer_new.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MGTBench 2.0:Rethinking the Machine-Generated Text Detection

Quick Start

Installation

Supported Methods

Supported Datasets

Usage

Cite

About

Releases

Packages

Contributors 4

Languages

License

Y-L-LIU/MGTBench-2.0

Folders and files

Latest commit

History

Repository files navigation

MGTBench 2.0:Rethinking the Machine-Generated Text Detection

Quick Start

Installation

Supported Methods

Supported Datasets

Usage

Cite

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages