phr

replication code for Estimating the Hallucination Rate of Generative AI

citation

@misc{jesson2024estimating,
      title={Estimating the Hallucination Rate of Generative AI},
      author={Andrew Jesson and Nicolas Beltran-Velez and Quentin Chu and Sweta Karlekar and Jannik Kossen and Yarin Gal and John P. Cunningham and David Blei},
      year={2024},
      eprint={2406.07457},
      archivePrefix={arXiv},
}

installation

git clone https://github.com/blei-lab/PHR.git
cd PHR
conda env create -f environment.yaml
pip install [-e] .

examples

regression

fit the model

python3 phr/regression/fit.py --job-dir output/ --experiment-id baseline --seed 0

run the evaluation script. Get the runid from output/wandb (e.g. 20240602_173936-biy3p7qq)

python3 phr/regression/evaluate.py --job-dir output/ --run-id 20240602_173936-biy3p7qq --set test --N 100

for plotting results, use notebooks/regression/plotting.ipynb

language

run evaluation scripts for Llama-2 7B. repeat for seeds [0, ..., 49], context lenghts [2, 4, 8, 16, 32], and datasets [sst2, SetFit/subj, ag_news, medical_questions_pairs, rte, wnli]. slurm job arrays are recommended.

python3 \
  phr/language/evaluate.py \
    --job-dir output/ \
    --model meta-llama/Llama-2-7b-hf \
    --dataset sst2 \
    --context-length 2 \
    --N 5 \
    --num-reps 10 \
    --num-y-samples 50 \
    --top-p 0.9 \
    --seed 0

for plotting results, use notebooks/language/plotting.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.github/workflows		.github/workflows
assets/sharp_chebyshev_random_relu_neural_network_regression_dataset_test-relu_42_1730159634		assets/sharp_chebyshev_random_relu_neural_network_regression_dataset_test-relu_42_1730159634
ci		ci
notebooks		notebooks
phr		phr
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yaml		environment.yaml
notes.txt		notes.txt
pyproject.toml		pyproject.toml
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

phr

citation

installation

examples

regression

language

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

blei-lab/phr

Folders and files

Latest commit

History

Repository files navigation

phr

citation

installation

examples

regression

language

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages