🤗 Preference Dataset | 📚 Documentation | 📄 Paper
This repository is the source code for the paper, Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback, where we introduce a routing framework that creates hybrid preferences with both LLM and human preference annotations to maximize performance on a given evaluation metric (e.g., RewardBench). We release this codebase to improve reproducibility of our work, and to aid researchers in constructing preference datasets in their research.
![main_figure](https://private-user-images.githubusercontent.com/12949683/373064876-3bfb7c42-ec9c-4457-9949-367dc6270269.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg4Mjg4MTcsIm5iZiI6MTczODgyODUxNywicGF0aCI6Ii8xMjk0OTY4My8zNzMwNjQ4NzYtM2JmYjdjNDItZWM5Yy00NDU3LTk5NDktMzY3ZGM2MjcwMjY5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA2VDA3NTUxN1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTVmNzcwNjI5YzI5ZTlkMDVlZmY4MzA2M2MyNDg2YjVlOTc3YTRhOThjNDU0OWZjZWNlM2VjYjkwZGIyNTgwMGMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.LfLVInO9QWMHw1Dx5azzp4w25cZHkhOujn_AkZtAdYk)
Install the dependencies within your Python environment:
python -m venv venv
venv/bin/source activate
pip install -r requirements.txt
Running the full pipeline involves several steps, some might need to be run on a TPU machine. Nevertheless, we wrote scripts to automate different parts of the pipeline. Please head over the docs directory for more information.
@article{miranda2024hybrid,
title={{Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback}},
author={Miranda, Lester James V and Wang, Yizhong and Elazar, Yanai and Kumar, Sachin and Pyatkin, Valentina and Brahman, Faeze and Smith, Noah A and Hajishirzi, Hannaneh and Dasigi, Pradeep},
journal={arXiv preprint arXiv:2410.19133},
year={2024}
}