Running experiments

First, you need to set a HuggingFace token as an environment variable (HF_TOKEN):

export HF_TOKEN=<your huggingface token>

You can find all runnable experiments in the scripts directory. Their filename should explicitly tell you their purpose.

Running translation

We currently use facebook/nllb-200-3.3B for translation. First install sentence splitter using:

pip install git+https://github.com/mediacloud/sentence-splitter.git

To translate reward bench into 22 Aya languages run the following:

cd scripts
bash run_nllb.sh

You can also translate a specifc preference dataset from huggingface to a specifc target language using scripts/translate_preference_pairs_nllb.py.

Getting rewards from a Reward Model (RM) on a HuggingFace dataset

Here, we use the scripts/run_rewardbench.py command-line interface and pass a HuggingFace dataset. This is useful if the reward model is trained as a Custom classifier (🛠️), Sequence classifier (🔢), or via DPO (🎯). For example, if we want to get the reward score of the UltraRM-13b reward model on a preference dataset, we run:

python -m scripts.run_rewardbench \
    --model openbmb/UltraRM-13b \
    --chat_template openbmb \
    --dataset_name $DATASET \
    --lang_code $LANG_CODE \
    --split "filtered" \
    --output_dir $OUTDIR \
    --batch_size 8 \
    --trust_remote_code \
    --force_truncation \
    --save_all

The evaluation parameters can be found in the allenai/reward-bench repository. This runs the reward model on the (prompt, chosen, rejected) triples and give us the reward score for each instance. The results are saved into a JSON file inside the $OUTDIR directory. Finally, you can find some experiments in the experiments/run_rm_evals.sh script.

Getting rewards from a Generative RM on a HuggingFace dataset

Here we use scripts/run_generative.py, a modified version of the same script in RewardBench to obtain rewards from a Generative RM (🗨️). The only difference is that this script accepts any arbitrary HuggingFace preference dataset (which we plan to conribute upstream later on) instead of just the RewardBench dataset.

For Generative RMs, we prompt a model in a style akin to LLM-as-a-judge, and then parse the output to obtain the preference. This can be done for closed-source APIs (e.g., GPT-4, Claude) or open-source LMs (done via vLLM). If you're planning to use some closed-source APIs, you also need to set the tokens for each:

export OPENAI_API_KEY=<your openai token>
export CO_API_KEY=<your cohere api token>
export ANTHROPIC_API_KEY=<your anthropic token>

You can also store all your API keys in a .env file. It will be loaded using the python-dotenv library. Say we want to obtain the preferences of gpt-4-2024-04-09:

export OPENAI_API_KEY=<your openai token>
python -m scripts.run_generative \
    --dataset_name $DATASET \
    --model gpt-4-turbo-2024-04-09 \
    --split "filtered" \
    --lang_code $LANG_CODE \
    --output_dir $OUTDIR

You can also run open-source LMs in a generative fashion. The inference is then routed through vLLM. Here's an example using meta-llama/Meta-Llama-3-70B-Instruct:

python -m scripts/run_generative.py \
    --dataset_name $DATASET \
    --lang_code $LANG_CODE \
    --split "filtered" \
    --model "meta-llama/Meta-Llama-3-70B-Instruct" \
    --num_gpus 4 \
    --output_dir $OUTDIR

To improve prompt output especially on multilingual cases, we recommend passing a tuple to the --include_languages parameter. The first value should be the language a prompt was written in, and the second value should be the language the assistant should use in its answer.

python -m scripts/run_generative.py \
    --dataset_name $DATASET \
    --lang_code deu_Latn \
    --split $SPLIT \
    --model "meta-llama/Meta-Llama-3-70B-Instruct" \
    --num_gpus 4 \
+   --include_languages German English
    --output_dir $OUTDIR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs.md

docs.md

Running experiments

Running translation

Getting rewards from a Reward Model (RM) on a HuggingFace dataset

Getting rewards from a Generative RM on a HuggingFace dataset

Files

docs.md

Latest commit

History

docs.md

File metadata and controls

Running experiments

Running translation

Getting rewards from a Reward Model (RM) on a HuggingFace dataset

Getting rewards from a Generative RM on a HuggingFace dataset