Code for the paper Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning
To run the script train_generator.sh (under scripts/gsm8k or scripts/math), you should first set WANDB_API_KEY, WANDB_ENTITY, model_name_or_path, save_dir. The generator is named by save_generator_id
cd UVM
bash scripts/math/train_generator.shFirst use the generator generator_id to generate n_solutions for each question in the training set,
cd UVM
bash scripts/math/generate.shYou should first config the path of your generator checkpoint model_name_or_path
The output will be saved to data/math/model_generation/
Train UVM using train_hyperscorer.sh. First set WANDB_API_KEY, WANDB_ENTITY, save_dir, and model_name_or_path (the path of generator checkpoint).
cd UVM
bash scripts/math/train_hyperscorer.shExperiments are accelerated by vllm. Both the generation and the UVM scoring are implemented by vllm.
Config your generator checkpoint path model_name_or_path and UVM checkpoint path scorer_model_name_or_path in search.sh
cd UVM
bash scripts/math/search.shThe output will be saved to eval_results/math/generator_with_scorer/test500 for example
First set --action_noise_type none in search.sh, then run
cd UVM
bash scripts/math/search.sh@misc{yu2025robustsearchuncertaintyawarevalue,
title={Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning},
author={Fei Yu and Yingru Li and Benyou Wang},
year={2025},
eprint={2502.11155},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2502.11155},
}