UVM, Uncertainty-Aware Value Models for Language Model Reasoning

Code for the paper Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning

Training

Train the generator

To run the script train_generator.sh (under scripts/gsm8k or scripts/math), you should first set WANDB_API_KEY, WANDB_ENTITY, model_name_or_path, save_dir. The generator is named by save_generator_id

cd UVM
bash scripts/math/train_generator.sh

Train the UVM

Generation

First use the generator generator_id to generate n_solutions for each question in the training set,

cd UVM
bash scripts/math/generate.sh

You should first config the path of your generator checkpoint model_name_or_path

The output will be saved to data/math/model_generation/

Training

Train UVM using train_hyperscorer.sh. First set WANDB_API_KEY, WANDB_ENTITY, save_dir, and model_name_or_path (the path of generator checkpoint).

cd UVM
bash scripts/math/train_hyperscorer.sh

Inference

Experiments are accelerated by vllm. Both the generation and the UVM scoring are implemented by vllm.

UVM-Guided Beam Search

Config your generator checkpoint path model_name_or_path and UVM checkpoint path scorer_model_name_or_path in search.sh

cd UVM
bash scripts/math/search.sh

The output will be saved to eval_results/math/generator_with_scorer/test500 for example

OVM-Guided Beam Search

First set --action_noise_type none in search.sh, then run

cd UVM
bash scripts/math/search.sh

Citation

@misc{yu2025robustsearchuncertaintyawarevalue,
      title={Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning}, 
      author={Fei Yu and Yingru Li and Benyou Wang},
      year={2025},
      eprint={2502.11155},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2502.11155}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
scripts		scripts
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

UVM, Uncertainty-Aware Value Models for Language Model Reasoning

Training

Train the generator

Train the UVM

Generation

Training

Inference

UVM-Guided Beam Search

OVM-Guided Beam Search

Citation

About

Uh oh!

Releases

Packages

Languages

FreedomIntelligence/UVM

Folders and files

Latest commit

History

Repository files navigation

UVM, Uncertainty-Aware Value Models for Language Model Reasoning

Training

Train the generator

Train the UVM

Generation

Training

Inference

UVM-Guided Beam Search

OVM-Guided Beam Search

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages