ARM: Adaptive Reasoning Model

ARM—Adaptive Reasoning Model, a reasoning model capable of adaptively selecting appropriate reasoning formats based on the task at hand.

Updates

2025/05/27: Thrilled to release ARM: A reasoning model capable of adaptively selecting reasoning formats based on the task, achieving a better trade-off between effectiveness and efficiency!

Data & Model

You can download our dataset and model from 🤗HuggingFace.

Environments

This repository contains the codebase for SFT and RL based on LLaMA-Factory and VeRL. We use two separate conda environments for each stage:

# SFT
conda env create -f environment/llama_factory_env.yaml
conda activate arm_llama_factory

# RL
conda env create -f environment/verl_env.yaml
conda activate arm_verl
pip3 install --force-reinstall torch==2.4.0 --index-url https://download.pytorch.org/whl/cu124
pip3 install flash-attn --no-build-isolation

Stage1: SFT

conda activate arm_llama_factory
cd LLaMA-Factory

Make sure to specify the correct model path in the .yaml file.

Train

CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train stage1_scripts/qwen2.5_7b/train.yaml

Merge

llamafactory-cli export stage1_scripts/qwen2.5_7b/merge.yaml

Stage2: RL

conda activate arm_verl
cd verl

Make sure to specify the correct model and data path in the .sh file.

Data Process

# The training data is located in arm/verl/data/parquet.  
# Alternatively, you can prepare your own training data, e.g.:
python3 stage2_scripts/data_preprocess/gsm8k.py

# You can also prepare data for the instruction-guided mode used in evaluation, e.g.:
python3 stage2_scripts/data_preprocess/instruction_guided/gsm8k.py

Train

bash stage2_scripts/trainer/run.sh

Generate

# Adaptive Mode
bash stage2_scripts/generation/adaptive_run.sh

# Instruction-Guided Mode. Specify the reasoning format in the .sh file:
bash stage2_scripts/generation/instruction_guided_run.sh

Evaluate

bash stage2_scripts/evaluation/run.sh

🔍Roadmap

[Work in Progress] Stay tuned!

Contact

If you have any problems, please contact Siye Wu, Jian Xie.

Citation Information

If our paper or related resources prove valuable to your research, we kindly ask for a citation.

@article{wu2025arm,
  title={ARM: Adaptive Reasoning Model},
  author={Wu, Siye and Xie, Jian and Zhang, Yikai and Chen, Aili and Zhang, Kai and Su, Yu and Xiao, Yanghua},
  journal={arXiv preprint arXiv:2505.20258},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
LLaMA-Factory		LLaMA-Factory
environment		environment
images		images
verl		verl
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ARM: Adaptive Reasoning Model

Updates

Data & Model

Environments

Stage1: SFT

Train

Merge

Stage2: RL

Data Process

Train

Generate

Evaluate

🔍Roadmap

Contact

Citation Information

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

TEAM-ARM/arm

Folders and files

Latest commit

History

Repository files navigation

ARM: Adaptive Reasoning Model

Updates

Data & Model

Environments

Stage1: SFT

Train

Merge

Stage2: RL

Data Process

Train

Generate

Evaluate

🔍Roadmap

Contact

Citation Information

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages