Skip to content

[trainer] feat: ReMax support using reward model for baseline#3780

Merged
wuxibin89 merged 1 commit intoverl-project:mainfrom
HollowMan6:remax-reward
Oct 17, 2025
Merged

[trainer] feat: ReMax support using reward model for baseline#3780
wuxibin89 merged 1 commit intoverl-project:mainfrom
HollowMan6:remax-reward

Commits

Commits on Oct 16, 2025