Skip to content

feat(grpo_trainer.py): Variational Sequence-Level Soft Policy Optimization (VESPO)#5199

Merged
qgallouedec merged 12 commits into
huggingface:mainfrom
casinca:VESPO
Mar 14, 2026
Merged

feat(grpo_trainer.py): Variational Sequence-Level Soft Policy Optimization (VESPO)#5199
qgallouedec merged 12 commits into
huggingface:mainfrom
casinca:VESPO

warning for `importance_sampling_level != "token"`

c5d0d50
Select commit
Loading
Failed to load commit list.
Sign in for the full log view

Annotations

1 error

The logs for this run have expired and are no longer available.