Skip to content

feat(grpo_trainer.py): Variational Sequence-Level Soft Policy Optimization (VESPO)#5199

Merged
qgallouedec merged 12 commits into
huggingface:mainfrom
casinca:VESPO
Mar 14, 2026
Merged

feat(grpo_trainer.py): Variational Sequence-Level Soft Policy Optimization (VESPO)#5199
qgallouedec merged 12 commits into
huggingface:mainfrom
casinca:VESPO

warning for `importance_sampling_level != "token"`

c5d0d50
Select commit
Loading
Failed to load commit list.
Cursor / Cursor Bugbot succeeded Mar 14, 2026 in 4m 54s

Bugbot Review

Bugbot Analysis Progress (4m 57s elapsed)

✅ Gathered PR context (3s)
✅ Completed bug detection (4m 53s)
✅ Posted analysis results (1s)

Final Result: Bugbot completed review - no issues found! ✅

Request ID: serverGenReqId_980ecd63-c0ae-4162-85bc-f21e142b8da3

Details