Skip to content

Add SDPO (Self-Distillation Policy Optimization) trainer#4935

Merged
kashif merged 85 commits into
huggingface:mainfrom
MengAiDev:4929
Mar 23, 2026
Merged

Add SDPO (Self-Distillation Policy Optimization) trainer#4935
kashif merged 85 commits into
huggingface:mainfrom
MengAiDev:4929

remove ref_model reference

bf4cc67
Select commit
Loading
Failed to load commit list.
Cursor / Cursor Bugbot completed Mar 22, 2026 in 12m 1s

Bugbot Review

Bugbot Analysis Progress (12m 4s elapsed)

✅ Gathered PR context (3s)
✅ Completed bug detection (11m 52s)
✅ Posted analysis results (9s)

Final Result: Bugbot completed review - no new issues found. 6 previously reported issues remain unresolved.

Request ID: serverGenReqId_e2fc9711-f674-4d8d-a03e-d24145c275c9

Details