Refactor KTO coordinated with DPO [b/N]: Simplify truncation logic by albertvillanova · Pull Request #4808 · huggingface/trl

albertvillanova · 2026-01-12T14:04:44Z

Refactor KTO coordinated with DPO [b/N]: Simplify truncation logic.

This PR simplifies the truncation logic by removing max_prompt_length and truncation_mode parameters, keeping only max_length.

This simplification makes the KTO trainer significantly easier to use and understand, while maintaining correct truncation behavior.

Follow-up to:

Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support #4792

Part of:

KTO refactoring #4786 (comment)

Coordinated with DPO refactoring, as discussed with @qgallouedec :

Refactor DPO #3906

Key Changes

KTOConfig
- Removed max_prompt_length parameter and documentation
- Removed truncation_mode parameter and documentation
- Now only uses max_length for sequence length control
KTOTrainer

Simplified _process_tokens() function:
- Removed complex two-stage truncation logic
- Old logic: First truncate prompt (with keep_start/keep_end modes), then truncate completion
- New logic: Simple single-stage truncation from the end (completion only)
- Updated docstring to reflect simpler behavior
Trainer Initialization:
- Removed max_prompt_length setup logic
- Removed self.max_prompt_length attribute
- Removed self.truncation_mode attribute
- Removed both parameters from fn_kwargs passed to _process_tokens()
Tests
- Updated test to remove truncation_mode and max_prompt_length from fn_kwargs

HuggingFaceDocBuilderDev · 2026-01-12T14:07:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec

LGTM!

FYI, answer, response, completion all refer to the same thing. We used to mix these terms, but now we try have a unified use of completion only

albertvillanova added 2 commits January 12, 2026 14:52

Remove max_prompt_length and truncation_mode from KTO

928f96d

Update test

9289b34

qgallouedec approved these changes Jan 12, 2026

View reviewed changes

albertvillanova mentioned this pull request Jan 12, 2026

KTO refactoring #4786

Open

6 tasks

Unify naming to completion instead of answer

7f44f28

albertvillanova merged commit 6402e70 into huggingface:main Jan 13, 2026
2 of 3 checks passed

This was referenced Feb 4, 2026

Remove max_prompt_length from experimental PRM #4963

Merged

Remove max_prompt_length from experimental BCO #4964

Merged

Remove max_prompt_length from experimental CPO #4965

Merged

Remove max_prompt_length from experimental ORPO #4966

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor KTO coordinated with DPO [b/N]: Simplify truncation logic#4808

Refactor KTO coordinated with DPO [b/N]: Simplify truncation logic#4808
albertvillanova merged 3 commits intohuggingface:mainfrom
albertvillanova:refactor-kto-coordinated-with-dpo-b

albertvillanova commented Jan 12, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jan 12, 2026

Uh oh!

qgallouedec left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

albertvillanova commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key Changes

Uh oh!

HuggingFaceDocBuilderDev commented Jan 12, 2026

Uh oh!

qgallouedec left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

albertvillanova commented Jan 12, 2026 •

edited

Loading

qgallouedec left a comment •

edited

Loading