Remove post-collation truncation from DPO by albertvillanova · Pull Request #5350 · huggingface/trl

albertvillanova · 2026-03-23T10:17:18Z

Remove post-collation truncation from DPO.

This PR removes internal truncation logic from the DPO trainer and require custom data collators to handle truncation themselves. This simplifies the trainer code and clarifies the contract for custom collators. Additionally, the PR updates documentation and error handling to reflect these changes.

Follow-up to:

Fix DPOTrainer collators to truncate sequences before padding #5305 (comment)

Motivation

Both built-in DPO collators (DataCollatorForPreference and DataCollatorForVisionPreference) already truncate sequences internally before padding. The only reason _truncate_inputs still existed in the trainer was as a silent safety net for custom collators, which is arguably worse than no safety net, because it hid the fact that the collator wasn't doing its job.

This PR makes the contract explicit and removes the silent fix-up.

Changes

Data Collation and Truncation Handling:

Removed the _truncate_inputs method and all related calls, shifting responsibility for truncation entirely to the data collator. Now, if a custom data collator is provided, it must handle truncation before padding.
Updated the documentation for the data_collator argument to clearly state that custom collators must truncate sequences before padding, as the trainer will not apply truncation after collation.

Code Simplification:

Removed the unused flush_right import and related logic, further simplifying the codebase.

Model Input Handling:

Updated how model input arguments are constructed in compute_ref_log_probs and _compute_loss, now directly including all relevant keys from the input dictionary without truncation logic.

Note

Medium Risk
Removes a silent safety net that truncated/padded batches inside DPOTrainer, so custom data_collators that relied on that behavior may now produce overlong or misaligned tensors and fail at runtime.

Overview
Removes post-collation truncation from DPOTrainer. The internal _truncate_inputs path (including keep_end flush/realign logic) is deleted, and both compute_ref_log_probs and loss computation now consume collator outputs as-is.

Updates the trainer/collator contract. The data_collator docstring now explicitly requires custom collators to truncate sequences before padding, and model kwargs assembly is simplified to pass through optional fields (e.g., token_type_ids, multimodal image inputs) directly without length fix-ups.

^{Written by Cursor Bugbot for commit 2983422. This will update automatically on new commits. Configure here.}

HuggingFaceDocBuilderDev · 2026-03-23T10:20:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This reverts commit 5dade1b.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

qgallouedec · 2026-03-24T01:05:14Z

            Function to use to form a batch from a list of elements of the processed `train_dataset` or `eval_dataset`.
            Will default to [`~trainer.dpo_trainer.DataCollatorForPreference`] if the model is a language model and
-            [`~trainer.dpo_trainer.DataCollatorForVisionPreference`] if the model is a vision-language model.
+            [`~trainer.dpo_trainer.DataCollatorForVisionPreference`] if the model is a vision-language model. Custom


can you please add the same comment in SFTTrainer.data_collator

I am planning to add the same comment to SFT in my subsequent PR, when I remove post-collation truncation from SFT as well.

qgallouedec · 2026-03-24T01:07:44Z

thanks! I think we might be able to remove flush_right from this repo in a next PR

qgallouedec · 2026-03-24T01:08:11Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2983422516

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

albertvillanova added 4 commits March 23, 2026 11:01

Do not call _truncate_inputs in DPOTrainer

4f14a3c

Remove DPOTrainer._truncate_inputs

c075032

Raise ValueError for custom collator and max_length

5dade1b

Update data_collator docstring to specify it must truncate

bbdc3ec

cursor Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread trl/trainer/dpo_trainer.py Outdated

Revert "Raise ValueError for custom collator and max_length"

36c0651

This reverts commit 5dade1b.

cursor Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread trl/trainer/dpo_trainer.py

Merge remote-tracking branch 'upstream/main' into fu-5305

2983422

qgallouedec reviewed Mar 24, 2026

View reviewed changes

qgallouedec approved these changes Mar 24, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Mar 24, 2026

View reviewed changes

Comment thread trl/trainer/dpo_trainer.py

albertvillanova merged commit ec1802e into huggingface:main Mar 24, 2026
12 checks passed

This was referenced Mar 24, 2026

Remove unused flush_right #5358

Merged

Remove post-collation truncation from SFT #5359

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove post-collation truncation from DPO#5350

Remove post-collation truncation from DPO#5350
albertvillanova merged 6 commits into
huggingface:mainfrom
albertvillanova:fu-5305

albertvillanova commented Mar 23, 2026 •

edited by cursor Bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 23, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

qgallouedec Mar 24, 2026

Uh oh!

albertvillanova Mar 24, 2026 •

edited

Loading

Uh oh!

qgallouedec commented Mar 24, 2026

Uh oh!

qgallouedec commented Mar 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

albertvillanova commented Mar 23, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Uh oh!

HuggingFaceDocBuilderDev commented Mar 23, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qgallouedec Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

albertvillanova Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Mar 24, 2026

Uh oh!

qgallouedec commented Mar 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

albertvillanova commented Mar 23, 2026 •

edited by cursor Bot

Loading

albertvillanova Mar 24, 2026 •

edited

Loading