Fix mm_token_type_ids silently dropped in DPO VLM training by albertvillanova · Pull Request #5279 · huggingface/trl

albertvillanova · 2026-03-12T11:54:19Z

Fix mm_token_type_ids silently dropped in DPO VLM training.

Note

Medium Risk
Touches the VLM batching/forward kwargs path for DPO, so mistakes could break multimodal training or cause subtle tensor-shape/runtime errors, but the change is narrow and covered by a targeted regression test.

Overview
Fixes DPO vision-language training for processors that return mm_token_type_ids (e.g. Qwen2.5-VL), ensuring the field is not dropped and its tensor shape stays aligned with input_ids.

DataCollatorForVisionPreference now pads/flushes mm_token_type_ids alongside other batch tensors and includes it in the batch output, and DPOTrainer forwards mm_token_type_ids into both reference-logprob computation and the main loss forward pass. Adds a regression test validating mm_token_type_ids presence and shape.

^{Written by Cursor Bugbot for commit 4842af6. This will update automatically on new commits. Configure here.}

HuggingFaceDocBuilderDev · 2026-03-12T11:57:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-12T12:13:01Z

+            "image_sizes",
+            "token_type_ids",
+            "mm_token_type_ids",
+        ):


mm_token_type_ids not truncated when max_length is set

Medium Severity

When max_length is set, _truncate_inputs truncates input_ids, attention_mask, and completion_mask but not mm_token_type_ids. Both compute_ref_log_probs and _compute_loss then pass the truncated input_ids alongside the original full-length mm_token_type_ids from inputs to the model, causing a shape mismatch that will crash during the forward pass.

Additional Locations (2)

trl/trainer/dpo_trainer.py#L1013-L1017

trl/trainer/dpo_trainer.py#L971-L993

yep, it sounds like a legit feedback

@qgallouedec,

First, this is a pre-existing condition, not introduced by this PR:

The same gap for token_type_ids has never caused a filed issue

Second, as already explicitly documented in the tests, for VLMs, truncating can remove image tokens, leading to errors. Anyone doing VLM DPO uses max_length=None

Therefore, in my opinion, if this is an issue we want to handle, I think it should be done in separate PR. I am opening one! 🚀

The same gap for token_type_ids has never caused a filed issue

interesting, curious to know why

Anyone doing VLM DPO uses max_length=None

yes that's recommended, but careful user might use max_length!=0 (technically it's supported, if you ensure that no image token are truncated)

ok a separate pr sounds good!

…ce#5279)

Forward mm_token_type_ids to _compute_loss and compute_ref_log_probs

c7e9f55

Handle mm_token_type_ids in DPO DataCollatorForVisionPreference

549a2ca

cursor Bot reviewed Mar 12, 2026

View reviewed changes

Comment thread trl/trainer/dpo_trainer.py

Add regression test

4842af6

cursor Bot reviewed Mar 12, 2026

View reviewed changes

qgallouedec approved these changes Mar 12, 2026

View reviewed changes

albertvillanova merged commit e9c1821 into huggingface:main Mar 13, 2026
12 checks passed

This was referenced Mar 13, 2026

DPOTrainer crashes when max_length is set with VLMs: IndexError #5283

Closed

Support max_length in DPO VLM training #5284

Merged

qgallouedec pushed a commit that referenced this pull request Mar 20, 2026

Fix mm_token_type_ids silently dropped in DPO VLM training (#5279)

f48bf87

songhappy pushed a commit to songhappy/trl that referenced this pull request Apr 20, 2026

Fix mm_token_type_ids silently dropped in DPO VLM training (huggingfa…

37e6a93

…ce#5279)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mm_token_type_ids silently dropped in DPO VLM training#5279

Fix mm_token_type_ids silently dropped in DPO VLM training#5279
albertvillanova merged 3 commits into
huggingface:mainfrom
albertvillanova:fix-5277

albertvillanova commented Mar 12, 2026 •

edited by cursor Bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 12, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Mar 12, 2026

Uh oh!

qgallouedec Mar 12, 2026

Uh oh!

albertvillanova Mar 12, 2026

Uh oh!

qgallouedec Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

albertvillanova commented Mar 12, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 12, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Mar 12, 2026

Choose a reason for hiding this comment

mm_token_type_ids not truncated when max_length is set

Uh oh!

qgallouedec Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

albertvillanova Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

albertvillanova commented Mar 12, 2026 •

edited by cursor Bot

Loading

`mm_token_type_ids` not truncated when `max_length` is set