Enable chunked NLL loss with VLM in SFT by qgallouedec · Pull Request #5684 · huggingface/trl

qgallouedec · 2026-04-29T17:50:07Z

Requires #5676

Note

Medium Risk
Expands the chunked_nll training path to VLM and MoE wrappers by patching model forward, which can subtly affect loss/gradient behavior across many model families and transformers versions.

Overview
Enables loss_type='chunked_nll' for vision-language models by extending _patch_chunked_ce_lm_head to handle VLM config (text_config), run the multimodal wrapper (base_model/model) so vision token injection occurs, and compute MoE auxiliary loss using the correct config fields.

Updates SFTTrainer to apply the patched chunked-loss forward for VLMs (removing the prior VLM restriction) and relaxes SFTConfig docs/help text to reflect that chunked_nll is now only incompatible with use_liger_kernel.

Adds/expands tests to cover chunked NLL training on multiple VLM families, plus forward/backward equivalence tests for patched chunked CE on VLMs (including a VLM MoE aux-loss case), and tightens the PEFT chunked-NLL test to assert base weights stay frozen while adapter params update.

^{Reviewed by Cursor Bugbot for commit ec0cad7. Bugbot is set up for automated code reviews on this repo. Configure here.}

HuggingFaceDocBuilderDev · 2026-04-29T17:52:53Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2026-05-05T16:06:04Z

@codex review

chatgpt-codex-connector · 2026-05-05T16:13:03Z

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 00cb84b. Configure here.}

AmineDiro · 2026-05-08T14:06:47Z

+            # the model itself. We should investigate this further, but for now we just skip these params.
+            # fmt: off
+            if (
+                model_id == "trl-internal-testing/tiny-Gemma3ForConditionalGeneration" and "model.vision_tower.vision_model.head" in n or


nit: can we refacto this a bit ? any reasons they didn't change ?

I'm not sure; it's been an open question for a long time, but it's never been urgent enough for me to set aside time to investigate. My hunch is that the gradients reaching the vision tower are too weak for the weights to be updated, either because of the structure of the tiny model or because of the initialization values.

for the refacto, I'd recommend keeping thing like this mostly because it's consistent with TestDPOTrainer.test_train_vlm and TestSFTTrainer.test_train_vlm, plus it explicitly shows which layers are problematic.
Although I agree it's no pretty

qgallouedec and others added 7 commits April 28, 2026 21:09

Enable chunked NLL loss with PEFT in SFT

0cf0afb

fix for prompt tuning

b3abe56

style

30d1d73

better

69c1b2f

Merge branch 'main' into chunked_nll_peft

77ba54c

raise

68f4f94

Enable chunked NLL loss with VLM in SFT

248dbeb

qgallouedec added 4 commits April 29, 2026 18:38

style

a4cad2e

Add VLM support to chunked cross-entropy loss tests

c81430d

more vlms

35903e2

rm docstring

eeb4a37

cursor Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread tests/test_sft_trainer.py

qgallouedec added 4 commits April 30, 2026 13:29

Merge branch 'main' into chunked_nll_peft

7006f11

Merge branch 'chunked_nll_peft' into chunked-nll-vlm

c0e822f

Merge branch 'main' into chunked_nll_peft

f256f32

Merge branch 'chunked_nll_peft' into chunked-nll-vlm

5764e15

cursor Bot reviewed May 3, 2026

View reviewed changes

Comment thread tests/test_sft_trainer.py

qgallouedec and others added 3 commits May 4, 2026 11:24

Merge branch 'main' into chunked_nll_peft

e9cb282

Merge branch 'chunked_nll_peft' into chunked-nll-vlm

5bb1309

fix base model resolution for old transformers versions

412dacf

cursor Bot reviewed May 4, 2026

View reviewed changes

Comment thread trl/trainer/sft_trainer.py

qgallouedec and others added 4 commits May 4, 2026 20:52

update auxiliary loss calculation to use text_config parameters

a080fa3

allow old transformers versions

93ea587

Merge branch 'main' into chunked_nll_peft

c48bb10

Merge branch 'chunked_nll_peft' into chunked-nll-vlm

cd2cf7d

Base automatically changed from chunked_nll_peft to main May 5, 2026 17:07

Merge branch 'main' into chunked-nll-vlm

54da4eb

cursor Bot reviewed May 6, 2026

View reviewed changes

Comment thread trl/trainer/sft_trainer.py

Merge branch 'main' into chunked-nll-vlm

00cb84b

cursor Bot reviewed May 6, 2026

View reviewed changes

Comment thread tests/test_sft_trainer.py Outdated

qgallouedec and others added 6 commits May 6, 2026 19:51

remove duplicate + consistency

de46c96

concistency

7010136

Merge branch 'main' into chunked-nll-vlm

e46a5b0

align loss type doc

4205b3c

a bit better testing

9387319

Merge branch 'main' into chunked-nll-vlm

ec0cad7

AmineDiro reviewed May 8, 2026

View reviewed changes

AmineDiro approved these changes May 8, 2026

View reviewed changes

qgallouedec merged commit b05330a into main May 8, 2026
13 checks passed

qgallouedec deleted the chunked-nll-vlm branch May 8, 2026 14:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable chunked NLL loss with VLM in SFT#5684

Enable chunked NLL loss with VLM in SFT#5684
qgallouedec merged 30 commits into
mainfrom
chunked-nll-vlm

qgallouedec commented Apr 29, 2026 •

edited by cursor Bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 29, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec commented May 5, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 5, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

AmineDiro May 8, 2026

Uh oh!

qgallouedec May 8, 2026

Uh oh!

qgallouedec May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

qgallouedec commented Apr 29, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 29, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec commented May 5, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 5, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AmineDiro May 8, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec May 8, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qgallouedec commented Apr 29, 2026 •

edited by cursor Bot

Loading