[WIP] Switch gradient checkpointing default to `use_reentrant=False` (PyTorch recommended) #43203

qgallouedec · 2026-01-09T18:09:16Z

Summary

This PR changes our gradient checkpointing default from use_reentrant=True to use_reentrant=False.

Two years ago we explicitly set use_reentrant=True in #28538 because PyTorch started warning that the default would change in the future, and recommending users choose a value explicitly:

/scratch/miniconda3/envs/brr/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: Warning: 
torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default 
value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass 
use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details 
on the differences between the two variants.

PyTorch warning shown with torch 2.3, see #28536

At the time, defaulting to True was the safest choice to preserve the behavior of earlier releases.

PyTorch now recommends the non-reentrant variant (use_reentrant=False) see, https://docs.pytorch.org/docs/stable/checkpoint.html, and is moving toward making it the default. Aligning with this upstream recommendation gives us several benefits:

Note: training and checkpointing behavior remains functionally equivalent in typical use cases, with the main difference being how activations are recomputed during backward (non-reentrant uses a safer mechanism).

… recommended)

HuggingFaceDocBuilderDev · 2026-01-09T18:18:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2026-01-09T21:47:54Z

tests/models/align/test_modeling_align.py

-    @unittest.skip
-    def test_training_gradient_checkpointing(self):
-        pass
-
-    @unittest.skip(
-        reason="This architecture seem to not compute gradients properly when using GC, check: https://github.com/huggingface/transformers/pull/27124"
-    )
-    def test_training_gradient_checkpointing_use_reentrant(self):
-        pass
-
-    @unittest.skip(
-        reason="This architecture seem to not compute gradients properly when using GC, check: https://github.com/huggingface/transformers/pull/27124"
-    )
-    def test_training_gradient_checkpointing_use_reentrant_false(self):
-        pass
-


It seems a big number of this ignored test actually pass. I check them all

github-actions · 2026-01-09T21:47:59Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: align, altclip, aria, autoformer, aya_vision, beit, big_bird, blip, blip_2, canine, chinese_clip, clap, clip, clipseg, colpali, deit

github-actions · 2026-01-09T21:56:01Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43203&sha=435655

Switch gradient checkpointing default to use_reentrant=False (PyTorch…

7e5c7d1

… recommended)

qgallouedec changed the title ~~Switch gradient checkpointing default to use_reentrant=False (PyTorch…~~ Switch gradient checkpointing default to use_reentrant=False (PyTorch recommended) Jan 9, 2026

qgallouedec changed the title ~~Switch gradient checkpointing default to use_reentrant=False (PyTorch recommended)~~ Switch gradient checkpointing default to use_reentrant=False (PyTorch recommended) Jan 9, 2026

qgallouedec and others added 7 commits January 9, 2026 18:24

All testers

7ea2cbe

revert

340224c

Merge branch 'main' into no-reentrant

68c0711

skip for the right reason

d642e52

skip for right reason and this one is fixed

1210ec2

skip for the right reason

453d5cc

fix up to blip2 excluded

4356551

qgallouedec commented Jan 9, 2026

View reviewed changes

qgallouedec changed the title ~~Switch gradient checkpointing default to use_reentrant=False (PyTorch recommended)~~ [WIP] Switch gradient checkpointing default to use_reentrant=False (PyTorch recommended) Jan 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Switch gradient checkpointing default to `use_reentrant=False` (PyTorch recommended) #43203

[WIP] Switch gradient checkpointing default to `use_reentrant=False` (PyTorch recommended) #43203

qgallouedec commented Jan 9, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Jan 9, 2026

Uh oh!

qgallouedec Jan 9, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIP] Switch gradient checkpointing default to use_reentrant=False (PyTorch recommended) #43203

Are you sure you want to change the base?

[WIP] Switch gradient checkpointing default to use_reentrant=False (PyTorch recommended) #43203

Conversation

qgallouedec commented Jan 9, 2026

Summary

Uh oh!

HuggingFaceDocBuilderDev commented Jan 9, 2026

Uh oh!

qgallouedec Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIP] Switch gradient checkpointing default to `use_reentrant=False` (PyTorch recommended) #43203

[WIP] Switch gradient checkpointing default to `use_reentrant=False` (PyTorch recommended) #43203

qgallouedec Jan 9, 2026 •

edited

Loading