Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
71f301b
draft
zucchini-nlp Jan 16, 2026
d0f4762
first make it work, then make pretty
zucchini-nlp Jan 23, 2026
7ab4252
:eyes:
zucchini-nlp Jan 23, 2026
56d3d0c
unused attributes?
zucchini-nlp Jan 23, 2026
ded5d7b
everythgin we need is in the config!
zucchini-nlp Jan 23, 2026
cb302df
push
zucchini-nlp Jan 26, 2026
a725f67
update
zucchini-nlp Jan 26, 2026
3eae234
update
zucchini-nlp Jan 26, 2026
0e7792b
fixes
zucchini-nlp Jan 26, 2026
2323559
forgot
zucchini-nlp Jan 26, 2026
a69cbc6
push more changes
zucchini-nlp Jan 27, 2026
5459482
move it out from mxin
zucchini-nlp Jan 27, 2026
754b616
last test fixes i hope
zucchini-nlp Jan 27, 2026
dc3d676
delete backbone utils from utils
zucchini-nlp Jan 27, 2026
f7306fc
merge main
zucchini-nlp Jan 27, 2026
49eda76
docs
zucchini-nlp Jan 27, 2026
a4f4b51
more docs updates
zucchini-nlp Jan 27, 2026
06eb80c
remove backbone from args
zucchini-nlp Jan 27, 2026
a5d987d
docstring
zucchini-nlp Jan 27, 2026
5063b0b
get rid of circular import and revert utils/backbonutils for BC
zucchini-nlp Jan 27, 2026
cd46e95
fix more tests
zucchini-nlp Jan 27, 2026
35061b2
modular
zucchini-nlp Jan 28, 2026
61acbc9
style run
zucchini-nlp Jan 29, 2026
a2f5b3b
Update src/transformers/modeling_backbone_utils.py
zucchini-nlp Jan 29, 2026
e68d5c1
Update src/transformers/modeling_backbone_utils.py
zucchini-nlp Jan 29, 2026
5e20535
set and align output features from single entrypoint
zucchini-nlp Jan 29, 2026
8c7864a
move init_backbone to '__init__'
zucchini-nlp Jan 29, 2026
ab2c275
calling init with timm backbone
zucchini-nlp Jan 29, 2026
46c245c
maybe fix test
zucchini-nlp Jan 29, 2026
e8b063d
Merge branch 'main' into backbone
zucchini-nlp Jan 30, 2026
dea1f0a
fix tests
zucchini-nlp Jan 30, 2026
4b1142a
fix more and new models as well
zucchini-nlp Jan 30, 2026
e8ef800
fix last test, maybe will move them to a common file
zucchini-nlp Jan 30, 2026
3770829
fix repo
zucchini-nlp Jan 30, 2026
f020eff
fix repo
zucchini-nlp Feb 2, 2026
f34bb0d
tests unified, cannot be in common because models are very different
zucchini-nlp Feb 2, 2026
72c223c
add in init
zucchini-nlp Feb 2, 2026
eb88a4c
make comment more detailed for future us
zucchini-nlp Feb 2, 2026
ac68691
Merge branch 'main' into backbone
zucchini-nlp Feb 3, 2026
1f3d29b
fix modular
zucchini-nlp Feb 4, 2026
ea9ac12
update tests after DETR refactor
zucchini-nlp Feb 4, 2026
223591c
rename `backbone_utils`
zucchini-nlp Feb 4, 2026
509ffc4
delete bare `is_timm_available`
zucchini-nlp Feb 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions docs/source/en/backbones.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ This guide describes the backbone class, backbones from the [timm](https://hf.co

There are two backbone classes.

- [`~transformers.utils.BackboneMixin`] allows you to load a backbone and includes functions for extracting the feature maps and indices.
- [`~transformers.utils.BackboneConfigMixin`] allows you to set the feature map and indices of a backbone configuration.
- [`~transformers.utils.BackboneMixin`] allows you to load a backbone and includes functions for extracting the feature maps and indices from config.
- [`~transformers.utils.BackboneConfigMixin`] allows you to set, align and verify the feature map and indices of a backbone configuration.

Refer to the [Backbone](./main_classes/backbones) API documentation to check which models support a backbone.

Expand Down Expand Up @@ -69,12 +69,13 @@ When you know a model supports a backbone, you can load the backbone and neck di

The example below loads a [ResNet](./model_doc/resnet) backbone and neck for use in a [MaskFormer](./model_doc/maskformer) instance segmentation head.

Set `backbone` to a pretrained model and `use_pretrained_backbone=True` to use pretrained weights instead of randomly initialized weights.
Note that initializing from config will create the model with random weights. If you want to load a pretrained model, use `from_pretrained` API.

```py
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation

config = MaskFormerConfig(backbone="microsoft/resnet-50", use_pretrained_backbone=True)
backbone_config = AutoConfig.from_pretrained("microsoft/resnet-50")
config = MaskFormerConfig(backbone_config=backbone_config)
model = MaskFormerForInstanceSegmentation(config)
```

Expand All @@ -96,14 +97,13 @@ model = MaskFormerForInstanceSegmentation(config)

## timm backbones

[timm](https://hf.co/docs/timm/index) is a collection of vision models for training and inference. Transformers supports timm models as backbones with the [`TimmBackbone`] and [`TimmBackboneConfig`] classes.

Set `use_timm_backbone=True` to load pretrained timm weights, and `use_pretrained_backbone` to use pretrained or randomly initialized weights.
[timm](https://hf.co/docs/timm/index) is a collection of vision models for training and inference. Transformers supports timm models as backbones with the [`TimmBackbone`] and [`TimmBackboneConfig`] classes. Set the neccessary backbone checkpoint in `backbone` to create a model with Timm backbone with randomly initialized weights.

```py
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation

config = MaskFormerConfig(backbone="resnet50", use_timm_backbone=True, use_pretrained_backbone=True)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_timm_backbone is not really needed imo. We can infer if the requested checkpoint is from timm or HF by checking if repo exists on the hub with a valid config

Deleted it as well as a redudant arg

backbone_config = TimmBackboneConfig(backbone="resnet50", out_indices=[-1])
config = MaskFormerConfig(backbone_config=backbone_config)
model = MaskFormerForInstanceSegmentation(config)
```

Expand All @@ -112,7 +112,7 @@ You could also explicitly call the [`TimmBackboneConfig`] class to load and crea
```py
from transformers import TimmBackboneConfig

backbone_config = TimmBackboneConfig("resnet50", use_pretrained_backbone=True)
backbone_config = TimmBackboneConfig("resnet50")
```

Pass the backbone configuration to the model configuration and instantiate the model head, [`MaskFormerForInstanceSegmentation`], with the backbone.
Expand Down
8 changes: 4 additions & 4 deletions docs/source/en/main_classes/backbones.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ rendered properly in your Markdown viewer.

A backbone is a model used for feature extraction for higher level computer vision tasks such as object detection and image classification. Transformers provides an [`AutoBackbone`] class for initializing a Transformers backbone from pretrained model weights, and two utility classes:

* [`~utils.BackboneMixin`] enables initializing a backbone from Transformers or [timm](https://hf.co/docs/timm/index) and includes functions for returning the output features and indices.
* [`~utils.BackboneConfigMixin`] sets the output features and indices of the backbone configuration.
* [`~backbone_utils.BackboneMixin`] enables initializing a backbone from Transformers or [timm](https://hf.co/docs/timm/index) and includes functions for returning the output features and indices.
* [`~backbone_utils.BackboneConfigMixin`] sets the output features and indices of the backbone configuration.

[timm](https://hf.co/docs/timm/index) models are loaded with the [`TimmBackbone`] and [`TimmBackboneConfig`] classes.

Expand All @@ -45,11 +45,11 @@ Backbones are supported for the following models:

## BackboneMixin

[[autodoc]] utils.BackboneMixin
[[autodoc]] backbone_utils.BackboneMixin

## BackboneConfigMixin

[[autodoc]] utils.BackboneConfigMixin
[[autodoc]] backbone_utils.BackboneConfigMixin

## TimmBackbone

Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/model_doc/dab-detr.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ Option 2: Instantiate DAB-DETR with randomly initialized weights for Transformer
Option 3: Instantiate DAB-DETR with randomly initialized weights for backbone + Transformer

```py
>>> config = DabDetrConfig(use_pretrained_backbone=False)
>>> config = DabDetrConfig()
>>> model = DabDetrForObjectDetection(config)
```

Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/model_doc/detr.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ model = DetrForObjectDetection(config)
- Option 3: Instantiate DETR with randomly initialized weights for backbone + Transformer

```python
config = DetrConfig(use_pretrained_backbone=False)
config = DetrConfig()
model = DetrForObjectDetection(config)
```

Expand Down
3 changes: 1 addition & 2 deletions docs/source/en/model_doc/pvt_v2.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ processed = image_processor(image)
outputs = model(torch.tensor(processed["pixel_values"]))
```

To use the PVTv2 as a backbone for more complex architectures like DeformableDETR, you can use AutoBackbone (this model would need fine-tuning as you're replacing the backbone in the pretrained model):
To use the PVTv2 as a backbone for more complex architectures like DeformableDETR, you can use AutoBackbone (this model would need fine-tuning as you're replacing the backbone in the pretrained model and it is initialized with random weights):

```python
import requests
Expand All @@ -77,7 +77,6 @@ model = AutoModelForObjectDetection.from_config(
config=AutoConfig.from_pretrained(
"SenseTime/deformable-detr",
backbone_config=AutoConfig.from_pretrained("OpenGVLab/pvt_v2_b5"),
use_timm_backbone=False
),
)

Expand Down
13 changes: 9 additions & 4 deletions docs/source/en/tasks/training_vision_backbone.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,18 @@ Initialize [`DetrConfig`] with the pre-trained DINOv3 ConvNext backbone. Use `nu
```py
from transformers import DetrConfig, DetrForObjectDetection, AutoImageProcessor

config = DetrConfig(backbone="facebook/dinov3-convnext-large-pretrain-lvd1689m",
use_pretrained_backbone=True, use_timm_backbone=False,
# Create a model with randomly initialized weights
backbone_config = AutoConfig.from_pretrained("facebook/dinov3-convnext-large-pretrain-lvd1689m")
backbone = AutoBackbone.from_pretrained("facebook/dinov3-convnext-large-pretrain-lvd1689m")

config = DetrConfig(backbone_config=backbone_config,
num_labels=1, id2label={0: "license_plate"}, label2id={"license_plate": 0})
model = DetrForObjectDetection(config)

for param in model.model.backbone.parameters():
param.requires_grad = False
# Assign pretrained backbone checkpoint and freeze the weights
model.model.backbone = backbone
model.model.freeze_backbone()
Comment on lines +49 to +51
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backbones are usually set under different module names, so I'm thinking we can add a self.get_backbone and self.set_backbone for easier manipulation

Maybe next PR, this one is already big


image_processor = AutoImageProcessor.from_pretrained("facebook/detr-resnet-50")
```

Expand Down
2 changes: 1 addition & 1 deletion docs/source/ja/model_doc/detr.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ DETR モデルをインスタンス化するには 3 つの方法があります
オプション 3: バックボーン + トランスフォーマーのランダムに初期化された重みを使用して DETR をインスタンス化します。

```py
>>> config = DetrConfig(use_pretrained_backbone=False)
>>> config = DetrConfig()
>>> model = DetrForObjectDetection(config)
```

Expand Down
2 changes: 1 addition & 1 deletion examples/modular-transformers/modeling_test_detr.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,14 @@

from ... import initialization as init
from ...activations import ACT2FN
from ...backbone_utils import load_backbone
from ...integrations import use_kernel_forward_from_hub
from ...modeling_attn_mask_utils import _prepare_4d_attention_mask
from ...modeling_layers import GradientCheckpointingLayer
from ...modeling_outputs import BaseModelOutput
from ...modeling_utils import PreTrainedModel
from ...pytorch_utils import meshgrid
from ...utils import ModelOutput, auto_docstring, is_timm_available, requires_backends, torch_compilable_check
from ...utils.backbone_utils import load_backbone
from .configuration_test_detr import TestDetrConfig


Expand Down
5 changes: 3 additions & 2 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,7 @@
_import_structure["modeling_flash_attention_utils"] = []
_import_structure["modeling_layers"] = ["GradientCheckpointingLayer"]
_import_structure["modeling_outputs"] = []
_import_structure["backbone_utils"] = ["BackboneConfigMixin", "BackboneMixin"]
_import_structure["modeling_rope_utils"] = ["ROPE_INIT_FUNCTIONS", "dynamic_rope_update", "RopeParameters"]
_import_structure["modeling_utils"] = ["PreTrainedModel", "AttentionInterface"]
_import_structure["masking_utils"] = ["AttentionMaskInterface"]
Expand Down Expand Up @@ -467,6 +468,8 @@
# Direct imports for type-checking
if TYPE_CHECKING:
# All modeling imports
# Models
from .backbone_utils import BackboneConfigMixin, BackboneMixin
from .cache_utils import Cache as Cache
from .cache_utils import DynamicCache as DynamicCache
from .cache_utils import DynamicLayer as DynamicLayer
Expand Down Expand Up @@ -609,8 +612,6 @@
from .integrations.executorch import convert_and_export_with_cache as convert_and_export_with_cache
from .masking_utils import AttentionMaskInterface as AttentionMaskInterface
from .model_debugging_utils import model_addition_debugger_context as model_addition_debugger_context

# Models
from .modeling_layers import GradientCheckpointingLayer as GradientCheckpointingLayer
from .modeling_rope_utils import ROPE_INIT_FUNCTIONS as ROPE_INIT_FUNCTIONS
from .modeling_rope_utils import RopeParameters as RopeParameters
Expand Down
Loading