[MaskFormer] PoC of AutoBackbone API to support ResNet + Swin #20204

NielsRogge · 2022-11-14T12:20:47Z

What does this PR do?

This PR adds support for more backbones than just Swin for the MaskFormer framework. The MaskFormer authors released checkpoints that leverage either ResNet or Swin as backbones, however we currently only support Swin. To support various backbones, this PR introduces the AutoBackbone API.

It introduces the following improvements:

adding AutoBackbone, ResNetBackbone
move MaskFormerSwin to its own modeling files and add MaskFormerSwinBackbone
make MaskFormer use the AutoBackbone API to leverage any backbone, including ResNet

AutoBackbone API

The API is implemented as follows. For a given model, one should implement an additional class, xxxBackbone, for instance ResNetBackbone, in addition to the regular classes like xxxModel and xxxForImageClassification. The backbone class turns the xxxModel into a generic backbone to be consumed by a framework, like DETR or MaskFormer.

The API is inspired by the one used in Detectron2. This means that any backbone should implement a forward and an output_shape method:

the forward method returns the hidden states for each of the desired stages
the output_shape method returns the channel dimension + strides for each of the desired stages.

There are additional methods like size_divisibility and padding_constraints which could be added in the future, for now they don't seem necessary.

Usage

An example can be found below. Basically, the user can specify which layers/stages to get the feature maps from.

from transformers import ResNetConfig, ResNetBackbone
import torch

config = ResNetConfig(out_features=["stem", "stage1", "stage2", "stage3", "stage4"])
model = ResNetBackbone(config)

pixel_values = torch.randn(1, 3, 224, 224)

outputs = model(pixel_values)
for key, value in outputs.items():
    print(key, value.shape)

which prints:

stem torch.Size([1, 64, 56, 56])
stage1 torch.Size([1, 256, 56, 56])
stage2 torch.Size([1, 512, 28, 28])
stage3 torch.Size([1, 1024, 14, 14])
stage4 torch.Size([1, 2048, 7, 7])

One can check the output specification as follows:

print(model.output_shape())

which prints:

{'stem': ShapeSpec(channels=64, height=None, width=None, stride=2), 'stage1': ShapeSpec(channels=256, height=None, width=None, stride=4), 'stage2': ShapeSpec(channels=512, height=None, width=None, stride=4), 'stage3': ShapeSpec(channels=1024, height=None, width=None, stride=4), 'stage4': ShapeSpec(channels=2048, height=None, width=None, stride=4)}

This is useful for frameworks, as they oftentimes require to know these things at initialization.

The Backbone API has a corresponding Auto class, which means that the following also works:

from transformers import ResNetConfig, AutoBackbone

config = ResNetConfig(out_features=["stem", "stage1", "stage2", "stage3", "stage4"])
model = AutoBackbone.from_config(config)

The AutoBackbone class also supports loading pre-trained weights, like so:

from transformers import AutoBackbone

backbone = AutoBackbone.from_pretrained("microsoft/resnet-50")

As the backbone also uses the same base_model_prefix like other head models.

To do's

Add tests for backbones. Backbone classes should not be tested with all tests defined in test_modeling_common.py, instead they should have separate tests. Here I'd like to discuss the best way to add these tests.
make fixup is currently complaining about the following:

Exception: There were 2 failures:
MaskFormerSwinBackbone is defined in 
transformers.models.maskformer.modeling_maskformer_swin but is not present in 
any of the auto mapping. If that is intended behavior, add its name to 
`IGNORE_NON_AUTO_CONFIGURED` in the file `utils/check_repo.py`.
ResNetBackbone is defined in transformers.models.resnet.modeling_resnet but is 
not present in any of the auto mapping. If that is intended behavior, add its 
name to `IGNORE_NON_AUTO_CONFIGURED` in the file `utils/check_repo.py`

=> however I added both MaskFormerSwinBackbone and ResNetBackbone to modeling_auto.py, so not sure why this fails. cc @sgugger

MaskFormer specifics

MaskFormer supports both ResNet and Swin as backbone. It does support native ResNets, but it doesn't use the native Swin as backbone, which is why we have a separate MaskFormerSwinModel class in the library, as well as a MaskFormerSwinBackbone class in this PR.

Happy to discuss the design!

NielsRogge · 2022-11-14T12:28:20Z

README.md

 1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Machine translation models trained using [OPUS](http://opus.nlpl.eu/) data by Jörg Tiedemann. The [Marian Framework](https://marian-nmt.github.io/) is being developed by the Microsoft Translator Team.
 1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (from Microsoft Research Asia) released with the paper [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding](https://arxiv.org/abs/2110.08518) by Junlong Li, Yiheng Xu, Lei Cui, Furu Wei.
 1. **[MaskFormer](https://huggingface.co/docs/transformers/model_doc/maskformer)** (from Meta and UIUC) released with the paper [Per-Pixel Classification is Not All You Need for Semantic Segmentation](https://arxiv.org/abs/2107.06278) by Bowen Cheng, Alexander G. Schwing, Alexander Kirillov.
+1. **[MaskformerSwin](https://huggingface.co/docs/transformers/main/model_doc/maskformer-swin)** (from <FILL INSTITUTION>) released with the paper [<FILL PAPER TITLE>](<FILL ARKIV LINK>) by <FILL AUTHORS>.


cc @sgugger this should not be added, MaskFormerSwin is an equivalent case to DonutSwin, however I tried adding ("MaskFormerSwin": "MaskFormer") to utils/check_copies.py, but no luck.

Do you know how to remove this?

Since I'm reading MaskformerSwin above (and not MaskFormerSwin) I'm guessing fixing the typo should be enough.

HuggingFaceDocBuilderDev · 2022-11-14T12:37:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

sgugger

Thanks for working in this, but this PR is not reviewable as it is because it tries to add to many things together. Moving the code for maskformer swin outside of the maskformer file is a PR of its own. Adding support for the Resnet backbone is a PR of its own. Finally adding an AutoBackbone API is also a PR of its own.

We are extremely far from being able to have an abstract class for backbones since we are just starting using them, so let's not add one (also Transformers doesn't really do abstract classes anyway). We should jsut focus on having backbone models in a first step with one forward and the minimal number of methods needed to make the rest work.

sgugger · 2022-11-14T17:15:35Z

README.md

 1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Machine translation models trained using [OPUS](http://opus.nlpl.eu/) data by Jörg Tiedemann. The [Marian Framework](https://marian-nmt.github.io/) is being developed by the Microsoft Translator Team.
 1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (from Microsoft Research Asia) released with the paper [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding](https://arxiv.org/abs/2110.08518) by Junlong Li, Yiheng Xu, Lei Cui, Furu Wei.
 1. **[MaskFormer](https://huggingface.co/docs/transformers/model_doc/maskformer)** (from Meta and UIUC) released with the paper [Per-Pixel Classification is Not All You Need for Semantic Segmentation](https://arxiv.org/abs/2107.06278) by Bowen Cheng, Alexander G. Schwing, Alexander Kirillov.
+1. **[MaskformerSwin](https://huggingface.co/docs/transformers/main/model_doc/maskformer-swin)** (from <FILL INSTITUTION>) released with the paper [<FILL PAPER TITLE>](<FILL ARKIV LINK>) by <FILL AUTHORS>.


Since I'm reading MaskformerSwin above (and not MaskFormerSwin) I'm guessing fixing the typo should be enough.

sgugger · 2022-11-14T17:17:01Z

src/transformers/backbone.py

@@ -0,0 +1,71 @@
+# Copyright (c) Facebook, Inc. and its affiliates.


This is not our copyright. I am not in favor of using a copy-pasted file for a base utility.

sgugger · 2022-11-14T17:19:05Z

src/transformers/backbone.py

+    stride: Optional[int] = None
+
+
+class Backbone(nn.Module):


The Transformers library does not use abstract classes. Especially not for a new API we haven't quite figured out yet. So let's just do backbone models for now.

sgugger · 2022-11-14T17:19:41Z

src/transformers/models/auto/modeling_auto.py

+        # Model for Instance Segmentation mapping
+        ("maskformer-swin", "MaskFormerSwinBackbone"),
+        ("resnet", "ResNetBackbone"),
+        ("swin", "MaskFormerSwinBackbone"),  # for backward compatibility


This does not make any sense since we could have a SwinBackbone one of these days.

NielsRogge · 2022-12-07T12:41:49Z

Closing this PR as it has been added in smaller separate PRs.

sgugger · 2022-12-07T14:54:17Z

Thanks again for splitting it, it was really better this way!

Niels Rogge and others added 30 commits November 9, 2022 12:40

Add first draft

b93229d

Add copied from statements

a32545b

Add Backbone class

b61ad09

Improve ResNet backbone

2bc7e9f

Add support for config as class

c9d6fb7

Define output_shape spec

78d04e9

Make forward pass with resnet backbone work

f67ac89

Properly convert the resnet backbone

69ff4bf

Improve conversion script

7a5797b

Add conversion of FPN

0075814

Add mask_features

6729b07

Add conversion of transformer decoder

848765b

Add conversion of decoder qkv

1f2fba3

Convert all weights

78ddc5a

Add verification

a98d4d3

Add MaskFormerResNetConfig

1117bce

Add MaskFormerResNetConfig

97f3738

Add MaskFormerSwin model, use existing ResNet

b5d1245

Add AutoBackbone class

f71e1d3

Add more fixes

642053d

Improve backbone API

dda665f

Apply more fixes

2e17731

Add MaskFormerSwin to special model names

1845f9e

Use MaskFormerSwinConfig instead

03f30df

Add comment to clarify

6113bf8

Add more copied from statements

67ac5ca

Add another copied from statements

590e9eb

Fix docs

6c413aa

Fix SwinConfig

c8048bc

Fix base_model_prefix for backbones

623b40b

NielsRogge commented Nov 14, 2022

View reviewed changes

sgugger suggested changes Nov 14, 2022

View reviewed changes

This was referenced Nov 15, 2022

Add AutoBackbone + ResNetBackbone #20229

Merged

[Maskformer] Add MaskFormerSwin backbone #20344

Merged

NielsRogge mentioned this pull request Nov 29, 2022

[MaskFormer] Add support for ResNet backbone #20483

Merged

1 task

NielsRogge closed this Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MaskFormer] PoC of AutoBackbone API to support ResNet + Swin #20204

[MaskFormer] PoC of AutoBackbone API to support ResNet + Swin #20204

Uh oh!

NielsRogge commented Nov 14, 2022 •

edited

Loading

Uh oh!

NielsRogge Nov 14, 2022 •

edited

Loading

Uh oh!

sgugger Nov 14, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Nov 14, 2022

Uh oh!

sgugger left a comment

Uh oh!

sgugger Nov 14, 2022

Uh oh!

sgugger Nov 14, 2022

Uh oh!

sgugger Nov 14, 2022

Uh oh!

sgugger Nov 14, 2022

Uh oh!

NielsRogge commented Dec 7, 2022 •

edited

Loading

Uh oh!

sgugger commented Dec 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,71 @@
		# Copyright (c) Facebook, Inc. and its affiliates.

[MaskFormer] PoC of AutoBackbone API to support ResNet + Swin #20204

[MaskFormer] PoC of AutoBackbone API to support ResNet + Swin #20204

Uh oh!

Conversation

NielsRogge commented Nov 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

AutoBackbone API

Usage

To do's

MaskFormer specifics

Uh oh!

NielsRogge Nov 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 14, 2022

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

NielsRogge commented Dec 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger commented Dec 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NielsRogge commented Nov 14, 2022 •

edited

Loading

NielsRogge Nov 14, 2022 •

edited

Loading

NielsRogge commented Dec 7, 2022 •

edited

Loading