[Refactoring] Unified parameters initialization #780

MeowZheng · 2021-01-07T07:26:06Z

I mainly revised 3 files:

In weight_init.py, add Constant, Kaiming, Normal, Pretrained, Uniform, Xavier classes, and register them in "INITIALIZERS" registry; add initialize function to initialize parameters with "init_cfg".
In checkpoint.py, add _load_checkpoint_with_prefix function.
Add base_module.py, add "BaseModule" and only implement init_weight function for parameters initialization

Design

Model intialization in OpenMMLab uses init_cfg, BaseModule::init_weight, initialize, and INITIALIZERS registry together. Users can initialize their models with following two steps:

Define init_cfg for a model or its components in model_cfg, but init_cfg of children components have higher priority and will override init_cfg of parents modules.
Build model as usual, but call model.init_weight() method explicitly, and model parameters will be initialized as configuration.

The high-level workflow of initialization in OpenMMLab is:
model_cfg(init_cfg) -> build_from_cfg -> model -> init_weight() -> initialize(self, self.init_cfg) -> children's init_weight()

APIs

init_cfg

it is dict or list[dict], and contains:

type - str containing the initializer name in INTIALIZERS, and followed by arguments of the initializer.
layer - str or list[str] containing the names of baisc layers in Pytorch or MMCV with learnable parameters that will be initialized, e.g. 'Conv2d','DeformConv2d'.
override - dict or [dict] containing the sub-modules that not inherit from BaseModule and whose initialization configuration is different from other layers' which are in 'layer' key. Initializer defined in type will work for all layers defined in layer, so if sub-modules are not derived Classes of BaseModule but can be initialized as same ways of layers in layer, it does not need to use override. override contains:
- type followed by arguments of initializer;
- name to indicate sub-module which will be initialized.

BaseModule

BaseModule is the base module for all modules in OpenMMLab. init_weight method of BaseModule can initialize itself parameters using initialize(module, init_cfg) function in mmcv, and call sub-components' init_weight()method.

initialize(module, init_cfg)

module - the module will be initialized.
init_cfg - initialization configuration dict.

INITIALIZERS registry

OpenMMLab has implemented 7 initializers including Constant, Xavier, Normal, Uniform, Kaiming, and Pretrained, and registers them in INITIALIZERS

Taking advantage of the "buider&registry" mechanism of OpenMMLab, INITIALIZERS can be easily extended by implementing new initializer classes and registering them in INITIALIZERS.

Usages

users initialize models of OpenMMLab, just need two steps: 1. define init_cfg; 2. build model and call model.init_weight().

define init_cfg for model

FooModel, FooConv1d, FooConv2d and FooLinear are derived from BaseModule. If we would like to initialize all weight of linear layer as 1 and bias as 2, all weight of conv1d layer as 3 and bias as 4, all weight of conv2d layer as 5 and bias as 6 of FooModel, we can define model_cfg and init_cfg as following

model_cfg = dict(
    type="FooModel",
    init_cfg=[
        dict(type='Constant', val=1, bias=2, layer='Linear'),
        dict(type='Constant', val=3, bias=4, layer='Conv1d'),
        dict(type='Constant', val=5, bias=6, layer='Conv2d')
    ],
    component1=dict(type='FooConv1d'),
    component2=dict(type='FooConv2d'),
    component3=dict(type='FooLinear'),
    component4=dict(
        type='FooLinearConv1d',
        linear=dict(type='FooLinear'),
        conv1d=dict(type='FooConv1d')))

After this, we build a FooModel instance and call init_weight

model = build_from_cfg(model_cfg, FOOMODELS)
model.init_weight()

define init_cfg nestedly

init_cfg of sub-modules will override the parents', like:

model_cfg = dict(
    type="FooModel",
    init_cfg=[
        dict(type='Constant', val=1, bias=2,layer='Linear',
            override=dict(type='Constant', name='reg', val=13, bias=14)),
        dict(type='Constant', val=3, bias=4, layer='Conv1d'),
        dict(type='Constant', val=5, bias=6, layer='Conv2d'),
    ],
    component1=dict(
        type='FooConv1d', init_cfg=dict(type='Constant', val=7, bias=8)),
    component2=dict(
        type='FooConv2d', init_cfg=dict(type='Constant', val=9, bias=10)),
    component3=dict(type='FooLinear'),
    component4=dict(
        type='FooLinearConv1d',
        linear=dict(type='FooLinear'),
        conv1d=dict(type='FooConv1d')))

after model = build_from_cfg(model_cfg, FOOMODELS) and model.init_weight(), parameters will be
model (FooModel)

component1 (FooConv1d, weight=7, bias=8)
component2 (FooConv2d, weight=9, bias=10)
component3 (FooLinear, weight=1, bias=2)
component4 (FooLinearConv1d)
- linear (FooLinear, weight=1, bias=2)
- conv1d (FooConv1d, weight=3, bias=4)
reg (nn.Linear, weight=13, bias=14)

Migration

If models inherit from nn.Module, must inherit from BaseModule.
Add init_cfg argument in __init__ of derived classes, and set default value for init_cfg:
If init_weight in current classes is recursively called init_weight of children's modules, such as

def init_weights(self, pretrained):
    """Initialize the weights in head.

    Args:
        pretrained (str, optional): Path to pre-trained weights.
            Defaults to None.
    """
    if self.with_shared_head:
        self.shared_head.init_weights(pretrained=pretrained)
    if self.with_bbox:
        self.bbox_roi_extractor.init_weights()
        self.bbox_head.init_weights()
    if self.with_mask:
        self.mask_head.init_weights()
        if not self.share_roi_extractor:
            self.mask_roi_extractor.init_weights()

just set init_cfg = None. Otherwise, set init_cfg value according to current code in init_weight, e.g.

# init_weight from retina_head
def init_weights(self):
    """Initialize weights of the head."""
    for m in self.cls_convs:
        normal_init(m.conv, std=0.01)
    for m in self.reg_convs:
        normal_init(m.conv, std=0.01)
    bias_cls = bias_init_with_prob(0.01)
    normal_init(self.retina_cls, std=0.01, bias=bias_cls)
    normal_init(self.retina_reg, std=0.01)

the init_cfg must be

init_cfg = dict(
    type='Normal',
    layer='Conv2d',
    std=0.01,
    override=dict(type='Normal', name='retina_cls',std=0.01,
                bias_prob=0.01))

Backward compatibility for pretrained in previous config file. Add following code in __init__ of derived classes from BaseModule

if pretrained is not None:
    warnings.warn('DeprecationWarning: pretrained is a deprecated \
        key, please consider using init_cfg')
    self.init_cfg = dict(type='Pretrained', checkpoint=pretrained)

There is no need to reimplement init_weight method in derived classes.
Call model.init_weight(), after building models. Please pay attention to it, as this is an additional action for models in OpenMMLab.
If users call init_weight of sub-components, or call init_weight of model twice, there will be a warning "This module has been initialized, please call initialize(module, init_cfg) to reinitialize it".

BC-breaking

Please inform users to call model.init_weight() after building models in tutorals. 　

ZwwWayne · 2021-01-07T14:29:07Z

May add design, usages, migration, and BC-breakings in the PR messages and documentation for both discussion and reference for users.

codecov · 2021-01-15T02:25:50Z

Codecov Report

Merging #780 (45a8746) into master (6c57b88) will increase coverage by 0.69%.
The diff coverage is 86.56%.

@@            Coverage Diff             @@
##           master     #780      +/-   ##
==========================================
+ Coverage   62.23%   62.93%   +0.69%     
==========================================
  Files         144      145       +1     
  Lines        8506     8673     +167     
  Branches     1522     1569      +47     
==========================================
+ Hits         5294     5458     +164     
- Misses       2945     2950       +5     
+ Partials      267      265       -2

Flag	Coverage Δ
unittests	`62.93% <86.56%> (+0.69%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmcv/cnn/alexnet.py	`26.08% <0.00%> (-4.35%)`	⬇️
mmcv/cnn/resnet.py	`12.19% <0.00%> (-0.61%)`	⬇️
mmcv/cnn/vgg.py	`11.11% <0.00%> (-1.02%)`	⬇️
mmcv/onnx/onnx_utils/symbolic_helper.py	`0.00% <0.00%> (ø)`
mmcv/ops/nms.py	`34.43% <8.33%> (ø)`
mmcv/utils/parrots_jit.py	`78.94% <66.66%> (+2.47%)`	⬆️
mmcv/runner/checkpoint.py	`68.05% <81.81%> (+1.82%)`	⬆️
mmcv/runner/base_module.py	`85.71% <85.71%> (ø)`
mmcv/cnn/utils/weight_init.py	`98.80% <98.59%> (+1.75%)`	⬆️
mmcv/cnn/__init__.py	`100.00% <100.00%> (ø)`
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c57b88...2b89df3. Read the comment docs.

mmcv/runner/base_module.py

ZwwWayne · 2021-01-17T07:16:53Z

The contents of PR messages should also be put into the tutorial to serve as documentation.

ZwwWayne · 2021-02-02T14:42:42Z

Please resolve conflicts as the load_checkpoint has been refactored.

mmcv/cnn/utils/weight_init.py

tests/test_runner/test_checkpoint.py

mmcv/cnn/resnet.py

mmcv/cnn/utils/weight_init.py

ZwwWayne · 2021-02-04T12:33:27Z

LGTM now. See if @hellock has any comments.

mmcv/cnn/utils/weight_init.py

apanand14 · 2021-09-08T08:08:01Z

File "tools/train.py", line 163, in main
model.init_weights()
File "C:\Users\topseven\anaconda3\envs\mmcv\lib\site-packages\mmcv\runner\base_module.py", line 117, in init_weights
m.init_weights()
TypeError: init_weights() missing 1 required positional argument: 'pretrained'

add initializers and BaseModule for unified parameter initialization

72a2e0c

ZwwWayne requested a review from xvjiarui January 7, 2021 14:30

MeowZheng added 3 commits January 14, 2021 19:57

fix circle import

18e20ce

update local

8d179fa

bug fix

7601820

MeowZheng added 4 commits January 15, 2021 14:37

add is_init flag in BaseModule

90626f4

fix docstring

6d92b30

sort import and fix doc format

f7b4fd3

fix bug

d4b04da

ZwwWayne reviewed Jan 17, 2021

View reviewed changes

mmcv/runner/base_module.py Show resolved Hide resolved

MeowZheng requested a review from ZwwWayne January 18, 2021 09:40

MeowZheng added 4 commits January 20, 2021 15:17

Merge branch 'master' of github.com:open-mmlab/mmcv into add_init

235410a

fix docformat and double quote string

cee7ffe

fix import sort

5fab920

import sort

c11b344

MeowZheng added 2 commits February 3, 2021 20:31

resolve ckpt conflicts

923b261

sort import

bf9cee2

ZwwWayne reviewed Feb 3, 2021

View reviewed changes

MeowZheng added 5 commits February 4, 2021 15:22

revise according to comments

2f372ca

fix doc format

1edd323

revise according to comments

b342549

revise import and fix typo

9b3b63c

polish code

9fc0692

MeowZheng added 2 commits February 5, 2021 11:22

revise minors

4e706bb

revice minors

b9d6ebc

revise apply function

8f08baa

MeowZheng requested a review from hellock February 5, 2021 10:27

MeowZheng added 3 commits February 5, 2021 21:41

revise bias initialization with probability

8c76f29

add type test for bias_prob

f930733

revise minors

2b89df3

hellock approved these changes Feb 7, 2021

View reviewed changes

hellock merged commit a4c3702 into open-mmlab:master Feb 7, 2021

MeowZheng deleted the add_init branch February 7, 2021 08:04

hhaAndroid mentioned this pull request Mar 11, 2021

[Refactor]: Unified parameter initialization open-mmlab/mmdetection#4750

Merged

58 tasks

xiliu8006 mentioned this pull request Apr 7, 2021

[Enhance] Refactor init_weight open-mmlab/mmdetection3d#378

Closed

7 tasks

dreamerlin mentioned this pull request Apr 9, 2021

[Refactor] Unified parameter initialization open-mmlab/mmaction2#794

Closed

21 tasks

dreamerlin reviewed Apr 10, 2021

View reviewed changes

mmcv/cnn/utils/weight_init.py Show resolved Hide resolved

dreamerlin reviewed Apr 10, 2021

View reviewed changes

mmcv/cnn/utils/weight_init.py Show resolved Hide resolved

dreamerlin mentioned this pull request Apr 11, 2021

Iteration Plan - Apr 2021 open-mmlab/mmaction2#779

Closed

9 tasks

xiliu8006 mentioned this pull request Jun 7, 2021

[Refactor]: Unified parameter initialization open-mmlab/mmdetection3d#622

Merged

7 tasks

ZwwWayne mentioned this pull request Jun 12, 2021

Use init_cfg to unify the parameter initialization approaches open-mmlab/mmocr#282

Closed

MeowZheng changed the title ~~add initializers and BaseModule for unified parameter initialization~~ [Refactoring] Unified parameters initialization Jun 17, 2021

GT9505 mentioned this pull request Aug 10, 2021

Unify model initialization open-mmlab/mmtracking#235

Merged

cih9088 mentioned this pull request Jan 20, 2023

[OTX][Hot-fix] Add init_weights call in the custom atss head openvinotoolkit/training_extensions#1554

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactoring] Unified parameters initialization #780

[Refactoring] Unified parameters initialization #780

MeowZheng commented Jan 7, 2021 •

edited

Loading

ZwwWayne commented Jan 7, 2021

codecov bot commented Jan 15, 2021 •

edited

Loading

ZwwWayne commented Jan 17, 2021

ZwwWayne commented Feb 2, 2021

ZwwWayne commented Feb 4, 2021 •

edited

Loading

apanand14 commented Sep 8, 2021

[Refactoring] Unified parameters initialization #780

[Refactoring] Unified parameters initialization #780

Conversation

MeowZheng commented Jan 7, 2021 • edited Loading

Design

APIs

init_cfg

BaseModule

initialize(module, init_cfg)

INITIALIZERS registry

Usages

define init_cfg for model

define init_cfg nestedly

Migration

BC-breaking

ZwwWayne commented Jan 7, 2021

codecov bot commented Jan 15, 2021 • edited Loading

Codecov Report

ZwwWayne commented Jan 17, 2021

ZwwWayne commented Feb 2, 2021

ZwwWayne commented Feb 4, 2021 • edited Loading

apanand14 commented Sep 8, 2021

MeowZheng commented Jan 7, 2021 •

edited

Loading

codecov bot commented Jan 15, 2021 •

edited

Loading

ZwwWayne commented Feb 4, 2021 •

edited

Loading