-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactoring] Unified parameters initialization #780
Conversation
May add design, usages, migration, and BC-breakings in the PR messages and documentation for both discussion and reference for users. |
Codecov Report
@@ Coverage Diff @@
## master #780 +/- ##
==========================================
+ Coverage 62.23% 62.93% +0.69%
==========================================
Files 144 145 +1
Lines 8506 8673 +167
Branches 1522 1569 +47
==========================================
+ Hits 5294 5458 +164
- Misses 2945 2950 +5
+ Partials 267 265 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
The contents of PR messages should also be put into the tutorial to serve as documentation. |
Please resolve conflicts as the load_checkpoint has been refactored. |
LGTM now. See if @hellock has any comments. |
File "tools/train.py", line 163, in main |
I mainly revised 3 files:
Constant
,Kaiming
,Normal
,Pretrained
,Uniform
,Xavier
classes, and register them in"INITIALIZERS"
registry; addinitialize
function to initialize parameters with"init_cfg"
._load_checkpoint_with_prefix
function."BaseModule"
and only implementinit_weight
function for parameters initializationDesign
Model intialization in OpenMMLab uses
init_cfg
,BaseModule::init_weight
,initialize
, andINITIALIZERS
registry together. Users can initialize their models with following two steps:init_cfg
for a model or its components inmodel_cfg
, butinit_cfg
of children components have higher priority and will overrideinit_cfg
of parents modules.model.init_weight()
method explicitly, and model parameters will be initialized as configuration.The high-level workflow of initialization in OpenMMLab is:
model_cfg(init_cfg) -> build_from_cfg -> model -> init_weight() -> initialize(self, self.init_cfg) -> children's init_weight()
APIs
init_cfg
it is dict or list[dict], and contains:
type
- str containing the initializer name inINTIALIZERS
, and followed by arguments of the initializer.layer
- str or list[str] containing the names of baisc layers in Pytorch or MMCV with learnable parameters that will be initialized, e.g.'Conv2d'
,'DeformConv2d'
.override
- dict or [dict] containing the sub-modules that not inherit from BaseModule and whose initialization configuration is different from other layers' which are in'layer'
key. Initializer defined intype
will work for all layers defined inlayer
, so if sub-modules are not derived Classes ofBaseModule
but can be initialized as same ways of layers inlayer
, it does not need to useoverride
.override
contains:type
followed by arguments of initializer;name
to indicate sub-module which will be initialized.BaseModule
BaseModule
is the base module for all modules in OpenMMLab.init_weight
method ofBaseModule
can initialize itself parameters usinginitialize(module, init_cfg)
function in mmcv, and call sub-components'init_weight()
method.initialize(module, init_cfg)
module
- the module will be initialized.init_cfg
- initialization configuration dict.INITIALIZERS registry
OpenMMLab has implemented 7 initializers including
Constant
,Xavier
,Normal
,Uniform
,Kaiming
, andPretrained
, and registers them inINITIALIZERS
Taking advantage of the "buider®istry" mechanism of OpenMMLab,
INITIALIZERS
can be easily extended by implementing new initializer classes and registering them inINITIALIZERS
.Usages
users initialize models of OpenMMLab, just need two steps: 1. define init_cfg; 2. build model and call
model.init_weight()
.define init_cfg for model
FooModel
,FooConv1d
,FooConv2d
andFooLinear
are derived fromBaseModule
. If we would like to initialize all weight of linear layer as 1 and bias as 2, all weight of conv1d layer as 3 and bias as 4, all weight of conv2d layer as 5 and bias as 6 ofFooModel
, we can define model_cfg and init_cfg as followingAfter this, we build a FooModel instance and call init_weight
define init_cfg nestedly
init_cfg of sub-modules will override the parents', like:
after
model = build_from_cfg(model_cfg, FOOMODELS)
andmodel.init_weight()
, parameters will bemodel (FooModel)
Migration
nn.Module
, must inherit fromBaseModule
.init_cfg
argument in__init__
of derived classes, and set default value forinit_cfg
:If
init_weight
in current classes is recursively calledinit_weight
of children's modules, such asjust set
init_cfg = None
. Otherwise, setinit_cfg
value according to current code ininit_weight
, e.g.the init_cfg must be
pretrained
in previous config file. Add following code in__init__
of derived classes fromBaseModule
init_weight
method in derived classes.model.init_weight()
, after building models. Please pay attention to it, as this is an additional action for models in OpenMMLab.init_weight
of sub-components, or callinit_weight
of model twice, there will be a warning "This module has been initialized, please call initialize(module, init_cfg) to reinitialize it".BC-breaking
Please inform users to call
model.init_weight()
after building models in tutorals.