Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

gluon.SymbolBlock cannot imports resnet trained with dtype="float16" #11849

Closed
woozch opened this issue Jul 21, 2018 · 5 comments
Closed

gluon.SymbolBlock cannot imports resnet trained with dtype="float16" #11849

woozch opened this issue Jul 21, 2018 · 5 comments

Comments

@woozch
Copy link

woozch commented Jul 21, 2018

Description

Cannot load fine-tuned resnet101 (incubator-mxnet/example/image-classification/symbols/resnet.py) with dtype="float16" with "gluon.SymbolBlock.imports" method.

Error Message:

AssertionError: Failed loading Parameter 'stage3_unit2_conv2_weight' from saved params: dtype incompatible expected <type 'numpy.float32'> vs saved <type 'numpy.float16'>

Minimum reproducible example

net = gluon.SymbolBlock.imports('resnet-101-symbol.json',['data','softmax_label'],'resnet-101-0007.params')  # This line gives the error message.
net = gluon.SymbolBlock.imports('resnet-101-symbol.json',['data','softmax_label']) 
print(net.collect_params())

My Questions

In incubator-mxnet/example/image-classification/symbols/resnet.py,
there is mx.sym.Cast for type conversion.
I fine-tuned resnet101 with dtype="float16", and I need to load this model as HybridBlock, However, the method gluon.SymbolBlock.imports makes every params' type in the network as float32. Therefore, the trained model cannot be updated.

Here, resnet-101-0007.params are trained with argument dtype='float16'
In resnet-101-symbol.json file, there is the Cast op.
{
"op": "Cast",
"name": "cast0",
"attrs": {"dtype": "float16"},
"inputs": [[7, 0, 0]]
},
It seems that gluon.SymbolBlock.imports does not consider the type conversion operator.

For now, I think I need to load all parameter manualy, and change types then save.
Is there any other solution to solve this problem?

@apeforest
Copy link
Contributor

@sandeep-krishnamurthy Please help to label this issue Gluon

@nicklhy
Copy link
Contributor

nicklhy commented Aug 14, 2018

Got the same problem here. Any updates here?

@ThomasDelteil
Copy link
Contributor

@rahul003 same problem here, can't load back saved float16 models

@sandeep-krishnamurthy
Copy link
Contributor

sandeep-krishnamurthy commented Aug 29, 2018

Using below snippet:

import mxnet as mx

ctx = mx.cpu(0)
data = mx.nd.zeros((1,3,224,224), ctx=ctx, dtype='float64')
net_fp32 = mx.gluon.model_zoo.vision.resnet34_v2(pretrained=True, ctx=ctx)
net_fp32.cast('float64')
net_fp32.hybridize()
pred = net_fp32.forward(data)
net_fp32.export('resnet34_fp16', 0)
print('export fp16 model')

sm = mx.sym.load('resnet34_fp16-symbol.json')
inputs = mx.sym.var('data', dtype='float64')
net_fp16 = mx.gluon.SymbolBlock(sm, inputs)
net_fp16.collect_params().load('resnet34_fp16-0000.params', ctx)
pred = net_fp16.forward(data)

Below are my findings:

  1. Casting worked fine.
  2. Saved parameters are in the correct format (fp64 in my sample code)
  3. sym.load worked fine. If I infer symbol's type (sym.infer_type(data='float64') I get correct inferred type (float64) for all params.

Below is the issue:

  1. When you create mx.gluon.SymbolBlock(sm, input). It creates the parameters in the Block. Type is not passed for creating the parameter.
    See here - https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/block.py#L1058 and the behavior is "If there is not parameter to get, it creates one and uses default type (fp32) https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/parameter.py#L688

I am working on the fix.

@apeforest @ThomasDelteil - FYI

@sandeep-krishnamurthy
Copy link
Contributor

Resolving as changes are merged.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants