Using "uniform" Xavier strategy to initialize the weight for VGG network (a trial solution to issue#9866) #9867

juliusshufan · 2018-02-23T06:22:05Z

Description

This PR provide a potential solution for issue #9866
For detailed information, please check the issue.

Checklist

Essentials

Passed code style checking (make lint)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

example/image-classification/common/fit.py

Comments

This PR has been verified on Nvidia P40 and CPU machine

… when loss argument is not set

…work is VGG

juliusshufan · 2018-02-28T04:45:27Z

@szha May I have any comments on review from you or other domain owner, I understand normally it is the user to decide the weight initialization method. For this case, as the current implementation of the example explicitly uses a different initialization method for Alexnet to avoid convergence issue, it might be possible to follow similar way for VGG... What do you think?
(For description of the issue, you might move to #9867

Thanks for your time.

BR,
Shufan

…ork (a trial solution to issue#9866) (apache#9867) * Enable the reporting of cross-entropy or nll loss value during training * Set the default value of loss as a '' to avoid a Python runtime issue when loss argument is not set * Applying the Xavier with "uniform" type to initialize weight when network is VGG

juliusshufan added 4 commits February 16, 2018 00:11

Enable the reporting of cross-entropy or nll loss value during training

fd00aa8

Set the default value of loss as a '' to avoid a Python runtime issue…

36940a2

… when loss argument is not set

Merge branch 'master' of https://github.com/juliusshufan/incubator-mxnet

dea5158

Applying the Xavier with "uniform" type to initialize weight when net…

3df418c

…work is VGG

juliusshufan requested a review from szha as a code owner February 23, 2018 06:22

juliusshufan mentioned this pull request Feb 23, 2018

The default weight initialization strategy makes the VGG network difficult to converge when utilizing examples under 'example/image-classification' #9866

Closed

juliusshufan changed the title ~~Using "uniform" Xavier strategy to initialize the weight for VGG network~~ Using "uniform" Xavier strategy to initialize the weight for VGG network (potential solution to issue#9866) Feb 23, 2018

juliusshufan changed the title ~~Using "uniform" Xavier strategy to initialize the weight for VGG network (potential solution to issue#9866)~~ Using "uniform" Xavier strategy to initialize the weight for VGG network (a trial solution to issue#9866) Feb 23, 2018

sxjscience self-requested a review February 28, 2018 04:57

sxjscience merged commit 17a9c6a into apache:master Feb 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using "uniform" Xavier strategy to initialize the weight for VGG network (a trial solution to issue#9866) #9867

Using "uniform" Xavier strategy to initialize the weight for VGG network (a trial solution to issue#9866) #9867

juliusshufan commented Feb 23, 2018 •

edited

Loading

juliusshufan commented Feb 28, 2018

Using "uniform" Xavier strategy to initialize the weight for VGG network (a trial solution to issue#9866) #9867

Using "uniform" Xavier strategy to initialize the weight for VGG network (a trial solution to issue#9866) #9867

Conversation

juliusshufan commented Feb 23, 2018 • edited Loading

Description

Checklist

Essentials

Changes

Comments

juliusshufan commented Feb 28, 2018

juliusshufan commented Feb 23, 2018 •

edited

Loading