This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
szhengac
requested review from
aaronmarkham,
eric-haibin-lin,
nswamy and
szha
as code owners
January 21, 2020 23:32
szhengac
requested review from
gigasquid,
sergeykolychev and
yzhliu
as code owners
January 22, 2020 22:41
gigasquid
approved these changes
Jan 22, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clojure code side looks good. Thanks for making the change 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to hold on these changes until 1.7.x or 2.x branch it cut
sergeykolychev
approved these changes
Jan 27, 2020
4 tasks
MoisesHer
pushed a commit
to MoisesHer/incubator-mxnet
that referenced
this pull request
Apr 10, 2020
* refactor optimizer * refactor optimizer * fix svrg test * fix rmsprop param naming * fix signum test * fix pylint and perl test * fix perl test and signsgd test * fix * retrigger ci * reduce ci overheads
anirudh2290
pushed a commit
to anirudh2290/mxnet
that referenced
this pull request
May 29, 2020
* refactor optimizer * refactor optimizer * fix svrg test * fix rmsprop param naming * fix signum test * fix pylint and perl test * fix perl test and signsgd test * fix * retrigger ci * reduce ci overheads
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Refactor optimizers for MxNet 2.0:
Main change:
Optimizer
andUpdater
are split into two files.step
andfused_step
. Pure ndarray implementation is put in functionstep
whilefused_step
contains using optimized kernel. Usingstep
orfused_step
is controlled by flaguse_fused_step
. The main reason to have two step functions is that it is hard for an optimization researcher to implement a new optimizer if he refers to the existing implementations such as SGD. In PyTorch, they only use pure python code inoptim
. So I consider providing two update functions to maintain both readability and efficiency.update
,update_multi_precision
,step
, andfused_step
take as inputindices, weights, grads, states
, where the length of the list is determined byaggregate_num
. Whenaggregate_num = numpy.inf
, all the parameters are aggregated. This change is necessary if we want to implement some complex optimizers such as LBFGS and Barzilai-Borwein step size, which require access to all the parameters in a single function.ccSGD
andLBSGD
.LBSGD
is simplySGD
+linear scaling
+grad accumulation
, which shouldn't be an optimizer in optimizer API.Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
@szha @eric-haibin-lin @sxjscience @leezu