-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-1450] Improve the backward mirroring implementation #18228
[MXNET-1450] Improve the backward mirroring implementation #18228
Conversation
Hey @ArmageddonKnight , Thanks for submitting the PR
CI supported jobs: [website, edge, unix-cpu, miscellaneous, unix-gpu, windows-gpu, sanity, windows-cpu, clang, centos-cpu, centos-gpu] Note: |
2509aaf
to
ade545d
Compare
6ab28ea
to
0e94405
Compare
8c626df
to
55cde60
Compare
ce933ca
to
8f5bdd7
Compare
@mxnet-bot run ci [centos-gpu] |
Jenkins CI successfully triggered : [centos-gpu] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArmageddonKnight would you mind sharing some performance result with this feature enabled?
@eric-haibin-lin According to our evaluation on a single machine with RTX 2080 Ti, the performance overhead of training ResNet-152 with a batch size of 152 is 6%. |
Is there a way to use it for Gluon? |
Hi @sxjscience , sorry for the late reply. It is possible, but because the current Gluon backend does not have mirroring involved, as can be seen here: Enabling backward mirroring currently has no effect on Gluon. |
Description
This PR improves the backward mirroring implementation. Specifically, it takes into account for each (group of) operator node whether doing backward mirroring can be truly benefitial to the total memory footprint (please refer to test case #1 and #2 below). It also considers the data dependencies between the forward node and its corresponding gradient node. This is because it is possible for the feature maps of a layer to be recomputed without recomputing the layer itself (e.g., the Fully-Connected layer, test case #3). Those improvements allow us to further optimize the memory consumption of our DNN training models.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
FYI, @eric-haibin-lin @szha