[MXNET-563] Refactor R optimizers to fix memory leak #11374

jeremiedb · 2018-06-23T04:21:54Z

Fix R memory leakage through refactor of the optimizers.
Given the mutatable NDArray isn't supported in R, optimizers have been ported as symbolic update, an executor being created for each weight.

! Only optimizers update symbols have been used for now, so only SGD, rmsprop and Adam are now supported. Will need to reimplement the manual update for Adagrad and Adadelta.

Memory is now kept at low level even for very large networks and embeddings.
Tested on CPU and single GPU, not multiple GPUs.

jeremiedb · 2018-06-23T22:38:27Z

Provides fix for #10721 #10928

anirudhacharya

@jeremiedb can you add tests for the optimizers - #7196 None of the optimizers have tests.

Let me know if you want me to pitch in, we can collaborate on this. I can begin work on fixing some of the broken optimizers, starting next month.

thirdwing · 2018-06-25T17:02:41Z

@hetong007 Please take a look at this.

jeremiedb · 2018-06-26T04:33:21Z

@anirudhacharya Sure, I'll add tests.
Would be great if you could jump in as well. I was expecting to add the missing Adagrad and Adadelta optimizers within a week in order to match existing functionnalities as soon as possible. Would you be disposed looking at the non-mutatble NDArrays which was actually the root cause leading to refactor optimizers into symbolic execution? Thanks!

anirudhacharya · 2018-06-26T06:42:38Z

@jeremiedb sure!

jeremiedb · 2018-06-27T05:57:12Z

@hetong007 With Adadelta and Adagrad, the same functionnalities as now supported (and non-centered rmsprop has been added within rmsprop). Tests to be added.

hetong007 · 2018-06-27T06:25:00Z

R-package/R/optimizer.R

-
+  count <- 0
+  num_update <- 0
+  


please eliminate trailing whitespaces.

hetong007 · 2018-06-27T06:26:21Z

R-package/R/optimizer.R

-#'      Step size.
-#' @param gamma1 float, default=0.95
+#'
+#' @param learning.rate float, default=1e-3


Is there a strong reason to change the default values? It may break other people's code.

I wanted to align with the Python's package default. I'll revert to existing default if you see if you see more harms from this change.

I suggest to keep the default value as in R package. It's not necessary to set default values the same across interfaces, while it may break user's script if we change it, especially for parameters in optimizer.

hetong007 · 2018-06-27T06:28:37Z

R-package/R/optimizer.R

+mx.opt.get.updater <- function(optimizer, weights, ctx) {
+
+  exec_list <- lapply(seq_along(weights), function(i) {
+    if (is.null(weights[[i]])) return(NULL) else


please format here and below as

if (condition) { xxx } else { yyy }

hetong007 · 2018-06-30T19:44:01Z

You may use the help from your editor to eliminate trailing spaces. Also, please add tests for new optimizers.

jeremiedb · 2018-07-14T04:47:12Z

@hetong007 Tests added and trailing space fixed.

hetong007 · 2018-07-16T22:09:04Z

R-package/R/optimizer.R

 #' @param learning.rate float, default=0.002
-#'      Step size.
+#'      The initial learning rate.
 #' @param gamma1 float, default=0.95
 #'      decay factor of moving average for gradient, gradient^2.
 #' @param gamm2 float, default=0.9


gamm2 -> gamma2

hetong007 · 2018-07-16T22:10:39Z

R-package/R/optimizer.R

+                        epsilon = 1e-8,
+                        wd = 0,
+                        rescale.grad = 1,
+                        clip_gradient = -1,


what is the difference between setting it to 1 and -1 ?

When clip_gradient is < 0, no clipping is applied.

Please add that to the docstring, otherwise people may feel confused without looking at the code.

aplikaplik · 2018-07-18T10:05:45Z

Hi, i have same problem with GPU memory (package mxnet, fuction mx.mlp, just started with that). But as i'm traffic engineer, i have a little problem with orientation in provided solution (i'm use r for long time but not in this detail). Can you pleas be so kind and provide some "Refactor R optimizers for dumies" version of solution? :).

jeremiedb · 2018-07-25T19:59:34Z

@hetong007 Is there any blocking element remaining?

jeremiedb · 2018-07-25T20:08:06Z

@aplikaplik Is your issue specificly about mxnet MLP function or also related to gpuR? As for mx.mlp, this PR should fix memory encountered with both mx.mlp and mx.model.Feedforward.create since they are all relying on the same optimizers update routine.
Roughly speaking, the idea of the fix was to create an symbolic graph for each parameter to be trained. At each update, the state weights are update through a forward pass on a graph in which the calculation are those associated with each of the optimisation routine (SGD, Adam, ...). At the end, it's the same approach, but using the symbolic representation (mx.symbol) rather than the imperative interface (mx.nd) since the latter had apparent memory leak.

aplikaplik · 2018-07-31T06:55:56Z

@jeremiedb Hi, i referenced gpuR because it could be usefull for @cdeterman. My problem is much simpler, because i don't know how implement this code (optimizer.R) to my R instalation or mxnet package? Im sorry for that elemtal question.

* refactor R optimizers to fix memory leak * add Adadelta and Adagrad * fix comments * fix comments * fix comments * add tests * fix whitespaces * fix whitespaces * fix typo * fix typo * add doc on clipping

onomatet · 2020-01-21T09:54:20Z

@hetong007 @jeremiedb
Hi, could you please check if #17207 is related to the changes in this pull request?
If yes, are there any temporary solutions to alter the learning rate of the optimizers in symbolic representation?

refactor R optimizers to fix memory leak

980e25e

jeremiedb requested a review from thirdwing as a code owner June 23, 2018 04:21

jeremiedb mentioned this pull request Jun 23, 2018

MxNet R, CNN, VRAM consumption explodes dramatically in dependence of number of filters #10721

Closed

anirudhacharya suggested changes Jun 24, 2018

View reviewed changes

thirdwing assigned hetong007 Jun 25, 2018

add Adadelta and Adagrad

ad5dca0

hetong007 suggested changes Jun 27, 2018

View reviewed changes

jeremiedb mentioned this pull request Jun 27, 2018

GPU support? jeremiedb/mxnet_winbin#1

Open

jeremiedb added 3 commits June 28, 2018 01:59

fix comments

6cc514c

fix comments

09ba6f2

fix comments

8be527c

jeremiedb added 3 commits July 14, 2018 00:40

add tests

82538f8

fix whitespaces

8b21611

fix whitespaces

5348917

hetong007 suggested changes Jul 16, 2018

View reviewed changes

jeremiedb added 2 commits July 16, 2018 21:38

fix typo

ea3e885

fix typo

159efbc

aplikaplik mentioned this pull request Jul 18, 2018

Running out of memory trying to work with matrices cdeterman/gpuR#100

Open

add doc on clipping

4cccd42

hetong007 approved these changes Jul 25, 2018

View reviewed changes

hetong007 merged commit be47870 into apache:master Jul 25, 2018

anirudhacharya mentioned this pull request Jul 27, 2018

[R] make sure all optimizers work #7196

Closed

jeremiedb deleted the optim-R branch September 6, 2018 03:46

onomatet mentioned this pull request Jan 13, 2020

LR schedulers do not work in R #17207

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-563] Refactor R optimizers to fix memory leak #11374

[MXNET-563] Refactor R optimizers to fix memory leak #11374

jeremiedb commented Jun 23, 2018

jeremiedb commented Jun 23, 2018

anirudhacharya left a comment

thirdwing commented Jun 25, 2018

jeremiedb commented Jun 26, 2018

anirudhacharya commented Jun 26, 2018 •

edited

Loading

jeremiedb commented Jun 27, 2018

hetong007 Jun 27, 2018

hetong007 Jun 27, 2018

jeremiedb Jun 27, 2018

hetong007 Jun 28, 2018

hetong007 Jun 27, 2018

hetong007 commented Jun 30, 2018

jeremiedb commented Jul 14, 2018

hetong007 Jul 16, 2018

hetong007 Jul 16, 2018

jeremiedb Jul 17, 2018

hetong007 Jul 17, 2018

aplikaplik commented Jul 18, 2018 •

edited

Loading

jeremiedb commented Jul 25, 2018

jeremiedb commented Jul 25, 2018

aplikaplik commented Jul 31, 2018

onomatet commented Jan 21, 2020

[MXNET-563] Refactor R optimizers to fix memory leak #11374

[MXNET-563] Refactor R optimizers to fix memory leak #11374

Conversation

jeremiedb commented Jun 23, 2018

jeremiedb commented Jun 23, 2018

anirudhacharya left a comment

Choose a reason for hiding this comment

thirdwing commented Jun 25, 2018

jeremiedb commented Jun 26, 2018

anirudhacharya commented Jun 26, 2018 • edited Loading

jeremiedb commented Jun 27, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hetong007 commented Jun 30, 2018

jeremiedb commented Jul 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aplikaplik commented Jul 18, 2018 • edited Loading

jeremiedb commented Jul 25, 2018

jeremiedb commented Jul 25, 2018

aplikaplik commented Jul 31, 2018

onomatet commented Jan 21, 2020

anirudhacharya commented Jun 26, 2018 •

edited

Loading

aplikaplik commented Jul 18, 2018 •

edited

Loading