SVRG optimization in python/contrib package, this version supports si…

…ngle machine single cpu, single gpu and multi-gpus
apache · Sep 4, 2018 · 4a2b644 · 4a2b644
1 parent 6fdfd89
commit 4a2b644
Show file tree

Hide file tree

Showing 23 changed files with 792 additions and 579 deletions.
diff --git a/contrib/svrg_optimization_python/README.md b/contrib/svrg_optimization_python/README.md
diff --git a/contrib/svrg_optimization_python/src/__init__.py b/contrib/svrg_optimization_python/src/__init__.py
diff --git a/contrib/svrg_optimization_python/tests/__init__.py b/contrib/svrg_optimization_python/tests/__init__.py
diff --git a/contrib/svrg_optimization_python/tests/test_svrg_module.py b/contrib/svrg_optimization_python/tests/test_svrg_module.py
diff --git a/contrib/svrg_optimization_python/tests/test_svrg_optimizer.py b/contrib/svrg_optimization_python/tests/test_svrg_optimizer.py
diff --git a/docs/api/python/contrib/svrg_optimization.md b/docs/api/python/contrib/svrg_optimization.md
@@ -0,0 +1,80 @@
+# SVRG Optimization in Python Module API
+
+## Overview
+SVRG which stands for Stochastic Variance Reduced Gradients, is an optimization technique that complements SGD. It 
+employs explicit variance reduction and converges much faster compared to SGD for smooth and strongly convex functions.
+
+SVRG optimization is implemented as a SVRGModule in `mxnet.contrib.svrg_optimization`, which is an extension of the 
+existing `mxnet.module.Module` APIs and encapsulates SVRG optimization logic within several new functions. SVRGModule 
+API changes compared to Module API to end users are minimal. 
+
+The current `SVRGModule` implements the standard SVRG optimization technique as described in _Accelerating Stochastic 
+Gradient Descent using Predicative Variance Reduction_ by calculating the gradients of all data 
+every `update_freq` epochs in the training.  The SVRGModule update rule: gradients w.r.t current parameters minus gradients w.r.t parameters 
+from the last mth epoch, plus the average of gradients over all data. 
+
+`SVRGOptimizer` wraps two optimizers, an AssignmentOptimizer which is used for full gradients accumulation in the KVStore and 
+a regular optimizer which is specified as a parameter to the `mod.init_optimizer()`.
+
+```eval_rst
+.. warning:: This package contains experimental APIs and may change in the near future.
+``` 
+
+This document lists the svrg_optimization APIs in mxnet:
+
+```eval_rst
+.. autosummary::
+    :nosignatures:
+
+    mxnet.contrib.svrg_optimization.SVRGModule
+    mxnet.contrib.svrg_optimization.SVRGOptimizer
+```
+
+### Intermediate Level API for SVRGModule
+
+The only extra step to use a SVRGModule compared to use a Module is to check if the current epoch should update the
+full gradients over all data. Code snippets below demonstrate the suggested usage of SVRGModule using intermediate 
+level APIs.
+
+```python
+>>> mod = SVRGModule(symbol=model, update_frequency=2, data_names=['data'], label_names=['lin_reg_label'])
+>>> mod.bind(data_shapes=di.provide_data, label_shapes=di.provide_label)
+>>> mod.init_params()
+>>> mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ))
+>>> for epoch in range(num_epochs):
+...     if epoch % mod.update_freq == 0:
+...         mod.update_full_grads(di)
+...         di.reset()
+...     for batch in di:
+...         mod.forward_backward(data_batch=batch)
+...         mod.update()
+```
+
+### High Level API for SVRGModule
+
+The high level API usage of SVRGModule remains exactly the same as Module API. Code snippets below gives an example of
+suggested usage of high level API.
+
+```python
+>>> mod = SVRGModule(symbol=model, update_frequency=2, data_names=['data'], label_names=['lin_reg_label'])
+>>> mod.fit(di, num_epochs=100, optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), num_epochs=100)
+```
+
+## API reference
+
+<script type="text/javascript" src='../../../_static/js/auto_module_index.js'></script>
+
+```eval_rst
+
+.. automodule:: mxnet.contrib.svrg_optimization.svrg_module
+    :members: init_optimizer, _create_optimizer, bind, forward, backward, update, update_full_grads, 
+    _accumulate_kvstore, _allocate_gradients, _svrg_grads_update_rule, update_svrg_gradients, fit, prepare
+    
+.. automodule:: mxnet.contrib.svrg_optimization.svrg_optimizer.SVRGOptimizer
+    :members: _check_params, update, create_state, _check_index
+
+.. automodule:: mxnet.contrib.svrg_optimization.svrg_optimizer.AssignmentOptimizer
+    :members: update
+    
+```
+<script>auto_index("api-reference");</script>
diff --git a/docs/api/python/module/module.md b/docs/api/python/module/module.md
@@ -58,6 +58,7 @@ The `module` package provides several modules:
     BucketingModule
     PythonModule
     PythonLossModule
+    SVRGModule 
 ```
 
 We summarize the interface for each class in the following sections.
@@ -188,6 +189,23 @@ additional functionality. We summarize them in this section.
     SequentialModule.add
 ```
 
+### Class `SVRGModule`
+SVRGModule is an extension to the Module API that implements SVRG (Stochastic Variance Reduced Gradients) optimization 
+logic. A few extra functions are defined to assist SVRG optimization update; however these functions are encapsulated in 
+Module's existing function calls and should not require explicit invocations by end users using high-level API. 
+
+```eval_rst
+.. autosummary::
+    :nosignatures:
+    
+    SVRGModule.update_full_grads
+    SVRGModule.update_svrg_gradients
+    SVRGModule._svrg_grads_update_rule
+    SVRGModule._accumulate_kvstore
+    SVRGModule._allocate_gradients
+    SVRGModule._create_optimizer
+```
+
 ## API Reference
 
 <script type="text/javascript" src='../../../_static/js/auto_module_index.js'></script>
@@ -205,6 +223,8 @@ additional functionality. We summarize them in this section.
     :members:
 .. autoclass:: mxnet.module.PythonLossModule
     :members:
+.. autoclass:: mxnet.contrib.svrg_optimization.SVRGModule
+    :members:
 ```
 
 <script>auto_index("api-reference");</script>