Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Commit

Permalink
SVRG optimization in python/contrib package, this version supports si…
Browse files Browse the repository at this point in the history
…ngle machine single cpu, single gpu and multi-gpus
  • Loading branch information
StephanieYuan committed Sep 4, 2018
1 parent 6fdfd89 commit 4a2b644
Show file tree
Hide file tree
Showing 23 changed files with 792 additions and 579 deletions.
39 changes: 0 additions & 39 deletions contrib/svrg_optimization_python/README.md

This file was deleted.

Empty file.
21 changes: 0 additions & 21 deletions contrib/svrg_optimization_python/tests/__init__.py

This file was deleted.

116 changes: 0 additions & 116 deletions contrib/svrg_optimization_python/tests/test_svrg_module.py

This file was deleted.

96 changes: 0 additions & 96 deletions contrib/svrg_optimization_python/tests/test_svrg_optimizer.py

This file was deleted.

80 changes: 80 additions & 0 deletions docs/api/python/contrib/svrg_optimization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# SVRG Optimization in Python Module API

## Overview
SVRG which stands for Stochastic Variance Reduced Gradients, is an optimization technique that complements SGD. It
employs explicit variance reduction and converges much faster compared to SGD for smooth and strongly convex functions.

SVRG optimization is implemented as a SVRGModule in `mxnet.contrib.svrg_optimization`, which is an extension of the
existing `mxnet.module.Module` APIs and encapsulates SVRG optimization logic within several new functions. SVRGModule
API changes compared to Module API to end users are minimal.

The current `SVRGModule` implements the standard SVRG optimization technique as described in _Accelerating Stochastic
Gradient Descent using Predicative Variance Reduction_ by calculating the gradients of all data
every `update_freq` epochs in the training. The SVRGModule update rule: gradients w.r.t current parameters minus gradients w.r.t parameters
from the last mth epoch, plus the average of gradients over all data.

`SVRGOptimizer` wraps two optimizers, an AssignmentOptimizer which is used for full gradients accumulation in the KVStore and
a regular optimizer which is specified as a parameter to the `mod.init_optimizer()`.

```eval_rst
.. warning:: This package contains experimental APIs and may change in the near future.
```

This document lists the svrg_optimization APIs in mxnet:

```eval_rst
.. autosummary::
:nosignatures:
mxnet.contrib.svrg_optimization.SVRGModule
mxnet.contrib.svrg_optimization.SVRGOptimizer
```

### Intermediate Level API for SVRGModule

The only extra step to use a SVRGModule compared to use a Module is to check if the current epoch should update the
full gradients over all data. Code snippets below demonstrate the suggested usage of SVRGModule using intermediate
level APIs.

```python
>>> mod = SVRGModule(symbol=model, update_frequency=2, data_names=['data'], label_names=['lin_reg_label'])
>>> mod.bind(data_shapes=di.provide_data, label_shapes=di.provide_label)
>>> mod.init_params()
>>> mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ))
>>> for epoch in range(num_epochs):
... if epoch % mod.update_freq == 0:
... mod.update_full_grads(di)
... di.reset()
... for batch in di:
... mod.forward_backward(data_batch=batch)
... mod.update()
```

### High Level API for SVRGModule

The high level API usage of SVRGModule remains exactly the same as Module API. Code snippets below gives an example of
suggested usage of high level API.

```python
>>> mod = SVRGModule(symbol=model, update_frequency=2, data_names=['data'], label_names=['lin_reg_label'])
>>> mod.fit(di, num_epochs=100, optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), num_epochs=100)
```

## API reference

<script type="text/javascript" src='../../../_static/js/auto_module_index.js'></script>

```eval_rst
.. automodule:: mxnet.contrib.svrg_optimization.svrg_module
:members: init_optimizer, _create_optimizer, bind, forward, backward, update, update_full_grads,
_accumulate_kvstore, _allocate_gradients, _svrg_grads_update_rule, update_svrg_gradients, fit, prepare
.. automodule:: mxnet.contrib.svrg_optimization.svrg_optimizer.SVRGOptimizer
:members: _check_params, update, create_state, _check_index
.. automodule:: mxnet.contrib.svrg_optimization.svrg_optimizer.AssignmentOptimizer
:members: update
```
<script>auto_index("api-reference");</script>
20 changes: 20 additions & 0 deletions docs/api/python/module/module.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ The `module` package provides several modules:
BucketingModule
PythonModule
PythonLossModule
SVRGModule
```

We summarize the interface for each class in the following sections.
Expand Down Expand Up @@ -188,6 +189,23 @@ additional functionality. We summarize them in this section.
SequentialModule.add
```

### Class `SVRGModule`
SVRGModule is an extension to the Module API that implements SVRG (Stochastic Variance Reduced Gradients) optimization
logic. A few extra functions are defined to assist SVRG optimization update; however these functions are encapsulated in
Module's existing function calls and should not require explicit invocations by end users using high-level API.

```eval_rst
.. autosummary::
:nosignatures:
SVRGModule.update_full_grads
SVRGModule.update_svrg_gradients
SVRGModule._svrg_grads_update_rule
SVRGModule._accumulate_kvstore
SVRGModule._allocate_gradients
SVRGModule._create_optimizer
```

## API Reference

<script type="text/javascript" src='../../../_static/js/auto_module_index.js'></script>
Expand All @@ -205,6 +223,8 @@ additional functionality. We summarize them in this section.
:members:
.. autoclass:: mxnet.module.PythonLossModule
:members:
.. autoclass:: mxnet.contrib.svrg_optimization.SVRGModule
:members:
```

<script>auto_index("api-reference");</script>
Loading

0 comments on commit 4a2b644

Please sign in to comment.