Skip to content

Commit

Permalink
[MXNET-1255] update hybridize documentation (apache#13597)
Browse files Browse the repository at this point in the history
* update hybridize documentation

* address review comments

* improve doc

* address comments

* address comments
  • Loading branch information
roywei authored and haohuw committed Jun 23, 2019
1 parent 3152c73 commit a2c5bdb
Showing 1 changed file with 107 additions and 2 deletions.
109 changes: 107 additions & 2 deletions docs/tutorials/gluon/hybrid.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# Hybrid - Faster training and easy deployment

*Note: a newer version is available [here](http://gluon.mxnet.io/chapter07_distributed-learning/hybridize.html).*
*Related Content:*
* [Fast, portable neural networks with Gluon HybridBlocks](https://gluon.mxnet.io/chapter07_distributed-learning/hybridize.html)
* [A Hybrid of Imperative and Symbolic Programming
](http://en.diveintodeeplearning.org/chapter_computational-performance/hybridize.html)

Deep learning frameworks can be roughly divided into two categories: declarative
and imperative. With declarative frameworks (including Tensorflow, Theano, etc)
Expand Down Expand Up @@ -137,4 +140,106 @@ to gluon with `SymbolBlock`:
net2 = gluon.SymbolBlock.imports('model-symbol.json', ['data'], 'model-0001.params')
```

<!-- INSERT SOURCE DOWNLOAD BUTTONS -->
## Operators that do not work with hybridize

If you want to hybridize your model, you must use `F.some_operator` in your 'hybrid_forward' function.
`F` will be `mxnet.nd` before you hybridize and `mxnet.sym` after hybridize. While most APIs are the same in NDArray and Symbol, there are some differences. Writing `F.some_operator` and call `hybridize` may not work all of the time.
Here we list some frequently used NDArray APIs that can't be hybridized and provide you the work arounds.

### Element-wise Operators

In NDArray APIs, the following arithmetic and comparison APIs are automatically broadcasted if the input NDArrays have different shapes.
However, that's not the case in Symbol API. It's not automatically broadcasted, and you have to manually specify to use another set of broadcast operators for Symbols expected to have different shapes.


| NDArray APIs | Description |
|---|---|
| [*NDArray.\__add\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__add__) | x.\__add\__(y) <=> x+y <=> mx.nd.add(x, y) |
| [*NDArray.\__sub\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__sub__) | x.\__sub\__(y) <=> x-y <=> mx.nd.subtract(x, y) |
| [*NDArray.\__mul\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__mul__) | x.\__mul\__(y) <=> x*y <=> mx.nd.multiply(x, y) |
| [*NDArray.\__div\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__div__) | x.\__div\__(y) <=> x/y <=> mx.nd.divide(x, y) |
| [*NDArray.\__mod\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__mod__) | x.\__mod\__(y) <=> x%y <=> mx.nd.modulo(x, y) |
| [*NDArray.\__lt\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__lt__) | x.\__lt\__(y) <=> x<y <=> x mx.nd.lesser(x, y) |
| [*NDArray.\__le\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__le__) | x.\__le\__(y) <=> x<=y <=> mx.nd.less_equal(x, y) |
| [*NDArray.\__gt\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__gt__) | x.\__gt\__(y) <=> x>y <=> mx.nd.greater(x, y) |
| [*NDArray.\__ge\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__ge__) | x.\__ge\__(y) <=> x>=y <=> mx.nd.greater_equal(x, y)|
| [*NDArray.\__eq\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__eq__) | x.\__eq\__(y) <=> x==y <=> mx.nd.equal(x, y) |
| [*NDArray.\__ne\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__ne__) | x.\__ne\__(y) <=> x!=y <=> mx.nd.not_equal(x, y) |

The current workaround is to use corresponding broadcast operators for arithmetic and comparison to avoid potential hybridization failure when input shapes are different.

| Symbol APIs | Description |
|---|---|
|[*broadcast_add*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_add) | Returns element-wise sum of the input arrays with broadcasting. |
|[*broadcast_sub*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_sub) | Returns element-wise difference of the input arrays with broadcasting. |
|[*broadcast_mul*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_mul) | Returns element-wise product of the input arrays with broadcasting. |
|[*broadcast_div*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_div) | Returns element-wise division of the input arrays with broadcasting. |
|[*broadcast_mod*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_mod) | Returns element-wise modulo of the input arrays with broadcasting. |
|[*broadcast_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_equal) | Returns the result of element-wise *equal to* (==) comparison operation with broadcasting. |
|[*broadcast_not_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_not_equal) | Returns the result of element-wise *not equal to* (!=) comparison operation with broadcasting. |
|[*broadcast_greater*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_greater) | Returns the result of element-wise *greater than* (>) comparison operation with broadcasting. |
|[*broadcast_greater_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_greater_equal) | Returns the result of element-wise *greater than or equal to* (>=) comparison operation with broadcasting. |
|[*broadcast_lesser*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_lesser) | Returns the result of element-wise *lesser than* (<) comparison operation with broadcasting. |
|[*broadcast_lesser_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_lesser_equal) | Returns the result of element-wise *lesser than or equal to* (<=) comparison operation with broadcasting. |

For example, if you want to add a NDarray to your input x, use `broadcast_add` instead of `+`:

```python
def hybrid_forward(self, F, x):
# avoid writing: return x + F.ones((1, 1))
return F.broadcast_add(x, F.ones((1, 1)))
```

If you used `+`, it would still work before hybridization, but will throw an error of shape missmtach after hybridization.

### Shape

Gluon's imperative interface is very flexible and allows you to print the shape of the NDArray. However, Symbol does not have shape attributes. As a result, you need to avoid printing shapes in `hybrid_forward`.
Otherwise, you will get the following error:
```bash
AttributeError: 'Symbol' object has no attribute 'shape'
```

### Slice
`[]` in NDArray is used to get a slice from the array. However, `[]` in Symbol is used to get an output from a grouped symbol.
For example, you will get different results for the following method before and after hybridization.

```python
def hybrid_forward(self, F, x):
return x[0]
```

The current workaround is to explicitly call [`slice`](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.slice) or [`slice_axis`](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.slice_axis) operators in `hybrid_forward`.


### Not implemented operators

Some of the often used operators in NDArray are not implemented in Symbol, and will cause hybridization failure.

#### NDArray.asnumpy
Symbol does not support the `asnumpy` function. You need to avoid calling `asnumpy` in `hybrid_forward`.

#### Array creation APIs

`mx.nd.array()` is used a lot, but Symbol does not have the `array` API. The current workaround is to use `F.ones`, `F.zeros`, or `F.full`, which exist in both the NDArray and Symbol APIs.

#### In-Place Arithmetic Operators

In-place arithmetic operators may be used in Gluon imperative mode, however if you expect to hybridize, you should write these operations explicitly instead.
For example, avoid writing `x += y` and use `x = x + y`, otherwise you will get `NotImplementedError`. This applies to all the following operators:

| NDArray in-place arithmetic operators | Description |
|---|---|
|[*NDArray.\__iadd\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__iadd__) | x.\__iadd\__(y) <=> x+=y |
|[*NDArray.\__isub\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__isub__) | x.\__isub\__(y) <=> x-=y |
|[*NDArray.\__imul\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__imul__) | x.\__imul\__(y) <=> x*=y |
|[*NDArray.\__idiv\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__idiv__) | x.\__rdiv\__(y) <=> x/=y |
|[*NDArray.\__imod\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__imod__) | x.\__rmod\__(y) <=> x%=y |



## Summary

The recommended practice is to utilize the flexibility of imperative NDArray API during experimentation. Once you finalized your model, make necessary changes mentioned above so you can call `hybridize` function to improve performance.

<!-- INSERT SOURCE DOWNLOAD BUTTONS -->

0 comments on commit a2c5bdb

Please sign in to comment.