Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-1255] update hybridize documentation #13597

Merged
merged 5 commits into from
Jan 7, 2019
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 107 additions & 2 deletions docs/tutorials/gluon/hybrid.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# Hybrid - Faster training and easy deployment

*Note: a newer version is available [here](http://gluon.mxnet.io/chapter07_distributed-learning/hybridize.html).*
*Related Contents:*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Content instead of Contents (but don't restart CI just for that change).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a link to the new dive into deeplearning book so made it "contents"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a semantic thing I guess. There's probably a rule here but... usually you can say Content is plural already (as it refers to a mass of stuff) and you don't usually say "Related Contents". It's more common to say "Related Content" even if there's more than one item in the list.
It is weird because you do say "Table of Contents", not "Table of Content".
🤷‍♂️ English is weird. It's not a showstopper... it's fine whichever way you want to do it. (but do it my way). Jk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*Related Contents:*
*Related Content:*

* [Fast, portable neural networks with Gluon HybridBlocks](https://gluon.mxnet.io/chapter07_distributed-learning/hybridize.html)
* [A Hybrid of Imperative and Symbolic Programming
](http://en.diveintodeeplearning.org/chapter_computational-performance/hybridize.html)

Deep learning frameworks can be roughly divided into two categories: declarative
and imperative. With declarative frameworks (including Tensorflow, Theano, etc)
Expand Down Expand Up @@ -137,4 +140,106 @@ to gluon with `SymbolBlock`:
net2 = gluon.SymbolBlock.imports('model-symbol.json', ['data'], 'model-0001.params')
```

<!-- INSERT SOURCE DOWNLOAD BUTTONS -->
## Operators that do not work with hybridize

If you want to hybridize your model, you must use `F.some_operator` in your 'hybrid_forward' function.
`F` will be `mxnet.nd` before you hybridize and `mxnet.sym` after hybridize. While most APIs are the same in NDArray and Symbol, there are some differences. Writing `F.some_operator` and call `hybridize` may not work all of the time.
Here we list some frequently used NDArray APIs that can't be hybridized and provide you the work arounds.

### Element-wise Operators

In NDArray APIs, the following arithmetic and comparison APIs are automatically broadcasted if the input NDArrays have different shapes.
However, that's not the case in Symbol API. It's not automatically broadcasted, and you have to manually specify to use another set of broadcast operators for Symbols expected to have different shapes.


| NDArray APIs | Description |
aaronmarkham marked this conversation as resolved.
Show resolved Hide resolved
|---|---|
| [*NDArray.\__add\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__add__) | x.\__add\__(y) <=> x+y <=> mx.nd.add(x, y) |
aaronmarkham marked this conversation as resolved.
Show resolved Hide resolved
| [*NDArray.\__sub\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__sub__) | x.\__sub\__(y) <=> x-y <=> mx.nd.subtract(x, y) |
| [*NDArray.\__mul\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__mul__) | x.\__mul\__(y) <=> x*y <=> mx.nd.multiply(x, y) |
| [*NDArray.\__div\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__div__) | x.\__div\__(y) <=> x/y <=> mx.nd.divide(x, y) |
| [*NDArray.\__mod\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__mod__) | x.\__mod\__(y) <=> x%y <=> mx.nd.modulo(x, y) |
| [*NDArray.\__lt\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__lt__) | x.\__lt\__(y) <=> x<y <=> x mx.nd.lesser(x, y) |
| [*NDArray.\__le\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__le__) | x.\__le\__(y) <=> x<=y <=> mx.nd.less_equal(x, y) |
| [*NDArray.\__gt\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__gt__) | x.\__gt\__(y) <=> x>y <=> mx.nd.greater(x, y) |
| [*NDArray.\__ge\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__ge__) | x.\__ge\__(y) <=> x>=y <=> mx.nd.greater_equal(x, y)|
| [*NDArray.\__eq\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__eq__) | x.\__eq\__(y) <=> x==y <=> mx.nd.equal(x, y) |
| [*NDArray.\__ne\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__ne__) | x.\__ne\__(y) <=> x!=y <=> mx.nd.not_equal(x, y) |

The current workaround is to use corresponding broadcast operators for arithmetic and comparison to avoid potential hybridization failure when input shapes are different.

| Symbol APIs | Description |
|---|---|
|[*broadcast_add*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_add) | Returns element-wise sum of the input arrays with broadcasting. |
|[*broadcast_sub*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_sub) | Returns element-wise difference of the input arrays with broadcasting. |
|[*broadcast_mul*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_mul) | Returns element-wise product of the input arrays with broadcasting. |
|[*broadcast_div*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_div) | Returns element-wise division of the input arrays with broadcasting. |
|[*broadcast_mod*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_mod) | Returns element-wise modulo of the input arrays with broadcasting. |
|[*broadcast_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_equal) | Returns the result of element-wise *equal to* (==) comparison operation with broadcasting. |
|[*broadcast_not_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_not_equal) | Returns the result of element-wise *not equal to* (!=) comparison operation with broadcasting. |
|[*broadcast_greater*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_greater) | Returns the result of element-wise *greater than* (>) comparison operation with broadcasting. |
|[*broadcast_greater_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_greater_equal) | Returns the result of element-wise *greater than or equal to* (>=) comparison operation with broadcasting. |
|[*broadcast_lesser*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_lesser) | Returns the result of element-wise *lesser than* (<) comparison operation with broadcasting. |
|[*broadcast_lesser_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_lesser_equal) | Returns the result of element-wise *lesser than or equal to* (<=) comparison operation with broadcasting. |

For example, if you want to add a NDarray to your input x, use `broadcast_add` instead of `+`:

```python
def hybrid_forward(self, F, x):
# avoid writing: return x + F.ones((1, 1))
return F.broadcast_add(x, F.ones((1, 1)))
```

If you used `+`, it would still work before hybridization, but will throw an error of shape missmtach after hybridization.

### Shape

Gluon's imperative interface is very flexible and allows you to print the shape of the NDArray. However, Symbol does not have shape attributes. As a result, you need to avoid printing shapes in `hybrid_forward`.
Otherwise, you will get the following error:
```bash
AttributeError: 'Symbol' object has no attribute 'shape'
```

### Slice
`[]` in NDArray is used to get a slice from the array. However, `[]` in Symbol is used to get an output from a grouped symbol.
For example, you will get different results for the following method before and after hybridization.

```python
def hybrid_forward(self, F, x):
return x[0]
```

The current workaround is to explicitly call [`slice`](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.slice) or [`slice_axis`](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.slice_axis) operators in `hybrid_forward`.


### Not implemented operators

Some of the often used operators in NDArray are not implemented in Symbol, and will cause hybridization failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Some of the often used operators in NDArray are not implemented in Symbol, and will cause hybridization failure
Some of the often used operators in NDArray are not implemented in Symbol, and will cause hybridization failure.


#### NDArray.asnumpy
Symbol does not support `asnumpy` function, you need to avoid calling `asnumpy` in `hybrid_forward`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Symbol does not support `asnumpy` function, you need to avoid calling `asnumpy` in `hybrid_forward`.
Symbol does not support the `asnumpy` function. You need to avoid calling `asnumpy` in `hybrid_forward`.


#### Array creation APIs

`mx.nd.array()` is used a lot, but Symbol does not have the `array` API. The current workaround is to use `F.ones`, `F.zeros`, or `F.full`, which exist in both the NDArray and Symbol APIs.

#### In-Place Arithmetic Operators

In-place arithmetic operators may be used in Gluon imperative mode, however if you expect to hybridize, you should write these operations explicitly instead.
For example, avoid writing `x += y` and use `x = x + y`, otherwise you will get `NotImplementedError`. This applies to all the following operators:

| NDArray in-place arithmetic operators | Description |
|---|---|
|[*NDArray.\__iadd\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__iadd__) | x.\__iadd\__(y) <=> x+=y |
|[*NDArray.\__isub\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__isub__) | x.\__isub\__(y) <=> x-=y |
|[*NDArray.\__imul\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__imul__) | x.\__imul\__(y) <=> x*=y |
|[*NDArray.\__idiv\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__idiv__) | x.\__rdiv\__(y) <=> x/=y |
|[*NDArray.\__imod\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__imod__) | x.\__rmod\__(y) <=> x%=y |



## Summary

The recommended practice is to utilize the flexibility of imperative NDArray API during experimentation. Once you finalized your model, make necessary changes mentioned above so you can call `hybridize` function to improve performance.

<!-- INSERT SOURCE DOWNLOAD BUTTONS -->