diff --git a/doc/conf.py b/doc/conf.py index 9838b809bc31..ea2dbfbd3db4 100644 --- a/doc/conf.py +++ b/doc/conf.py @@ -126,7 +126,7 @@ # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. -# html_theme = 'alabaster' +html_theme = 'sphinx_rtd_theme' # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, diff --git a/doc/python/io.md b/doc/python/io.md index 7bff6a83e354..933945563339 100644 --- a/doc/python/io.md +++ b/doc/python/io.md @@ -1,11 +1,11 @@ -Python IO API -=================== +## Data Input and Output + Mxnet handles IO for you by implementing data iterators. It is like an iterable class in python, you can traverse the data using a for loop. -IO API Reference ----------------------- +## IO API Reference + ```eval_rst .. automodule:: mxnet.io :members: diff --git a/doc/python/kvstore.md b/doc/python/kvstore.md new file mode 100644 index 000000000000..ffd3835b2f0e --- /dev/null +++ b/doc/python/kvstore.md @@ -0,0 +1,3 @@ +# Distributed Key-value Store + +TODO diff --git a/doc/python/narray.md b/doc/python/narray.md deleted file mode 100644 index e4befbb908ba..000000000000 --- a/doc/python/narray.md +++ /dev/null @@ -1,12 +0,0 @@ -Python NArray API -================= -NArray is the basic computation element in mxnet. -It is like numpy.ndarray, but comes with two unique features *gpu execution* and *dependency scheduling*. - - -NArray API Reference --------------------- -```eval_rst -.. automodule:: mxnet.narray - :members: -``` diff --git a/doc/python/ndarray.md b/doc/python/ndarray.md new file mode 100644 index 000000000000..0a342f60428b --- /dev/null +++ b/doc/python/ndarray.md @@ -0,0 +1,209 @@ +# NDArray: Numpy style tensor computations on CPU/GPU + +`NDArray` is the basic operation unit in MXNet for matrix and tensor +computations. It is similar to `numpy.ndarray`, but with two additional +features: + +1. **multiple devices**: all operations can be run on various devices including +CPU and GPU +2. **automatic parallelization**: all operations are automatically executed in + parallel with each other + +## Create and Initialization + +We can create `NDArray` on either GPU or GPU + +```python +>>> import mxnet as mx +>>> a = mx.nd.empty((2, 3)) # create a 2-by-3 matrix on cpu +>>> b = mx.nd.empty((2, 3), mx.gpu()) # create a 2-by-3 matrix on gpu 0 +>>> c = mx.nd.empty((2, 3), mx.gpu(2)) # create a 2-by-3 matrix on gpu 2 +>>> c.shape # get shape +(2L, 3L) +>>> c.context # get device info +Context(device_type=gpu, device_id=2) +``` + +They can be initialized by various ways: + +```python +>>> a = mx.nd.zeros((2, 3)) # create a 2-by-3 matrix and filled with 0 +>>> b = mx.nd.ones((2, 3)) # create a 2-by-3 matrix and filled with 1 +>>> b[:] = 2 # assign all elements of b with 2 +``` + +We can copy the value from one to anther, even if they sit on different devices + +```python +>>> a = mx.nd.ones((2, 3)) +>>> b = mx.nd.zeros((2, 3), mx.gpu()) +>>> a.copyto(b) # copy data from cpu to gpu +``` + +We can also convert `NDArray` to `numpy.ndarray` + +```python +>>> a = mx.nd.ones((2, 3)) +>>> b = a.asnumpy() +>>> type(b) + +>>> print b +[[ 1. 1. 1.] + [ 1. 1. 1.]] +``` + +and verse vice + +```python +>>> a = mx.nd.empty((2, 3)) +>>> a[:] = np.random.uniform(-0.1, 0.1, a.shape) +>>> print a.asnumpy() +[[-0.06821112 -0.03704893 0.06688045] + [ 0.09947646 -0.07700162 0.07681718]] +``` + +## Basic Operations + +### Elemental-wise operations + +In default, `NDArray` performs elemental-wise operations: + +```python +>>> a = mx.nd.ones((2, 3)) * 2 +>>> b = mx.nd.ones((2, 3)) * 4 +>>> print a.asnumpy() +[[ 4. 4. 4.] + [ 4. 4. 4.]] +>>> c = a + b +>>> print c.asnumpy() +[[ 6. 6. 6.] + [ 6. 6. 6.]] +>>> d = a * b +>>> print d.asnumpy() +[[ 8. 8. 8.] + [ 8. 8. 8.]] +``` + +If two `NDArray` sit on different devices, we need explicitly move them into the +same one. The following example performing computations on GPU 0: + +```python +>>> a = mx.nd.ones((2, 3)) * 2 +>>> b = mx.nd.ones((2, 3), mx.gpu()) * 3 +>>> c = a.copyto(mx.gpu()) * b +>>> print c.asnumpy() +[[ 6. 6. 6.] + [ 6. 6. 6.]] +``` + +### Indexing + +TODO + +### Linear Algebra + +TODO + +## Load and Save + +There are two ways to save data to (load from) disks easily. The first way uses +`pickle`. `NDArray` is pickle compatible, which means you can simply pickle the +NArray like what you did with `numpy.ndarray`. + +```python +>>> import mxnet as mx +>>> import pickle as pkl + +>>> a = mx.nd.ones((2, 3)) * 2 +>>> data = pkl.dumps(a) +>>> b = pkl.loads(data) +>>> print b.asnumpy() +[[ 2. 2. 2.] + [ 2. 2. 2.]] +``` + +On the second way, we directly dump a list of `NDArray` into disk in binary format. + +```python +>>> a = mx.nd.ones((2,3))*2 +>>> b = mx.nd.ones((2,3))*3 +>>> mx.nd.save('mydata.bin', [a, b]) +>>> c = mx.nd.load('mydata.bin') +>>> print c[0].asnumpy() +[[ 2. 2. 2.] + [ 2. 2. 2.]] +>>> print c[1].asnumpy() +[[ 3. 3. 3.] + [ 3. 3. 3.]] +``` + +We can also dump a dict. + +```python +>>> mx.nd.save('mydata.bin', {'a':a, 'b':b}) +>>> c = mx.nd.load('mydata.bin') +>>> print c['a'].asnumpy() +[[ 2. 2. 2.] + [ 2. 2. 2.]] +>>> print c['b'].asnumpy() +[[ 3. 3. 3.] + [ 3. 3. 3.]] +``` + +In addition, we have setup the distributed filesystem such as S3 and HDFS, we +can directly save to and load from them. For example: + +```python +>>> mx.nd.save('s3://mybucket/mydata.bin', [a,b]) +>>> mx.nd.save('hdfs///users/myname/mydata.bin', [a,b]) +``` + +## Parallelization + +The operations of `NDArray` are executed by third libraries such as `cblas`, +`mkl`, and `cuda`. In default, each operation is executed by multi-threads. In +addition, `NDArray` can execute operations in parallel. It is desirable when we +use multiple resources such as CPU, GPU cards, and CPU-to-GPU memory bandwidth. + +For example, if we write `a += 1` followed by `b += 1`, and `a` is on CPU while +`b` is on GPU, then want to execute them in parallel to improve the +efficiency. Furthermore, data copy between CPU and GPU are also expensive, we +hope to run it parallel with other computations as well. + +However, finding the codes can be executed in parallel by eye is hard. In the +following example, `a+=1` and `c*=3` can be executed in parallel, but `a+=1` and +`b*=3` should be in sequential. + +```python +a = mx.nd.ones((2,3)) +b = a +c = a.copyto(mx.cpu()) +a += 1 +b *= 3 +c *= 3 +``` + +Luckily, MXNet can automatically resolve the dependencies and +execute operations in parallel with correctness guaranteed. In other words, we +can write program as by assuming there is only a single thread, while MXNet will +automatically dispatch it into multi-devices, such as multi GPU cards or multi +machines. + +It is achieved by lazy evaluation. Any operation we write down is issued into a +internal DAG engine, and then returned. For example, if we run `a += 1`, it +returns immediately after pushing the plus operator to the engine. This +asynchronous allows us to push more operators to the engine, so it can determine +the read and write dependency and find a best way to execute them in +parallel. + +The actual computations are finished if we want to copy the results into some +other place, such as `print a.asnumpy()` or `mx.nd.save([a])`. Therefore, if we +want to write highly parallelized codes, we only need to postpone when we need +the results. + +## NDArray API + +```eval_rst +.. automodule:: mxnet.ndarray + :members: +``` diff --git a/doc/python/python_api.md b/doc/python/python_api.md deleted file mode 100644 index e79a3519b83e..000000000000 --- a/doc/python/python_api.md +++ /dev/null @@ -1,10 +0,0 @@ -Python API Reference -==================== -This page gives the Python API reference of mxnet. - -Symbolic Interface ------------------- -```eval_rst -.. automodule:: mxnet.symbol - :members: -``` \ No newline at end of file diff --git a/doc/python/python_guide.md b/doc/python/python_guide.md index 29ec50519c3a..440131eacb3c 100644 --- a/doc/python/python_guide.md +++ b/doc/python/python_guide.md @@ -1,349 +1,40 @@ -MXNet Python Guide -================== -This page gives a general overvie of MXNet python package. -MXNet contains a mixed flavor of elements you might need to bake flexible and efficient applications. -There are two major components in MXNet: -* Numpy style [NArray API](#getting-started-with-narray) that -offers matrix and tensor computations on both CPU and GPU, and atomatically parallelize the computation for you; -* [Symbolic API](#symbolic-api-and-differentiation) that allows you define a computation graph(configure a neural network), - and automatically gradient for you. +# MXNet Python Guide -We aim to cover a taste of each flavor in this page. -You are welcomed to also take look at the API reference page Listed in below, or direct skip to next section. +This page gives a general overvie of MXNet python package. MXNet contains a +mixed flavor of elements you might need to bake flexible and efficient +applications. There are mainly three concepts in MXNet: -List of Python Documents ------------------------- -* [NArray API](narray.md) -* [Data Loading API](io.md) -* [Symbolic API](symbol.md) +* Numpy style `NDArray` offers matrix and tensor computations on both CPU and +GPU, with automatic parallelization -Getting Started with NArray ---------------------------- -The basic operation unit in MXNet is ```NArray```. -NArray is basically same as ```numpy.ndarray``` in python, -with two additional features: ***multiple device computation*** and ***automatic parallelism***. +* `Symbol` makes defining a neural network extremely easy, and it provides + automatic differentiation. -### Create NArray and Basics -You can create ```NArray``` in both GPU and GPU, and get the shape of NArray. -```python -import mxnet as mx +* `KVStore` allows data synchronization between multi-GPUs and multi-machine + easily -cpu_array = mx.narray.create((10, 10)) -gpu_array = mx.narray.create((10, 10), mx.Context('gpu', 0)) -print(cpu_array.shape) -``` -If the NArray sits on CPU, we can get a ```numpy.ndarray``` equivalence as follows -```python -numpy_array = cpu_array.numpy -cpu_array.numpy[:] = 10 -print(cpu_array.numpy) -``` -Of course, NArray itself support basic computations such as elementwise operations. -The following example adds two narray together, and creates a new ```NArray```. - -```python -a = mx.narray.create((10, 10)) -b = mx.narray.create((10, 10)) -a.numpy[:] = 10 -b.numpy[:] = 20 -c = a + b -print(c.numpy) -``` - -Now we know how to create and manipulate NArrays. If we have some data on CPU, -how can we make use of GPU and help us to speedup computations? You can use -the copy function to copy NArray between devices, like the following example. -```python -cpu_array = mx.narray.create((10, 10)) -gpu_array = mx.narray.create((10, 10), mx.Context('gpu', 0)) -cpu_array.numpy[:] = 1 - -# copy to an allocated GPU array -cpu_array.copyto(gpu_array) - -# create a new copy of NArray on GPU 0 -gpu_array2 = cpu_array.copyto(mx.Context('gpu', 0)) - -# do some operations on GPU, the result will be on same device. -gpu_array3 = gpu_array2 + 1.0 - -# copy back to CPU -gpu_array3.copyto(cpu_array) - -# print the result -print(cpu_array.numpy) -``` - -In common workflow, it is encouraged to copy the data into a GPU NArray, -do as much as computation as you can, and copy it back to CPU. -Besides the NArrays that are explicitly created, the computation will -generate result NArray that are sit on the same device. - -It is important to note that mxnet do not support arthematic inputs -from two different devices. You need to insert a copyto explicitly -to do the computation, like showed in the following example. -```python -cpu_array = mx.narray.ones((10, 10)) -gpu_array = mx.narray.create((10, 10), mx.Context('gpu', 0)) -gpu_array2 = gpu_array + cpu_array.copyto(gpu_array.context) -``` - -We made this choice because the copy between devices creates additional overhead. -The current API makes the copy cost transparent to the user. - -### Automatically Parallelizing Computation -So far you have learnt the basics of NArray, hope you like the flavor so far. -In machine learning scenarios, it is very common that we can have parallel -computation path, where computation can run concurrently. For example, in the following code, -```a = a + 1``` and ```b = b + 1``` can run in parallel. -```python -a = mx.narray.create((10, 10)) -b = mx.narray.create((10, 10)) -a.numpy[:] = 10 -b.numpy[:] = 20 -a = a + 1 -b = b + 1 -c = a + b -``` -This might be a toy example, but real usecases exists, for example when we want to parallel run -neural net computation on four GPUs. Sometimes we can do this by manually creating threads -and have each of the thread drive the computation. - -However, it is really non-trivial task to synchronize between threads. -Even in the toy example like the above case, we need to wait both operations on a and b to complete -until we can execute ```c = a + b```. - -There are even more subtle cases, for example, in the following case, ```b.copyto(a)``` need to wait -the ```c = a + 1``` to finish. Otherwise we might get different result for c. -```python -a = mx.narray.create((10, 10)) -b = mx.narray.create((10, 10)) -c = a + 1 -b.copyto(a) -``` - -As you can see, it is really hard to write parallel programs, and really hard to reason what can be parallelized. -So normally people just give up and stay with single threaded programs. -Luckily, mxnet does the parallelism ***automatically*** and ***correctly*** for you. - -So when you write the program, you can write them in normal way, -and mxnet will try to run the computation as soon as the dependency get resolved in a parallel way. -One thing that you need to know about though, is that that mxnet's computation is ***asynchronizely issued***. -So the script will immediately return, but the result may not yet be ready. - -To wait the computation to finish, you can call ```wait``` function on the NArray. -The ```wait``` function is called in ```NArray.numpy```, so the result is always synchronized -and you do not need to worry about doing anything wrong. -Due to the same ready, it is adviced to use NArray as much as possible to gain parallelism. - -```python -a = mx.narray.create((10, 10)) -a.numpy[:] = 10 -a = a + 1 -a.wait() -``` -So far the examples are on CPU. Of course same thing works for GPU and multiple GPUs, -for example the following snippet copies the data into two GPUs, runs the computation -and copy things back. - -```python -a = mx.narray.create((10, 10)) -a.numpy[:] = 10 -a_gpu1 = a.copyto(mx.Context('gpu', 0)) -a_gpu1 = a_gpu1 + 1 -a_gpu2 = a.copyto(mx.Context('gpu', 1)) -a_gpu2 = a_gpu2 + 1 - -print(a_gpu1.copyto(mx.Context('cpu')).numpy) -print(a_gpu2.copyto(mx.Context('cpu')).numpy) -``` -As usual, mxnet will automatically do all the parallelization for you, to give you maximum efficiency. - -### Save Load NArray -It is important to save your work after some computations. -We provide two ways to allow you to save and load the NArray objects. -The first way is the naural pythonic way, using pickle. NArray is pickle compatible, -which means you can simply pickle the NArray like what you did with numpy.ndarray. - -The following code gives example of pickling NArray. -```python -import numpy as np -import mxnet as mx -import pickle as pkl +**Table of contents** -a = mx.narray.create((10, 10)) -a.numpy[:] = 10 +```eval_rst +.. toctree:: + :maxdepth: 2 -data = pkl.dumps(a) -a2 = pkl.loads(data) - -assert np.sum(a2.numpy != a.numpy) == 0 -``` - -However, in some scenarios, you may also want to save the results and loads them in in other languages that -are supported by mxnet. To achieve that, you can use ```narray.save``` and ```narray.load```. -What is more, you can directly save and load from cloud such as S3, HDFS:) By simply building mxnet with S3 support. - -The following code is an example on how you can save list of narray into S3 storage and load them back. -```python -import numpy as np -import mxnet as mx - -a = mx.narray.create((10, 10)) -a.numpy[:] = 10 - -# save a list of narray -data = mx.narray.save('s3://mybucket/mydata.bin', [a]) -a2 = mx.narray.load('s3://mybucket/mydata.bin') - -assert np.sum(a2[0].numpy != a.numpy) == 0 - -# can also save a dict of narray -data = mx.narray.save('s3://mybucket/mydata.bin', {'data1': a, 'data2': a}) -narray_dict = mx.narray.load('s3://mybucket/mydata.bin') -``` -In this way, you can always store your experiment on the cloud:) -As usually, we support both flavors for you, and you can choose which one you like to use. - - -Symbolic API and Differentiation --------------------------------- -Now you have seen the power of NArray of MXNet. It seems to be interesting and we are ready to build some real deep learning. -Hmm, this seems to be really exciting, but wait, do we need to build things from scratch? -It seems that we need to re-implement all the layers in deep learning toolkits such as [CXXNet](https://github.com/dmlc/cxxnet) in NArray? -Well, you do not have to. There is a Symbolic API in MXNet that readily helps you to do all these. - -More importantly, the Symbolic API is designed to bring in the advantage of C++ static layers(operators) to ***maximumly optimizes the performance and memory*** that is even better than CXXNet. Sounds exciting? Let us get started on this. - -### Creating Symbols -A common way to create a neural network is to create it via some way of configuration file or API. -The following code creates a configuration two layer perceptrons. -```python -import mxnet.symbol as sym - -data = sym.Variable('data') -net = sym.FullyConnected(data=data, name='fc1', num_hidden=128) -net = sym.Activation(data=net, name='relu1', act_type="relu") -net = sym.FullyConnected(data=net, name='fc2', num_hidden=10) -net = sym.Softmax(data=net, name = 'sm') -``` -If you are familiar with tools such as cxxnet or caffe, the ```Symbol``` object is like configuration files -that configures the network structure. If you are more familiar with tools like theano, the ```Symbol``` -object something that defines the computation graph. Basically, it creates a computation graph -that defines the forward pass of neural network. - -The Configuration API allows you to define the computation graph via compositions. -If you have not used symbolic configuration tools like theano before, one thing to -note is that the ```net``` can also be viewed as function that have input arguments. - -You can get the list of arguments by calling ```Symbol.list_arguments```. -```python ->>> net.list_arguments() -['data', 'fc1_weight', 'fc1_bias', 'fc2_weight', 'fc2_bias'] -``` -In our example, you can find that the arguments contains the parameters in each layer, as well as input data. -One thing that worth noticing is that the argument names like ```fc1_weight``` are automatically generated because -it was not specified in creation of fc1. -You can also specify it explicitly, like the following code. -```python ->>> import mxnet.symbol as sym ->>> data = sym.Variable('data') ->>> w = sym.Variable('myweight') ->>> net = sym.FullyConnected(data=data, weight=w, - name='fc1', num_hidden=128) ->>> net.list_arguments() -['data', 'myweight', 'fc1_bias'] -``` - -Besides the coarse grained neuralnet operators such as FullyConnected, Convolution. -MXNet also provides fine graned operations such as elementwise add, multiplications. -The following example first performs an elementwise add between two symbols, then feed -them to the FullyConnected operator. -``` ->>> import mxnet.symbol as sym ->>> lhs = sym.Variable('data1') ->>> rhs = sym.Variable('data2') ->>> net = sym.FullyConnected(data=lhs + rhs, - name='fc1', num_hidden=128) ->>> net.list_arguments() -['data1', 'data2', 'fc1_weight', 'fc1_bias'] + ndarray + symbol + kvstore + io ``` -### More Complicated Composition -In the previous example, Symbols are constructed in a forward compositional way. -Besides doing things in a forward compistion way. You can also treat composed symbols as functions, -and apply them to existing symbols. - -```python ->>> import mxnet.symbol as sym ->>> data = sym.Variable('data') ->>> net = sym.FullyConnected(data=data, - name='fc1', num_hidden=128) ->>> net.list_arguments() -['data', 'fc1_weight', 'fc1_bias'] ->>> data2 = sym.Variable('data2') ->>> in_net = sym.FullyConnected(data=data, - name='in', num_hidden=128) ->>> composed_net = net(data=in_net, name='compose') ->>> composed_net.list_arguments() -['data2', 'in_weight', 'in_bias', 'compose_fc1_weight', 'compose_fc1_bias'] -``` -In the above example, net is used a function to apply to an existing symbol ```in_net```, the resulting -composed_net will replace the original ```data``` by the the in_net instead. This is useful when you -want to change the input of some neural-net to be other structure. - -### Shape Inference -Now we have defined the computation graph. A common problem in the computation graph, -is to figure out shapes of each parameters. -Usually, we want to know the shape of all the weights, bias and outputs. - -You can use ```Symbol.infer_shape``` to do that. THe shape inference function -allows you to pass in shapes of arguments that you know, -and it will try to infer the shapes of all arguments and outputs. -```python ->>> import mxnet.symbol as sym ->>> data = sym.Variable('data') ->>> net = sym.FullyConnected(data=data, name='fc1', - num_hidden=10) ->>> arg_shape, out_shape = net.infer_shape(data=(100, 100)) ->>> dict(zip(net.list_arguments(), arg_shape)) -{'data': (100, 100), 'fc1_weight': (10, 100), 'fc1_bias': (10,)} ->>> out_shape -[(100, 10)] -``` -In common practice, you only need to provide the shape of input data, and it -will automatically infers the shape of all the parameters. -You can always also provide more shape information, such as shape of weights. -The ```infer_shape``` will detect if there is inconsitency in the shapes, -and raise an Error if some of them are inconsistent. - -### Bind the Symbols -Symbols are configuration objects that represents a computation graph (a configuration of neuralnet). -So far we have introduced how to build up the computation graph (i.e. a configuration). -The remaining question is, how we can do computation using the defined graph. - -TODO. - -### How Efficient is Symbolic API -In short, they design to be very efficienct in both memory and runtime. - -The major reason for us to introduce Symbolic API, is to bring the efficient -C++ operations in powerful toolkits such as cxxnet and caffe together with the flexible -dynamic NArray operations. All the memory and computation resources are allocated statically during Bind, -to maximize the runtime performance and memory utilization. -The coarse grained operators are equivalent to cxxnet layers, which are extremely efficient. -We also provide fine grained operators for more flexible composition. Because we are also doing more inplace -memory allocation, mxnet can be ***more memory efficient*** than cxxnet, and gets to same runtime, with greater flexiblity. -How to Choose between APIs --------------------------- -You can mix them all as much as you like. Here are some guidelines -* Use Symbolic API and coarse grained operator to create established structure. -* Use fine-grained operator to extend parts of of more flexible symbolic graph. -* Do some dynamic NArray tricks, which are even more flexible, between the calls of forward and backward of executors. + + + + + + -We believe that different ways offers you different levels of flexibilty and efficiency. Normally you do not need to -be flexible in all parts of the networks, so we allow you to use the fast optimized parts, -and compose it flexibly with fine-grained operator or dynamic NArray. We believe such kind of mixture allows you to build -the deep learning architecture both efficiently and flexibly as your choice. To mix is to maximize the peformance and flexiblity. \ No newline at end of file + + + + diff --git a/doc/python/symbol.md b/doc/python/symbol.md index f9bb0585c4b2..b306ddc9b7c0 100644 --- a/doc/python/symbol.md +++ b/doc/python/symbol.md @@ -1,20 +1,168 @@ -Python Symbolic API -=================== -Symbolic part of mxnet allows you to describe a computational graph in a declarative way. -The Symbol object is a lightweight contains the head of the computation graph. -The Symbol can be binded to Executor, where the computation resources are actually allocated and computation +# Symbolic and Automatic Differentiation +Now you have seen the power of NArray of MXNet. It seems to be interesting and +we are ready to build some real deep learning. Hmm, this seems to be really +exciting, but wait, do we need to build things from scratch? It seems that we +need to re-implement all the layers in deep learning toolkits such as +[CXXNet](https://github.com/dmlc/cxxnet) in NArray? Well, you do not have +to. There is a Symbolic API in MXNet that readily helps you to do all these. + +More importantly, the Symbolic API is designed to bring in the advantage of C++ +static layers(operators) to ***maximumly optimizes the performance and memory*** +that is even better than CXXNet. Sounds exciting? Let us get started on this. + +## Creating Symbols + +A common way to create a neural network is to create it via some way of +configuration file or API. The following code creates a configuration two layer +perceptrons. + +```python +import mxnet.symbol as sym +data = sym.Variable('data') +net = sym.FullyConnected(data=data, name='fc1', num_hidden=128) +net = sym.Activation(data=net, name='relu1', act_type="relu") +net = sym.FullyConnected(data=net, name='fc2', num_hidden=10) +net = sym.Softmax(data=net, name = 'sm') +``` + +If you are familiar with tools such as cxxnet or caffe, the ```Symbol``` object +is like configuration files that configures the network structure. If you are +more familiar with tools like theano, the ```Symbol``` object something that +defines the computation graph. Basically, it creates a computation graph that +defines the forward pass of neural network. + +The Configuration API allows you to define the computation graph via +compositions. If you have not used symbolic configuration tools like theano +before, one thing to note is that the ```net``` can also be viewed as function +that have input arguments. + +You can get the list of arguments by calling ```Symbol.list_arguments```. + +```python +>>> net.list_arguments() +['data', 'fc1_weight', 'fc1_bias', 'fc2_weight', 'fc2_bias'] +``` + +In our example, you can find that the arguments contains the parameters in each +layer, as well as input data. One thing that worth noticing is that the +argument names like ```fc1_weight``` are automatically generated because it was +not specified in creation of fc1. You can also specify it explicitly, like the +following code. + +```python +>>> import mxnet.symbol as sym +>>> data = sym.Variable('data') +>>> w = sym.Variable('myweight') +>>> net = sym.FullyConnected(data=data, weight=w, + name='fc1', num_hidden=128) +>>> net.list_arguments() +['data', 'myweight', 'fc1_bias'] +``` + +Besides the coarse grained neuralnet operators such as FullyConnected, +Convolution. MXNet also provides fine graned operations such as elementwise +add, multiplications. The following example first performs an elementwise add +between two symbols, then feed them to the FullyConnected operator. + +```python +>>> import mxnet.symbol as sym +>>> lhs = sym.Variable('data1') +>>> rhs = sym.Variable('data2') +>>> net = sym.FullyConnected(data=lhs + rhs, + name='fc1', num_hidden=128) +>>> net.list_arguments() +['data1', 'data2', 'fc1_weight', 'fc1_bias'] +``` + +## More Complicated Composition + +In the previous example, Symbols are constructed in a forward compositional way. +Besides doing things in a forward compistion way. You can also treat composed +symbols as functions, and apply them to existing symbols. + +```python +>>> import mxnet.symbol as sym +>>> data = sym.Variable('data') +>>> net = sym.FullyConnected(data=data, + name='fc1', num_hidden=128) +>>> net.list_arguments() +['data', 'fc1_weight', 'fc1_bias'] +>>> data2 = sym.Variable('data2') +>>> in_net = sym.FullyConnected(data=data, + name='in', num_hidden=128) +>>> composed_net = net(data=in_net, name='compose') +>>> composed_net.list_arguments() +['data2', 'in_weight', 'in_bias', 'compose_fc1_weight', 'compose_fc1_bias'] +``` + +In the above example, net is used a function to apply to an existing symbol +```in_net```, the resulting composed_net will replace the original ```data``` by +the the in_net instead. This is useful when you want to change the input of some +neural-net to be other structure. + +## Shape Inference + +Now we have defined the computation graph. A common problem in the computation +graph, is to figure out shapes of each parameters. Usually, we want to know the +shape of all the weights, bias and outputs. + +You can use ```Symbol.infer_shape``` to do that. THe shape inference function +allows you to pass in shapes of arguments that you know, +and it will try to infer the shapes of all arguments and outputs. + +```python +>>> import mxnet.symbol as sym +>>> data = sym.Variable('data') +>>> net = sym.FullyConnected(data=data, name='fc1', + num_hidden=10) +>>> arg_shape, out_shape = net.infer_shape(data=(100, 100)) +>>> dict(zip(net.list_arguments(), arg_shape)) +{'data': (100, 100), 'fc1_weight': (10, 100), 'fc1_bias': (10,)} +>>> out_shape +[(100, 10)] +``` + +In common practice, you only need to provide the shape of input data, and it +will automatically infers the shape of all the parameters. You can always also +provide more shape information, such as shape of weights. The ```infer_shape``` +will detect if there is inconsitency in the shapes, and raise an Error if some +of them are inconsistent. + +## Bind the Symbols + +Symbols are configuration objects that represents a computation graph (a +configuration of neuralnet). So far we have introduced how to build up the +computation graph (i.e. a configuration). The remaining question is, how we can +do computation using the defined graph. + +TODO. + +## How Efficient is Symbolic API + +In short, they design to be very efficienct in both memory and runtime. + +The major reason for us to introduce Symbolic API, is to bring the efficient C++ +operations in powerful toolkits such as cxxnet and caffe together with the +flexible dynamic NArray operations. All the memory and computation resources are +allocated statically during Bind, to maximize the runtime performance and memory +utilization. + +The coarse grained operators are equivalent to cxxnet layers, which are +extremely efficient. We also provide fine grained operators for more flexible +composition. Because we are also doing more inplace memory allocation, mxnet can +be ***more memory efficient*** than cxxnet, and gets to same runtime, with +greater flexiblity. + +## Symbol API -Symbolic API Reference ----------------------- ```eval_rst .. automodule:: mxnet.symbol :members: ``` -Executor API Reference ----------------------- +## Executor API ```eval_rst .. automodule:: mxnet.executor :members: