add two tutorials

apache · Oct 17, 2015 · b2ab9e8 · b2ab9e8
1 parent 158b602
commit b2ab9e8
Show file tree

Hide file tree

Showing 6 changed files with 848 additions and 70 deletions.
diff --git a/R-package/vignettes/mnistCompetition.Rmd b/R-package/vignettes/mnistCompetition.Rmd
@@ -0,0 +1,113 @@
+---
+title: "Handwritten Digits Classification Competition"
+author: "Tong He"
+date: "October 17, 2015"
+output: html_document
+---
+
+[MNIST](http://yann.lecun.com/exdb/mnist/) is a handwritten digits image data set created by Yann LeCun. Every digit is represented by a 28x28 image. It has become a standard data set to test classifiers on simple image input. Neural network is no doubt a strong model for image classification tasks. There's a [long-term hosted competition](https://www.kaggle.com/c/digit-recognizer) on Kaggle using this data set. We will present the basic usage of `mxnet` to compete in this challenge.
+
+## Data Loading
+
+First, let us download the data from [here](https://www.kaggle.com/c/digit-recognizer/data), and put them under the `data/` folder in your working directory.
+
+Then we can read them in R and convert to matrices.
+
+```{r, eval=FALSE}
+train <- read.csv('data/train.csv', header=TRUE)
+test <- read.csv('data/test.csv', header=TRUE)
+train <- data.matrix(train)
+test <- data.matrix(test)
+
+train.x <- train[,-1]
+train.y <- train[,1]
+```
+
+Here every image is represented as a single row in train/test. The greyscale of each image falls in the range [0, 255], we can linearly transform it into [0,1] by
+
+```{r, eval = FALSE}
+train.x <- train.x/255
+test <- test/255
+```
+
+In the label part, we see the number of each digit is fairly even:
+
+```{r, eval=FALSE}
+table(train.y)
+```
+
+## Network Configuration
+
+Now we have the data. The next step is to configure the structure of our network.
+
+```{r}
+data <- mx.symbol.Variable("data")
+fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=128)
+act1 <- mx.symbol.Activation(fc1, name="relu1", act_type="relu")
+fc2 <- mx.symbol.FullyConnected(act1, name = "fc2", num_hidden = 64)
+act2 <- mx.symbol.Activation(fc2, name="relu2", act_type="relu")
+fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=10)
+softmax <- mx.symbol.Softmax(fc3, name = "sm")
+```
+
+1. In `mxnet`, we use its own data type `symbol` to configure the network. `data <- mx.symbol.Variable("data")` use `data` to represent the input data, i.e. the input layer.
+2. Then we set the first hidden layer by `fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=128)`. This layer has `data` as the input, its name and the number of hidden neurons.
+3. The activation is set by `act1 <- mx.symbol.Activation(fc1, name="relu1", act_type="relu")`. The activation function takes the output from the first hidden layer `fc1`.
+4. The second hidden layer takes the result from `act1` as the input, with its name as "fc2" and the number of hidden neurons as 64.
+5. the second activation is almost the same as `act1`, except we have a different input source and name.
+6. Here comes the output layer. Since there's only 10 digits, we set the number of neurons to 10.
+7. Finally we set the activation to softmax to get a probabilistic prediction.
+
+## Training 
+
+We are almost ready for the training process. Before we start the computation, let's decide what device should we use.
+
+```{r}
+devices <- lapply(1:2, function(i) {
+  mx.cpu(i)
+})
+```
+
+Here we assign two threads of our CPU to `mxnet`. After all these preparation, you can run the following command to train the neural network!
+
+```{r}
+set.seed(0)
+model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y,
+                                     ctx=devices, num.round=10, array.batch.size=100,
+                                     learning.rate=0.07, momentum=0.9,
+                                     initializer=mx.init.uniform(0.07),
+                                     epoch.end.callback=mx.callback.log.train.metric(100))
+```
+
+## Prediction and Submission
+
+To make prediction, we can simply write
+
+```{r}
+preds <- predict(model, test)
+dim(preds)
+```
+
+It is a matrix with 28000 rows and 10 cols, containing the desired classification probabilities from the output layer. To extract the maximum label for each row, we can use the `max.col` in R:
+
+```{r}
+pred.label <- max.col(preds) - 1
+table(pred.label)
+```
+
+With a little extra effort in the csv format, we can have our submission to the competition!
+
+```{r}
+submission <- data.frame(ImageId=1:nrow(test), Label=pred.label)
+write.csv(submission, file='submission.csv', row.names=FALSE, quote=FALSE)
+```
+
+
+
+
+
+
+
+
+
+
diff --git a/R-package/vignettes/mxnetTutorial.Rmd → ...ge/vignettes/ndarrayAndSymbolTutorial.Rmd b/R-package/vignettes/mxnetTutorial.Rmd → ...ge/vignettes/ndarrayAndSymbolTutorial.Rmd
@@ -1,4 +1,4 @@
-MXNet R Overview Tutorial
+MXNet R Tutorial on NDArray and Symbol
 ============================
 
 This vignette gives a general overview of MXNet's R package.  MXNet contains a
@@ -27,27 +27,27 @@ CPU and GPU
 
 Let's create `NDArray` on either GPU or CPU
 
-```r
+```{r}
 require(mxnet)
-a = mx.nd.zeros(c(2, 3)) # create a 2-by-3 matrix on cpu
-b = mx.nd.zeros(c(2, 3), mx.gpu()) # create a 2-by-3 matrix on gpu 0
-c = mx.nd.zeros(c(2, 3), mx.gpu(2)) # create a 2-by-3 matrix on gpu 0
+a <- mx.nd.zeros(c(2, 3)) # create a 2-by-3 matrix on cpu
+b <- mx.nd.zeros(c(2, 3), mx.gpu()) # create a 2-by-3 matrix on gpu 0
+c <- mx.nd.zeros(c(2, 3), mx.gpu(2)) # create a 2-by-3 matrix on gpu 0
 c$dim()
 ```
 
 We can also initialize an `NDArray` object in various ways:
 
-```r
-a = mx.nd.ones(c(4, 4))
-b = mx.rnorm(c(4, 5))
-c = mx.nd.array(1:5)
+```{r}
+a <- mx.nd.ones(c(4, 4))
+b <- mx.rnorm(c(4, 5))
+c <- mx.nd.array(1:5)
 ```
 
 To check the numbers in an `NDArray`, we can simply run
 
-```r
-a = mx.nd.ones(c(2, 3))
-b = as.array(a)
+```{r}
+a <- mx.nd.ones(c(2, 3))
+b <- as.array(a)
 class(b)
 b
 ```
@@ -58,47 +58,47 @@ b
 
 You can perform elemental-wise operations on `NDArray` objects:
 
-```r
-a = mx.nd.ones(c(2, 3)) * 2
-b = mx.nd.ones(c(2, 4)) / 8
+```{r}
+a <- mx.nd.ones(c(2, 3)) * 2
+b <- mx.nd.ones(c(2, 4)) / 8
 as.array(a)
 as.array(b)
-c = a + b
+c <- a + b
 as.array(c)
-d = c / a - 5
+d <- c / a - 5
 as.array(d)
 ```
 
 If two `NDArray`s sit on different divices, we need to explicitly move them 
 into the same one. For instance:
 
-```r
-a = mx.nd.ones(c(2, 3)) * 2
-b = mx.nd.ones(c(2, 3), mx.gpu()) / 8
-c = mx.nd.copyto(a, mx.gpu()) * b
+```{r}
+a <- mx.nd.ones(c(2, 3)) * 2
+b <- mx.nd.ones(c(2, 3), mx.gpu()) / 8
+c <- mx.nd.copyto(a, mx.gpu()) * b
 as.array(c)
 ```
 
 #### Load and Save
 
 You can save an `NDArray` object to your disk with `mx.nd.save`:
 
-```r
-a = mx.nd.ones(c(2, 3))
+```{r}
+a <- mx.nd.ones(c(2, 3))
 mx.nd.save(a, 'temp.ndarray')
 ```
 
 You can also load it back easily:
 
-```r
-a = mx.nd.load('temp.ndarray')
+```{r}
+a <- mx.nd.load('temp.ndarray')
 as.array(a[[1]])
 ```
 
 In case you want to save data to the distributed file system such as S3 and HDFS, 
 we can directly save to and load from them. For example:
 
-```r
+```{r,eval=FALSE}
 mx.nd.save(a, 's3://mybucket/mydata.bin')
 mx.nd.save(a, 'hdfs///users/myname/mydata.bin')
 ```
@@ -108,22 +108,22 @@ mx.nd.save(a, 'hdfs///users/myname/mydata.bin')
 `NDArray` can automatically execute operations in parallel. It is desirable when we
 use multiple resources such as CPU, GPU cards, and CPU-to-GPU memory bandwidth.
 
-For example, if we write `a = a + 1` followed by `b = b + 1`, and `a` is on CPU while
+For example, if we write `a <- a + 1` followed by `b <- b + 1`, and `a` is on CPU while
 `b` is on GPU, then want to execute them in parallel to improve the
 efficiency. Furthermore, data copy between CPU and GPU are also expensive, we
 hope to run it parallel with other computations as well.
 
 However, finding the codes can be executed in parallel by eye is hard. In the
-following example, `a = a + 1` and `c = c * 3` can be executed in parallel, but `a = a + 1` and
-`b = b * 3` should be in sequential.
-
-```r
-a = mx.nd.ones(c(2,3))
-b = a
-c = mx.nd.copyto(a, mx.cpu())
-a = a + 1
-b = b * 3
-c = c * 3
+following example, `a <- a + 1` and `c <- c * 3` can be executed in parallel, but `a <- a + 1` and
+`b <- b * 3` should be in sequential.
+
+```{r}
+a <- mx.nd.ones(c(2,3))
+b <- a
+c <- mx.nd.copyto(a, mx.cpu())
+a <- a + 1
+b <- b * 3
+c <- c * 3
 ```
 
 Luckily, MXNet can automatically resolve the dependencies and
@@ -133,7 +133,7 @@ automatically dispatch it into multi-devices, such as multi GPU cards or multi
 machines.
 
 It is achieved by lazy evaluation. Any operation we write down is issued into a
-internal engine, and then returned. For example, if we run `a = a + 1`, it
+internal engine, and then returned. For example, if we run `a <- a + 1`, it
 returns immediately after pushing the plus operator to the engine. This
 asynchronous allows us to push more operators to the engine, so it can determine
 the read and write dependency and find a best way to execute them in
@@ -152,13 +152,13 @@ WIth the computational unit `NDArray`, we need a way to construct neural network
 
 The following codes create a two layer perceptrons network:
 
-```r
+```{r}
 require(mxnet)
-net = mx.symbol.Variable('data')
-net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
-net = mx.symbol.Activation(data=net, name='relu1', act_type="relu")
-net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
-net = mx.symbol.Softmax(data=net, name='out')
+net <- mx.symbol.Variable('data')
+net <- mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
+net <- mx.symbol.Activation(data=net, name='relu1', act_type="relu")
+net <- mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
+net <- mx.symbol.Softmax(data=net, name='out')
 class(net)
 ```
 
@@ -170,7 +170,7 @@ or the activation type (*act_type*).
 The symbol can be simply viewed as a function taking several arguments, whose
 names are automatically generated and can be get by
 
-```r
+```{r}
 arguments(net)
 ```
 
@@ -183,10 +183,10 @@ As can be seen, these arguments are the parameters need by each symbol:
 
 We can also specify the automatic generated names explicitly:
 
-```r
-net = mx.symbol.Variable('data')
-w = mx.symbol.Variable('myweight')
-net = sym.FullyConnected(data=data, weight=w, name='fc1', num_hidden=128)
+```{r}
+net <- mx.symbol.Variable('data')
+w <- mx.symbol.Variable('myweight')
+net <- sym.FullyConnected(data=data, weight=w, name='fc1', num_hidden=128)
 arguments(net)
 ```
 
@@ -198,22 +198,22 @@ commonly used layers in deep learning. We can also easily define new operators
 in python.  The following example first performs an elementwise add between two
 symbols, then feed them to the fully connected operator.
 
-```r
-lhs = mx.symbol.Variable('data1')
-rhs = mx.symbol.Variable('data2')
-net = mx.symbol.FullyConnected(data=lhs + rhs, name='fc1', num_hidden=128)
+```{r}
+lhs <- mx.symbol.Variable('data1')
+rhs <- mx.symbol.Variable('data2')
+net <- mx.symbol.FullyConnected(data=lhs + rhs, name='fc1', num_hidden=128)
 arguments(net)
 ```
 
 We can also construct symbol in a more flexible way rather than the single
 forward composition we addressed before.
 
-```r
-net = mx.symbol.Variable('data')
-net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
-net2 = mx.symbol.Variable('data2')
-net2 = mx.symbol.FullyConnected(data=net2, name='net2', num_hidden=128)
-composed_net = net(data=net2, name='compose')
+```{r}
+net <- mx.symbol.Variable('data')
+net <- mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
+net2 <- mx.symbol.Variable('data2')
+net2 <- mx.symbol.FullyConnected(data=net2, name='net2', num_hidden=128)
+composed_net <- net(data=net2, name='compose')
 arguments(composed_net)
 ```
 
@@ -226,9 +226,9 @@ In the above example, *net* is used a function to apply to an existing symbol
 Now we have known how to define the symbol. Next we can inference the shapes of
 all the arguments it needed by given the input data shape.
 
-```r
-net = mx.symbol.Variable('data')
-net = mx.symbol.FullyConnected(data=ent, name='fc1', num_hidden=10)
+```{r}
+net <- mx.symbol.Variable('data')
+net <- mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=10)
 ```
 
 The shape inference can be used as an earlier debugging mechanism to detect
@@ -243,19 +243,17 @@ For neural nets, a more commonly used pattern is ```simple_bind```, which will c
 all the arguments arrays for you. Then you can call forward, and backward(if gradient is needed)
 to get the gradient.
 
-```r
-# Todo: refine code
-# define computation graphs
-A = mx.symbol.Variable('A')
-B = mx.symbol.Variable('B')
-C = A * B
+```{r, eval=FALSE}
+A <- mx.symbol.Variable('A')
+B <- mx.symbol.Variable('B')
+C <- A * B
 
-texec = mx.simple.bind(C)
+texec <- mx.simple.bind(C)
 texec.forward()
 texec.backward()
 ```
 
-The [model API](../../python/mxnet/model.py) is a thin wrapper around the symbolic executors to support neural net training.
+The [model API](../../R-package/R/model.R) is a thin wrapper around the symbolic executors to support neural net training.
 
 You are also highly encouraged to read [Symbolic Configuration and Execution in Pictures](symbol_in_pictures.md),
 which provides a detailed explanation of concepts in pictures.

diff --git a/doc/R-package/Makefile b/doc/R-package/Makefile
@@ -3,6 +3,8 @@ PKGROOT=../../R-package
 
 # ADD The Markdown to be built here
 classifyRealImageWithPretrainedModel.md:
+mnistCompetition.Rmd:
+ndarrayAndSymbolTutorial.Rmd:
 
 # General Rules for build rmarkdowns, need knitr
 %.md: $(PKGROOT)/vignettes/%.Rmd

diff --git a/doc/R-package/index.md b/doc/R-package/index.md
@@ -10,6 +10,8 @@ The MXNet R packages brings flexible and efficient GPU computing and deep learni
 Tutorials
 ---------
 * [Classify Realworld Images with Pretrained Model](classifyRealImageWithPretrainedModel.md)
+* [Handwritten Digits Classification Competition](mnistCompetition.md)
+* [Tutorial on NDArray and Symbol](ndarrayAndSymbolTutorial.md)
 
 Installation
 ------------