Merge pull request #6 from dmlc/master

merge back
apache · Oct 18, 2015 · ab62f70 · ab62f70
2 parents 87cd7b8 + fd1525d
commit ab62f70
Show file tree

Hide file tree

Showing 11 changed files with 320 additions and 292 deletions.
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -95,7 +95,7 @@ target_link_libraries(mxnet ${mshadow_LINKER_LIBS})
 target_link_libraries(mxnet dmlccore)
 target_link_libraries(mxnet pslite)
 target_link_libraries(mxnet ${pslite_LINKER_LIBS})
-
+set_target_properties(mxnet PROPERTIES OUTPUT_NAME "libmxnet")
 
 # ---[ Linter target
 if(MSVC)

diff --git a/R-package/DESCRIPTION b/R-package/DESCRIPTION
@@ -6,8 +6,8 @@ Date: 2015-10-02
 Author: Tianqi Chen, Qiang Kou, Tong He
 Maintainer: Qiang Kou <[email protected]>
 Description: MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix the flavours of deep learning programs together to maximize the efficiency and your productivity.
-License: Apache-2.0
-URL: https://github.com/dmlc/mxnet
+License: BSD
+URL: https://github.com/dmlc/mxnet/R-package
 BugReports: https://github.com/dmlc/mxnet/issues
 Imports: methods, Rcpp (>= 0.11.1)
 Suggests: testthat

diff --git a/R-package/LICENSE b/R-package/LICENSE
@@ -0,0 +1,28 @@
+Copyright (c) 2015 by Contributors
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright notice, this
+  list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright notice,
+  this list of conditions and the following disclaimer in the documentation
+  and/or other materials provided with the distribution.
+
+* Neither the name of rabit nor the names of its
+  contributors may be used to endorse or promote products derived from
+  this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
diff --git a/R-package/README.md b/R-package/README.md
@@ -1,5 +1,7 @@
-MXNet R-Package
-===============
+<img src=https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/logo-m/mxnetR.png width=155/> Deep Learning for R
+==========================
+[![Build Status](https://travis-ci.org/dmlc/mxnet.svg?branch=master)](https://travis-ci.org/dmlc/mxnet)
+[![Documentation Status](https://readthedocs.org/projects/mxnet/badge/?version=latest)](http://mxnet.readthedocs.org/en/latest/R-package/index.html)
 
 You have find MXNet R Package! The MXNet R packages brings flexible and efficient GPU
 computing and state-of-art deep learning to R.
@@ -15,9 +17,10 @@ Resources
 * [MXNet R Package Document](http://mxnet.readthedocs.org/en/latest/R-package/index.html)
   - Check this out for detailed documents, examples, installation guides.
 
-
 Installation
 ------------
 Follow [Installation Guide](http://mxnet.readthedocs.org/en/latest/build.html)
 
-
+License
+-------
+MXNet R-package is licensed under [BSD](https://github.com/dmlc/mxnet/blob/master/R-Package/LICENSE) license.
diff --git a/R-package/vignettes/mnistCompetition.Rmd b/R-package/vignettes/mnistCompetition.Rmd
@@ -1,15 +1,19 @@
 Handwritten Digits Classification Competition
-======================================================
+=============================================
 
-[MNIST](http://yann.lecun.com/exdb/mnist/) is a handwritten digits image data set created by Yann LeCun. Every digit is represented by a 28x28 image. It has become a standard data set to test classifiers on simple image input. Neural network is no doubt a strong model for image classification tasks. There's a [long-term hosted competition](https://www.kaggle.com/c/digit-recognizer) on Kaggle using this data set. We will present the basic usage of `mxnet` to compete in this challenge.
+[MNIST](http://yann.lecun.com/exdb/mnist/) is a handwritten digits image data set created by Yann LeCun. Every digit is represented by a 28x28 image. It has become a standard data set to test classifiers on simple image input. Neural network is no doubt a strong model for image classification tasks. There's a [long-term hosted competition](https://www.kaggle.com/c/digit-recognizer) on Kaggle using this data set.
+We will present the basic usage of [mxnet](https://github.com/dmlc/mxnet/tree/master/R-package) to compete in this challenge.
+
+This tutorial is written in Rmarkdown. You can download the source [here](https://github.com/dmlc/mxnet/blob/master/R-package/vignettes/mnistCompetition.Rmd) and view a
+hosted version of tutorial [here](http://mxnet.readthedocs.org/en/latest/R-package/mnistCompetition.html).
 
 ## Data Loading
 
 First, let us download the data from [here](https://www.kaggle.com/c/digit-recognizer/data), and put them under the `data/` folder in your working directory.
 
 Then we can read them in R and convert to matrices.
 
-```{r, eval=FALSE}
+```{r}
 require(mxnet)
 train <- read.csv('data/train.csv', header=TRUE)
 test <- read.csv('data/test.csv', header=TRUE)
@@ -22,22 +26,22 @@ train.y <- train[,1]
 
 Here every image is represented as a single row in train/test. The greyscale of each image falls in the range [0, 255], we can linearly transform it into [0,1] by
 
-```{r, eval=FALSE}
+```{r}
 train.x <- train.x/255
 test <- test/255
 ```
 
 In the label part, we see the number of each digit is fairly even:
 
-```{r, eval=FALSE}
+```{r}
 table(train.y)
 ```
 
 ## Network Configuration
 
 Now we have the data. The next step is to configure the structure of our network.
 
-```{r, eval=FALSE}
+```{r}
 data <- mx.symbol.Variable("data")
 fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=128)
 act1 <- mx.symbol.Activation(fc1, name="relu1", act_type="relu")
@@ -59,15 +63,13 @@ softmax <- mx.symbol.Softmax(fc3, name="sm")
 
 We are almost ready for the training process. Before we start the computation, let's decide what device should we use.
 
-```{r, eval=FALSE}
-devices <- lapply(1:2, function(i) {
-  mx.cpu(i)
-})
+```{r}
+devices <- mx.cpu()
 ```
 
-Here we assign two threads of our CPU to `mxnet`. After all these preparation, you can run the following command to train the neural network! Note that `mx.set.seed` is the correct function to control the random process in `mxnet`.
+Here we assign CPU to `mxnet`. After all these preparation, you can run the following command to train the neural network! Note that `mx.set.seed` is the correct function to control the random process in `mxnet`.
 
-```{r, eval=FALSE}
+```{r}
 mx.set.seed(0)
 model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y,
                                      ctx=devices, num.round=10, array.batch.size=100,
@@ -80,21 +82,21 @@ model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y,
 
 To make prediction, we can simply write
 
-```{r, eval=FALSE}
+```{r}
 preds <- predict(model, test)
 dim(preds)
 ```
 
 It is a matrix with 28000 rows and 10 cols, containing the desired classification probabilities from the output layer. To extract the maximum label for each row, we can use the `max.col` in R:
 
-```{r, eval=FALSE}
+```{r}
 pred.label <- max.col(preds) - 1
 table(pred.label)
 ```
 
 With a little extra effort in the csv format, we can have our submission to the competition!
 
-```{r, eval=FALSE}
+```{r}
 submission <- data.frame(ImageId=1:nrow(test), Label=pred.label)
 write.csv(submission, file='submission.csv', row.names=FALSE, quote=FALSE)
 ```
@@ -105,7 +107,7 @@ Next we are going to introduce a new network structure: [LeNet](http://yann.lecu
 
 First we construct the network:
 
-```{r, eval=FALSE}
+```{r}
 # input
 data <- mx.symbol.Variable('data')
 # first conv
@@ -130,7 +132,7 @@ lenet <- mx.symbol.Softmax(data=fc2)
 
 Then let us reshape the matrices into arrays:
 
-```{r, eval=FALSE}
+```{r}
 train.array <- t(train.x)
 dim(train.array) <- c(1,28,28,nrow(train.x))
 train.array <- aperm(train.array, c(4,1,2,3))
@@ -141,38 +143,47 @@ test.array <- aperm(test.array, c(4,1,2,3))
 
 Next we are going to compare the training speed on different devices, so the definition of the devices goes first:
 
-```{r, eval=FALSE}
+```{r}
+n.gpu <- 1 
 device.cpu <- mx.cpu()
-device.gpu <- lapply(1:4, function(i) {
+device.gpu <- lapply(0:(n.gpu-1), function(i) {
   mx.gpu(i)
 })
 ```
 
-Training on CPU:
+As you can see, we can pass a list of devices, to ask mxnet to train on multiple GPUs (you can do similar thing for cpu,
+but since internal computation of cpu is already multi-threaded, there is less gain than using GPUs).
+
+We start by training on CPU first. Because it takes a bit time to do so, we will only run it for one iteration.
 
-```{r, eval=FALSE}
+```{r}
 mx.set.seed(0)
+tic <- proc.time()
 model <- mx.model.FeedForward.create(lenet, X=train.array, y=train.y,
-                                     ctx=device.cpu, num.round=5, array.batch.size=100,
+                                     ctx=device.cpu, num.round=1, array.batch.size=100,
                                      learning.rate=0.05, momentum=0.9, wd=0.00001,
                                      eval.metric=mx.metric.accuracy,
                                      epoch.end.callback=mx.callback.log.train.metric(100))
+print(proc.time() - tic) 
 ```
 
 Training on GPU:
 
-```{r, eval=FALSE}
+```{r}
 mx.set.seed(0)
+tic <- proc.time()
 model <- mx.model.FeedForward.create(lenet, X=train.array, y=train.y,
                                      ctx=device.gpu, num.round=5, array.batch.size=100,
                                      learning.rate=0.05, momentum=0.9, wd=0.00001,
                                      eval.metric=mx.metric.accuracy,
                                      epoch.end.callback=mx.callback.log.train.metric(100))
+print(proc.time() - tic) 
 ```
 
+As you can see by using GPU, we can get a much faster speedup in training!
 Finally we can submit the result to Kaggle again to see the improvement of our ranking!
 
-```{r, eval=FALSE}
+```{r}
 preds <- predict(model, test.array)
 pred.label <- max.col(preds) - 1
 submission <- data.frame(ImageId=1:nrow(test), Label=pred.label)