diff --git a/scala-package/README.md b/scala-package/README.md index 20fbee2469b0..06ce6930fa46 100644 --- a/scala-package/README.md +++ b/scala-package/README.md @@ -1,67 +1,136 @@ - Deep Learning for Scala/Java +MXNet Package for Scala/Java ===== -[![Build Status](http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/master/badge/icon)](http://jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/master/) -[![GitHub license](http://dmlc.github.io/img/apache2.svg)](./LICENSE) - -Here you find the MXNet Scala Package! -It brings flexible and efficient GPU/CPU computing and state-of-art deep learning to JVM. +The MXNet Scala/Java Package brings flexible and efficient GPU/CPU computing and state-of-art deep learning to JVM. - It enables you to write seamless tensor/matrix computation with multiple GPUs in Scala, Java and other languages built on JVM. - It also enables you to construct and customize the state-of-art deep learning models in JVM languages, and apply them to tasks such as image classification and data science challenges. +- The Scala/Java Inferece APIs provides an easy out of the box solution for loading pre-trained MXNet models and running inference on them. -Install +Pre-Built Maven Packages ------------ - -Technically, all you need is the `mxnet-full_2.11-{arch}-{xpu}-{version}.jar` in your classpath. -It will automatically extract the native library to a tempfile and load it. -You can find the pre-built jar file in [here](https://search.maven.org/search?q=g:org.apache.mxnet) - and also our nightly build package [here](https://repository.apache.org/#nexus-search;gav~org.apache.mxnet~) -Currently we provide `linux-x86_64-gpu`, `linux-x86_64-cpu` and `osx-x86_64-cpu`. Support for Windows will come soon. -Use the following dependency in maven, change the artifactId according to your own architecture, e.g., `mxnet-full_2.11-osx-x86_64-cpu` for OSX (and cpu-only). +### Stable ### + +The MXNet Scala/Java packages can be easily included in your Maven managed project. +The stable jar files for the packages are available on the [MXNet Maven Package Repository](https://search.maven.org/search?q=g:org.apache.mxnet) +Currently we provide packages for Linux (Ubuntu 16.04) (CPU and GPU) and macOS (CPU only). Stable packages for Windows and CentOS will come soon. For now, if you have a CentOS machine, follow the ```Build From Source``` section below. + +To add MXNet Scala/Java package to your project, add the dependency as shown below corresponding to your platform, under the ```dependencies``` tag in your project's ```pom.xml``` : + +**Linux GPU** + +maven badge + +```HTML + + org.apache.mxnet + mxnet-full_2.11-linux-x86_64-gpu + [1.3.1,) + +``` + +**Linux CPU** + +maven badge + +```HTML + + org.apache.mxnet + mxnet-full_2.11-linux-x86_64-cpu + [1.3.1,) + +``` + +**macOS CPU** + +maven badge + +```HTML + + org.apache.mxnet + mxnet-full_2.11-osx-x86_64-cpu + [1.3.1,) + +``` + +**Note:** ```[1.3.1,)<\version>``` indicates that we will fetch packages with version 1.3.1 or higher. This will always ensure that the pom.xml is able to fetch the latest and greatest jar files from Maven. + +### Nightly ### + +Apart from these, the nightly builds representing the bleeding edge development on Scala/Java packages are also available on the [MXNet Maven Nexus Package Repository](https://repository.apache.org/#nexus-search;gav~org.apache.mxnet~~~~). +Currently we provide nightly packages for Linux (CPU and GPU) and MacOS (CPU only). The Linux nightly jar files also work on CentOS. Nightly packages for Windows will come soon. + +Add the following ```repository``` to your project's ```pom.xml``` file : + +````html + + + Apache Snapshot + https://repository.apache.org/content/groups/snapshots + + +```` + +Also, add the dependency which corresponds to your platform to the ```dependencies``` tag : + +**Linux GPU** + +maven badge + +```HTML + + org.apache.mxnet + mxnet-full_2.11-linux-x86_64-gpu + [1.5.0,) + +``` + +**Linux CPU** + +maven badge ```HTML org.apache.mxnet - mxnet-full_2.10-linux-x86_64-gpu - 0.1.1 + mxnet-full_2.11-linux-x86_64-cpu + [1.5.0,) ``` -You can also use `mxnet-core_2.10-0.1.1.jar` and put the compiled native library somewhere in your load path. +**macOS CPU** +maven badge ```HTML org.apache.mxnet - mxnet-core_2.10 - 0.1.1 + mxnet-full_2.11-osx-x86_64-cpu + [1.5.0,) ``` -If you have some native libraries conflict with the ones in the provided 'full' jar (e.g., you use openblas instead of atlas), this is a recommended way. -Refer to the next section for how to build it from the very source. +**Note:** ```[1.5.0,)<\version>``` indicates that we will fetch packages with version 1.5.0 or higher. This will always ensure that the pom.xml is able to fetch the latest and greatest jar files from Maven Snapshot repository. -Build +Build From Source ------------ -Checkout the [Installation Guide](http://mxnet.incubator.apache.org/install/index.html) contains instructions to install mxnet. -Then you can compile the Scala Package by +Checkout the [Installation Guide](http://mxnet.incubator.apache.org/install/index.html) contains instructions to install mxnet package and build it from source. +If you have built MXNet from source and are looking to setup Scala from that point, you may simply run the following from the MXNet source root: ```bash make scalapkg ``` -(Optional) run unit/integration tests by +You can also run the unit tests and integration tests on the Scala Package by : ```bash make scalaunittest make scalaintegrationtest ``` -Or run a subset of unit tests by, e.g., +Or run a subset of unit tests, for e.g., ```bash make SCALA_TEST_ARGS=-Dsuites=org.apache.mxnet.NDArraySuite scalaunittest @@ -70,123 +139,38 @@ make SCALA_TEST_ARGS=-Dsuites=org.apache.mxnet.NDArraySuite scalaunittest If everything goes well, you will find jars for `assembly`, `core` and `example` modules. Also it produces the native library in `native/{your-architecture}/target`, which you can use to cooperate with the `core` module. -Once you've downloaded and unpacked MNIST dataset to `./data/`, run the training example by - -```bash -java -Xmx4G -cp \ - scala-package/assembly/{your-architecture}/target/*:scala-package/examples/target/*:scala-package/examples/target/classes/lib/* \ - org.apache.mxnet.examples.imclassification.TrainMnist \ - --data-dir=./data/ \ - --num-epochs=10 \ - --network=mlp \ - --cpus=0,1,2,3 -``` +Examples & Usage +------- +- To set up the Scala Project using IntelliJ IDE on macOS follow the instructions [here](https://mxnet.incubator.apache.org/tutorials/scala/mxnet_scala_on_intellij.html). +- Several examples on using the Scala APIs are provided in the [Scala Examples Folder](https://github.com/apache/incubator-mxnet/tree/master/scala-package/examples/) -If you've compiled with `USE_DIST_KVSTORE` enabled, the python tools in `mxnet/tracker` can be used to launch distributed training. -The following command runs the above example using 2 worker nodes (and 2 server nodes) in local. Refer to [Distributed Training](http://mxnet.incubator.apache.org/how_to/multi_devices.html) for more details. +Scala Training APIs +------- +- Module API : +[The Module API](https://mxnet.incubator.apache.org/api/scala/module.html) provides an intermediate and high-level interface for performing computation with neural networks in MXNet. Modules provide high-level APIs for training, predicting, and evaluating. -```bash -tracker/dmlc_local.py -n 2 -s 2 \ - java -Xmx4G -cp \ - scala-package/assembly/{your-architecture}/target/*:scala-package/examples/target/*:scala-package/examples/target/classes/lib/* \ - org.apache.mxnet.examples.imclassification.TrainMnist \ - --data-dir=./data/ \ - --num-epochs=10 \ - --network=mlp \ - --cpus=0 \ - --kv-store=dist_sync -``` +- KVStore API : +To run training over multiple GPUs and multiple hosts, one can use the [KVStore API](https://mxnet.incubator.apache.org/api/scala/kvstore.html). -Change the arguments and have fun! +- IO/Data Loading : +MXNet Scala provides APIs for preparing data to feed as an input to models. Check out [Data Loading API](https://mxnet.incubator.apache.org/api/scala/io.html) for more info. + +Other available Scala APIs for training can be found [here](https://mxnet.incubator.apache.org/api/scala/index.html). + -Usage +Scala Inference APIs ------- -Here is a Scala example of what training a simple 3-layer multilayer perceptron on MNIST looks like. You can download the MNIST dataset using [get_mnist_data script](https://github.com/apache/incubator-mxnet/blob/master/scala-package/core/scripts/get_mnist_data.sh). - -```scala -import org.apache.mxnet._ -import org.apache.mxnet.optimizer.SGD - -// model definition -val data = Symbol.Variable("data") -val fc1 = Symbol.FullyConnected(name = "fc1")()(Map("data" -> data, "num_hidden" -> 128)) -val act1 = Symbol.Activation(name = "relu1")()(Map("data" -> fc1, "act_type" -> "relu")) -val fc2 = Symbol.FullyConnected(name = "fc2")()(Map("data" -> act1, "num_hidden" -> 64)) -val act2 = Symbol.Activation(name = "relu2")()(Map("data" -> fc2, "act_type" -> "relu")) -val fc3 = Symbol.FullyConnected(name = "fc3")()(Map("data" -> act2, "num_hidden" -> 10)) -val mlp = Symbol.SoftmaxOutput(name = "sm")()(Map("data" -> fc3)) - -// load MNIST dataset -val trainDataIter = IO.MNISTIter(Map( - "image" -> "data/train-images-idx3-ubyte", - "label" -> "data/train-labels-idx1-ubyte", - "data_shape" -> "(1, 28, 28)", - "label_name" -> "sm_label", - "batch_size" -> "50", - "shuffle" -> "1", - "flat" -> "0", - "silent" -> "0", - "seed" -> "10")) - -val valDataIter = IO.MNISTIter(Map( - "image" -> "data/t10k-images-idx3-ubyte", - "label" -> "data/t10k-labels-idx1-ubyte", - "data_shape" -> "(1, 28, 28)", - "label_name" -> "sm_label", - "batch_size" -> "50", - "shuffle" -> "1", - "flat" -> "0", "silent" -> "0")) - -// setup model and fit the training data -val model = FeedForward.newBuilder(mlp) - .setContext(Context.cpu()) - .setNumEpoch(10) - .setOptimizer(new SGD(learningRate = 0.1f, momentum = 0.9f, wd = 0.0001f)) - .setTrainData(trainDataIter) - .setEvalData(valDataIter) - .build() -``` +The [Scala Inference APIs](https://mxnet.incubator.apache.org/api/scala/infer.html) provide an easy, out of the box solution to load a pre-trained MXNet model and run inference on it. The Inference APIs are present in the [Infer Package](https://github.com/apache/incubator-mxnet/tree/master/scala-package/infer) under the MXNet Scala Package repository, while the documentation for the Infer API is available [here](https://mxnet.incubator.apache.org/api/scala/docs/index.html#org.apache.mxnet.infer.package). -Predict using the model in the following way: - -```scala -val probArrays = model.predict(valDataIter) -// in this case, we do not have multiple outputs -require(probArrays.length == 1) -val prob = probArrays(0) - -// get real labels -import scala.collection.mutable.ListBuffer -valDataIter.reset() -val labels = ListBuffer.empty[NDArray] -while (valDataIter.hasNext) { - val evalData = valDataIter.next() - labels += evalData.label(0).copy() -} -val y = NDArray.concatenate(labels) - -// get predicted labels -val py = NDArray.argmax_channel(prob) -require(y.shape == py.shape) - -// calculate accuracy -var numCorrect = 0 -var numInst = 0 -for ((labelElem, predElem) <- y.toArray zip py.toArray) { - if (labelElem == predElem) { - numCorrect += 1 - } - numInst += 1 -} -val acc = numCorrect.toFloat / numInst -println(s"Final accuracy = $acc") -``` +Java Inference APIs +------- +The [Java Inference APIs](http://mxnet.incubator.apache.org/api/java/index.html) also provide an easy, out of the box solution to load a pre-trained MXNet model and run inference on it. The Inference APIs are present in the [Infer Package](https://github.com/apache/incubator-mxnet/tree/master/scala-package/infer/src/main/scala/org/apache/mxnet/infer/javaapi) under the MXNet Scala Package repository, while the documentation for the Infer API is available [here](https://mxnet.incubator.apache.org/api/java/docs/index.html#org.apache.mxnet.infer.package). +More APIs will be added to the Java Inference APIs soon. -Release +JVM Memory Management ------- -- Version 0.1.1, March 24, 2016. - - Bug fix for MAE & MSE metrics. -- Version 0.1.0, March 22, 2016. +The Scala/Java APIs also provide an automated resource management system, thus making it easy to manage the native memory footprint without any degradation in performance. +More details about JVM Memory Management are available [here](https://github.com/apache/incubator-mxnet/blob/master/scala-package/memory-management.md). License -------