Documentation update related to sparse support (#12367)

* Update sparse.md * Update sparse.md * Update csr.md * Update row_sparse.md * Update train.md
apache · Aug 27, 2018 · d9ea96a · d9ea96a
1 parent cb7dc7f
commit d9ea96a
Show file tree

Hide file tree

Showing 5 changed files with 9 additions and 21 deletions.
diff --git a/docs/api/python/ndarray/sparse.md b/docs/api/python/ndarray/sparse.md
@@ -16,7 +16,7 @@ This document lists the routines of the *n*-dimensional sparse array package:
 ```
 
 The `CSRNDArray` and `RowSparseNDArray` API, defined in the `ndarray.sparse` package, provides
-imperative sparse tensor operations on **CPU**.
+imperative sparse tensor operations.
 
 An `CSRNDArray` inherits from `NDArray`, and represents a two-dimensional, fixed-size array in compressed sparse row format.
 
@@ -63,16 +63,13 @@ A detailed tutorial is available at
 
 ```eval_rst
 
-.. note:: ``mxnet.ndarray.sparse.RowSparseNDArray`` and ``mxnet.ndarray.sparse.CSRNDArray`` DO NOT support the ``mxnet.gluon`` high-level interface yet.
-
 .. note:: ``mxnet.ndarray.sparse`` is similar to ``mxnet.ndarray`` in some aspects. But the differences are not negligible. For instance:
 
-   - Only a subset of operators in ``mxnet.ndarray`` have specialized implementations in ``mxnet.ndarray.sparse``.
-     Operators such as Convolution and broadcasting do not have sparse implementations yet.
+   - Only a subset of operators in ``mxnet.ndarray`` have efficient sparse implementations in ``mxnet.ndarray.sparse``.
+   - If an operator do not occur in the ``mxnet.ndarray.sparse`` namespace, that means the operator does not have an efficient sparse implementation yet. If sparse inputs are passed to such an operator, it will convert inputs to the dense format and fallback to the already available dense implementation.
    - The storage types (``stype``) of sparse operators' outputs depend on the storage types of inputs.
      By default the operators not available in ``mxnet.ndarray.sparse`` infer "default" (dense) storage type for outputs.
      Please refer to the [API Reference](#api-reference) section for further details on specific operators.
-   - GPU support for ``mxnet.ndarray.sparse`` is experimental. Only a few sparse operators are supported on GPU such as ``sparse.dot``.
 
 .. note:: ``mxnet.ndarray.sparse.CSRNDArray`` is similar to ``scipy.sparse.csr_matrix`` in some aspects. But they differ in a few aspects:
 
@@ -559,7 +556,6 @@ We summarize the interface for each class in the following sections.
     sgd_update
     sgd_mom_update
     adam_update
-    ftrl_update
     adagrad_update
 ```
 

diff --git a/docs/api/python/symbol/sparse.md b/docs/api/python/symbol/sparse.md
@@ -16,7 +16,7 @@ This document lists the routines of the sparse symbolic expression package:
 ```
 
 The `Sparse Symbol` API, defined in the `symbol.sparse` package, provides
-sparse neural network graphs and auto-differentiation on CPU.
+sparse neural network graphs and auto-differentiation.
 
 The storage type of a variable is speficied by the `stype` attribute of the variable.
 The storage type of a symbolic expression is inferred based on the storage types of the variables and the operators.
@@ -43,12 +43,11 @@ array([ 1.,  1.],
 .. note:: most operators provided in ``mxnet.symbol.sparse`` are similar to those in
    ``mxnet.symbol`` although there are few differences:
 
-   - Only a subset of operators in ``mxnet.symbol`` have specialized implementations in ``mxnet.symbol.sparse``.
-     Operators such as reduction and broadcasting do not have sparse implementations yet.
+   - Only a subset of operators in ``mxnet.symbol`` have efficient sparse implementations in ``mxnet.symbol.sparse``.
+   - If an operator do not occur in the ``mxnet.symbol.sparse`` namespace, that means the operator does not have an efficient sparse implementation yet. If sparse inputs are passed to such an operator, it will convert inputs to the dense format and fallback to the already available dense implementation.
    - The storage types (``stype``) of sparse operators' outputs depend on the storage types of inputs.
      By default the operators not available in ``mxnet.symbol.sparse`` infer "default" (dense) storage type for outputs.
      Please refer to the API reference section for further details on specific operators.
-   - GPU support for ``mxnet.symbol.sparse`` is experimental.
 
 ```
 

diff --git a/docs/tutorials/sparse/csr.md b/docs/tutorials/sparse/csr.md
@@ -512,9 +512,7 @@ Note that in the file the column indices are expected to be sorted in ascending
 
 ### GPU Support
 
-By default, `CSRNDArray` operators are executed on CPU. In MXNet, GPU support for `CSRNDArray` is experimental with only a few sparse operators such as [dot](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html#mxnet.ndarray.sparse.dot).
-
-To create a `CSRNDArray` on a GPU, we need to explicitly specify the context:
+By default, `CSRNDArray` operators are executed on CPU. To create a `CSRNDArray` on a GPU, we need to explicitly specify the context:
 
 **Note** If a GPU is not available, an error will be reported in the following section. In order to execute it a cpu, set `gpu_device` to `mx.cpu()`.
 

diff --git a/docs/tutorials/sparse/row_sparse.md b/docs/tutorials/sparse/row_sparse.md
@@ -541,12 +541,7 @@ Note that only [mxnet.optimizer.SGD](https://mxnet.incubator.apache.org/api/pyth
 
 ### GPU Support
 
-By default, RowSparseNDArray operators are executed on CPU. In MXNet, GPU support for RowSparseNDArray is limited
-to a few sparse operators such as [sgd_update](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html#mxnet.ndarray.sparse.sgd_update),
-[dot](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html#mxnet.ndarray.sparse.dot) and
-[Embedding](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.Embedding).
-
-To create a RowSparseNDArray on gpu, we need to explicitly specify the context:
+By default, RowSparseNDArray operators are executed on CPU. To create a RowSparseNDArray on gpu, we need to explicitly specify the context:
 
 **Note** If a GPU is not available, an error will be reported in the following section. In order to execute it on a cpu, set gpu_device to mx.cpu().
 

diff --git a/docs/tutorials/sparse/train.md b/docs/tutorials/sparse/train.md
@@ -314,7 +314,7 @@ assert metric.get()[1] < 1, "Achieved MSE (%f) is larger than expected (1.0)" %
 
 ### Training the model with multiple machines or multiple devices
 
-To train a sparse model with multiple machines, you need to call `prepare` before `forward`, or `save_checkpoint`.
+Distributed training with `row_sparse` weights and gradients are supported in MXNet, which significantly reduces communication cost for large models. To train a sparse model with multiple machines, you need to call `prepare` before `forward`, or `save_checkpoint`.
 Please refer to the example in [mxnet/example/sparse/linear_classification](https://github.com/apache/incubator-mxnet/tree/master/example/sparse/linear_classification)
 for more details.