From 3b53981a4fe932e8ae80e4ea1ab5cd0260a12574 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 17:32:47 -0700
Subject: [PATCH 01/15] add tensor inspector tutorial

---
 3rdparty/mkldnn                       |   2 +-
 docs/faq/tensor_inspector_tutorial.md | 147 ++++++++++++++++++++++++++
 2 files changed, 148 insertions(+), 1 deletion(-)
 create mode 100644 docs/faq/tensor_inspector_tutorial.md
diff --git a/3rdparty/mkldnn b/3rdparty/mkldnn
index d89bf4babd7c..41bee20d7eb4 160000
--- a/3rdparty/mkldnn
+++ b/3rdparty/mkldnn
@@ -1 +1 @@
-Subproject commit d89bf4babd7cce7efa6613387dca79c123164084
+Subproject commit 41bee20d7eb4a67feeeeb8d597b3598994eb1959
diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
new file mode 100644
index 000000000000..23d2fc0d08f1
--- /dev/null
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -0,0 +1,147 @@
+## Introduction
+
+When developing new operators, developers need to deal with tensor objects extensively. This new utility, Tensor Inspector, mainly aims to help developers debug by providing unified interfaces to print, check, and dump the tensor value. To developers' convenience, This utility works for all the three data types: Tensors, TBlobs, and NDArrays. Also, it supports both CPU and GPU tensors.
+
+
+## Usage 
+
+This utility locates in `src/common/tensor_inspector.h`. To use it in any operator code, just include `tensor_inspector`, construct an `TensorInspector` object, and call the APIs on that object. You can run any script that uses the operator you just modified then.
+
+The screenshot below shows a sample usage in `src/operator/nn/convolution-inl.h`.
+
+![Screen Shot 2019-07-08 at 5 03 46 PM](https://user-images.githubusercontent.com/16669457/60850062-68690e00-a1a2-11e9-8268-033edde17aa4.png)
+
+
+## Functionalities/APIs
+
+### Create a TensorInspector Object from Tensor, TBlob, and NDArray Objects
+
+You can create a `TensorInspector` object by passing in two things: 1) an object of type `Tensor`, `Tbob`, or `NDArray`, and 2) an `RunContext` object.
+
+Essentially, `TensorInspector` can be understood as a wrapper class around `TBlob`. Internally, the `Tensor`, `Tbob`, or `NDArray` object that you passed in will all be converted to a `TBlob` object. The `RunContext` object is used when the the tensor is a GPU tensor; in such case, we need to use the context information to copy the data from GPU memory to CPU/main memory.
+
+Below are the three constructors:
+
+```c++
+// Construct from Tensor object
+template<typename Device, int dimension, typename DType MSHADOW_DEFAULT_DTYPE>
+TensorInspector(const  mshadow::Tensor<Device, dimension, DType>& ts, const RunContext& ctx);
+
+// Construct from TBlob object
+TensorInspector(const TBlob& tb, const RunContext& ctx);
+
+// Construct from NDArray object
+TensorInspector(const NDArray& arr, const RunContext& ctx):
+```
+
+### Print Tensor Value (Static) 
+
+To print out the tensor value in a nicely structured way,  you can use this API:
+
+```c++
+void print_string();
+```
+
+This API will print the entire tensor to `std::cout` and preserve the shape (it supports all dimensions from 1 and up). You can copy the output and interpret it with any `JSON` loader. Also, on the last line of the output you can find some useful information about the tensor. Refer to the case below, we are able to know that this is a float-typed tensor with shape 20x1x5x5.
+
+![Screen Shot 2019-07-08 at 4 07 16 PM](https://user-images.githubusercontent.com/16669457/60848554-d8c06100-a19b-11e9-9fe0-23e79a7a371a.png)
+
+If instead of printing the tensor to `std::cout`, you just need a `string`, you can use this API:
+```c++
+std::string void to_string();
+```
+
+### Interactively Print Tensor Value (Dynamic) 
+
+When debugging, situations might occur that at compilation time, you do not know which part of a tensor to inspect. Also, sometimes, it would be nice to pause the operator control flow to “zoom into” a specific, erroneous part of a tensor multiple times until you are satisfied. In this regard, you can use this API to interactively inspect the tensor:
+
+```c++
+void  interactive_print(std::string tag =  "") {
+```
+
+This API will set a "break point" in your code, so that you will enter a loop that will keep asking you for further command. In the API call, `tag` is an optional parameter to give the call a name, so that you can identify it when you have multiple `interactive_print()` calls in different parts of your code. A visit count will tell you for how many times have you stepped into this particular "break point", should this operator be called more than once. Note that all `interactive_print()` calls are properly locked, so you can use it in many different places without issues.
+
+![Screen Shot 2019-07-10 at 5 29 07 PM](https://user-images.githubusercontent.com/16669457/61013632-5325e800-a338-11e9-90e6-607f17d81495.png)
+
+Refer the screenshot above, there are many useful commands available: you can type "e" to print out the entire tensor, ''d" to dump the tensor to file (see below), "b" to break from this command loop, and "s" to skip all future `interactive_print()`. Most importantly, in this screen, you can specify a part of the tensor that you are particularly interested in and want to print out. For example, for this 20x1x5x5 tensor, you can type in "0, 0" and presss enter to check the sub-tensor with shape 5x5 at coordinate (0, 0). 
+
+### Check Tensor Value
+
+Sometimes, developers might want to check if the tensor contains unexpected values which could be negative values, NaNs, infinities or others. To facilitate that, you can use these APIs:
+
+```c++
+template<typename ValueChecker>
+std::vector<std::vector<int>> check_value(const ValueChecker& checker,
+		bool interactive = false, std::string tag = "");
+// OR
+std::vector<std::vector<int>> check_value(CheckerType ct,
+		bool interactive = false, std::string tag =  "");
+```
+
+In the first API, `ValueChecker checker` is a bool lambda function that takes in a single parameter which is of the same data type as the tensor.  For example:
+
+```c++
+// use the same DType as in the tensor object
+[] (DType x) {return x == 0};
+```
+
+This checker is called on every value within the tensor. The return of the API is a `vector` of all the coordinates where the checker evaluates to `true`. The coordinates are themselves represented by `vector<int>`. If you set `interactive` to true, you will set a "break point" and enter a loop that asks for commands. This is similar to `interactive_print()`. You can type "p" to print the coordinates, "b" to break from the loop, and "s" to skip all future "break points" in `interactive_print()`.  Just like `interactive_print()`, this this interactive screen is also properly locked.
+
+![Screen Shot 2019-07-10 at 5 34 20 PM](https://user-images.githubusercontent.com/16669457/61013773-fe36a180-a338-11e9-9a2b-5f11ccc7afa7.png)
+
+Also, there are a bunch of built-int value checkers. Refer to the Enum below:
+
+```c++
+enum  CheckerType {
+	NegativeChecker, // check if is negative
+	PositiveChecker, // check if is positive
+	ZeroChecker, // check if is zero
+	NaNChecker, // check if is NaN, will always return false if DType is not a float type
+	InfChecker, // check if is infinity, will always return false if DType is not a float type
+	PositiveInfChecker, // check if is positive infinity,
+						// will always return false if DType is not a float type
+	NegativeInfChecker, // check if is nagative infinity,
+						// will always return false if DType is not a float type
+	FiniteChecker, // check if is finite, will always return false if DType is not a float type
+	NormalChecker, // check if is neither infinity nor NaN
+	AbnormalChecker, // chekck if is infinity or nan
+};
+```
+
+Remember the second API?
+
+```c++
+std::vector<std::vector<int>> check_value(CheckerType ct,
+		bool interactive = false, std::string tag =  "");
+```
+
+You can simply pass in a value from `CheckerType` where you would have passed in your own lambda if you were using the first API.
+
+### Dump Tensor Value
+
+Sometimes, you might want to dump the tensor to a file in binary mode. Then, you might want to use a python script to further analyze the tensor value.  Or, you might do that simply because a binary dumps has better precision and is faster to load than if you copy-paste the output from `print_string()` and load it as a `JASON` string. Either way, you can use this API:
+
+```c++
+void dump_value(std::string tag);
+```
+
+This API will creat a file with name  "{tag}_{visit_count}.npy", where tag is the name that we give to the call, and visit is the visit count, should the operated be called more than once.
+
+The output format is `.npy`, version 1.0. This is the Numpy format and we can easily load it with the following code:
+
+```
+import numpy as np
+a = np.load('abc_1.npy')
+print(a)
+```
+
+Let's see the how it runs:
+
+![Screen Shot 2019-07-10 at 5 17 29 PM](https://user-images.githubusercontent.com/16669457/61013259-cc244000-a336-11e9-8564-a018041634f6.png)
+
+Notice: in `interactive_print()`, you could also do value dumping with command "d". You will be prompt to enter the `tag` value:
+
+![Screen Shot 2019-07-11 at 4 57 41 PM](https://user-images.githubusercontent.com/16669457/61092906-0f48e680-a3fd-11e9-8251-c4371cdd00ad.png)
+
+
+

From 160b8912e14746f2de1b660c6b24a24197ffcb46 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 17:40:21 -0700
Subject: [PATCH 02/15] link docs

---
 docs/faq/add_op_in_backend.md         | 3 +++
 docs/faq/develop_and_hack.md          | 1 +
 docs/faq/tensor_inspector_tutorial.md | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/docs/faq/add_op_in_backend.md b/docs/faq/add_op_in_backend.md
index 15f4ed9fbab4..1345b144778d 100644
--- a/docs/faq/add_op_in_backend.md
+++ b/docs/faq/add_op_in_backend.md
@@ -674,3 +674,6 @@ We welcome your contributions to MXNet.
 [quadratic_op.cu](https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/quadratic_op.cu),
 and
 [test_operator.py](https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L6514).
+
+# Additional Debug Tool
+- [Use TensorInspector to help debug](tensor_inspector_tutorial.md)
diff --git a/docs/faq/develop_and_hack.md b/docs/faq/develop_and_hack.md
index 0e7d221f7dc3..74ac5ac58212 100644
--- a/docs/faq/develop_and_hack.md
+++ b/docs/faq/develop_and_hack.md
@@ -19,6 +19,7 @@
 - [Create new operators](new_op.md)
 - [Use Torch from MXNet](torch.md)
 - [Set environment variables of MXNet](env_var.md)
+- [Use TensorInspector to help debug](tensor_inspector_tutorial.md)
 
 # Other Resources
 - [MXNet System Architecture Overview](/architecture/overview.html)
diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index 23d2fc0d08f1..6b45852c7f86 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -1,3 +1,5 @@
+# Use TensorInspector to Help Debug Operators
+
 ## Introduction
 
 When developing new operators, developers need to deal with tensor objects extensively. This new utility, Tensor Inspector, mainly aims to help developers debug by providing unified interfaces to print, check, and dump the tensor value. To developers' convenience, This utility works for all the three data types: Tensors, TBlobs, and NDArrays. Also, it supports both CPU and GPU tensors.

From f93ae219262513e8ce52c0d68abd8eb3f40b2ed5 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 17:42:11 -0700
Subject: [PATCH 03/15] link docs

---
 docs/faq/add_op_in_backend.md | 4 ++--
 docs/faq/develop_and_hack.md  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/faq/add_op_in_backend.md b/docs/faq/add_op_in_backend.md
index 1345b144778d..248a0b058370 100644
--- a/docs/faq/add_op_in_backend.md
+++ b/docs/faq/add_op_in_backend.md
@@ -675,5 +675,5 @@ We welcome your contributions to MXNet.
 and
 [test_operator.py](https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L6514).
 
-# Additional Debug Tool
-- [Use TensorInspector to help debug](tensor_inspector_tutorial.md)
+## Additional Resources
+- [Use TensorInspector to Help Debug Operators](tensor_inspector_tutorial.md)
diff --git a/docs/faq/develop_and_hack.md b/docs/faq/develop_and_hack.md
index 74ac5ac58212..d2710e923be4 100644
--- a/docs/faq/develop_and_hack.md
+++ b/docs/faq/develop_and_hack.md
@@ -19,7 +19,7 @@
 - [Create new operators](new_op.md)
 - [Use Torch from MXNet](torch.md)
 - [Set environment variables of MXNet](env_var.md)
-- [Use TensorInspector to help debug](tensor_inspector_tutorial.md)
+- [Use TensorInspector to Help Debug Operators](tensor_inspector_tutorial.md)
 
 # Other Resources
 - [MXNet System Architecture Overview](/architecture/overview.html)

From 32881e5acff6a0dc833e52f9c82c4967df40f006 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 18:06:45 -0700
Subject: [PATCH 04/15] add license

---
 docs/faq/tensor_inspector_tutorial.md | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index 6b45852c7f86..e189e253bcd8 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -1,3 +1,18 @@
+<!--- Licensed to the Apache Software Foundation (ASF) under one -->
+<!--- or more contributor license agreements.  See the NOTICE file -->
+<!--- distributed with this work for additional information -->
+<!--- regarding copyright ownership.  The ASF licenses this file -->
+<!--- to you under the Apache License, Version 2.0 (the -->
+<!--- "License"); you may not use this file except in compliance -->
+<!--- with the License.  You may obtain a copy of the License at -->
+<!---   http://www.apache.org/licenses/LICENSE-2.0 -->
+<!--- Unless required by applicable law or agreed to in writing, -->
+<!--- software distributed under the License is distributed on an -->
+<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
+<!--- KIND, either express or implied.  See the License for the -->
+<!--- specific language governing permissions and limitations -->
+<!--- under the License. -->
+
 # Use TensorInspector to Help Debug Operators
 
 ## Introduction

From 74fde2e719e11ed0e2c852af46d4770155ac6e36 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 18:09:19 -0700
Subject: [PATCH 05/15] Revert "add license"

This reverts commit 32881e5acff6a0dc833e52f9c82c4967df40f006.
---
 docs/faq/tensor_inspector_tutorial.md | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index e189e253bcd8..6b45852c7f86 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -1,18 +1,3 @@
-<!--- Licensed to the Apache Software Foundation (ASF) under one -->
-<!--- or more contributor license agreements.  See the NOTICE file -->
-<!--- distributed with this work for additional information -->
-<!--- regarding copyright ownership.  The ASF licenses this file -->
-<!--- to you under the Apache License, Version 2.0 (the -->
-<!--- "License"); you may not use this file except in compliance -->
-<!--- with the License.  You may obtain a copy of the License at -->
-<!---   http://www.apache.org/licenses/LICENSE-2.0 -->
-<!--- Unless required by applicable law or agreed to in writing, -->
-<!--- software distributed under the License is distributed on an -->
-<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
-<!--- KIND, either express or implied.  See the License for the -->
-<!--- specific language governing permissions and limitations -->
-<!--- under the License. -->
-
 # Use TensorInspector to Help Debug Operators
 
 ## Introduction

From af8cad8103458ce97bd2f764abfcfba26ad233f2 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 18:09:30 -0700
Subject: [PATCH 06/15] Revert "link docs"

This reverts commit f93ae219262513e8ce52c0d68abd8eb3f40b2ed5.
---
 docs/faq/add_op_in_backend.md | 4 ++--
 docs/faq/develop_and_hack.md  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/faq/add_op_in_backend.md b/docs/faq/add_op_in_backend.md
index 248a0b058370..1345b144778d 100644
--- a/docs/faq/add_op_in_backend.md
+++ b/docs/faq/add_op_in_backend.md
@@ -675,5 +675,5 @@ We welcome your contributions to MXNet.
 and
 [test_operator.py](https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L6514).
 
-## Additional Resources
-- [Use TensorInspector to Help Debug Operators](tensor_inspector_tutorial.md)
+# Additional Debug Tool
+- [Use TensorInspector to help debug](tensor_inspector_tutorial.md)
diff --git a/docs/faq/develop_and_hack.md b/docs/faq/develop_and_hack.md
index d2710e923be4..74ac5ac58212 100644
--- a/docs/faq/develop_and_hack.md
+++ b/docs/faq/develop_and_hack.md
@@ -19,7 +19,7 @@
 - [Create new operators](new_op.md)
 - [Use Torch from MXNet](torch.md)
 - [Set environment variables of MXNet](env_var.md)
-- [Use TensorInspector to Help Debug Operators](tensor_inspector_tutorial.md)
+- [Use TensorInspector to help debug](tensor_inspector_tutorial.md)
 
 # Other Resources
 - [MXNet System Architecture Overview](/architecture/overview.html)

From ea18c63bb04a82b351de39098751e89e84d1b114 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 18:09:40 -0700
Subject: [PATCH 07/15] Revert "link docs"

This reverts commit 160b8912e14746f2de1b660c6b24a24197ffcb46.
---
 docs/faq/add_op_in_backend.md         | 3 ---
 docs/faq/develop_and_hack.md          | 1 -
 docs/faq/tensor_inspector_tutorial.md | 2 --
 3 files changed, 6 deletions(-)

diff --git a/docs/faq/add_op_in_backend.md b/docs/faq/add_op_in_backend.md
index 1345b144778d..15f4ed9fbab4 100644
--- a/docs/faq/add_op_in_backend.md
+++ b/docs/faq/add_op_in_backend.md
@@ -674,6 +674,3 @@ We welcome your contributions to MXNet.
 [quadratic_op.cu](https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/quadratic_op.cu),
 and
 [test_operator.py](https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L6514).
-
-# Additional Debug Tool
-- [Use TensorInspector to help debug](tensor_inspector_tutorial.md)
diff --git a/docs/faq/develop_and_hack.md b/docs/faq/develop_and_hack.md
index 74ac5ac58212..0e7d221f7dc3 100644
--- a/docs/faq/develop_and_hack.md
+++ b/docs/faq/develop_and_hack.md
@@ -19,7 +19,6 @@
 - [Create new operators](new_op.md)
 - [Use Torch from MXNet](torch.md)
 - [Set environment variables of MXNet](env_var.md)
-- [Use TensorInspector to help debug](tensor_inspector_tutorial.md)
 
 # Other Resources
 - [MXNet System Architecture Overview](/architecture/overview.html)
diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index 6b45852c7f86..23d2fc0d08f1 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -1,5 +1,3 @@
-# Use TensorInspector to Help Debug Operators
-
 ## Introduction
 
 When developing new operators, developers need to deal with tensor objects extensively. This new utility, Tensor Inspector, mainly aims to help developers debug by providing unified interfaces to print, check, and dump the tensor value. To developers' convenience, This utility works for all the three data types: Tensors, TBlobs, and NDArrays. Also, it supports both CPU and GPU tensors.

From 9a75dba7048916859390bc5916d70d098ddabf5c Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 18:09:56 -0700
Subject: [PATCH 08/15] Revert "add tensor inspector tutorial"

This reverts commit 3b53981a4fe932e8ae80e4ea1ab5cd0260a12574.
---
 3rdparty/mkldnn                       |   2 +-
 docs/faq/tensor_inspector_tutorial.md | 147 --------------------------
 2 files changed, 1 insertion(+), 148 deletions(-)
 delete mode 100644 docs/faq/tensor_inspector_tutorial.md

diff --git a/3rdparty/mkldnn b/3rdparty/mkldnn
index 41bee20d7eb4..d89bf4babd7c 160000
--- a/3rdparty/mkldnn
+++ b/3rdparty/mkldnn
@@ -1 +1 @@
-Subproject commit 41bee20d7eb4a67feeeeb8d597b3598994eb1959
+Subproject commit d89bf4babd7cce7efa6613387dca79c123164084
diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
deleted file mode 100644
index 23d2fc0d08f1..000000000000
--- a/docs/faq/tensor_inspector_tutorial.md
+++ /dev/null
@@ -1,147 +0,0 @@
-## Introduction
-
-When developing new operators, developers need to deal with tensor objects extensively. This new utility, Tensor Inspector, mainly aims to help developers debug by providing unified interfaces to print, check, and dump the tensor value. To developers' convenience, This utility works for all the three data types: Tensors, TBlobs, and NDArrays. Also, it supports both CPU and GPU tensors.
-
-
-## Usage 
-
-This utility locates in `src/common/tensor_inspector.h`. To use it in any operator code, just include `tensor_inspector`, construct an `TensorInspector` object, and call the APIs on that object. You can run any script that uses the operator you just modified then.
-
-The screenshot below shows a sample usage in `src/operator/nn/convolution-inl.h`.
-
-![Screen Shot 2019-07-08 at 5 03 46 PM](https://user-images.githubusercontent.com/16669457/60850062-68690e00-a1a2-11e9-8268-033edde17aa4.png)
-
-
-## Functionalities/APIs
-
-### Create a TensorInspector Object from Tensor, TBlob, and NDArray Objects
-
-You can create a `TensorInspector` object by passing in two things: 1) an object of type `Tensor`, `Tbob`, or `NDArray`, and 2) an `RunContext` object.
-
-Essentially, `TensorInspector` can be understood as a wrapper class around `TBlob`. Internally, the `Tensor`, `Tbob`, or `NDArray` object that you passed in will all be converted to a `TBlob` object. The `RunContext` object is used when the the tensor is a GPU tensor; in such case, we need to use the context information to copy the data from GPU memory to CPU/main memory.
-
-Below are the three constructors:
-
-```c++
-// Construct from Tensor object
-template<typename Device, int dimension, typename DType MSHADOW_DEFAULT_DTYPE>
-TensorInspector(const  mshadow::Tensor<Device, dimension, DType>& ts, const RunContext& ctx);
-
-// Construct from TBlob object
-TensorInspector(const TBlob& tb, const RunContext& ctx);
-
-// Construct from NDArray object
-TensorInspector(const NDArray& arr, const RunContext& ctx):
-```
-
-### Print Tensor Value (Static) 
-
-To print out the tensor value in a nicely structured way,  you can use this API:
-
-```c++
-void print_string();
-```
-
-This API will print the entire tensor to `std::cout` and preserve the shape (it supports all dimensions from 1 and up). You can copy the output and interpret it with any `JSON` loader. Also, on the last line of the output you can find some useful information about the tensor. Refer to the case below, we are able to know that this is a float-typed tensor with shape 20x1x5x5.
-
-![Screen Shot 2019-07-08 at 4 07 16 PM](https://user-images.githubusercontent.com/16669457/60848554-d8c06100-a19b-11e9-9fe0-23e79a7a371a.png)
-
-If instead of printing the tensor to `std::cout`, you just need a `string`, you can use this API:
-```c++
-std::string void to_string();
-```
-
-### Interactively Print Tensor Value (Dynamic) 
-
-When debugging, situations might occur that at compilation time, you do not know which part of a tensor to inspect. Also, sometimes, it would be nice to pause the operator control flow to “zoom into” a specific, erroneous part of a tensor multiple times until you are satisfied. In this regard, you can use this API to interactively inspect the tensor:
-
-```c++
-void  interactive_print(std::string tag =  "") {
-```
-
-This API will set a "break point" in your code, so that you will enter a loop that will keep asking you for further command. In the API call, `tag` is an optional parameter to give the call a name, so that you can identify it when you have multiple `interactive_print()` calls in different parts of your code. A visit count will tell you for how many times have you stepped into this particular "break point", should this operator be called more than once. Note that all `interactive_print()` calls are properly locked, so you can use it in many different places without issues.
-
-![Screen Shot 2019-07-10 at 5 29 07 PM](https://user-images.githubusercontent.com/16669457/61013632-5325e800-a338-11e9-90e6-607f17d81495.png)
-
-Refer the screenshot above, there are many useful commands available: you can type "e" to print out the entire tensor, ''d" to dump the tensor to file (see below), "b" to break from this command loop, and "s" to skip all future `interactive_print()`. Most importantly, in this screen, you can specify a part of the tensor that you are particularly interested in and want to print out. For example, for this 20x1x5x5 tensor, you can type in "0, 0" and presss enter to check the sub-tensor with shape 5x5 at coordinate (0, 0). 
-
-### Check Tensor Value
-
-Sometimes, developers might want to check if the tensor contains unexpected values which could be negative values, NaNs, infinities or others. To facilitate that, you can use these APIs:
-
-```c++
-template<typename ValueChecker>
-std::vector<std::vector<int>> check_value(const ValueChecker& checker,
-		bool interactive = false, std::string tag = "");
-// OR
-std::vector<std::vector<int>> check_value(CheckerType ct,
-		bool interactive = false, std::string tag =  "");
-```
-
-In the first API, `ValueChecker checker` is a bool lambda function that takes in a single parameter which is of the same data type as the tensor.  For example:
-
-```c++
-// use the same DType as in the tensor object
-[] (DType x) {return x == 0};
-```
-
-This checker is called on every value within the tensor. The return of the API is a `vector` of all the coordinates where the checker evaluates to `true`. The coordinates are themselves represented by `vector<int>`. If you set `interactive` to true, you will set a "break point" and enter a loop that asks for commands. This is similar to `interactive_print()`. You can type "p" to print the coordinates, "b" to break from the loop, and "s" to skip all future "break points" in `interactive_print()`.  Just like `interactive_print()`, this this interactive screen is also properly locked.
-
-![Screen Shot 2019-07-10 at 5 34 20 PM](https://user-images.githubusercontent.com/16669457/61013773-fe36a180-a338-11e9-9a2b-5f11ccc7afa7.png)
-
-Also, there are a bunch of built-int value checkers. Refer to the Enum below:
-
-```c++
-enum  CheckerType {
-	NegativeChecker, // check if is negative
-	PositiveChecker, // check if is positive
-	ZeroChecker, // check if is zero
-	NaNChecker, // check if is NaN, will always return false if DType is not a float type
-	InfChecker, // check if is infinity, will always return false if DType is not a float type
-	PositiveInfChecker, // check if is positive infinity,
-						// will always return false if DType is not a float type
-	NegativeInfChecker, // check if is nagative infinity,
-						// will always return false if DType is not a float type
-	FiniteChecker, // check if is finite, will always return false if DType is not a float type
-	NormalChecker, // check if is neither infinity nor NaN
-	AbnormalChecker, // chekck if is infinity or nan
-};
-```
-
-Remember the second API?
-
-```c++
-std::vector<std::vector<int>> check_value(CheckerType ct,
-		bool interactive = false, std::string tag =  "");
-```
-
-You can simply pass in a value from `CheckerType` where you would have passed in your own lambda if you were using the first API.
-
-### Dump Tensor Value
-
-Sometimes, you might want to dump the tensor to a file in binary mode. Then, you might want to use a python script to further analyze the tensor value.  Or, you might do that simply because a binary dumps has better precision and is faster to load than if you copy-paste the output from `print_string()` and load it as a `JASON` string. Either way, you can use this API:
-
-```c++
-void dump_value(std::string tag);
-```
-
-This API will creat a file with name  "{tag}_{visit_count}.npy", where tag is the name that we give to the call, and visit is the visit count, should the operated be called more than once.
-
-The output format is `.npy`, version 1.0. This is the Numpy format and we can easily load it with the following code:
-
-```
-import numpy as np
-a = np.load('abc_1.npy')
-print(a)
-```
-
-Let's see the how it runs:
-
-![Screen Shot 2019-07-10 at 5 17 29 PM](https://user-images.githubusercontent.com/16669457/61013259-cc244000-a336-11e9-8564-a018041634f6.png)
-
-Notice: in `interactive_print()`, you could also do value dumping with command "d". You will be prompt to enter the `tag` value:
-
-![Screen Shot 2019-07-11 at 4 57 41 PM](https://user-images.githubusercontent.com/16669457/61092906-0f48e680-a3fd-11e9-8251-c4371cdd00ad.png)
-
-
-

From 968f1f4cd5132d976cda1438da6a2abd7a44cc98 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Thu, 11 Jul 2019 18:10:56 -0700
Subject: [PATCH 09/15] add tensor inspector doc

---
 docs/faq/add_op_in_backend.md         |   3 +
 docs/faq/develop_and_hack.md          |   1 +
 docs/faq/tensor_inspector_tutorial.md | 164 ++++++++++++++++++++++++++
 3 files changed, 168 insertions(+)
 create mode 100644 docs/faq/tensor_inspector_tutorial.md

diff --git a/docs/faq/add_op_in_backend.md b/docs/faq/add_op_in_backend.md
index 15f4ed9fbab4..248a0b058370 100644
--- a/docs/faq/add_op_in_backend.md
+++ b/docs/faq/add_op_in_backend.md
@@ -674,3 +674,6 @@ We welcome your contributions to MXNet.
 [quadratic_op.cu](https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/quadratic_op.cu),
 and
 [test_operator.py](https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L6514).
+
+## Additional Resources
+- [Use TensorInspector to Help Debug Operators](tensor_inspector_tutorial.md)
diff --git a/docs/faq/develop_and_hack.md b/docs/faq/develop_and_hack.md
index 0e7d221f7dc3..d2710e923be4 100644
--- a/docs/faq/develop_and_hack.md
+++ b/docs/faq/develop_and_hack.md
@@ -19,6 +19,7 @@
 - [Create new operators](new_op.md)
 - [Use Torch from MXNet](torch.md)
 - [Set environment variables of MXNet](env_var.md)
+- [Use TensorInspector to Help Debug Operators](tensor_inspector_tutorial.md)
 
 # Other Resources
 - [MXNet System Architecture Overview](/architecture/overview.html)
diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
new file mode 100644
index 000000000000..e189e253bcd8
--- /dev/null
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -0,0 +1,164 @@
+<!--- Licensed to the Apache Software Foundation (ASF) under one -->
+<!--- or more contributor license agreements.  See the NOTICE file -->
+<!--- distributed with this work for additional information -->
+<!--- regarding copyright ownership.  The ASF licenses this file -->
+<!--- to you under the Apache License, Version 2.0 (the -->
+<!--- "License"); you may not use this file except in compliance -->
+<!--- with the License.  You may obtain a copy of the License at -->
+<!---   http://www.apache.org/licenses/LICENSE-2.0 -->
+<!--- Unless required by applicable law or agreed to in writing, -->
+<!--- software distributed under the License is distributed on an -->
+<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
+<!--- KIND, either express or implied.  See the License for the -->
+<!--- specific language governing permissions and limitations -->
+<!--- under the License. -->
+
+# Use TensorInspector to Help Debug Operators
+
+## Introduction
+
+When developing new operators, developers need to deal with tensor objects extensively. This new utility, Tensor Inspector, mainly aims to help developers debug by providing unified interfaces to print, check, and dump the tensor value. To developers' convenience, This utility works for all the three data types: Tensors, TBlobs, and NDArrays. Also, it supports both CPU and GPU tensors.
+
+
+## Usage 
+
+This utility locates in `src/common/tensor_inspector.h`. To use it in any operator code, just include `tensor_inspector`, construct an `TensorInspector` object, and call the APIs on that object. You can run any script that uses the operator you just modified then.
+
+The screenshot below shows a sample usage in `src/operator/nn/convolution-inl.h`.
+
+![Screen Shot 2019-07-08 at 5 03 46 PM](https://user-images.githubusercontent.com/16669457/60850062-68690e00-a1a2-11e9-8268-033edde17aa4.png)
+
+
+## Functionalities/APIs
+
+### Create a TensorInspector Object from Tensor, TBlob, and NDArray Objects
+
+You can create a `TensorInspector` object by passing in two things: 1) an object of type `Tensor`, `Tbob`, or `NDArray`, and 2) an `RunContext` object.
+
+Essentially, `TensorInspector` can be understood as a wrapper class around `TBlob`. Internally, the `Tensor`, `Tbob`, or `NDArray` object that you passed in will all be converted to a `TBlob` object. The `RunContext` object is used when the the tensor is a GPU tensor; in such case, we need to use the context information to copy the data from GPU memory to CPU/main memory.
+
+Below are the three constructors:
+
+```c++
+// Construct from Tensor object
+template<typename Device, int dimension, typename DType MSHADOW_DEFAULT_DTYPE>
+TensorInspector(const  mshadow::Tensor<Device, dimension, DType>& ts, const RunContext& ctx);
+
+// Construct from TBlob object
+TensorInspector(const TBlob& tb, const RunContext& ctx);
+
+// Construct from NDArray object
+TensorInspector(const NDArray& arr, const RunContext& ctx):
+```
+
+### Print Tensor Value (Static) 
+
+To print out the tensor value in a nicely structured way,  you can use this API:
+
+```c++
+void print_string();
+```
+
+This API will print the entire tensor to `std::cout` and preserve the shape (it supports all dimensions from 1 and up). You can copy the output and interpret it with any `JSON` loader. Also, on the last line of the output you can find some useful information about the tensor. Refer to the case below, we are able to know that this is a float-typed tensor with shape 20x1x5x5.
+
+![Screen Shot 2019-07-08 at 4 07 16 PM](https://user-images.githubusercontent.com/16669457/60848554-d8c06100-a19b-11e9-9fe0-23e79a7a371a.png)
+
+If instead of printing the tensor to `std::cout`, you just need a `string`, you can use this API:
+```c++
+std::string void to_string();
+```
+
+### Interactively Print Tensor Value (Dynamic) 
+
+When debugging, situations might occur that at compilation time, you do not know which part of a tensor to inspect. Also, sometimes, it would be nice to pause the operator control flow to “zoom into” a specific, erroneous part of a tensor multiple times until you are satisfied. In this regard, you can use this API to interactively inspect the tensor:
+
+```c++
+void  interactive_print(std::string tag =  "") {
+```
+
+This API will set a "break point" in your code, so that you will enter a loop that will keep asking you for further command. In the API call, `tag` is an optional parameter to give the call a name, so that you can identify it when you have multiple `interactive_print()` calls in different parts of your code. A visit count will tell you for how many times have you stepped into this particular "break point", should this operator be called more than once. Note that all `interactive_print()` calls are properly locked, so you can use it in many different places without issues.
+
+![Screen Shot 2019-07-10 at 5 29 07 PM](https://user-images.githubusercontent.com/16669457/61013632-5325e800-a338-11e9-90e6-607f17d81495.png)
+
+Refer the screenshot above, there are many useful commands available: you can type "e" to print out the entire tensor, ''d" to dump the tensor to file (see below), "b" to break from this command loop, and "s" to skip all future `interactive_print()`. Most importantly, in this screen, you can specify a part of the tensor that you are particularly interested in and want to print out. For example, for this 20x1x5x5 tensor, you can type in "0, 0" and presss enter to check the sub-tensor with shape 5x5 at coordinate (0, 0). 
+
+### Check Tensor Value
+
+Sometimes, developers might want to check if the tensor contains unexpected values which could be negative values, NaNs, infinities or others. To facilitate that, you can use these APIs:
+
+```c++
+template<typename ValueChecker>
+std::vector<std::vector<int>> check_value(const ValueChecker& checker,
+		bool interactive = false, std::string tag = "");
+// OR
+std::vector<std::vector<int>> check_value(CheckerType ct,
+		bool interactive = false, std::string tag =  "");
+```
+
+In the first API, `ValueChecker checker` is a bool lambda function that takes in a single parameter which is of the same data type as the tensor.  For example:
+
+```c++
+// use the same DType as in the tensor object
+[] (DType x) {return x == 0};
+```
+
+This checker is called on every value within the tensor. The return of the API is a `vector` of all the coordinates where the checker evaluates to `true`. The coordinates are themselves represented by `vector<int>`. If you set `interactive` to true, you will set a "break point" and enter a loop that asks for commands. This is similar to `interactive_print()`. You can type "p" to print the coordinates, "b" to break from the loop, and "s" to skip all future "break points" in `interactive_print()`.  Just like `interactive_print()`, this this interactive screen is also properly locked.
+
+![Screen Shot 2019-07-10 at 5 34 20 PM](https://user-images.githubusercontent.com/16669457/61013773-fe36a180-a338-11e9-9a2b-5f11ccc7afa7.png)
+
+Also, there are a bunch of built-int value checkers. Refer to the Enum below:
+
+```c++
+enum  CheckerType {
+	NegativeChecker, // check if is negative
+	PositiveChecker, // check if is positive
+	ZeroChecker, // check if is zero
+	NaNChecker, // check if is NaN, will always return false if DType is not a float type
+	InfChecker, // check if is infinity, will always return false if DType is not a float type
+	PositiveInfChecker, // check if is positive infinity,
+						// will always return false if DType is not a float type
+	NegativeInfChecker, // check if is nagative infinity,
+						// will always return false if DType is not a float type
+	FiniteChecker, // check if is finite, will always return false if DType is not a float type
+	NormalChecker, // check if is neither infinity nor NaN
+	AbnormalChecker, // chekck if is infinity or nan
+};
+```
+
+Remember the second API?
+
+```c++
+std::vector<std::vector<int>> check_value(CheckerType ct,
+		bool interactive = false, std::string tag =  "");
+```
+
+You can simply pass in a value from `CheckerType` where you would have passed in your own lambda if you were using the first API.
+
+### Dump Tensor Value
+
+Sometimes, you might want to dump the tensor to a file in binary mode. Then, you might want to use a python script to further analyze the tensor value.  Or, you might do that simply because a binary dumps has better precision and is faster to load than if you copy-paste the output from `print_string()` and load it as a `JASON` string. Either way, you can use this API:
+
+```c++
+void dump_value(std::string tag);
+```
+
+This API will creat a file with name  "{tag}_{visit_count}.npy", where tag is the name that we give to the call, and visit is the visit count, should the operated be called more than once.
+
+The output format is `.npy`, version 1.0. This is the Numpy format and we can easily load it with the following code:
+
+```
+import numpy as np
+a = np.load('abc_1.npy')
+print(a)
+```
+
+Let's see the how it runs:
+
+![Screen Shot 2019-07-10 at 5 17 29 PM](https://user-images.githubusercontent.com/16669457/61013259-cc244000-a336-11e9-8564-a018041634f6.png)
+
+Notice: in `interactive_print()`, you could also do value dumping with command "d". You will be prompt to enter the `tag` value:
+
+![Screen Shot 2019-07-11 at 4 57 41 PM](https://user-images.githubusercontent.com/16669457/61092906-0f48e680-a3fd-11e9-8251-c4371cdd00ad.png)
+
+
+

From 5f3e19eec0934e6bd7d3812e09362219c9e99674 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Fri, 12 Jul 2019 14:13:08 -0700
Subject: [PATCH 10/15] fix api name

---
 docs/faq/tensor_inspector_tutorial.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index e189e253bcd8..5255f58e2f49 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -139,7 +139,7 @@ You can simply pass in a value from `CheckerType` where you would have passed in
 Sometimes, you might want to dump the tensor to a file in binary mode. Then, you might want to use a python script to further analyze the tensor value.  Or, you might do that simply because a binary dumps has better precision and is faster to load than if you copy-paste the output from `print_string()` and load it as a `JASON` string. Either way, you can use this API:
 
 ```c++
-void dump_value(std::string tag);
+void dump_to_file(std::string tag);
 ```
 
 This API will creat a file with name  "{tag}_{visit_count}.npy", where tag is the name that we give to the call, and visit is the visit count, should the operated be called more than once.

From 116590c17e4b471051cdb52dc3c8b1edafd1c54f Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Tue, 16 Jul 2019 14:25:41 -0700
Subject: [PATCH 11/15] add new test and limitations section

---
 docs/faq/tensor_inspector_tutorial.md | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index 5255f58e2f49..bf0871a94d29 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -17,12 +17,12 @@
 
 ## Introduction
 
-When developing new operators, developers need to deal with tensor objects extensively. This new utility, Tensor Inspector, mainly aims to help developers debug by providing unified interfaces to print, check, and dump the tensor value. To developers' convenience, This utility works for all the three data types: Tensors, TBlobs, and NDArrays. Also, it supports both CPU and GPU tensors.
+When developing new operators, developers need to deal with tensor objects extensively. This new utility, Tensor Inspector, mainly aims to help developers debug by providing unified interfaces to print, check, and dump the tensor value. To developers' convenience, this utility works for all the three data types: Tensors, TBlobs, and NDArrays. Also, it supports both CPU and GPU tensors.
 
 
 ## Usage 
 
-This utility locates in `src/common/tensor_inspector.h`. To use it in any operator code, just include `tensor_inspector`, construct an `TensorInspector` object, and call the APIs on that object. You can run any script that uses the operator you just modified then.
+This utility is located in `src/common/tensor_inspector.h`. To use it in any operator code, just include `tensor_inspector`, construct an `TensorInspector` object, and call the APIs on that object. You can run any script that uses the operator you just modified then.
 
 The screenshot below shows a sample usage in `src/operator/nn/convolution-inl.h`.
 
@@ -35,7 +35,7 @@ The screenshot below shows a sample usage in `src/operator/nn/convolution-inl.h`
 
 You can create a `TensorInspector` object by passing in two things: 1) an object of type `Tensor`, `Tbob`, or `NDArray`, and 2) an `RunContext` object.
 
-Essentially, `TensorInspector` can be understood as a wrapper class around `TBlob`. Internally, the `Tensor`, `Tbob`, or `NDArray` object that you passed in will all be converted to a `TBlob` object. The `RunContext` object is used when the the tensor is a GPU tensor; in such case, we need to use the context information to copy the data from GPU memory to CPU/main memory.
+Essentially, `TensorInspector` can be understood as a wrapper class around `TBlob`. Internally, the `Tensor`, `Tbob`, or `NDArray` object that you passed in will be converted to a `TBlob` object. The `RunContext` object is used when the the tensor is a GPU tensor; in such a case, we need to use the context information to copy the data from GPU memory to CPU/main memory.
 
 Below are the three constructors:
 
@@ -80,7 +80,7 @@ This API will set a "break point" in your code, so that you will enter a loop th
 
 ![Screen Shot 2019-07-10 at 5 29 07 PM](https://user-images.githubusercontent.com/16669457/61013632-5325e800-a338-11e9-90e6-607f17d81495.png)
 
-Refer the screenshot above, there are many useful commands available: you can type "e" to print out the entire tensor, ''d" to dump the tensor to file (see below), "b" to break from this command loop, and "s" to skip all future `interactive_print()`. Most importantly, in this screen, you can specify a part of the tensor that you are particularly interested in and want to print out. For example, for this 20x1x5x5 tensor, you can type in "0, 0" and presss enter to check the sub-tensor with shape 5x5 at coordinate (0, 0). 
+Refer the screenshot above, there are many useful commands available: you can type "e" to print out the entire tensor, "d" to dump the tensor to file (see below), "b" to break from this command loop, and "s" to skip all future `interactive_print()`. Most importantly, in this screen, you can specify a part of the tensor that you are particularly interested in and want to print out. For example, for this 20x1x5x5 tensor, you can type in "0, 0" and presss enter to check the sub-tensor with shape 5x5 at coordinate (0, 0). 
 
 ### Check Tensor Value
 
@@ -102,7 +102,7 @@ In the first API, `ValueChecker checker` is a bool lambda function that takes in
 [] (DType x) {return x == 0};
 ```
 
-This checker is called on every value within the tensor. The return of the API is a `vector` of all the coordinates where the checker evaluates to `true`. The coordinates are themselves represented by `vector<int>`. If you set `interactive` to true, you will set a "break point" and enter a loop that asks for commands. This is similar to `interactive_print()`. You can type "p" to print the coordinates, "b" to break from the loop, and "s" to skip all future "break points" in `interactive_print()`.  Just like `interactive_print()`, this this interactive screen is also properly locked.
+This checker is called on every value within the tensor. The return of the API is a `vector` of all the coordinates where the checker evaluates to `true`. The coordinates are themselves represented by `vector<int>`. If you set `interactive` to true, you will set a "break point" and enter a loop that asks for commands. This is similar to `interactive_print()`. You can type "p" to print the coordinates, "b" to break from the loop, and "s" to skip all future "break points" in `interactive_print()`. You can also specify a coordinate to print only a part of the tensor or type "e" to print out the entire tensor.  Just like `interactive_print()`, this this interactive screen is also properly locked.
 
 ![Screen Shot 2019-07-10 at 5 34 20 PM](https://user-images.githubusercontent.com/16669457/61013773-fe36a180-a338-11e9-9a2b-5f11ccc7afa7.png)
 
@@ -132,7 +132,7 @@ std::vector<std::vector<int>> check_value(CheckerType ct,
 		bool interactive = false, std::string tag =  "");
 ```
 
-You can simply pass in a value from `CheckerType` where you would have passed in your own lambda if you were using the first API.
+You can simply pass in a value from `CheckerType` where you would have passed in your own lambda if you were using the first API. Note that it's the developer's responsibility to pass in a valid value checker.
 
 ### Dump Tensor Value
 
@@ -142,7 +142,7 @@ Sometimes, you might want to dump the tensor to a file in binary mode. Then, you
 void dump_to_file(std::string tag);
 ```
 
-This API will creat a file with name  "{tag}_{visit_count}.npy", where tag is the name that we give to the call, and visit is the visit count, should the operated be called more than once.
+This API will create a file with name  "{tag}_{visit_count}.npy", where tag is the name that we give to the call, and visit is the visit count, should the operated be called more than once.
 
 The output format is `.npy`, version 1.0. This is the Numpy format and we can easily load it with the following code:
 
@@ -156,9 +156,13 @@ Let's see the how it runs:
 
 ![Screen Shot 2019-07-10 at 5 17 29 PM](https://user-images.githubusercontent.com/16669457/61013259-cc244000-a336-11e9-8564-a018041634f6.png)
 
-Notice: in `interactive_print()`, you could also do value dumping with command "d". You will be prompt to enter the `tag` value:
+Notice: in `interactive_print()`, you could also do value dumping with command "d". You will be prompted to enter the `tag` value:
 
 ![Screen Shot 2019-07-11 at 4 57 41 PM](https://user-images.githubusercontent.com/16669457/61092906-0f48e680-a3fd-11e9-8251-c4371cdd00ad.png)
 
+### Test Coverage and Limitations
 
+This Utility has been tested on Mac and Ubuntu with and without CUDNN and MKLDNN. Supports for `Tensor`, `TBlob`, and `NDArray` and for CPU and GPU have been manually tested and exhibited no issue. 
+
+Currently, this utility only supports non-empty tensors and tensors with known shapes i.e. `tb_.ndim() > 0`. Also, this utility only supports dense `NDArray` objects, i.e. when the type is `kDefaultStorage`. 
 

From 453ab1c4dc6140c28ddc259da5c9cb41b645149b Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Tue, 16 Jul 2019 14:27:38 -0700
Subject: [PATCH 12/15] fix

---
 docs/faq/tensor_inspector_tutorial.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index bf0871a94d29..cbd6898824a9 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -162,7 +162,7 @@ Notice: in `interactive_print()`, you could also do value dumping with command "
 
 ### Test Coverage and Limitations
 
-This Utility has been tested on Mac and Ubuntu with and without CUDNN and MKLDNN. Supports for `Tensor`, `TBlob`, and `NDArray` and for CPU and GPU have been manually tested and exhibited no issue. 
+This utility has been tested on Mac and Ubuntu with and without CUDNN and MKLDNN. Supports for `Tensor`, `TBlob`, and `NDArray`, as well as for CPU and GPU have been manually tested. 
 
 Currently, this utility only supports non-empty tensors and tensors with known shapes i.e. `tb_.ndim() > 0`. Also, this utility only supports dense `NDArray` objects, i.e. when the type is `kDefaultStorage`. 
 

From f049fe6e3d716c2800020f67371e066d685bdb97 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Fri, 19 Jul 2019 15:01:59 -0700
Subject: [PATCH 13/15] update urls and other fixes

---
 docs/faq/tensor_inspector_tutorial.md | 52 +++++++++++++--------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index cbd6898824a9..933d13a0aaa8 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -22,11 +22,11 @@ When developing new operators, developers need to deal with tensor objects exten
 
 ## Usage 
 
-This utility is located in `src/common/tensor_inspector.h`. To use it in any operator code, just include `tensor_inspector`, construct an `TensorInspector` object, and call the APIs on that object. You can run any script that uses the operator you just modified then.
+This utility is located in `src/common/tensor_inspector.h`. To use it in any operator code, just include it using `#include "{path}/tensor_inspector.h"`, construct an `TensorInspector` object, and call the APIs on that object. You can run any script that uses the operator you just modified then.
 
 The screenshot below shows a sample usage in `src/operator/nn/convolution-inl.h`.
 
-![Screen Shot 2019-07-08 at 5 03 46 PM](https://user-images.githubusercontent.com/16669457/60850062-68690e00-a1a2-11e9-8268-033edde17aa4.png)
+![tensor_inspector_example_usage](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_example_usage.png)
 
 
 ## Functionalities/APIs
@@ -35,9 +35,9 @@ The screenshot below shows a sample usage in `src/operator/nn/convolution-inl.h`
 
 You can create a `TensorInspector` object by passing in two things: 1) an object of type `Tensor`, `Tbob`, or `NDArray`, and 2) an `RunContext` object.
 
-Essentially, `TensorInspector` can be understood as a wrapper class around `TBlob`. Internally, the `Tensor`, `Tbob`, or `NDArray` object that you passed in will be converted to a `TBlob` object. The `RunContext` object is used when the the tensor is a GPU tensor; in such a case, we need to use the context information to copy the data from GPU memory to CPU/main memory.
+Essentially, `TensorInspector` can be understood as a wrapper class around `TBlob`. Internally, the `Tensor`, `Tbob`, or `NDArray` object that you passed in will be converted to a `TBlob` object. The `RunContext` object is used when the tensor is a GPU tensor; in such a case, we need to use the context information to copy the data from GPU memory to CPU/main memory.
 
-Below are the three constructors:
+Following are the three constructors:
 
 ```c++
 // Construct from Tensor object
@@ -53,15 +53,15 @@ TensorInspector(const NDArray& arr, const RunContext& ctx):
 
 ### Print Tensor Value (Static) 
 
-To print out the tensor value in a nicely structured way,  you can use this API:
+To print out the tensor value in a nicely structured way, you can use this API:
 
 ```c++
 void print_string();
 ```
 
-This API will print the entire tensor to `std::cout` and preserve the shape (it supports all dimensions from 1 and up). You can copy the output and interpret it with any `JSON` loader. Also, on the last line of the output you can find some useful information about the tensor. Refer to the case below, we are able to know that this is a float-typed tensor with shape 20x1x5x5.
+This API will print the entire tensor to `std::cout` and preserve the shape (it supports all dimensions from 1 and up). You can copy the output and interpret it with any `JSON` loader. You can find some useful information about the tensor on the last line of the output. Refer to the case below, we are able to know that this is a float-typed tensor with shape 20x1x5x5.
 
-![Screen Shot 2019-07-08 at 4 07 16 PM](https://user-images.githubusercontent.com/16669457/60848554-d8c06100-a19b-11e9-9fe0-23e79a7a371a.png)
+![tensor_inspector_to_string](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_to_string.png)
 
 If instead of printing the tensor to `std::cout`, you just need a `string`, you can use this API:
 ```c++
@@ -70,17 +70,17 @@ std::string void to_string();
 
 ### Interactively Print Tensor Value (Dynamic) 
 
-When debugging, situations might occur that at compilation time, you do not know which part of a tensor to inspect. Also, sometimes, it would be nice to pause the operator control flow to “zoom into” a specific, erroneous part of a tensor multiple times until you are satisfied. In this regard, you can use this API to interactively inspect the tensor:
+Sometimes at compilation time, you may not know which part of a tensor to inspect. Also, it may be nice to pause the operator control flow to “zoom into” a specific, erroneous part of a tensor multiple times until you are satisfied. In this regard, you can use this API to interactively inspect the tensor:
 
 ```c++
 void  interactive_print(std::string tag =  "") {
 ```
 
-This API will set a "break point" in your code, so that you will enter a loop that will keep asking you for further command. In the API call, `tag` is an optional parameter to give the call a name, so that you can identify it when you have multiple `interactive_print()` calls in different parts of your code. A visit count will tell you for how many times have you stepped into this particular "break point", should this operator be called more than once. Note that all `interactive_print()` calls are properly locked, so you can use it in many different places without issues.
+This API will set a "break point" in your code. What that "break point" is reached, you will enter a loop that will keep asking you for further command input. In the API call, `tag` is an optional parameter to give the call a name, so that you can identify it when you have multiple `interactive_print()` calls in different parts of your code. A visit count will tell you how many times you stepped into this particular "break point", should this operator be called more than once. Note that all `interactive_print()` calls are properly locked, so you can use it in many different places without issues.
 
-![Screen Shot 2019-07-10 at 5 29 07 PM](https://user-images.githubusercontent.com/16669457/61013632-5325e800-a338-11e9-90e6-607f17d81495.png)
+![tensor_inspector_interactive_print](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_interactive_print.png)
 
-Refer the screenshot above, there are many useful commands available: you can type "e" to print out the entire tensor, "d" to dump the tensor to file (see below), "b" to break from this command loop, and "s" to skip all future `interactive_print()`. Most importantly, in this screen, you can specify a part of the tensor that you are particularly interested in and want to print out. For example, for this 20x1x5x5 tensor, you can type in "0, 0" and presss enter to check the sub-tensor with shape 5x5 at coordinate (0, 0). 
+There are many useful commands available, as described in the previous screenshot: you can type "e" to print out the entire tensor, "d" to dump the tensor to file (see below), "b" to break from this command loop, and "s" to skip all future `interactive_print()`. Most importantly, in this screen, you can specify a part of the tensor that you are particularly interested in and want to print out. For example, for this 64x20x24x24 tensor, you can type in "0, 0" and presss enter to check the sub-tensor with shape 24x24 at coordinate (0, 0). 
 
 ### Check Tensor Value
 
@@ -104,24 +104,24 @@ In the first API, `ValueChecker checker` is a bool lambda function that takes in
 
 This checker is called on every value within the tensor. The return of the API is a `vector` of all the coordinates where the checker evaluates to `true`. The coordinates are themselves represented by `vector<int>`. If you set `interactive` to true, you will set a "break point" and enter a loop that asks for commands. This is similar to `interactive_print()`. You can type "p" to print the coordinates, "b" to break from the loop, and "s" to skip all future "break points" in `interactive_print()`. You can also specify a coordinate to print only a part of the tensor or type "e" to print out the entire tensor.  Just like `interactive_print()`, this this interactive screen is also properly locked.
 
-![Screen Shot 2019-07-10 at 5 34 20 PM](https://user-images.githubusercontent.com/16669457/61013773-fe36a180-a338-11e9-9a2b-5f11ccc7afa7.png)
+![tensor_inspector_value_check](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_value_check.png)
 
 Also, there are a bunch of built-int value checkers. Refer to the Enum below:
 
 ```c++
 enum  CheckerType {
-	NegativeChecker, // check if is negative
-	PositiveChecker, // check if is positive
-	ZeroChecker, // check if is zero
-	NaNChecker, // check if is NaN, will always return false if DType is not a float type
-	InfChecker, // check if is infinity, will always return false if DType is not a float type
-	PositiveInfChecker, // check if is positive infinity,
+	NegativeChecker, // check if negative
+	PositiveChecker, // check if positive
+	ZeroChecker, // check for zero
+	NaNChecker, // check if for NaN, will always return false if DType is not a float type
+	InfChecker, // check for infinity, will always return false if DType is not a float type
+	PositiveInfChecker, // check for positive infinity,
 						// will always return false if DType is not a float type
-	NegativeInfChecker, // check if is nagative infinity,
+	NegativeInfChecker, // check for nagative infinity,
 						// will always return false if DType is not a float type
-	FiniteChecker, // check if is finite, will always return false if DType is not a float type
-	NormalChecker, // check if is neither infinity nor NaN
-	AbnormalChecker, // chekck if is infinity or nan
+	FiniteChecker, // check if finite, will always return false if DType is not a float type
+	NormalChecker, // check if it is neither infinity nor NaN
+	AbnormalChecker, // chekck if it is infinity or nan
 };
 ```
 
@@ -136,7 +136,7 @@ You can simply pass in a value from `CheckerType` where you would have passed in
 
 ### Dump Tensor Value
 
-Sometimes, you might want to dump the tensor to a file in binary mode. Then, you might want to use a python script to further analyze the tensor value.  Or, you might do that simply because a binary dumps has better precision and is faster to load than if you copy-paste the output from `print_string()` and load it as a `JASON` string. Either way, you can use this API:
+Sometimes, you might want to dump the tensor to a file in binary mode. Then, you might want to use a python script to further analyze the tensor value. Or, you might do that simply because a binary dump has better precision and is faster to load than the output copy-pasted from `print_string()` and loaded as a `JSON` string. Either way, you can use this API:
 
 ```c++
 void dump_to_file(std::string tag);
@@ -152,13 +152,13 @@ a = np.load('abc_1.npy')
 print(a)
 ```
 
-Let's see the how it runs:
+Let's see how it runs:
 
-![Screen Shot 2019-07-10 at 5 17 29 PM](https://user-images.githubusercontent.com/16669457/61013259-cc244000-a336-11e9-8564-a018041634f6.png)
+![tensor_inspector_dump_to_file](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_dump_to_file.png)
 
 Notice: in `interactive_print()`, you could also do value dumping with command "d". You will be prompted to enter the `tag` value:
 
-![Screen Shot 2019-07-11 at 4 57 41 PM](https://user-images.githubusercontent.com/16669457/61092906-0f48e680-a3fd-11e9-8251-c4371cdd00ad.png)
+![tensor_inspector_interactive_print](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_interactive_print.png)
 
 ### Test Coverage and Limitations
 

From 4a331bf33c472722ad4dcf9d2025e08ce5aaeeef Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Fri, 19 Jul 2019 15:03:54 -0700
Subject: [PATCH 14/15] fix urls

---
 docs/faq/tensor_inspector_tutorial.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index 933d13a0aaa8..bbe1cade080a 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -26,7 +26,7 @@ This utility is located in `src/common/tensor_inspector.h`. To use it in any ope
 
 The screenshot below shows a sample usage in `src/operator/nn/convolution-inl.h`.
 
-![tensor_inspector_example_usage](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_example_usage.png)
+![tensor_inspector_example_usage](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/faq/tensor_inspector_tutorial/tensor_inspector_example_usage.png)
 
 
 ## Functionalities/APIs
@@ -61,7 +61,7 @@ void print_string();
 
 This API will print the entire tensor to `std::cout` and preserve the shape (it supports all dimensions from 1 and up). You can copy the output and interpret it with any `JSON` loader. You can find some useful information about the tensor on the last line of the output. Refer to the case below, we are able to know that this is a float-typed tensor with shape 20x1x5x5.
 
-![tensor_inspector_to_string](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_to_string.png)
+![tensor_inspector_to_string](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/faq/tensor_inspector_tutorial/tensor_inspector_to_string.png)
 
 If instead of printing the tensor to `std::cout`, you just need a `string`, you can use this API:
 ```c++
@@ -78,7 +78,7 @@ void  interactive_print(std::string tag =  "") {
 
 This API will set a "break point" in your code. What that "break point" is reached, you will enter a loop that will keep asking you for further command input. In the API call, `tag` is an optional parameter to give the call a name, so that you can identify it when you have multiple `interactive_print()` calls in different parts of your code. A visit count will tell you how many times you stepped into this particular "break point", should this operator be called more than once. Note that all `interactive_print()` calls are properly locked, so you can use it in many different places without issues.
 
-![tensor_inspector_interactive_print](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_interactive_print.png)
+![tensor_inspector_interactive_print](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/faq/tensor_inspector_tutorial/tensor_inspector_interactive_print.png)
 
 There are many useful commands available, as described in the previous screenshot: you can type "e" to print out the entire tensor, "d" to dump the tensor to file (see below), "b" to break from this command loop, and "s" to skip all future `interactive_print()`. Most importantly, in this screen, you can specify a part of the tensor that you are particularly interested in and want to print out. For example, for this 64x20x24x24 tensor, you can type in "0, 0" and presss enter to check the sub-tensor with shape 24x24 at coordinate (0, 0). 
 
@@ -104,7 +104,7 @@ In the first API, `ValueChecker checker` is a bool lambda function that takes in
 
 This checker is called on every value within the tensor. The return of the API is a `vector` of all the coordinates where the checker evaluates to `true`. The coordinates are themselves represented by `vector<int>`. If you set `interactive` to true, you will set a "break point" and enter a loop that asks for commands. This is similar to `interactive_print()`. You can type "p" to print the coordinates, "b" to break from the loop, and "s" to skip all future "break points" in `interactive_print()`. You can also specify a coordinate to print only a part of the tensor or type "e" to print out the entire tensor.  Just like `interactive_print()`, this this interactive screen is also properly locked.
 
-![tensor_inspector_value_check](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_value_check.png)
+![tensor_inspector_value_check](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/faq/tensor_inspector_tutorial/tensor_inspector_value_check.png)
 
 Also, there are a bunch of built-int value checkers. Refer to the Enum below:
 
@@ -154,11 +154,11 @@ print(a)
 
 Let's see how it runs:
 
-![tensor_inspector_dump_to_file](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_dump_to_file.png)
+![tensor_inspector_dump_to_file](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/faq/tensor_inspector_tutorial/tensor_inspector_dump_to_file.png)
 
 Notice: in `interactive_print()`, you could also do value dumping with command "d". You will be prompted to enter the `tag` value:
 
-![tensor_inspector_interactive_print](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/docs/faq/tensor_inspector_tutorial/tensor_inspector_interactive_print.png)
+![tensor_inspector_interactive_print](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/faq/tensor_inspector_tutorial/tensor_inspector_interactive_print.png)
 
 ### Test Coverage and Limitations
 

From edc580c8506100480bfe9105ce98f7a720f71ae1 Mon Sep 17 00:00:00 2001
From: zha0q1 <zhaoqizh@usc.edu>
Date: Fri, 19 Jul 2019 17:21:21 -0700
Subject: [PATCH 15/15] fix

---
 docs/faq/tensor_inspector_tutorial.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/faq/tensor_inspector_tutorial.md b/docs/faq/tensor_inspector_tutorial.md
index bbe1cade080a..e77c7447fd17 100644
--- a/docs/faq/tensor_inspector_tutorial.md
+++ b/docs/faq/tensor_inspector_tutorial.md
@@ -42,7 +42,7 @@ Following are the three constructors:
 ```c++
 // Construct from Tensor object
 template<typename Device, int dimension, typename DType MSHADOW_DEFAULT_DTYPE>
-TensorInspector(const  mshadow::Tensor<Device, dimension, DType>& ts, const RunContext& ctx);
+TensorInspector(const mshadow::Tensor<Device, dimension, DType>& ts, const RunContext& ctx);
 
 // Construct from TBlob object
 TensorInspector(const TBlob& tb, const RunContext& ctx);
@@ -76,7 +76,7 @@ Sometimes at compilation time, you may not know which part of a tensor to inspec
 void  interactive_print(std::string tag =  "") {
 ```
 
-This API will set a "break point" in your code. What that "break point" is reached, you will enter a loop that will keep asking you for further command input. In the API call, `tag` is an optional parameter to give the call a name, so that you can identify it when you have multiple `interactive_print()` calls in different parts of your code. A visit count will tell you how many times you stepped into this particular "break point", should this operator be called more than once. Note that all `interactive_print()` calls are properly locked, so you can use it in many different places without issues.
+This API will set a "break point" in your code. When that "break point" is reached, you will enter a loop that will keep asking you for further command input. In the API call, `tag` is an optional parameter to give the call a name, so that you can identify it when you have multiple `interactive_print()` calls in different parts of your code. A visit count will tell you how many times you stepped into this particular "break point", should this operator be called more than once. Note that all `interactive_print()` calls are properly locked, so you can use it in many different places without issues.
 
 ![tensor_inspector_interactive_print](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/faq/tensor_inspector_tutorial/tensor_inspector_interactive_print.png)