From 91ad2664e1c83c80ff318b29fcd32608bee90d04 Mon Sep 17 00:00:00 2001
From: Talia <31782251+TEChopra1000@users.noreply.github.com>
Date: Wed, 23 Oct 2019 09:32:18 -0700
Subject: [PATCH] fixed broken links across multiple files (#16581)

---
 .../getting-started/crash-course/5-predict.md        |  2 +-
 .../getting-started/crash-course/6-use_gpus.md       |  2 +-
 .../gluon_from_experiment_to_deployment.md           | 10 +++++-----
 .../tutorials/getting-started/to-mxnet/pytorch.md    | 12 ++++++------
 .../python/tutorials/packages/gluon/image/mnist.md   |  2 +-
 .../src/pages/api/architecture/note_data_loading.md  |  2 +-
 .../docs/tutorials/mxnet_cpp_inference_tutorial.md   | 10 +++++-----
 .../src/pages/api/faq/distributed_training.md        |  6 +++---
 docs/static_site/src/pages/api/faq/float16.md        |  8 ++++----
 .../src/pages/api/faq/gradient_compression.md        |  2 +-
 .../src/pages/api/faq/model_parallel_lstm.md         |  2 +-
 docs/static_site/src/pages/api/faq/recordio.md       |  1 -
 .../src/pages/get_started/build_from_source.md       |  2 +-
 julia/docs/src/tutorial/char-lstm.md                 |  6 +++---
 julia/docs/src/tutorial/mnist.md                     |  2 +-
 julia/docs/src/user-guide/overview.md                |  2 --
 julia/examples/char-lstm/README.md                   |  2 +-
 17 files changed, 35 insertions(+), 38 deletions(-)

diff --git a/docs/python_docs/python/tutorials/getting-started/crash-course/5-predict.md b/docs/python_docs/python/tutorials/getting-started/crash-course/5-predict.md
index 7a7738d8df1b..9afe95b58403 100644
--- a/docs/python_docs/python/tutorials/getting-started/crash-course/5-predict.md
+++ b/docs/python_docs/python/tutorials/getting-started/crash-course/5-predict.md
@@ -21,7 +21,7 @@ A saved model can be used in multiple places, such as to continue training, to f
 
 ## Prerequisites
 
-Please run the [previous tutorial](train.md) to train the network and save its parameters to file. You will need this file to run the following steps.
+Please run the [previous tutorial](4-train.html) to train the network and save its parameters to file. You will need this file to run the following steps.
 
 ```{.python .input  n=1}
 from mxnet import nd
diff --git a/docs/python_docs/python/tutorials/getting-started/crash-course/6-use_gpus.md b/docs/python_docs/python/tutorials/getting-started/crash-course/6-use_gpus.md
index b78c38ab7077..a0788ba7df2d 100644
--- a/docs/python_docs/python/tutorials/getting-started/crash-course/6-use_gpus.md
+++ b/docs/python_docs/python/tutorials/getting-started/crash-course/6-use_gpus.md
@@ -99,7 +99,7 @@ net(x)
 
 Finally, we show how to use multiple GPUs to jointly train a neural network through data parallelism. Let's assume there are *n* GPUs. We split each data batch into *n* parts, and then each GPU will run the forward and backward passes using one part of the data.
 
-Let's first copy the data definitions and the transform function from the [previous tutorial](predict.md).
+Let's first copy the data definitions and the transform function from the [previous tutorial](5-predict.html).
 
 ```{.python .input}
 batch_size = 256
diff --git a/docs/python_docs/python/tutorials/getting-started/gluon_from_experiment_to_deployment.md b/docs/python_docs/python/tutorials/getting-started/gluon_from_experiment_to_deployment.md
index 8d2c4e100c76..b1f65e682263 100644
--- a/docs/python_docs/python/tutorials/getting-started/gluon_from_experiment_to_deployment.md
+++ b/docs/python_docs/python/tutorials/getting-started/gluon_from_experiment_to_deployment.md
@@ -20,7 +20,7 @@
 
 ## Overview
 MXNet Gluon API comes with a lot of great features, and it can provide you everything you need: from experimentation to deploying the model. In this tutorial, we will walk you through a common use case on how to build a model using gluon, train it on your data, and deploy it for inference.
-This tutorial covers training and inference in Python, please continue to [C++ inference part](https://mxnet.apache.org/versions/master/tutorials/c++/mxnet_cpp_inference_tutorial.html) after you finish.
+This tutorial covers training and inference in Python, please continue to [C++ inference part](/api/cpp/docs/tutorials/cpp_inference) after you finish.
 
 Let's say you need to build a service that provides flower species recognition. A common problem is that you don't have enough data to train a good model. In such cases, a technique called Transfer Learning can be used to make a more robust model.
 In Transfer Learning we make use of a pre-trained model that solves a related task, and was trained on a very large standard dataset, such as ImageNet. ImageNet is from a different domain, but we can utilize the knowledge in this pre-trained model to perform the new task at hand.
@@ -77,7 +77,7 @@ from mxnet.gluon.data.vision import transforms
 from mxnet.gluon.model_zoo.vision import resnet50_v2
 ```
 
-Next, we define the hyper-parameters that we will use for fine-tuning. We will use the [MXNet learning rate scheduler](../packages/gluon/training/learning_rates/learning_rate_schedules.html) to adjust learning rates during training.
+Next, we define the hyper-parameters that we will use for fine-tuning. We will use the [MXNet learning rate scheduler](/api/python/docs/tutorials/packages/gluon/training/learning_rates/learning_rate_schedules.html) to adjust learning rates during training.
 Here we set the `epochs` to 1 for quick demonstration, please change to 40 for actual training.
 
 ```python
@@ -161,7 +161,7 @@ test_data = gluon.data.DataLoader(
 
 We will use pre-trained ResNet50_v2 model which was pre-trained on the [ImageNet Dataset](http://www.image-net.org/) with 1000 classes. To match the classes in the Flower dataset, we must redefine the last softmax (output) layer to be 102, then initialize the parameters.
 
-Before we go to training, one unique Gluon feature you should be aware of is hybridization. It allows you to convert your imperative code to a static symbolic graph, which is much more efficient to execute. There are two main benefits of hybridizing your model: better performance and easier serialization for deployment. The best part is that it's as simple as just calling `net.hybridize()`. To know more about Gluon hybridization, please follow the [hybridization tutorial](https://mxnet.apache.org/tutorials/gluon/hybrid.html).
+Before we go to training, one unique Gluon feature you should be aware of is hybridization. It allows you to convert your imperative code to a static symbolic graph, which is much more efficient to execute. There are two main benefits of hybridizing your model: better performance and easier serialization for deployment. The best part is that it's as simple as just calling `net.hybridize()`. To know more about Gluon hybridization, please follow the [hybridization tutorial](/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html).
 
 
 
@@ -265,7 +265,7 @@ finetune_net.export("flower-recognition", epoch=epochs)
 ## Load the model and run inference using the MXNet Module API
 
 MXNet provides various useful tools and interfaces for deploying your model for inference. For example, you can use [MXNet Model Server](https://github.com/awslabs/mxnet-model-server) to start a service and host your trained model easily.
-Besides that, you can also use MXNet's different language APIs to integrate your model with your existing service. We provide [Python](https://mxnet.apache.org/api/python/module/module.html),    [Java](https://mxnet.apache.org/api/java/index.html), [Scala](https://mxnet.apache.org/api/scala/index.html), and [C++](https://mxnet.apache.org/api/c++/index.html) APIs.
+Besides that, you can also use MXNet's different language APIs to integrate your model with your existing service. We provide [Python](/api/python.html),    [Java](/api/java.html), [Scala](/api/scala.html), and [C++](/api/cpp) APIs.
 
 Here we will briefly introduce how to run inference using Module API in Python. There is more detailed explanation available in the [Predict Image Tutorial](https://mxnet.apache.org/tutorials/python/predict_image.html).
 In general, prediction consists of the following steps:
@@ -315,7 +315,7 @@ You can continue to the [next tutorial](https://mxnet.apache.org/versions/master
 
 You can also find more ways to run inference and deploy your models here:
 1. [Java Inference examples](https://github.com/apache/incubator-mxnet/tree/master/scala-package/examples/src/main/java/org/apache/mxnetexamples/javaapi/infer)
-2. [Scala Inference examples](https://mxnet.apache.org/tutorials/scala/)
+2. [Scala Inference examples](/api/scala/docs/tutorials/infer)
 4. [MXNet Model Server Examples](https://github.com/awslabs/mxnet-model-server/tree/master/examples)
 
 ## References
diff --git a/docs/python_docs/python/tutorials/getting-started/to-mxnet/pytorch.md b/docs/python_docs/python/tutorials/getting-started/to-mxnet/pytorch.md
index d7720bac4348..1ab490fbaa42 100644
--- a/docs/python_docs/python/tutorials/getting-started/to-mxnet/pytorch.md
+++ b/docs/python_docs/python/tutorials/getting-started/to-mxnet/pytorch.md
@@ -164,7 +164,7 @@ mx_trainer = gluon.Trainer(mx_net.collect_params(),
                            'sgd', {'learning_rate': 0.1})
 ```
 
-The code difference between frameworks is small. The main difference is that in Apache MXNet we use [Trainer](https://mxnet.apache.org/api/python/docs/api/gluon/mxnet.gluon.Trainer.html) class, which accepts optimization algorithm as an argument. We also use [.collect_params()](/api/python/docs/api/gluon/_autogen/mxnet.gluon.nn.Block.collect_params.html) method to get parameters of the network.
+The code difference between frameworks is small. The main difference is that in Apache MXNet we use [Trainer](/api/python/docs/api/gluon/trainer.html) class, which accepts optimization algorithm as an argument. We also use [.collect_params()](/api/python/docs/api/gluon/block.html#mxnet.gluon.Block.collect_params) method to get parameters of the network.
 
 ### 4. Training
 
@@ -212,13 +212,13 @@ Some of the differences in Apache MXNet when compared to PyTorch are as follows:
 
 * In Apache MXNet, you don't need to flatten the 4-D input into 2-D when feeding the data into forward pass.
 
-* In Apache MXNet, you need to perform the calculation within the [autograd.record()](/api/python/docs/api/gluon-related/_autogen/mxnet.autograd.record.html) scope so that it can be automatically differentiated in the backward pass.
+* In Apache MXNet, you need to perform the calculation within the [autograd.record()](/api/python/docs/api/autograd/index.html?autograd%20record#mxnet.autograd.record) scope so that it can be automatically differentiated in the backward pass.
 
 * It is not necessary to clear the gradient every time as with PyTorch's `trainer.zero_grad()` because by default the new gradient is written in, not accumulated.
 
-* You need to specify the update step size (usually batch size) when performing [step()](/api/python/docs/api/gluon/_autogen/mxnet.gluon.Trainer.step.html) on the trainer.
+* You need to specify the update step size (usually batch size) when performing [step()](/api/python/docs/api/gluon/trainer.html?#mxnet.gluon.Trainer.step) on the trainer.
 
-* You need to call [.asscalar()](/api/python/docs/api/ndarray/_autogen/mxnet.ndarray.NDArray.asscalar.html) to turn a multidimensional array into a scalar.
+* You need to call [.asscalar()](/api/python/docs/api/ndarray/ndarray.html?#mxnet.ndarray.NDArray.asscalar) to turn a multidimensional array into a scalar.
 
 * In this sample, Apache MXNet is twice as fast as PyTorch. Though you need to be cautious with such toy comparisons.
 
@@ -230,9 +230,9 @@ As we saw above, Apache MXNet Gluon API and PyTorch have many similarities. The
 
 While Apache MXNet Gluon API is very similar to PyTorch, there are some extra functionality that can make your code even faster.
 
-* Check out [Hybridize tutorial](/api/python/docs/guide/packages/gluon/hybridize.html) to learn how to write imperative code which can be converted to symbolic one.
+* Check out [Hybridize tutorial](/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html) to learn how to write imperative code which can be converted to symbolic one.
 
-* Also, check out how to extend Apache MXNet with your own [custom layers](/api/python/docs/guide/extend/custom_layer.html).
+* Also, check out how to extend Apache MXNet with your own [custom layers](/api/python/docs/tutorials/packages/gluon/blocks/custom-layer.html?custom_layers).
 
 ## Appendix
 
diff --git a/docs/python_docs/python/tutorials/packages/gluon/image/mnist.md b/docs/python_docs/python/tutorials/packages/gluon/image/mnist.md
index 8a3d8229413b..a6898278edf6 100644
--- a/docs/python_docs/python/tutorials/packages/gluon/image/mnist.md
+++ b/docs/python_docs/python/tutorials/packages/gluon/image/mnist.md
@@ -112,7 +112,7 @@ to train the MLP network we defined above.
 
 For our training, we will make use of the stochastic gradient descent (SGD) optimizer. In particular, we'll be using mini-batch SGD. Standard SGD processes train data one example at a time. In practice, this is very slow and one can speed up the process by processing examples in small batches. In this case, our batch size will be 100, which is a reasonable choice. Another parameter we select here is the learning rate, which controls the step size the optimizer takes in search of a solution. We'll pick a learning rate of 0.02, again a reasonable choice. Settings such as batch size and learning rate are what are usually referred to as hyper-parameters. What values we give them can have a great impact on training performance.
 
-We will use [Trainer](https://mxnet.io/api/python/docs/api/gluon/mxnet.gluon.Trainer.html) class to apply the
+We will use [Trainer](/api/python/docs/api/gluon/trainer.html) class to apply the
 [SGD optimizer](https://mxnet.io/api/python/docs/api/gluon-related/_autogen/mxnet.optimizer.SGD.html) on the
 initialized parameters.
 
diff --git a/docs/static_site/src/pages/api/architecture/note_data_loading.md b/docs/static_site/src/pages/api/architecture/note_data_loading.md
index 1279d0361e5f..01bf1f23f600 100644
--- a/docs/static_site/src/pages/api/architecture/note_data_loading.md
+++ b/docs/static_site/src/pages/api/architecture/note_data_loading.md
@@ -125,7 +125,7 @@ then compress into JPEG format.
 After that, we save a header that indicates the index and label
 for that image to be used when constructing the *Data* field for that record.
 We then pack several images together into a file.
-You may want to also review the [example using im2rec.py to create a RecordIO dataset](https://mxnet.apache.org/tutorials/basic/data.html#loading-data-using-image-iterators).
+You may want to also review the [example using im2rec.py to create a RecordIO dataset](https://mxnet.apache.org/api/faq/recordio).
 
 ### Access Arbitrary Parts Of Data
 
diff --git a/docs/static_site/src/pages/api/cpp/docs/tutorials/mxnet_cpp_inference_tutorial.md b/docs/static_site/src/pages/api/cpp/docs/tutorials/mxnet_cpp_inference_tutorial.md
index 9392eca2977f..0d96817560d0 100644
--- a/docs/static_site/src/pages/api/cpp/docs/tutorials/mxnet_cpp_inference_tutorial.md
+++ b/docs/static_site/src/pages/api/cpp/docs/tutorials/mxnet_cpp_inference_tutorial.md
@@ -29,7 +29,7 @@ tag: cpp
 ## Overview
 MXNet provides various useful tools and interfaces for deploying your model for inference. For example, you can use [MXNet Model Server](https://github.com/awslabs/mxnet-model-server) to start a service and host your trained model easily.
 Besides that, you can also use MXNet's different language APIs to integrate your model with your existing service. We provide [Python]({{'/api/python/docs/api/symbol-related/mxnet.module'|relative_url}}),    [Java]({{'/api/java/docs/api'|relative_url}}), [Scala]({{'/api/scala/docs/api'|relative_url}}), and [C++]({{'/api/cpp/docs/api'|relative_url}}) APIs.
-We will focus on the MXNet C++ API. We have slightly modified the code in [C++ Inference Example](https://github.com/apache/incubator-mxnet/tree/master/example/inference) for our use case.
+We will focus on the MXNet C++ API. We have slightly modified the code in [C++ Inference Example](https://github.com/apache/incubator-mxnet/tree/master/cpp-package/example/inference) for our use case.
 
 ## Prerequisites
 
@@ -105,7 +105,7 @@ class Predictor {
 
 ### Load the model, synset file, and normalization values
 
-In the Predictor constructor, you need to provide paths to saved json and param files. After that, add the following methods `LoadModel` and `LoadParameters` to load the network and its parameters. This part is the same as [the example](https://github.com/apache/incubator-mxnet/blob/master/cpp-package/example/inference/inception_inference.cpp).
+In the Predictor constructor, you need to provide paths to saved json and param files. After that, add the following methods `LoadModel` and `LoadParameters` to load the network and its parameters. This part is the same as [the example](https://github.com/apache/incubator-mxnet/blob/master/cpp-package/example/inference/imagenet_inference.cpp).
 
 Next, we need to load synset file, and normalization values. We have made the following change since our synset file contains flower names and we used both mean and standard deviation for image normalization.
 
@@ -280,12 +280,12 @@ Then it will predict your image:
 
 Now you can explore more ways to run inference and deploy your models:
 1. [Java Inference examples](https://github.com/apache/incubator-mxnet/tree/master/scala-package/examples/src/main/java/org/apache/mxnetexamples/javaapi/infer)
-2. [Scala Inference examples]({{'/api/scala/docs/tutorials'|relative_url}})
-3. [ONNX model inference examples]({{'/api/python/docs/tutorials/deploy/index.html'|relative_url}})
+2. [Scala Inference examples](/api/scala/docs/tutorials)
+3. [ONNX model inference examples](/api/python/docs/tutorials/deploy/index.html)
 4. [MXNet Model Server Examples](https://github.com/awslabs/mxnet-model-server/tree/master/examples)
 
 ## References
 
-1. [Gluon end to end tutorial]({{'/api/python/docs/tutorials/packages/gluon/gluon_from_experiment_to_deployment.html'|relative_url}})
+1. [Gluon end to end tutorial](/api/python/docs/tutorials/getting-started/gluon_from_experiment_to_deployment.html)
 2. [Gluon C++ inference example](https://github.com/apache/incubator-mxnet/blob/master/cpp-package/example/inference/)
 3. [Gluon C++ package](https://github.com/apache/incubator-mxnet/tree/master/cpp-package)
diff --git a/docs/static_site/src/pages/api/faq/distributed_training.md b/docs/static_site/src/pages/api/faq/distributed_training.md
index caf0123b7aea..622ace60f780 100644
--- a/docs/static_site/src/pages/api/faq/distributed_training.md
+++ b/docs/static_site/src/pages/api/faq/distributed_training.md
@@ -91,7 +91,7 @@ In the case of distributed training though, we would need to divide the dataset
 
 Typically, this split of data for each worker happens through the data iterator,
 on passing the number of parts and the index of parts to iterate over.
-Some iterators in MXNet that support this feature are [mxnet.io.MNISTIterator]({{'//api/mxnet/io/index.html#mxnet.io.MNISTIter'|relative_url}}) and [mxnet.io.ImageRecordIter]({{'/api/mxnet/io/index.html#mxnet.io.ImageRecordIter'|relative_url}}).
+Some iterators in MXNet that support this feature are [mxnet.io.MNISTIterator](/api/python/docs/api/mxnet/io/index.html?MNISTIter#mxnet.io.MNISTIter) and [mxnet.io.ImageRecordIter](api/python/docs/api/mxnet/io/index.html?imagerecorditer#mxnet.io.ImageRecordIter).
 If you are using a different iterator, you can look at how the above iterators implement this.
 We can use the kvstore object to get the number of workers (`kv.num_workers`) and rank of the current worker (`kv.rank`).
 These can be passed as arguments to the iterator.
@@ -101,7 +101,7 @@ to see an example usage.
 ### Updating weights
 KVStore server supports two modes, one which aggregates the gradients and updates the weights using those gradients, and second where the server only aggregates gradients. In the latter case, when a worker process pulls from kvstore, it gets the aggregated gradients. The worker then uses these gradients and applies the weights locally.
 
-When using Gluon there is an option to choose between these modes by passing `update_on_kvstore` variable when you create the [Trainer]({{'/api/python/docs/api/gluon/mxnet.gluon.Trainer.html'|relative_url}}) object like this:
+When using Gluon there is an option to choose between these modes by passing `update_on_kvstore` variable when you create the [Trainer](/api/python/docs/api/gluon/trainer.html) object like this:
 
 ```
 trainer = gluon.Trainer(net.collect_params(), optimizer='sgd',
@@ -190,7 +190,7 @@ git clone --recursive https://github.com/apache/incubator-mxnet
 ```
 
 #### Example
-Let us consider training a VGG11 model on the CIFAR10 dataset using [example/gluon/image_classification.py](https://github.com/apache/incubator-mxnet/blob/master/example/gluon/image_classification.py).
+Let us consider training a VGG11 model on the CIFAR10 dataset using [example/gluon/image_classification.py](https://github.com/apache/incubator-mxnet/blob/master/tools/launch.py).
 ```
 cd example/gluon/
 ```
diff --git a/docs/static_site/src/pages/api/faq/float16.md b/docs/static_site/src/pages/api/faq/float16.md
index d824acb3ce6d..e63bf87ac68f 100644
--- a/docs/static_site/src/pages/api/faq/float16.md
+++ b/docs/static_site/src/pages/api/faq/float16.md
@@ -39,7 +39,7 @@ The float16 data type is a 16 bit floating point representation according to the
 - CUDA 9 or higher
 - cuDNN v7 or higher
 
-This tutorial also assumes understanding of how to train a network with float32 (the default). Please refer to [logistic regression tutorial](https://mxnet.apache.org/versions/master/tutorials/gluon/logistic_regression_explained.html) to get started with Apache MXNet and Gluon API. This tutorial focuses on the changes needed to switch from float32 to mixed precision and tips on achieving the best performance with mixed precision.
+This tutorial also assumes understanding of how to train a network with float32 (the default). Please refer to [logistic regression tutorial](/api/python/docs/tutorials/getting-started/logistic_regression_explained.html) to get started with Apache MXNet and Gluon API. This tutorial focuses on the changes needed to switch from float32 to mixed precision and tips on achieving the best performance with mixed precision.
 
 ## Using the Gluon API
 
@@ -47,13 +47,13 @@ This tutorial also assumes understanding of how to train a network with float32
 
 With Gluon API, you need to take care of three things to convert a model to support computation with float16.
 
-1. Cast Gluon `Block`'s parameters and expected input type to float16 by calling the [cast]({{'/api/python/docs/api/gluon/mxnet.gluon.nn.Block.html#mxnet.gluon.nn.Block.cast'|relative_url}}) method of the `Block` representing the network.
+1. Cast Gluon `Block`'s parameters and expected input type to float16 by calling the [cast](/api/python/docs/api/gluon/block.html?cast#mxnet.gluon.Block.cast) method of the `Block` representing the network.
 
 ```python
 net.cast('float16')
 ```
 
-2. Ensure the data input to the network is of float16 type. If your `DataLoader` or `Iterator` produces output in another datatype, then you would have to cast your data. There are different ways you can do this. The easiest would be to use the [astype]({{'/api/python/docs/api/ndarray/_autogen/mxnet.ndarray.NDArray.astype.html#mxnet.ndarray.NDArray.astype'|relative_url}}) method of NDArrays.
+2. Ensure the data input to the network is of float16 type. If your `DataLoader` or `Iterator` produces output in another datatype, then you would have to cast your data. There are different ways you can do this. The easiest would be to use the [astype](/api/python/docs/api/ndarray/ndarray.html?astype#mxnet.ndarray.NDArray.astype) method of NDArrays.
 
 ```python
 data = data.astype('float16', copy=False)
@@ -98,7 +98,7 @@ net.features = pretrained_net.features
 net.cast('float16')
 ```
 
-You can check the parameters of the model by calling [summary]({{'/api/python/docs/api/gluon/mxnet.gluon.nn.Block.html#mxnet.gluon.nn.Block.summary'|relative_url}}) with some fake data. Notice the provided `dtype=np.float16` in the line below. As it was mentioned earlier, we have to provide data as float16 as well.
+You can check the parameters of the model by calling [summary](/api/python/docs/api/gluon/block.html?block%20summary#mxnet.gluon.Block.summary) with some fake data. Notice the provided `dtype=np.float16` in the line below. As it was mentioned earlier, we have to provide data as float16 as well.
 
 ```python
 net.summary(mx.nd.uniform(shape=(1, 3, 224, 224), dtype=np.float16))
diff --git a/docs/static_site/src/pages/api/faq/gradient_compression.md b/docs/static_site/src/pages/api/faq/gradient_compression.md
index 1f4c5fb21903..e2b47c646ada 100644
--- a/docs/static_site/src/pages/api/faq/gradient_compression.md
+++ b/docs/static_site/src/pages/api/faq/gradient_compression.md
@@ -110,7 +110,7 @@ A reference `gluon` implementation with a gradient compression option can be fou
 mod = mx.mod.Module(..., compression_params={'type’:'2bit', 'threshold':0.5})
 ```
 
-A `module` example is provided with [this guide for setting up MXNet with distributed training](https://mxnet.apache.org/versions/master/faq/multi_devices.html#distributed-training-with-multiple-machines). It comes with the option of turning on gradient compression as an argument to the [train_mnist.py script](https://github.com/apache/incubator-mxnet/blob/master/example/image-classification/train_mnist.py).
+A `module` example is provided with [this guide for setting up MXNet with distributed training](/api/faq/distributed_training). It comes with the option of turning on gradient compression as an argument to the [train_mnist.py script](https://github.com/apache/incubator-mxnet/blob/master/example/image-classification/train_mnist.py).
 
 ### Configuration Details
 
diff --git a/docs/static_site/src/pages/api/faq/model_parallel_lstm.md b/docs/static_site/src/pages/api/faq/model_parallel_lstm.md
index 60df280b38fe..08cf6be76a90 100644
--- a/docs/static_site/src/pages/api/faq/model_parallel_lstm.md
+++ b/docs/static_site/src/pages/api/faq/model_parallel_lstm.md
@@ -37,7 +37,7 @@ One key strength of _MXNet_ is its ability to leverage
 powerful heterogeneous hardware environments to achieve significant speedups.
 
 There are two primary ways that we can spread a workload across multiple devices.
-In a previous document, [we addressed data parallelism](multi_devices),
+In a previous document, [we addressed data parallelism](/api/faq/distributed_training),
 an approach in which samples within a batch are divided among the available devices.
 With data parallelism, each device stores a complete copy of the model.
 Here, we explore _model parallelism_, a different approach.
diff --git a/docs/static_site/src/pages/api/faq/recordio.md b/docs/static_site/src/pages/api/faq/recordio.md
index 75407cb3da5f..2e8fcdd647f3 100644
--- a/docs/static_site/src/pages/api/faq/recordio.md
+++ b/docs/static_site/src/pages/api/faq/recordio.md
@@ -38,7 +38,6 @@ We provide two tools for creating a RecordIO dataset.
 * [im2rec.py](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) - implements the tool using the Python API.
 
 Both provide the same output: a RecordIO dataset.
-You may want to also review the [example using real-world data with im2rec.py.](https://mxnet.apache.org/tutorials/basic/data.html#loading-data-using-image-iterators)
 
 ### Prerequisites
 
diff --git a/docs/static_site/src/pages/get_started/build_from_source.md b/docs/static_site/src/pages/get_started/build_from_source.md
index e8f7d468b399..20a4542461c4 100644
--- a/docs/static_site/src/pages/get_started/build_from_source.md
+++ b/docs/static_site/src/pages/get_started/build_from_source.md
@@ -98,7 +98,7 @@ Those can be extended with [LAPACK (Linear Algebra Package)](https://github.com/
 
 MXNet supports multiple mathematical backends for computations on the CPU:
 * [Apple Accelerate](https://developer.apple.com/documentation/accelerate)
-* [ATLAS](https://math-atlas.sourceforge.net/)
+* [ATLAS](http://math-atlas.sourceforge.net/)
 * [MKL](https://software.intel.com/en-us/intel-mkl) (MKL, MKLML)
 * [MKL-DNN](https://github.com/intel/mkl-dnn)
 * [OpenBLAS](https://www.openblas.net/)
diff --git a/julia/docs/src/tutorial/char-lstm.md b/julia/docs/src/tutorial/char-lstm.md
index bc7f7b471d94..ab7e9352b5ab 100644
--- a/julia/docs/src/tutorial/char-lstm.md
+++ b/julia/docs/src/tutorial/char-lstm.md
@@ -31,7 +31,7 @@ networks yet, the example shown here is an implementation of LSTM by
 using the default FeedForward model via explicitly unfolding over time.
 We will be using fixed-length input sequence for training. The code is
 adapted from the [char-rnn example for MXNet's Python
-binding](https://github.com/dmlc/mxnet/blob/master/example/rnn/char_lstm.ipynb),
+binding](https://github.com/dmlc/mxnet-notebooks/blob/master/python/tutorials/char_lstm.ipynb),
 which demonstrates how to use low-level
 [Symbolic API](@ref) to build customized neural
 network models directly.
@@ -165,7 +165,7 @@ char-lstm. To train the model, we just follow the standard high-level
 API. Firstly, we construct a LSTM symbolic architecture:
 
 Note all the parameters are defined in
-[examples/char-lstm/config.jl](https://github.com/dmlc/MXNet.jl/blob/master/examples/char-lstm/config.jl).
+[examples/char-lstm/config.jl](https://github.com/apache/incubator-mxnet/blob/master/julia/examples/char-lstm/config.jl).
 Now we load the text file and define the data provider. The data
 `input.txt` we used in this example is [a tiny Shakespeare
 dataset](https://github.com/dmlc/web-data/tree/master/mxnet/tinyshakespeare).
@@ -318,6 +318,6 @@ illustrations](http://colah.github.io/posts/2015-08-Understanding-LSTMs/),
 but could otherwise be very useful for debugging. As we can see, the
 LSTM unfolded over time is just a (very) deep neural network. The
 complete code for producing this visualization can be found in
-[examples/char-lstm/visualize.jl](https://github.com/apache/incubator-mxnet/tree/master/julia/examples/char-lstmvisualize.jl).
+[examples/char-lstm/visualize.jl](https://github.com/apache/incubator-mxnet/blob/master/julia/examples/char-lstm/visualize.jl).
 
 ![image](images/char-lstm-vis.svg)
diff --git a/julia/docs/src/tutorial/mnist.md b/julia/docs/src/tutorial/mnist.md
index cc5267071f11..edc1a67d2485 100644
--- a/julia/docs/src/tutorial/mnist.md
+++ b/julia/docs/src/tutorial/mnist.md
@@ -23,7 +23,7 @@ multi-layer perceptron and then a convolutional neural network (the
 LeNet architecture) on the [MNIST handwritten digit
 dataset](http://yann.lecun.com/exdb/mnist/). The code for this tutorial
 could be found in
-[examples/mnist](https://github.com/dmlc/MXNet.jl/tree/master/examples/mnist).  There are also two Jupyter notebooks that expand a little more on the [MLP](https://github.com/ultradian/julia_notebooks/blob/master/mxnet/mnistMLP.ipynb) and the [LeNet](https://github.com/ultradian/julia_notebooks/blob/master/mxnet/mnistLenet.ipynb), using the more general `ArrayDataProvider`. 
+[examples/mnist](https://github.com/apache/incubator-mxnet/blob/master/julia/docs/src/tutorial/mnist.md).  There are also two Jupyter notebooks that expand a little more on the [MLP](https://github.com/ultradian/julia_notebooks/blob/master/mxnet/mnistMLP.ipynb) and the [LeNet](https://github.com/ultradian/julia_notebooks/blob/master/mxnet/mnistLenet.ipynb), using the more general `ArrayDataProvider`. 
 
 Simple 3-layer MLP
 ------------------
diff --git a/julia/docs/src/user-guide/overview.md b/julia/docs/src/user-guide/overview.md
index 974cc7dee974..342448a15bed 100644
--- a/julia/docs/src/user-guide/overview.md
+++ b/julia/docs/src/user-guide/overview.md
@@ -269,8 +269,6 @@ symbolic composition system. It is like
 [Theano](http://deeplearning.net/software/theano/), except that we
 avoided long expression compilation time by providing *larger* neural
 network related building blocks to guarantee computation performance.
-See also [this note](https://mxnet.readthedocs.org/en/latest/program_model.html)
-for the design and trade-off of the MXNet symbolic composition system.
 
 The basic type is `mx.SymbolicNode`. The following is a trivial example of
 composing two symbols with the `+` operation.
diff --git a/julia/examples/char-lstm/README.md b/julia/examples/char-lstm/README.md
index ac745dd4cc41..155f29603623 100644
--- a/julia/examples/char-lstm/README.md
+++ b/julia/examples/char-lstm/README.md
@@ -29,7 +29,7 @@ and `StatsBase.jl`.
 ## Training
 
 This example is adapted from the
-[example in Python binding](https://github.com/dmlc/mxnet/blob/master/example/rnn/char_lstm.ipynb) of
+[example in Python binding](https://github.com/dmlc/mxnet-notebooks/blob/master/python/tutorials/char_lstm.ipynb) of
 MXNet. The data `input.txt` can be downloaded [here](https://github.com/dmlc/web-data/tree/master/mxnet/tinyshakespeare).
 
 Modify parameters in [config.jl](config.jl) and then run [train.jl](train.jl). An example output