diff --git a/docs/static_site/src/pages/get_started/jetson_setup.md b/docs/static_site/src/pages/get_started/jetson_setup.md index 2e58ad214795..75ca197a63fd 100644 --- a/docs/static_site/src/pages/get_started/jetson_setup.md +++ b/docs/static_site/src/pages/get_started/jetson_setup.md @@ -25,18 +25,17 @@ permalink: /get_started/jetson_setup # Install MXNet on a Jetson -MXNet supports the Ubuntu Arch64 based operating system so you can run MXNet on NVIDIA Jetson Devices, such as the [TX2](http://www.nvidia.com/object/embedded-systems-dev-kits-modules.html) or [Nano](https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit). +MXNet supports Ubuntu AArch64 based operating system so you can run MXNet on all [NVIDIA Jetson modules](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/), such as Jetson Nano, TX1, TX2, Xavier NX and AGX Xavier. These instructions will walk through how to build MXNet and install MXNet's Python language binding. -For the purposes of this install guide we will assume that CUDA is already installed on your Jetson device. The disk image provided by NVIDIA's getting started guides will have the Jetson toolkit preinstalled, and this also includes CUDA. You should double check what versions are installed and which version you plan to use. +For the purposes of this install guide we will assume that CUDA is already installed on your Jetson device. NVIDIA Jetpack comes with the latest OS image for Jetson mdoule, and developer tools for both host computer and developer kit, and this also includes CUDA. You should double check what versions are installed and which version you plan to use. After installing the prerequisites, you have several options for installing MXNet: -1. Use a Jetson MXNet pip wheel for Python development and use a precompiled Jetson MXNet binary. -3. Build MXNet from source +1. Build MXNet from source * On a faster Linux computer using cross-compilation * On the Jetson itself (very slow and not recommended) - +2. Use a Jetson MXNet pip wheel for Python development and use a precompiled Jetson MXNet binary (not provided on this page as CUDA enabled wheels are not in accordance with ASF policy, users can download them from other 3rd party sources) ## Prerequisites To build from source or to use the Python wheel, you must install the following dependencies on your Jetson. @@ -47,27 +46,22 @@ Cross-compiling will require dependencies installed on that machine as well. To use the Python API you need the following dependencies: ```bash -sudo apt update -sudo apt -y install \ +sudo apt-get update +sudo apt-get install -y \ build-essential \ git \ - graphviz \ - libatlas-base-dev \ + libopenblas-dev \ libopencv-dev \ - python-pip + python3-pip \ + python-numpy -sudo pip install --upgrade \ +sudo pip3 install --upgrade \ pip \ - setuptools - -sudo pip install \ - graphviz==0.8.4 \ - jupyter \ - numpy==1.15.2 + setuptools \ + numpy ``` If you plan to cross-compile you will need to install these dependencies on that computer as well. -If you get an error about something being busy, you can restart the Nano and this error will go away. You can then continue installation of the prerequisites. ### Download the source & setup some environment variables: @@ -79,6 +73,11 @@ Clone the MXNet source code repository using the following `git` command in your git clone --recursive https://github.com/apache/incubator-mxnet.git mxnet ``` +You can also checkout a particular branch of MXNet. For example, to install MXNet v1.6: +```bash +git clone --recursive -b v1.6.x https://github.com/apache/incubator-mxnet.git mxnet +``` + Setup your environment variables for MXNet and CUDA in your `.profile` file in your home directory. Add the following to the file. @@ -103,49 +102,17 @@ You can check to see what version of CUDA is running with `nvcc`. nvcc --version ``` -To switch CUDA versions on a device or computer that has more than one version installed, use the following and replace the symbolic link to the version you want. This one uses CUDA 10.0, which is preinstalled on the Nano. +To switch CUDA versions on a device or computer that has more than one version installed, use the following and replace the symbolic link to the version you want. This one uses CUDA 10.2, which comes with Jetpack 4.4. ```bash sudo rm /usr/local/cuda -sudo ln -s /usr/local/cuda-10.0 /usr/local/cuda +sudo ln -s /usr/local/cuda-10.2 /usr/local/cuda ``` **Note:** When cross-compiling, change the CUDA version on the host computer you're using to match the version you're running on your Jetson device. -**Note:** CUDA 10.1 is recommended but doesn't ship with the Nano's SD card image. You may want to go through CUDA upgrade steps first. - -## Option 1. Install MXNet for Python - -To use a prepared Python wheel, download it to your Jetson, and run it. -* [MXNet 1.4.0 - Python 3](https://s3.us-east-2.amazonaws.com/mxnet-public/install/jetson/1.4.0/mxnet-1.4.0-cp36-cp36m-linux_aarch64.whl) -* [MXNet 1.4.0 - Python 2](https://s3.us-east-2.amazonaws.com/mxnet-public/install/jetson/1.4.0/mxnet-1.4.0-cp27-cp27mu-linux_aarch64.whl) - - -It should download the required dependencies, but if you have issues, -install the dependencies in the prerequisites section, then run the pip wheel. - -```bash -sudo pip install mxnet-1.4.0-cp27-cp27mu-linux_aarch64.whl -``` -Now use a pre-compiled binary you can download it from S3 which is a patch v1.4.1: -* https://s3.us-east-2.amazonaws.com/mxnet-public/install/jetson/1.4.1/libmxnet.so - -Place this file in `$MXNET_HOME/lib`. - -To use this with the MXNet Python binding, you must match the source directory's checked out version with the binary's source version, then install it with pip. - -```bash -cd $MXNET_HOME -git checkout v1.4.x -git submodule update --init --recursive -cd python -sudo pip install -e . -``` - -Refer to the following Conclusion and Next Steps section to test your installation. - -## Option 2. Build MXNet from Source +## Build MXNet from Source Installing MXNet from source is a two-step process: @@ -164,30 +131,21 @@ Then run the following to execute cross-compilation via Docker. $MXNET_HOME/ci/build.py -p jetson ``` -### Manual +### Manually on the Jetson module (Slow) **Step 1** Build the Shared Library -(Skip this sub-step for compiling on the Jetson device directly.) -Edit the Makefile to install the MXNet with CUDA bindings to leverage the GPU on the Jetson: +Use the config_jetson.mk Makefile to install MXNet with CUDA bindings to leverage the GPU on the Jetson module. ```bash -cp $MXNET_HOME/make/crosscompile.jetson.mk config.mk +cp $MXNET_HOME/make/config_jetson.mk config.mk ``` -Now edit `config.mk` to make some additional changes for the Nano. Update the following settings: - -1. Update the CUDA path. `USE_CUDA_PATH = /usr/local/cuda` -2. Add `-gencode arch=compute-63, code=sm_62` to the `CUDA_ARCH` setting. -3. Update the NVCC settings. `NVCCFLAGS := -m64` -4. (optional, but recommended) Turn on OpenCV. `USE_OPENCV = 1` +The pre-existing Makefile builds for all Jetson architectures. Edit `config.mk` if you want to specifically build for a particular architecture or if you want to build without CUDA bindings (CPU only). You can make the following changes: -Now edit the Mshadow Makefile to ensure MXNet builds with Pascal's hardware level low precision acceleration by editing `3rdparty/mshadow/make/mshadow.mk`. -The last line has `MSHADOW_USE_PASCAL` set to `0`. Change this to `1` to enable it. +1. Modify `CUDA_ARCH` to build for specific architectures. Currently, we have `CUDA_ARCH = -gencode arch=compute_53,code=sm_53 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_72,code=sm_72`. Keep `-gencode arch=compute_53,code=sm_53` for Nano and TX1, `-gencode arch=compute_62,code=sm_62` for TX2, `-gencode arch=compute_72,code=sm_72` for Xavier NX and AGX Xavier. -```bash -MSHADOW_CFLAGS += -DMSHADOW_USE_PASCAL=1 -``` +2. For CPU only builds, remove `USE_CUDA_PATH`, `CUDA_ARCH`, `USE_CUDNN` flags. Now you can build the complete MXNet library with the following command: @@ -204,7 +162,7 @@ To install Python bindings run the following commands in the MXNet directory: ```bash cd $MXNET_HOME/python -sudo pip install -e . +sudo pip3 install -e . ``` Note that the `-e` flag is optional. It is equivalent to `--editable` and means that if you edit the source files, these changes will be reflected in the package installed. @@ -222,7 +180,7 @@ This creates the required `.jar` file to use in your Java or Scala projects. ## Conclusion and Next Steps -You are now ready to run MXNet on your NVIDIA Jetson TX2 or Nano device. +You are now ready to run MXNet on your NVIDIA module. You can verify your MXNet Python installation with the following: ```python diff --git a/make/config_jetson.mk b/make/config_jetson.mk new file mode 100644 index 000000000000..7de6eff7b6b5 --- /dev/null +++ b/make/config_jetson.mk @@ -0,0 +1,219 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +#------------------------------------------------------------------------------- +# Template configuration for compiling mxnet +# +# If you want to change the configuration, please use the following +# steps. Assume you are on the root directory of mxnet. First copy the this +# file so that any local changes will be ignored by git +# +# $ cp make/config.mk . +# +# Next modify the according entries, and then compile by +# +# $ make +# +# or build in parallel with 8 threads +# +# $ make -j8 +#------------------------------------------------------------------------------- + +#--------------------- +# For cross compilation we only explictily set a compiler when one is not already present. +#-------------------- + +ifndef CC +export CC = gcc +endif +ifndef CXX +export CXX = g++ +endif +ifndef NVCC +export NVCC = nvcc +endif + +# whether compile with options for MXNet developer +DEV = 0 + +# whether compile with debug +DEBUG = 0 + +# whether to turn on segfault signal handler to log the stack trace +USE_SIGNAL_HANDLER = 1 + +# the additional link flags you want to add +ADD_LDFLAGS = -L${CROSS_ROOT}/lib -L/usr/lib/aarch64-linux-gnu/ + +# the additional compile flags you want to add +ADD_CFLAGS = -I${CROSS_ROOT}/include -I/usr/include/aarch64-linux-gnu/ + +#--------------------------------------------- +# matrix computation libraries for CPU/GPU +#--------------------------------------------- + +# whether use CUDA during compile +USE_CUDA = 1 + +# add the path to CUDA library to link and compile flag +# if you have already add them to environment variable, leave it as NONE +# USE_CUDA_PATH = /usr/local/cuda +USE_CUDA_PATH = /usr/local/cuda + +# CUDA_ARCH setting +CUDA_ARCH = -gencode arch=compute_53,code=sm_53 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_72,code=sm_72 + +# whether to enable CUDA runtime compilation +ENABLE_CUDA_RTC = 0 + +# whether use CuDNN R3 library +USE_CUDNN = 1 + +#whether to use NCCL library +USE_NCCL = 0 +#add the path to NCCL library +USE_NCCL_PATH = NONE + +# whether use opencv during compilation +# you can disable it, however, you will not able to use +# imbin iterator +USE_OPENCV = 1 +# Add OpenCV include path, in which the directory `opencv2` exists +USE_OPENCV_INC_PATH = NONE +# Add OpenCV shared library path, in which the shared library exists +USE_OPENCV_LIB_PATH = NONE + +#whether use libjpeg-turbo for image decode without OpenCV wrapper +USE_LIBJPEG_TURBO = 0 +#add the path to libjpeg-turbo library +USE_LIBJPEG_TURBO_PATH = NONE + +# use openmp for parallelization +USE_OPENMP = 1 + +# whether use MKL-DNN library +USE_MKLDNN = 0 + +# whether use NNPACK library +USE_NNPACK = 0 + +# choose the version of blas you want to use +# can be: mkl, blas, atlas, openblas +# in default use atlas for linux while apple for osx +UNAME_S := $(shell uname -s) +USE_BLAS = openblas + +# whether use lapack during compilation +# only effective when compiled with blas versions openblas/apple/atlas/mkl +USE_LAPACK = 1 + +# path to lapack library in case of a non-standard installation +USE_LAPACK_PATH = + +# add path to intel library, you may need it for MKL, if you did not add the path +# to environment variable +USE_INTEL_PATH = NONE + +# If use MKL only for BLAS, choose static link automatically to allow python wrapper +ifeq ($(USE_BLAS), mkl) +USE_STATIC_MKL = 1 +else +USE_STATIC_MKL = NONE +endif + +#---------------------------- +# Settings for power and arm arch +#---------------------------- +USE_SSE=0 + +# Turn off F16C instruction set support +USE_F16C=0 + +#---------------------------- +# distributed computing +#---------------------------- + +# whether or not to enable multi-machine supporting +USE_DIST_KVSTORE = 0 + +# whether or not allow to read and write HDFS directly. If yes, then hadoop is +# required +USE_HDFS = 0 + +# path to libjvm.so. required if USE_HDFS=1 +LIBJVM=$(JAVA_HOME)/jre/lib/amd64/server + +# whether or not allow to read and write AWS S3 directly. If yes, then +# libcurl4-openssl-dev is required, it can be installed on Ubuntu by +# sudo apt-get install -y libcurl4-openssl-dev +USE_S3 = 0 + +#---------------------------- +# performance settings +#---------------------------- +# Use operator tuning +USE_OPERATOR_TUNING = 1 + +# Use gperftools if found +# Disable because of #8968 +USE_GPERFTOOLS = 0 + +# path to gperftools (tcmalloc) library in case of a non-standard installation +USE_GPERFTOOLS_PATH = + +# Use JEMalloc if found, and not using gperftools +USE_JEMALLOC = 1 + +# path to jemalloc library in case of a non-standard installation +USE_JEMALLOC_PATH = + +#---------------------------- +# additional operators +#---------------------------- + +# path to folders containing projects specific operators that you don't want to put in src/operators +EXTRA_OPERATORS = + +#---------------------------- +# other features +#---------------------------- + +# Create C++ interface package +USE_CPP_PACKAGE = 0 + +# Use int64_t type to represent the total number of elements in the tensor +# This will cause performance degradation reported in issue #14496 +# Set to 1 for large tensor with tensor size greater than INT32_MAX i.e. 2147483647 +# Note: the size of each dimension is still bounded by INT32_MAX +USE_INT64_TENSOR_SIZE = 0 + +#---------------------------- +# plugins +#---------------------------- + +# whether to use caffe integration. This requires installing caffe. +# You also need to add CAFFE_PATH/build/lib to your LD_LIBRARY_PATH +# CAFFE_PATH = $(HOME)/caffe +# MXNET_PLUGINS += plugin/caffe/caffe.mk + +# WARPCTC_PATH = $(HOME)/warp-ctc +# MXNET_PLUGINS += plugin/warpctc/warpctc.mk + +# whether to use sframe integration. This requires build sframe +# git@github.com:dato-code/SFrame.git +# SFRAME_PATH = $(HOME)/SFrame +# MXNET_PLUGINS += plugin/sframe/plugin.mk