Skip to content

Commit

Permalink
Merge pull request PaddlePaddle#28 from PaddlePaddle/add-inference
Browse files Browse the repository at this point in the history
add inference doc
  • Loading branch information
reyoung authored Jun 30, 2018
2 parents 75e032b + f2f7fbb commit d6709c7
Show file tree
Hide file tree
Showing 4 changed files with 214 additions and 2 deletions.
2 changes: 1 addition & 1 deletion paddle
Submodule paddle updated 26 files
+20 −0 doc/fluid/design/concepts/lod_tensor.md
+4 −4 doc/fluid/howto/optimization/host_memory_profiling_cn.md
+26 −0 doc/fluid/howto/optimization/timeline_cn.md
+0 −0 doc/fluid/howto/optimization/timeline_en.md
+1 −1 paddle/fluid/framework/details/multi_devices_graph_builder.h
+1 −1 paddle/fluid/framework/parallel_executor.cc
+1 −0 python/paddle/fluid/framework.py
+9 −9 python/paddle/fluid/layers/nn.py
+31 −26 python/paddle/fluid/lod_tensor.py
+5 −12 python/paddle/fluid/optimizer.py
+14 −14 python/paddle/fluid/tests/book/high-level-api/label_semantic_roles/test_label_semantic_roles_newapi.py
+7 −5 python/paddle/fluid/tests/book/high-level-api/machine_translation/test_machine_translation.py
+7 −5 python/paddle/fluid/tests/book/high-level-api/recommender_system/test_recommender_system_newapi.py
+5 −5 python/paddle/fluid/tests/book/high-level-api/understand_sentiment/test_understand_sentiment_conv.py
+5 −5 python/paddle/fluid/tests/book/high-level-api/understand_sentiment/test_understand_sentiment_dynamic_rnn.py
+5 −5 python/paddle/fluid/tests/book/high-level-api/understand_sentiment/test_understand_sentiment_stacked_lstm.py
+10 −9 python/paddle/fluid/tests/book/high-level-api/word2vec/test_word2vec_new_api.py
+10 −6 python/paddle/fluid/tests/book/notest_understand_sentiment.py
+47 −15 python/paddle/fluid/tests/book/test_label_semantic_roles.py
+8 −6 python/paddle/fluid/tests/book/test_machine_translation.py
+7 −5 python/paddle/fluid/tests/book/test_recommender_system.py
+7 −7 python/paddle/fluid/tests/book/test_rnn_encoder_decoder.py
+11 −10 python/paddle/fluid/tests/book/test_word2vec.py
+42 −31 python/paddle/fluid/tests/test_lod_tensor.py
+10 −1 python/paddle/fluid/tests/unittests/test_layers.py
+2 −2 python/setup.py.in
7 changes: 6 additions & 1 deletion source/advanced_usage/deploy/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@
服务端
######

.. toctree::
:maxdepth: 2

native_inference_engine.rst


移动端
######
######
108 changes: 108 additions & 0 deletions source/advanced_usage/deploy/native_inference_engine.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
Paddle 预测 API
===============

为了更简单方便的预测部署,Fluid 提供了一套高层 API
用来隐藏底层不同的优化实现。

`预测库相关代码 <https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/contrib/inference>`__
包括

- 头文件 ``paddle_inference_api.h`` 定义了所有的接口
- 库文件\ ``libpaddle_fluid.so`` 或 ``libpaddle_fluid.a``
- 库文件 ``libpaddle_inference_api.so`` 或
``libpaddle_inference_api.a``

编译和依赖可以参考 :ref:`install_or_build_cpp_inference_lib` 。

下面是一些 API 概念的介绍

PaddleTensor
------------

PaddleTensor 定义了预测最基本的输入输出的数据格式,其定义是

.. code:: cpp
struct PaddleTensor {
std::string name; // variable name.
std::vector<int> shape;
PaddleBuf data; // blob of data.
PaddleDType dtype;
};
- ``name`` 用于指定输入数据对应的 模型中variable 的名字
(暂时没有用,但会在后续支持任意 target 时启用)
- ``shape`` 表示一个 Tensor 的 shape
- ``data`` 数据以连续内存的方式存储在\ ``PaddleBuf``
中,\ ``PaddleBuf``
可以接收外面的数据或者独立\ ``malloc``\ 内存,详细可以参考头文件中相关定义。
- ``dtype`` 表示 Tensor 的数据类型

engine
------

高层 API 底层有多种优化实现,我们称之为 engine,目前有三种 engine

- 原生 engine,由 paddle 原生的 forward operator
组成,可以天然支持所有paddle 训练出的模型,
- Anakin engine,封装了
`Anakin <https://github.com/PaddlePaddle/Anakin>`__
,在某些模型上性能不错,但只能接受自带模型格式,无法支持所有 paddle
模型,
- TensorRT mixed engine,用子图的方式支持了
`TensorRT <https://developer.nvidia.com/tensorrt>`__ ,支持所有paddle
模型,并自动切割部分计算子图到 TensorRT 上加速(WIP)

其实现为

.. code:: cpp
enum class PaddleEngineKind {
kNative = 0, // Use the native Fluid facility.
kAnakin, // Use Anakin for inference.
kAutoMixedTensorRT // Automatically mixing TensorRT with the Fluid ops.
};
预测部署过程
------------

总体上分为以下步骤

1. 用合适的配置创建 ``PaddlePredictor``
2. 创建输入用的 ``PaddleTensor``\ ,传入到 ``PaddlePredictor`` 中
3. 获取输出的 ``PaddleTensor`` ,将结果取出

下面完整演示一个简单的模型,部分细节代码隐去

.. code:: cpp
#include "paddle_inference_api.h"
// 创建一个 config,并修改相关设置
paddle::NativeConfig config;
config.model_dir = "xxx";
config.use_gpu = false;
// 创建一个原生的 PaddlePredictor
auto predictor =
paddle::CreatePaddlePredictor<NativeConfig, PaddleEngineKind::kNative>(config);
// 创建输入 tensor
int64_t data[4] = {1, 2, 3, 4};
paddle::PaddleTensor tensor{.name = "",
.shape = std::vector<int>({4, 1}),
.data = PaddleBuf(data, sizeof(data)),
.dtype = PaddleDType::INT64};
// 创建输出 tensor,输出 tensor 的内存可以复用
std::vector<paddle::PaddleTensor> outputs;
// 执行预测
CHECK(predictor->Run(slots, &outputs));
// 获取 outputs ...
编译时,联编 ``libpaddle_fluid.a/.so`` 和
``libpaddle_inference_api.a/.so`` 便可。

详细代码参考
------------

- `inference
demos <https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/contrib/inference/demo>`__
- `复杂单线程/多线程例子 <https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/contrib/inference/test_paddle_inference_api_impl.cc>`__
99 changes: 99 additions & 0 deletions source/beginners_guide/install/build_and_install_lib_cn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
.. _install_or_build_cpp_inference_lib:

安装与编译C++预测库
===========================

直接下载安装
-------------

====================== ========================================
版本说明 C++预测库
====================== ========================================
cpu_avx_mkl `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuAvxCp27cp27mu/.lastSuccessful/fluid.tgz>`_
cpu_avx_openblas `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuAvxOpenblas/.lastSuccessful/fluid.tgz>`_
cpu_noavx_openblas `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuNoavxOpenblas/.lastSuccessful/fluid.tgz>`_
cuda7.5_cudnn5_avx_mkl `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda75cudnn5cp27cp27mu/.lastSuccessful/fluid.tgz>`_
cuda8.0_cudnn5_avx_mkl `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda80cudnn5cp27cp27mu/.lastSuccessful/fluid.tgz>`_
cuda8.0_cudnn7_avx_mkl `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda8cudnn7cp27cp27mu/.lastSuccessful/fluid.tgz>`_
cuda9.0_cudnn7_avx_mkl `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda90cudnn7avxMkl/.lastSuccessful/fluid.tgz>`_
====================== ========================================

从源码编译
----------
用户也可以从 PaddlePaddle 核心代码编译C++预测库,只需在编译时配制下面这些编译选项:

================= =========
选项 值
================= =========
CMAKE_BUILD_TYPE Release
FLUID_INSTALL_DIR 安装路径
WITH_FLUID_ONLY ON(推荐)
WITH_SWIG_PY OFF(推荐
WITH_PYTHON OFF(推荐)
WITH_GPU ON/OFF
WITH_MKL ON/OFF
================= =========

建议按照推荐值设置,以避免链接不必要的库。其它可选编译选项按需进行设定。

下面的代码片段从github拉取最新代码,配制编译选项(需要将PADDLE_ROOT替换为PaddlePaddle预测库的安装路径):

.. code-block:: bash
pip install paddlepaddle-gpu
PADDLE_ROOT=/path/of/capi
git clone https://github.com/PaddlePaddle/Paddle.git
cd Paddle
mkdir build
cd build
cmake -DFLUID_INSTALL_DIR=$PADDLE_ROOT \
-DCMAKE_BUILD_TYPE=Release \
-DWITH_FLUID_ONLY=ON \
-DWITH_SWIG_PY=OFF \
-DWITH_PYTHON=OFF \
-DWITH_MKL=OFF \
-DWITH_GPU=OFF \
..
make
make inference_lib_dist
成功编译后,使用C++预测库所需的依赖(包括:(1)编译出的PaddlePaddle预测库和头文件;(2)第三方链接库和头文件;(3)版本信息与编译选项信息)
均会存放于PADDLE_ROOT目录中。目录结构如下:

.. code-block:: text
PaddleRoot/
├── CMakeCache.txt
├── paddle
│   └── fluid
│   ├── framework
│   ├── inference
│   ├── memory
│   ├── platform
│   ├── pybind
│   └── string
├── third_party
│   ├── boost
│   │   └── boost
│   ├── eigen3
│   │   ├── Eigen
│   │   └── unsupported
│   └── install
│   ├── gflags
│   ├── glog
│   ├── mklml
│   ├── protobuf
│   ├── snappy
│   ├── snappystream
│   └── zlib
└── version.txt
version.txt 中记录了该预测库的版本信息,包括Git Commit ID、使用OpenBlas或MKL数学库、CUDA/CUDNN版本号,如:

.. code-block:: text
GIT COMMIT ID: c95cd4742f02bb009e651a00b07b21c979637dc8
WITH_MKL: ON
WITH_GPU: ON
CUDA version: 8.0
CUDNN version: v5

0 comments on commit d6709c7

Please sign in to comment.