|
| 1 | +# NNVM Overview |
| 2 | + |
| 3 | +NNVM is a C++ library to help developers to build deep learning system. |
| 4 | +It provides ways to construct, represent and transform computation graphs |
| 5 | +invariant of how it is executed. |
| 6 | + |
| 7 | +To begin with, let us start with a few stories to tell the design goals. |
| 8 | + |
| 9 | +## Stories and Design Goals |
| 10 | + |
| 11 | +X has built a new deep learning framework for image classification for fun, |
| 12 | +with the modular tools like CuDNN and CUDA, it is not hard to assemble a C++ API. |
| 13 | +However, most users like to use python/R/scala or other languages. |
| 14 | +By registering the operators to NNVM, X can now get the graph composition |
| 15 | +language front-end on these languages quickly without coding it up for |
| 16 | +each type of langugage. |
| 17 | + |
| 18 | +Y want to build a deep learning serving system on embedded devices. |
| 19 | +To do that, we need to cut things off, as opposed to add new parts, |
| 20 | +because codes such as gradient calculation multi-GPU scheduling is NOT relevant. |
| 21 | +It is hard to build things from scratch as well, because we want to |
| 22 | +reuse components such as memory optimization and kernel execution. |
| 23 | +It is hard to do so in current frameworks because all these information |
| 24 | +are tied to the operator interface. We want to be able to keep |
| 25 | +certain part of the system we need and throw away other parts |
| 26 | +to get the minimum system we need. |
| 27 | + |
| 28 | +Z want to extend an existing deep learning system by adding a new feature, |
| 29 | +say FPGA execution of some operators. To do so Z need to add a interface like ```FPGAKernel``` |
| 30 | +to the operators. E want to do another new feature that generate code for |
| 31 | +certain subset of operations. Then interface like ```GenLLVMCode``` need to be added |
| 32 | +to the operator. Eventually the system end up with a fat operator interface |
| 33 | +in order to support everything (while everyone only want some part of it). |
| 34 | + |
| 35 | +We can think more stories, as the deep learning landscape shifts to more devices |
| 36 | +applications and scenarios. It is desirable to have different specialized |
| 37 | +learning system to solve some problem well, |
| 38 | + |
| 39 | +Here is a list of things we want: |
| 40 | +- Minimum dependency |
| 41 | +- Being able to assemble some part together while discarding some other parts |
| 42 | +- No centralized operator interface but still allow user to provide various information about operators. |
| 43 | + |
| 44 | +## Minimum Registration for a Symbolic Front-End |
| 45 | +To use NNVM to build language front end, developer only need to register |
| 46 | +minimum information about each operators. |
| 47 | + |
| 48 | +```c++ |
| 49 | +NNVM_REGISTER_OP(add) |
| 50 | +.describe("add two data together") |
| 51 | +.set_num_inputs(2); |
| 52 | + |
| 53 | +NNVM_REGISTER_OP(conv2d) |
| 54 | +.describe("take 2d convolution of input") |
| 55 | +.set_num_inputs(2); |
| 56 | + |
| 57 | +NNVM_REGISTER_OP(assign) |
| 58 | +.describe("assign second input argument to the first one") |
| 59 | +.set_num_inputs(2); |
| 60 | +``` |
| 61 | +
|
| 62 | +Compiling the code with nnvm library. User can use the following interface |
| 63 | +to compose the computation graph in python, like the following code. |
| 64 | +
|
| 65 | +```python |
| 66 | +import nnvm.symbol as nn |
| 67 | +
|
| 68 | +# symbolic variable |
| 69 | +x = nn.Variable('x') |
| 70 | +y = nn.Variable('y') |
| 71 | +w = nn.Variable('w') |
| 72 | +
|
| 73 | +z = nn.conv2d(nn.add(x, y), w, filter_size=(2,2), name='conv1') |
| 74 | +``` |
| 75 | + |
| 76 | +The graph structure can be accessed in the backend. Currently python interface is supported. |
| 77 | +But as NNVM follows the same C bridge API design as [MXNet](https://github.com/dmlc/mxnet), |
| 78 | +which support many languages such as R/Julia/Scala/C++, more language support can be easily |
| 79 | +moved in in the future. |
| 80 | + |
| 81 | +## Operator Attribute for More Extensions |
| 82 | + |
| 83 | +While the minimum information provided by the operator is enough to get a front-end. |
| 84 | +In order to do transformations and executing the graph. We need more information from each operator. |
| 85 | +A typical difference between neural nets' computation graph and traditional LLVM IR is that |
| 86 | +there are a lot more high level operators. We cannot fix the set of operators in the graph. |
| 87 | + |
| 88 | +Instead developers are allowed to register attributes of operator. The attributes can include shape |
| 89 | +inference function, whether the operator can be carried in-place etc. |
| 90 | + |
| 91 | +This design to having an operator attribute registry is not uncommon in deep learning systems. |
| 92 | +For example, MXNet has a ```OpProperty``` class, Tensorflow has a ```OpDef``` and Caffe2 have a ```OperatorSchema``` class. |
| 93 | +However, the operator attribute interface listed in these frameworks only support a number of defined attributes of interest to the system. |
| 94 | +For example, MXNet support inplace optimization decision, shape and type inference function. |
| 95 | +If we want to extend the framework to add new type of attributes in each operator, we need to change the operator registry. |
| 96 | +Eventually the operator interface become big and have to evolve in the centralized repo. |
| 97 | + |
| 98 | +In NNVM, we decided to change the design and support arbitrary type of operator attributes, |
| 99 | +without need to change the operator registry. This also echos the need of minimum interface |
| 100 | +so that the code can be easier to share accross multiple projects |
| 101 | + |
| 102 | +User can register new attribute, such as inplace property checking function as follows. |
| 103 | +```c++ |
| 104 | +using FInplaceOption = std::function< |
| 105 | + std::vector<std::pair<int, int> > (const NodeAttrs& attrs)>; |
| 106 | + |
| 107 | +// attributes can be registered from multiple places. |
| 108 | +NNVM_REGISTER_OP(add) |
| 109 | +.set_num_inputs(1); |
| 110 | + |
| 111 | +// register to tell first input can be calculate inplace with first output |
| 112 | +NNVM_REGISTER_OP(add) |
| 113 | +.attr<FInplaceOption>("FInplaceOption", [](const NodeAttrs& attrs) { |
| 114 | + return std::vector<std::pair<int, int> >{{0, 0}}; |
| 115 | + }); |
| 116 | + |
| 117 | +NNVM_REGISTER_OP(exp) |
| 118 | +.set_num_inputs(1) |
| 119 | +.attr<FInplaceOption>("FInplaceOption", [](const NodeAttrs& attrs) { |
| 120 | + return std::vector<std::pair<int, int> >{{0, 0}}; |
| 121 | + }); |
| 122 | +``` |
| 123 | +
|
| 124 | +These attributes can be queried at arbitrary parts of the code, like the following parts. |
| 125 | +Under the hood, each attributes are stored in a any type columar store, |
| 126 | +that can easily be retrieved and cast back to typed table and do quick lookups. |
| 127 | +
|
| 128 | +```c++ |
| 129 | +void MyFunction() { |
| 130 | + const Op* add = Op::Get("add"); |
| 131 | + // if we need quick query, we can use static variable |
| 132 | + // attribute map contains attributes of all operators. |
| 133 | + static auto& finplace_option_map = Op::GetAttr<FInplaceOption>("FInplaceOption"); |
| 134 | +
|
| 135 | + // quick look up attribute of add, O(1) time, vector index lookup internally. |
| 136 | + auto add_inplace = finplace_option_tbl[add]; |
| 137 | +} |
| 138 | +``` |
| 139 | +Besides making the code minimum, this attribute store enables decentralization of projects. |
| 140 | +Before, all the attributes of operator have to sit on a centralized interface class. |
| 141 | +Now, everyone can register their own attribute, take some other attribute they need from another project |
| 142 | +without need to change the operator interface. |
| 143 | + |
| 144 | +See [example code](../example/src/operator.cc) on how operators can be registered. |
| 145 | + |
| 146 | +## Graph and Pass |
| 147 | + |
| 148 | +When we get more information about the operators. |
| 149 | +We can use them to do optimizations and get more information about the graph. |
| 150 | +Graph is the unit we manipulate in these steps. A Graph in NNVM contains |
| 151 | +two parts: |
| 152 | +- The computation graph structure |
| 153 | +- A attribute map from string to any type ```map<string, shared_ptr<any> >``` |
| 154 | + |
| 155 | +The second attribute map is quite important, as we may need different kinds |
| 156 | +of information about the graph during the transformation process. Let it be |
| 157 | +shapes of each tensor, types of each tensor or the storage allocation plans. |
| 158 | + |
| 159 | +A ```Pass``` can take a graph with existing attribute information, |
| 160 | +and transform it to the same graph with more attributes, or another graph. |
| 161 | + |
| 162 | +We have bunch of pass implemented in NNVM, including symbolic differentiation, |
| 163 | +memory planning, shape/type inference and we can support more. |
| 164 | + |
| 165 | +## Executing the Graph |
| 166 | + |
| 167 | +Currently the library defined nothing on how the graph can be executed. |
| 168 | +Execution is intentionally excluded from this module because we believe |
| 169 | +that can be another module, and there can be many ways to execute one graph. |
| 170 | +We can target different runtime platforms, or even write your own ones. |
| 171 | + |
| 172 | +More importantly, the information such as memory allocation plan, |
| 173 | +shape and type of each tensor can be used during execution phase |
| 174 | +to enhance. |
| 175 | + |
| 176 | +We can also register more runtime related information to the operator registry, |
| 177 | +and define pass function to do runtime related optimization of the graph. |
| 178 | + |
| 179 | +## Relation to LLVM |
| 180 | + |
| 181 | +NNVM is inspired by LLVM. It is at a more high level, in a sense that there are a lot of optimization |
| 182 | +chance we can have by knowing the high level information about the operator. |
| 183 | +On the other hand, we do believe that code generation to LLVM can be a natural extension and can benefit some of the usecases. |
| 184 | + |
| 185 | +## Unix Philosophy in Learning Systems |
| 186 | + |
| 187 | +There are a few existing computation graph based deep learning frameworks (e.g. Theano, Tensorflow, Caffe2, MXNet etc.). |
| 188 | +NNVM do not intend to become another one. Instead, NNVM summarizes a module that contains |
| 189 | + |
| 190 | +- The graph representation is minimum, with no code dependency |
| 191 | +- Operator attribute allow arbitrary information registered in unified way |
| 192 | +- Invariant of execution layer to be re-targetable to multiple frontend and backend. |
| 193 | + |
| 194 | +We believe this is the correct way for learning system. |
| 195 | +By having more such modules, we can pick one ones we need, and remove the ones we do not want in our use cases. |
| 196 | +Hopefully these effort can make deep learning system research and building easy, fun and rewarding. |
0 commit comments