Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Dynamic subgraph compile support #17623

Merged
merged 56 commits into from
Mar 19, 2020
Merged
Show file tree
Hide file tree
Changes from 52 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
639db17
passed args down to acceptSubgraph
samskalicky Feb 17, 2020
5898d53
added example and set param names on inputs to subgraph to map
samskalicky Feb 18, 2020
2294584
increased lib_api version number
samskalicky Feb 19, 2020
fad8e74
fixed whitespace
samskalicky Feb 19, 2020
734f1c4
fixed spacing
samskalicky Feb 19, 2020
934ae8f
Merge branch 'master' of https://github.com/apache/incubator-mxnet in…
samskalicky Feb 20, 2020
ceed9be
added info about lib_api.h to README
samskalicky Feb 20, 2020
098db85
updated readme for new args argument to reviewSubgraph
samskalicky Feb 20, 2020
cfcc0a6
added more tests
samskalicky Feb 20, 2020
1fa7f1d
added example for partitioning HybridBlock in-place without forward pass
samskalicky Feb 21, 2020
8f37c48
added example for partitioning
samskalicky Feb 21, 2020
729173f
fixed whitespace
samskalicky Feb 21, 2020
bb90d70
fixed sanity
samskalicky Feb 21, 2020
06c3841
fixed lint
samskalicky Feb 21, 2020
f8f6191
added support for passing aux
samskalicky Feb 22, 2020
dc17e3f
fixed lint
samskalicky Feb 22, 2020
56bbb01
sanity
samskalicky Feb 22, 2020
a12517d
perl changes
samskalicky Feb 22, 2020
c5d322e
replaced code with hybridize call
samskalicky Feb 24, 2020
4333260
added unittest for gluon optimize_for
samskalicky Feb 24, 2020
adde456
fixed whitespace
samskalicky Feb 24, 2020
8f58f33
fixed test
samskalicky Feb 24, 2020
4daefa7
addressed comments
samskalicky Feb 26, 2020
68f3de0
fixed grammar
samskalicky Feb 26, 2020
3e4b09a
Merge branch 'master' of https://github.com/apache/incubator-mxnet in…
samskalicky Feb 26, 2020
520edcc
fixed spelling
samskalicky Feb 26, 2020
55d575b
added aux argument to the reviewSubgraph API in README
samskalicky Feb 26, 2020
005b53c
updated infer shape to use aux for optimize_for
samskalicky Feb 27, 2020
a51486c
Merge branch 'subgraph_compile' of https://github.com/samskalicky/inc…
samskalicky Feb 27, 2020
668e315
fixed spacing
samskalicky Feb 27, 2020
bb7e52d
changed shape/dtype keys so they dont conflict with MXNet operator attrs
samskalicky Feb 27, 2020
20382ae
added error message to show missing arg/aux
samskalicky Feb 27, 2020
a2a9df1
added calls to setDLtensor for MXTensor constructors
samskalicky Feb 27, 2020
23958da
changed tests to pass aux in addition to args
samskalicky Feb 27, 2020
4f1d0d2
fixed bug passing attributes
samskalicky Feb 29, 2020
127bcbf
fixed memory leak where user attribute strings were not freed
samskalicky Feb 29, 2020
3f57b9b
added passing down shapes/dtypes to subgraph inputs
samskalicky Feb 29, 2020
c971fdc
fixed style
samskalicky Feb 29, 2020
2d9995a
fixed docstring
samskalicky Mar 5, 2020
ade7d48
removed space
samskalicky Mar 5, 2020
736cd8f
changed defines
samskalicky Mar 6, 2020
7fb9ea3
fixed bug in indexing into map with shapes/types when annotating the …
samskalicky Mar 13, 2020
b0a79e5
added support for MKLDNN tensor format conversion in case user does p…
samskalicky Mar 13, 2020
488740a
cleaned up code and added comments
samskalicky Mar 13, 2020
3b278be
fixed whitespace
samskalicky Mar 13, 2020
26734fe
added guards around MKLDNN checks for non-MKLDNN builds
samskalicky Mar 13, 2020
277288d
refactor to use pointers to reduce code duplication
samskalicky Mar 14, 2020
5940450
added MKLDNN guards for custom op
samskalicky Mar 14, 2020
c1d3f5e
fixed whitespace
samskalicky Mar 14, 2020
0b38e5c
added subgraph property API to let subg_prop initialize subgraph inputs
samskalicky Mar 16, 2020
d59a4dc
moved custom code to subgraph property API, cleaned up build_subgraph.cc
samskalicky Mar 16, 2020
5abc8c3
added support for ops with multiple outputs and InitSubgraphInputs
samskalicky Mar 16, 2020
90f6973
fixed sanity, removed prints
samskalicky Mar 16, 2020
28b6bef
fixed whitespace
samskalicky Mar 16, 2020
516d149
fixed shape/dtype parsing
samskalicky Mar 17, 2020
4e2efec
fixed lint
samskalicky Mar 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 47 additions & 22 deletions example/extensions/lib_subgraph/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,11 @@ You can start getting familiar with custom partitioners by running an example pr

* **lib_subgraph/test_subgraph.py**: This file calls `mx.library.load(‘libsubgraph_lib.so’)` to load the library containing the custom components, partitions the model using the `optimize_for` API, and prints outputs of the forward passes. The outputs should be the same as the regular MXNet forward pass without partitioning.

* **include/mxnet/lib_api.h**: This file from MXNet source code is the single header file needed to include all necessary data types and function prototypes for writing a custom operator library. You can either specify the include path in the `Makefile`, or copy the header file over to `example/extensions/lib_subgraph` folder. Note that apart from this header, the custom operator library is independent of MXNet source.

## Writing Custom Partitioner Library

For building a library containing your own custom partitioner, compose a C++ source file like `mypart_lib.cc`, include `lib_api.h` header file, and write your custom partitioner with these essential functions:
To build your own library containing a custom partitioner, compose a C++ source file like `mypart_lib.cc`, include `lib_api.h` header file, and write your custom partitioner with these essential functions:
- `initialize` - Library Initialization Function
- `REGISTER_PARTITIONER ` - Partitioner Registration Macro
- `mySupportedOps ` - Operator Support
Expand All @@ -76,34 +78,60 @@ sym, _, _ = mx.model.load_checkpoint('mymodel', 0)
# Symbol/Module flow
sym2 = sym.optimize_for("myPart")

# Gluon flow
# Gluon flow 1
sym_block = nn.SymbolBlock(sym, inputs)
sym_block.hybridize(backend='myPart')

# Gluon flow 2
sym_block = nn.SymbolBlock(sym, inputs)
sym_block.optimize_for(x, backend='myPart')
```

In the Gluon hybridize flow, the model is actually hybridized during the first inference, rather than immediately when calling `hybridize`. This hybridize-based flow is useful if a user expects to run inference immediately after hybridizing. But for users than just want to partition but not run a whole forward pass, the `optimize_for` API combines the hybrdize/forward APIs but does not run a forward pass. After calling `optimize_for` users can `export` their model immediately without running a forward pass.

### Using a Custom Partitioner Library

Partitioning APIs in MXNet are available in both Symbol and Gluon APIs. For the Symbol API, the `optimize_for` API can be called on Symbol objects to return a partitioned Symbol.

```
optimize_for(backend, args=None, ctx=None, **kwargs)
optimize_for(backend, args=None, aux=None, ctx=None, **kwargs)
```

The `optimize_for` API takes at least 1 argument, `backend` which is a string that identifies which backend to partition the model for. The `args` argument is optional and takes a list of NDArray or dict of str to NDArray. It is used to infer shapes and types and before partitioning. The `ctx` argument is optional and takes a device context to infer storage types. It also take any other user-specified options that will be passed to the backend partitioning APIs.
The `optimize_for` API takes at least 1 argument, `backend` which is a string that identifies which backend to partition the model for. The `args` and `aux` arguments are optional and take a list of NDArray or dict of str to NDArray. They are used to infer shapes and types and before partitioning, and passed to the backend to use during compilation. The `ctx` argument is optional and takes a device context to infer storage types. It also takes any other user-specified options that will be passed to the backend partitioning APIs.

For the Gluon API, the `hybridize` API can be called on HybridBlocks to partition the internal CachedOp Symbol.

```
hybridize(backend=None, backend_opts=None)
hybridize(backend=None, backend_opts=None, **kwargs)
```

The `hybridize` function prepares the HybridBlock to be converted into a backend symbol. The `backend` argument is a string that identifies which backend that will partition the model. The `backend_opts` takes other user-specified options that will be passed to the backend partitioning APIs. The actual partitioning takes place during the forward pass.

If you just want to partition the HybridBlock but not run a complete forward pass, you can use the `optimize_for` API that combines the work done in the `hybridize` API with part of the work done in the forward pass.

```
optimize_for(x, backend=None, backend_opts=None, **kwargs)
```

When the `optimize_for` API is called on a HybridBlock it partitions immediately. This lets users export the partitioned model without running a complete forward pass.

```
block.optimize_for(x, backend='myPart')
block.export('partitioned')
```

When the `hybridize` function is called, Gluon will convert the program’s execution into the style used in symbolic programming. The `backend` argument is a string that identifies which backend to partition the model for. The `backend_opts` takes other user-specified options that will be passed to the backend partitioning APIs.
But you can also use `optimize_for` in place of `hybridize` and run inference immediately after too.

```
block.optimize_for(x, backend='myPart')
block(x)
```

### Writing A Custom Partitioner

There are several essential building blocks for making a custom partitioner:

* [initialize](./subgraph_lib.cc#L242):
* [initialize](./subgraph_lib.cc#L261):
* This function is the library initialization function necessary for any dynamic libraries. It lets you check if the user is using a compatible version of MXNet. Note that this `version` parameter is passed from MXNet when library is loaded.

MXReturnValue initialize(int version)
Expand All @@ -116,40 +144,37 @@ There are several essential building blocks for making a custom partitioner:
std::vector<bool>& ids,
std::unordered_map<std::string, std::string>& options)

* [REGISTER_PARTITIONER(my_part_name)](./subgraph_lib.cc#L238):
* [REGISTER_PARTITIONER(my_part_name)](./subgraph_lib.cc#L257):
* This macro registers the custom partitioner and its properties to MXNet by its name. Notice that a partitioner can have multiple partitioning strategies. This enables multiple *passes* to be run in a single partitioning call from the user. The first argument to `addStrategy` is a user-specified name. The second argument is the `supportedOps` function. The third argument is the name of the subgraph operator to create for each subgraph created during partitioning (see below for more info about subgraph operators). The `setReviewSubgraph` API registers a callback function that is called for each subgraph created during partitioning (more on this below). Notice that the first argument to this function is the strategy to associate with and the second argument is the `reviewSubgraph` function.

REGISTER_PARTITIONER(my_part_name)
.addStrategy("strategy1",
supportedOps,
"_custom_subgraph_op")
.setReviewSubgraph("strategy1",
reviewSubgraph);
.addStrategy("strategy1", supportedOps, "_custom_subgraph_op")
.setReviewSubgraph("strategy1", reviewSubgraph);


samskalicky marked this conversation as resolved.
Show resolved Hide resolved
Also there are some optional functions you can specify:

* [reviewSubgraph](./subgraph_lib.cc#L220):
* [reviewSubgraph](./subgraph_lib.cc#L219):
* This function provides an opportunity to accept/reject a subgraph after MXNet partitions it. It also allows specifying custom attributes on the subgraph (ie. user-generated IDs). If you do not register this function, subgraphs will be accepted by default.

MXReturnValue reviewSubgraph(
std::string json,
int subraph_id,
int subgraph_id,
bool* accept,
std::unordered_map<std::string,
std::string>& options,
std::unordered_map<std::string,
std::string>& attrs)
std::unordered_map<std::string, std::string>& options,
std::unordered_map<std::string, std::string>& attrs,
std::map<std::string, MXTensor>& args,
std::map<std::string, MXTensor>& aux)

Let’s take a closer look at those registry functions:

* **supportedOps**: This function takes four arguments. The 1st argument is a JSON string of the model architecture graph, where nodes are inputs/params/weights and edges are data dependencies. The graph is pre-sorted in topological order. The 2nd argument is an array of booleans, one for each operator in the model. When traversing the graph, operators to be partitioned into subgraphs are identified and an entry is set to `true` for the node ID in the `ids` array. The last argument is the map of options specified by the user. Users can pass custom options to the partitioner and they are passed to this function in the `options` map.
* **supportedOps**: This function takes four arguments. The 1st argument is a JSON string of the model architecture graph, where nodes are inputs/params/weights and edges are data dependencies. The graph is pre-sorted in topological order. The 2nd argument is an array of booleans, one for each operator in the model. When traversing the graph, operators to be partitioned into subgraphs are identified and an entry is set to `true` for the index in the `ids` array corresponding to the node ID. The last argument is the map of options specified by the user. Users can pass custom options to the partitioner and they are passed to this function in the `options` map.

* **reviewSubgraph**: This function takes five arguments. The 1st argument is a JSON string of the newly partitioned subgraph. The 2nd argument is the subgraph ID, this is just a number MXNet uses to identify this particular subgraph (it starts at zero and increments). The 3rd argument is an output to be set in this function to tell MXNet whether to accept (value: `true`) or reject (value: `false`) the subgraph. The 4th argument is the map of options specified by the user. The last argument is a map of attributes that should be set on the created subgraph. These attributes will be available later at runtime, and provides a mechanisn to pass info from partition-time to runtime. You might want to reject a subgraph if it doesnt include all the operators you want, for example. The `options` map is the same one passed to the `supportedOps` API.
* **reviewSubgraph**: This function takes five arguments. The 1st argument is a JSON string of the newly partitioned subgraph. The 2nd argument is the subgraph ID, this is just a number MXNet uses to identify this particular subgraph (it starts at zero and increments, unique for each subgraph in the model). The 3rd argument is an output to be set in this function to tell MXNet whether to accept (value: `true`) or reject (value: `false`) the subgraph. You might want to reject a subgraph if it doesnt include all the operators you want, for example. The `options` map is the same one passed to the `supportedOps` API. The 4th argument is the map of options specified by the user. The 5th argument is a map of attributes that should be set on the created subgraph. These attributes will be available later at runtime, and provides a mechanisn to pass info from partition-time to runtime. The last argument is the map of params/weights/args to the model and the associated names. For inputs the the subgraph that come directly from the params/weights of the model, you can look up the name of the input in this map to get the actual tensor values.

### Writing A Custom Subgraph Operator

A partitioning strategy specifies how to partition a model and isolate operators into subgraphs. In MXNet, subgraphs are just a [stateful operator](../lib_custom_op#writing-stateful-custom-operator). Subgraph operators have an extra attribute called `SUBGRAPH_SYM_JSON` that maps to a JSON string of the subgraph. The expectation is that when a subgraph operator executes a forward/backward call, it executes all of the operators in the subgraph.
A partitioning strategy specifies how to partition a model and isolate operators into subgraphs. In MXNet, subgraphs are just a [stateful operator](../lib_custom_op#writing-stateful-custom-operator). Subgraph operators have an extra attribute called `MX_STR_SUBGRAPH_SYM_JSON` that maps to a JSON string of the subgraph. The expectation is that when a subgraph operator executes a forward/backward call, it executes all of the operators in the subgraph.

When registering a custom subgraph operator, all thats needed is to register a `createOpState` function and to set that the operator is a subgraph operator by calling the `setIsSubgraphOp` API like:

Expand Down
41 changes: 30 additions & 11 deletions example/extensions/lib_subgraph/subgraph_lib.cc
Original file line number Diff line number Diff line change
Expand Up @@ -160,11 +160,11 @@ MXReturnValue createOpState(std::map<std::string, std::string> attrs,
std::string serialized_subgraph = "[empty]";
// MXNet subgraph is stored as Symbol in operator node attrs subgraphs field
// custom subgraph is stored as json string in custom operator attrs map entry
if (attrs.count(SUBGRAPH_SYM_JSON)) {
if (attrs.count(MX_STR_SUBGRAPH_SYM_JSON)) {
// user can now parse json and run other custom ops inside subgraph
serialized_subgraph = attrs[SUBGRAPH_SYM_JSON];
serialized_subgraph = attrs[MX_STR_SUBGRAPH_SYM_JSON];
}
attrs.erase(SUBGRAPH_SYM_JSON);
attrs.erase(MX_STR_SUBGRAPH_SYM_JSON);
*op_inst = new MyStatefulOp(serialized_subgraph, attrs);
std::cout << "Info: stateful operator created" << std::endl;
return MX_SUCCESS;
Expand All @@ -177,7 +177,7 @@ REGISTER_OP(_custom_subgraph_op)
const std::vector<std::string> op_names({"exp","log"});

MXReturnValue mySupportedOps(std::string json,
std::vector<bool> ids,
std::vector<bool>& ids,
samskalicky marked this conversation as resolved.
Show resolved Hide resolved
std::unordered_map<std::string, std::string>& options) {
for (auto kv : options) {
std::cout << "option: " << kv.first << " ==> " << kv.second << std::endl;
Expand All @@ -204,8 +204,8 @@ MXReturnValue mySupportedOps(std::string json,
dtype = std::stoi(attrs.map[JsonVal("dtype")].str);
}

//check if op dtype is float
if(dtype == kFloat32) {
//check if op dtype is float, and if option was specified to require float types
if((dtype == kFloat32 && options.count("reqFloat") > 0) || options.count("reqFloat") == 0) {
samskalicky marked this conversation as resolved.
Show resolved Hide resolved
//check if op is in whitelist
if(std::find(op_names.begin(),op_names.end(),op.str.c_str()) != op_names.end()) {
// found op in whitelist, set value to 1 to include op in subgraph
Expand All @@ -216,22 +216,41 @@ MXReturnValue mySupportedOps(std::string json,
return MX_SUCCESS;
}

MXReturnValue myReviewSubgraph(std::string json, int subraph_id, bool* accept,
MXReturnValue myReviewSubgraph(std::string json, int subgraph_id, bool* accept,
std::unordered_map<std::string, std::string>& options,
std::unordered_map<std::string, std::string>& attrs) {
std::unordered_map<std::string, std::string>& attrs,
std::map<std::string, MXTensor>& args,
std::map<std::string, MXTensor>& aux) {
for (auto kv : options) {
std::cout << "option: " << kv.first << " ==> " << kv.second << std::endl;
}
if(options.find("reject") != options.end() &&
options["reject"].compare("True") == 0) {
for (auto kv : args) {
samskalicky marked this conversation as resolved.
Show resolved Hide resolved
std::cout << "arg: " << kv.first << " ==> (";
for (auto s : kv.second.shape)
std::cout << s << ",";
std::cout << ") [";
for (int i=0; i<kv.second.size(); i++)
std::cout << kv.second.data<float>()[i] << ", ";
std::cout << "]" << std::endl;
}

// check if option `reqArgs` was specified, and if so check if args were provided
if(options.count("reqArgs") > 0 && args.size() == 0) {
samskalicky marked this conversation as resolved.
Show resolved Hide resolved
*accept = false;
std::cout << "rejecting subgraph since args were not provided" << std::endl;
return MX_SUCCESS;
}

// check if option `reject` was specified, and if so check if value is 'True'
if(options.count("reject") > 0 && options["reject"].compare("True") == 0) {
// if specified, reject the subgraph. this is only used for testing
*accept = false;
std::cout << "rejecting subgraph" << std::endl;
} else {
*accept = true;
std::cout << "accepting subgraph" << std::endl;
attrs["myKey"] = "myVal";
}
std::cout << json << std::endl;
return MX_SUCCESS;
}

Expand Down
62 changes: 58 additions & 4 deletions example/extensions/lib_subgraph/test_subgraph.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,10 @@
# This test checks if dynamic loading of library into MXNet is successful
# and checks the end of end computation of custom operator

import mxnet as mx
import os, ctypes
import mxnet as mx
from mxnet.gluon import nn
from mxnet import nd
from mxnet.base import _LIB, check_call, mx_uint, c_str, c_str_array, SymbolHandle

# load library
Expand All @@ -35,6 +37,10 @@
path = os.path.abspath('libsubgraph_lib.dll')
mx.library.load(path)

###############################################
# Test with subgraph not consuming params
###############################################
# example model, ops to be partitioned do not have args (use outputs from other ops as inputs)
a = mx.sym.var('a')
b = mx.sym.var('b')
c = a + b
Expand Down Expand Up @@ -75,9 +81,6 @@
out3 = exe3.forward()
print(out3)

from mxnet.gluon import nn
from mxnet import nd

# Gluon Hybridize partitioning with shapes/types
print('-------------------------------')
print('Testing Gluon Hybridize partitioning with shapes/types')
Expand All @@ -88,3 +91,54 @@
out4 = sym_block(mx.nd.ones((3,2)),mx.nd.ones((3,2)))
print(out4)

# Gluon Hybridize partitioning with shapes/types without inference
print('-------------------------------')
print('Testing Gluon Hybridize partitioning with shapes/types without inference')
inputs = [a,b]
sym_block2 = nn.SymbolBlock(sym, inputs)
sym_block2.initialize()
sym_block2.optimize_for(mx.nd.ones((3,2)), mx.nd.ones((3,2)), backend='myProp')
sym_block2.export('partitioned')


###############################################
# Test with subgraph directly consuming params
###############################################
# example model, ops to be partitioned have args
d2 = mx.sym.exp(a)
samskalicky marked this conversation as resolved.
Show resolved Hide resolved
sym2 = mx.sym.log(d2)

#execute in MXNet
print('-------------------------------')
print('Testing regular MXNet execution')
exe5 = sym2.bind(ctx=mx.cpu(), args={'a':mx.nd.ones((3,2))})
out5 = exe5.forward()
print(out5)

# with propogating shapes/types
print('-------------------------------')
print('Testing partitioning with shapes/types')
arg_array = [mx.nd.ones((3,2),dtype='float32')]
mysym6 = sym2.optimize_for("myProp", arg_array, reqArgs=True)
print(mysym6.tojson())
exe6 = mysym6.bind(ctx=mx.cpu(), args={'a':mx.nd.ones((3,2))})
out6 = exe6.forward()
print(out6)

# without propogating shapes/types
print('-------------------------------')
print('Testing partitioning without shapes/types')
mysym7 = sym2.optimize_for("myProp", reqArgs=True)
exe7 = mysym7.bind(ctx=mx.cpu(), args={'a':mx.nd.ones((3,2))})
out7 = exe7.forward()
print(out7)

# Gluon Hybridize partitioning with shapes/types
print('-------------------------------')
print('Testing Gluon Hybridize partitioning with shapes/types')
inputs = [a]
sym2_block = nn.SymbolBlock(sym2, inputs)
sym2_block.initialize()
sym2_block.hybridize(backend='myProp')
out8 = sym2_block(mx.nd.ones((3,2)))
print(out8)
4 changes: 3 additions & 1 deletion include/mxnet/c_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -2170,8 +2170,10 @@ MXNET_DLL int MXOptimizeForBackend(SymbolHandle sym_handle,
const char* backend_name,
const int dev_type,
SymbolHandle* ret_sym_handle,
const mx_uint len,
const mx_uint args_len,
NDArrayHandle* in_args_handle,
const mx_uint aux_len,
NDArrayHandle* in_aux_handle,
const mx_uint num_options,
const char** keys,
const char** vals);
Expand Down
Loading