-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
@mxnet-bot run ci [sanity] |
Jenkins CI successfully triggered : [sanity] |
@mxnet-bot run ci [sanity] |
Jenkins CI successfully triggered : [sanity] |
@mxnet-bot run ci [sanity] |
Jenkins CI successfully triggered : [sanity] |
if(USE_CUDA) | ||
add_library(customop_gpu_lib SHARED ${CMAKE_CURRENT_SOURCE_DIR}/example/extensions/lib_custom_op/relu_lib.cu) | ||
target_include_directories(customop_gpu_lib PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/include/mxnet) | ||
endif() | ||
if(MSVC) | ||
if(UNIX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those things can be deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean you don't need to add -shared even for customop_gpu_lib, just change the lines inside MSVC block
std::vector<MXTensor> outputs, | ||
OpResource res) { | ||
MXReturnValue backward(const std::unordered_map<std::string, std::string>& attrs, | ||
std::vector<MXTensor>* inputs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't forget to update this change to lib_custom_op README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
also please lib_api update version to 10 |
changed to version 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the hard work for writing custom graph pass support and making MXNet extension well organized! I have left a few minor suggestions. If you feel it is not worth a CI run, feel free to ignore them.
if(USE_CUDA) | ||
add_library(customop_gpu_lib SHARED ${CMAKE_CURRENT_SOURCE_DIR}/example/extensions/lib_custom_op/relu_lib.cu) | ||
target_include_directories(customop_gpu_lib PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/include/mxnet) | ||
endif() | ||
if(MSVC) | ||
if(UNIX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean you don't need to add -shared even for customop_gpu_lib, just change the lines inside MSVC block
Custom Operator support was merged (#15921, #17270) and is not available in versions of MXNet prior to v1.7.0. | ||
To access the feature now, please install MXNet by compiling from source using master or using the previously mentioned commits, downloading one of the nightly builds, or from a release of MXNet 1.7.0+. | ||
For running the following example, it doesn’t matter if it is a CUDA, MKLDNN or plain MXNet build; the custom operator doesn’t interact with the execution of other native MXNet operators. | ||
To run the following example, the build type of MXNet doesn’t matter since the custom operator doesn’t interact with the execution of other native MXNet operators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is better to add prerequisite here or run an example like "This requires GCC > 5 or CUDA > 9 to run the examples"
* register custom ops for library authors | ||
* register custom ops, partitioner, and passes | ||
* for library authors | ||
* See example/extension/lib_custom_op/README.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can rephrase it to "APIs to write extension library, see ... for registering custom operators, ... for custom partitioners, ... for custom graph passes"
@@ -45,7 +49,7 @@ | |||
#endif | |||
|
|||
/* Make sure to update the version number everytime you make changes */ | |||
#define MX_LIBRARY_VERSION 6 | |||
#define MX_LIBRARY_VERSION 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still feel it is better to make it 10
also it is better to add to c_api.cc line 339 version checking message something like "please update lib_api.h to match the version supported by MXNet backend"
Synced offline with @samskalicky and @rondogency - they are ok with doing the last changes proposed by @rondogency in another PR targeting master specifically, so this PR is ready to merge. |
@@ -200,6 +208,9 @@ If the number of input and output tensors are fixed, you can use hard-coded numb | |||
* **inferType**: This function takes three arguments. The 1st argument is the attributes (same as above). The 2nd argument is the a list of input data types corresponding to the input tensors. The 3rd argument is the placeholder for output tensor data types you need to assign. | |||
For example, if this operator has one input and one output, and data type doesn’t change, then you can do `outtypes[0] = intypes[0]` to populate the data type. | |||
|
|||
* **inferSType**: This function takes three arguments. The 1st argument is the attributes (same as above). The 2nd argument is the a list of input storage types corresponding to the input tensors. The 3rd argument is the placeholder for output storage types you need to assign. | |||
For example, if this operator has one input and one output, and data type doesn’t change, then you can do `outtypes[0] = intypes[0]` to populate the data type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data type doesn’t change -> data storage type doesn’t change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a list of input storage types corresponding to the input tensors -> a list of input storage types corresponding to the input tensors (dense, row_sparse, or CSR). For details, see https://cwiki.apache.org/confluence/display/MXNET/A+Guide+to+Implementing+Sparse+Operators+in+MXNet+Backend
It would be good to include the link above in case people wonder why/if inferSType is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a whole overview for Sparse. Maybe you can help us add another section to the readme about that
|
||
```python | ||
import mxnet as mx | ||
mx.library.load(‘libmypass_lib.so’) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
‘libmypass_lib.so’ -> 'libmypass_lib.so'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what am i missing? looks the same to me...
@ptrendx let's not merge commits named "[WIP]" to master. You can edit the name prior to merge |
Oops, you are right, sorry, I missed that. |
* add debug prints to debug error in CI * add debug prints to debug error in CI * remove prints * initial commit * enabled calling create for selector * connected selector to call external class * added code to remove temp graph attrs * fixed build issues * changed shape inference to use different attr names * fixed selector class * cleaned up APIs * fixed sanity * updated build for extensions * sanity fix * refactored MXLoadLib into separate functions * undo rebase * finished merge * enabled verbose in library loading * fixed example * added passing args/aux down to graph pass * added creating new args/aux for graph passes * fixed return args/aux * fixed sanity * whitespace * fixed lint * updated perl API, README, added pass_lib to cmake build flow * fixed mistake with relu example lib * fixed perl syntax * addressed comments * addressed more comments * fixed compile issues Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]>
* add debug prints to debug error in CI * add debug prints to debug error in CI * remove prints * initial commit * enabled calling create for selector * connected selector to call external class * added code to remove temp graph attrs * fixed build issues * changed shape inference to use different attr names * fixed selector class * cleaned up APIs * fixed sanity * updated build for extensions * sanity fix * refactored MXLoadLib into separate functions * undo rebase * finished merge * enabled verbose in library loading * fixed example * added passing args/aux down to graph pass * added creating new args/aux for graph passes * fixed return args/aux * fixed sanity * whitespace * fixed lint * updated perl API, README, added pass_lib to cmake build flow * fixed mistake with relu example lib * fixed perl syntax * addressed comments * addressed more comments * fixed compile issues Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]>
< We didnt want to rerun the whole CI. did we fix the problem where renaming the PR reruns CI @leezu ? @samskalicky when merging the commit, Github allows to edit the commit message. So the commit message can be different from the PR title. Indeed we can't edit the PR title without retriggering the CI as of now (cc @ChaiBapchya) |
* add debug prints to debug error in CI * add debug prints to debug error in CI * remove prints * initial commit * enabled calling create for selector * connected selector to call external class * added code to remove temp graph attrs * fixed build issues * changed shape inference to use different attr names * fixed selector class * cleaned up APIs * fixed sanity * updated build for extensions * sanity fix * refactored MXLoadLib into separate functions * undo rebase * finished merge * enabled verbose in library loading * fixed example * added passing args/aux down to graph pass * added creating new args/aux for graph passes * fixed return args/aux * fixed sanity * whitespace * fixed lint * updated perl API, README, added pass_lib to cmake build flow * fixed mistake with relu example lib * fixed perl syntax * addressed comments * addressed more comments * fixed compile issues Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]>
* add debug prints to debug error in CI * add debug prints to debug error in CI * remove prints * initial commit * enabled calling create for selector * connected selector to call external class * added code to remove temp graph attrs * fixed build issues * changed shape inference to use different attr names * fixed selector class * cleaned up APIs * fixed sanity * updated build for extensions * sanity fix * refactored MXLoadLib into separate functions * undo rebase * finished merge * enabled verbose in library loading * fixed example * added passing args/aux down to graph pass * added creating new args/aux for graph passes * fixed return args/aux * fixed sanity * whitespace * fixed lint * updated perl API, README, added pass_lib to cmake build flow * fixed mistake with relu example lib * fixed perl syntax * addressed comments * addressed more comments * fixed compile issues Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]>
* add debug prints to debug error in CI * add debug prints to debug error in CI * remove prints * initial commit * enabled calling create for selector * connected selector to call external class * added code to remove temp graph attrs * fixed build issues * changed shape inference to use different attr names * fixed selector class * cleaned up APIs * fixed sanity * updated build for extensions * sanity fix * refactored MXLoadLib into separate functions * undo rebase * finished merge * enabled verbose in library loading * fixed example * added passing args/aux down to graph pass * added creating new args/aux for graph passes * fixed return args/aux * fixed sanity * whitespace * fixed lint * updated perl API, README, added pass_lib to cmake build flow * fixed mistake with relu example lib * fixed perl syntax * addressed comments * addressed more comments * fixed compile issues Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]>
Description
Enhancements to MXNet Extensions (subgraph property, custom ops, custom graph passes). Addresses a few points from #17236. More description on the way!
Features
SupportedOps Enhancements for graph coloring
In the custom partitioner API, custom library writers could implement the
SupportedOps
API to specify which ops to include in a subgraph by settingTrue/False
for each node ID in the graph. This PR adds support for writers to identify which specific subgraph to include a node into by specifying a subgraph ID integer for each node, or -1 to indicate that a node can go into any subgraph.Selector class for subgraph creation
In the custom partitioner API, custom library writers could implement the
SupportedOps
API to specify which ops to include in a subgraph. But some custom partitioners may want more control. This PR, adds support for the internal MXNet subgraph propertySubgraphSelector
class to be implemented in a custom library. Custom library writers can choose to implementSupportedOps
API orCustomOpSelector
class for their partitioner.Custom Graph Passes
This PR adds the ability to register a custom graph pass in a library. Working backwards from the custom library writer, the
Pass
API is implemented with the following interface:The model graph is passed as a JSON string, and users provide a new JSON string for the modified graph. Options specified by the MXNet user at the Python level are passed in the
options
map. If the MXNet user provided the args/aux at the Python level they are provided to the pass in theargs/aux
arguments. The PassResource classres
exposes two functions:alloc_arg
andalloc_aux
that allow the custom library writer to allocate NDArrays for new or replacement args/aux within the custom pass. Custom passes are registered in the library with this syntax:As custom passes are found in the library during loading, a lambda function is registered:
From the MXNet front-end, users call custom passes using the same
optimize_for
API. The backend name argument to theoptimize_for
API is attempted to be looked up in registered subgraph backends first, and if missing is attempted to be looked up in registered graph passes:Compilation
The lib_api.h was designed to be as generic as possible to allow users to compile their custom library with any version of C++ (C++11 or higher) and GLIBC to fit the needs of their application. This does not have to correspond to the version of C++ or GLIBC used to compile MXNet. To test this, we will compile with C++11 to check for compiler errors, but will also compile and test with C++17 to match was MXNet is using.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments