Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support wrap TVM compiled function as an NDArray function. This enables use TVM as RTC module for MX's async function
Example
Technical Details
The bridge is quite natural as MXNet already uses DLTensor representation, which is used by TVM. The hard part is that we need to use MXNet's engine to run the compiled function, instead of running them directly.
Since TVM relies on LLVM, it is a bit too early to directly introduce this dependency. This PR does this differently. The TVM bridge depends on a header only component of TVM and does not have to link against tvm runtime.
When a user has TVM installed in their environment, TVM queries the MXTVMBridge function to get the wrapper logic and use it to run MXNet's function asynchronously. When a user does not have TVM installed, the additional logic won't add any additional link dependencies.
Because of this optional linking logic, I did not include test case for MXNet's CI. But have verified that the code works locally on GPU and CPU case here
Restriction
MXNet and TVM need to be built with same C++ ABI (because we pass around PackedFunc). This is somewhat a restriction but makes the code sharing easier by using the PackedFunc system. This usually can be achieved by using the same c++ compiler. For example, (g++4.8 and g++5.0 are not compatible, usually, the latest version of clang is compatible with latest version of g++), running incompatible ABI will cause undefined behavior in the code and possible segfault. This restriction can be possibly removed by forcing a pure C ABI, but requires additional work and may also affect the conciseness of code.