-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Add unit tests for TensorRT integration and fix some bugs #15399
Conversation
@mxnet-label-bot add [pr-awaiting-review] |
Looks like CI caught a few issues. For example http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-15399/1/pipeline seems like it should be relevant to this PR. I'll have a look to see if there's anything else that jumps out at me. |
@@ -157,6 +157,12 @@ std::string ConvertNnvmGraphToOnnx( | |||
return serialized_onnx_graph; | |||
} | |||
|
|||
void ConvertIdentity(NodeProto* node_proto, const NodeAttrs& attrs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea if TRT actually optimizes this out? I've seen this in a few prod services :-/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be optimized by ONNX-TRT
return (param.dim != 0); | ||
} | ||
|
||
if (op_name == "Dropout") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, will TensorRT optimize this out? We don't want it at inference time right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropout have always been seen as identity function in MXNet-TensorRT integration so I don't see any changement on this, regarding to whether or not identity is actually doing a copy or not I'm not quite sure, here is the onnx-tensorrt conversion: https://github.com/onnx/onnx-tensorrt/blob/0ab159579551cabfa05fd66f338357f116e96835/trt_utils.hpp#L169-L180
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, non-blocking comment for this PR. I'm just thinking about adding a warning in the future if people are using TRT with operations that don't make sense at inference time (Dropout, Ident, Empty Concats or Copies, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few small changes requested. Looks like CI caught a few issues as well.
@Caenorst Could you please address the review comments? Thanks! |
@Caenorst Gentle ping... |
d086543
to
05070fe
Compare
…dCudaEngine, changer assert_allclose to assert_almost_equal
07d5f5f
to
04a5764
Compare
…unit test, remove test_tensorrt_deconvolution.py
I don't understand the error on windows-gpu, it doesn't seems related to my modifications... |
@KellenSunderland can we merge it ? (I did a bunch of modifications since the last review that you may wanna review too) |
Description
TensorRT integration lacked of unit tests, we instead relied on output comparison of a full network which is not very pertinent, difficult to find the tolerance and not very helpful if it fails.
This PR have two purposes:
As we only partition subgraph of at least 2 Ops we always append an identity to each output. we then compare to MXNet, using both TRT FP32 and FP16 computation.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Comments