-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[MetaSchedule][M4b] Testcases for TensorRT builder/runner #10055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
CC @zxybazh would love you guys to review each other’s code :-) |
junrushao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor nitpicks
| # from tvm import script | ||
| # from tvm._ffi import register_func | ||
| # from tvm.runtime import Module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this?
| ): | ||
| if use_meta_sched: | ||
| # With meta_schedule | ||
| dev = "nvidia/geforce-rtx-2080" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| dev = "nvidia/geforce-rtx-2080" | |
| dev = "cuda" |
| def relay_build_with_tensorrt( | ||
| mod: Module, | ||
| target: Target, | ||
| params: dict, | ||
| ) -> List[BuilderResult]: | ||
| from tvm.relay.op.contrib.tensorrt import partition_for_tensorrt | ||
|
|
||
| mod, config = partition_for_tensorrt(mod, params) | ||
| with tvm.transform.PassContext( | ||
| opt_level=3, config={"relay.ext.tensorrt.options": config} | ||
| ): | ||
| return tvm.relay.build_module._build_module_no_factory( | ||
| mod, "cuda", "llvm", params | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should refactor these functions, put them under python/tvm/meta_schedule/testing/byoc_trt.py, so that others could conveniently reuse these cool stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| target: Target, | ||
| params: dict, | ||
| ) -> List[BuilderResult]: | ||
| # @Sung: Weird. Cannot pass keyword arg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have time, you may add a proxy function to _build_module_no_factory to allow kwargs
@register_func("tvm.relay.build")
def _build_module_no_factory_impl(mod, target, target_host, params, mod_name):
target, target_host = Target.check_and_update_host_consist(target, target_host)
return build(mod, target, params=params, mod_name=mod_name).module
def _build_module_no_factory(mod, target=None, target_host=None, params=None, mod_name="default"):
"""A wrapper around build which discards the Python GraphFactoryRuntime.
This wrapper is suitable to be used from other programming languages as
the runtime::Module can be freely passed between language boundaries.
"""
return _build_module_no_factory_impl(mod, target, target_host, params, mod_name)| "model_name", | ||
| ["resnet-50", "mobilenet"], | ||
| ) | ||
| @pytest.mark.parametrize("batch_size", [1, 8, 16]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @pytest.mark.parametrize("batch_size", [1, 8, 16]) | |
| @pytest.mark.parametrize("batch_size", [1]) |
| ) | ||
|
|
||
|
|
||
| # @sunggg: memory verification error at test_relay_model("resnet-50", 1, use_meta_sched=False, use_trt=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot reproduce this, so let's double confirm :-) If there is no problem, let's remove this line
8f9b886 to
eb21e55
Compare
…0010) Adding interfaces into Pipeline Executor to "run", "stop","set input", and "get input" from the pipeline executor, In this patch, we also implemented the "BackendRuntime" structure to wrap the graph runtime interface in order to support pipeline executor interface and implement data copy method. This method is used to transfer data between two backend runtimes.
…che#10036) * introduce profile_all_alignments option * add profile_all_alignment option to API * wip * fixed dynamic case * black * update gen_gemm too * minor improvement * fix * all tests work * add doc * fixed for sm = 75 case * fix typo * remove unused import * profile_all -> find_first_valid * fix
…10049) * Add ApplyHisotryBest. Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Siyuan Feng <[email protected]> * Retrigger CI. * Update integration.py Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Siyuan Feng <[email protected]>
* mutate-unroll * mutate-unroll
* WIP * WIP * WIP * test cases * add examples * lint * Amend co-authors information Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]> * WIP * address comments and changed tensorized comparator * update * nit * fix example * lint * lint * lint * remove unused * trigger ci * clang-format * fix * rebase Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
…ardware (apache#9993) * Add env variable to micro tflite tutorial * Address @gromero comments * address @areusch comment * fix scope * trigger * trigger
This commit introduces BaseAddress ObjectRef to determine base addresses in the codegen for microNPU. This is required when multiple memory pools become available. Thus, base addresses could not be statically determined in the source module.
…ets with (apache#9637) sse4.1 support
… than call node (apache#10069) Co-authored-by: pranav jonnalagadda-SJ1 Eng_ML <[email protected]>
…pache#9999) Co-authored-by: wangjiuyang <[email protected]>
* multi level tiling * remove tensor core related code * pylint * fix Co-authored-by: Junru Shao <[email protected]>
…)" (apache#10072) Because of the failure of LSTM conversion from Pytorch
…ort tensoflow 2.6 (apache#9978) On tensorflow 2.4 the test is expected to fail as the generated graph is not forzen. On tensorflow 2.6 the generated graph is identified as frozen, therefore the test is not needed
) Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Wuwei Lin <[email protected]>
e38b24d to
68fd695
Compare
|
Thanks @sunggg! It's finally merged :-) |
Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
This PR includes BYOC builder/runner infra and its test case for TensorRT.
Thanks for your time to review this.
cc: @junrushao1994
Please note that previous PR is closed due to the overlap with previous merge.