[ONNX][CI]Parametrize ONNX Unit tests #8621

mbrookhart · 2021-08-02T18:50:16Z

#7438 Accidentally disabled the ONNX unit tests on CPU, and since then we've had a few regressions creep in. We've been attempting to enable the ONNX tests in a CPU only job at #8390 , but we've hit issues with pytorch in the CPU docker image.

This PR instead refactors the onnx test file to use the parametrize_targets flow.
it

adds that decorator to every test
inlines helper functions that are only used in one test as closures
removes the for loops over enabled targets so pytest will run every test on each target individually
removes -m gpu from the invocation of the pytest call in CI

This will re-enable a bunch of tests that have been disabled in CI and allow us to more easily skip tests that fail only on one target.

Thank you!

cc @electriclilies @Lunderberg @tkonolige @jwfromm @masahi

P.S. This conflicts somewhat with #8574 , but I really like the onnx node file changes in there.

electriclilies · 2021-08-02T19:05:59Z

Thanks for doing this @mbrookhart! Given the difficulty surrounding the docker image update (we've been trying for around a month to get the update through), I think that getting this refactor in is probably the fastest way to prevent further regressions.

tkonolige · 2021-08-02T21:40:26Z

This is a great change. Thanks for all the hard work. Is the plan to split CPU and GPU tests across the different nodes once we get onnx building on the cpu node?

Also, have you got a chance to see how much longer tests take with this change?

csullivan

👏 , thank you @mbrookhart!

csullivan · 2021-08-02T22:40:39Z

tests/python/frontend/onnx/test_forward.py

+@tvm.testing.parametrize_targets
+def test_forward_arg_min_max(target, dev):
+    if "cuda" in target:
+        pytest.skip("Fails on CUDA")


Suggested change

@tvm.testing.parametrize_targets

def test_forward_arg_min_max(target, dev):

if "cuda" in target:

pytest.skip("Fails on CUDA")

@tvm.testing.known_failing_targets("cuda")

@tvm.testing.parametrize_targets

def test_forward_arg_min_max(target, dev):

Perhaps consider using known_failing_targets here and elsewhere

I tried that on multiple tests, and they still ran on cuda and failed. This decorator apparently doesn't seem to work.

Can you try using @tvm.testing.exclude_targets('cuda') instead? The known_failing_targets applies pytest.mark.xfail, which still runs a test to see if it succeeds, but doesn't count a failure of that test as a failure of the test suite. This makes sense if it is an assertion failure, but not if the test results in a segfault. The exclude_targets decorator removes those tests entirely from the ones being run.

@tvm.testing.exclude_targets('cuda') fails in the same way, test runs, fails, and is reported as a failure.

Follow-up, it looks like currently the @tvm.testing.parametrize_targets decorator doesn't respect the settings in known_failing_targets or exclude_targets. There's a slightly different code-path between the explicit parametrization and the auto-parametrization. This is resolved with #8542, which merges these two paths.

mbrookhart · 2021-08-03T02:36:34Z

This is a great change. Thanks for all the hard work. Is the plan to split CPU and GPU tests across the different nodes once we get onnx building on the cpu node?

I'm not planning on it unless we start an effort to do it for all frameworks, pytorch was segfaulting on CPU when I tried.

Also, have you got a chance to see how much longer tests take with this change?

It's a minimal change vs. running the tests locally before (perhaps 1-2 minutes of extra reference kernel calls). Unfortunately, many of these tests were disabled before, so this will up the CI time to re-enable them.

mbrookhart · 2021-08-03T15:30:23Z

Thanks @tkonolige @Lunderberg @jroesch @electriclilies @csullivan!

I've synced with @Lunderberg, we've decided to merge this and he will rebase #8542 and include @tvm.testing.known_failing_targets("cuda") on that PR which fixes the inconsistencies with the current API.

mbrookhart requested review from areusch, comaniac, jroesch, junrushao, merrymercy, tqchen and yzhliu as code owners August 2, 2021 18:50

Parametrize ONNX Unit tests

2578ad0

mbrookhart force-pushed the parameterize_onnx_tests branch from 74db414 to 2578ad0 Compare August 2, 2021 20:40

csullivan reviewed Aug 2, 2021

View reviewed changes

jroesch approved these changes Aug 3, 2021

View reviewed changes

mbrookhart merged commit 5140d90 into apache:main Aug 3, 2021

mbrookhart deleted the parameterize_onnx_tests branch August 3, 2021 15:28

Lunderberg mentioned this pull request Aug 3, 2021

[UnitTests] Apply correct requires_gpu() pytest marks for parametrized target #8542

Merged

mbrookhart mentioned this pull request Aug 3, 2021

[TEST] test_roi_align in ONNX frontend tests taking 5+ mins of CI time #8611

Closed

masahi mentioned this pull request Sep 7, 2021

[TESTS] Hardcode target and device when getting tvm output #8565

Closed

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021

Parametrize ONNX Unit tests (apache#8621)

e0c3360

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022

Parametrize ONNX Unit tests (apache#8621)

7c9f717

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ONNX][CI]Parametrize ONNX Unit tests #8621

[ONNX][CI]Parametrize ONNX Unit tests #8621

mbrookhart commented Aug 2, 2021 •

edited

Loading

electriclilies commented Aug 2, 2021 •

edited

Loading

tkonolige commented Aug 2, 2021

csullivan left a comment

csullivan Aug 2, 2021

mbrookhart Aug 3, 2021 •

edited

Loading

Lunderberg Aug 3, 2021

mbrookhart Aug 3, 2021

Lunderberg Aug 3, 2021

mbrookhart commented Aug 3, 2021

mbrookhart commented Aug 3, 2021

[ONNX][CI]Parametrize ONNX Unit tests #8621

[ONNX][CI]Parametrize ONNX Unit tests #8621

Conversation

mbrookhart commented Aug 2, 2021 • edited Loading

electriclilies commented Aug 2, 2021 • edited Loading

tkonolige commented Aug 2, 2021

csullivan left a comment

Choose a reason for hiding this comment

csullivan Aug 2, 2021

Choose a reason for hiding this comment

mbrookhart Aug 3, 2021 • edited Loading

Choose a reason for hiding this comment

Lunderberg Aug 3, 2021

Choose a reason for hiding this comment

mbrookhart Aug 3, 2021

Choose a reason for hiding this comment

Lunderberg Aug 3, 2021

Choose a reason for hiding this comment

mbrookhart commented Aug 3, 2021

mbrookhart commented Aug 3, 2021

mbrookhart commented Aug 2, 2021 •

edited

Loading

electriclilies commented Aug 2, 2021 •

edited

Loading

mbrookhart Aug 3, 2021 •

edited

Loading