Fix #1942: Fix CI app module test flakiness/failures due to shared test processes #1943

BenHenning · 2020-10-07T22:03:57Z

Fix #1942

This started as #1939, but since the CI failure was also affecting that PR I decided to create a new PR that adds additional debugging information and fixes the issue.

The PR fixes CI tests by requiring each one run in its own process. Note that this was an existing issue, but it only recently started affecting CI tests (https://github.com/oppia/oppia-android/actions). This will result in tests running significantly slower than before because subsequent test suites cannot benefit from from previous tests' resources setup in Robolectric. This is exactly how Bazel works, so this is the correct long-term strategy for the tests. #1942 contains much more information, but the summary is that static state being shared and not properly synchronized across test suites was causing later suites to fail. Running tests in their own process fully mitigates the problem.

Note that parallelization is needed to run the tests locally, but it does not work in Android Studio's test runner UI which means that running whole modules of tests via the UI will take 3-4x longer than before. Running from the CLI is still quite fast since it does properly use Gradle's parallelization configuration introduced in this PR.

The debugging changes enable full stack traces for all Robolectric tests to ensure failures can actually be understood when they occur in CI. Note that full-stacktrace will specifically show Gradle task failures which is especially useful when running Espresso tests from the CLI (per #1831). The Gradle changes are needed for Robolectric tests to show full stack traces when failing. Note also that the Gradle configuration is resulting in all passed & skipped tests to print out which is resulting in much more output. That seems like a nice addition since it makes the results clearer, but please let me know if you disagree.

Sample of failures showing up with stack traces after this change: https://github.com/oppia/oppia-android/pull/1939/checks?check_run_id=1222719700.

Compared test times:

CI job	Runtime without process forking (per #1939)	Runtime with process forking
Robolectric Tests (Non-App Modules)	8m 14s	15m 38s
Robolectric Tests - App Module (Non-Flaky)	8m 52s (with failures), 9-12m (passing)	21m 55s

rt4914

LGTM. Thanks Ben

BenHenning · 2020-10-08T08:21:22Z

Thanks @rt4914.

@vinitamurthi & @anandwana001: given this is blocking everyone, merging it now. Please let me know if you have concerns & I will address them in a subsequent PR.

BenHenning and others added 3 commits October 7, 2020 09:07

Enable full stack traces for all Robolectric tests

0a6474b

Show stack traces for failing Robolectric tests.

25090f2

Introduce class-level forking & enable test parallelization.

a4cdd17

BenHenning requested review from vinitamurthi, rt4914 and anandwana001 October 7, 2020 22:29

BenHenning assigned BenHenning, rt4914, vinitamurthi and anandwana001 and unassigned BenHenning Oct 7, 2020

BenHenning mentioned this pull request Oct 7, 2020

Enable full stack traces for all Robolectric tests #1939

Closed

BenHenning changed the title ~~Fix #1942: Fix ci flakiness~~ Fix #1942: Fix CI app module test flakiness/failures due to shared test processes Oct 7, 2020

rt4914 approved these changes Oct 8, 2020

View reviewed changes

rt4914 assigned BenHenning and unassigned rt4914 Oct 8, 2020

BenHenning merged commit 62f7669 into develop Oct 8, 2020

BenHenning deleted the fix-ci-flakiness branch October 8, 2020 08:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #1942: Fix CI app module test flakiness/failures due to shared test processes #1943

Fix #1942: Fix CI app module test flakiness/failures due to shared test processes #1943

BenHenning commented Oct 7, 2020 •

edited

Loading

rt4914 left a comment

BenHenning commented Oct 8, 2020

Fix #1942: Fix CI app module test flakiness/failures due to shared test processes #1943

Fix #1942: Fix CI app module test flakiness/failures due to shared test processes #1943

Conversation

BenHenning commented Oct 7, 2020 • edited Loading

rt4914 left a comment

Choose a reason for hiding this comment

BenHenning commented Oct 8, 2020

BenHenning commented Oct 7, 2020 •

edited

Loading