travis-ci build fix using custom test runner by Stypox · Pull Request #180 · TeamNewPipe/NewPipeExtractor

Stypox · 2019-07-31T23:39:48Z

[v] I carefully read the contribution guidelines and agree to them.

Briefly

This pull request introduces a custom test runner for services tests in NewPipeExtractor. Services tests connect and fetch data from servers, and this can create problems with too-many-requests and recaptcha-exceptions. See #177

Features

The NewPipeTestRunner works the same way as the default Junit runner: @Test (with optional parameters timeout and expected), @Ignore (with optional reason), @Before, @After, @BeforeClass, @AfterClass are supported.
The only differencies are these:

when a test function throws a ReCaptchaException, that function is marked as ignored.
when the failedTests/totalNotIgnoredTests ratio is above failRatioTriggeringIgnore (customizable), all failing test functions are marked as ignored. Probably the servers returned something wrong due to overload, so all test methods relying on the server response cannot work.
The time it takes to conduct any test is displayed as 0ms, since at first results are collected from every test function, then the failedTests/totalNotIgnoredTests ratio is calculated using those results, and only then the RunNotifier is notified about the results (but checking results takes less than 1ms). This is a side effect of the implementation details, but it is not a big deal.

In order to customize failRatioTriggeringIgnore I introduced the NewPipeTestRunnerOptions annotation, that is mandatory on tests run with NewPipeTestRunner. The annotation allows to choose a value for failRatioTriggeringIgnore. See this file for more information about it.

Every ignored/failed test generates a message explaining the reason and containing the stack trace.

Example

@RunWith(NewPipeTestRunner.class)
@NewPipeTestRunnerOptions(failRatioTriggeringIgnore = 0.6f)
public class MyServiceExractorTest { /* ... */ }

Feedback requested

tarvis-ci servers are very fast. The custom runner ignores failed tests when the service being tested returned an invalid response, but this could lead to many ignored tests. There are three ways to fix this: add some delay between tests to reduce overload, retry a test if it failed, or a combination of both. Let me know your opinion.
Another thing i'd like to get feedback on is how the value for failRatioTriggeringIgnore should be chosen. Is 0.6=60% ok?

Other improvements to tests

Add base class for youtube stream extractor tests
Assert that the stream urls' response code is not an error (e.g. not 404)

TODO

Add NewPipeTestRunner to YouTube tests
Add NewPipeTestRunner to SoundCloud tests
Add NewPipeTestRunner to MediaCCC tests

Stypox · 2019-08-01T08:43:10Z

^ I added the description now, along with feedback requests :-)
@theScrabi @TobiGr @yausername

mauriciocolli · 2019-10-26T17:54:19Z

Had a quick overview, my main concern is that it has a chance that some bugs/erros will go unnoticed, I imagine checking the logs frequently would be a part of the new workflow if this is merged.

The time it takes to conduct any test is displayed as 0ms, since at first results are collected from every test function [...]

It's very inconvenient to not know which test is running at the moment you are testing them. This would only be noticeable when using an IDE, but it is annoying when debugging for example.

It seems that this is indeed not fixable, in my tests, if fireTestIgnored is called after a fireTestStarted it just duplicates the result in the IDE's dashboard (considered a terminal event I guess). Using a fireTestAssumptionFailed is possible, but it doesn't matter because there's another issue which messes with tests results that happens to contain any error, as they will have a fireTestStarted but the fireTestFailure/fireTestFinished will have to wait, resulting in the dashboard hanging and not updating the next results until all of them finish running.

[...] add some delay between tests to reduce overload, retry a test if it failed, or a combination of both.

A combination of them would be more appropriate here I think, maybe exponential backoff (when retrying each test) would be a good fit.

Another thing i'd like to get feedback on is how the value for failRatioTriggeringIgnore should be chosen. Is 0.6=60% ok?

Not sure how, but 60% seems like an ok value.

One thing about your implementation though: I think you over complicated things by subclassing that class, it'd be better if you subclass the (default) BlockJUnit4ClassRunner instead. It'd also fix the bug that you can't run individual tests anymore.

Look at this similar runner example using it (not exactly the same behavior, just to demonstrate).

public class NewPipeTestRunner extends BlockJUnit4ClassRunner {

    public NewPipeTestRunner(Class<?> klass) throws InitializationError {
        super(klass);
    }

    private int testFailCount = 0;
    private int testIgnoredByUserCount = 0;
    private final float failRatioTriggeringIgnore = .6f; // Get from annotation

    private final List<MethodResultCollector> result = new ArrayList<>();

    private class MethodResultCollector {
        private final Description methodDescription;
        private final boolean ignoredByUser;
        private String ignoredByUserReason = "";
        private final Throwable thrownException;

        MethodResultCollector(FrameworkMethod frameworkMethod, Description description, Throwable thrownException) {
            this.ignoredByUser = frameworkMethod.getMethod().isAnnotationPresent(Ignore.class);
            this.thrownException = thrownException;
            this.methodDescription = description;

            if (ignoredByUser) {
                ignoredByUserReason = frameworkMethod.getMethod().getAnnotation(Ignore.class).value();
            }
        }

        boolean isIgnoredByUser() {
            return ignoredByUser;
        }

        boolean isSuccessful() {
            return thrownException == null;
        }

        void showResults(RunNotifier notifier, boolean ignoreAnyError) {
            if (ignoredByUser) {
                notifier.fireTestIgnored(methodDescription);
                System.out.println(methodDescription.getMethodName() + "() ignored because of @Ignore" +
                        (!ignoredByUserReason.isEmpty() ? (": " + ignoredByUserReason) : ""));
            } else if (thrownException != null) {
                if (thrownException instanceof AssumptionViolatedException) {
                    notifier.fireTestAssumptionFailed(new Failure(methodDescription, thrownException));
                    System.out.println(methodDescription.getMethodName() + "() ignored since it threw an AssumptionViolatedException, " +
                            "message = " + thrownException.getMessage());
                    thrownException.printStackTrace(System.out);

                } else if (thrownException instanceof ReCaptchaException) {
                    notifier.fireTestIgnored(methodDescription);
                    System.out.println(methodDescription.getMethodName() + "() ignored since it threw a ReCaptchaException");
                    thrownException.printStackTrace(System.out);

                } else if (ignoreAnyError) {
                    notifier.fireTestIgnored(methodDescription);
                    System.out.println(methodDescription.getMethodName() + "() ignored since the whole class " + getTestClass().getName() +
                            " has more than " + (int) (failRatioTriggeringIgnore * 100) + "% of failed tests");
                    thrownException.printStackTrace(System.out);

                } else {
                    notifier.fireTestStarted(methodDescription);
                    notifier.fireTestFailure(new Failure(methodDescription, thrownException));
                    notifier.fireTestFinished(methodDescription);
                }
            } else {
                notifier.fireTestStarted(methodDescription);
                notifier.fireTestFinished(methodDescription);
            }
        }

    }

    @Override
    public void run(RunNotifier notifier) {
        super.run(notifier);

        for (MethodResultCollector resultCollector : result) {
            if (!resultCollector.isSuccessful()) testFailCount++;
            else if (resultCollector.isIgnoredByUser()) testIgnoredByUserCount++;
        }

        final int notIgnoredTestCount = testCount() - testIgnoredByUserCount;
        final boolean ignoreAll = notIgnoredTestCount > 1 && (float) testFailCount / notIgnoredTestCount >= failRatioTriggeringIgnore;

        for (MethodResultCollector resultCollector : result) {
            resultCollector.showResults(notifier, ignoreAll);
        }
    }


    @Override
    protected void runChild(FrameworkMethod method, RunNotifier notifier) {
        final Description description = describeChild(method);

        @Nullable Throwable error = null;
        try {
            methodBlock(method).evaluate();
        } catch (Throwable e) {
            error = e;
        }

        result.add(new MethodResultCollector(method, description, error));
    }
}

So, to sum things up about this:

Pros:

Would let tests succeed in case of a service outage or instability.

Cons:

Sudden changes to the service would lead to ignoring the (real) failing tests.
False-positives of PR/tests passing, when in reality, it's not.
Need to check the log every time (questionable, but a con anyway).
Messed up timing in tests (matters when using an IDE).
If a test is ignored we would have to either trigger a rebuild or test locally anyway, we couldn't trust that it is passing if it's ignored.

In the end, I ended up having more cons about this than pros, tell me your opinion about this list.

I believe a simple retry (with the aforementioned exponential backoff) in the downloader would be enough for this project. And if the ReCaptchas are the problem, I don't think just ignoring them is the solution (refer to the last con in the list).

Doing like youtube-dl, that is, separating the core tests from the services which are classified as "Allowed Failures" in the CI, also sounds good.

mauriciocolli · 2019-10-26T17:55:20Z

extractor/src/test/java/org/schabi/newpipe/NewPipeTestRunner.java

+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+
+public class NewPipeTestRunner extends Runner {


Refer to my comment about the superclass. Some reasons are:

Individual tests can't be run anymore.

Hard-to-track bugs may rise from the custom implementation, as it isn't tested.

Stypox · 2019-10-28T13:21:18Z

Thank you for the feedback, I will answer and try to address the issues you pointed out in the next days :-D

It works the same as the default junit runner except for these things: - when a test function throws a `ReCaptchaException` the test is ignored with a message and a stack trace. - when the failedTests/totalNotIgnoredTests ratio is above `errorRatioTriggeringIgnore`, all failing tests for that class are ignored. Probably YouTube returned something wrong. - The time it takes to conduct any test is displayed as 0ms

Instead of the constant variable used before, now the user can define it. For more info see documentation in NewPipeTestRunnerOptions.java

Now exceptions coming from the implementation are treated correctly (causing the program to crash) instead of only making the current test fail.

It is >=1.0f so that it makes the runner never ignore any failed test, disabling the feature

This removes much of the code that was used to get and invoke methods. Now the class Runner is also able to run individual tests.

It is too risky to silently ignore tests that actually failed. The fact that many tests in a class fail indicates that there is a problem in the set-up methods (`@BeforeClass`). We can't assume that the problem in set-up methods is a network error, since it could as well be a programming error. Also `ReCaptchaException`s are now considered a violated assumption (same as AssumptionViolatedException), since that's what they are. This allows to call `fireTestStarted` before invoking the tested method, thus correctly mesuring the elapsed time.

This will prevent overloads on services' servers.

Apparently age restricted videos have no related videos (on invidious it's the same, too)

Thanks to base class YoutubeStreamExtractorBaseTest

Stypox · 2019-11-04T06:37:00Z

Now I have removed the failRatioTriggeringIgnore. After thinking more about it I agree with you.
This gives me the possibility to run tests normally (i.e. not caching results for later) so time is now measured properly.

I added three new options to the NewPipeTestRunnerOptions class:

classDelayMs: how many milliseconds to wait before executing the annotated class.
methodDelayMs: how many milliseconds to wait before executing each test method in the annotated class.
retry: how many times the tests should be retried before deeming them as failed.

The thing I didn't implement about time is exponential backoff, because I have no idea how it should work (which base should I use? Is it ok for the exponent to just be the current retry?)

I switched to BlockJUnit4ClassRunner and the code shrunk by a lot. Thanks for the suggestion ;-)

@mauriciocolli

Stypox · 2020-04-10T17:59:04Z

Closing, as the only useful part of this PR is now in #309

Stypox changed the title ~~Cl build fix~~ tarvis-ci build fix Aug 1, 2019

Stypox changed the title ~~tarvis-ci build fix~~ tarvis-ci build fix using custom test runner Aug 1, 2019

Stypox changed the title ~~tarvis-ci build fix using custom test runner~~ travis-ci build fix using custom test runner Aug 1, 2019

Stypox added the enhancement New feature or request label Aug 17, 2019

Stypox mentioned this pull request Oct 23, 2019

Quick fix for failing test #210

Closed

mauriciocolli reviewed Oct 26, 2019

View reviewed changes

Stypox added 8 commits November 2, 2019 15:53

Add NewPipeTestRunnerOptions with failRatioTriggeringIgnore

0667ca8

Instead of the constant variable used before, now the user can define it. For more info see documentation in NewPipeTestRunnerOptions.java

Improve exception handling in NewPipeTestRunner

d553b3c

Now exceptions coming from the implementation are treated correctly (causing the program to crash) instead of only making the current test fail.

Default value for failRatioTriggeringIgnore in NewPipeTestRunner

3b621a7

It is >=1.0f so that it makes the runner never ignore any failed test, disabling the feature

Add support for reason in @ignore(...) to NewPipeTestRunner

c0ceec8

Add assertResponseCodeOk to ExtractorAsserts

7ea7c1b

Add base class for youtube stream extractor tests

0ced6ae

Simplify code

8af85fe

Stypox force-pushed the cl-build-fix branch from 7240614 to 8af85fe Compare November 2, 2019 15:07

Stypox added 8 commits November 3, 2019 14:37

Subclass BlockJUnit4ClassRunner in NewPipeTestRunner

aea44ad

This removes much of the code that was used to get and invoke methods. Now the class Runner is also able to run individual tests.

Add delays before running the class or each method in custom Runner

ba93740

This will prevent overloads on services' servers.

Validate options in custom Runner

dd608a8

Retry failing tests in custom Runner

a2e0219

Readd StreamExtractor tests accidentally deleted while rebasing

69a59a1

Ignore related videos test in AgeRestrictedTest

f7e5696

Apparently age restricted videos have no related videos (on invidious it's the same, too)

Add TimestampTest and simplify other StreamTests

5381b9a

Thanks to base class YoutubeStreamExtractorBaseTest

Stypox closed this Apr 10, 2020

TobiGr mentioned this pull request May 31, 2021

reCaptcha Exceptions in scheduled CI tests #642

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

travis-ci build fix using custom test runner#180

travis-ci build fix using custom test runner#180
Stypox wants to merge 16 commits intoTeamNewPipe:devfrom
Stypox:cl-build-fix

Stypox commented Jul 31, 2019 •

edited

Loading

Uh oh!

Stypox commented Aug 1, 2019 •

edited

Loading

Uh oh!

mauriciocolli commented Oct 26, 2019

Uh oh!

mauriciocolli Oct 26, 2019

Uh oh!

Stypox commented Oct 28, 2019

Uh oh!

Stypox commented Nov 4, 2019

Uh oh!

Stypox commented Apr 10, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Comments

Conversation

Stypox commented Jul 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Briefly

Features

Example

Feedback requested

Other improvements to tests

TODO

Uh oh!

Stypox commented Aug 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mauriciocolli commented Oct 26, 2019

Uh oh!

mauriciocolli Oct 26, 2019

Choose a reason for hiding this comment

Uh oh!

Stypox commented Oct 28, 2019

Uh oh!

Stypox commented Nov 4, 2019

Uh oh!

Stypox commented Apr 10, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Stypox commented Jul 31, 2019 •

edited

Loading

Stypox commented Aug 1, 2019 •

edited

Loading