Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and report rotten green tests #176

Open
cschreib opened this issue Jun 9, 2024 · 3 comments
Open

Detect and report rotten green tests #176

cschreib opened this issue Jun 9, 2024 · 3 comments
Labels
enhancement New feature or request
Milestone

Comments

@cschreib
Copy link
Member

cschreib commented Jun 9, 2024

I read an article the other day on "rotten green test", i.e., tests that are reported "green" (passed), but that don't actually execute anything, hence don't test what they should. There's a report on the topic, and a PR for GoogleTest to attempt to catch and report this. Perhaps we should also do something about it too.

@cschreib cschreib added the enhancement New feature or request label Jun 9, 2024
@cschreib cschreib added this to the v1.x milestone Jun 9, 2024
@horenmar
Copy link

horenmar commented Dec 8, 2024

You can probably yoink this from Catch2 semi-directly

For context, the reason why it is not the default is use of sections-as-generators like this

TEST_CASE() {
    int i;
    SECTION("A") { i = 2; }
    SECTION("B") { i = 3; }
    REQUIRE(foo(i));
}

@horenmar
Copy link

horenmar commented Dec 8, 2024

Another important piece of this puzzle is to error out on empty test runs. This catches cases where a link issue happens that strips out the tests, or automation and your terminal have an escaping fight and you end up with invalid test specs.

@cschreib
Copy link
Member Author

Thanks for the tip! I like the idea of the --warn <thing>, seems very useful and probably easy to implement. I think having it opt-in is probably a good idea, indeed. Have you thought of introducing two levels, i.e., warn on empty test case, and warn on empty leaf section?

Since I created this ticket, I ended up implementing a more intrusive approach at work, which also addresses the "empty test" / "rotten green test" problem. We use GoogleTest there, but the technique would work for any test framework. I wonder if we could offer it or something similar in snitch as well.

The idea is to declare at the start of the test the list of "concepts" / "features" / "requirements" you want to test, and explicitly mark each of these as "covered" at the end of the relevant checks (REQUIRE/CHECK/etc.). It looks sort of like this:

TEST_CASE() {
    const auto checker = requirement::expected("can meow properly", "can bark gently");

    for (const auto& animal : get_all_animals()) {
        if (animal.type() == "cat") {
            // Do some checks specific to cats.
            CHECK(animal.sound() == "meow");
            requirement::covered("can meow properly");
        } else if (animal.type() == "dog") {
            // Do some checks specific to dogs.
            CHECK(animal.sound() == "woof");
            requirement::covered("can bark gently");
        }
    }
}

The checker variable will automatically verify in its destructor that all "expected" features have been "covered", and report a failure if not. In the example above, the test might actually end up executing no checks if the get_all_animals() function returns an empty list, or maybe returns a list of animals none of which are cats or dogs. Or, more subtle, get_all_animals() could return a list with only cats and no dogs, and you silently end up only not testing your dogs. The empty test warnings wouldn't catch that, since we'd have seen some CHECK (though for the cats only).

There are ways to restructure the tests to avoid these pitfalls, or at least in such a way that the empty test warnings would catch the problem (using sections for example). But it might not always be practicable, e.g. if get_all_animals() is costly.

This approach also catches more accidents, like an early return bypassing some but not all the checks (including exceptions, if they are manually handled inside the test). But also, a particular worry of mine, to strengthen cases where the actual checks are outsourced to a "helper" function (usually to reduce code duplication); this tends to visually separate the checks from the test case, making it easy to refactor the checks and break the intent of the test case:

TEST_CASE() {
    for (const auto& animal : get_all_animals()) {
        if (animal.type() == "cat") {
            check_cat(animal);
        } else if (animal.type() == "dog") {
            check_dog(animal);
        }
    }
}

void check_cat(const auto& animal) {
    // Do some checks specific to cats...

    // Oops, somebody removed this because it was failing another test.
    // CHECK(animal.sound() == "meow");
}

With the requirement::covered() statement, there is a visual warning sign that something depends on that check:

void check_cat(const auto& animal) {
    // Do some checks specific to cats...

    CHECK(animal.sound() == "meow");           // these two lines...
    requirement::covered("can meow properly"); // ... would likely be removed together
}

And if someone is bold enough to remove the check anyway, removing the last two lines, the checker will ultimately complain and fail the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants