-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Common Test for testing when Gradualizer should pass, fail, and its known problems #567
base: master
Are you sure you want to change the base?
Conversation
Sadly, this doesn't work, as init_per_suite/1 is called after all/0, so a suite with all/0 not returning any tests is not run :(
…nfo_pass.erl This is necessary to avoid a problem with the test module name being used as a generated test name, which is a function name. It lead to a name clash with preexisting module_info/1, which is defined for every module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really familiar with Common Test and I also don't fully understand all the magic in gradualizer_dynamic_suite
. But under this disclaimer, this PR looks quite good to me (apart from a few comments I left). The output is nicer compared to EUnit and, as you said, it's probably easier to compare between different versions of Gradualizer. It's a bit slower than the EUnit tests (15 secs vs 10 secs on my machine) but that's not a problem I think. Overall, I consider it as an improvement over the EUnit tests. Thanks for implementing it!
Do you plan to replace the EUnit tests for should_*
with these CT tests, or do you want them to coexist (which would cause duplicated code)?
Thanks for extending the PR description and adding screenshots, it made it much easier to grasp the intended usage. If this gets merged, maybe we should also document it somewhere.
variable_binding_leaks]. | ||
|
||
should_pass_template(_@File) -> | ||
?assertEqual(ok, gradualizer:type_check_file(_@File, [{form_check_timeout_ms, 2000}])). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EUnit tests print a Gradualizer's error message describing the type error when a test fails. I find it useful for quickly glancing over what's going on without having to inspect all the failed test files one by one. Do you think it's possible with CT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's available in the CT HTML report, but that requires a little bit of clicking in the browser. It might be possible to get in the shell, too, but I'd have to think some more about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, CT supports plain text reports too. I don't know it well, but if it's too complicated, maybe it isn's worth it.
If we run it ourselves instead of relying on rebar3, we'll have better control over it.
@@ -149,6 +149,9 @@ eunit: compile-tests | |||
erl $(ERL_OPTS) -noinput -pa ebin -pa test -eval \ | |||
'$(erl_run_eunit), halt().' | |||
|
|||
ct: | |||
@rebar3 ct --label "git: $$(git describe --tags --always) $$(git diff --no-ext-diff --quiet --exit-code || echo '(modified)')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We made this whole makefile work without rebar3 because of some annoyances we had with it and to get better control of what's happening. If we mix rebar3 with non-rebar3 we'll have files built in various place and a mess in general. I don't like that.
Isn't it fairly straitforward to run ct
without rebar3? It's a command line tool.
Add ct to the tests
target and to .PHONY
.
We should be able to modify the logic we have in erl_cover_run
function to have coverage computed on eunit and commontest combined, or only commontest if we port all eunit tests to commontest.
To run a specific suite, we can do something similar to what erlang.mk does, e.g. make ct suite=known_problems_should_pass_SUITE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We made this whole makefile work without rebar3...
Generally, I think rebar3 is quite mature and I'd happily drop the Makefile, to be honest. though I admit I see some small benefits of using a tailored Makefile.
Isn't it fairly straitforward to run ct without rebar3? It's a command line tool.
It's possible, but the nice CLI output is provided by a custom Rebar3 CT hook, so if we run CT directly, we'll get uglier printouts.
Add ct to the
tests
target and to.PHONY
.
If we're happy with moving completely to CT, I can do that. For now, I considered this a nicer alternative for local development (especially comparing test results across builds), but did not include it in the CI, nor did I remove the original EUnit tests this is based on. In this light, it didn't make sense to run them twice, so the CT variants are not in tests
.
Module = ?config(dynamic_suite_module, Config), | ||
?assert(Module /= undefined), | ||
case erlang:function_exported(Module, generated_tests, 0) of | ||
true -> | ||
{ok, Module:generated_tests()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this test suite doing?
It looks like a lot of magic. I don't like magic. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a little bit of magic, but let's start from the standard CT conventions. CT treats files matching *_SUITE.erl
as test suites, so gradualizer_dynamic_suite.erl
is not a test suite, it's a helper file for use in test suites. The actual test suites match the previous tests defined in EUnit:
test/known_problems_should_fail_SUITE.erl
test/known_problems_should_pass_SUITE.erl
test/should_fail_SUITE.erl
test/should_pass_SUITE.erl
.
The dynamic suite helper exports only one function - gradualizer_dynamic_suite:reload/1
- which generates CT test cases based on passed in config and then dynamically reloads the module it's invoked from. It's idempotent, so calling it once, twice, or 100 times has the same effect. That's where the name comes from - "dynamic suite". The reason for this helper is that, unlike EUnit, CT has no standard way to generate tests dynamically.
The tests are generated for each of the test files defined under the respective should pass/fail/known problems directories.
The most enigmatic part is where to actually invoke gradualizer_dynamic_suite:reload/1
in a client test suite - this is something I had to dive in OTP CT code to figure out. Apparently, *_SUITE:groups/0
is the first function CT calls when running a test suite, so it's the best place to call reload/1
from. Before @xxdavid's remark that listing cases by hand is tedious we could call reload/1
from something more logical, like init_per_suite/1
, but if we want to avoid listing the to-be-generated tests manually, we have to play with how CT runs the test suites.
@@ -0,0 +1,63 @@ | |||
-module(known_problems_should_fail_SUITE). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you duplicate the eunit test suites as commontest suite?
I don't want duplicated logic. Delete the eunit test suites that have been ported to commontest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did, yes. If we're happy with moving completely to CT, I'll delete the EUnit tests.
variable_binding_leaks]. | ||
|
||
should_pass_template(_@File) -> | ||
?assertEqual(ok, gradualizer:type_check_file(_@File, [{form_check_timeout_ms, 2000}])). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, CT supports plain text reports too. I don't know it well, but if it's too complicated, maybe it isn's worth it.
If we run it ourselves instead of relying on rebar3, we'll have better control over it.
This PR adds ability to test Gradualizer functionality with Common Test (aka CT), which reports results in a way that enables much more convenient comparisons between builds. CT generates HTML test run reports. The Makefile rule,
make ct
, labels test runs with git commit info and a modified annotation, if a run tested a modified working directory. All in all, when introducing a change, this testing framework makes it easier to spot what broke and compare results between various builds.For now, it's not integrated with CI.
Usage:
or if we're only interested in a specific suite:
Example results
Command line:
Please note that line numbers are always the same, since the CT tests (test functions) are generated from a template. However, the suite and test name are sufficient to identify the typechecked file the CT test corresponds to.
HTML listing of all runs (just 2 in this case):
Two runs compared side by side: