-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce runtime by running only module related tests #498
Comments
This seems like a test runner feature, not a feature of a mutation tester. Or so I thought when I implemented this exact feature into hammett (my test runner) to improve mutation testing speed for iommi (a lib we're writing at work), with mutmut (my mutation tester) :) This feature should be usable via cosmic ray just as it is via mutmut. |
Sure it can be pushed down the stack. It seems to me that this feature would be beneficial regardless of the testrunner and therefore well suited for the mutation test driver...
Groetjes,
Louis
… On 16 Jun 2020, at 16:22, Anders Hovmöller ***@***.***> wrote:
This seems like a test runner feature, not a feature of a mutation tester. Or so I thought when I implemented this exact feature into hammett (my test runner) to improve mutation testing speed for iommi (a lib we're writing at work), with mutmut (my mutation tester) :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
The problem with putting it at the mutation tester level is that there really is no common API for test runners to get the required information from the runner, nor are there APIs to tell test runners what tests to run. In fact pytest has made it intentionally harder to use it as a library! |
I don’t know all test tools, but the one I know allow me to specify which file to run the tests from. That’s all that’s needed isn’t it?
Groetjes,
Louis
… On 16 Jun 2020, at 17:00, Anders Hovmöller ***@***.***> wrote:
The problem with putting it at the mutation tester level is that there really is no common API for test runners to get the required information from the runner, nor are there APIs to tell test runners what tests to run.
In fact pytest has made it intentionally harder to use it as a library!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
This is a good idea, and it's a subset of larger principles we've discussed in other issues, e.g. #85 and #484. The best way to do it is with a filter. This gives you the ability to mark arbitrary mutations/tests as "skipped" prior to execution (or really, whenever you want). I suspect it would be very easy to write a filter that takes the name of the modified module and skips everything that isn't related to it. You can find a number of filters, as well as a base class to simplify their creation in common cases, in the source. |
Ah, yea you only mean on the entire test file level, then yes, I believe it's fairly safe/easy to implement. We used to do this on iommi (for mutmut but the principle is the same) in fact: https://github.com/TriOptima/iommi/blob/634bccf9e91bc5006ab2f95a368e1341a7bf5287/mutmut_config.py Since then I've implemented this module-test coupling feature directly into my test runner hammett, and we now use that locally. The way we do it is we put the tests right next to the module, but with a Also, hammett has way lower overhead than pytest and is somewhat pytest compatible so you can get some pretty big wins doing mutation testing with it. Again, this applies just as much to CR or mutpy as to mutmut. In mutmut I have an extra special code path for hammett to get even more speed, where hammett is written to be used as an API and unloads modules, and mutmut runs 100 mutants in a single process and then you get a new process. I couldn't get django to not leak memory like crazy so that's why it's not just one process for all mutants, but hitting the process creation overhead every 100 mutants makes the overhead pretty much disappear right now as there are other things that are more expensive anyway :( |
I have just now looked at the mutmut configuration feature. It seems to me that what you describe there should work if in this pre test callback the driver also provides information about the mutation being tested (module file name and path...). For me mutmut and cosmic ray are on the same level in the stack. I also thought about using coverage information for this purpose allowing even bigger reductions. It seemed to me that there might be a risk that due to a mutation the code path changes which in turn makes another test also relevant. And have therefor discarded this option. Thinking a bit more about it, the mutmut way gives a lot of freedom to use whatever kind of naming convention to indicate the relation between the module and the tests. I would very much like to add support for such a feature to cosmic ray, because it already supports utilizing all cores of the available hardware... |
So again, I would recommend a filter. You can define whatever relationship you want between modules and tests, implement that in a filter, and everything works. All the parts you need are there. |
@lvdstam we don't have the same mutants though, so don't be fooled into thinking that's the only difference between the tools! Also, you can sometimes get 10x or 100x improvements in speed with mutmut plus the hammett special case, so just using more cores might not be faster at all. Don't assume it does. Which reminds me.. we should discuss naming and classifying mutants so we can compare the tools. We have discussed this before a bit but never got anywhere. |
@abingham Looking at the filter suggestion, I don't understand the idea. My understanding of the filter is that it runs after the init and removes some of the mutations from the pool of mutations to test. I'm not clear on how to use the filter for limiting the tests to execute for each mutation. Can you please elaborate on this. |
@boxed I have noticed that the set of mutations is indeed different, but that seems solvable with some analysis and additional plugins. Looking at hammett, I have difficulty in understanding what it does, what are limitations... Besides mutmut with hammett getting into a situation where no progress is made (process seem to have died) on our tests, the above (not understanding) is my main reason for looking at other tools/ways to get the job done. |
Ah, I see. I was reading this incorrectly. As you say, filters will not let you control which tests are run. The only way to do this is by changing the test command you use, assuming your test framework gives you the ability to do what you're looking for. If you have a true correspondence between tests and modules under test, it might be simplest to split your mutation testing up along these lines. You could have a single session per module (or test, or whatever works best for you). I'd be curious to know how this approach works out in practice. |
Related to this (and what I was initially thinking about) is that filters can let you skip mutations to locations that don't need to be mutated. That is, if you've only changed module X, you can filter out mutations to all other modules. This can have a huge impact on runtime, and should be done independent of whether you find a way to limit the tests being run. |
@abingham I agree with you on that last bit. To get a complete verdict however, you need to have the results of the previous run available. I'm mainly thinking about running the mutation tests in the Ci/CD pipeline where a full run would make more sense. For a local developer experience, the feature you describe is very desirable. |
So what are your thoughts on this feature and implementation? Does the feature make sense? And if so would a mutmut style interface with two callbacks make sense?
For the mutation_info I have proposed a dictionary, but I would actually prefer a more typed interface like a class. |
I'd be happier with a more extrinsic solution that didn't involve new APIs inside CR, something closer to what hammet sounds like. So instead of telling CR to invoke pytest directly, you'd have it invoke some new test runner (let's call it "tufnel" to keep the gag going). Tufnel would be given enough information to know how to drive pytest (or whatever test runner you really want to use). So the business of running tests would be outside of CR, and tufnel (and related tools) could be developed completely separate from CR. As things stand, this would still require at least a minor change to CR. We'd have to communicate information about the module being mutated to tufnel. The most direct way to do this would be to let test runner commands have replacement fields that CR can fill in with that information. So your test runner command might look like:
CR would fill in the |
With hammett there doesn't need to be any communication (you can modify the command like before if you want but you shouldn't need to). Hammett checks the modification times of files to know what modules are changed, and then it knows what test modules to run. I'm sorry to hear there's a hang. There was some fix in mutmut a while back, I haven't released a new version since then. If it's not a big inconvenience it would be great if you could try the master branch directly. But you could run the test suite with hammett? |
@abingham I like that idea. Not sure that from a performance viewpoint creating another process in between is a smart thing. I guess one would like to batch this, but given that the driver mutates the code... |
@boxed with mutmut I could run the tests. In my current environment which is clean from the repo, if I just pip install hammett and run hammett -x no tests are executed... When time allowd, I will install the latest master version of mutmut and redo. |
You're going to get a subprocess for every test run as it is; that's how CR runs tests. If this new test runner runs pytest in-process (which is entirely possible, and maybe ideal for something like this), the only new runtime hit will be in calculating the tests to run. |
Not to be rude, but could you move the mutmut support issues to mutmut? |
@abingham Could you give me a pointer towards running pytest in-process. Seems a perfectly reasonable solution then... Sure. Will not report on mutmut here. |
I think this is what you need. CR used to run tests in-process, and I'm 99% sure this is how we did it for pytest. |
But using the pytest API makes it pytest only. Could be done, but I would prefer a more generic way? |
I'm completely fine with a totally generic solution, but I don't know what that solution is, and I don't think it needs to be in CR. CR used to have a plugin-based system for test runners, and it ran everything in-process, so maybe you could do something like that. |
Okay. Sounds like we need to take some time to think it over. Maybe I'll just fork and see what happens. Could you give me a global hint as to where to add the call to "tufnel" and where to get the module_path parameter. Making some changes to the code might just help me get a better understanding... |
You would specify that in the test command in your configuration, i.e. the file you pass to the
I think the correct place to get this is in Edit: Actually, |
In process is pretty dangerous though. I found out that the hard way. That's why I run just 100 runs before I cycle the process. At least pytest-django (probably django) isn't amenable to being reloaded, and pytest isn't really possible to unload cleanly either (another reason I gave up and wrote hammett). Just wanted to put my experience here so you don't waste all that time I wasted! |
@lvdstam could you open issues for hammett and/or mutmut so we can continue discussing that there? You don't have an email-address so can't contact you directly. Sorry about this last off topic message Austin. (I wish there was private messages on github!) |
In-process is a non-starter if you expect to be able to run more than one test run. But in this case we'll be running only a single test run per process. |
Well, I actually accomplish more than one test run per process. It was painful, I'm pretty sure I broke windows support and I had to run my own test runner, and cycle the process once in a while, but it's possible. But really we should be doing mutation schemata and then this wouldn't be such a big problem, and we'd get parallelization basically for free. But it's a huge rewrite... at some point I might get to it. |
I'm sure it's possible, but since a mutation testing run could cause arbitrary damage to a runtime environment, it seems impractical. CR takes the approach of using processes as firewalls against potential corruption. |
@abingham Thanks for the hints. I'll go and run with it for a bit. |
Assuming that there is some relation between the path and name of a code and test module, and assuming that only tests related to the code module being mutated need to be executed, a significant reduction in the runtime could be achieved. This behavior should be optional and configurable allowing for instance to associate multiple test modules with one code module or vice versa. Nothing to complex though. Simplicity should be key.
With some guidance, I would be willing to create a merge request for such a feature.
The text was updated successfully, but these errors were encountered: