-
-
Notifications
You must be signed in to change notification settings - Fork 808
Test improvements #1001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test improvements #1001
Conversation
`out_order` is the global parametrization list, not the test fixture argument
|
This is really amazing, you're such a pro! I've really been struggling as the code is often hard to follow and I find that it could profit so much from the kind of Pythonic refactoring that you showcased in this PR: That would make the whole library much more accessible, easy to maintain and contribute to. Code that is written well in the sense that it makes things as understandable as they can be is the best documentation. A significant step in the right direction. I reviewed every commit and everything looks right to me. Of course, with the test suite in it's current state, it's a bit hard to be sure we didn't accidentally introduce some error. Anyways, you definitely also at fixed at least one bug and reducing the flakiness is an absolute must. Also, we have to take some calculated risks in improving the code quality in this repo and I'm fully on board with that. You also seem to know what you're doing (: But no worry if sth breaks, then we fix forward.. Please keep it coming, if you feel like it. Super happy to have you on board in this effort! |
|
P.S. Thanks for mentioning https://ast-grep.github.io/ Looks really promising and didn't hear of it yet. I've been looking for good tools to help with refactoring. Will look into it. Let me know if you can recommend other stuff :) |
|
I haven't checked this PR in detail, but just a tip for the future: When refactoring tests, it can be useful to run code coverage (pytest-cov) before and after the refactor to ensure that exactly the same lines of code are covered as before. That's not a proof that the tests do the same thing, but a strong indicator. |
|
@BenjaminBossan Absolutely – though in this case it wouldn't help as much, since the original code generated random tests :) But as soon as we have a test suite that's not as flaky, |
This PR improves the technical state of the tests some – it doesn't fix flakiness or coverage any.
parametrizein the way it's intended to be used; stacking multipleparametrizes results in a Cartesian product of all of the parameters. Where the original code did this by hand and precariously constructed test ID strings, we'll now let Pytest do it for us. It's about half shorter too. The exact same number of tests is generated as before (3,136) – the ids are different (more readable) in the new version though.--lf(run last failed) etc. not exactly work as they should. This is somewhat alleviated, but not perfectly, by using a seeded generator. The number of files visited (i.e. the number ofrandintcalls to that seeded generator) affects the result still.NO_CUBLASLTNotImplementedErrorimplemented in Don't crash Python interpreter viaassert(false)when missing cublaslt #998 is now translated into a test skip, not a test failure.-k "not (slow or benchmark)".I tested that tests run just as before on my hardware, and many if not all succeed (
310 failed, 1991 passed, 563 skipped, 18 deselected, 704 warnings in 488.88s (0:08:08)) – but that's probably the flakiness of the tests in general to blame.This PR stacks on #1000.
These transformations were done partly with https://ast-grep.github.io/ (h/t @HerringtonDarkholme!)