Test improvements #1001

akx · 2024-01-31T07:46:14Z

This PR improves the technical state of the tests some – it doesn't fix flakiness or coverage any.

The main thing is to use Pytest's parametrize in the way it's intended to be used; stacking multiple parametrizes results in a Cartesian product of all of the parameters. Where the original code did this by hand and precariously constructed test ID strings, we'll now let Pytest do it for us. It's about half shorter too. The exact same number of tests is generated as before (3,136) – the ids are different (more readable) in the new version though.
The original code would generate random dimension numbers for tests on every invocation, making things like --lf (run last failed) etc. not exactly work as they should. This is somewhat alleviated, but not perfectly, by using a seeded generator. The number of files visited (i.e. the number of randint calls to that seeded generator) affects the result still.
The NO_CUBLASLT NotImplementedError implemented in Don't crash Python interpreter via assert(false) when missing cublaslt #998 is now translated into a test skip, not a test failure.
Two Pytest markers were added, to allow e.g. -k "not (slow or benchmark)".
- There are still a number of non-benchmark tests that contain internal looping that could be replaced with e.g. pytest-repeat to make it possible to run a shorter or longer test suite. That's out of scope for this PR though.

I tested that tests run just as before on my hardware, and many if not all succeed (310 failed, 1991 passed, 563 skipped, 18 deselected, 704 warnings in 488.88s (0:08:08)) – but that's probably the flakiness of the tests in general to blame.

This PR stacks on #1000.

These transformations were done partly with https://ast-grep.github.io/ (h/t @HerringtonDarkholme!)

`out_order` is the global parametrization list, not the test fixture argument

Titus-von-Koeller · 2024-02-01T02:23:17Z

This is really amazing, you're such a pro!

I've really been struggling as the code is often hard to follow and I find that it could profit so much from the kind of Pythonic refactoring that you showcased in this PR: That would make the whole library much more accessible, easy to maintain and contribute to. Code that is written well in the sense that it makes things as understandable as they can be is the best documentation.

A significant step in the right direction. I reviewed every commit and everything looks right to me. Of course, with the test suite in it's current state, it's a bit hard to be sure we didn't accidentally introduce some error. Anyways, you definitely also at fixed at least one bug and reducing the flakiness is an absolute must. Also, we have to take some calculated risks in improving the code quality in this repo and I'm fully on board with that. You also seem to know what you're doing (: But no worry if sth breaks, then we fix forward..

Please keep it coming, if you feel like it. Super happy to have you on board in this effort!

Titus-von-Koeller · 2024-02-01T02:37:57Z

P.S. Thanks for mentioning https://ast-grep.github.io/

Looks really promising and didn't hear of it yet. I've been looking for good tools to help with refactoring. Will look into it. Let me know if you can recommend other stuff :)

BenjaminBossan · 2024-02-08T10:55:29Z

I haven't checked this PR in detail, but just a tip for the future: When refactoring tests, it can be useful to run code coverage (pytest-cov) before and after the refactor to ensure that exactly the same lines of code are covered as before. That's not a proof that the tests do the same thing, but a strong indicator.

akx · 2024-02-08T10:57:20Z

@BenjaminBossan Absolutely – though in this case it wouldn't help as much, since the original code generated random tests :)

But as soon as we have a test suite that's not as flaky, pytest-cov all the way!

akx added 5 commits January 31, 2024 08:39

test_nvidia_transform: fix variable reference

b6b078d

`out_order` is the global parametrization list, not the test fixture argument

Make parametrize use more idiomatic

28043bd

Use a more deterministic helper for dim* determination

5ff8baf

Convert NO_CUBLASLT errors into skips too

849bd93

Mark slow and benchmark tests as such (allows -k "not benchmark")

4d84b62

akx mentioned this pull request Jan 31, 2024

test_nvidia_transform: fix variable reference #1000

Merged

Titus-von-Koeller merged commit 2336a45 into bitsandbytes-foundation:main Feb 1, 2024

akx mentioned this pull request Feb 7, 2024

Drop parameterized dependency in favor of Pytest's parametrize huggingface/accelerate#2421

Closed

5 tasks

akx deleted the parametrize-ids branch February 8, 2024 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Test improvements #1001

Test improvements #1001

Uh oh!

akx commented Jan 31, 2024 •

edited

Loading

Uh oh!

Titus-von-Koeller commented Feb 1, 2024 •

edited

Loading

Uh oh!

Titus-von-Koeller commented Feb 1, 2024

Uh oh!

BenjaminBossan commented Feb 8, 2024 •

edited

Loading

Uh oh!

akx commented Feb 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Test improvements #1001

Test improvements #1001

Uh oh!

Conversation

akx commented Jan 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Titus-von-Koeller commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Titus-von-Koeller commented Feb 1, 2024

Uh oh!

BenjaminBossan commented Feb 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akx commented Feb 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

akx commented Jan 31, 2024 •

edited

Loading

Titus-von-Koeller commented Feb 1, 2024 •

edited

Loading

BenjaminBossan commented Feb 8, 2024 •

edited

Loading