Improving Hypothesis' index and shape strategies #46

Zac-HD · 2020-06-18T14:50:54Z

👋 I'm a Hypothesis core dev, and got very excited when I saw your intro blog post today.

I've been working to support better property-based testing for science code for some time now - see e.g. scipy-conference/scipy_proceedings#549 - and it's lovely to find another project and potential collaboration!

https://github.com/Quansight/ndindex/blob/0d9d131a5310915f3a51ebe95491e43afcd1dcad/ndindex/tests/helpers.py#L14-L18

In particular, both of these limitations are things that we can fix!

While I can total understand and support rolling your own rather than working upstream, it would be awesome if you could open an issue when you run into things like this - one of our limiting factors is actually user input and use-cases, so if you have suggestions they will find interested ears 😁

Before I go and upgrade our slices() and basic_indices() strategies, can I just check:

The "empty selection only if start == end" aspect of slices() is OK? If so I can just add some logic to maybe-invert index-from-start to index-from-end.
basic_indices() could similarly maybe-return t[0] if it would return a length-one tuple t, unless disabled by setting a new argument to False. Do you think this would match Numpy user's intuitions about what a basic index is?

The text was updated successfully, but these errors were encountered:

asmeurer · 2020-06-18T16:06:42Z

Hi. Thanks for taking notice of this project.

Regarding slices, I intentionally rolled my own, because for this library I care about a every possible corner case of slices to ensure correctness, whereas presumably for most use-cases you would want to avoid generating duplicate slices. For example, if a slice step=None, it is always equivalent to step=1. Depending on what level of abstraction you are working in, there is no need to check this. But I always want to make sure I handle every corner case here.

For shapes, I do think hypothesis can improve. See https://github.com/Quansight/ndindex/blob/0d9d131a5310915f3a51ebe95491e43afcd1dcad/ndindex/tests/helpers.py#L58-L61. I want to generate array shapes but only keep those that create arrays that are not too big (also note the NumPy bug that hypothesis found here). I think this is actually a pretty big source of filtering in my code. But I'm not sure if there's a better way to generate such shapes directly to avoid filtering.

There's also quite a bit of filtering in the tests themselves because I filter out invalid indices. Perhaps I should try to be smarter about this. I do want to generate these indices for the base tests, but generally want to avoid them for higher level tests. For example, tuple indices with more than one ellipsis should raise an error. This is tested in the constructor test_tuple_hypothesis but uses assume(False) in the other tests. I've been hesitant to get too clever here because I don't want to accidentally filter out too much, and because what is there has been working (although admittedly some bugs were not found until the tests were ran several times on CI). Or as a simpler example, I should probably avoid step=0 in all but the base Slice constructor test because it raises an exception there. But I haven't bothered, even for my manual exhaustive test.

Another question I had about hypothesis is if you had any kind of gitter or something like that where I could ask questions? On your website I only saw a link to an IRC channel, which didn't seem to have anyone in it when I checked.

Zac-HD · 2020-06-19T07:21:52Z

slices

I've opened an issue to improve our st.slices() strategy. Once that's done - no eta, since I think it's a good intro for new contributors - the only thing it won't generate is slices which have start != end but select no elements due to step. This is usually helpful (because around half the time nothing is selected otherwise), but in your case you might want to roll your own to keep generating such cases.

shapes

Capping the product of side lengths without introducing weird biases or shrinking problems is surprisingly difficult... our usual approach is still to filter them. Note that these problems only really matter if you're creating arrays of the given shape though.

Drawing the side lengths from ints() | just(1) | sampled_from([1, 2]) or similar could help, by increasing the chance that several dimensions are of size one (or maybe two) to boost dimensionality without increasing the product (much).

In the end filtering out a lot isn't actually a problem in itself, so if the performance is still OK you don't need to do anything. And congratulations on the bug - are you keeping a 'trophy case' list?

indices

I'd suggest using separate strategies for "usually valid indices" and "always valid indices". I'm also keen to upgrade our upstream basic_indices() strategy to provide the latter - is there anything other than the "always a tuple" thing that we should change?

admittedly some bugs were not found until the tests were ran several times on CI

For complicated new tests I often set them to run 10k or 100k examples and leave them running for a while - 100 examples just isn't much to hit edge cases in high-dimensional spaces. There are plans in the works for a proper long-running mode, and I'm hoping to announce something really cool later this year... but for now just turn up the max_examples setting 😅

Q&A

No, we don't have a live chat kind of thing... they can devolve into unpaid support and none of us want to go there. There's a stackoverflow tag which is pretty helpful though, and you're welcome to tag me in an issue.

asmeurer · 2020-06-24T19:05:52Z

I've opened an issue to improve our st.slices() strategy. Once that's done - no eta, since I think it's a good intro for new contributors - the only thing it won't generate is slices which have start != end but select no elements due to step. This is usually helpful (because around half the time nothing is selected otherwise), but in your case you might want to roll your own to keep generating such cases.

I will probably continue to use my own. I think another difference between my use-case and hypothesis is that I always want to generate indices independent of shape, because ndindex operations should always work independently of array shape when possible.

Capping the product of side lengths without introducing weird biases or shrinking problems is surprisingly difficult... our usual approach is still to filter them. Note that these problems only really matter if you're creating arrays of the given shape though.

Yeah, that's exactly what I'm doing. I don't know what else you would do with a shape. Actually I don't really know what a "typical" use-case for the numpy hypothesis stuff is.

In the end filtering out a lot isn't actually a problem in itself, so if the performance is still OK you don't need to do anything. And congratulations on the bug - are you keeping a 'trophy case' list?

Right, I'm mostly concerned because of the healthcheck errors. And I the filtering means that it isn't actually running as many tests as I would hope.

As an aside, I realized that I've been misusing assume(False). I've used it on error condition cases, but I actually want those to be tested to ensure that they do give an error at the right place. I was getting a lot of healthcheck errors before in some cases because of this.

I'd suggest using separate strategies for "usually valid indices" and "always valid indices". I'm also keen to upgrade our upstream basic_indices() strategy to provide the latter - is there anything other than the "always a tuple" thing that we should change?

That and slices. I can't completely tell what you are doing with ellipses but I would generate ellipses even when they are redundant (an ellipsis at the end of the tuple or an ellipsis when there are already ndim terms in the tuple).

No, we don't have a live chat kind of thing... they can devolve into unpaid support and none of us want to go there. There's a stackoverflow tag which is pretty helpful though, and you're welcome to tag me in an issue.

So where's the best place to ask questions about hypothesis?

Zac-HD · 2020-07-08T13:04:40Z

(back from an outback road trip now)

That and slices. I can't completely tell what you are doing with ellipses but I would generate ellipses even when they are redundant (an ellipsis at the end of the tuple or an ellipsis when there are already ndim terms in the tuple).

We already do this upstream 😁
I've opened a new issue for the SciPy sprints to generate non-tuple basic indices though, which I think was the last thing here.

So where's the best place to ask questions about hypothesis?

Probably https://stackoverflow.com/questions/tagged/python-hypothesis for general questions; for OSS-project-specific questions you can ping me in issues too.

Zac-HD mentioned this issue Jun 18, 2020

De-composite helper functions #47

Merged

Zac-HD mentioned this issue Jun 19, 2020

Generate negative indices from the slices() strategy HypothesisWorks/hypothesis#2468

Closed

scopatz added the skip-report label Jul 1, 2020

Zac-HD mentioned this issue Jul 11, 2020

Generate non-tuples from npst.basic_indices() HypothesisWorks/hypothesis#2486

Closed

Zac-HD closed this as completed Jul 11, 2020

asmeurer mentioned this issue Oct 25, 2020

Add parse_index function for parsing NDIndex from string #91

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving Hypothesis' index and shape strategies #46

Improving Hypothesis' index and shape strategies #46

Zac-HD commented Jun 18, 2020

asmeurer commented Jun 18, 2020

Zac-HD commented Jun 19, 2020 •

edited

Loading

asmeurer commented Jun 24, 2020

Zac-HD commented Jul 8, 2020 •

edited

Loading

Improving Hypothesis' index and shape strategies #46

Improving Hypothesis' index and shape strategies #46

Comments

Zac-HD commented Jun 18, 2020

asmeurer commented Jun 18, 2020

Zac-HD commented Jun 19, 2020 • edited Loading

slices

shapes

indices

Q&A

asmeurer commented Jun 24, 2020

Zac-HD commented Jul 8, 2020 • edited Loading

Zac-HD commented Jun 19, 2020 •

edited

Loading

Zac-HD commented Jul 8, 2020 •

edited

Loading