Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type confusion with Tuple #17

Open
asmeurer opened this issue May 5, 2020 · 2 comments
Open

Type confusion with Tuple #17

asmeurer opened this issue May 5, 2020 · 2 comments

Comments

@asmeurer
Copy link
Member

asmeurer commented May 5, 2020

There's a lot of potential for type confusion with Tuple:

  • Tuple(tuple) is currently not valid. To create a Tuple from a tuple, you have to use Tuple(*tuple). But there are valid slices with nested tuples, so we can't make it work.
  • len(t) gives ValueError because len in ndindex means the number of elements indexed by the index for a single axis. For a Tuple, this isn't implemented, but we could make it assume there are as many axes as elements of the tuple. But people might expect it to mean len(Tuple.args).
  • t[0] doesn't work, but it eventually will mean composed indices (a[t][0]), but people might expect t.args[0].

I don't know if there are good solutions here, other than documentation, unless we want to give up some of the nice syntactic sugar. It's unfortunate that tuple is an index that itself can be indexed.

@asmeurer
Copy link
Member Author

I have added a document warning the user about type confusion to #71.

Regarding the issues above:

I think we can make Tuple(tuple) work. The current behavior of nested tuples indices in NumPy is to treat them as array indices, and is deprecated. At the very least we should either make it just work (give the tuple), or else give an error. Making it give an array index is too risky as people will want to use Tuple(tuple) instead of Tuple(*tuple).

For composition, I think we should just not use __getitem__. It would be convenient, but there's too much risk, both with Tuple and fancy index types. I already know from using ndindex myself that it's easy to accidentally want to treat Tuple as a tuple, and probably the same will be true for IntegerArray and BooleanArray. The only way to safeguard against this is to make any tuple operation give an error on Tuple and ditto for IntegerArray and BooleanArray. We could actually just make __getitem__ work on idx.raw for Tuple, IntegerArray, and BooleanArray, but I think it's best to just force the user to keep track of types for now.

I'm also thinking using __len__ was a mistake. I've definitely had to stop myself from using len(Tuple) instead of len(Tuple.raw), and if I am confused by it, then everyone else will be too. It is currently only implemented on Slice and Integer, but we might want it on the other index types at some point. Not only is there type confusion with len(Tuple) and len(IntegerArray), but we could do better to make it a method anyway, as we could then have an axis keyword argument that lets you specify if you want the length of just the first axis (or any axis), or total the number of indexed elements. I will leave it as it is for now, since I am not actually using len outside of internal isempty implementations, but in the future, we should clean up this API.

Some of the ideas of syntactic sugar from #14 may also be bad ideas. They should be considered carefully. Sugar is nice and can really make code more readable, but it should only be used if it is completely unambiguous, and there is no possibility for the user to try using it to do something it doesn't do.

asmeurer added a commit to asmeurer/ndindex that referenced this issue Aug 3, 2020
asmeurer added a commit to asmeurer/ndindex that referenced this issue Aug 7, 2020
…edError

These only work in NumPy as fancy indices. However, there is too much of a
risk of API confusion in thinking that Tuple() works like Tuple((1, 2, 3))
instead of Tuple(1, 2, 3). So we make this an unconditional error with helpful
error messages. Fancy indices should use either a list or array.

c.f. issue Quansight-Labs#17.
@asmeurer
Copy link
Member Author

asmeurer commented Aug 7, 2020

In #74 I have made Tuple(tuple) give ValueError instead of NotImplementedError, with a hint in the error message to either use Tuple(*args) or use a list for fancy indexing. tuples inside of tuple indices do fancy indexing, but there is too much risk for confusion of Tuple(tuple) rather than the correct Tuple(*tuple), that I think it is best to just disallow it inside of ndindex. We could potentially allow it in ndindex() but not the Tuple constructor if it comes up as something that people are using. I think generally, though, in NumPy you should use either a list or an array for an explicit fancy index. The issue is that even in NumPy, if a tuple of ints or bools is not inside another tuple, it will be treated as a tuple index, not a fancy index, like a[(0, 1, 2)]. So it's a bad practice in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants