Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename getindex and setindex! to operator form #42525

Closed
timholy opened this issue Oct 6, 2021 · 25 comments
Closed

Rename getindex and setindex! to operator form #42525

timholy opened this issue Oct 6, 2021 · 25 comments

Comments

@timholy
Copy link
Member

timholy commented Oct 6, 2021

A discussion on slack suggested that getindex is confusing and should perhaps be called getatindex (which is not quite as readable). But if we could write this in operator form

julia> Base.:(==)
== (generic function with 165 methods)

julia> Base.:([])
ERROR: syntax: invalid syntax "Base.[]" around REPL[2]:1
Stacktrace:
 [1] top-level scope
   @ REPL[2]:1

(it's good that it throws an error now, that means it's available) then we could add a custom array type and define

Base.:([])(A::MySpecialArray, i::Int) = ...

It would seem this doesn't even have to wait for Julia 2.0; that could be when we delete the older names.

@MasonProtter
Copy link
Contributor

But like... Why?

@mkitti
Copy link
Contributor

mkitti commented Oct 6, 2021

But like... Why?

The name getindex is counterintuitive. Coming from other language conventions, one might expect this to return an index rather than the value at the index given as an argument. In other circumstances, such as getindex(::RefValue), there is no index argument at all.

In the near term, it may improve the legibility of code if this was more clearly connected with the square bracket syntax using an alias.

Could we turn [] into an operator as in C++ without being enormously disruptive?

@MasonProtter
Copy link
Contributor

I think getindex(v, i) is maybe counterintuitive if you think too hard about it for like one or two encounters and then would be fine. On the other hand, code like

Base.:([])(v::MyType, i...) = #...

is at risk of being confusing and opaque every time you look at it. This idea is cute, but imo deprecating getindex in favour of it is a cure far worse than the disease.

@mkitti
Copy link
Contributor

mkitti commented Oct 6, 2021

In thinking about the potential implications, one that arises is that [] alone is currently interpreted as Base.vect() which results in Any[]. It may be best to contain this within abaremodule.

@mbauman
Copy link
Member

mbauman commented Oct 7, 2021

Yes, I fear [] is too overloaded for this to be sensible. One advantage to the "operator" name, though, is that it can be lexically overridden. For example, A' lowers to var"'"(A), and var"'" is simply bound to Base.adjoint. This allows you to lexically shadow the behavior of ' within a given scope without messing up Base.adjoint.

In contrast, A[] lowers directly to Base.getindex. This is what broke 2-based indexing.

@mkitti
Copy link
Contributor

mkitti commented Oct 7, 2021

The var approach is interesting. Essentially, we can create an alias Base.var"[]" for Base.getindex in the following way.

julia> @eval Base var"[]" = getindex
getindex (generic function with 207 methods)

julia> struct MyConstantArray{C} end

julia> Base.var"[]"(::MyConstantArray{C}, i::Int) where C = C

julia> c = MyConstantArray{5}()
MyConstantArray{5}()

julia> c[3]
5

julia> c[10]
5

@haampie
Copy link
Contributor

haampie commented Oct 7, 2021

.:([]) looks more like ascii art than a reference to a function to me 😬. even C++'s weird operator[] is more humanly decodable than this.

julia> macro operator_str(str)
         esc(Symbol(:operator, str))
       end
@operator_str (macro with 1 method)

julia> operator"[]"
ERROR: UndefVarError: operator[] not defined
Stacktrace:
 [1] top-level scope
   @ REPL[2]:1

julia> operator"[]" = getindex
getindex (generic function with 208 methods)

julia> operator"[]"(rand(3), 1)
0.10244386495710267

How do you distinguish between setindex! and getindex here though? In C++ that's through operator[](...) const, so I suppose operator"[] const" = getindex

@JeffBezanson
Copy link
Member

In thinking about the potential implications, one that arises is that [] alone is currently interpreted as Base.vect()

This is a key point to me. The syntax [] already means something else, so reusing it for something slightly different is hardly a model of clarity. The syntax a[b] is also not really "operator syntax", so calling it [] to me has the dual problems that (1) it's already a strange non-standard way to name something, and (2) it doesn't correspond directly to how it's used. If the indexing syntax were a [] b it would be ok I guess.

getatindex vs. getindex seems a bit pedantic to me. Sure, you can argue that it's more accurate, but I bet nobody would guess it either. But, there are some precedents for getindex getting an index instead of an element, e.g. in java. Python uses __getitem__. We have getkey, so maybe getvalue, to pair with that? I'm willing to consider a renaming like that for 2.0.

@MasonProtter
Copy link
Contributor

In contrast, A[] lowers directly to Base.getindex. This is what broke 2-based indexing.

Couldn’t we just do this lexically as well? Lower it to getindex instead of Base.getindex. I don’t see why the operator form would change that.

@mbauman
Copy link
Member

mbauman commented Oct 7, 2021

We used to do exactly that, but it changed in a grand reckoning on lowering pre-1.0 (breaking TwoBasedIndexing.jl). The informal rule is that we shouldn't lower to something you didn't write. The rationale here is well-motivated, IMO. You wrote A[]. There's no reason to think that writing getindex = 2 in the same scope would break that.

So using a name that resembled var"[]" or something like it could possibly allow following that rule — because it looks like it could actually have an effect.

@JeffBezanson
Copy link
Member

The not-yet-implemented plan there was to lower to __getindex__, so you can still override it lexically but user variables won't be captured.

@mkitti
Copy link
Contributor

mkitti commented Oct 8, 2021

I dug and found the inception of the name getindex. It came as the counterpart to setindex!.
#907 (comment)

Consider that we also have getproperty, and getfield, which take the property name and field name as the second argument. Being more verbose, these could be getbypropertyname or getbyfieldname just as getindex could be getbyindex. A method named getvalue could apply to an element that is a member of an indexed array, a property, or a field, thus getvalue seems ambiguous. In this sense, there seems to be a consistent habit of dropping prepositions and being brief for the method names getindex, getproperty, and getfield.

getkey seems out of place out of the getX names because its meaning is actually read literally. It's not getbykey, which is just get.

@mkitti
Copy link
Contributor

mkitti commented Oct 8, 2021

TL;DR An alias for getindex that uses square brackets increases the comprehension of code. Renaming is for another time.

While the title of this issue suggests renaming, the present question is really if there should be an alias. We are all agreed that this is not the time for a renaming and deprecation to occur. It either should have been done very early on (0.2) or later in a 2.0.

An alias that uses the characters [ and ] would help make the connection between the squares brackets and getindex discoverable and concrete. It would also be helpful in situations where getindex is non-sensical but [] is used.

For example, applied to Ref, getindex seems quite awkward since no index argument is present. Yes, one could argue that these are 0-dimensional arrays, but we actually do have an index for that, an instance of CartesianIndex{0}.

julia/base/refvalue.jl

Lines 56 to 57 in d633726

getindex(b::RefValue) = b.x
setindex!(b::RefValue, x) = (b.x = x; b)

Another example brought up in Slack, is its use in getindex(::Type{T}, vals...) where T for syntax such as Int8[-1, 3, -5]. Here we are exploiting the syntax and are not using an index at all.

function getindex(::Type{T}, vals...) where T

In each of the above examples, perhaps it would have been intuitive to use a method name that refers to square brackets rather than indices. If examples such as this already occur in Base, surely they will also be common place in user packages.

By adding an alias that has square brackets in the name, we could increase the human comprehension of code, especially when the method is being used in a manner that focuses on the syntax rather than the indexing operation.

@haampie
Copy link
Contributor

haampie commented Oct 8, 2021

What alias would you propose for setindex! then for symmetry?

@mkitti
Copy link
Contributor

mkitti commented Oct 8, 2021

What alias would you propose for setindex! then for symmetry?

Since we don't like calling this an operator, how about var"get[]" and var"set[]!" as aliases for getindex and setindex!. It also differentiates the method from cat, vcat, hcat, hvcat, etc.

@JeffBezanson
Copy link
Member

I really don't think throwing around var"get[]" is going to help with anything.

@mkitti
Copy link
Contributor

mkitti commented Oct 8, 2021

Abandoning any attempt to relate the name to the syntax or "operator" as per the original post and focusing on just clarifying what the operation does, how about getelement?

I chose "element" because we have precedent within Julia and ancestor languages.

How many elements does an array have? What is the element type of array?

julia> zeros(2)
2-element Vector{Float64}:
 0.0
 0.0

julia> eltype(zeros(2))
Float64

https://docs.julialang.org/en/v1/manual/arrays/#man-array-literals

Arrays can also be directly constructed with square braces; the syntax [A, B, C, ...] creates a one dimensional array (i.e., a vector) containing the comma-separated arguments as its elements

https://groups.csail.mit.edu/mac/ftpdir/scheme-7.4/doc-html/scheme_8.html#SEC72

The objects in the car fields of successive pairs of a list are the elements of the list. For example, a two-element list is a pair whose car is the first element and whose cdr is a pair whose car is the second elementand whose cdr is the empty list. The length of a list is the number of elements, which is the same as the number of pairs

https://groups.csail.mit.edu/mac/ftpdir/scheme-7.4/doc-html/scheme_9.html#SEC82

Vectors are heterogenous structures whose elements are indexed by exact non-negative integers

https://docs.oracle.com/javase/tutorial/java/nutsandbolts/arrays.html

Each item in an array is called an element, and each element is accessed by its numerical index

@quinnj
Copy link
Member

quinnj commented Oct 8, 2021

That seems like a pretty good name for it.

@BioTurboNick
Copy link
Contributor

BioTurboNick commented Oct 9, 2021

I dug and found the inception of the name getindex. It came as the counterpart to setindex!. #907 (comment)

Consider that we also have getproperty, and getfield, which take the property name and field name as the second argument. Being more verbose, these could be getbypropertyname or getbyfieldname just as getindex could be getbyindex. A method named getvalue could apply to an element that is a member of an indexed array, a property, or a field, thus getvalue seems ambiguous. In this sense, there seems to be a consistent habit of dropping prepositions and being brief for the method names getindex, getproperty, and getfield.

getkey seems out of place out of the getX names because its meaning is actually read literally. It's not getbykey, which is just get.

You could argue that the value returned by getproperty and getfield is the property/field.

That is, property name:property as field name:field and key:key... and as you note, index:element.

Seen that way, getelement fits in nicely.

@Moelf
Copy link
Contributor

Moelf commented Oct 9, 2021

while getelement is not bad, the mere fact it will be breaking change (at some point) is a huge cost. Besides, the function names (IMO) just need to be reasonable, people don't have to "get it perfectly" the first time they see it. In fact, getindex emphasizes the fact that it's indexable (not keyed by name like property or field), which is a nice association.

@mkitti
Copy link
Contributor

mkitti commented Oct 9, 2021

while getelement is not bad, the mere fact it will be breaking change (at some point) is a huge cost. Besides, the function names (IMO) just need to be reasonable, people don't have to "get it perfectly" the first time they see it. In fact, getindex emphasizes the fact that it's indexable (not keyed by name like property or field), which is a nice association.

I am not advocating for deprecation of getindex. I, however, do think an alias could be helpful and provide clarity, especially in circumstances where the element being retrieved is not indexable. I also do not follow the "not keyed" point since we use getindex to define methods to retrieve values from a dictionary using a key (e.g. getindex(h::Dict{K, V}, key) where {K, V})

@Moelf
Copy link
Contributor

Moelf commented Oct 9, 2021

an alias could be helpful

this is even more confusing than deprecation because now there appear to be 4 things. I think core devs time is far better spent on other things. This discussion (the "what names is slightly more accurate on first read" part, not the [] operator part), unfortunately, seems to be an example that falls under Law of triviality.

@mkitti
Copy link
Contributor

mkitti commented Oct 9, 2021

Could we add some kind of facility to deal with brackets in general? bracket"[]" for this issue or bracket"⟨⟩" as in #35288?

@bramtayl
Copy link
Contributor

How about atindex for getindex and toindex! for setindex!?

@mkitti
Copy link
Contributor

mkitti commented Oct 10, 2021

Let's just close this bike shed.

  • The issue is more complicated than defining a new operator.
  • A getvalue/setvalue! PR or getelement/setelement! PR can stand on its own and does not really address the original issue as stated by Tim.
  • Since brackets are not operators, we should decide how to address brackets before revisiting this.

@timholy timholy closed this as completed Oct 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants