-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Deprecate multi-value non-scalar indexed assignment #24368
Conversation
Assuming this is the only reasonable path forward to I changed the broadcast syntax since you started working on this. Do you need any help finishing this before feature-freeze? |
Yes, I am unable to work on this myself in time to hit the feature freeze. If you'd be able to resurrect and take this over I'd be very grateful. |
I'm a little skeptical that I'll have time either; #25050 has my own priority, given that I know we want that but there's some chance we'll look at this and decide we don't want it. But time permitting I'll see what I can do. |
Just curious about the status here, and if this is something we still want? |
It is. Unfortunately, @mbauman is on vacation until January – we may need to allow a few changes after the alpha, unless someone else gets this done before then. |
I started looking into this on |
I'll take a look at some point at the broadcasting, but FYI we may change the API yet again. If we merge some combination of #24992 and |
Thanks for the rebase, @andyferris! I'm back and this is my top priority right now. I'm getting back up to speed with all the changes in the past month and it's nice to be able to pick up on your branch. |
9d8c047
to
71d683e
Compare
(@mbauman Sorry I didn't finish it - Christmas happened - but at least it's a start). |
base/sparse/sparsematrix.jl
Outdated
setindex!(A::SparseMatrixCSC, x::Number, ::Colon) = setindex!(A, x, 1:length(A)) | ||
setindex!(A::SparseMatrixCSC, x::Number, ::Colon, ::Colon) = setindex!(A, x, 1:size(A, 1), 1:size(A,2)) | ||
setindex!(A::SparseMatrixCSC, x::Number, ::Colon, j::Union{Integer, AbstractVector}) = setindex!(A, x, 1:size(A, 1), j) | ||
setindex!(A::SparseMatrixCSC, x::Number, i::Union{Integer, AbstractVector}, ::Colon) = setindex!(A, x, i, 1:size(A, 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add these ::Number
qualifications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise we don't hit the deprecated method. Eventually I think I'll need to add a deprecated method specifically for sparse matrices, but this got me a bit closer to getting things passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cheers, thanks :). Exciting motion!
Bump. Adding to milestone. |
4912bc2
to
b9e2e37
Compare
[ci skip]
…m with this new setup
b0fde3f
to
b729d65
Compare
Does this mean that
would become illegal, but that
would still be allowed? So also
would no longer be allowed? That strikes me a weird - since I also worry about performance implications - AFAIK, this kind of |
…luenonscalarindexedassignment * origin/master: (28 commits) fix an optimizer bug in `invoke` with non-constant functions (#26301) lower top-level statements in such a way that the front-end knows (#26304) Make sure Sockets page has h1 header (#26315) fix doctests, and make them less prone to errors (#26275) FIx intro to manual chapter on types (#26312) Add a missing "that" (#26313) fix docstring for code_llvm (#26266) Remove the examples/ folder (#26153) download cert.pem from deps-getall, fixes #24903 (#25344) Slight update to doc string for at-enum to refer to instances (#26208) performance tweak in reverse(::String) (#26300) remove references to `TCPSocket` in Base docstrings (#26298) Deprecate adding integers to CartesianIndex (#26284) Deprecate conj(::Any), add real(::Missing) and imag(::Missing) (#26288) fix #26267, regression in `names` with `imported=false` (#26293) fix #25857, `typeinfo` problem in showing arrays with abstract tuple types (#26289) Add adjoint(::Missing) method (#25502) Use lowered form in logging macro (#26291) deprecate bin, oct, dec, hex, and base in favor of `string` and keyword args (#25804) deprecate `spawn(cmd)` to `run(cmd, wait=false)` (#26130) ...
Yes, that's correct. Right now in the indexing expression This PR is paving the way towards a future where we can treat the RHS of the Now, I did also consider going the other way. That is, we could demand that the syntax
I've very carefully transformed all the performance specializations on
|
IMHO, that would be the sane solution. I think it's completely unnatural, and surprising to the user, if Think about this:
would be legal, even though it results in
but
would be illegal, even though it would result in
if still allowed. To interpret assignment and comparison so differently, that's quite weird, isn't it? I don't think this would make Julia appear elegant. But if we really, really have to deprecate
Ah, sorry, didn't see that - thanks, that's really great! One worry, though - did you benchmark this multi-threaded, on a modern CPU with a decent number of threads/cores (> 20/10)? In my recent tests, views are still (current state of v0.7) very often heap allocated, and frequent global heap locks kill multi-threading performance to an amazing degree. I know Julia is supposed to able to stack allocate views in the future (and maybe already now, sometimes?), but we don't seem to be there yet, and may not be for a while - though I'd be more than happy to learn otherwise. |
|
Sorry, yes, that's what I meant to say. So it seems completely unnatural to me that And, like i mentioned, it would make assignment and comparison semantically different. Except for exotic types with some special behavior, the natural expectation is that a successful |
It doesn't assign an object that is a single value, it assigns an object that is a collection of values, but a single object nevertheless. |
@oschulz I think you've made a pretty good case. Something else smells a bit odd here --- we're saying it's inconsistent for Repeating a scalar is inherently broadcasting and so should require |
Uh, sorry, now I'm confused - maybe I misunderstood: I thought this PR was about deprecating |
Right, I fully agree. We deprecated this kind of implicit broadcasting in so many other places (like deprecating |
Of course, the corner case is |
But 1 is not a range, so isn't this completely different from |
Here's my proposal - let the rules/semantics for implementations of
This is, IMHO, simple and elegant. It should also clearly define RHS shape requirements, etc. It would mean, of course, that |
That's tempting enough to try it the other way. I definitely don't think both of these behaviors can stand, but I think we really only need to deprecate one or the other. |
Thanks for considering this, @mbauman - I realize you put a lot of work into this already. I'd be perfectly happy with deprecating the one-to-many assignment (and keeping the many-to-many). Elegance of the language aside, maybe this would also break less user code? It's hard to judge, of course, but I'd guess things like Hm, here's an idea: Why don't we use |
Cool. I agree with @oschulz with the logic that the LHS should be A[i] # always scalar getindex
A.[Inds] # returns a collection with the same shape / keys as `Inds`
A[i] = x # always scalar setindex!
A.[Inds] = B # After this, `A.[Inds] == B`. `B` always must have the same shape / keys as `Inds`
A.[Inds] .= x # Broadcast the (possibly scalar, possibly not) RHS over the indices `Inds` To me the end goal is to replace (so we can apply this to dictionaries) non-scalar getindex and setindex with some syntactically distinct "getindices" and "setindices!" which I've written with a |
@andyferris, I like your long term plan, that would be elegant. I think it's important though that we do this at point where it can be done without any temporary (heap) allocation of views at all. But if we deprecate the one-to-many setindex right now, and keep the many-to-many for now, we have time, right? @mbauman, do you think One question: What would happen to other functions like |
Removing this "dual" behavior is most awkward and painful, of course, in places that depend upon it. Notably, this is I would absolutely love to propagate this change through to concatenation and deprecate For now, I'm trying to get by without the Regarding the other array tools that allow either an integer index or a collection of indices, I don't think they're nearly as problematic. This one is just particularly horrible because I also don't think that we'll ever deprecate the array's |
Thanks for the explanations, Matt! |
Thanks for catching the fact that I did this backwards! As I work my way through the opposite set of deprecations, I'm more and more convinced that this is the right direction. |
Thanks for doing this, Matt! |
It certainly can't, since |
Replaced by #26347 |
The current `setindex!` function (the `A[i] = b` syntax) does three very different operations: * Given an index `i` that refers to only one location (scalar indexing), `A[i] = b` modifies `A` in that location to hold `b`. * Given an index `I` that refers to more than one location (non-scalar indexing), `A[I] = b` can mean one of two things: * If `b` is an AbstractArray (multi-value), assign the values it contains to those locations `I` in `A`. That is, essentially, `[A[i] = bᵢ for (i, bᵢ) in zip(I, b)]`. * If `b` is not an AbstractArray (scalar-value), then broadcast its value to every location selected by `I` in `A`. These two different behaviors in the non-scalar indexing case basically make using this function in a generic way impossible. If you want to _always_ set the value `b` to many indices of `A`, you _cannot_ use this syntax because `b` might happen to be an array in some cases, radically changing the meaning of the expression. You need to manually iterate over the indices, using scalar setindex methods. But we now also have the new `broadcast!` syntax, `A[I] .= B`, which uses the usual broadcasting semantics to determine how `B` should fill into the values of `A`. This PR deprecates the implicit broadcasting of scalar values in non-scalar indexing in favor of an explicit `.=` broadcast syntax, leaving multi-value non-scalar indexing intact. This is the _exact opposite_ of PR #24368.
The current `setindex!` function (the `A[i] = b` syntax) does three very different operations: * Given an index `i` that refers to only one location (scalar indexing), `A[i] = b` modifies `A` in that location to hold `b`. * Given an index `I` that refers to more than one location (non-scalar indexing), `A[I] = b` can mean one of two things: * If `b` is an AbstractArray (multi-value), assign the values it contains to those locations `I` in `A`. That is, essentially, `[A[i] = bᵢ for (i, bᵢ) in zip(I, b)]`. * If `b` is not an AbstractArray (scalar-value), then broadcast its value to every location selected by `I` in `A`. These two different behaviors in the non-scalar indexing case basically make using this function in a generic way impossible. If you want to _always_ set the value `b` to many indices of `A`, you _cannot_ use this syntax because `b` might happen to be an array in some cases, radically changing the meaning of the expression. You need to manually iterate over the indices, using scalar setindex methods. But we now also have the new `broadcast!` syntax, `A[I] .= B`, which uses the usual broadcasting semantics to determine how `B` should fill into the values of `A`. This PR deprecates the implicit broadcasting of scalar values in non-scalar indexing in favor of an explicit `.=` broadcast syntax, leaving multi-value non-scalar indexing intact. This is the _exact opposite_ of PR #24368.
Updated Feb 9, 2018: This PR is now ready for review. Executive summary:
The current
setindex!
function (theA[i] = b
syntax) does three very different operations:i
that refers to only one location (scalar indexing),A[i] = b
modifiesA
in that location to holdb
.I
that refers to more than one location (non-scalar indexing),A[I] = b
can mean one of two things:b
is an AbstractArray (multi-value), assign the values it contains to those locationsI
inA
. That is, essentially,[A[i] = bᵢ for (i, bᵢ) in zip(I, b)]
.b
is not an AbstractArray (scalar-value), then broadcast its value to every location selected byI
inA
.These two different behaviors in the non-scalar indexing case basically make using this function in a generic way impossible. If you want to always set the value
b
to many indices ofA
, you cannot use this syntax becauseb
might happen to be an array in some cases, radically changing the meaning of the expression. You need to manually iterate over the indices, using scalar setindex methods. But we now also have the newbroadcast!
syntax,A[I] .= B
, which uses the usual broadcasting semantics to determine howB
should fill into the values ofA
.This PR deprecates the special "multi-value non-scalar indexing" behavior in favor of using
.=
broadcasted assignments. It's important to note, however, that these two behaviors are not equivalent, and in some cases the deprecation is quite convoluted. There are three reasons for this:setindex!
is more permissive in its shape checks than broadcasting is (when bothI
andb
contain the same number of elements).AbstractArray
s will expand out to multiple indices, but tuples and other broadcasting collections will, too.This was not a simple change to make; simply following the deprecation message won't always work. The places where I had the biggest difficulty adjusting to the new behavior was in cases where we still do this kind of "arrays are special" concatenating behavior:
cat
,mapslices
, etc. The other place was inOffsetArrays
— multi-value non-scalar setindex doesn't care about matching indices but broadcasting does.Old message: This is still a ways from completion, but I wanted to post this WIP so folks could start getting a sense of what this deprecation would mean. This would address the 1.0 portion of #24086.
One interesting aspect of this is that
x[I] .= …
is much more permissive than the currentsetindex!
method is… and some of our internals are relying on those tougher bounds. So we'll have to carefully vet the changes here. One example from this current branch: