Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate ones? #24444

Closed
timholy opened this issue Nov 2, 2017 · 46 comments
Closed

Deprecate ones? #24444

timholy opened this issue Nov 2, 2017 · 46 comments
Labels
domain:linear algebra Linear algebra needs decision A decision on this change is needed stdlib Julia's standard library

Comments

@timholy
Copy link
Sponsor Member

timholy commented Nov 2, 2017

Now that we distinguish one and oneunit, ones(T, sz) seems like a misnomer. Deprecate in favor of fill(oneunit(T), sz)? If so, should we ditch zeros too?

@timholy timholy added the needs decision A decision on this change is needed label Nov 2, 2017
@timholy timholy added this to the 1.0 milestone Nov 2, 2017
@fredrikekre
Copy link
Member

xref #11557 (comment) and below, and also @Sacha0's PR #24389

@Sacha0
Copy link
Member

Sacha0 commented Nov 2, 2017

I have work in progress to this effect that I hope to post in the next day or two :). Best!

@GunnarFarneback
Copy link
Contributor

To me fill(oneunit(T), sz) looks like a non-trivial loss in readability compared to ones(T, sz).

@Sacha0
Copy link
Member

Sacha0 commented Nov 2, 2017

Note that you rarely need write something as verbose as fill(oneunit(T), sz), as usually a literal or something similarly compact suffices in place of oneunit(T). Shorter incantations may also become possible with future changes to array constructors. Additionally, once you've transitioned to fill, you grow fond of it I assure you :). Best!

@JeffBezanson
Copy link
Sponsor Member

We could just pick whether ones uses one or oneunit. ones and zeros should be considered convenience functions, which is ok as long as they have clear meanings in terms of more general functions.

@StefanKarpinski StefanKarpinski added the status:triage This should be discussed on a triage call label Nov 2, 2017
@Sacha0
Copy link
Member

Sacha0 commented Nov 2, 2017

Plan per triage: Revisit next week, specifically considering moving ones and zeros to the MATLAB compatibility layer, in favor of fill in base. Best!

@andyferris
Copy link
Member

ones and oneunits would distinguish the two options.

Not sure how I feel about moving this and zeros to MATLAB compatibility. Just because Mathworks invented it doesn’t mean it’s not a good convenience API that we can be proud of. Just briefly - what were the reasons behind this thought? (Sorry, I have to say that triage seems much more efficient, but significantly less transparent when the reasoning isn’t given).

@timholy
Copy link
Sponsor Member Author

timholy commented Nov 5, 2017

It's not a change I'm thrilled about either, but I raised it because it's a logical inconsistency. I think it's fair to say that ones(T, sz) implies fill(one(T), sz) but really what it does under the hood is fill(oneunit(T), sz).

One option would be to rename it to oneunits. An alternative would be to do a horrific swap: one -> algebraic_one and oneunit->one. This is breaking and not possible to do via deprecation, which is why I say "horrific."

@andyferris
Copy link
Member

Yes, I would say adding oneunits and changing ones to give fill(one(T), ...) would be the obvious fix for that, no?

@timholy
Copy link
Sponsor Member Author

timholy commented Nov 5, 2017

I'd be fine with that. Out of curiosity, what are the uses of fill(one(T), sz)? Do we even need ones at all?

@andyferris
Copy link
Member

Haha :) I was going to ask - what do you use fill(oneunits(T), sz) for? (E.g. What would an array filled with 1 metre, or 1 kilogram, or 1 second, be used for?)

I think fill(one(T), sz) is used for much the same reason as zeros(T, sz), which is to initialize an array for a custom reduction-type-of operation. An array z = zeros(T, sz) is ready to have it's elements added to with z[i] += x whereas as o = fill(one(T), sz) is ready to have it's elements multiplied o[i] *= x. For example, I'm thinking of situations where the array elements might represent (relative) probability. In these two cases, I'm looking for the identity for the + or * operator respectively (and not additive generators).

@fredrikekre
Copy link
Member

See also #23544

@ChrisRackauckas
Copy link
Member

Haha :) I was going to ask - what do you use fill(oneunits(T), sz) for? (E.g. What would an array filled with 1 metre, or 1 kilogram, or 1 second, be used for?)

The dependent variables for an ODE. It would be weird if it just randomly chopped of units when you asked for an array of type T?

@Sacha0
Copy link
Member

Sacha0 commented Nov 5, 2017

The fundamental reason to prefer fill is that a tight set of powerful, orthogonal, composable tools serves better than a larger collection of ad hoc, limited, and overlapping tools such as ones/zeros/trues/falses. The discussion above highlights this point: While fill accommodates all use cases above unambiguously and (in most real use) concisely, the ones/zeros/trues/falses approach requires a new function for each use case.

A couple relevant examples from rewrites in base:

complex.(ones(size(Ac, 1)), ones(size(Ac, 1)))

becomes

fill(1.0 + 1.0im, size(Ac, 1))

and

2ones(Int, 2, 2, 2)

becomes

fill(2, (2, 2, 2)) # or fill(2, 2, 2, 2) if you prefer

Please note that these fill incantations are simpler, more readable, more compact, and more efficient than their non-fill counterparts, and base/ and test/ are rife with examples like these. As with all change, transitioning to fill requires some initial mental adjustment. But post-adjustment you find you have more power and elegance at your fingertips :). Best!

@carlobaldassi
Copy link
Member

@Sacha0: trues/falses are not directly replaceable by fill, but need to use fill! with a BitArray initializer. They are also not included in the ambiguity between one and oneunit. Therefore I don't think that they fit in this discussion at all.

As for ones, I'm generally opposed to deprecating it anyway, I don't see the benefit. The argument that some expression can be written more effectively with fill is not very compelling in my opinion, as all these examples use ones as intermediate steps to do something else; but what about when you actually want an array of ones? Then having to use fill is longer, less obvious and just more annoying. I like @JeffBezanson's proposal best. If parts of base or test can be rewritten more effectively using fill instead of ones there's nothing preventing that right now.

@TotalVerb
Copy link
Contributor

@carlobaldassi While that is true, a quick GitHub search reveals that almost all uses of ones should really be using fill to avoid intermediate allocations...

@stevengj
Copy link
Member

stevengj commented Nov 6, 2017

Using fill for these cases makes sense to me. Note that we will want a new fill(x, A) = fill!(x, copymutable(A)) method to replace the corresponding methods of ones etc.

@carlobaldassi
Copy link
Member

@TotalVerb From a quick skim at the first 10 pages of that search, I wouldn't say that "almost all" is a fair assessment. A "significant fraction", at best. There are lots and lots of legitimate use cases for ones there, perhaps even the majority (and even if they were just say 20% I think my argument still stands).

(I also actually have reservations about the readability argument, I think it's questionable to claim that e.g. fill(2, 2, 2, 2) is more readable than 2 * ones(Int, 2, 2, 2), or that, say, fill(k * 1.0, 3) would be more readable than k * ones(3); I'm not at all convinced that it's merely a matter of habit. This is a minor point though.)

@Sacha0
Copy link
Member

Sacha0 commented Nov 6, 2017

trues/falses are not directly replaceable by fill, but need to use fill! with a BitArray initializer. They are also not included in the ambiguity between one and oneunit. Therefore I don't think that they fit in this discussion at all.

Indeed, trues and falses are not directly replaceable by fill :). Rather, trues and falses are additional instances of the general problem described above / related to #11557 (and that the recent direction of array constructors will hopefully tackle). Other examples include the existence of e.g. bones,bzeros, brand, brandn, and beye in BandedMatrices.jl and equivalents with a d prefix in DistributedArrays.jl.

As for ones, I'm generally opposed to deprecating it anyway, I don't see the benefit. The argument that some expression can be written more effectively with fill is not very compelling in my opinion, as all these examples use ones as intermediate steps to do something else

Having just rewritten a few hundred uses of ones in base, I can affirm @TotalVerb's statement

a quick GitHub search reveals that almost all uses of ones should really be using fill to avoid intermediate allocations...

(Edit: Though I would say roughly half rather than almost all, and appropriate rewrites may be something other than fill.) Moreover, that rewrite experience has taught me...

but what about when you actually want an array of ones? Then having to use fill is longer, less obvious and just more annoying.

... that on the other hand, fill is frequently shorter and simpler in that case: The requested ones are frequently not Float64 (instead e.g. ones(Int, n...) and ones(Complex{Float64}, n...)), in which case fill is shorter and simpler by admitting a literal (e.g. fill(1, n...) and fill(1.0 + 0im, n...)). In measurable terms, the branch in which I have been rewriting ones calls in base is ~5% shorter by character count from ones -> fill rewrites. Best!

@Sacha0
Copy link
Member

Sacha0 commented Nov 9, 2017

To get an objective sense of how ones appears in the wild, I collected all ones calls appearing in the first ten pages of a GitHub search for ones in Julia code, rewrote each such call as appropriate and classified the corresponding change (see this gist), and then reduced the classification data to the following summary:

The analysis included 156 ones calls. Of those calls,

  • 84 calls (~54%) were ad hoc fills. (For example, ones(100)/sqrt(100)*7 simplifies to fill(7/sqrt(100), 100) or, better yet, fill(.7, 100). My favorite was kron(0.997, ones(1, J*J*s) -> fill(0.997, 1, J*J*s).)

  • 3 calls (~2%) were ad hoc broadcasts. (For example, A - ones(n,n) simplifies to A .- 1..)

  • 5 calls (~3%) were ad hoc vector literals. (For example, ones(1) simplifies to [1.].)

  • 1 call (~0.5%) was semantically a junk matrix construction. (Though relatively uncommon in the wild, this pattern is quite common in test/ because we do not have a concise convenience constructor for uninitialized Arrays, as in e.g. @test_throws DimensionMismatch BLAS.trsv(...,Vector{elty}(n+1)) versus @test_throws DimensionMismatch BLAS.trsv(...,ones(elty,n+1)).)

The remaining calls were semantically reasonable as ones, though often ones sees use merely because it's short, rather than because ones specifically are necessary. Of those remaining calls,

  • 13 calls (~8%) were slightly shorter as fill. (For example, ones(Int, n, n) -> fill(1, n, n) or ones(Float64, n) -> fill(1., n).)

  • 50 calls (~32%) were slightly longer as fill. (For example, ones(n, n) -> fill(1., n, n).)

Overall, in the wild ~60% of ones calls are better written somehow else, ~8% are reasonably semantically ones and slightly shorter as fill, and ~32% are reasonably semantically ones and slightly longer as fill.

An additional observation:

I encountered only one instance of a ones call accepting an array argument, and it wasn't clear whether the enclosing snippet was real code. So the ones methods accepting an array argument see little if any use in the wild.

@gabrielgellner
Copy link
Contributor

Really interesting discussion ... went from being on the against side to the for side ... Also as another language precedent, R uses rep and matrix in a manner that is equivalent to the fill (just corresponding to the 1d and 2d cases) and you get used to it very quickly -- even though I came from a zeros/ones world.

@andyferris
Copy link
Member

Wow, thank you @Sacha0 for putting in that effort!

The question naturally arises as "what about zeros"? I'm guessing there will be significantly more usage, and a few more categories of use (including things like "I simply don't trust uninitialised arrays" or "I don't know how to use the Array constructors").

For whatever reason (I guess it's the symmetry of one and zero), I'm somewhat attracted towards replacing either both ones and zeros with fill, or neither.

@StefanKarpinski
Copy link
Sponsor Member

The thing with zeros is that you would seem to be in one of these situations:

  1. You need to overwrite most of the zeros – in this case you're better off using a comprehension;
  2. You don't need to replace most of the zeros – in this case you're better off using a sparse matrix;
  3. You actually need an all-zero matrix – in this case you're better off using 0I.

There's really no use case where actually allocating a matrix of dense zeros is actually a good idea.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Nov 10, 2017

That’s perhaps true of linear algebra. It’s not uncommon to need a zero-initialized collection in my work on compilers and other data structures. Maybe they are often sparse, but it’s not worth the performance impact to represent them compactly.

@StefanKarpinski
Copy link
Sponsor Member

Fair enough – sometimes you don't care about density and the simplicity is worth it.

@StefanKarpinski
Copy link
Sponsor Member

Triage: resolved that we keep only the completely non-generic methods i.e. zeros(dims...) and ones(dims...) and maybe zeros(dims) and ones(dims) as well.

@Sacha0
Copy link
Member

Sacha0 commented Nov 18, 2017

#24595's OP detailed this issue's broader context, and now a followup in #24595 specifically addresses convenience constructors ones, zeros, and fill in depth. Reading the former is valuable in appreciating the latter. Best!

@carlobaldassi
Copy link
Member

Well, saving at least the Float64 case is better than nothing.
I believe that the case of integer zeros is also quite relevant, which -- I assume -- is basically the point that @vtjnash was making here.

It should also be noted that zeros does not have the "problem" of allowing the 3 * ones(n) anti-pattern. In fact, I don't really see why ones and zeros should go together, except in the broad sense of being convenience constructors. There is no real 'symmetry' between those two.

A couple of additional comments on the statistical analysis, since it seems to be the basis of the following discussions and the writeup of #24595. First, ten pages are not really enough for fine-grained conclusions of what's going on in the wild, they can give a rough idea at best. Some files came straight from matlab, for example, as it's clear from their name/style. Secondly, as I suspected, even that analysis shows that roughly half of the uses of ones in there were "legitimate". Thirdly, looking at code like this doesn't tell anything about when writing 3 * ones(...) is really an anti-pattern that creates performance issues, or it's a piece of code that has no performance implications at all (and the writer may have decided that it's just more readable written that way -- which I strongly feel is nobody else's business to decide otherwise, in that case).

Relatedly to the last point, and I think more importantly, what you get to see from a github search will never take into account what's going on in the REPL of people doing exploratory/preliminary work in Julia. Which is exactly where convenience functions come in most handy, and taking them away for no discernible reason is correspondingly most annoying. My point being, having a consistent set of orthogonal primitives that allow to write generic and efficient code is a great goal, and the effort to get there is truly commendable; it's just that not all code is supposed to be beautiful, generic, composable library code. Just my two cents.

@Sacha0
Copy link
Member

Sacha0 commented Nov 20, 2017

Regarding

I believe that the case of integer zeros is also quite relevant, which -- I assume -- is basically the point that @vtjnash was making here.

which refers to

It’s not uncommon to need a zero-initialized collection in my work on compilers and other data structures. Maybe they are often sparse, but it’s not worth the performance impact to represent them compactly.

Note that fill serves as well or better in that case: fill(0, shape...) versus zeros(Int, shape...).

Concerning your other points, #24595 may be worth reading :). Best!

@carlobaldassi
Copy link
Member

I simply find that zeros(Int, 10, 10) is more readable/explicit than fill(0, 10, 10), and zeros(T, k) is better than fill(zero(T), k). Why can't we just have both? I don't buy the argument that zeros suffers the same ambiguity problem as ones.

Concerning your other points, #24595 may be worth reading

I had read it. (I even linked it.)

@Sacha0
Copy link
Member

Sacha0 commented Nov 20, 2017

I had read it. (I even linked it.)

Having read #24595 in full and given it due consideration, you are then aware that #24595: (1) concerns a much broader issue of which convenience constructors are only a part; and (2) considers far more than just the statistical analysis posted above and the points you focus on here.

I appreciate that you feel strongly about ones and zeros; your sentiment has come through loud and clear :). As such, chances are our bandwidth would be better spent pushing other fronts forward than continuing this conversation in its present form. Best!

@StefanKarpinski StefanKarpinski added domain:linear algebra Linear algebra stdlib Julia's standard library labels Nov 20, 2017
@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Nov 26, 2017

Is there a generic fill incoming along with the zeros change? zeros had a very legitimate use for generic programming since it's much safer than similar, so all of DiffEq, and then I know recently Optim and NLsolve moved from allocating with similar to zeros since knowing everything is allocated to have zeros in it stops a lot of bugs. However, now it seems there's going to be no method for:

zeros(X)

anymore, other than:

similar(X); fill!(X,0)

because the current fill only builds Arrays and thus doesn't type-match like similar or zeros does. I know that some people misused zeros to allocate when they shouldn't have, but allocating with zeros is a very reasonable thing to do in many cases. I hope that a fill(0,X) shorthand gets added to fill this void.

@Sacha0
Copy link
Member

Sacha0 commented Nov 26, 2017

Much thanks for the thoughtful post Chris! :) As an interim shorthand replacement, does zero(X) do the trick?

Note that such use cases are precisely where the ambiguity in zeros and ones can be problematic: Does zeros(X) yield an object with eltype(X) and filled with eltype(X)'s additive identity (i.e. fill!(similar(X), 0)), or instead an object filled with a multiplicative zero for eltype(X) (possibly not of eltype(X))? (For expansion, see #24595.)

The fill(0, X) concept sees a bit of discussion in #11557, and I agree that might be a useful generalization of fill. Thanks and best!

@timholy
Copy link
Sponsor Member Author

timholy commented Nov 26, 2017

The other problem is that arrays with unconventional indices might want to be created with something like zeros(inds...) (because the index type determines the array type). But for a 1-d case, is X "the array you want to be similar to" or "the indices for the desired array"? (After all, AbstractUnitRange <: AbstractArray.) Concretely, does zeros(3:5) mean fill!(similar(3:5), 0) or fill!(OffsetArray(Vector{Float64}(3), 3:5), 0)?

@Sacha0
Copy link
Member

Sacha0 commented Nov 26, 2017

Linking #24656, which contains additional discussion of {ones|zeros }(A::AbstractArray, ...) convenience constructors. Best!

@ChrisRackauckas
Copy link
Member

I am surprised it's considered odd to use zeros to make cache variables according to #24656. I would think that, if zeros was reduced to close to zero overhead, almost any case where people are using similar should instead be zeros since that tends to fix quite a few bugs. I think we should encourage more people to do this since similar can be quite unsafe, and not having a function and instead pulling together fill! + similar makes it less obvious that is what people should be doing. Here's a comment about this popping up in Optim.jl:

JuliaNLSolvers/NLsolve.jl#89 (comment)

However, I do agree with @timholy that it is non-obvious how it should be interpreted. Let me point you to a really non-simple example in DiffEq.

https://github.com/JuliaDiffEq/MultiScaleArrays.jl

MultiScaleArrays.jl created abstract arrays which are recursive graph-like structures which can be used in the diffeq solvers (and I think it may be compatible with Optim.jl and NLsolve.jl now?). It's a nice convenience for biological models among other things. When we throw it into the ODE solver there's a question: what should we make the cache arrays? In some cases it matters that the user gets back the array they wanted since it will show up in their ODE function f(t,u,du) and they will want to treat it accordingly. However, in other cases it only shows up as something that is internally broadcasted against. So there's two different types of cache variables.

To handle this, take a look at the cache for one of the algorithms:

https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/caches/low_order_rk_caches.jl#L224-L234

Here, rate_prototype = similar(u,first(u)/t,indices(u) is similar but with potentially different eltype for units. But notice that there's two separate types here: similar(u) vs similar(u,indices(u)). I have interpreted those to mean "match the type and the shape" vs "match the shape and the eltype, but doesn't need to be the same type". So for an AbstractMultiScaleArray, the first will create another AbstractMultiScaleArray while the other, for speed since it isn't seen by the user, will just create an Array of the appropriate size. This are then extended to similar(u,T) and similar(u,T,indices(u)).

Maybe that's just punning off of what already exists, but I think this is an important distinction. When doing generic programming, you have two separate caches: user-facing caches that you want to match types to match their expectation, and internal caches which are just used by the algorithm and you want as much speed as possible.

Notice that these are using similar because I was too lazy to come up with a zeros version of it. I actually have a separate place that can zero out some of these arrays because if the user only sets du in their f(t,u,du) derivative calculation, they tend to implicitly thing "what I don't set means zero", which only is true when it was allocated with zeros, so I try to pre-allocate using zeros as much as possible (the same issue comes up in NLsolve.jl for this).

Hopefully that explanation wasn't too confusing to follow. In all of my cases I can just switch to similar followed by fill!, but I know some packages won't and that will be a source of bugs.

@timholy
Copy link
Sponsor Member Author

timholy commented Nov 27, 2017

Interesting. You're right that similar(A, inds) might create a different type than similar(A), but in general I've always thought of it as likely creating the same type but with different indices. For example, if you needed a 1-dimensional cache for a columnwise operation on a 2-d object, I would use similar(A, first(inds)). (Of course it is a different type because the dimensionality is a type parameter, but it might be the same abstract container type.) You could also use it to create a 5x5 cache of a small tile, etc.

Overall this seems to be a challenging problem. It's a little late in the game, but should we introduce same? It could have the same arguments as similar, but the contract would be that it would be required to return the same abstract container.

@andyferris
Copy link
Member

I could support a one-argument form of same, but even this is tricky - note that same(a) couldn't return the same array type if a doesn't support setindex! because same and similar are only useful if you are going to write to the array afterwards. We could make this an error for immutable a, but as a interface for AbstractArray this seems unnecessary (and maybe unhelpful) for making correct, generic code.

Similarly, we can't assume every AbstractArray can support different eltypes or indices. To me, having a two- or three-argument form of same would just introduce run-time errors in a bunch of places while giving people a false sense of security that their generic code will work well for any AbstractArray input, when it does not.

But for a 1-d case, is X "the array you want to be similar to" or "the indices for the desired array"?

This is another reason I am in favour of keys returning a container with identical indices and values, and then making this a requirement of similar (unless you provide one (some) integer(s) in which case Base.OneTo (CartesianRange) is assumed).

@Sacha0
Copy link
Member

Sacha0 commented Nov 28, 2017

This discussion is veering towards #18161, and perhaps should continue there :).

@JeffBezanson JeffBezanson added the status:triage This should be discussed on a triage call label Dec 31, 2017
@JeffBezanson
Copy link
Sponsor Member

Most recent triage was leaning towards just keeping ones.

@JeffBezanson JeffBezanson removed the status:triage This should be discussed on a triage call label Jan 4, 2018
@JeffBezanson JeffBezanson removed this from the 1.0 milestone Jan 4, 2018
@damiendr
Copy link
Contributor

Does it hurt to keep them? I think it helps people who come from numpy feel at home in Julia.

@ViralBShah
Copy link
Member

Closing since we are keeping ones and zeros.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:linear algebra Linear algebra needs decision A decision on this change is needed stdlib Julia's standard library
Projects
None yet
Development

Successfully merging a pull request may close this issue.