convert can lead to unexpected aliasing #12441

IainNZ · 2015-08-03T15:06:18Z

Consider:

function foo(x)
  y = convert(Vector{Float64},x)
  y[1] = 10.0
  return y
end
julia> x = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> foo(x)
3-element Array{Float64,1}:
 10.0
  2.0
  3.0

julia> x
3-element Array{Int64,1}:
 1
 2
 3

julia> x = [1.,2.,3.]
3-element Array{Float64,1}:
 1.0
 2.0
 3.0

julia> foo(x)
3-element Array{Float64,1}:
 10.0
  2.0
  3.0

julia> x
3-element Array{Float64,1}:
 10.0
  2.0
  3.0

The input is modified in one case, and not the other, which is in the realms of "scary". One (The?) solution is to always do a copy, even if the type matches, for mutable types.

johnmyleswhite · 2015-08-03T15:09:04Z

Potentially relevant sub-issue: should a convert call imply making relevant recursive calls to convert for all fields? For example, is convert(Nullable{Vector{Int}}, Nullable{Vector{Float64}} equivalent to convert(Nullable{Vector{Int}}, Nullable{Vector{Int}}? This is relevant to the question of assuming that convert(Nullable{T}, x::Nullable{T}) = x that comes up in #12424.

simonster · 2015-08-03T15:27:21Z

We use convert everywhere (x::T = ... declarations, constructors, setfield!, setindex!) so we really don't want to make convert always return a copy. I think the solution is to extend copy to accept a type as the first argument. My proposal from #9575 (comment) was:

copy{T}(::Type{T}, x::T) = copy(x)
copy{T}(::Type{T}, x) = convert(T, x)

The caveat of the second method is that for some composite types, convert might only copy some of the fields, e.g. for a SparseMatrixCSC it copies nzval but not colptr and rowval.

simonster · 2015-08-03T15:44:45Z

On second thought, convert for SparseMatrixCSC should probably copy colptr and rowval, since otherwise you can probably make Julia segfault by modifying a converted SparseMatrixCSC while the original is still around...

JeffBezanson · 2015-08-03T20:40:15Z

See #8759 for a discussion of our policy on aliasing.

StefanKarpinski · 2015-08-03T20:49:05Z

The copy thing is necessary sometimes, so there is an actionable change here. Otherwise you can't express "convert this thing and make sure it's a copy" without sometimes making an extra copy.

nalimilan · 2016-02-29T13:19:14Z

Reopening per @StefanKarpinski's comment. I looked for this issue (again) after reading this code in GLM.jl:

    wts = T <: Float64 ? copy(wts) : convert(typeof(y), wts)
    off = T <: Float64 ? copy(offset) : convert(Vector{T}, offset)

I can see two solutions:

add copy(T, x) (see @simonster's suggestion above)
add a keyword argument convert(T, x; copy=FALSE)

Note that the former already exists in Base, it's called copy_oftype (#9575). I think I'd prefer the latter, one argument being that convert will not always copy all fields as copy would (so guarantees are weaker).

JeffBezanson · 2016-02-29T19:59:56Z

I think the right way to write this is copy!(similar(A, Float64), A). If that's disliked, then copy(Vector{Float64}, A) would be ok. To me, adding the functionality to convert really violates orthogonality, since copy(x) could also then be spelled convert(Any, x, copy=true).

Sacha0 · 2016-07-20T20:24:35Z

Whether to extend copy or settle on the copy!(similar(... idiom appears to remain an open question.

On the one hand, forming the copy-of-type operation via composition (and thereby avoiding interface expansion) is elegant.

On the other hand, relying on a mechanism still somewhat shrouded in ambiguity (similar) for such a common, clear operation is unsettling. Additionally, copy(Vector{Float64}, A) is about as clear as can be, and may enable optimizations that copy!(similar(A, Float64), A) might make difficult.

Thoughts? Best!

andyferris · 2017-01-05T04:26:19Z

I think the right way to write this is copy!(similar(A, Float64), A)

That doesn't work very well at all for immutable (container) types, like those in StaticArrays.

+1 for copy(T, x). Seems both clear and flexible.

JeffBezanson · 2017-01-05T04:35:25Z

Why would you need to copy an immutable container?

andyferris · 2017-01-05T06:05:37Z

Ideally, I would like methods written by other people for AbstractArray to also work with a StaticArray as much as possible. We already have e.g. copy(::Float64) defined, simply to support generic programming, even though the copy is a no-op, so this is the same idea.

(EDIT: e.g. a copy might be taken because you don't want the data to be changed by some other reference, rather than with the intention of mutating it yourself).

JeffBezanson · 2017-01-05T06:10:41Z

Ah yes, good old "defensive copies". I think it would be better to check for mutability in the relatively rare cases where that's wanted, instead of requiring all immutable collections to implement copy. It certainly needs to be possible to check for mutability, for example to pick in-place vs. out-of-place algorithms.

andyferris · 2017-01-05T06:12:39Z

Also, in static arrays it might make sense to use copy(similar_type(A,Float64), A) - this would convert the element type and also avoid any unexpected aliasing that could result from convert(similar_type(A,Float64), A) and would work for both immutable SArray and mutable MArray.

It certainly needs to be possible to check for mutability, for example to pick in-place vs. out-of-place algorithms.

A trait for array mutability would be quite helpful!

andyferris · 2017-01-05T06:50:53Z

In the interest of "generic programming" (whatever the precise definition of that is) I know of these possibilities in this space:

Convert something as fast as possible with convert(T,x) (sometimes aliasing)
Use copy(convert(T,x)) to avoid aliasing while converting (sometimes makes two copies)
Use copy!(similar(x,T),x) instead of 2. (only works for mutable objects)

copy(T, x) seems to be the only thing that would never make two copies, works for both mutables and immutables, and provides certain guarantees about aliasing.

andreasnoack · 2017-01-05T15:23:54Z

We use LinAlg.copy_oftype in many places to avoid copying twice. I guess it is somewhat similar to the copy(similar_type(A,Float64), A) you mentioned.

andyferris · 2017-01-05T22:50:57Z

I guess the idea might be to make something like copy_oftype work for arbitrary types (not just arrays with different element types).

nalimilan · 2017-03-19T18:53:57Z

Another approach to fixing this would be to introduce a @copy or @noalias annotation similar to how @inbounds works. That would allow changing the behavior from inside convert methods without having to duplicate them (once for copying, once for non-copying).

Often, most of the code needs to be common to both methods, except that to return a copy you need to call copy on a few objects which are used to fill the result's fields. It's not always as simple as copying the returned object: for example, in CategoricalArrays, depending on which type parameters are identical between the original object and the result, some parts will need an explicit call to copy and others will automatically be copied by the internal calls to convert; copying the result would sometimes generate an unnecessary intermediate copy for some fields (i.e. exactly the problem we're trying to address).

We could also default to copying, for safety, and consider @alias or @nocopy as an optimization, just like @inbounds.

EDIT: another advantage of this approach is that it could be used more generally by other functions, and in particular by constructors (for which we have no clear policy regarding aliasing AFAIK).

IainNZ · 2018-06-02T22:58:43Z

Tentative bump on this, with 1.0 in mind. I'm not sure if there is any more action possible than the documentation change that was done at the time, but would seem to be last call on any possible "fix" (or actually past it).
If there isn't anything to do, suggest we just close again.

Sacha0 · 2018-06-03T01:38:01Z

In 1.0, replacing y = convert(Vector{Float64}, x) with y = Vector{Float64}(x) should do the trick (Ref. #16029), as the constructor will return a fresh object. Perhaps close once Vector{Float64}(x) is implemented? Best!

StefanKarpinski · 2018-06-04T19:27:14Z

There's also now Base.unalias and Base.mightalias:

help?> Base.unalias
  Base.unalias(dest, A)

  Return either A or a copy of A in a rough effort to prevent modifications to dest from
  affecting the returned object. No guarantees are provided.

  Custom arrays that wrap or use fields containing arrays that might alias against other
  external objects should provide a Base.dataids implementation.

  This function must return an object of exactly the same type as A for performance and type
  stability. Mutable custom arrays for which copy(A) is not typeof(A) should provide a
  Base.unaliascopy implementation.

  See also Base.mightalias.

help?> Base.mightalias
  Base.mightalias(A::AbstractArray, B::AbstractArray)

  Perform a conservative test to check if arrays A and B might share the same memory.

  By default, this simply checks if either of the arrays reference the same memory regions, as
  identified by their Base.dataids.

JeffBezanson closed this as completed Aug 3, 2015

Sacha0 mentioned this issue Nov 28, 2015

Infinite recursion of \ on sparse matices JuliaLang/LinearAlgebra.jl#283

Closed

nalimilan reopened this Feb 29, 2016

nalimilan mentioned this issue Mar 21, 2016

implement copy(::Void) = nothing #15546

Closed

nalimilan mentioned this issue Jun 4, 2016

remove UTF-16 and UTF-32 stuff #16590

Merged

Sacha0 mentioned this issue Jul 20, 2016

Augment convert's documentation with a note on aliasing #17519

Merged

tkelman mentioned this issue Jan 2, 2017

Improvement of documentation of [A, B, C, ...] syntax #19807

Closed

nalimilan mentioned this issue Feb 27, 2017

convert: for arrays, copy or view? #20826

Closed

Sacha0 mentioned this issue Oct 15, 2017

similar and traits #18161

Open

yuyichao mentioned this issue Oct 20, 2017

Possible type related bug in Dict #24233

Closed

vtjnash closed this as completed Mar 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert can lead to unexpected aliasing #12441

convert can lead to unexpected aliasing #12441

IainNZ commented Aug 3, 2015

johnmyleswhite commented Aug 3, 2015

simonster commented Aug 3, 2015

simonster commented Aug 3, 2015

JeffBezanson commented Aug 3, 2015

StefanKarpinski commented Aug 3, 2015

nalimilan commented Feb 29, 2016

JeffBezanson commented Feb 29, 2016

Sacha0 commented Jul 20, 2016 •

edited

Loading

andyferris commented Jan 5, 2017 •

edited

Loading

JeffBezanson commented Jan 5, 2017

andyferris commented Jan 5, 2017 •

edited

Loading

JeffBezanson commented Jan 5, 2017

andyferris commented Jan 5, 2017

andyferris commented Jan 5, 2017

andreasnoack commented Jan 5, 2017

andyferris commented Jan 5, 2017

nalimilan commented Mar 19, 2017 •

edited

Loading

IainNZ commented Jun 2, 2018

Sacha0 commented Jun 3, 2018

StefanKarpinski commented Jun 4, 2018

convert can lead to unexpected aliasing #12441

convert can lead to unexpected aliasing #12441

Comments

IainNZ commented Aug 3, 2015

johnmyleswhite commented Aug 3, 2015

simonster commented Aug 3, 2015

simonster commented Aug 3, 2015

JeffBezanson commented Aug 3, 2015

StefanKarpinski commented Aug 3, 2015

nalimilan commented Feb 29, 2016

JeffBezanson commented Feb 29, 2016

Sacha0 commented Jul 20, 2016 • edited Loading

andyferris commented Jan 5, 2017 • edited Loading

JeffBezanson commented Jan 5, 2017

andyferris commented Jan 5, 2017 • edited Loading

JeffBezanson commented Jan 5, 2017

andyferris commented Jan 5, 2017

andyferris commented Jan 5, 2017

andreasnoack commented Jan 5, 2017

andyferris commented Jan 5, 2017

nalimilan commented Mar 19, 2017 • edited Loading

IainNZ commented Jun 2, 2018

Sacha0 commented Jun 3, 2018

StefanKarpinski commented Jun 4, 2018

Sacha0 commented Jul 20, 2016 •

edited

Loading

andyferris commented Jan 5, 2017 •

edited

Loading

andyferris commented Jan 5, 2017 •

edited

Loading

nalimilan commented Mar 19, 2017 •

edited

Loading