Adding an example of an array type wrapper without loose of performance. #33420

stakaz · 2019-09-30T09:07:34Z

I found it really handy to know that in order to wrap an array and keep the performance you have to use base.@propagate_inbounds on getindex and setindex! functions. As I think this is a typical option used for dispatching it should be mentioned in the docs.

stakaz · 2019-09-30T09:40:55Z

Sorry, I have no idea what white spaces are meant by the test buildbot/whitespace_linux32. I have checked once more and cannot find anything bad in the lines.

KristofferC · 2019-09-30T09:49:56Z

The whitespace CI in the logs is from the previous commit. Not sure why a new one didn't start.

mateuszbaran · 2019-09-30T10:41:13Z

It might also be worth mentioning Base.dataids when talking about wrapper performance.

mcabbott · 2019-09-30T21:56:45Z

doc/src/manual/interfaces.md

+
+julia> Base.size(A::MyArray) = size(A.a)
+
+julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...) = getindex(A.a, i...)


Perhaps this should encourage the passing on of keywords, for named A[i=3] type indexing?

Suggested change

julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...) = getindex(A.a, i...)

julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...; kw...) = getindex(A.a, i...; kw...)

Link: JuliaCollections/AxisArraysFuture#1 (comment)

Keyword arguments often cause various performance problems. Just look at this for example: JuliaArrays/StaticArrays.jl#540 . I don't this it should be encouraged in a section dedicated to fast wrappers.

Good point. But is this a concern with unused kw... being passed along, or only when they are actually used? I can’t detect any effect on things I tried.

(And, is there a good explanation of this issue somewhere?)

The problem occurs when you call a function with keyword arguments. It's OK to have default keyword arguments though but even passing these default kwargs slows things down. That's why in some places keyword arguments are forwarded as normal arguments (as named tuples). There might be some exceptions that can be optimized by the compiler but from my experience it's a good rule of thumb.

(And, is there a good explanation of this issue somewhere?)

I don't know, it seems to generally be hard to find good explanations of such compiler details.

mbauman · 2019-10-01T20:27:03Z

Hm, I'm not the biggest fan of encouraging @propagate_inbounds because it misattributes the error to an internal implementation detail. See, for example, this SO answer: https://stackoverflow.com/questions/38901275/inbounds-propagation-rules-in-julia/38929159#38929159

stakaz · 2019-10-01T20:47:08Z

However note, that it leads to problems only after explicitly using @inbounds: I would say that people who use this macro usually are interested in performance and know about the risks.

It simple just not should be the case that a wrapped array has speed drawbacks.

mbauman · 2019-10-01T20:52:39Z

My point is that I want to encourage folks to use @boundscheck blocks themselves. In the example I cite above, it's the difference between:

julia> module M
           using Random
           struct ShuffledVector{A,T} <: AbstractVector{T}
               data::A
               shuffle::Vector{Int}
           end
           ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
           Base.size(A::ShuffledVector) = size(A.data)
           Base.@propagate_inbounds function Base.getindex(A::ShuffledVector, i::Int)
               A.data[A.shuffle[i]]
           end
       end

julia> s = M.ShuffledVector(1:4);

julia> s[5]
ERROR: BoundsError: attempt to access 4-element Array{Int64,1} at index [5]
Stacktrace:
 [1] getindex at ./array.jl:728 [inlined]
 [2] getindex(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[4]:10
 [3] top-level scope at REPL[6]:1

vs.

julia> module M
           using Random
           struct ShuffledVector{A,T} <: AbstractVector{T}
               data::A
               shuffle::Vector{Int}
           end
           ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
           Base.size(A::ShuffledVector) = size(A.data)
           Base.@inline function Base.getindex(A::ShuffledVector, i::Int)
               @boundscheck checkbounds(A, i)
               A.data[A.shuffle[i]]
           end
       end

julia> s = M.ShuffledVector(1:4);

julia> s[5]
ERROR: BoundsError: attempt to access 4-element Main.M.ShuffledVector{UnitRange{Int64},Int64} at index [5]
Stacktrace:
 [1] throw_boundserror(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Tuple{Int64}) at ./abstractarray.jl:538
 [2] checkbounds at ./abstractarray.jl:503 [inlined]
 [3] getindex(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[7]:10
 [4] top-level scope at REPL[9]:1

Both implementations will have the same performance.

stakaz · 2019-10-02T09:51:12Z

Yeah, maybe the approach with @boundscheck checkbounds(A, i) is a good point then. As far as I can understand you will be able to use @inbounds and gain the same performance with this, am I right?

mbauman · 2019-10-02T14:25:18Z

Yes, it'll perform exactly the same.

The reason-for-being for @propagate_inbounds is when you want to refactor an indexing implementation to use an inner function. For example, the internal abstract indexing functionality is essentially:

@propagate_inbounds getindex(A::AbstractArray, I...) = _getindex(IndexStyle(A), A, to_indices(A, I)...)

stakaz · 2019-12-16T10:21:57Z

Just corrected the sentence to restart the checks (one of them was stuck for some reason). After thinking a bit about it, I would prefer the @propagate_inbounds in this case for the sake of simplicity. Defining own boundschecks is a good solution but if you really just want a wrapper around a Base array and the same speed than @propagate_inbounds looks much more natural to me (like just tell that the wrapped array is a Base array).

stakaz · 2019-12-16T11:56:51Z

I don't see what makes the test fail here? Could someone else look into it? It is only the documentation, so I don't see how it can fail...

KristofferC · 2019-12-16T12:12:29Z

I restarted the failing workers.

vchuravy · 2019-12-16T15:57:19Z

My point is that I want to encourage folks to use @BoundsCheck blocks themselves. In the example I cite above, it's the difference between:

This is missing one facet. I either have to use @propagate_inbounds or I have to @inbounds myself. Whether or not I use @boundscheck is an orthogonal matter. I would advocate for the variant below, since it checks the bounds for the wrapper implementation and it also checks the bounds
for the inner implementation, but both inbounds checks are disabled when an outer caller sets @inbounds.

module M
           using Random
           struct ShuffledVector{A,T} <: AbstractVector{T}
               data::A
               shuffle::Vector{Int}
           end
           ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
           Base.size(A::ShuffledVector) = size(A.data)
           Base.@propagate_inbounds function Base.getindex(A::ShuffledVector, i::Int)
               @boundscheck checkbounds(A, i)
               A.data[A.shuffle[i]]
           end
       end

julia> s = M.ShuffledVector(1:4);

no_inbounds(A, i) = A[i]
inbounds(A, i) = @inbounds A[i]

julia> @code_typed no_inbounds(zeros(3), 1)
CodeInfo(
1 ─ %1 = Base.arrayref(true, A, i)::Float64
└──      return %1
) => Float64

julia> @code_typed inbounds(zeros(3), 1)
CodeInfo(
1 ─ %1 = Base.arrayref(false, A, i)::Float64
└──      return %1
) => Float64

Note that arrayref has a flag that indicates whether it needs to boundscheck.

Now for the wrapper without @propagate_inbounds (using Cthulhu to get the benefit of DCE).

julia> descend(inbounds, map(typeof, (M.ShuffledVector(zeros(3)), 1)), debuginfo=:none)
CodeInfo(
1 ─      goto #2
2 ─ %2 = Base.getfield(A, :data)::Array{Float64,1}
│   %3 = Base.getfield(A, :shuffle)::Array{Int64,1}
│   %4 = Base.arrayref(true, %3, i)::Int64
│   %5 = Base.arrayref(true, %2, %4)::Float64
└──      goto #3
3 ─      return %5
)

julia> descend(no_inbounds, map(typeof, (M.ShuffledVector(zeros(3)), 1)), debuginfo=:none)
CodeInfo(
1 ─ %1  = Core.tuple(i)::Tuple{Int64}
│   %2  = Base.getfield(A, :data)::Array{Float64,1}
│   %3  = Base.arraysize(%2, 1)::Int64
│   %4  = Base.slt_int(%3, 0)::Bool
│   %5  = Base.ifelse(%4, 0, %3)::Int64
│   %6  = Base.sle_int(1, i)::Bool
│   %7  = Base.sle_int(i, %5)::Bool
│   %8  = Base.and_int(%6, %7)::Bool
└──       goto #3 if not %8
2 ─       goto #4
3 ─       invoke Base.throw_boundserror(_2::Main.M.ShuffledVector{Array{Float64,1},Float64}, %1::Tuple{Int64})::Union{}
└──       $(Expr(:unreachable))::Union{}
4 ┄ %13 = Base.getfield(A, :data)::Array{Float64,1}
│   %14 = Base.getfield(A, :shuffle)::Array{Int64,1}
│   %15 = Base.arrayref(true, %14, i)::Int64
│   %16 = Base.arrayref(true, %13, %15)::Float64
└──       goto #5
5 ─       return %16
)

Note how both cases use Base.arrayref(true), e.g. the @inbounds was not propagated, and we only eliminated one level of boundschecking.

Now my variant with @propagate_inbounds

julia> descend(inbounds, map(typeof, (M.ShuffledVector(zeros(3)), 1)), debuginfo=:none)
CodeInfo(
1 ─      goto #2
2 ─ %2 = Base.getfield(A, :data)::Array{Float64,1}
│   %3 = Base.getfield(A, :shuffle)::Array{Int64,1}
│   %4 = Base.arrayref(false, %3, i)::Int64
│   %5 = Base.arrayref(false, %2, %4)::Float64
└──      goto #3
3 ─      return %5
)

julia> descend(no_inbounds, map(typeof, (M.ShuffledVector(zeros(3)), 1)), debuginfo=:none)
CodeInfo(
1 ─ %1  = Core.tuple(i)::Tuple{Int64}
│   %2  = Base.getfield(A, :data)::Array{Float64,1}
│   %3  = Base.arraysize(%2, 1)::Int64
│   %4  = Base.slt_int(%3, 0)::Bool
│   %5  = Base.ifelse(%4, 0, %3)::Int64
│   %6  = Base.sle_int(1, i)::Bool
│   %7  = Base.sle_int(i, %5)::Bool
│   %8  = Base.and_int(%6, %7)::Bool
└──       goto #3 if not %8
2 ─       goto #4
3 ─       invoke Base.throw_boundserror(_2::Main.M.ShuffledVector{Array{Float64,1},Float64}, %1::Tuple{Int64})::Union{}
└──       $(Expr(:unreachable))::Union{}
4 ┄ %13 = Base.getfield(A, :data)::Array{Float64,1}
│   %14 = Base.getfield(A, :shuffle)::Array{Int64,1}
│   %15 = Base.arrayref(true, %14, i)::Int64
│   %16 = Base.arrayref(true, %13, %15)::Float64
└──       goto #5
5 ─       return %16
)

This has lead to confusion before, JuliaArrays/StaticArrays.jl#564
Either the getindex method needs to internally use @inbounds, which makes it harder to verify the implementation of the wrapper,
or the getindex method needs to use @propagate_inbounds so that @inbounds get's passed to the inner array access appropriately.

mbauman · 2019-12-16T19:06:35Z

Either the getindex method needs to internally use @inbounds, which makes it harder to verify the implementation of the wrapper,
or the getindex method needs to use @propagate_inbounds so that @inbounds get's passed to the inner array access appropriately.

This is exactly what it boils down to. I'm not sure there's a categorically "better" choice here — I think the arguments for and against both are highly opinion-based. Here are my opinions:

Personally, I disagree that option one "makes it harder to verify the implementation of the wrapper." It's @inbounds! It's exactly how @inbounds works! Just do your debugging without it (or with --check-bounds=yes). Yes, I forgot it in my comment above, but again, I say that's more of a feature than a bug. Let's lean conservative here.

Further, I think of @inbounds as a necessary evil. It's a spooky action at a distance. There is nothing else in Julia (well, besides Cassette 🙈) that allows you to reach into a library and muck around with its implementation. @propagate_inbounds allows a caller into your library to muck around in a second-order (or potentially nth-order) dependency. I'd rather limit this sort of behavior. That's why I don't want to encourage or export Base.@propagate_inbounds.

vtjnash · 2021-04-13T19:25:05Z

We already have a section on the Array interface, and I agree with mbauman that this doesn't seem beneficial to show

Stanislav Kazmin added 2 commits September 30, 2019 10:58

Adding an example of an array type wrapper without loose of perfomrance.

eee4e9a

Merge branch 'master' of github.com:stakaz/julia

cf91a70

stakaz changed the title ~~Adding an example of an array type wrapper without loose of perfomrance.~~ Adding an example of an array type wrapper without loose of performance. Sep 30, 2019

corrected the example

42e9afe

stakaz force-pushed the master branch from 89aa89e to 42e9afe Compare September 30, 2019 09:30

removed trailing whitespaces

86434bb

mcabbott reviewed Sep 30, 2019

View reviewed changes

StefanKarpinski assigned mbauman Oct 1, 2019

correctd the example text

6e6b1c0

vtjnash closed this Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding an example of an array type wrapper without loose of performance. #33420

Adding an example of an array type wrapper without loose of performance. #33420

stakaz commented Sep 30, 2019

stakaz commented Sep 30, 2019

KristofferC commented Sep 30, 2019

mateuszbaran commented Sep 30, 2019

mcabbott Sep 30, 2019

mateuszbaran Oct 1, 2019

mcabbott Oct 1, 2019

mateuszbaran Oct 1, 2019

mbauman commented Oct 1, 2019

stakaz commented Oct 1, 2019

mbauman commented Oct 1, 2019 •

edited

Loading

stakaz commented Oct 2, 2019

mbauman commented Oct 2, 2019

stakaz commented Dec 16, 2019

stakaz commented Dec 16, 2019

KristofferC commented Dec 16, 2019

vchuravy commented Dec 16, 2019

mbauman commented Dec 16, 2019 •

edited

Loading

vtjnash commented Apr 13, 2021


		julia> Base.size(A::MyArray) = size(A.a)

		julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...) = getindex(A.a, i...)

Adding an example of an array type wrapper without loose of performance. #33420

Adding an example of an array type wrapper without loose of performance. #33420

Conversation

stakaz commented Sep 30, 2019

stakaz commented Sep 30, 2019

KristofferC commented Sep 30, 2019

mateuszbaran commented Sep 30, 2019

mcabbott Sep 30, 2019

Choose a reason for hiding this comment

mateuszbaran Oct 1, 2019

Choose a reason for hiding this comment

mcabbott Oct 1, 2019

Choose a reason for hiding this comment

mateuszbaran Oct 1, 2019

Choose a reason for hiding this comment

mbauman commented Oct 1, 2019

stakaz commented Oct 1, 2019

mbauman commented Oct 1, 2019 • edited Loading

stakaz commented Oct 2, 2019

mbauman commented Oct 2, 2019

stakaz commented Dec 16, 2019

stakaz commented Dec 16, 2019

KristofferC commented Dec 16, 2019

vchuravy commented Dec 16, 2019

mbauman commented Dec 16, 2019 • edited Loading

vtjnash commented Apr 13, 2021

mbauman commented Oct 1, 2019 •

edited

Loading

mbauman commented Dec 16, 2019 •

edited

Loading