Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding an example of an array type wrapper without loose of performance. #33420

Closed
wants to merge 5 commits into from

Conversation

stakaz
Copy link

@stakaz stakaz commented Sep 30, 2019

I found it really handy to know that in order to wrap an array and keep the performance you have to use base.@propagate_inbounds on getindex and setindex! functions. As I think this is a typical option used for dispatching it should be mentioned in the docs.

@stakaz stakaz changed the title Adding an example of an array type wrapper without loose of perfomrance. Adding an example of an array type wrapper without loose of performance. Sep 30, 2019
@stakaz
Copy link
Author

stakaz commented Sep 30, 2019

Sorry, I have no idea what white spaces are meant by the test buildbot/whitespace_linux32. I have checked once more and cannot find anything bad in the lines.

@KristofferC
Copy link
Sponsor Member

The whitespace CI in the logs is from the previous commit. Not sure why a new one didn't start.

@mateuszbaran
Copy link
Contributor

It might also be worth mentioning Base.dataids when talking about wrapper performance.


julia> Base.size(A::MyArray) = size(A.a)

julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...) = getindex(A.a, i...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should encourage the passing on of keywords, for named A[i=3] type indexing?

Suggested change
julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...) = getindex(A.a, i...)
julia> Base.@propagate_inbounds Base.getindex(A::MyArray, i...; kw...) = getindex(A.a, i...; kw...)

Link: JuliaCollections/AxisArraysFuture#1 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keyword arguments often cause various performance problems. Just look at this for example: JuliaArrays/StaticArrays.jl#540 . I don't this it should be encouraged in a section dedicated to fast wrappers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. But is this a concern with unused kw... being passed along, or only when they are actually used? I can’t detect any effect on things I tried.

(And, is there a good explanation of this issue somewhere?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem occurs when you call a function with keyword arguments. It's OK to have default keyword arguments though but even passing these default kwargs slows things down. That's why in some places keyword arguments are forwarded as normal arguments (as named tuples). There might be some exceptions that can be optimized by the compiler but from my experience it's a good rule of thumb.

(And, is there a good explanation of this issue somewhere?)

I don't know, it seems to generally be hard to find good explanations of such compiler details.

@mbauman
Copy link
Sponsor Member

mbauman commented Oct 1, 2019

Hm, I'm not the biggest fan of encouraging @propagate_inbounds because it misattributes the error to an internal implementation detail. See, for example, this SO answer: https://stackoverflow.com/questions/38901275/inbounds-propagation-rules-in-julia/38929159#38929159

@stakaz
Copy link
Author

stakaz commented Oct 1, 2019

However note, that it leads to problems only after explicitly using @inbounds: I would say that people who use this macro usually are interested in performance and know about the risks.

It simple just not should be the case that a wrapped array has speed drawbacks.

@mbauman
Copy link
Sponsor Member

mbauman commented Oct 1, 2019

My point is that I want to encourage folks to use @boundscheck blocks themselves. In the example I cite above, it's the difference between:

julia> module M
           using Random
           struct ShuffledVector{A,T} <: AbstractVector{T}
               data::A
               shuffle::Vector{Int}
           end
           ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
           Base.size(A::ShuffledVector) = size(A.data)
           Base.@propagate_inbounds function Base.getindex(A::ShuffledVector, i::Int)
               A.data[A.shuffle[i]]
           end
       end

julia> s = M.ShuffledVector(1:4);

julia> s[5]
ERROR: BoundsError: attempt to access 4-element Array{Int64,1} at index [5]
Stacktrace:
 [1] getindex at ./array.jl:728 [inlined]
 [2] getindex(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[4]:10
 [3] top-level scope at REPL[6]:1

vs.

julia> module M
           using Random
           struct ShuffledVector{A,T} <: AbstractVector{T}
               data::A
               shuffle::Vector{Int}
           end
           ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
           Base.size(A::ShuffledVector) = size(A.data)
           Base.@inline function Base.getindex(A::ShuffledVector, i::Int)
               @boundscheck checkbounds(A, i)
               A.data[A.shuffle[i]]
           end
       end

julia> s = M.ShuffledVector(1:4);

julia> s[5]
ERROR: BoundsError: attempt to access 4-element Main.M.ShuffledVector{UnitRange{Int64},Int64} at index [5]
Stacktrace:
 [1] throw_boundserror(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Tuple{Int64}) at ./abstractarray.jl:538
 [2] checkbounds at ./abstractarray.jl:503 [inlined]
 [3] getindex(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[7]:10
 [4] top-level scope at REPL[9]:1

Both implementations will have the same performance.

@stakaz
Copy link
Author

stakaz commented Oct 2, 2019

Yeah, maybe the approach with @boundscheck checkbounds(A, i) is a good point then. As far as I can understand you will be able to use @inbounds and gain the same performance with this, am I right?

@mbauman
Copy link
Sponsor Member

mbauman commented Oct 2, 2019

Yes, it'll perform exactly the same.

The reason-for-being for @propagate_inbounds is when you want to refactor an indexing implementation to use an inner function. For example, the internal abstract indexing functionality is essentially:

@propagate_inbounds getindex(A::AbstractArray, I...) = _getindex(IndexStyle(A), A, to_indices(A, I)...)

@stakaz
Copy link
Author

stakaz commented Dec 16, 2019

Just corrected the sentence to restart the checks (one of them was stuck for some reason). After thinking a bit about it, I would prefer the @propagate_inbounds in this case for the sake of simplicity. Defining own boundschecks is a good solution but if you really just want a wrapper around a Base array and the same speed than @propagate_inbounds looks much more natural to me (like just tell that the wrapped array is a Base array).

@stakaz
Copy link
Author

stakaz commented Dec 16, 2019

I don't see what makes the test fail here? Could someone else look into it? It is only the documentation, so I don't see how it can fail...

@KristofferC
Copy link
Sponsor Member

I restarted the failing workers.

@vchuravy
Copy link
Member

My point is that I want to encourage folks to use @BoundsCheck blocks themselves. In the example I cite above, it's the difference between:

This is missing one facet. I either have to use @propagate_inbounds or I have to @inbounds myself. Whether or not I use @boundscheck is an orthogonal matter. I would advocate for the variant below, since it checks the bounds for the wrapper implementation and it also checks the bounds
for the inner implementation, but both inbounds checks are disabled when an outer caller sets @inbounds.

module M
           using Random
           struct ShuffledVector{A,T} <: AbstractVector{T}
               data::A
               shuffle::Vector{Int}
           end
           ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
           Base.size(A::ShuffledVector) = size(A.data)
           Base.@propagate_inbounds function Base.getindex(A::ShuffledVector, i::Int)
               @boundscheck checkbounds(A, i)
               A.data[A.shuffle[i]]
           end
       end

julia> s = M.ShuffledVector(1:4);
no_inbounds(A, i) = A[i]
inbounds(A, i) = @inbounds A[i]

julia> @code_typed no_inbounds(zeros(3), 1)
CodeInfo(
1 ─ %1 = Base.arrayref(true, A, i)::Float64
└──      return %1
) => Float64

julia> @code_typed inbounds(zeros(3), 1)
CodeInfo(
1 ─ %1 = Base.arrayref(false, A, i)::Float64
└──      return %1
) => Float64

Note that arrayref has a flag that indicates whether it needs to boundscheck.

Now for the wrapper without @propagate_inbounds (using Cthulhu to get the benefit of DCE).

julia> descend(inbounds, map(typeof, (M.ShuffledVector(zeros(3)), 1)), debuginfo=:none)
CodeInfo(
1 ─      goto #2
2 ─ %2 = Base.getfield(A, :data)::Array{Float64,1}
│   %3 = Base.getfield(A, :shuffle)::Array{Int64,1}
│   %4 = Base.arrayref(true, %3, i)::Int64
│   %5 = Base.arrayref(true, %2, %4)::Float64
└──      goto #3
3 ─      return %5
)

julia> descend(no_inbounds, map(typeof, (M.ShuffledVector(zeros(3)), 1)), debuginfo=:none)
CodeInfo(
1 ─ %1  = Core.tuple(i)::Tuple{Int64}
│   %2  = Base.getfield(A, :data)::Array{Float64,1}
│   %3  = Base.arraysize(%2, 1)::Int64
│   %4  = Base.slt_int(%3, 0)::Bool
│   %5  = Base.ifelse(%4, 0, %3)::Int64
│   %6  = Base.sle_int(1, i)::Bool
│   %7  = Base.sle_int(i, %5)::Bool
│   %8  = Base.and_int(%6, %7)::Bool
└──       goto #3 if not %8
2 ─       goto #4
3 ─       invoke Base.throw_boundserror(_2::Main.M.ShuffledVector{Array{Float64,1},Float64}, %1::Tuple{Int64})::Union{}
└──       $(Expr(:unreachable))::Union{}
4 ┄ %13 = Base.getfield(A, :data)::Array{Float64,1}
│   %14 = Base.getfield(A, :shuffle)::Array{Int64,1}
│   %15 = Base.arrayref(true, %14, i)::Int64
│   %16 = Base.arrayref(true, %13, %15)::Float64
└──       goto #5
5 ─       return %16
)

Note how both cases use Base.arrayref(true), e.g. the @inbounds was not propagated, and we only eliminated one level of boundschecking.

Now my variant with @propagate_inbounds

julia> descend(inbounds, map(typeof, (M.ShuffledVector(zeros(3)), 1)), debuginfo=:none)
CodeInfo(
1 ─      goto #2
2 ─ %2 = Base.getfield(A, :data)::Array{Float64,1}
│   %3 = Base.getfield(A, :shuffle)::Array{Int64,1}
│   %4 = Base.arrayref(false, %3, i)::Int64
│   %5 = Base.arrayref(false, %2, %4)::Float64
└──      goto #3
3 ─      return %5
)

julia> descend(no_inbounds, map(typeof, (M.ShuffledVector(zeros(3)), 1)), debuginfo=:none)
CodeInfo(
1 ─ %1  = Core.tuple(i)::Tuple{Int64}
│   %2  = Base.getfield(A, :data)::Array{Float64,1}
│   %3  = Base.arraysize(%2, 1)::Int64
│   %4  = Base.slt_int(%3, 0)::Bool
│   %5  = Base.ifelse(%4, 0, %3)::Int64
│   %6  = Base.sle_int(1, i)::Bool
│   %7  = Base.sle_int(i, %5)::Bool
│   %8  = Base.and_int(%6, %7)::Bool
└──       goto #3 if not %8
2 ─       goto #4
3 ─       invoke Base.throw_boundserror(_2::Main.M.ShuffledVector{Array{Float64,1},Float64}, %1::Tuple{Int64})::Union{}
└──       $(Expr(:unreachable))::Union{}
4 ┄ %13 = Base.getfield(A, :data)::Array{Float64,1}
│   %14 = Base.getfield(A, :shuffle)::Array{Int64,1}
│   %15 = Base.arrayref(true, %14, i)::Int64
│   %16 = Base.arrayref(true, %13, %15)::Float64
└──       goto #5
5 ─       return %16
)

This has lead to confusion before, JuliaArrays/StaticArrays.jl#564
Either the getindex method needs to internally use @inbounds, which makes it harder to verify the implementation of the wrapper,
or the getindex method needs to use @propagate_inbounds so that @inbounds get's passed to the inner array access appropriately.

@mbauman
Copy link
Sponsor Member

mbauman commented Dec 16, 2019

Either the getindex method needs to internally use @inbounds, which makes it harder to verify the implementation of the wrapper,
or the getindex method needs to use @propagate_inbounds so that @inbounds get's passed to the inner array access appropriately.

This is exactly what it boils down to. I'm not sure there's a categorically "better" choice here — I think the arguments for and against both are highly opinion-based. Here are my opinions:

Personally, I disagree that option one "makes it harder to verify the implementation of the wrapper." It's @inbounds! It's exactly how @inbounds works! Just do your debugging without it (or with --check-bounds=yes). Yes, I forgot it in my comment above, but again, I say that's more of a feature than a bug. Let's lean conservative here.

Further, I think of @inbounds as a necessary evil. It's a spooky action at a distance. There is nothing else in Julia (well, besides Cassette 🙈) that allows you to reach into a library and muck around with its implementation. @propagate_inbounds allows a caller into your library to muck around in a second-order (or potentially nth-order) dependency. I'd rather limit this sort of behavior. That's why I don't want to encourage or export Base.@propagate_inbounds.

@vtjnash vtjnash closed this Apr 13, 2021
@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 13, 2021

We already have a section on the Array interface, and I agree with mbauman that this doesn't seem beneficial to show

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants