Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Limit
@inbounds
to indexing in the dual-iterator branch in `copyto_…
…unaliased!` (#55919) This simplifies the `copyto_unalised!` implementation where the source and destination have different `IndexStyle`s, and limits the `@inbounds` to only the indexing operation. In particular, the iteration over `eachindex(dest)` is not marked as `@inbounds` anymore. This seems to help with performance when the destination uses Cartesian indexing. Reduced implementation of the branch: ```julia function copyto_proposed!(dest, src) axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes")) iterdest, itersrc = eachindex(dest), eachindex(src) for (destind, srcind) in zip(iterdest, itersrc) @inbounds dest[destind] = src[srcind] end dest end function copyto_current!(dest, src) axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes")) iterdest, itersrc = eachindex(dest), eachindex(src) ret = iterate(iterdest) @inbounds for a in src idx, state = ret::NTuple{2,Any} dest[idx] = a ret = iterate(iterdest, state) end dest end function copyto_current_limitinbounds!(dest, src) axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes")) iterdest, itersrc = eachindex(dest), eachindex(src) ret = iterate(iterdest) for isrc in itersrc idx, state = ret::NTuple{2,Any} @inbounds dest[idx] = src[isrc] ret = iterate(iterdest, state) end dest end ``` ```julia julia> a = zeros(40000,4000); b = rand(size(a)...); julia> av = view(a, UnitRange.(axes(a))...); julia> @Btime copyto_current!($av, $b); 617.704 ms (0 allocations: 0 bytes) julia> @Btime copyto_current_limitinbounds!($av, $b); 304.146 ms (0 allocations: 0 bytes) julia> @Btime copyto_proposed!($av, $b); 240.217 ms (0 allocations: 0 bytes) julia> versioninfo() Julia Version 1.12.0-DEV.1260 Commit 4a4ca9c (2024-09-28 01:49 UTC) Build Info: Official https://julialang.org release Platform Info: OS: Linux (x86_64-linux-gnu) CPU: 8 × Intel(R) Core(TM) i5-10310U CPU @ 1.70GHz WORD_SIZE: 64 LLVM: libLLVM-18.1.7 (ORCJIT, skylake) Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores) Environment: JULIA_EDITOR = subl ``` I'm not quite certain why the proposed implementation here (`copyto_proposed!`) is even faster than `copyto_current_limitinbounds!`. In any case, `copyto_proposed!` is easier to read, so I'm not complaining. This fixes #53158 (cherry picked from commit 06e7b9d)
- Loading branch information