Detangle linear indexing from non-scalar indexing #13614

mbauman · 2015-10-14T20:49:46Z

This removes the LinearFast special cases from non-scalar indexing. Previously, we were manually hoisting the div/rem sub2ind calculation along the indexed strides, but LLVM seems to be just as capable at performing this optimization in the cases I have tested. Even better, though, this creates a clean separation between the array indexing fallbacks:

Scalar fallbacks use ind2sub and sub2ind to compute the required number of indices that the custom type must implement (as determined by Base.linearindexing).
Non-scalar fallbacks simply "unwrap" the elements from AbstractArrays (and Colon) and use scalar indexing with the indices that were provided.
(CartesianIndices are also expanded to individual integers, but that is a smaller detail.)

In all cases that I've tried, I've been unable to measure a performance difference. Indeed, the LLVM IR looks identical in my spot-checks, too.

mbauman · 2015-10-15T18:00:18Z

LLVM seems to be just as capable at performing this optimization

This result holds for both 3.3 and 3.7. Failures were all unrelated.

This removes the LinearFast special cases from non-scalar indexing. Previously, we were manually hoisting the div/rem sub2ind calculation along the indexed strides, but LLVM seems to be just as capable at performing this optimization in the cases I have tested. Even better, though, this creates a clean separation between the array indexing fallbacks: * Scalar fallbacks use `ind2sub` and `sub2ind` to compute the required number of indices that the custom type must implement. * Non-scalar fallbacks simply "unwrap" the elements from AbstractArrays and use scalar indexing with the indices that were provided. * (CartesianIndices are also expanded to individual integers, but that is a smaller detail.) In all cases that I've tried, I've been unable to measure a performance difference. Indeed, the LLVM IR looks identical in my spot-checks, too.

Detangle linear indexing from non-scalar indexing

timholy · 2015-11-06T10:15:53Z

Given MAX_TUPLETYPE_LEN, nice to get rid of one of the args.

But your comment in the code & above seems to imply that LLVM is figuring out the div/rem optimization. I don't think it is; this is the "forward direction" (sub2ind), which doesn't involve div/rem (it's just multiplication and addition). It's ind2sub where that's an issue.

Back in JuliaLang#13614 I changed the signature of _unsafe_getindex!, but I failed to update the BitArray methods... so they were just using the generic definitions. This is a very simple fix to restore the indexing performance for BitArrays.

mbauman force-pushed the mb/nolinear branch from aa18b32 to 64525aa Compare October 16, 2015 16:08

mbauman added a commit that referenced this pull request Oct 16, 2015

Merge pull request #13614 from JuliaLang/mb/nolinear

29e8d33

Detangle linear indexing from non-scalar indexing

mbauman merged commit 29e8d33 into master Oct 16, 2015

mbauman deleted the mb/nolinear branch October 16, 2015 17:35

This was referenced Jan 11, 2016

Fix BitArray indexing performance #14650

Merged

RFC: Allow any index type in nonscalar indexing #12567

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detangle linear indexing from non-scalar indexing #13614

Detangle linear indexing from non-scalar indexing #13614

mbauman commented Oct 14, 2015

mbauman commented Oct 15, 2015

timholy commented Nov 6, 2015

Detangle linear indexing from non-scalar indexing #13614

Detangle linear indexing from non-scalar indexing #13614

Conversation

mbauman commented Oct 14, 2015

mbauman commented Oct 15, 2015

timholy commented Nov 6, 2015