Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detangle linear indexing from non-scalar indexing #13614

Merged
merged 1 commit into from
Oct 16, 2015
Merged

Conversation

mbauman
Copy link
Member

@mbauman mbauman commented Oct 14, 2015

This removes the LinearFast special cases from non-scalar indexing. Previously, we were manually hoisting the div/rem sub2ind calculation along the indexed strides, but LLVM seems to be just as capable at performing this optimization in the cases I have tested. Even better, though, this creates a clean separation between the array indexing fallbacks:

  • Scalar fallbacks use ind2sub and sub2ind to compute the required number of indices that the custom type must implement (as determined by Base.linearindexing).
  • Non-scalar fallbacks simply "unwrap" the elements from AbstractArrays (and Colon) and use scalar indexing with the indices that were provided.
  • (CartesianIndices are also expanded to individual integers, but that is a smaller detail.)

In all cases that I've tried, I've been unable to measure a performance difference. Indeed, the LLVM IR looks identical in my spot-checks, too.

@mbauman
Copy link
Member Author

mbauman commented Oct 15, 2015

LLVM seems to be just as capable at performing this optimization

This result holds for both 3.3 and 3.7. Failures were all unrelated.

This removes the LinearFast special cases from non-scalar indexing.  Previously, we were manually hoisting the div/rem sub2ind calculation along the indexed strides, but LLVM seems to be just as capable at performing this optimization in the cases I have tested.  Even better, though, this creates a clean separation between the array indexing fallbacks:

* Scalar fallbacks use `ind2sub` and `sub2ind` to compute the required number of indices that the custom type must implement.
* Non-scalar fallbacks simply "unwrap" the elements from AbstractArrays and use scalar indexing with the indices that were provided.
* (CartesianIndices are also expanded to individual integers, but that is a smaller detail.)

In all cases that I've tried, I've been unable to measure a performance difference. Indeed, the LLVM IR looks identical in my spot-checks, too.
mbauman added a commit that referenced this pull request Oct 16, 2015
Detangle linear indexing from non-scalar indexing
@mbauman mbauman merged commit 29e8d33 into master Oct 16, 2015
@mbauman mbauman deleted the mb/nolinear branch October 16, 2015 17:35
@timholy
Copy link
Member

timholy commented Nov 6, 2015

Given MAX_TUPLETYPE_LEN, nice to get rid of one of the args.

But your comment in the code & above seems to imply that LLVM is figuring out the div/rem optimization. I don't think it is; this is the "forward direction" (sub2ind), which doesn't involve div/rem (it's just multiplication and addition). It's ind2sub where that's an issue.

mbauman added a commit to mbauman/julia that referenced this pull request Jan 11, 2016
Back in JuliaLang#13614 I changed the signature of _unsafe_getindex!, but I failed to update the BitArray methods... so they were just using the generic definitions. This is a very simple fix to restore the indexing performance for BitArrays.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants