-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-10810: [Rust] Improve comparison kernels performance #8900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8900 +/- ##
===========================================
+ Coverage 53.96% 76.77% +22.81%
===========================================
Files 170 181 +11
Lines 30707 41009 +10302
===========================================
+ Hits 16571 31485 +14914
+ Misses 14136 9524 -4612
Continue to review full report at Codecov.
|
rust/arrow/src/array/array_binary.rs
Outdated
| /// Returns the element at index `i` as a byte slice. | ||
| pub fn value(&self, i: usize) -> &[u8] { | ||
| assert!( | ||
| debug_assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change mean that we should mark the function as unsafe? Should we provide safe and unsafe versions of these functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can think more about safe/unsafe and being more clear about it. I will move this change from this PR for now.
| for i in 0..$left.len() { | ||
| result.append($op($left.value(i), $right.value(i)))?; | ||
| if $op($left.value(i), $right.value(i)) { | ||
| unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functions that use this macro should now be declared unsafe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will revert the change with the asserts for now. I think the value functions should be marked unsafe that don't perform bound checks, and other functions that can trigger UB. The macro's themselves should be "relatively" safe I think, as long as the .len() value is correct.
I think a better solution for the future would be to have a safe iterator that doesn't do bound checking, so I think it's better to move the particular change out of this PR for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, @Dandandan . Note that all primitives and strings have iterators and FromIterator: https://github.com/apache/arrow/blob/master/rust/arrow/src/array/iterator.rs , but they are for Option<T>, not T.
I agree with you that we should mark that fn value as unsafe and offer an iterator over T (besides the one over Option<T>). That UB is really obvious and it is also a security vulnerability causing an escalation of privileges as it allows privileged access to the application's memory via out of bounds accesses.
I usually see the iterator over T when they can mask the result or OR / AND the null bitmaps, while Option<T> is used when that is not possible / useful.
|
Will revert the |
|
@andygrove @jorgecarleitao |
|
FWIW I tested this branch with a local micro benchmark I had and it shows significant improvement. I wrote up some details here: https://docs.google.com/document/d/15DRpIr1EUqo7zVR1psBhoX95ML1PO55kkHwt2EasM88/edit#heading=h.pfa677dojqtv Backstory was I was looking at the performance of these kernels (inspired by @rdettai ) as |
|
Cool to know / read @alamb to see that we are getting closer to "optimal" performance on a single thread. The PRs I create are also based on profiling DataFusion, showing that it has "real world" improvement on those changes as well besides getting good result on the benchmarks. I hope they are useful for influxdb as well. The docs / presentation are looking very interesting, it probably isn't recorded? As mentioned there is for some datatypes (like strings) there is some overhead still related to the |
|
Ah just saw the link to the YouTube video, thanks @alamb |
|
Took it a bit further and the commit 23c8ff2 commit 3b67f70 : |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andygrove and @jorgecarleitao -- I think this PR is looking ready to merge. Are you satisfied with the resolution of the unsafe discussion?
This PR shows that there is still about a ~2x performance (compared to ~8x earlier) difference between using a builder vs using a mutable buffer directly after #8842 .
This also accounts for a ~5% difference on some queries in DataFusion (when not using the simd feature, where the implementation doesn't use the builder). Also the bounds checks are a bit expensive. In some
valuefunctions they are explicitly not there whereas in other (like for string) they are there.I guess there will be always some overhead in the builder as it does need to do some bookkeeping, but I think it's a good idea to see how we can write kernels while not losing too much performance.
FYI @jorgecarleitao