-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-11332: [Rust] Use MutableBuffer in take_string instead of Vec #9279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| let offsets = offsets_buffer.typed_data_mut(); | ||
| let mut values = Vec::with_capacity(bytes_offset); | ||
| let mut values = MutableBuffer::new(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used to be here bytes_offset but that doesn't make a lot of sense.
The values array could be empty as well (e.g. for empty strings), or have a totally different capacity.
|
I think a nice followup for |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thanks @Dandandan !
@jorgecarleitao what do you think?
nevi-me
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR changes take string to use `MutableBuffer` to create a byte array directly instead of converting it from a `Vec<u8>`.
There used to be some overhead compared to using a `Vec` and converting it to a buffer afterwards, but the overhead seems to be gone now.
The change seems to be neutral according to benchmarks, giving results within a few %. If there is any remaining overhead in `MutableBufffer` I think we should fix that rather than having some workarounds and inconsistencies with other kernels.
```
take str 512 time: [2.3304 us 2.3358 us 2.3419 us]
change: [-4.4130% -4.0693% -3.7241%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
Benchmarking take str 1024: Collecting 100 samples in estimated 5.0198 s (1.1M i take str 1024 time: [4.3583 us 4.3633 us 4.3694 us]
change: [-0.5853% +1.1186% +2.9951%] (p = 0.29 > 0.05)
No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
3 (3.00%) low severe
6 (6.00%) high mild
7 (7.00%) high severe
Benchmarking take str null indices 512: Collecting 100 samples in estimated 5.00 take str null indices 512
time: [2.4779 us 2.4813 us 2.4844 us]
change: [-2.4765% -2.2000% -1.9437%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) low mild
2 (2.00%) high mild
Benchmarking take str null indices 1024: Collecting 100 samples in estimated 5.0 take str null indices 1024
time: [4.4823 us 4.4910 us 4.5053 us]
change: [-4.8482% -4.5426% -4.2894%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low mild
3 (3.00%) high mild
1 (1.00%) high severe
Benchmarking take str null values 1024: Collecting 100 samples in estimated 5.00 take str null values 1024
time: [4.4856 us 4.4889 us 4.4920 us]
change: [-2.2093% -2.0471% -1.8925%] (p = 0.00 < 0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low severe
3 (3.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Benchmarking take str null values null indices 1024: Collecting 100 samples in e take str null values null indices 1024
time: [9.6438 us 9.6514 us 9.6592 us]
change: [-2.8600% -2.7478% -2.6338%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
```
Closes #9279 from Dandandan/take_string_opt
Authored-by: Heres, Daniel <[email protected]>
Signed-off-by: Andrew Lamb <[email protected]>
This PR changes take string to use
MutableBufferto create a byte array directly instead of converting it from aVec<u8>.There used to be some overhead compared to using a
Vecand converting it to a buffer afterwards, but the overhead seems to be gone now.The change seems to be neutral according to benchmarks, giving results within a few %. If there is any remaining overhead in
MutableBuffferI think we should fix that rather than having some workarounds and inconsistencies with other kernels.