Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass through strings using arrow2::Buffers #2931

Merged
merged 6 commits into from
Aug 8, 2023
Merged

Conversation

jleibs
Copy link
Member

@jleibs jleibs commented Aug 7, 2023

What

This avoids copying strings and lets us end up with&str references pointed directly at the underlying arrow buffer.

1.8x speedup on mono_strings and 8.7x speedup on batch strings:
image

Checklist

@jleibs jleibs force-pushed the jleibs/primitive_buffers branch from 2547eff to 5e2ff49 Compare August 7, 2023 17:38
@jleibs jleibs changed the base branch from main to jleibs/strings_and_bench August 7, 2023 17:38
@jleibs jleibs added 🚀 performance Optimization, memory use, etc codegen/idl labels Aug 7, 2023
@jleibs jleibs force-pushed the jleibs/primitive_buffers branch from 5e2ff49 to bd2c2ac Compare August 7, 2023 17:42
@jleibs jleibs marked this pull request as ready for review August 7, 2023 19:14
@jleibs jleibs force-pushed the jleibs/primitive_buffers branch from 33e3c1a to 87d1b4b Compare August 7, 2023 19:43
@jleibs jleibs force-pushed the jleibs/strings_and_bench branch from 5eae3dd to 603778d Compare August 7, 2023 19:45
@jleibs jleibs force-pushed the jleibs/primitive_buffers branch from 87d1b4b to 304e01d Compare August 7, 2023 19:47
@teh-cmc teh-cmc self-requested a review August 8, 2023 07:14
Copy link
Member

@teh-cmc teh-cmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

crates/re_types/src/arrow_adapter.rs Outdated Show resolved Hide resolved
crates/re_types/src/arrow_adapter.rs Outdated Show resolved Hide resolved
crates/re_types/src/lib.rs Outdated Show resolved Hide resolved
crates/re_types_builder/src/codegen/rust.rs Show resolved Hide resolved
crates/re_types_builder/src/codegen/rust.rs Show resolved Hide resolved
Comment on lines +2322 to +2325
arrow2::bitmap::utils::ZipValidity::new_with_validity(
offsets.iter().zip(offsets.lengths()),
downcast.validity(),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I probably could've used that in a few places 😅

@emilk
Copy link
Member

emilk commented Aug 8, 2023

Does this close #1887 ?

@jleibs
Copy link
Member Author

jleibs commented Aug 8, 2023

Does this close #1887 ?

It's at least related. This only affects code-generated strings, so we need to migrate TextEntry to codegen. Also need to take a look at how the TextEntry query results are used to make sure we aren't gratuitously converting them back to strings somewhere in the query path.

Base automatically changed from jleibs/strings_and_bench to main August 8, 2023 11:36
jleibs added a commit that referenced this pull request Aug 8, 2023
### What
Add a new benchmark of strings so we can verify the move to buffers in
#2931 is actually an improvement.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2926) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2926)
- [Docs
preview](https://rerun.io/preview/pr%3Ajleibs%2Fstrings_and_bench/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Ajleibs%2Fstrings_and_bench/examples)
@jleibs jleibs force-pushed the jleibs/primitive_buffers branch from 304e01d to 3940e90 Compare August 8, 2023 11:44
@jleibs jleibs merged commit 83bfeb1 into main Aug 8, 2023
@jleibs jleibs deleted the jleibs/primitive_buffers branch August 8, 2023 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/idl 🚀 performance Optimization, memory use, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants