-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batching 1: introduce DataCell
& retire ComponentBundle
#1634
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
teh-cmc
added
enhancement
New feature or request
🏹 arrow
concerning arrow
⛃ re_datastore
affects the datastore itself
labels
Mar 21, 2023
3 tasks
teh-cmc
force-pushed
the
cmc/datastore/new_types
branch
6 times, most recently
from
March 23, 2023 15:44
70e7a65
to
60912c1
Compare
teh-cmc
commented
Mar 23, 2023
teh-cmc
changed the title
batching 1: introduce
batching 1: introduce Mar 23, 2023
DataCell
DataCell
& retire ComponentBundle
teh-cmc
force-pushed
the
cmc/datastore/new_types
branch
from
March 23, 2023 18:01
60912c1
to
a5a42c9
Compare
teh-cmc
force-pushed
the
cmc/datastore/new_types
branch
3 times, most recently
from
March 23, 2023 19:15
acfe862
to
f7f55c9
Compare
teh-cmc
force-pushed
the
cmc/datastore/new_types
branch
from
March 23, 2023 19:51
f7f55c9
to
8512de9
Compare
Trouble ahead:
|
Aaaand we're good:
|
emilk
approved these changes
Mar 25, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful!
1 task
18 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
🏹 arrow
concerning arrow
enhancement
New feature or request
⛃ re_datastore
affects the datastore itself
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First step of #1619: Introduce
DataCell
and retireComponentBundle
.DataCell
is the leaf type of our data model, a uniform array of components:[C, C, C, ...]
. No more, no less.Users should prefer using it over raw arrow arrays as it tends to make code a lot simpler (and faster).
Behind the scenes, a
DataCell
is backed by an erased arrow array living on the heap, which is likely to point into a larger batch of contiguous memory that it shares with its peers.This PR merely introduces the
DataCell
type but doesn't quite use it everywhere: the store in particular still shuffles raw arrow arrays around for the most part (for now).This is mostly grunt work (and a bunch of house cleaning I've wanted done for a while): aside from
data_cell.rs
there's no notable logic changes.Nice surprise: we get roughly 5% to 25% performance improvement across all our benchmarks, which I assume is the result of
DataCell
eliminating a bunch of unneeded list wraps/unwraps all over the place.Also discovered a major performance pitfall in
TryIntoArrow
.Spawned a bunch of issues:
Chunk
#1692Component
s/DataCell
s from non-reference types #1693Component
should be a single trait #1694Component
'sDataType
should embed its metadata #1696Part of #1619