-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batching 4: retire MsgBundle
+ batching support in transport layer
#1679
Conversation
MsgBundle
+ ready to send batchesMsgBundle
+ ready to send batches
9e4661e
to
bc97c77
Compare
cedfddc
to
d33cd0f
Compare
14693bf
to
f7e3e9f
Compare
MsgBundle
+ ready to send batchesMsgBundle
+ batching support
f7e3e9f
to
37b92b3
Compare
89c0780
to
03d231d
Compare
/// The same component were put in the same log message multiple times. | ||
/// E.g. `with_component()` was called multiple times for `Point3D`. | ||
/// We don't support that yet. | ||
#[error( | ||
"All component collections must have exactly one row (i.e. no batching), got {0:?} instead. Perhaps with_component() was called multiple times with the same component type?" | ||
)] | ||
MoreThanOneRow(Vec<(ComponentName, usize)>), | ||
|
||
/// Some components had more or less instances than some other. | ||
/// For example, there were `10` positions and `8` colors. | ||
#[error( | ||
"All component collections must share the same number of instances (i.e. row length) \ | ||
for a given row, got {0:?} instead" | ||
)] | ||
MismatchedRowLengths(Vec<(ComponentName, u32)>), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These things are now checked by Data{Cell,Row,Table}
d8d507c
to
1fa6aa7
Compare
1fa6aa7
to
dc1c47f
Compare
34c2db8
to
cdebe4e
Compare
cdebe4e
to
eefab45
Compare
2923d41
to
50d0477
Compare
110dfa2
to
1827b2c
Compare
MsgBundle
+ batching supportMsgBundle
+ batching support in transport layer
#[error("Trying to deserialize data that is missing a column present in the schema: {0:?}")] | ||
MissingColumn(String), | ||
|
||
#[error("Trying to deserialize data that cannot possibly be a column: {0:?}")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to
#[error("Trying to deserialize column data that doesn't contain any ListArrays: {0:?}")]
NotAColumn(String),
/// let labels: &[_] = &[Label("hey".into())]; | ||
/// DataRow::from_cells2(MsgId::random(), "c", timepoint(2, 1), num_instances, (colors, labels)) | ||
/// }; | ||
/// let table = DataTable::from_rows(table_id, [row1, row2, row3]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1-indexing? This is not Lua!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eeeh I do think 1-indexing makes more sense when talking about tables: an excel sheet starts a row 1 🙃 but yeah, let's go with what a Rust programmer would expect
This PR finally gets rid of
MsgBundle
entirely, and putsDataTable
in charge or serialization/deserialization.Since
DataTable
support batches, the transport layer as a whole now supports batches too (which doesn't mean that the SDKs nor the store do!).A batch always contains the following control columns:
rerun.row_id
: aRowId
uniquely identifying every row in the batch,rerun.timepoint
: aTimePoint
for each row,rerun.entity_path
: theEntityPath
that each row relates to,rerun.num_instances
: the expected number of instances for each row.We're not yet in a position to benefit from all the niceties that batching is suppose to offer us; for that we now need to implement the changes to the
DataStore
on one hand and to the SDKs on the other hand.Regressions
The new batches carry more information than the old schema (
entity_path
,num_instances
,row_id
..) and so it is expected to see some regression in very-micro benchmarks such asarrow_mono_points
:mono_points_arrow/generate_message_bundles
44481992
ns/iter (± 1123759
)47092378
ns/iter (± 853262
)0.94
mono_points_arrow/generate_messages
178549277
ns/iter (± 1487091
)123513169
ns/iter (± 1380848
)1.45
mono_points_arrow/encode_log_msg
217471314
ns/iter (± 725662
)157619842
ns/iter (± 1814858
)1.38
mono_points_arrow/encode_total
444945548
ns/iter (± 2328275
)329033469
ns/iter (± 1723866
)1.35
mono_points_arrow/decode_log_msg
264431976
ns/iter (± 1105351
)180134730
ns/iter (± 875308
)1.47
mono_points_arrow/decode_message_bundles
92561122
ns/iter (± 1346869
)53748777
ns/iter (± 782433
)1.72
mono_points_arrow/decode_total
358486103
ns/iter (± 1978345
)231988721
ns/iter (± 1345301
)1.55
This gets dwarfed to oblivion when you enable batching on this very same benchmark:
On top of #1673
Part of #1619