Skip to content

Commit

Permalink
Introduce codegen optimizations for primitives and fixed-sized-arrays (
Browse files Browse the repository at this point in the history
…#2970)

### What

This implements 2 optimizations:
- The first is ArrowBuffer optimization returns an inner Buffer directly
when we know that the type itself it just an array of primitives. This
is useful for zero-copy returns for dense data such as Tensors.
- The second is the optimizations from:
#2954 . For this, we identify
cases where we know the inner arrays are not nullable and instead of
using validity-iterators map directly to slices.

Significant speedups for batch queries:

![image](https://github.com/rerun-io/rerun/assets/3312232/7ea1f3a2-a45a-4813-b82c-eaee55914c32)


TODO:
- [x] We should be able to check that the contents don't actually
contain a validity map with non-nulls and return a deserialization error
in that case.
 - [x] Add handling for other ArrowBuffer types.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2970) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2970)
- [Docs
preview](https://rerun.io/preview/pr%3Ajleibs%2Fcodegen_optimizations/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Ajleibs%2Fcodegen_optimizations/examples)
  • Loading branch information
jleibs authored Aug 15, 2023
1 parent 1339fdd commit 0f60eb9
Show file tree
Hide file tree
Showing 54 changed files with 1,268 additions and 323 deletions.
17 changes: 4 additions & 13 deletions crates/re_log_types/src/data_cell.rs
Original file line number Diff line number Diff line change
Expand Up @@ -362,17 +362,8 @@ impl DataCell {
pub fn try_to_native<'a, C: Component + Default + 'a>(
&'a self,
) -> DataCellResult<impl Iterator<Item = C> + '_> {
Ok(C::try_iter_from_arrow(self.inner.values.as_ref())?
.map(C::convert_item_to_self)
.map(|v| {
// TODO(#2523): This unwrap and the `Default` bounds should go away once we move to fallible iterators
v.unwrap_or_else(|| {
re_log::warn_once!(
"Unexpected missing data when iterating non-optional data-cell. Falling back on Default value."
);
C::default()
})
}))
re_tracing::profile_function!(C::name().as_str());
Ok(C::try_iter_from_arrow(self.inner.values.as_ref())?.map(C::convert_item_to_self))
}

/// Returns the contents of an expected mono-component as an `Option<C>`.
Expand All @@ -381,7 +372,7 @@ impl DataCell {
#[inline]
pub fn try_to_native_mono<'a, C: Component + 'a>(&'a self) -> DataCellResult<Option<C>> {
let mut iter =
C::try_iter_from_arrow(self.inner.values.as_ref())?.map(C::convert_item_to_self);
C::try_iter_from_arrow(self.inner.values.as_ref())?.map(C::convert_item_to_opt_self);

let result = match iter.next() {
// It's ok to have no result from the iteration: this is what we
Expand Down Expand Up @@ -420,7 +411,7 @@ impl DataCell {
pub fn try_to_native_opt<'a, C: Component + 'a>(
&'a self,
) -> DataCellResult<impl Iterator<Item = Option<C>> + '_> {
Ok(C::try_iter_from_arrow(self.inner.values.as_ref())?.map(C::convert_item_to_self))
Ok(C::try_iter_from_arrow(self.inner.values.as_ref())?.map(C::convert_item_to_opt_self))
}

/// Returns the contents of the cell as an iterator of native optional components.
Expand Down
2 changes: 1 addition & 1 deletion crates/re_log_types/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -485,7 +485,7 @@ macro_rules! component_legacy_shim {
}

#[inline]
fn convert_item_to_self(item: Self::Item<'_>) -> Option<Self> {
fn convert_item_to_opt_self(item: Self::Item<'_>) -> Option<Self> {
<Self as arrow2_convert::deserialize::ArrowDeserialize>::arrow_deserialize(item)
}
}
Expand Down
2 changes: 1 addition & 1 deletion crates/re_types/source_hash.txt

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions crates/re_types/src/arrow_buffer.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
use arrow2::buffer::Buffer;

/// Convenience-wrapper around an arrow [`Buffer`] that is known to contain a
/// a primitive type.
///
/// The arrow2 [`Buffer`] object is internally reference-counted and can be
/// easily converted back to a `&[T]` referencing the underlying storage.
/// This avoids some of the lifetime complexities that would otherwise
/// arise from returning a `&[T]` directly, but is significantly more
/// performant than doing the full allocation necessary to return a `Vec<T>`.
#[derive(Clone, Debug, Default, PartialEq)]
pub struct ArrowBuffer<T>(pub Buffer<T>);

impl<T> ArrowBuffer<T> {
#[inline]
/// The number of instances of T stored in this buffer.
pub fn num_instances(&self) -> usize {
// WARNING: If you are touching this code, make sure you know what len() actually does.
//
// There is ambiguity in how arrow2 and arrow-rs talk about buffer lengths, including
// some incorrect documentation: https://github.com/jorgecarleitao/arrow2/issues/1430
//
// Arrow2 `Buffer<T>` is typed and `len()` is the number of units of `T`, but the documentation
// is currently incorrect.
// Arrow-rs `Buffer` is untyped and len() is in bytes, but `ScalarBuffer`s are in units of T.
self.0.len()
}

#[inline]
pub fn is_empty(&self) -> bool {
self.0.is_empty()
}
}

impl<T> From<Vec<T>> for ArrowBuffer<T> {
fn from(value: Vec<T>) -> Self {
Self(value.into())
}
}
4 changes: 2 additions & 2 deletions crates/re_types/src/components/annotation_context.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

45 changes: 41 additions & 4 deletions crates/re_types/src/components/class_id.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

45 changes: 41 additions & 4 deletions crates/re_types/src/components/color.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions crates/re_types/src/components/disconnected_space.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

44 changes: 40 additions & 4 deletions crates/re_types/src/components/draw_order.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 0f60eb9

Please sign in to comment.