Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize gathering of point cloud colors #3730

Merged
merged 16 commits into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions crates/re_arrow_store/benches/arrow2.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;
use std::sync::Arc;

use arrow2::array::{Array, PrimitiveArray, StructArray, UnionArray};
use criterion::{criterion_group, Criterion};
use criterion::Criterion;
use itertools::Itertools;

use re_log_types::{DataCell, SizeBytes as _};
Expand All @@ -19,7 +19,7 @@ use re_types::{

// ---

criterion_group!(benches, erased_clone, estimated_size_bytes);
criterion::criterion_group!(benches, erased_clone, estimated_size_bytes);

#[cfg(not(feature = "core_benchmarks_only"))]
criterion::criterion_main!(benches);
Expand Down
10 changes: 9 additions & 1 deletion crates/re_log_types/src/data_cell.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use std::sync::Arc;

use arrow2::datatypes::DataType;
use re_types::{Component, ComponentName, DeserializationError};
use re_types::{Component, ComponentBatch, ComponentName, DeserializationError};

use crate::SizeBytes;

Expand Down Expand Up @@ -164,6 +164,14 @@ pub struct DataCellInner {
// TODO(#1696): Check that the array is indeed a leaf / component type when building a cell from an
// arrow payload.
impl DataCell {
/// Builds a new `DataCell` from a component batch.
#[inline]
pub fn from_component_batch(batch: &dyn ComponentBatch) -> re_types::SerializationResult<Self> {
batch
.to_arrow()
.map(|arrow| DataCell::from_arrow(batch.name(), arrow))
}

/// Builds a new `DataCell` from a uniform iterable of native component values.
///
/// Fails if the given iterable cannot be serialized to arrow, which should never happen when
Expand Down
28 changes: 27 additions & 1 deletion crates/re_log_types/src/data_row.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use ahash::HashSetExt;
use nohash_hasher::IntSet;
use re_types::ComponentName;
use re_types::{AsComponents, ComponentName};
use smallvec::SmallVec;

use crate::{DataCell, DataCellError, DataTable, EntityPath, SizeBytes, TableId, TimePoint};
Expand Down Expand Up @@ -266,6 +266,32 @@ pub struct DataRow {
}

impl DataRow {
/// Builds a new `DataRow` from anything implementing [`AsComponents`].
pub fn from_component_batches(
row_id: RowId,
timepoint: TimePoint,
entity_path: EntityPath,
as_components: &dyn AsComponents,
) -> anyhow::Result<Self> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to take an iterator of ComponentBatches: it's strictly more expressive, easier to build / come by and more consistent with the rest of our APIs.

Suggested change
pub fn from_component_batches(
row_id: RowId,
timepoint: TimePoint,
entity_path: EntityPath,
as_components: &dyn AsComponents,
) -> anyhow::Result<Self> {
pub fn from_component_batches(
row_id: RowId,
timepoint: TimePoint,
entity_path: EntityPath,
comp_batches: impl IntoIterator<Item = &'a dyn ComponentBatch>,
) -> anyhow::Result<Self> {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we get the num_instances in that case? Take the max of all the batches? What if there are no batches, or if they are all splats?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take the max of all the batches?

Yes, that matches the behavior of our log methods (in all languages, even!):

What if there are no batches

Then there's nothing in the row and there are no instances

or if they are all splats?

You cannot have "all splats", that would just result in a row with num_instances = 1.

Copy link
Member Author

@emilk emilk Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought we stored num_instances separately.

So what happens if I log a full point cloud first and then later want to update all the colors with a splat color - that would be a new row with num_instance = 1 then, even though it will affect several instances?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's where it becomes funky...

If you do this:

rr.log("random", rr.Points3D(positions, colors=colors, radii=radii))
rr.log_components("random", [rr.components.ColorBatch([255, 0, 0])])

Then you're going to end up with the original colors being discarded, a single red point and the rest of the points using the default color for this entity path (because that ColorBatch is not a splat).

Now, there is a trick at your disposal... you could do this:

rr.log("random", rr.Points3D(positions, colors=colors, radii=radii))
rr.log_components("random", [rr.components.ColorBatch([255, 0, 0])], num_instances=2)

And now you'll end up with only red points, because you explicitly said that the data was 2 instances wide, and so the log function considers the ColorBatch to be a splat...

Of course we could change things so that logging 1 thing is always considered a splat, but then you have the opposite problem, which might or might not be better depending on the situation 🤷.

And this is why I don't like that splats are a logging-time rather than a query-time concern: the view should get to decide what to do with the data that it has as its disposal, and that behavior should be configurable through blueprints and through the UI.
This instance key business is pretty similar to e.g. configurable texture clamping modes in gfx APIs after all.

re_tracing::profile_function!();

let data_cells = as_components
.as_component_batches()
.into_iter()
.map(|batch| DataCell::from_component_batch(batch.as_ref()))
.collect::<Result<Vec<DataCell>, _>>()?;

let mut row = DataRow::from_cells(
row_id,
timepoint,
entity_path,
as_components.num_instances() as _,
data_cells,
)?;
row.compute_all_size_bytes();
Ok(row)
}

/// Builds a new `DataRow` from an iterable of [`DataCell`]s.
///
/// Fails if:
Expand Down
12 changes: 12 additions & 0 deletions crates/re_space_view_spatial/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,15 @@ parking_lot.workspace = true
rayon.workspace = true
serde = "1"
smallvec = { workspace = true, features = ["serde"] }


[dev-dependencies]
criterion.workspace = true
mimalloc.workspace = true

[lib]
bench = false

[[bench]]
name = "bench_points"
harness = false
135 changes: 135 additions & 0 deletions crates/re_space_view_spatial/benches/bench_points.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
//! Keeping track of performance issues/regressions in `arrow2` that directly affect us.
emilk marked this conversation as resolved.
Show resolved Hide resolved

use re_arrow_store::{DataStore, LatestAtQuery};
use re_log_types::{DataRow, EntityPath, RowId, TimeInt, TimePoint, Timeline};
use re_space_view_spatial::LoadedPoints;
use re_types::{
archetypes::Points3D,
components::{Color, InstanceKey, Position3D},
Loggable as _,
};
use re_viewer_context::Annotations;

#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

criterion::criterion_main!(benches);
criterion::criterion_group!(benches, bench_points);

// ---

#[cfg(not(debug_assertions))]
const NUM_POINTS: usize = 1_000_000;

// `cargo test` also runs the benchmark setup code, so make sure they run quickly:
#[cfg(debug_assertions)]
const NUM_POINTS: usize = 10;

// ---

/// Mimics `examples/python/open_photogrammetry_format/main.py`
fn bench_points(c: &mut criterion::Criterion) {
let timeline = Timeline::log_time();
let ent_path = EntityPath::from("points");

let store = {
let mut store = DataStore::new(InstanceKey::name(), Default::default());

let positions = vec![Position3D::new(0.1, 0.2, 0.3); NUM_POINTS];
let colors = vec![Color::from(0xffffffff); NUM_POINTS];
let points = Points3D::new(positions).with_colors(colors);
let mut timepoint = TimePoint::default();
timepoint.insert(timeline, TimeInt::from_seconds(0));
let data_row =
DataRow::from_component_batches(RowId::random(), timepoint, ent_path.clone(), &points)
.unwrap();
store.insert_row(&data_row).unwrap();
store
};

let latest_at = LatestAtQuery::latest(timeline);
let annotations = Annotations::missing();

{
let mut group = c.benchmark_group("Points3D");
group.bench_function("query_archetype", |b| {
b.iter(|| {
let arch_view =
re_query::query_archetype::<Points3D>(&store, &latest_at, &ent_path).unwrap();
assert_eq!(arch_view.num_instances(), NUM_POINTS);
arch_view
});
});
}

let arch_view = re_query::query_archetype::<Points3D>(&store, &latest_at, &ent_path).unwrap();
assert_eq!(arch_view.num_instances(), NUM_POINTS);

{
let mut group = c.benchmark_group("Points3D");
group.throughput(criterion::Throughput::Elements(NUM_POINTS as _));
group.bench_function("load_all", |b| {
b.iter(|| {
let points =
LoadedPoints::load(&arch_view, &ent_path, latest_at.at, &annotations).unwrap();
assert_eq!(points.positions.len(), NUM_POINTS);
assert_eq!(points.colors.len(), NUM_POINTS);
assert_eq!(points.radii.len(), NUM_POINTS); // NOTE: we don't log radii, but we should get a list of defaults!
points
});
});
}

{
let mut group = c.benchmark_group("Points3D");
group.throughput(criterion::Throughput::Elements(NUM_POINTS as _));
group.bench_function("load_positions", |b| {
b.iter(|| {
let positions = LoadedPoints::load_positions(&arch_view).unwrap();
assert_eq!(positions.len(), NUM_POINTS);
positions
});
});
}

{
let points = LoadedPoints::load(&arch_view, &ent_path, latest_at.at, &annotations).unwrap();

let mut group = c.benchmark_group("Points3D");
group.throughput(criterion::Throughput::Elements(NUM_POINTS as _));
group.bench_function("load_colors", |b| {
b.iter(|| {
let colors =
LoadedPoints::load_colors(&arch_view, &ent_path, &points.annotation_infos)
.unwrap();
assert_eq!(colors.len(), NUM_POINTS);
colors
});
});
}

// NOTE: we don't log radii!
{
let mut group = c.benchmark_group("Points3D");
group.throughput(criterion::Throughput::Elements(NUM_POINTS as _));
group.bench_function("load_radii", |b| {
b.iter(|| {
let radii = LoadedPoints::load_radii(&arch_view, &ent_path).unwrap();
assert_eq!(radii.len(), NUM_POINTS);
radii
});
});
}

{
let mut group = c.benchmark_group("Points3D");
group.throughput(criterion::Throughput::Elements(NUM_POINTS as _));
group.bench_function("load_picking_ids", |b| {
b.iter(|| {
let picking_ids = LoadedPoints::load_picking_ids(&arch_view);
assert_eq!(picking_ids.len(), NUM_POINTS);
picking_ids
});
});
}
}
3 changes: 3 additions & 0 deletions crates/re_space_view_spatial/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ mod ui_3d;
pub use space_view_2d::SpatialSpaceView2D;
pub use space_view_3d::SpatialSpaceView3D;

#[doc(hidden)] // Public for benchmarks
pub use parts::LoadedPoints;

// ---

mod view_kind {
Expand Down
15 changes: 3 additions & 12 deletions crates/re_space_view_spatial/src/parts/images.rs
Original file line number Diff line number Diff line change
Expand Up @@ -259,10 +259,7 @@ impl ImagesPart {
.annotations
.resolved_class_description(None)
.annotation_info()
.color(
color.map(|c| c.to_array()).as_ref(),
DefaultColor::OpaqueWhite,
);
.color(color.map(|c| c.to_array()), DefaultColor::OpaqueWhite);

if let Some(textured_rect) = to_textured_rect(
ctx,
Expand Down Expand Up @@ -379,10 +376,7 @@ impl ImagesPart {
.annotations
.resolved_class_description(None)
.annotation_info()
.color(
color.map(|c| c.to_array()).as_ref(),
DefaultColor::OpaqueWhite,
);
.color(color.map(|c| c.to_array()), DefaultColor::OpaqueWhite);

if let Some(textured_rect) = to_textured_rect(
ctx,
Expand Down Expand Up @@ -467,10 +461,7 @@ impl ImagesPart {
.annotations
.resolved_class_description(None)
.annotation_info()
.color(
color.map(|c| c.to_array()).as_ref(),
DefaultColor::OpaqueWhite,
);
.color(color.map(|c| c.to_array()), DefaultColor::OpaqueWhite);

if let Some(textured_rect) = to_textured_rect(
ctx,
Expand Down
20 changes: 10 additions & 10 deletions crates/re_space_view_spatial/src/parts/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,10 @@ use re_types::components::Text;
pub use spatial_view_part::SpatialViewPartData;
pub use transform3d_arrows::{add_axis_arrows, Transform3DArrowsPart};

#[doc(hidden)] // Public for benchmarks
pub use points3d::LoadedPoints;

use ahash::HashMap;
use std::sync::Arc;

use re_data_store::{EntityPath, InstancePathHash};
use re_types::components::{Color, InstanceKey};
Expand Down Expand Up @@ -118,7 +120,7 @@ pub fn process_colors<'a, A: Archetype>(
arch_view.iter_optional_component::<Color>()?,
)
.map(move |(annotation_info, color)| {
annotation_info.color(color.map(move |c| c.to_array()).as_ref(), default_color)
annotation_info.color(color.map(|c| c.to_array()), default_color)
}))
}

Expand All @@ -134,9 +136,7 @@ pub fn process_labels<'a, A: Archetype>(
annotation_infos.iter(),
arch_view.iter_optional_component::<Text>()?,
)
.map(move |(annotation_info, text)| {
annotation_info.label(text.as_ref().map(move |t| t.as_str()))
}))
.map(move |(annotation_info, text)| annotation_info.label(text.as_ref().map(|t| t.as_str()))))
}

/// Process [`re_types::components::Radius`] components to [`re_renderer::Size`] using auto size
Expand Down Expand Up @@ -171,22 +171,22 @@ pub fn process_radii<'a, A: Archetype>(
fn process_annotations<Primary, A: Archetype>(
query: &ViewQuery<'_>,
arch_view: &re_query::ArchetypeView<A>,
annotations: &Arc<Annotations>,
annotations: &Annotations,
) -> Result<ResolvedAnnotationInfos, re_query::QueryError>
where
Primary: re_types::Component + Clone,
{
process_annotations_and_keypoints(query, arch_view, annotations, |_: &Primary| {
process_annotations_and_keypoints(query.latest_at, arch_view, annotations, |_: &Primary| {
glam::Vec3::ZERO
})
.map(|(a, _)| a)
}

/// Resolves all annotations and keypoints for the given entity view.
fn process_annotations_and_keypoints<Primary, A: Archetype>(
query: &ViewQuery<'_>,
latest_at: re_log_types::TimeInt,
arch_view: &re_query::ArchetypeView<A>,
annotations: &Arc<Annotations>,
annotations: &Annotations,
mut primary_into_position: impl FnMut(&Primary) -> glam::Vec3,
) -> Result<(ResolvedAnnotationInfos, Keypoints), re_query::QueryError>
where
Expand Down Expand Up @@ -220,7 +220,7 @@ where

if let (Some(keypoint_id), Some(class_id), primary) = (keypoint_id, class_id, primary) {
keypoints
.entry((class_id, query.latest_at.as_i64()))
.entry((class_id, latest_at.as_i64()))
.or_default()
.insert(keypoint_id.0, primary_into_position(&primary));
class_description.annotation_info_with_keypoint(keypoint_id.0)
Expand Down
Loading
Loading