Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static data 0: revamped TimeInt #5534

Merged
merged 13 commits into from
Apr 5, 2024
Merged

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Mar 15, 2024

Commits make no sense, review the final changelog directly.

All the interesting bits happen in re_log_types/time_point & re_sdk -- everything else is just change propagation.

  • TimeInt now ranges from i64:MIN + 1 to i64::MAX.
  • TimeInt::STATIC, which takes the place of the now illegal TimeInt(i64::MIN), is now the only way of identifying static data.
  • It is impossible to create TimeInt::STATIC inadvertently -- users of the SDK cannot set the clock to that value.
  • Similarly, it is impossible to create a TimeRange, a TimePoint, a LatestAtQuery or a RangeQuery that includes TimeInt::STATIC.
    If static data exists, that's what will be returned, unconditionally -- there's no such thing as querying for it explicitely.
  • TimePoint::timeless is gone -- we already have TimePoint::default that we use all over the place, we don't need two ways of doing the same thing.

There still exists a logical mapping between an empty TimePoint and static data, as that is how one represents static data on the wire -- terminology wise: "a timeless timepoint results in static data".

Similar to the "ensure RowIds are unique" refactor from back when, this seemingly tiny change on the surface will vastly simplify downstream code that finally has some invariants to rely on.


Part of a PR series that removes the concept of timeless data in favor of the much simpler concept of static data:

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!

crates/re_log_types/src/time_point/time_int.rs Outdated Show resolved Hide resolved
crates/re_log_types/src/data_table.rs Outdated Show resolved Hide resolved
crates/re_log_types/src/data_table.rs Outdated Show resolved Hide resolved
crates/re_log_types/src/time_range.rs Outdated Show resolved Hide resolved
crates/re_sdk/src/recording_stream.rs Outdated Show resolved Hide resolved
crates/re_sdk/src/recording_stream.rs Outdated Show resolved Hide resolved
crates/re_time_panel/src/lib.rs Outdated Show resolved Hide resolved
crates/re_time_panel/src/time_ranges_ui.rs Outdated Show resolved Hide resolved
crates/re_viewer_context/src/blueprint_helpers.rs Outdated Show resolved Hide resolved
@teh-cmc teh-cmc force-pushed the cmc/static_0_timeint_shenanigans branch from 2bd5760 to f338850 Compare March 18, 2024 09:39
@teh-cmc teh-cmc marked this pull request as ready for review March 18, 2024 09:41
@teh-cmc teh-cmc force-pushed the cmc/static_0_timeint_shenanigans branch from 30eed52 to ca6dfca Compare March 19, 2024 16:18
@emilk emilk self-requested a review March 20, 2024 10:53
Copy link
Member

@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa, that's a lot of tiny changes 😬
Must have been so much fun :P

Looks like the right choices all the way though

crates/re_log_types/src/time_point/non_min_i64.rs Outdated Show resolved Hide resolved
crates/re_log_types/src/time_point/non_min_i64.rs Outdated Show resolved Hide resolved
crates/re_log_types/src/time_point/time_int.rs Outdated Show resolved Hide resolved
teh-cmc added a commit that referenced this pull request Apr 5, 2024
Introduces the concept of static data into the data APIs.

Static data is a on a per-entity per-component basis. If it exists, it
unconditionally shadows any temporal data of the same type. It is never
garbage collected.
When static data is returned, it is indicated via `TimeInt::STATIC`.

The terminology has been normalized all over the place: data is either
static or temporal, and nothing else.

Static data cannot have more than one cell per-entity per-component.
Trying to write more than one cells will trigger last-write-wins
semantics, as defined by `RowId` ordering.

Timeless fallbacks just don't exist anymore, which simplifies out _a
lot_ of code in the datastore and query cache.

Note: static data is in many subtle ways incompatible with our legacy
InstanceKey-based model, which results in a couple hacks in this PR.
Those hacks will be gone as soon as the new data APIs land and instance
keys go away.

- Fixes #5264
- Fixes #2074
- Fixes #5447
- Fixes #1766


---

Part of a PR series that removes the concept of timeless data in favor
of the much simpler concept of static data:
- #5534
- #5535
- #5536
- #5537
- #5540
teh-cmc added a commit that referenced this pull request Apr 5, 2024
Just exposing all the new static stuff to the Python SDK, and trying to
kill the "timeless" terminology in the process.

---

Part of a PR series that removes the concept of timeless data in favor
of the much simpler concept of static data:
- #5534
- #5535
- #5536
- #5537
- #5540
teh-cmc added a commit that referenced this pull request Apr 5, 2024
Just exposing all the new static stuff to the C & C++ SDKs, and trying
to kill the "timeless" terminology in the process.

---

Part of a PR series that removes the concept of timeless data in favor
of the much simpler concept of static data:
- #5534
- #5535
- #5536
- #5537
- #5540
teh-cmc added a commit that referenced this pull request Apr 5, 2024
Just exposing all the new static stuff to the Rust SDK, and trying to
kill the "timeless" terminology in the process.

---

Part of a PR series that removes the concept of timeless data in favor
of the much simpler concept of static data:
- #5534
- #5535
- #5536
- #5537
- #5540
teh-cmc added a commit that referenced this pull request Apr 8, 2024
This introduces a new temporary `re_query2` crate, which won't ever be
published.
It will replace the existing `re_query` crate once all the necessary
features have been backported.

As of this PR, this crate only contains the `ClampedZip` iterator
machinery, which is code generated for all the different arities.

Since I'm very, _very tired_ of the awful DX of macros, I implemented a
very low-tech code generator in the crate itself
(`src/bin/clamped_zip.rs`) that just spews the generated code on stdout.
That seems like the right complexity-to-maintenance tradeoff,
considering that iterator combinators don't really ever change.

`ClampedZip` naturally works with more than one required component,
finally!

- Fixes #4742
- Fixes #2750

Here's an example of one of these combinators:
```rust
/// Returns a new [`ClampedZip1x2`] iterator.
///
/// The number of elements in a clamped zip iterator corresponds to the number of elements in the
/// shortest of its required iterators (`r0`).
///
/// Optional iterators (`o0`, `o1`) will repeat their latest values if they happen to be to short
/// to be zipped with the shortest of the required iterators.
///
/// If an optional iterator is not only too short but actually empty, its associated default function
/// (`o0_default_fn`, `o1_default_fn`) will be executed and the resulting value repeated as necessary.
pub fn clamped_zip_1x2<R0, O0, O1, D0, D1>(
    r0: R0,
    o0: O0,
    o0_default_fn: D0,
    o1: O1,
    o1_default_fn: D1,
) -> ClampedZip1x2<R0::IntoIter, O0::IntoIter, O1::IntoIter, D0, D1>
where
    R0: IntoIterator,
    O0: IntoIterator,
    O0::Item: Clone,
    O1: IntoIterator,
    O1::Item: Clone,
    D0: Fn() -> O0::Item,
    D1: Fn() -> O1::Item,
{
    ClampedZip1x2 {
        r0: r0.into_iter(),
        o0: o0.into_iter(),
        o1: o1.into_iter(),
        o0_default_fn,
        o1_default_fn,
        o0_latest_value: None,
        o1_latest_value: None,
    }
}

/// Implements a clamped zip iterator combinator with 2 required iterators and 2 optional
/// iterators.
///
/// See [`clamped_zip_1x2`] for more information.
pub struct ClampedZip1x2<R0, O0, O1, D0, D1>
where
    R0: Iterator,
    O0: Iterator,
    O0::Item: Clone,
    O1: Iterator,
    O1::Item: Clone,
    D0: Fn() -> O0::Item,
    D1: Fn() -> O1::Item,
{
    r0: R0,
    o0: O0,
    o1: O1,
    o0_default_fn: D0,
    o1_default_fn: D1,

    o0_latest_value: Option<O0::Item>,
    o1_latest_value: Option<O1::Item>,
}

impl<R0, O0, O1, D0, D1> Iterator for ClampedZip1x2<R0, O0, O1, D0, D1>
where
    R0: Iterator,
    O0: Iterator,
    O0::Item: Clone,
    O1: Iterator,
    O1::Item: Clone,
    D0: Fn() -> O0::Item,
    D1: Fn() -> O1::Item,
{
    type Item = (R0::Item, O0::Item, O1::Item);

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        let r0_next = self.r0.next()?;
        let o0_next = self.o0.next().or(self.o0_latest_value.take());
        let o1_next = self.o1.next().or(self.o1_latest_value.take());

        self.o0_latest_value = o0_next.clone();
        self.o1_latest_value = o1_next.clone();

        Some((
            r0_next,
            o0_next.unwrap_or_else(|| (self.o0_default_fn)()),
            o1_next.unwrap_or_else(|| (self.o1_default_fn)()),
        ))
    }
}
```

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 8, 2024
This implements the new uncached latest-at APIs, and introduces some
basic types for promises.

Tests and benchmarks have been backported from `re_query`.

We already get a pretty decent improvement because the join process
(clamped-zip) is cheaper and we don't need to query for instance keys at
all:
```
group                         re_query                                re_query2
-----                         --------                                ---------
arrow_batch_points2/query     1.39      2.5±0.03µs 379.7 MElem/sec    1.00  1810.6±23.62ns 526.7  MElem/sec
arrow_mono_points2/query      1.44   1082.7±8.66µs 902.0 KElem/sec    1.00    753.6±9.28µs 1295.9 KElem/sec
```

- Fixes #3379
- Part of #1893  

Here's an example/guide of using the new API:
```rust
// First, get the raw results for this query.
//
// Raw here means that these results are neither deserialized, nor resolved/converted.
// I.e. this corresponds to the raw `DataCell`s, straight from our datastore.
let results: LatestAtResults = re_query2::latest_at(
    &store,
    &query,
    &entity_path.into(),
    MyPoints::all_components().iter().cloned(), // no generics!
);

// Then, grab the raw results for each individual components.
//
// This is still raw data, but now a choice has been made regarding the nullability of the
// _component batch_ itself (that says nothing about its _instances_!).
//
// * `get_required` returns an error if the component batch is missing
// * `get_optional` returns an empty set of results if the component if missing
// * `get` returns an option
let points: &LatestAtComponentResults = results.get_required::<MyPoint>()?;
let colors: &LatestAtComponentResults = results.get_optional::<MyColor>();
let labels: &LatestAtComponentResults = results.get_optional::<MyLabel>();

// Then comes the time to resolve/convert and deserialize the data.
// These steps have to be done together for efficiency reasons.
//
// Both the resolution and deserialization steps might fail, which is why this returns a `Result<Result<T>>`.
// Use `PromiseResult::flatten` to simplify it down to a single result.
//
// A choice now has to be made regarding the nullability of the _component batch's instances_.
// Our IDL doesn't support nullable instances at the moment -- so for the foreseeable future you probably
// shouldn't be using anything but `iter_dense`.

let points = match points.iter_dense::<MyPoint>(&mut resolver).flatten() {
    PromiseResult::Pending => {
        // Handle the fact that the data isn't ready appropriately.
        return Ok(());
    }
    PromiseResult::Ready(data) => data,
    PromiseResult::Error(err) => return Err(err.into()),
};

let colors = match colors.iter_dense::<MyColor>(&mut resolver).flatten() {
    PromiseResult::Pending => {
        // Handle the fact that the data isn't ready appropriately.
        return Ok(());
    }
    PromiseResult::Ready(data) => data,
    PromiseResult::Error(err) => return Err(err.into()),
};

let labels = match labels.iter_sparse::<MyLabel>(&mut resolver).flatten() {
    PromiseResult::Pending => {
        // Handle the fact that the data isn't ready appropriately.
        return Ok(());
    }
    PromiseResult::Ready(data) => data,
    PromiseResult::Error(err) => return Err(err.into()),
};

// With the data now fully resolved/converted and deserialized, the joining logic can be
// applied.
//
// In most cases this will be either a clamped zip, or no joining at all.

let color_default_fn = || MyColor::from(0xFF00FFFF);
let label_default_fn = || None;

let results =
    clamped_zip_1x2(points, colors, color_default_fn, labels, label_default_fn).collect_vec();
```


---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 8, 2024
Static-aware, key-less, component-based, cached latest-at APIs.

The overall structure of this new cache is very similar to what we had
before. Effectively it is just an extremely simplified version of
`re_query_cache`.

This introduces a new temporary `re_query_cache2` crate, which won't
ever be published.
It will replace the existing `re_query_cache` crate once all the
necessary features have been backported.

- Fixes #3232
- Fixes #4733
- Fixes #4734
- Part of #3379
- Part of #1893  

Example:
```rust
let caches = re_query_cache2::Caches::new(&store);

// First, get the results for this query.
//
// They might or might not already be cached. We won't know for sure until we try to access
// each individual component's data below.
let results: CachedLatestAtResults = caches.latest_at(
    &store,
    &query,
    &entity_path.into(),
    MyPoints::all_components().iter().cloned(), // no generics!
);

// Then, grab the results for each individual components.
// * `get_required` returns an error if the component batch is missing
// * `get_optional` returns an empty set of results if the component if missing
// * `get` returns an option
//
// At this point we still don't know whether they are cached or not. That's the next step.
let points: &CachedLatestAtComponentResults = results.get_required::<MyPoint>()?;
let colors: &CachedLatestAtComponentResults = results.get_optional::<MyColor>();
let labels: &CachedLatestAtComponentResults = results.get_optional::<MyLabel>();

// Then comes the time to resolve/convert and deserialize the data.
// These steps have to be done together for efficiency reasons.
//
// Both the resolution and deserialization steps might fail, which is why this returns a `Result<Result<T>>`.
// Use `PromiseResult::flatten` to simplify it down to a single result.
//
// A choice now has to be made regarding the nullability of the _component batch's instances_.
// Our IDL doesn't support nullable instances at the moment -- so for the foreseeable future you probably
// shouldn't be using anything but `iter_dense`.
//
// This is the step at which caching comes into play.
//
// If the data has already been accessed with the same nullability characteristics in the
// past, then this will just grab the pre-deserialized, pre-resolved/pre-converted result from
// the cache.
//
// Otherwise, this will trigger a deserialization and cache the result for next time.

let points = match points.iter_dense::<MyPoint>(&mut resolver).flatten() {
    PromiseResult::Pending => {
        // Handle the fact that the data isn't ready appropriately.
        return Ok(());
    }
    PromiseResult::Ready(data) => data,
    PromiseResult::Error(err) => return Err(err.into()),
};

let colors = match colors.iter_dense::<MyColor>(&mut resolver).flatten() {
    PromiseResult::Pending => {
        // Handle the fact that the data isn't ready appropriately.
        return Ok(());
    }
    PromiseResult::Ready(data) => data,
    PromiseResult::Error(err) => return Err(err.into()),
};

let labels = match labels.iter_sparse::<MyLabel>(&mut resolver).flatten() {
    PromiseResult::Pending => {
        // Handle the fact that the data isn't ready appropriately.
        return Ok(());
    }
    PromiseResult::Ready(data) => data,
    PromiseResult::Error(err) => return Err(err.into()),
};

// With the data now fully resolved/converted and deserialized, the joining logic can be
// applied.
//
// In most cases this will be either a clamped zip, or no joining at all.

let color_default_fn = || {
    static DEFAULT: MyColor = MyColor(0xFF00FFFF);
    &DEFAULT
};
let label_default_fn = || None;

let results =
    clamped_zip_1x2(points, colors, color_default_fn, labels, label_default_fn).collect_vec();
```


---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 8, 2024
A trivial PR that essentially just does this:
```diff
- pub trait Loggable: Clone + Sized + SizeBytes {
+ pub trait Loggable: 'static + Send + Sync + Clone + Sized + SizeBytes {
```

because im very tired of carrying these clauses around manually
everywhere.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 8, 2024
Now that we have a component-based latest-at cache, we can start
replacing legacy uncached helpers with new ones.
Commit-by-commit review should be trivial.

Because the new APIs are designed with promises in mind, this already
highlights a whole bunch of places where we need to think about what to
do in case the data is not ready yet.
As indicated in #5607, these places have been labeled `TODO(#5607)` in
the code.
For now, we simply treat a pending promise the same as missing data.

This PR also adds the new `Caches` and `PromiseResolver` to the
`EntityDb`.
To run a cached query, you now need a `DataStore`, a `Caches` and a
`PromiseResolver`, i.e. you need an `EntityDb`.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 8, 2024
_Trivial commit by commit_

All data UIs are now using cached APIs as much as possible.

As a nice side-effect, `EntityDb` is now plumbed through everywhere,
giving access to everything you might possibly need in all the places
you might need 'em.

We're slowly but surely seeing the first sign of instance keys going
away, but it's still not the focus of this PR.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 8, 2024
Introduce very high-level APIs that make it possible to query, resolve,
deserialize and cache an entire archetype all at once.

The core trait allowing this is the newly introduced `ToArchetype<A>`
trait:
```rust
pub trait ToArchetype<A: re_types_core::Archetype> {
    fn to_archetype(&self, resolver: &crate::PromiseResolver) -> crate::PromiseResult<A>;
}
```

This trait is implemented for all builtins archetypes thanks to a new
codegen pass.

Implementing such a trait is tricky: one needs to know about this trait,
archetypes, queries, caches, etc all in one single place. This is a
recipe for a very nasty circular dependency chains.
This PR makes it work for all archetypes except archetypes that are
generated in `re_viewport`. These will need some special care in a
follow-up PR (likely moving them out of there, and only reexporting them
in `re_viewport`?).
Update: related:
- #5421

Here's how it looks in practice:
```rust
let caches = re_query_cache2::Caches::new(&store);

// First, get the results for this query.
//
// They might or might not already be cached. We won't know for sure until we try to access
// each individual component's data below.
let results: CachedLatestAtResults = caches.latest_at(
    &store,
    &query,
    &entity_path.into(),
    Points2D::all_components().iter().cloned(), // no generics!
);

// Then make use of the `ToArchetype` helper trait in order to query, resolve, deserialize and
// cache an entire archetype all at once.
use re_query_cache2::ToArchetype as _;

let arch: Points2D = match results.to_archetype(&resolver) {
    PromiseResult::Pending => {
        // Handle the fact that the data isn't ready appropriately.
        return Ok(());
    }
    PromiseResult::Ready(arch) => arch,
    PromiseResult::Error(err) => return Err(err.into()),
};

// With the data now fully resolved/converted and deserialized, some joining logic can be
// applied if desired.
//
// In most cases this will be either a clamped zip, or no joining at all.

let color_default_fn = || None;
let label_default_fn = || None;

let results = clamped_zip_1x2(
    arch.positions.iter(),
    arch.colors
        .iter()
        .flat_map(|colors| colors.iter().map(Some)),
    color_default_fn,
    arch.labels
        .iter()
        .flat_map(|labels| labels.iter().map(Some)),
    label_default_fn,
)
.collect_vec();

eprintln!("results:\n{results:?}");
```


---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 8, 2024
Code generator, and code generated, for the new `RangeZip` machinery.

Similar to #5573, the code generation is implemented as a very low-tech
binary in the crate itself (`src/bin/range_zip.rs`), that just spews the
generated code on stdout.
That seems like the right complexity-to-maintenance tradeoff,
considering that iterator combinators don't really ever change.

Here's an example of one of these combinators:
```rust
/// Returns a new [`RangeZip2x2`] iterator.
///
/// The number of elements in a range zip iterator corresponds to the number of elements in the
/// shortest of its required iterators (`r0`, `r1`).
///
/// Each call to `next` is guaranteed to yield the next value for each required iterator,
/// as well as the most recent index amongst all of them.
///
/// Optional iterators accumulate their state and yield their most recent value (if any),
/// each time the required iterators fire.
pub fn range_zip_2x2<Idx, IR0, R0, IR1, R1, IO0, O0, IO1, O1>(
    r0: IR0,
    r1: IR1,
    o0: IO0,
    o1: IO1,
) -> RangeZip2x2<Idx, IR0::IntoIter, R0, IR1::IntoIter, R1, IO0::IntoIter, O0, IO1::IntoIter, O1>
where
    Idx: std::cmp::Ord,
    IR0: IntoIterator<Item = (Idx, R0)>,
    IR1: IntoIterator<Item = (Idx, R1)>,
    IO0: IntoIterator<Item = (Idx, O0)>,
    IO1: IntoIterator<Item = (Idx, O1)>,
{
    RangeZip2x2 {
        r0: r0.into_iter(),
        r1: r1.into_iter(),
        o0: o0.into_iter().peekable(),
        o1: o1.into_iter().peekable(),

        o0_data_latest: None,
        o1_data_latest: None,
    }
}

/// Implements a range zip iterator combinator with 2 required iterators and 2 optional
/// iterators.
///
/// See [`range_zip_2x2`] for more information.
pub struct RangeZip2x2<Idx, IR0, R0, IR1, R1, IO0, O0, IO1, O1>
where
    Idx: std::cmp::Ord,
    IR0: Iterator<Item = (Idx, R0)>,
    IR1: Iterator<Item = (Idx, R1)>,
    IO0: Iterator<Item = (Idx, O0)>,
    IO1: Iterator<Item = (Idx, O1)>,
{
    r0: IR0,
    r1: IR1,
    o0: Peekable<IO0>,
    o1: Peekable<IO1>,

    o0_data_latest: Option<O0>,
    o1_data_latest: Option<O1>,
}

impl<Idx, IR0, R0, IR1, R1, IO0, O0, IO1, O1> Iterator
    for RangeZip2x2<Idx, IR0, R0, IR1, R1, IO0, O0, IO1, O1>
where
    Idx: std::cmp::Ord,
    IR0: Iterator<Item = (Idx, R0)>,
    IR1: Iterator<Item = (Idx, R1)>,
    IO0: Iterator<Item = (Idx, O0)>,
    IO1: Iterator<Item = (Idx, O1)>,
    O0: Clone,
    O1: Clone,
{
    type Item = (Idx, R0, R1, Option<O0>, Option<O1>);

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        let Self {
            r0,
            r1,
            o0,
            o1,
            o0_data_latest,
            o1_data_latest,
        } = self;

        let Some((r0_index, r0_data)) = r0.next() else {
            return None;
        };
        let Some((r1_index, r1_data)) = r1.next() else {
            return None;
        };

        let max_index = [r0_index, r1_index].into_iter().max().unwrap();

        let mut o0_data = None;
        while let Some((_, data)) = o0.next_if(|(index, _)| index <= &max_index) {
            o0_data = Some(data);
        }
        let o0_data = o0_data.or(o0_data_latest.take());
        *o0_data_latest = o0_data.clone();

        let mut o1_data = None;
        while let Some((_, data)) = o1.next_if(|(index, _)| index <= &max_index) {
            o1_data = Some(data);
        }
        let o1_data = o1_data.or(o1_data_latest.take());
        *o1_data_latest = o1_data.clone();

        Some((max_index, r0_data, r1_data, o0_data, o1_data))
    }
}
```

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 8, 2024
This implements the new uncached range APIs.

Latest-at & range queries are now much more similar than before and
share a lot of nice traits.

Tests have been backported from `re_query`.

Here's an example/guide of using the new API:
```rust
// First, get the raw results for this query.
//
// Raw here means that these results are neither deserialized, nor resolved/converted.
// I.e. this corresponds to the raw `DataCell`s, straight from our datastore.
let results: RangeResults = re_query2::range(
    &store,
    &query,
    &entity_path.into(),
    MyPoints::all_components().iter().cloned(), // no generics!
);

// Then, grab the raw results for each individual components.
//
// This is still raw data, but now a choice has been made regarding the nullability of the
// _component batch_ itself (that says nothing about its _instances_!).
//
// * `get_required` returns an error if the component batch is missing
// * `get_optional` returns an empty set of results if the component if missing
// * `get` returns an option
let all_points: &RangeComponentResults = results.get_required(MyPoint::name())?;
let all_colors: &RangeComponentResults = results.get_optional(MyColor::name());
let all_labels: &RangeComponentResults = results.get_optional(MyLabel::name());

let all_indexed_points = izip!(
    all_points.iter_indices(),
    all_points.iter_dense::<MyPoint>(&resolver)
);
let all_indexed_colors = izip!(
    all_colors.iter_indices(),
    all_colors.iter_sparse::<MyColor>(&resolver)
);
let all_indexed_labels = izip!(
    all_labels.iter_indices(),
    all_labels.iter_sparse::<MyLabel>(&resolver)
);

let all_frames = range_zip_1x2(all_indexed_points, all_indexed_colors, all_indexed_labels);

// Then comes the time to resolve/convert and deserialize the data, _for each timestamp_.
// These steps have to be done together for efficiency reasons.
//
// Both the resolution and deserialization steps might fail, which is why this returns a `Result<Result<T>>`.
// Use `PromiseResult::flatten` to simplify it down to a single result.
//
// A choice now has to be made regarding the nullability of the _component batch's instances_.
// Our IDL doesn't support nullable instances at the moment -- so for the foreseeable future you probably
// shouldn't be using anything but `iter_dense`.
eprintln!("results:");
for ((data_time, row_id), points, colors, labels) in all_frames {
    let points = match points.flatten() {
        PromiseResult::Pending => {
            // Handle the fact that the data isn't ready appropriately.
            continue;
        }
        PromiseResult::Ready(data) => data,
        PromiseResult::Error(err) => return Err(err.into()),
    };

    let colors = if let Some(colors) = colors {
        match colors.flatten() {
            PromiseResult::Pending => {
                // Handle the fact that the data isn't ready appropriately.
                continue;
            }
            PromiseResult::Ready(data) => data,
            PromiseResult::Error(err) => return Err(err.into()),
        }
    } else {
        vec![]
    };
    let color_default_fn = || Some(MyColor::from(0xFF00FFFF));

    let labels = if let Some(labels) = labels {
        match labels.flatten() {
            PromiseResult::Pending => {
                // Handle the fact that the data isn't ready appropriately.
                continue;
            }
            PromiseResult::Ready(data) => data,
            PromiseResult::Error(err) => return Err(err.into()),
        }
    } else {
        vec![]
    };
    let label_default_fn = || None;

    // With the data now fully resolved/converted and deserialized, the joining logic can be
    // applied.
    //
    // In most cases this will be either a clamped zip, or no joining at all.

    let results = clamped_zip_1x2(points, colors, color_default_fn, labels, label_default_fn)
        .collect_vec();
    eprintln!("{data_time:?} @ {row_id}:\n    {results:?}");
}
```

- Fixes #3379
- Part of #1893  

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- TODO
- TODO

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Static-aware, key-less, component-based, cached range APIs.

```rust
let caches = re_query_cache2::Caches::new(&store);

// First, get the raw results for this query.
//
// They might or might not already be cached. We won't know for sure until we try to access
// each individual component's data below.
let results: CachedRangeResults = caches.range(
    &store,
    &query,
    &entity_path.into(),
    MyPoints::all_components().iter().copied(), // no generics!
);

// Then, grab the results for each individual components.
// * `get_required` returns an error if the component batch is missing
// * `get_or_empty` returns an empty set of results if the component if missing
// * `get` returns an option
//
// At this point we still don't know whether they are cached or not. That's the next step.
let all_points: &CachedRangeComponentResults = results.get_required(MyPoint::name())?;
let all_colors: &CachedRangeComponentResults = results.get_or_empty(MyColor::name());
let all_labels: &CachedRangeComponentResults = results.get_or_empty(MyLabel::name());

// Then comes the time to resolve/convert and deserialize the data.
// These steps have to be done together for efficiency reasons.
//
// That's when caching comes into play.
// If the data has already been accessed in the past, then this will just grab the
// pre-deserialized, pre-resolved/pre-converted result from the cache.
// Otherwise, this will trigger a deserialization and cache the result for next time.
let all_points = all_points.to_dense::<MyPoint>(&resolver);
let all_colors = all_colors.to_dense::<MyColor>(&resolver);
let all_labels = all_labels.to_dense::<MyLabel>(&resolver);

// The cache might not have been able to resolve and deserialize the entire dataset across all
// available timestamps.
//
// We can use the following APIs to check the status of the front and back sides of the data range.
//
// E.g. it is possible that the front-side of the range is still waiting for pending data while
// the back-side has been fully loaded.
assert!(matches!(
    all_points.status(),
    (PromiseResult::Ready(()), PromiseResult::Ready(()))
));

// Zip the results together using a stateful time-based join.
let all_frames = range_zip_1x2(
    all_points.range_indexed(),
    all_colors.range_indexed(),
    all_labels.range_indexed(),
);

// Then comes the time to resolve/convert and deserialize the data, _for each timestamp_.
// These steps have to be done together for efficiency reasons.
//
// Both the resolution and deserialization steps might fail, which is why this returns a `Result<Result<T>>`.
// Use `PromiseResult::flatten` to simplify it down to a single result.
eprintln!("results:");
for ((data_time, row_id), points, colors, labels) in all_frames {
    let colors = colors.unwrap_or(&[]);
    let color_default_fn = || {
        static DEFAULT: MyColor = MyColor(0xFF00FFFF);
        &DEFAULT
    };

    let labels = labels.unwrap_or(&[]).iter().cloned().map(Some);
    let label_default_fn = || None;

    // With the data now fully resolved/converted and deserialized, the joining logic can be
    // applied.
    //
    // In most cases this will be either a clamped zip, or no joining at all.

    let results = clamped_zip_1x2(points, colors, color_default_fn, labels, label_default_fn)
        .collect_vec();
    eprintln!("{data_time:?} @ {row_id}:\n    {results:?}");
}
```

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Title.

The new cache being natively component-based makes things much smoothier
than before.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Text logs, line plots and scatter plots.

A bit faster than `main`, with a bit less memory overhead.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Migrate all spatial views that were using the old cache APIs to the new
ones.
Instance keys are not queried at all anymore.

All views are now range-aware by default.
Also took the opportunity to somewhat streamline everything.

The 10min air-traffic example with full visible range is about 2-2.5x
faster than before.

I'm sure I broke a few things here and there, I'll run a full check
suite once everything's said and done.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
`re_query_cache` is gone, `re_query_cache2` takes its place -- simple as
that.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Migrate every little thing that didn't use to go through the cached
APIs.

`Image` and `Mesh3D` are temporarily cached even though they shouldn't
be, that's taken care of in a follow-up PR.

Once again, I probably broke a million edge cases -- I want to get as
fast as possible to removing instance keys before doing an in-depth
quality pass.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
There is now only one way to query data: `re_query` (well you can still
query the datastore directly if you're a monster, but that's for another
PR).

All queries go through both the query cache and the deserialization
cache.
There will be a follow-up PR to disable the deserialization cache for
specific components.

Most of this is just (re)moving stuff around except for the last two
commits which take care of porting the cached test suites since they
cannot depend on uncached APIs to do comparisons anymore.

- Closes #6018 
- Closes #3320

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Make it possible to not cache some components, all while pretending
really hard that they've been cached.

- Related: #5974 

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
@Wumpf Wumpf added 🦀 Rust API Rust logging API and removed 🪵 Log & send APIs Affects the user-facing API for all languages labels May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TimeInt::BEGINNING vs. TimeInt::MIN vs. Option<TimeInt>
3 participants