Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New data APIs 13: sunset legacy cache crate #5994

Merged
merged 4 commits into from
Apr 26, 2024

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Apr 16, 2024

re_query_cache is gone, re_query_cache2 takes its place -- simple as that.


Part of a PR series to completely revamp the data APIs in preparation for the removal of instance keys and the introduction of promises:

Builds on top of the static data PR series:

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!

@teh-cmc teh-cmc added 🔍 re_query affects re_query itself do-not-merge Do not merge this PR 🚜 refactor Change the code, not the functionality include in changelog labels Apr 16, 2024
@teh-cmc teh-cmc changed the base branch from main to cmc/data_apis_12_spatial April 16, 2024 13:31
@teh-cmc teh-cmc force-pushed the cmc/data_apis_12_spatial branch 2 times, most recently from ef52f75 to 0bd949c Compare April 18, 2024 15:00
@teh-cmc teh-cmc force-pushed the cmc/data_apis_13_kill_old_cache branch from 6634872 to e7d8fe1 Compare April 18, 2024 15:03
@teh-cmc teh-cmc force-pushed the cmc/data_apis_12_spatial branch from 0bd949c to 8a3356c Compare April 19, 2024 09:33
@teh-cmc teh-cmc force-pushed the cmc/data_apis_13_kill_old_cache branch from e7d8fe1 to 3299bb5 Compare April 19, 2024 09:38
@teh-cmc teh-cmc force-pushed the cmc/data_apis_12_spatial branch from 8a3356c to feefc95 Compare April 23, 2024 11:27
@teh-cmc teh-cmc force-pushed the cmc/data_apis_13_kill_old_cache branch from 3299bb5 to 71fb160 Compare April 24, 2024 09:17
@teh-cmc teh-cmc marked this pull request as ready for review April 24, 2024 09:40
@emilk emilk self-requested a review April 25, 2024 08:51
@teh-cmc teh-cmc force-pushed the cmc/data_apis_12_spatial branch from 50b32ff to 734c6cb Compare April 25, 2024 16:16
@teh-cmc teh-cmc force-pushed the cmc/data_apis_13_kill_old_cache branch from 7a61a31 to 8177a18 Compare April 25, 2024 16:27
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Static-aware, key-less, component-based, cached range APIs.

```rust
let caches = re_query_cache2::Caches::new(&store);

// First, get the raw results for this query.
//
// They might or might not already be cached. We won't know for sure until we try to access
// each individual component's data below.
let results: CachedRangeResults = caches.range(
    &store,
    &query,
    &entity_path.into(),
    MyPoints::all_components().iter().copied(), // no generics!
);

// Then, grab the results for each individual components.
// * `get_required` returns an error if the component batch is missing
// * `get_or_empty` returns an empty set of results if the component if missing
// * `get` returns an option
//
// At this point we still don't know whether they are cached or not. That's the next step.
let all_points: &CachedRangeComponentResults = results.get_required(MyPoint::name())?;
let all_colors: &CachedRangeComponentResults = results.get_or_empty(MyColor::name());
let all_labels: &CachedRangeComponentResults = results.get_or_empty(MyLabel::name());

// Then comes the time to resolve/convert and deserialize the data.
// These steps have to be done together for efficiency reasons.
//
// That's when caching comes into play.
// If the data has already been accessed in the past, then this will just grab the
// pre-deserialized, pre-resolved/pre-converted result from the cache.
// Otherwise, this will trigger a deserialization and cache the result for next time.
let all_points = all_points.to_dense::<MyPoint>(&resolver);
let all_colors = all_colors.to_dense::<MyColor>(&resolver);
let all_labels = all_labels.to_dense::<MyLabel>(&resolver);

// The cache might not have been able to resolve and deserialize the entire dataset across all
// available timestamps.
//
// We can use the following APIs to check the status of the front and back sides of the data range.
//
// E.g. it is possible that the front-side of the range is still waiting for pending data while
// the back-side has been fully loaded.
assert!(matches!(
    all_points.status(),
    (PromiseResult::Ready(()), PromiseResult::Ready(()))
));

// Zip the results together using a stateful time-based join.
let all_frames = range_zip_1x2(
    all_points.range_indexed(),
    all_colors.range_indexed(),
    all_labels.range_indexed(),
);

// Then comes the time to resolve/convert and deserialize the data, _for each timestamp_.
// These steps have to be done together for efficiency reasons.
//
// Both the resolution and deserialization steps might fail, which is why this returns a `Result<Result<T>>`.
// Use `PromiseResult::flatten` to simplify it down to a single result.
eprintln!("results:");
for ((data_time, row_id), points, colors, labels) in all_frames {
    let colors = colors.unwrap_or(&[]);
    let color_default_fn = || {
        static DEFAULT: MyColor = MyColor(0xFF00FFFF);
        &DEFAULT
    };

    let labels = labels.unwrap_or(&[]).iter().cloned().map(Some);
    let label_default_fn = || None;

    // With the data now fully resolved/converted and deserialized, the joining logic can be
    // applied.
    //
    // In most cases this will be either a clamped zip, or no joining at all.

    let results = clamped_zip_1x2(points, colors, color_default_fn, labels, label_default_fn)
        .collect_vec();
    eprintln!("{data_time:?} @ {row_id}:\n    {results:?}");
}
```

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Title.

The new cache being natively component-based makes things much smoothier
than before.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Text logs, line plots and scatter plots.

A bit faster than `main`, with a bit less memory overhead.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
@teh-cmc teh-cmc force-pushed the cmc/data_apis_12_spatial branch from b69decd to f3cd349 Compare April 26, 2024 10:41
Base automatically changed from cmc/data_apis_12_spatial to main April 26, 2024 10:41
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Migrate all spatial views that were using the old cache APIs to the new
ones.
Instance keys are not queried at all anymore.

All views are now range-aware by default.
Also took the opportunity to somewhat streamline everything.

The 10min air-traffic example with full visible range is about 2-2.5x
faster than before.

I'm sure I broke a few things here and there, I'll run a full check
suite once everything's said and done.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
@teh-cmc teh-cmc force-pushed the cmc/data_apis_13_kill_old_cache branch from 8177a18 to cb70b89 Compare April 26, 2024 10:43
@teh-cmc teh-cmc merged commit aa044bc into main Apr 26, 2024
11 of 18 checks passed
@teh-cmc teh-cmc deleted the cmc/data_apis_13_kill_old_cache branch April 26, 2024 10:43
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Migrate every little thing that didn't use to go through the cached
APIs.

`Image` and `Mesh3D` are temporarily cached even though they shouldn't
be, that's taken care of in a follow-up PR.

Once again, I probably broke a million edge cases -- I want to get as
fast as possible to removing instance keys before doing an in-depth
quality pass.

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
There is now only one way to query data: `re_query` (well you can still
query the datastore directly if you're a monster, but that's for another
PR).

All queries go through both the query cache and the deserialization
cache.
There will be a follow-up PR to disable the deserialization cache for
specific components.

Most of this is just (re)moving stuff around except for the last two
commits which take care of porting the cached test suites since they
cannot depend on uncached APIs to do comparisons anymore.

- Closes #6018 
- Closes #3320

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
teh-cmc added a commit that referenced this pull request Apr 26, 2024
Make it possible to not cache some components, all while pretending
really hard that they've been cached.

- Related: #5974 

---

Part of a PR series to completely revamp the data APIs in preparation
for the removal of instance keys and the introduction of promises:
- #5573
- #5574
- #5581
- #5605
- #5606
- #5633
- #5673
- #5679
- #5687
- #5755
- #5990
- #5992
- #5993 
- #5994
- #6035
- #6036
- #6037

Builds on top of the static data PR series:
- #5534
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge Do not merge this PR include in changelog 🔍 re_query affects re_query itself 🚜 refactor Change the code, not the functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants