Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save rerun data to file #2

Merged
merged 6 commits into from
Apr 15, 2022
Merged

Save rerun data to file #2

merged 6 commits into from
Apr 15, 2022

Conversation

emilk
Copy link
Member

@emilk emilk commented Apr 14, 2022

.rrd for rerun data.

You cannot save files from the web viewer, but you can load them (by drag-dropping onto it).

@emilk emilk merged commit 1999acc into main Apr 15, 2022
@emilk emilk deleted the save-to-file branch April 15, 2022 05:39
teh-cmc added a commit that referenced this pull request Dec 5, 2022
* object path => entity path

* move utils from lib.rs to dedicated file

* color_rgba -> color_srgba_unmultiplied

* getting intimate with arrow's datamodel

* getting _even more_ intimate with arrow's datamodel

* split it

* building dem index keys

* disgustingly, incorrectly inserting components all the way down

* timelines need no list

* similarly clarifying the nested listing situation, on the components side this time

* make sure it looks like it should!

* actual integration tests

* bootstrapping text-based debugging

* bootstrapping indices

* introducing TypedTimeInt everywhere

* full index sorting

* auto-inserting empty component lists in starting buckets

* better datagen tools

* bidirectional merges for indices + properly showing NULLs in dataframes

* finally can show off some more advanced ingestion patterns!

* dealing with corrupt validity bitmaps, and the sheer size of my stupidity

* read path taking its first steps: latest_at for indices!

* look! it's a read path!

* it works!

* show the resulting dataframe duh

* clean up pass #1: task log

* clean up pass #2: moving everybody where they belong

* clean up pass #3: definitions

* a minimal solution for missing components

* some more cleanup

* porting relevant TODOs into issues

* appeasing the CI deities

* merge catastrophe

* they see me cleanin', they hatin'

* * Reorg of re_arrow_store
* Removed up old ArrowDB code
* Connected app data ingest into new DataStore

* fix broken doc links

* store files prefixed with store_

* integration tests in integration folder + exposing datagen tools to everyone

* make integration tests scale to more complex scenarios

* adding currently failing scenario: query before any data present

* added failing test and scenarios for all emptines-related edge cases

* better testing tools

* fixing broken edge cases on read path

* demonstrating faulty read behavior in roundtrip test

* fixing dem faulty swaps

* when the doc itself demonstrates bugs :x

* adding baseline bench somewhat mimicking the legacy ones, though it doesn't really make sense anymore

* exploding query results so you can actually do stuff with them

* properly testing all halfway frames (and, unsurprisingly, failing!)

* properly dealing with multi-row primary indices

* less verbose scenarios for end-to-lend latest_at tests

* addressing misc PR comments

* TimeReal, TimeRange & TimeRangeF are now a properly of re_log_types™

* retiring TypedTimeRange before Emil tries to hurt it

* mark unreachable as such

* replaced binary_search with a partition_point

* using entity path hashes directly in indexing datastructures

* re_viewer don't need those no more

Co-authored-by: John Hughes <[email protected]>
Co-authored-by: Emil Ernerfeldt <[email protected]>
@emilk emilk mentioned this pull request Jan 24, 2023
Wumpf added a commit that referenced this pull request Aug 10, 2023
# This is the 1st commit message:

new color/keypoint/classid/label datatypes

# This is the commit message #2:

fixups
emilk pushed a commit that referenced this pull request Aug 17, 2023
### What

There is a limitation to handling no more than a single dropped file
over the viewer dating back from #2 😮, with a bug in error handling to
boot (error was shown only for 3+ files).

This PR removes that limitation, as it seems to... just work.

<img width="1747" alt="image"
src="https://github.com/rerun-io/rerun/assets/49431240/ef435608-d505-4dec-b713-8675df8927ce">


### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3030) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3030)
- [Docs
preview](https://rerun.io/preview/pr%3Aantoine%2Fmulti-dropped-files/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aantoine%2Fmulti-dropped-files/examples)
teh-cmc added a commit that referenced this pull request Feb 29, 2024
commit f15c79b
Author: Clement Rey <[email protected]>
Date:   Wed Feb 28 17:55:25 2024 +0100

    fmt

commit 7b50fa8
Author: Clement Rey <[email protected]>
Date:   Wed Feb 28 17:53:17 2024 +0100

    enable data_loaders feature by default in rerun_py

commit a35a9d0
Author: Clement Rey <[email protected]>
Date:   Wed Feb 28 12:28:53 2024 +0100

    add python example

commit 5dd9685
Author: Clement Rey <[email protected]>
Date:   Wed Feb 28 12:28:21 2024 +0100

    expose dataloaders to python SDK
teh-cmc added a commit that referenced this pull request Oct 2, 2024
A first implementation of the new dataframe APIs.
The name is now very misleading though: there isn't anything dataframe-y
left in here, it is a row-based iterator with Rerun semantics baked in,
driven by a sorted streaming join.

It is rather slow (related:
#7558 (comment)),
lacks many features and is full of edge cases, but it works.
It does support dedupe-latest semantics (slowly), view contents and
selections, chunk overlaps, and pagination (horribly, by virtue of
implementing `Iterator`).
It does _not_ support `Clear`s, nor `latest-at` sparse-filling, nor
PoVs, nor index sampling. Yet.

Upcoming PRs will be all about fixing these shortcomings one by one.

It should look somewhat familiar:
```rust
let query_cache = QueryCache::new(store);
let query_engine = QueryEngine {
    store,
    cache: &query_cache,
};

let mut query = QueryExpression2::new(timeline);
query.view_contents = Some(
    query_engine
        .iter_entity_paths(&entity_path_filter)
        .map(|entity_path| (entity_path, None))
        .collect(),
);
query.filtered_index_range = Some(ResolvedTimeRange::new(time_from, time_to));
eprintln!("{query:#?}:");

let query_handle = query_engine.query(query.clone());
// eprintln!("{:#?}", query_handle.selected_contents());
for batch in query_handle.into_batch_iter().skip(offset).take(len) {
    eprintln!("{batch}");
}
```

No tests until we have the guarantee that these are the semantics we
will commit to.

* Part of #7495 
* Requires #7559
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant