Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging lots of out-of-order data using the same timepoint on a timeline leads to extreme slowdown in datastore #4415

Closed
emilk opened this issue Dec 1, 2023 · 6 comments · Fixed by #4417 or #6570
Assignees
Labels
🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc ⛃ re_datastore affects the datastore itself
Milestone

Comments

@emilk
Copy link
Member

emilk commented Dec 1, 2023

A user shared a puffin profile of very slow data ingestion

slow_ingestion.puffin.zip

image

image

What I see is this: the data ingestion (which is currently happening in the main rendering loop, see #4298) gets a message containing ~3800 rows of data. Processing that takes ~2 seconds, because each insertion takes a whopping ~9ms. The bulk of the insertion cost is re-sorting, which makes me suspect that the reason for this new slowdown is out-of-order insertions.

There are several layers of bad here:

  • As mentioned, we shouldn't be doing insertions on the main render loop (Parallelize data ingestion #4298)
  • If we do, we should make sure to not use more than a certain time budget
  • Each insertion should not be this slow, even with out-of-order insertion

Workarounds

  • Run the logging with RERUN_FLUSH_NUM_ROWS=1: this will create smaller chunks of work for the viewer, which should get back the viewer frame-rate, but still with slow ingestion

  • Stop logging data out-of-order. This could happen if you are using custom timelines that are not monotonically increasing. Rerun should support out-of-order logging, but it seems like we're hitting a bad corner case here.

Other notes

It would be great if we could tell from a puffin profile whether or not the code that was running was a debug or release build.
If this was from a debug build, then this whole issue is a mistake.

@emilk emilk added 🪳 bug Something isn't working ⛃ re_datastore affects the datastore itself 🚀 performance Optimization, memory use, etc labels Dec 1, 2023
@teh-cmc
Copy link
Member

teh-cmc commented Dec 1, 2023

The first thing that immediately jumps out to me is: why are the sorting routines even running at all on the ingestion path?!

There is only one very specific case where we ever try to sort during ingestion: the bucket is full and has to be split.
Since I don't see any trace of IndexedBucket::split in this flamegraph, it means we're early exiting before the actual split, which means the bucket only contains a single timestamp.

fn split(&self) -> Option<(TimeInt, Self)> {
let Self {
timeline,
cluster_key: _,
inner,
} = self;
let mut inner1 = inner.write();
inner1.sort();
let IndexedBucketInner {
is_sorted: _,
time_range: time_range1,
col_time: col_time1,
col_insert_id: col_insert_id1,
col_row_id: col_row_id1,
col_num_instances: col_num_instances1,
columns: columns1,
size_bytes: _, // NOTE: recomputed below
} = &mut *inner1;
if col_time1.len() < 2 {
return None; // early exit: can't split the unsplittable
}
if col_time1.first() == col_time1.last() {
// The entire bucket contains only one timepoint, thus it's impossible to find
// a split index to begin with.
return None;
}
re_tracing::profile_function!();

But then why is the bucket sorted flag invalidated every frame?

This code looks correct:

*is_sorted &= (*last_time, *last_row_id) <= (time.as_i64(), row.row_id());

So the only explanation I can think of is that the user is somehow repeatedly ingesting data at a single timestamp and doing so with out-of-order RowIds (maybe multiple processes and/or threads are involved)?

In any case I can probably make a patch to move that sort after the early exit: the bucket's time-range bounds are up-to-date even when the bucket is unsorted.

@teh-cmc
Copy link
Member

teh-cmc commented Dec 1, 2023

It would be great if we could tell from a puffin profile whether or not the code that was running was a debug or release build.
If this was from a debug build, then this whole issue is a mistake.

+1 -- even the whole rerun versioning string ideally

@teh-cmc teh-cmc self-assigned this Dec 1, 2023
teh-cmc added a commit that referenced this issue Dec 1, 2023
…ath (#4417)

This should fix #4415, although without knowing how exactly the user
ended up in that situation it's hard to say for certain.
@emilk emilk reopened this Dec 4, 2023
@emilk
Copy link
Member Author

emilk commented Dec 4, 2023

#4417 only pushed the problem to later:

Screenshot 2023-12-04 at 17 13 27

The user was using rec.set_time_sequence temporarily for logging a few things, then never called disable_timeline. The user then went on to log 800k+ of other things, all ending up on the same timepoint in this "temporary" timeline.

EDIT: I have been shared a reproduce on a private repository. It logs everything on the main thread, yet the RowId:s are reaching the batcher out-of-order for some unknown reason.

EDIT2: The cause was this: #4658

@emilk emilk assigned emilk and unassigned teh-cmc Dec 5, 2023
@emilk emilk changed the title Extremely slow data ingestion Logging lots of out-of-order data using the same timepoint on a timeline leads to extreme slowdown in datastore Dec 5, 2023
@emilk emilk added this to the Triage milestone Dec 5, 2023
@emilk
Copy link
Member Author

emilk commented Dec 5, 2023

We do log a warning:

[2023-12-05T13:00:56Z WARN re_arrow_store::store_write] Found over 512 rows with the same timepoint "keyframes"=#35 - perhaps you forgot to update or remove the timeline "keyframes"?

I think the only solution is to actually split these buckets so they don't grow.

@emilk emilk removed their assignment Dec 11, 2023
emilk added a commit that referenced this issue Jan 3, 2024
### What
Rerun is designed to be able to handle out-of-order ingestion, but with
a performance hit.

Usually that performance hit is very small, but when logging to the
exact same timestamp many times, we hit a corner-case of the data store
where we get a huge bucket (see
#4415). If the data arrives
out-of-order, this means an expensive resort of the huge bucket on each
ingestion.

Usually such out-of-order log rows should only happen in multi-threaded
applications, but due to a bug it happened almost always. This PR fixes
this bug, and adds regression test for it. This PR thus alleviates the
problem in #4415, but does not
fix it for actual out-of-order multi-threaded applications.

~I introduced the bug in #4502 after the 0.11 release.~ **EDIT:** no, it
was pre-existing!

<img
src="https://github.com/rerun-io/rerun/assets/1148717/fde1f973-deec-46be-b6cc-b671968bf633"
width="250">

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using newly built examples:
[app.rerun.io](https://app.rerun.io/pr/4658/index.html)
* Using examples from latest `main` build:
[app.rerun.io](https://app.rerun.io/pr/4658/index.html?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[app.rerun.io](https://app.rerun.io/pr/4658/index.html?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG

- [PR Build Summary](https://build.rerun.io/pr/4658)
- [Docs
preview](https://rerun.io/preview/b58459b740b693d86be9e9b3e846d37bf565f060/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/b58459b740b693d86be9e9b3e846d37bf565f060/examples)
<!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)
@emilk emilk self-assigned this Jan 4, 2024
emilk added a commit that referenced this issue Jan 5, 2024
### What
In order to investigate #4415 I
wanted to reproduce it.

This PR adds a benchmark that inserts many rows using the same
timepoint, either in-order or shuffled.

I also fixed a "bug" in the existing insertion benchmarks, which
accidentally were _also_ measuring the time to call `DataTable::to_rows`
(and collect the result), which was about 10% of the measured time.

---

5950X pinned to core #7 / Arch:
```
datastore/num_rows=1000/num_instances=1/shuffled=false/insert_same_time_point/insert      1.00  1641.2±28.89µs 595.0 KElem/sec
datastore/num_rows=1000/num_instances=1/shuffled=true/insert_same_time_point/insert       1.00      3.0±0.04ms 328.4 KElem/sec
datastore/num_rows=10000/num_instances=1/shuffled=false/insert_same_time_point/insert     1.00     27.1±1.59ms 361.0 KElem/sec
datastore/num_rows=10000/num_instances=1/shuffled=true/insert_same_time_point/insert      1.00    162.5±1.23ms 60.1 KElem/sec
```

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using newly built examples:
[app.rerun.io](https://app.rerun.io/pr/4676/index.html)
* Using examples from latest `main` build:
[app.rerun.io](https://app.rerun.io/pr/4676/index.html?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[app.rerun.io](https://app.rerun.io/pr/4676/index.html?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG

- [PR Build Summary](https://build.rerun.io/pr/4676)
- [Docs
preview](https://rerun.io/preview/75ba0c0685dbea827b473270b1b350e348e72974/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/75ba0c0685dbea827b473270b1b350e348e72974/examples)
<!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

---------

Co-authored-by: Clement Rey <[email protected]>
@emilk
Copy link
Member Author

emilk commented Jan 5, 2024

Summary: When this issue was open, we hit this extreme slow-down every time we logged a lot of data using the same time point. Since then two PRs have landed:

The first reduces the sorting to only be done at query time.
This still makes the viewer really slow, but it can ingest a lot more between each slow frame,
perhaps reaching the end of the recording (if any) fast enough.

The second PR ensures we only hit the corner case if data is logged out-of-order.
It means the log rows arrive to the batcher in a different order than they were produced (swapped RowIds).
This mean there has to be a race: two threads logging to the same timeline, using the same time point.

In other words, the risk for this bug to produce extreme slowdowns is a lot smaller now.
It can still happen, but I think it is safe to lower its priority until we get more reports of it.

@emilk
Copy link
Member Author

emilk commented Jan 5, 2024

For future adventurers:

The problem with splitting the buckets further even when they share the same timepoint, is the definition of IndexTable::buckets:

pub buckets: BTreeMap<TimeInt, IndexedBucket>

Since BTreeMap doesn't allow duplicate keys, we would need to change the key there, or add "sub-buckets" inside IndexedBucket split on RowId. In either case, it is an annoying change.

@emilk emilk removed their assignment Jan 5, 2024
@teh-cmc teh-cmc self-assigned this May 31, 2024
@teh-cmc teh-cmc closed this as completed in ed20cba Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc ⛃ re_datastore affects the datastore itself
Projects
None yet
2 participants