Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow data ingestion of light data #3086

Closed
emilk opened this issue Aug 23, 2023 · 2 comments · Fixed by #3088
Closed

Very slow data ingestion of light data #3086

emilk opened this issue Aug 23, 2023 · 2 comments · Fixed by #3088
Labels
🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc
Milestone

Comments

@emilk
Copy link
Member

emilk commented Aug 23, 2023

A user reported extremely slow data ingestion of rather simple data (a few lines and points, 13 MB for ~30s worth of data).

When turning on RUST_LOG=debug the terminal was spammed with:

2023-08-23T08:33:14Z DEBUG re_arrow_store::store_write] couldn't split indexed bucket, proceeding to ignore limits kind="insert" timeline=frame time="#0" entity=pitch/gt_ball/cam06/projections len_limit=512 len=2298 len_overflow=true

I suspect this is related.

@emilk emilk added 🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc labels Aug 23, 2023
@emilk emilk added this to the 0.9 milestone Aug 23, 2023
@emilk emilk changed the title Slow ingestion Very slow data ingestion of light data Aug 23, 2023
@emilk emilk modified the milestones: 0.9, 0.8.2 Aug 23, 2023
@emilk
Copy link
Member Author

emilk commented Aug 23, 2023

All data has frame=0, and so the store cannot split the bucket. The problem is that it is re-sorting the bucket on each insert:

Screenshot 2023-08-23 at 13 16 57

So the problem is:

@emilk
Copy link
Member Author

emilk commented Aug 23, 2023

Let's fix #433, but also remove the log spam

emilk added a commit that referenced this issue Aug 23, 2023
### What
* Closes #3086
* Closes #433

This should also overall just speed up data insertion for the common
case of already-sorted data

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3088) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3088)
- [Docs
preview](https://rerun.io/preview/e5adb1aa580de2274b4eca9f6c5de38ae503b521/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/e5adb1aa580de2274b4eca9f6c5de38ae503b521/examples)
<!--EXAMPLES-PREVIEW--><!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
jleibs pushed a commit that referenced this issue Aug 31, 2023
### What
* Closes #3086
* Closes #433

This should also overall just speed up data insertion for the common
case of already-sorted data

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3088) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3088)
- [Docs
preview](https://rerun.io/preview/e5adb1aa580de2274b4eca9f6c5de38ae503b521/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/e5adb1aa580de2274b4eca9f6c5de38ae503b521/examples)
<!--EXAMPLES-PREVIEW--><!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
jleibs pushed a commit that referenced this issue Aug 31, 2023
### What
* Closes #3086
* Closes #433

This should also overall just speed up data insertion for the common
case of already-sorted data

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3088) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3088)
- [Docs
preview](https://rerun.io/preview/e5adb1aa580de2274b4eca9f6c5de38ae503b521/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/e5adb1aa580de2274b4eca9f6c5de38ae503b521/examples)
<!--EXAMPLES-PREVIEW--><!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
jleibs pushed a commit that referenced this issue Aug 31, 2023
### What
* Closes #3086
* Closes #433

This should also overall just speed up data insertion for the common
case of already-sorted data

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3088) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3088)
- [Docs
preview](https://rerun.io/preview/e5adb1aa580de2274b4eca9f6c5de38ae503b521/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/e5adb1aa580de2274b4eca9f6c5de38ae503b521/examples)
<!--EXAMPLES-PREVIEW--><!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant