Optimized cpu time for 3D point clouds (once again!) #5273
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
Embarrassingly simple optimization that got missed when cachifying the point cloud: Removed two needless
to_vec
copy calls which were done on separate rayon jobs.pixi run rerun-release ../rainwater.rrd --threads=1
19.6ms -> 17.7ms cpu time on top bar counter in rainwater scene (4.6mio colored points with homogeneous radius).
Without --threads=1 perf was a bit too unstable to make good statements (also perf trace can't be averaged nicely) and arguably much more interesting anyways. There's not much multithreading going on in this scene regardless.
Starts to get gpu bound on my mac which is ofc extremely view & resolution dependent. Haven tried but I expect the gpu->cpu transfer cost to still be much more significant on machines without unified memory architecture (and obviously we want to fix that regardless).
Before:
After:
As visible from the traces, we could certainly still do better fairly easily by having a fast path on color and radius processing
Checklist
main
build: app.rerun.ionightly
build: app.rerun.io