-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement recording/last-modified-at aware garbage collection #4183
Conversation
3206392
to
9b0e2a1
Compare
crates/re_viewer/src/store_hub.rs
Outdated
}; | ||
|
||
let store_dbs = &mut self.store_bundle.store_dbs; | ||
if store_dbs.len() <= 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprised to see this return early when there is 1 store_db? Is this guaranteed to be some kind of special store that we don't want to GC? Please clarify with a comment if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh-oh. No that's just me having shuffled things around one time too many 😬 Nice catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I went with the following... opinions welcome.
commit 8e6b8853b3751037989e80c26704fbf3d0438a64
Author: Clement Rey <[email protected]>
Date: Wed Nov 8 19:21:51 2023 +0100
always GC, but dont remove the last one
diff --git a/crates/re_viewer/src/store_hub.rs b/crates/re_viewer/src/store_hub.rs
index e17fd0d0dd..3457abc1b6 100644
--- a/crates/re_viewer/src/store_hub.rs
+++ b/crates/re_viewer/src/store_hub.rs
@@ -222,9 +222,6 @@ impl StoreHub {
};
let store_dbs = &mut self.store_bundle.store_dbs;
- if store_dbs.len() <= 1 {
- return;
- }
let Some(store_db) = store_dbs.get_mut(&store_id) else {
if cfg!(debug_assertions) {
@@ -239,9 +236,21 @@ impl StoreHub {
let store_size_after =
store_db.store().timeless_size_bytes() + store_db.store().temporal_size_bytes();
+ // No point keeping an empty recording around.
+ if store_db.is_empty() {
+ self.remove_recording_id(&store_id);
+ return;
+ }
+
// Running the GC didn't do anything.
- // That's because all that's left in that store is protected rows: it's time to remove it entirely.
- if store_size_before == store_size_after {
+ //
+ // That's because all that's left in that store is protected rows: it's time to remove it
+ // entirely, unless it's the last recording still standing, in which case we're better off
+ // keeping some data around to show the user rather than a blank screen.
+ //
+ // If the user needs the memory for something else, they will get it back as soon as they
+ // log new things anyhow.
+ if store_size_before == store_size_after && store_dbs.len() > 1 {
self.remove_recording_id(&store_id);
}
9b0e2a1
to
fa66a63
Compare
fa66a63
to
8e6b885
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly seems like a net improvement relative to today.
However, there's one edge case I'm a bit worried about, which is when you have several incoming recordings in parallel in which case you really do want to distribute your GCs as before. In this case "last modified" is going to jump around somewhat unpredictably as new data comes into the system.
I'm trying to think if there's something we could do with an "overlap" metric. Basically if all recordings are considered overlapping, then we spread out our GC evenly. Otherwise we GC the oldest recording, as implemented here.
Commit by commit, there's renaming involved!
GC will now focus on the oldest-modified recording first.
Tried a lot of fancy things, but a lot of stress testing has shown that nothing worked as well as doing this the dumb way.
Speaking of stress testing, the scripts I've used are now committed in the repository. Make sure to try them out when modifying the GC code 😬.
In general, the GC supports stress much better than I thought/hoped:
many_medium_sized_single_row_recordings.py
,many_medium_sized_many_rows_recordings.py
&many_large_many_rows_recordings.py
all behave pretty nicely, something like this:23-11-08_16.41.47.patched.mp4
many_large_single_row_recordings.py
on the other hand is still a disaster (watch til the end, this slowly devolves into a blackhole):23-11-08_17.00.12.patched.mp4
This is not a new problem (not to me at least 😬), large recordings with very few rows have always been a nightmare on the GC (not specifically the DataStore GC, the GC as a whole through the entire app).
I've never had time to investigate why, but now we have an issue for it at least:
app_id
/recording_id
semantics #1904Checklist