-
Notifications
You must be signed in to change notification settings - Fork 78
Lock mark sweep block list during release #1106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
We can add the assertions I mentioned in #1103 (not the early |
|
|
||
| #[cfg(feature = "ms_block_list_sanity")] | ||
| { | ||
| let mut sanity_list = self.sanity_list.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this method is not thread-safe, we can use try_lock() instead of lock(), assert that try_lock() always succeeds. If the assertion fails, it means there is a contention on the lock, and the only case that would happen is that two threads attempted to call thread-unsafe methods concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the method is_empty itself can be called from multiple threads concurrently as long as there isn't another thread calling other methods that mutate this BlockList. In theory, Rust's ownership model and borrow checker can prevent that from happening because it disallows a & to coexist with &mut.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this PR fixes the issue, I am OK with it because the main part is just adding a lock.
But I think there is something fundamentally wrong with our implementation. We can discuss more about it on Zulip. My hypothesis is that SweepChunk doesn't really need to sweep the blocks, but only needs to count marked blocks. Once we remove that, there should be no data race anymore. That'll fundamentally address the race this PR is addressing.
Using try_lock on the sanity check should make it fail faster if a race occurs.
|
|
||
| #[cfg(feature = "ms_block_list_sanity")] | ||
| { | ||
| let mut sanity_list = self.sanity_list.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the method is_empty itself can be called from multiple threads concurrently as long as there isn't another thread calling other methods that mutate this BlockList. In theory, Rust's ownership model and borrow checker can prevent that from happening because it disallows a & to coexist with &mut.
|
|
||
| #[cfg(feature = "ms_block_list_sanity")] | ||
| { | ||
| let mut sanity_list = self.sanity_list.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if there is a contention on this lock, there must be a race because no two threads should call BlockList::remove at the same time. We can use try_lock() here.
|
|
||
| #[cfg(feature = "ms_block_list_sanity")] | ||
| { | ||
| let mut sanity_list = self.sanity_list.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
|
|
||
| #[cfg(feature = "ms_block_list_sanity")] | ||
| { | ||
| let mut sanity_list = self.sanity_list.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same.
This PR resolves the issue in #824 (comment). It closes #824.
AbandonedBlockListswhen they are accessed. This resolves the data race betweenAbandonedBlockListsandSweepChunk-- they both access blocks.ms_block_list_sanity. This embeds a sanity block listVec<Block>inBlockList, and checks if the actual block list with side metadata and linked list matches the sanity list.ms_block_list_sanityis enabled forextreme_assertions.