-
Notifications
You must be signed in to change notification settings - Fork 265
fast unlock in contention #461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
During contention, almost all threads are active on CPU, unlock them fast can make those threads make progress more quickly. This help improve global throughput in high contention a lot. One shortcoming is that fair unlock is now required be invoked explicitly. This is an improvement to Amanieu#418. Signed-off-by: Jay <[email protected]>
|
Running Running with 9 threads
Running with 18 threads
Running with 27 threads
Running with 36 threads
|
|
Running parking_lot::RwLock (this pr) - [write] 1102.323 kHz [read] 2943.833 kHz |
This reverts commit d43aee1. Signed-off-by: Jay <[email protected]>
Signed-off-by: Jay <[email protected]>
|
Reimplement the PR by maintaining parked bit on waker side, new implementation is less error-prone and work with CondVar directly. Benchmark shows even more positive results: Running Running with 9 threads
Running with 18 threads
Running with 27 threads
Running with 36 threads
Running parking_lot::RwLock (this pr) - [write] 6121.347 kHz [read] 968.373 kHz |
| { | ||
| let mut prev = self.state.load(Ordering::Relaxed); | ||
| let new_state = prev & !LOCKED_BIT; | ||
| prev = self.state.swap(new_state, Ordering::Release); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bug here: you may "forget" a parked thread if another thread sets PARKED_BIT between the load and swap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then prev must be set to PARKED_BIT | LOCKED_BIT at L104 and can't pass the check at L105.
|
Bench with the command in #418 std::sync::Mutex avg 30.795793ms min 28.369313ms max 33.668656ms std::sync::Mutex avg 30.52266ms min 28.69828ms max 34.945486ms |
This is an alternative implementation of idea Amanieu#461. Compared to Amanieu#461, this PR maintains parked bit on waiter side, so that waker doesn't have to atomic operation twice. And waker now reset all lock states back to 0 no matter what state it was. This makes fast lock more likely succeed during high contention. Signed-off-by: Jay <[email protected]>
This is an alternative implementation of idea Amanieu#461. Compared to Amanieu#461, this PR maintains parked bit on waiter side, so that waker doesn't have to atomic operation twice. And waker now reset all lock states back to 0 no matter what state it was. This makes fast lock more likely succeed during high contention. Signed-off-by: Jay <[email protected]>
This is an alternative more aggressive implementation of idea Amanieu#461. Compared to Amanieu#461, this PR - maintains parked bit on waiter side, so that waker doesn't have to atomic operation twice. - reset all lock states back to 0 when unlock. This makes fast lock more likely succeed during high contention. - set PARKED_BIT even waiter is prevented from sleep, so that more threads can be woken up during contention to compete for progress. Signed-off-by: Jay <[email protected]>
During contention, almost all threads are active on CPU, unlock them fast can make those threads make progress more quickly. This help improve global throughput in high contention a lot.
One shortcoming is that fair unlock is now required be invoked explicitly.
This is an improvement to #418.