Clarified docs in std::sync::RwLock + added test to ensure that max reader count is respected#153555
Clarified docs in std::sync::RwLock + added test to ensure that max reader count is respected#153555asder8215 wants to merge 1 commit intorust-lang:mainfrom
Conversation
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
Can we add a test that e.g. at least 2^32 read locks can be acquired and then document that? I've certainly written code (though not particularly cross-platform deployed code) that would benefit from a stronger guarantee here, and it seems like at least on tier 1 targets (for which we test in CI) we should be able to give a stronger guarantee than 'at least 2' :) |
I can do that for sure. Regarding the test, do you want me to add this in as a doctest within the |
|
I don't think it should be a doc test, seems too noisy for the docs. |
|
@Mark-Simulacrum I was trying this out for example on a separate rust file: fn main() {
let mut rwlock_vec = Vec::with_capacity(u32::MAX as usize);
let mut read_lock_ctr: u32 = 0;
while read_lock_ctr < u32::MAX {
rwlock_vec.push(MaybeUninit::uninit());
read_lock_ctr += 1;
}
let rwlock: RwLock<i32> = RwLock::new(0);
read_lock_ctr = 0;
while read_lock_ctr < u32::MAX {
rwlock_vec[read_lock_ctr as usize].write(rwlock.read().unwrap());
read_lock_ctr += 1;
}
read_lock_ctr = 0;
while read_lock_ctr < u32::MAX {
unsafe { rwlock_vec[read_lock_ctr as usize].assume_init_drop(); }
read_lock_ctr += 1;
}
}I use Still, with the |
|
Can't you mem::forget the read guard? |
Oh true, I forgot to consider doing that, and it works if I just do this: fn main() {
let mut read_lock_ctr: u32 = 0;
let rwlock: RwLock<i32> = RwLock::new(0);
while read_lock_ctr < u32::MAX - 1 {
let lock = rwlock.read();
mem::forget(lock);
read_lock_ctr += 1;
}
}(This code above panics though, see below) On another note, looking deeper into pub struct RwLock {
// The state consists of a 30-bit reader counter, a 'readers waiting' flag, and a 'writers waiting' flag.
// Bits 0..30:
// 0: Unlocked
// 1..=0x3FFF_FFFE: Locked by N readers
// 0x3FFF_FFFF: Write locked
// Bit 30: Readers are waiting on this futex.
// Bit 31: Writers are waiting on the writer_notify futex.
state: Futex,
// The 'condition variable' to notify writers through.
// Incremented on every signal.
writer_notify: Futex,
}
const READ_LOCKED: Primitive = 1;
const MASK: Primitive = (1 << 30) - 1;
const WRITE_LOCKED: Primitive = MASK;
const DOWNGRADE: Primitive = READ_LOCKED.wrapping_sub(WRITE_LOCKED); // READ_LOCKED - WRITE_LOCKED
const MAX_READERS: Primitive = MASK - 1;
const READERS_WAITING: Primitive = 1 << 30;
const WRITERS_WAITING: Primitive = 1 << 31;Note: this is from The maximum readers from my understanding here is 2^30 - 1 - 1 (or 1073741822). Should I reflect this in the test and in documentation? |
|
Let's definitely reflect it in the test. It'll probably want to be cfg(not(miri)) because it's likely too slow in miri (I hope it only takes a few seconds outside miri). I'll nominate for libs-api on how much we want to guarantee particular properties of the implementation... I suspect there are other platforms where maybe we have different limits (or will in the future). |
a152909 to
f7596ef
Compare
|
@Mark-Simulacrum, I took a look at the four current different implementations of |
| // - On SolidASP3, it leans on rwlock/solid.rs, which utilizes rwl_loc_rdl, so the max | ||
| // number of readers depends on the internal implementation of rwl_loc_rdl. | ||
| // - Every other platforms utilizes rwlock/no_threads.rs, which has a max reader of | ||
| // isize::MAX. An arithmetic overflow error occurs if it exceeds the max reader count. |
There was a problem hiding this comment.
What is this arithmetic overflow look like? A panic?
I think we want to know if there are new targets added that don't meet the target (MAX_READERS here). I'd rather see the cfg on the MAX_READERS constant and we can lower it / set it to different values depending on the cfg.
E.g.:
const MAX_READERS: if cfg!(...) { ... } else if cfg!(...) { ... } else { ... };
There was a problem hiding this comment.
What is this arithmetic overflow look like? A panic?
There's no explicit panic in the arithmetic overflow for the no_threads RwLock impl. If we take a look at the code inside library/std/src/sys/sync/rwlock/no_thread.rs:
use crate::cell::Cell;
pub struct RwLock {
// This platform has no threads, so we can use a Cell here.
mode: Cell<isize>,
}
unsafe impl Send for RwLock {}
unsafe impl Sync for RwLock {} // no threads on this platform
impl RwLock {
#[inline]
pub const fn new() -> RwLock {
RwLock { mode: Cell::new(0) }
}
#[inline]
pub fn read(&self) {
let m = self.mode.get();
if m >= 0 {
self.mode.set(m + 1);
} else {
rtabort!("rwlock locked for writing");
}
}
#[inline]
pub fn try_read(&self) -> bool {
let m = self.mode.get();
if m >= 0 {
self.mode.set(m + 1);
true
} else {
false
}
}
#[inline]
pub fn write(&self) {
if self.mode.replace(-1) != 0 {
rtabort!("rwlock locked for reading")
}
}
#[inline]
pub fn try_write(&self) -> bool {
if self.mode.get() == 0 {
self.mode.set(-1);
true
} else {
false
}
}
#[inline]
pub unsafe fn read_unlock(&self) {
self.mode.set(self.mode.get() - 1);
}
#[inline]
pub unsafe fn write_unlock(&self) {
assert_eq!(self.mode.replace(0), -1);
}
#[inline]
pub unsafe fn downgrade(&self) {
assert_eq!(self.mode.replace(1), -1);
}
}There is overflow that could occur on mode when we reach isize::MAX readers and decide to do another RwLock::read() call here (though this could end up being silent).
I have a different PR as well addressing what I find odd about the RwLock::write here with always replacing the mode value with -1 even when it's reader locked.
Should there be an assertion here that does something similar to rwlock/futex.rs with erroring with the message "too many active read locks on RwLock" when we go beyond the max reader count? I'm not sure how to test this locally or if CI would be able to run the max reader test we have here on the no_threads impl to see if it works as intended in.
I think we want to know if there are new targets added that don't meet the target (MAX_READERS here). I'd rather see the cfg on the MAX_READERS constant and we can lower it / set it to different values depending on the cfg.
E.g.:
const MAX_READERS: if cfg!(...) { ... } else if cfg!(...) { ... } else { ... };
I can do this. I'm a bit concerned about the rwlock/queue.rs implementation because technically the max readers is usize::MAX - (1 << 4) from my understanding of this piece of code:
/// Marks the state as read-locked, if possible.
#[inline]
fn read_lock(state: State) -> Option<State> {
if state.addr() & QUEUED == 0 && state.addr() != LOCKED {
Some(without_provenance_mut(state.addr().checked_add(SINGLE)? | LOCKED))
} else {
None
}
}However, at least from my understanding of RwLock::lock_contended():
#[cold]
fn lock_contended(&self, write: bool) {
let mut node = Node::new(write);
let mut state = self.state.load(Relaxed);
let mut count = 0;
let update_fn = if write { write_lock } else { read_lock };
loop {
// Optimistically update the state.
if let Some(next) = update_fn(state) {
// The lock is available, try locking it.
match self.state.compare_exchange_weak(state, next, Acquire, Relaxed) {
Ok(_) => return,
Err(new) => state = new,
}
continue;
} else if state.addr() & QUEUED == 0 && count < SPIN_COUNT {
// If the lock is not available and no threads are queued, optimistically spin for a
// while, using exponential backoff to decrease cache contention.
for _ in 0..(1 << count) {
spin_loop();
}
state = self.state.load(Relaxed);
count += 1;
continue;
}
// The optimistic paths did not succeed, so fall back to parking the thread.
// First, prepare the node.
node.prepare();
// If there are threads queued, this will set the `next` field to be a pointer to the
// first node in the queue.
// If the state is read-locked, this will set `next` to the lock count.
// If it is write-locked, it will set `next` to zero.
node.next.0 = AtomicPtr::new(state.mask(NODE_MASK).cast());
node.prev = AtomicLink::new(None);
// Set the `QUEUED` bit and preserve the `LOCKED` and `DOWNGRADED` bit.
let mut next = ptr::from_ref(&node)
.map_addr(|addr| addr | QUEUED | (state.addr() & (DOWNGRADED | LOCKED)))
as State;
let mut is_queue_locked = false;
if state.addr() & QUEUED == 0 {
// If this is the first node in the queue, set the `tail` field to the node itself
// to ensure there is a valid `tail` field in the queue (Invariants 1 & 2).
// This needs to use `set` to avoid invalidating the new pointer.
node.tail.set(Some(NonNull::from(&node)));
} else {
// Otherwise, the tail of the queue is not known.
node.tail.set(None);
// Try locking the queue to eagerly add backlinks.
next = next.map_addr(|addr| addr | QUEUE_LOCKED);
// Track if we changed the `QUEUE_LOCKED` bit from off to on.
is_queue_locked = state.addr() & QUEUE_LOCKED == 0;
}
// Register the node, using release ordering to propagate our changes to the waking
// thread.
if let Err(new) = self.state.compare_exchange_weak(state, next, AcqRel, Relaxed) {
// The state has changed, just try again.
state = new;
continue;
}
// The node has been registered, so the structure must not be mutably accessed or
// destroyed while other threads may be accessing it.
// Guard against unwinds using a `PanicGuard` that aborts when dropped.
let guard = PanicGuard;
// If the current thread locked the queue, unlock it to eagerly adding backlinks.
if is_queue_locked {
// SAFETY: This thread set the `QUEUE_LOCKED` bit above.
unsafe {
self.unlock_queue(next);
}
}
// Wait until the node is removed from the queue.
// SAFETY: the node was created by the current thread.
unsafe {
node.wait();
}
// The node was removed from the queue, disarm the guard.
mem::forget(guard);
// Reload the state and try again.
state = self.state.load(Relaxed);
count = 0;
}
}If we receive None from read_lock() or reach that possible max reader count, the pending reader will try to acquire the lock again another time.
I guess my questions for this scenario are:
- Should I just set
MAX_READERtousize::MAX - (1 << 4)for targets associated withrwlock/queue.rs? (this will make the test run slower on 64 bit machines though, wouldn't it?) - If we set
MAX_READERtousize::MAX - (1 << 4), would it be possible for this to stack overflow as all allocation for thisRwLockimpl is stack allocated?
The last question I have is I'm aware that for solid_asp3 it RwLock::read() depends on how rwl_loc_rdl is internally implemented. That's my understanding at least from looking at std/src/sys/pal/solid/abi/mod.rs:
...
// `rwlock.h`
unsafe extern "C" {
pub fn rwl_loc_rdl(id: ID) -> ER;
pub fn rwl_loc_wrl(id: ID) -> ER;
pub fn rwl_ploc_rdl(id: ID) -> ER;
pub fn rwl_ploc_wrl(id: ID) -> ER;
pub fn rwl_unl_rwl(id: ID) -> ER;
pub fn rwl_acre_rwl() -> ER_ID;
pub fn rwl_del_rwl(id: ID) -> ER;
}I've been trying to find information about what solid_asp3 rwl_loc_rdl does, but I struggled to find sources of its implementation. Do you know where I could go to to find solid_asp3's internal implementation of rwl_loc_rdl (or do you happen to know the max reader count for solid_asp3)?
There was a problem hiding this comment.
Please don't copy/paste method bodies, just link to them if you want to reference something specific. Much easier to follow the conversation if there's not large walls of code.
Should there be an assertion here that does something similar to rwlock/futex.rs with erroring with the message "too many active read locks on RwLock" when we go beyond the max reader count? I'm not sure how to test this locally or if CI would be able to run the max reader test we have here on the no_threads impl to see if it works as intended in.
Yes, I think it would be good to use checked_ arithmetic operations and panic if we overflow the counter, rather than depending on whether overflow-checks are enabled for soundness.
Should I just set MAX_READER to usize::MAX - (1 << 4) for targets associated with rwlock/queue.rs? (this will make the test run slower on 64 bit machines though, wouldn't it?)
I don't think we can afford to run up to ~2^64 iterations, that's going to be too slow. Asserting a smaller limit on those targets (e.g., 2^32 or so) should be fine though. In cfg!(miri) I'd recommend asserting we can acquire (say) 100 reader locks to keep the test fast enough.
If we set MAX_READER to usize::MAX - (1 << 4), would it be possible for this to stack overflow as all allocation for this RwLock impl is stack allocated?
I don't think so? The stack allocation for the reader locks is only if we're parked (i.e., waiting for some other thread), right?
solid_asp3
Per https://doc.rust-lang.org/nightly/rustc/platform-support/kmc-solid.html, @kawadakk is the target maintainer. Can you confirm what we should expect from the RwLock implementation on that target?
Let's assume 2^30 for now if we don't hear otherwise.
|
libs-api discussed and we agreed to move forward without a docs-guarantee to the limit, but keeping the test (see my comment above). |
Sounds good on the no docs-guarantee to the limit. Are you okay with the current change of using "multiple" instead of "any number of" within |
|
Yes, multiple is fine. |
…=joboet Prevent no_threads RwLock's write() impl from setting mode to -1 when it is locked for reading In my time updating the docs to `std::sync::RwLock` and adding a test verifying that max reader count is reachable in rust-lang#153555, I noticed that the no_threads RwLock's `write()` implementation always sets the `mode` to `-1` (denoting writer locked) even though it could be reader locked. I feel like that's logically incorrect and that it should only be setting the `mode` to `-1` when we know that the mode is unlocked (`0`); `write()` should mirror the code that `read()` and `try_read()` has with `try_write()`. For reference on read/try_read and write/try_write current implementations: ```rust #[inline] pub fn read(&self) { let m = self.mode.get(); if m >= 0 { self.mode.set(m + 1); } else { rtabort!("rwlock locked for writing"); } } #[inline] pub fn try_read(&self) -> bool { let m = self.mode.get(); if m >= 0 { self.mode.set(m + 1); true } else { false } } #[inline] pub fn write(&self) { if self.mode.replace(-1) != 0 { // <-- This behavior logically does something different than what `try_write` does rtabort!("rwlock locked for reading") } } #[inline] pub fn try_write(&self) -> bool { if self.mode.get() == 0 { self.mode.set(-1); true } else { false } } ``` r? @jhpratt
Rollup merge of #154185 - asder8215:no_threads_write_impl, r=joboet Prevent no_threads RwLock's write() impl from setting mode to -1 when it is locked for reading In my time updating the docs to `std::sync::RwLock` and adding a test verifying that max reader count is reachable in #153555, I noticed that the no_threads RwLock's `write()` implementation always sets the `mode` to `-1` (denoting writer locked) even though it could be reader locked. I feel like that's logically incorrect and that it should only be setting the `mode` to `-1` when we know that the mode is unlocked (`0`); `write()` should mirror the code that `read()` and `try_read()` has with `try_write()`. For reference on read/try_read and write/try_write current implementations: ```rust #[inline] pub fn read(&self) { let m = self.mode.get(); if m >= 0 { self.mode.set(m + 1); } else { rtabort!("rwlock locked for writing"); } } #[inline] pub fn try_read(&self) -> bool { let m = self.mode.get(); if m >= 0 { self.mode.set(m + 1); true } else { false } } #[inline] pub fn write(&self) { if self.mode.replace(-1) != 0 { // <-- This behavior logically does something different than what `try_write` does rtabort!("rwlock locked for reading") } } #[inline] pub fn try_write(&self) -> bool { if self.mode.get() == 0 { self.mode.set(-1); true } else { false } } ``` r? @jhpratt
…e can reach max reader
This addresses the issue with the
std::sync::RwLockdocs in #115338. It centers around the following lines:It's true that the
RwLockin theory should allow any number of readers to acquire the lock when a writer is not holding it, but this may not be true in the implementation and could be os dependent. I decided to replace "any number of readers" to "multiple", so that it implies that more than 1 reader can acquire the lock, but you can't necessarily take away that this value is unbounded.@rustbot label +A-docs