Deadlocking with rwlock #1205

arn-backpack · 2024-10-24T11:30:27Z

Hello,

I have run into some strange behaviour where rayon seems to deadlock when interacting with a rwlock. I have seen other issues such as this #592, but it doesn't quite seem to fit the below example.

In the below example we

Let a task acquire a read lock.
Let another task block on acquiring the write lock.
Let several tasks block on acquiring a read lock.

However, the task which successfully acquired the read lock seems to deadlock on its rayon processing, even though there should be available rayon threads to process it. We are only creating CORES / 2 read lock tasks so there should be plenty of rayon threads which are not blocking on acquiring the read lock.

use parking_lot::RwLock;
use rayon::iter::ParallelIterator;
use rayon::prelude::IntoParallelRefIterator;
use rayon::ThreadPoolBuilder;
use std::sync::Arc;
use std::thread::sleep;


pub const CORES: usize = 16;


#[derive(Debug)]
struct Engine {
    data: u64,
}

fn main() {
    ThreadPoolBuilder::new()
        .thread_name(|i| format!("rayon-thread-{}", i))
        .build_global()
        .unwrap();

    let engine = Arc::new(RwLock::new(Engine { data: 0 }));

    rayon::spawn({
        let engine = engine.clone();
        move || loop {
            {
                // Attempt to acquire the lock after the first read task below has acquired the read lock.
                sleep(std::time::Duration::from_millis(50));
                println!("Acquiring write lock");
                let mut lock = engine.write();
                println!("Writing data");
                lock.data += 1;
            }
            println!("Data written");
            sleep(std::time::Duration::from_secs(3));
        }
    });

    let tasks: Vec<_> = (0..CORES / 2).collect();
    for task_number in &tasks {
        // The first task wont sleep and will acquire the read lock before the above write task attempts to acquire the write lock.
        sleep(std::time::Duration::from_millis(*task_number as u64 * 100));
        my_task(*task_number, engine.clone());
    }

    sleep(std::time::Duration::from_secs(1_000));
}

fn my_task(task_number: usize, lock: Arc<RwLock<Engine>>) {
    rayon::spawn(move || {
        let thread = std::thread::current();
        println!("Attempting to acquire lock task={} on thread={}", task_number, thread.name().unwrap());

        let data = lock.read();
        println!("Successfully acquired lock task={}", task_number);
        sleep(std::time::Duration::from_millis(1_000));
        let mut list = Vec::new();
        list.extend(0..CORES);
        let _: Vec<_> = list
            .par_iter()
            .map(|idx| {
                let binding = std::thread::current();
                sleep(std::time::Duration::from_secs(1));
                println!("Task={} thread={} processing it={}", task_number, binding.name().unwrap(), idx);
                idx
            })
            .collect();
        let thread = std::thread::current();
        // This never prints.
        println!("Processing completed task={} thread={} data={:?}", task_number, thread.name().unwrap(), *data);
    });
}

The text was updated successfully, but these errors were encountered:

cuviper · 2024-10-24T15:13:14Z

I think it is the same fundamental issue as #592. Your task that's holding the read lock is splitting into many recursive joins via par_iter. When the first part of a join finishes, it will wait for its other half to finish as well, so it enters work-stealing. If that steals another read-lock job, it will block due to the waiting writer, and that's a deadlock since it prevents the current reader from finishing.

If you attach a debugger, it should be possible to see this in the thread backtraces.

Since you're using parking_lot, you could work around it in this particular example by calling read_recursive instead, since that ignores waiting writers -- "starving" them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlocking with rwlock #1205

Deadlocking with rwlock #1205

arn-backpack commented Oct 24, 2024 •

edited

Loading

cuviper commented Oct 24, 2024

Deadlocking with rwlock #1205

Deadlocking with rwlock #1205

Comments

arn-backpack commented Oct 24, 2024 • edited Loading

cuviper commented Oct 24, 2024

arn-backpack commented Oct 24, 2024 •

edited

Loading