Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak within rayon_core::registry::ThreadSpawn #870

Closed
Hywan opened this issue Jul 5, 2021 · 3 comments
Closed

Memory leak within rayon_core::registry::ThreadSpawn #870

Hywan opened this issue Jul 5, 2021 · 3 comments

Comments

@Hywan
Copy link
Contributor

Hywan commented Jul 5, 2021

Hello,

I'm maintaining the https://github.com/wasmerio/wasmer project. Two of our users have reported memory leaks that seem to come from Rayon. I'm quoting the example from @chenyukang at wasmerio/wasmer#2404 (comment) which doesn't involve Wasmer at all, it's purely Rayon and it illustrates the memory leak:

struct Demo {
    count: i32,
}

impl Demo {
    pub fn new() -> Self {
        Self { count: 0 }
    }

    pub fn add(&mut self, v: i32) -> i32 {
        self.count = self.count + v as i32;
        self.count
    }
}

fn run_rayon() {
    let input: Vec<i32> = (1..1100).collect();
    let res: i32 = input
        .par_iter()
        .map_init(Demo::new, |demo, &v| demo.add(v as i32))
        .sum();
    println!("res: {}", res);
}

fn main() {
    run_rayon();
}

Here is the Valgrind report:

==28104== Memcheck, a memory error detector
==28104== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==28104== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==28104== Command: ./target/debug/rayon-debug
==28104==
res: 5477109
==28104==
==28104== HEAP SUMMARY:
==28104==     in use at exit: 53,128 bytes in 150 blocks
==28104==   total heap usage: 227 allocs, 77 frees, 80,701 bytes allocated
==28104==
==28104== 2,304 bytes in 8 blocks are possibly lost in loss record 20 of 24
==28104==    at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==28104==    by 0x40149CA: allocate_dtv (dl-tls.c:286)
==28104==    by 0x40149CA: _dl_allocate_tls (dl-tls.c:532)
==28104==    by 0x4879322: allocate_stack (allocatestack.c:622)
==28104==    by 0x4879322: pthread_create@@GLIBC_2.2.5 (pthread_create.c:660)
==28104==    by 0x1BD774: std::sys::unix::thread::Thread::new (thread.rs:50)
==28104==    by 0x179AB0: std::thread::Builder::spawn_unchecked (mod.rs:498)
==28104==    by 0x17B53C: std::thread::Builder::spawn (mod.rs:381)
==28104==    by 0x12A689: <rayon_core::registry::DefaultSpawn as rayon_core::registry::ThreadSpawn>::spawn (registry.rs:100)
==28104==    by 0x12B4D7: rayon_core::registry::Registry::new (registry.rs:256)
==28104==    by 0x12A976: rayon_core::registry::global_registry::{{closure}} (registry.rs:168)
==28104==    by 0x12AB87: rayon_core::registry::set_global_registry::{{closure}} (registry.rs:195)
==28104==    by 0x143FAC: std::sync::once::Once::call_once::{{closure}} (once.rs:261)
==28104==    by 0x11DC47: std::sync::once::Once::call_inner (once.rs:418)
==28104==
==28104== LEAK SUMMARY:
==28104==    definitely lost: 0 bytes in 0 blocks
==28104==    indirectly lost: 0 bytes in 0 blocks
==28104==      possibly lost: 2,304 bytes in 8 blocks
==28104==    still reachable: 50,824 bytes in 142 blocks
==28104==         suppressed: 0 bytes in 0 blocks
==28104== Reachable blocks (those to which a pointer was found) are not shown.
==28104== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==28104==
==28104== For lists of detected and suppressed errors, rerun with: -s
==28104== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

Thoughts?

@cuviper
Copy link
Member

cuviper commented Jul 7, 2021

I suspect this is simply due to the fact that we never shutdown the global thread pool, by design. See also #688.

That "possibly lost" record is just an allocation for thread-local storage, and I suspect the rest will be related to the thread pool's work queues. It seems valgrind is already filtering out the thread stacks, because those would also be 2MB each by default.

All of that should be bounded though. If you find some memory use that increases over time, that may be evidence of a real leak.

@chenyukang
Copy link

The leak size will not grow, lost memory size is always:
2,304 bytes in 8 blocks are possibly lost in loss record 20 of 2

I guess you are right, if we create some thread in Rust, and don't wait them, there will also this kind of memory leak:

fn run_thread() {
    use std::thread;
    let builder = thread::Builder::new();

    let handler = builder
        .spawn(|| {
            // thread code
            println!("hello");
        })
        .unwrap();

    //handler.join().unwrap();
}

fn main() {
    run_thread();
}

The result is:

==16491== HEAP SUMMARY:
==16491==     in use at exit: 1,472 bytes in 7 blocks
==16491==   total heap usage: 21 allocs, 14 frees, 3,616 bytes allocated
==16491==
==16491== 288 bytes in 1 blocks are possibly lost in loss record 6 of 7
==16491==    at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==16491==    by 0x40149CA: allocate_dtv (dl-tls.c:286)
==16491==    by 0x40149CA: _dl_allocate_tls (dl-tls.c:532)
==16491==    by 0x4879322: allocate_stack (allocatestack.c:622)
==16491==    by 0x4879322: pthread_create@@GLIBC_2.2.5 (pthread_create.c:660)
==16491==    by 0x12E0C4: std::sys::unix::thread::Thread::new (thread.rs:50)
==16491==    by 0x10FC11: std::thread::Builder::spawn_unchecked (mod.rs:498)
==16491==    by 0x11034B: std::thread::Builder::spawn (mod.rs:381)
==16491==    by 0x1106C0: rayon_demo::run_thread (main.rs:31)
==16491==    by 0x110735: rayon_demo::main (main.rs:42)
==16491==    by 0x11086A: core::ops::function::FnOnce::call_once (function.rs:227)
==16491==    by 0x110C5D: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:125)
==16491==    by 0x110B70: std::rt::lang_start::{{closure}} (rt.rs:66)
==16491==    by 0x12C489: call_once<(),Fn<()>> (function.rs:259)
==16491==    by 0x12C489: do_call<&Fn<()>,i32> (panicking.rs:379)
==16491==    by 0x12C489: try<i32,&Fn<()>> (panicking.rs:343)
==16491==    by 0x12C489: catch_unwind<&Fn<()>,i32> (panic.rs:431)
==16491==    by 0x12C489: std::rt::lang_start_internal (rt.rs:51)
==16491==
==16491== LEAK SUMMARY:
==16491==    definitely lost: 0 bytes in 0 blocks
==16491==    indirectly lost: 0 bytes in 0 blocks
==16491==      possibly lost: 288 bytes in 1 blocks
==16491==    still reachable: 1,184 bytes in 6 blocks
==16491==         suppressed: 0 bytes in 0 blocks
==16491== Reachable blocks (those to which a pointer was found) are not shown.
==16491== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==16491==
==16491== For lists of detected and suppressed errors, rerun with: -s
==16491== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

@cuviper
Copy link
Member

cuviper commented Jul 8, 2021

Ok -- you'd have to talk to glibc and/or valgrind folks about recognizing that particular allocation, as it's not under Rust's control, let alone rayon. Either way, I don't think there's anything to be concerned about here, so I'm going to close. Feel free to let us know if you find something else!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants