-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pointer being freed was not allocated #48
Comments
Yes, it is definitely meant to be thread-safe, although admittedly we use lmdb-store with child processes (rather than worker threads), so it gets more heavily tested in that scenario, but certainly would like to ensure it is working well with threads. A few questions: |
Yes, I built locally from master. Currently using put. I haven't tried transactionAsync yet. The options I'm using look like this: lmdb.open(cacheDir, {
name: 'parcel-cache',
encoding: 'binary',
}); Thanks! |
Also, I have been using the benchmark tests as my main means of pressure testing threads (and I just updated them to make sure they are fully running multi-threaded tests), which is run with |
Well I've put in many hours into debugging this so far. A preliminary solution that seems to work is to revert this commit: LMDB/lmdb@cb256f4. That appears to have been meant to address a leak: https://bugs.openldap.org/show_bug.cgi?id=9155. One thing I have noticed is that there seems to be a significant amount of differences between mdb.c in this repo vs upstream. I'm not sure if that's just because it's outdated or whether more changes have been introduced here. But perhaps one of these differences may be related? Or it might actually be a bug in LMDB itself, but I find that more unlikely as I imagine many people are using it in multithreaded environments. I did try to replace mdb.c in lmdb-store with the upstream version, but that caused other issues. I'm not really very familiar with the code, and still sort of wrapping my head around it. But it seems to me that there is a single pre-allocated write transaction that is reused here. Sorry for the brain dump and if this was really a question for the upstream project. |
Thank you for the great research! I also have been trying to reproduce this with no luck, but this gives some valuable direction. My initial hypothesis of what may be going wrong is that the Anyway, this hypothesis might be completely wrong. I am going to try to exploit it with some intentional delays to see if I can reproduce this issue though and get back to you. Thanks again for the help! Also, maybe I can try to put together a document of the changes I have made to mdb.c. Some of them are minor bug fixes I intend to submit as bug fixes upstream (just haven't gotten around to it yet). Some of the other stuff was me just muddling through differences in C and C++ to experiment with some things. The most significant difference though is the on-fly-resizing during transactions that lmdb-store implements. And this does have threading implications since new memory maps can be allocated on the fly in any thread; basically this is handled by simply never un-mapping previous memory maps. I actually don't think this is related at all to this issue though, at least if my hypothesis is anywhere close to correct. |
If this hypothesis is correct, the solution may simply be to move the |
Ok, I believe I have successfully reproduced this issue by introducing some delay/sleeps in between the unlock and the free. This does seem to be a real race condition, and I can verify that a double free can and does occur under the right conditions. And moving the free call, as mentioned above, does indeed seem to fix it as well. I will get a version published with a fix soon. BTW, I am curious if there are any peculiarities to your system that makes this issue be manifested. As mentioned, with both threads running, it seems rather extraordinary for the second thread to win the race to changing the |
That's great! In my brief testing so far this appears to fix the issue for me. 🥳 I'm running on an M1 MacBook Air, which may be somewhat peculiar. I did notice that I am able to reproduce the issue much more easily if I start the process and then switch to another application and do something somewhat intensive, e.g. refresh a browser tab. This may support your theory of CPU throttling, or at least more aggressive time slicing. @wbinnssmith also saw this issue on an Intel MacBook Pro though, but it sounded like it might have happened after a significant amount of writes. Maybe macOS related in general? For context, we are looking to switch from a filesystem-based cache in Parcel to LMDB. We're seeing some pretty impressive performance wins from doing that (~20% faster builds). Thanks so much for your help with this issue! |
Fascinating. I was actually just wondering if it would be possible to find a computer with M1 silicon to test lmdb-store, and see if it has any issues. Thanks for doing that for me, looks like we may found an issue! (I do actually test with an old MBP that I have, and never saw this with the multi-threaded benchmark tests, but my MBP is probably from circa 2008 or so, so may have not be that close to modern macs with subtle thread timing). |
I just ran lmdb-store@c346d2c with Parcel on an Intel Mac and unfortunately the pointer is still being freed and crashing the program. It does take some time to reproduce, running across multiple threads and sometimes for many tens of seconds, but it does happen. @kriszyp would an automated reproduction of the issue help even if it's in the much larger context of Parcel? |
Sure, I would be happy to check out any reproduction of the issue, even in a larger context. |
Fix published in v1.5.2. |
Thanks again @kriszyp! |
When using lmdb-store from node worker_threads, I sometimes get crashes with a message like
I did a debug build of lmdb-store, and managed to get the process to stop in lldb. Here's the backtrace:
We verified that this only happens when running with worker threads, so I'm guessing that somehow a transaction is getting committed/ended twice on different threads? Is lmdb-store meant to be thread safe?
I'm still trying to come up with a minimal reproduction and debug this. But any pointers you have would be helpful. Thanks!
The text was updated successfully, but these errors were encountered: