-
-
Notifications
You must be signed in to change notification settings - Fork 370
LMDB store #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LMDB store #192
Conversation
|
Some very simple benchmarks suggest this is worth adding, performance is better than DirectoryStore or DBMStore with BerkeleyDB, and close if not better than in-memory store (dict). |
|
Updated benchmarks with larger arrays and some simple dask workflows. Generally confirm that lmdb performs better than bdb. |
|
I've updated the benchmark notebook again to include zip store and dask examples. Also demonstrates that zip store is thread safe. |
|
Another benchmark update including ndbm and bdb btree versus hash. |
|
cc @jeromekelleher, if you're already getting good performance with Berkeley DB then this probably won't make much difference, but FWIW it looks like LMDB is quite a bit faster, almost as fast as memory, probably because the whole database is memory-mapped. |
|
FTR I've added some locks to DictStore and DBMStore as well. These don't seem to impact on performance at all even under parallel workloads. They may not be strictly necessary for DictStore because of the GIL, and may not be necessary for DBMStore because at least some of the DBM implementations (GDBM, Berkeley) claim to do their own locking, but seemed worth the extra caution pending some deeper analysis. |
|
I think this is ready to go. |
|
Read/write performance really isn't important to me as the vast majority of the time is spend in de/compression. LMDB sounds good though. Definitely nice to hear about the extra locking, I was a little worried about letting it all happen at the DB layer. |
|
Have you played with the number of threads Zarr’s Blosc uses for decompression? Also there may be other tricks to play with like filtering before compression. |
|
I've played around with it a bit all right, but it's not a bottleneck for me. I'm doing compression/decompression in worker threads while the rest of the program is doing other stuff, so I'm not particularly sensitive to the performance. |
This PR adds a new
LMDBStoreclass.Also some other improvements to the storage module, including making ZipStore thread safe (resolves #194).