Dynamic LMDB mapsize allocation [1.1.0] #2605

yeastplume · 2019-02-19T14:34:50Z

Mostly to support #2525, but also to make the backend store a bit more flexible. This:

Allocates DB space in chunks of 128MB at a time
Checks whether more space is needed every time store.batch() is called. If so, increases the mapsize in 128MB chunks until the space used relative to the mapsize is beneath a threshold (currently 65%, can be tweaked later)

Does this need to be any more complicated than this?

antiochp · 2019-02-19T14:40:58Z

store/tests/lmdb.rs

+	phatness: u64,
+}
+
+impl PhatChunkStruct {


antiochp · 2019-02-20T12:18:11Z

Is it worth considering doing a "double the size each time" strategy?

DavidBurkett · 2019-02-20T12:38:34Z

store/src/lmdb.rs

+			env_info.mapsize + ALLOC_CHUNK_SIZE
+		};
+		unsafe {
+			self.env.set_mapsize(new_mapsize)?;


You need to make sure there are no active txns before resizing. See: https://github.com/monero-project/monero/blob/master/src/blockchain_db/lmdb/db_lmdb.cpp#L517-L539

Yep 👍 I'll look into enforcing that. I was thinking calling it only from the batch creation function helps here, but of course that doesn't take multiple threads with open txns into account.

In the monero code it's just seems to be implemented via a simple reference count of a global atomic. @antiochp @ignopeverell As far as I can see within the node, the Store struct is never wrapped in any mutexes, can you confirm whether there are multiple threads trying to access the ChainStore or PeerStore at any given time?

Also a bit of an issue here with multiple wallet invocations trying to access the store at the same time, which is possible under current architecture.

Are we talking any txns here? Or just write txs? (All lmdb access is via a read txn or write txn).

If write txns then lmdb itself is the mutex - it only supports a single write txn at a time (across all threads).
If we successfully create a batch then we're good to go - we guarantee no other thread has a write txn active.

My understanding is that setting the db mapsize rearranges the indices, affecting reads, too.

We don't actually have any mechanism to expose long-lived read txns via our store impl. So I think we're fine there, every read is effectively in its own read txn currently.
So if we take a write lock via our batch and then resize the db we should be good - the next read will simply create a new read txn on the resized db.

DavidBurkett · 2019-02-20T12:49:41Z

store/src/lmdb.rs

@@ -24,6 +24,10 @@ use lmdb_zero::LmdbResultExt;

 use crate::core::ser;

+/// number of bytes to grow the database by when needed
+pub const ALLOC_CHUNK_SIZE: usize = 134_217_728; //128 MB


Seems a bit low. When Grin has full blocks, this could happen every hour or so, especially for archive peers. I'd be curious to see some metrics around how long this takes to resize. I'd also be interested to see how disruptive mdb_txn_safe::prevent_new_txns() and mdb_txn_safe::wait_no_active_txns() are (see comment on line 152). If either of those operations have a noticeable effect on performance, it'd be better to resize less often.

Sure, can try and collect some metrics once that's implemented.

Just another thought here, it might be debatable whether this is the right size for the chain, but it's already too large for the wallet and peer DB. I might think about making this a parameter somehow without adding too much cruft

Good point. Maybe we should consider @antiochp's doubling proposal? Or will that eventually be too excessive?

Seems easy enough to tune once we have live data.

DavidBurkett · 2019-02-20T12:55:33Z

store/src/lmdb.rs

+	}
+
+	/// Increments the database size by one ALLOC_CHUNK_SIZE
+	pub fn do_resize(&self) -> Result<(), Error> {


Should we check available space, and try to fail more gracefully? Or do you think that's more complexity than it's worth? It's trivial to do a check like that in C++, but not sure if Rust provides APIs for that.

I'd thought this as well, but if trying to allocate larger than disk space you get:

Error: LmdbErr(Error::Code(12, 'Cannot allocate memory'))

Which I think is graceful enough without having to add more complexity here.

Cool. I agree.

yeastplume · 2019-02-21T12:14:37Z

Right, unfortunately just performing the resize within calls to batch results in segfaulty behaviour somewhere in LMDB. It would seem, all things considered, that the safest thing to do here on resize is to close/drop the database entirely, perform the resize, then re-open.

I've tested this by setting the chunk size to something very small (2MB) and syncing a chain from scratch. With the close/reopen behaviour in place it syncs and expands the db as needed without issue, and fully syncs. Without, it inevitably segfaults somewhere.

The downside here is that the calls to open and close the db, and therefore batch now requires a mutable reference, which means that higher-up references in ChainStore and PeerStore now need to be wrapped in mutexes. It's more cumbersome, but you could argue it's more belt-and-suspendery since we're now much more sure it's safe to reallocate the DB size at the point it's being done. Also, I believe windows will need this in place anyhow in order to resize due to its aggressive file locking.

yeastplume · 2019-02-21T12:16:24Z

Also, refactored the store itself a bit to be a bit more encapsulated and ensure callers don't need to explicitly import the lmdb crate.

antiochp · 2019-02-21T12:22:47Z

The downside here is that the calls to open and close the db, and therefore batch now requires a mutable reference, which means that higher-up references in ChainStore and PeerStore now need to be wrapped in mutexes.

That would mean we lose any ability to have multiple readers on the db simultaneously?
Right now that's a nice benefit of LMDB that may be hard to give up (lets multiple peer threads read the db for some early existence checks etc.)

yeastplume · 2019-02-21T12:36:06Z

I've just tried implementing as an RwLock here. How much of a performance hit is this likely to be for the peer store?

ignopeverell · 2019-02-25T20:16:49Z

store/src/lmdb.rs

@@ -24,6 +24,10 @@ use lmdb_zero::LmdbResultExt;

 use crate::core::ser;

+/// number of bytes to grow the database by when needed
+pub const ALLOC_CHUNK_SIZE: usize = 134_217_728; //128 MB


Seems easy enough to tune once we have live data.

ignopeverell · 2019-02-25T20:20:12Z

chain/src/chain.rs

@@ -142,7 +141,7 @@ impl OrphanBlockPool {
 /// maintains locking for the pipeline to avoid conflicting processing.
 pub struct Chain {
 	db_root: String,
-	store: Arc<store::ChainStore>,
+	store: Arc<RwLock<store::ChainStore>>,


I'm fine with the use of a RwLock but can't this be pushed down into our LMDB store? All regular operations would be considered a read, except for the close/open of the DB which would take the write.

Took a bit of doing and testing, but think I've managed it in the latest push.

yeastplume · 2019-02-26T14:11:16Z

Think this is ready for merging if anyone wants to give a final review and a little thumbs up somewhere. Since the last comments I've:

Moved the RwLock into store::Store itself, and moved all the locks to occur before each read/write transaction. I've tested by setting the increment chunk size to a small value (1MB at a time) and syncing from scratch, doing loads of DB resizes along the way. Current iteration works without issue.
I've changed the allocation size logic to keep allocating (currently 128MB) chunks until at least 45% of total mapsize is free. No reason for choosing this value, but it can be tweaked at any stage.

ignopeverell

Very nice looking, I like that it removes the lmdb dependency in a few crates!

antiochp · 2019-02-27T12:35:59Z

I've just tried implementing as an RwLock here. How much of a performance hit is this likely to be for the peer store?

I suspect effectively zero given everything else going on.

yeastplume added 2 commits February 19, 2019 14:30

dynamically resize lmdb

ce4d887

rustfmt

85d3e27

antiochp reviewed Feb 19, 2019

View reviewed changes

store/tests/lmdb.rs

phatness: u64,

}

impl PhatChunkStruct {

Copy link

Member

antiochp Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😹

yeastplume changed the title ~~Dynamic LMDB mapsize allocation~~ Dynamic LMDB mapsize allocation [1.1.0] Feb 19, 2019

yeastplume added the enhancement label Feb 19, 2019

yeastplume added this to the 1.1.0 milestone Feb 19, 2019

DavidBurkett reviewed Feb 20, 2019

View reviewed changes

yeastplume added 3 commits February 21, 2019 12:00

explicitly close db before resizing

8ea9d43

rustfmt

b35bda5

merge from 1.1.0 branch

4b2e2f0

yeastplume added 2 commits February 21, 2019 12:32

test fix

8ba03fd

rustfmt

5b49a16

yeastplume added 3 commits February 21, 2019 13:09

pool tests

4f28272

merge from latest 1.1.0

4892e35

chain fix

7e92a1b

yeastplume mentioned this pull request Feb 25, 2019

(GrinWin) - Windows 10 Support Meta-Issue #2525

Closed

11 tasks

ignopeverell reviewed Feb 25, 2019

View reviewed changes

yeastplume added 6 commits February 26, 2019 09:56

merge from upstream

ebb7513

merge

0fb3cc5

move RwLock into Store, ensure resize gives a min threshold

66e3c78

rustfmt

1150512

move locks based on testing

628be78

rustfmt

b4e360e

ignopeverell approved these changes Feb 26, 2019

View reviewed changes

yeastplume merged commit beaae28 into mimblewimble:milestone/1.1.0 Feb 27, 2019

0xmichalis mentioned this pull request Feb 27, 2019

Autodetect lmdb storage allocation based on the env #1681

Closed

yeastplume deleted the lmdb_resize branch March 4, 2019 10:04

antiochp added the release notes To be included in release notes (of relevant milestone). label Jun 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic LMDB mapsize allocation [1.1.0] #2605

Dynamic LMDB mapsize allocation [1.1.0] #2605

yeastplume commented Feb 19, 2019 •

edited

Loading

antiochp Feb 19, 2019

antiochp commented Feb 20, 2019

DavidBurkett Feb 20, 2019

yeastplume Feb 20, 2019

yeastplume Feb 20, 2019

antiochp Feb 20, 2019 •

edited

Loading

DavidBurkett Feb 20, 2019

antiochp Feb 20, 2019

DavidBurkett Feb 20, 2019

yeastplume Feb 20, 2019

yeastplume Feb 21, 2019

DavidBurkett Feb 21, 2019

ignopeverell Feb 25, 2019

DavidBurkett Feb 20, 2019

yeastplume Feb 20, 2019

DavidBurkett Feb 20, 2019

yeastplume commented Feb 21, 2019

yeastplume commented Feb 21, 2019

antiochp commented Feb 21, 2019

yeastplume commented Feb 21, 2019

ignopeverell Feb 25, 2019

ignopeverell Feb 25, 2019

yeastplume Feb 26, 2019

yeastplume commented Feb 26, 2019

ignopeverell left a comment

antiochp commented Feb 27, 2019

Dynamic LMDB mapsize allocation [1.1.0] #2605

Dynamic LMDB mapsize allocation [1.1.0] #2605

Conversation

yeastplume commented Feb 19, 2019 • edited Loading

Choose a reason for hiding this comment

antiochp commented Feb 20, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antiochp Feb 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yeastplume commented Feb 21, 2019

yeastplume commented Feb 21, 2019

antiochp commented Feb 21, 2019

yeastplume commented Feb 21, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yeastplume commented Feb 26, 2019

ignopeverell left a comment

Choose a reason for hiding this comment

antiochp commented Feb 27, 2019

yeastplume commented Feb 19, 2019 •

edited

Loading

antiochp Feb 20, 2019 •

edited

Loading