-
Notifications
You must be signed in to change notification settings - Fork 1.7k
db: more cache budget for BODIES and EXTRA columns #11548
base: master
Are you sure you want to change the base?
Conversation
Help for reviewers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand you correctly this PR stems from the observation that the column sizes do not match the cache memory allocated to them; I think you're saying "the three columns are roughly of the same size, they should have roughly the same amount of memory assigned"?
It is a really good question but I have several hard-to-answer questions:
- isn't it true that the read/write pattern to
COL_STATE
is much less regular than to the others? It is really tricky to make any kind of caching efficient when users control the kinds of queries by deploying solidity code and making token transfers from addresses that can be anywhere in the DB? Allocating as much memory as possible to speed up seeks inCOL_STATE
still seems like a smart move. - why isn't
COL_BODIES
ever pruned? This might be the dumbest question ever, but when a node warp syncs it doesn't have all block bodies back to genesis, does it? We backfill ancient blocks, but what is the actual purpose of that: crucial for security or "nice to have"? - I was under the illusion that
COL_EXTRA
was a column where we tossed random bits and pieces we didn't have a better place for, e.g. the best block etc. TIL that is not at all the case but why do we need to store all transaction receipts for ever? Can't we prune this? From a cursory glance at the code usingCOL_EXTRA
it seems like we're mostly writing to it, but most reads seem to be for the "first", "best" and "ancient" keys; if it is indeed mostly appended to, spending cache on it is likely wasted? Maybe we need aCOL_RECEIPT
?
EDIT: There is no benchmarking data here – do you have any? What changes with the cache redistribution?
The problem is that it's not just the memory assigned, it's the size of levels L0 and L1 in rocksdb, which affects the overall db (column) layout. |
I don't know what you mean, ELI5 pls? |
@ordian FYI the stats of
@dvdplm this may help |
Sort of, I've read that file many times but I still have a hard time getting an intuition for what config values are relevant to us. https://github.com/facebook/rocksdb/wiki/Leveled-Compaction is a good read too, but only somewhat related to this PR. Do you have similar stats for a DB without this patch? Do you expect the level distribution to be different? Why? |
What I meant is that it's not just the memory assigned, if you look how we use memory budget, |
@dvdplm Sorry that I only left few word without descriptive information. I just want to explain why it change the level distribution. But I got another question that I don't have non-patched state db now. |
It seems that our assumption about db sizes was not adequate to reality.
Changing the default memory distribution seems to help with #11494.