-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Execution state] complete ledger - Separation of node types (leaf, intermediate) #1745
Comments
Hi @ramtinms! Regarding this taskMy proposal to implement this task is the following:
The potential drawback with this is that while it improves clarity and maintainability significantly, it comes at a potentially small but non-negligible cost in performance. I could run benchmarks to quantify the impact that this has if you think it is necessary. AFAIK, the overhead is very minimal, but I assume that every thousandth of a nanosecond counts, for such a critical component as the ledger trie. When it comes to the memory usage savings, we can already compute an approximation of what it would be. Since in the mainnet checkpoint there are currently ~150 million nodes, about 70 million of which are leaves and 80 are branches (if I recall correctly), that would result in about 1.2GB of savings for the leaves and 3.2GB of savings for branches, for a total of 4.4GB of savings over ~30GB of raw data, so an expected improvement of around 15%. Is that what you had in mind? Extension Node proposalExperimental results show that when using extension nodes with the mainnet-14 checkpoint, the size of the trie after rebuilding it from the checkpoint is reduced by around 30%, going from 14.5GB (without payloads) to 10.3GB. The process of trimming the trie however, which is done by essentially going through all of the leaves of the original trie and recreating a new one from them, turns out to be quite slow. On the machine I use for testing, it takes about 30mns to read the checkpoint and rebuild the nodes, but then it takes 5 to 6 hours to trim the resulting trie. Once the new trie is ready however, inserting and reading values should be slightly faster, since there are less nodes to traverse to reach leaves. I think it's a tradeoff that might be worth discussing, WDYT? |
Hi Brendan, I agree there might be a potential penalty and we probably should run benchmarks. At a glance, performance of interface function calls vs non-interface function calls appear to be similar if the function isn't inlined. The performance hit is larger when the function can be inlined. Since Go 1.17 can inline some functions that couldn't be inlined with 1.16, maybe both versions should be used in benchmarks. Very preliminary benchmarks using the example code from the Stack Overflow link showed: https://go.dev/play/p/E9H_4K2J9- // when function can be inlined, performance hit on interface is larger
BenchmarkIntmethod-4 428540371 2.859 ns/op 0 B/op 0 allocs/op
BenchmarkInterface-4 388212188 3.089 ns/op 0 B/op 0 allocs/op
BenchmarkTypeSwitch-4 512972532 2.336 ns/op 0 B/op 0 allocs/op
BenchmarkTypeAssertion-4 495534915 2.416 ns/op 0 B/op 0 allocs/op // when function cannot be inlined, performance hit on interface is smaller
BenchmarkIntmethod-4 385235356 3.088 ns/op 0 B/op 0 allocs/op
BenchmarkInterface-4 388532904 3.089 ns/op 0 B/op 0 allocs/op
BenchmarkTypeSwitch-4 387217621 3.088 ns/op 0 B/op 0 allocs/op
BenchmarkTypeAssertion-4 388183740 3.088 ns/op 0 B/op 0 allocs/op When benchmarks were run with count=20, the not-inlined calls averaged 3.09 ns/op for all 4 benchmarks. name old time/op new time/op delta
Intmethod-4 2.84ns ± 1% 3.09ns ± 0% +8.99% (p=0.000 n=20+19)
Interface-4 3.09ns ± 0% 3.09ns ± 0% +0.07% (p=0.045 n=18+20)
TypeSwitch-4 2.33ns ± 0% 3.09ns ± 0% +32.52% (p=0.000 n=20+20)
TypeAssertion-4 2.42ns ± 0% 3.09ns ± 0% +27.44% (p=0.000 n=19+20) Benchmarks were run on Haswell CPU and compiled with Go 1.16. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
The current implementation of the complete ledger re-uses the same node structure type for both leaf and intermediate nodes. Separating the node types into two structures should improve memory consumption because of the way go handles struct alignment (memory used). For leaf nodes, we don't need the pointers to right and left children and for intermediate nodes, we don't need a pointer for payloads and paths.
The current code has already support for different construction methods for node creation so that should make it easier to separate the node types, though checkpointer right now uses a default constructor and might need some workaround.
Potential challenges
The text was updated successfully, but these errors were encountered: