Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid repeated storage reads (reads amplification) #443

Closed
matklad opened this issue Feb 15, 2022 · 5 comments
Closed

Avoid repeated storage reads (reads amplification) #443

matklad opened this issue Feb 15, 2022 · 5 comments

Comments

@matklad
Copy link
Contributor

matklad commented Feb 15, 2022

Today aurora does a lot of IO, and it seems that a significant fraction of that can be avoidable. Here's a list of potential places to look at:

Sputnik's gas metering

In particular

These op-codes feel like they fetch account info repeatedly for gas counting purposes, and then once again when actually executing op-codes.

The best way to optimize this is probably to run the standalone-runner and check that no repeated calls are made. For example, if I add the following print:

diff --git a/engine-standalone-storage/src/engine_state.rs b/engine-standalone-storage/src/engine_state.rs
index 0f42b85..ee5ccb6 100644
--- a/engine-standalone-storage/src/engine_state.rs
+++ b/engine-standalone-storage/src/engine_state.rs
@@ -90,6 +90,8 @@ impl<'db, 'input: 'db, 'output: 'db> IO for EngineStateAccess<'db, 'input, 'outp
     }
 
     fn read_storage(&self, key: &[u8]) -> Option<Self::StorageValue> {
+        eprintln!("key = {:?}", key);
+
         if let Some(diff) = self.transaction_diff.borrow().get(key) {
             return diff
                 .value()

and run the uniswap benchmark, I get this bit of output which clearly shows needless fetches:

key = [7, 2, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 1, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 1, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 2, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 1, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 2, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 3, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 1, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 2, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 7, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]

eprintln!("{:?}", ::backtrace::Backtrace::new()); seems to confirm that those come from gas counting

Basic Account Info

It seems that we often fetch three bits of account info at the same time: ballance, nonce, and "is the code deployed" bit. It seems worthwhile to pack all account info (except for actual code) into a single storage record. This can be implemented as a dynamic migration: we add new key for compact info and, while fetching the code, we first try to fetch it from the new key, and, if that fails, migrate old key to the new key.

Storage Batching

More generally, today the base read cost is 10_000 higher than the per-byte cost, so generally the code should be optimized for fetching kilobytes of data at a time.

@joshuajbouw
Copy link
Contributor

joshuajbouw commented Feb 15, 2022

Good to know! Thanks for finding this. Caching would be ideal here.

@matklad
Copy link
Contributor Author

matklad commented Feb 18, 2022

diff --git a/src/executor/stack/executor.rs b/src/executor/stack/executor.rs
index 3d584d9..1796018 100644
--- a/src/executor/stack/executor.rs
+++ b/src/executor/stack/executor.rs
@@ -974,7 +974,7 @@ impl<'config, 'precompiles, S: StackState<'config>, P: PrecompileSet> Handler
 		if self.config.empty_considered_exists {
 			self.state.exists(address)
 		} else {
-			self.state.exists(address) && !self.state.is_empty(address)
+			!self.state.is_empty(address)
 		}
 	}

I think this diff improves NEAR gas by 5% for the uniswap benchmark, but I don't fully understand why. Specifically, if I only the exists check and remove is_empty, the gas usage stays the same.

@matklad
Copy link
Contributor Author

matklad commented Feb 18, 2022

(that's a sputnik diff)

@joshuajbouw joshuajbouw changed the title Avoid repated storage reads (reads amplification) Avoid repeated storage reads (reads amplification) Feb 20, 2022
@mfornet
Copy link
Contributor

mfornet commented Jun 29, 2022

@joshuajbouw @birchmd is this issue still relevant. I remember several optimizations with regard to repeated reads, and I wonder if there is potentially more to do here?

@birchmd
Copy link
Member

birchmd commented Jun 29, 2022

No, this is issue is resolved. We have full storage caching in the engine now as of #488

@birchmd birchmd closed this as completed Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants