Avoid repeated storage reads (reads amplification) #443

matklad · 2022-02-15T10:10:57Z

Today aurora does a lot of IO, and it seems that a significant fraction of that can be avoidable. Here's a list of potential places to look at:

Sputnik's gas metering

In particular

SSTORE
CALL

These op-codes feel like they fetch account info repeatedly for gas counting purposes, and then once again when actually executing op-codes.

The best way to optimize this is probably to run the standalone-runner and check that no repeated calls are made. For example, if I add the following print:

diff --git a/engine-standalone-storage/src/engine_state.rs b/engine-standalone-storage/src/engine_state.rs
index 0f42b85..ee5ccb6 100644
--- a/engine-standalone-storage/src/engine_state.rs
+++ b/engine-standalone-storage/src/engine_state.rs
@@ -90,6 +90,8 @@ impl<'db, 'input: 'db, 'output: 'db> IO for EngineStateAccess<'db, 'input, 'outp
     }
 
     fn read_storage(&self, key: &[u8]) -> Option<Self::StorageValue> {
+        eprintln!("key = {:?}", key);
+
         if let Some(diff) = self.transaction_diff.borrow().get(key) {
             return diff
                 .value()

and run the uniswap benchmark, I get this bit of output which clearly shows needless fetches:

key = [7, 2, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 1, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 1, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 2, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 1, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 2, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 3, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 1, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 2, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]
key = [7, 7, 253, 203, 97, 115, 194, 238, 228, 6, 212, 142, 5, 215, 129, 170, 42, 179, 151, 105, 255, 105]

eprintln!("{:?}", ::backtrace::Backtrace::new()); seems to confirm that those come from gas counting

Basic Account Info

It seems that we often fetch three bits of account info at the same time: ballance, nonce, and "is the code deployed" bit. It seems worthwhile to pack all account info (except for actual code) into a single storage record. This can be implemented as a dynamic migration: we add new key for compact info and, while fetching the code, we first try to fetch it from the new key, and, if that fails, migrate old key to the new key.

Storage Batching

More generally, today the base read cost is 10_000 higher than the per-byte cost, so generally the code should be optimized for fetching kilobytes of data at a time.

The text was updated successfully, but these errors were encountered:

joshuajbouw · 2022-02-15T14:30:15Z

Good to know! Thanks for finding this. Caching would be ideal here.

matklad · 2022-02-18T17:40:57Z

diff --git a/src/executor/stack/executor.rs b/src/executor/stack/executor.rs
index 3d584d9..1796018 100644
--- a/src/executor/stack/executor.rs
+++ b/src/executor/stack/executor.rs
@@ -974,7 +974,7 @@ impl<'config, 'precompiles, S: StackState<'config>, P: PrecompileSet> Handler
 		if self.config.empty_considered_exists {
 			self.state.exists(address)
 		} else {
-			self.state.exists(address) && !self.state.is_empty(address)
+			!self.state.is_empty(address)
 		}
 	}

I think this diff improves NEAR gas by 5% for the uniswap benchmark, but I don't fully understand why. Specifically, if I only the exists check and remove is_empty, the gas usage stays the same.

matklad · 2022-02-18T17:41:11Z

(that's a sputnik diff)

mfornet · 2022-06-29T10:50:17Z

@joshuajbouw @birchmd is this issue still relevant. I remember several optimizations with regard to repeated reads, and I wonder if there is potentially more to do here?

birchmd · 2022-06-29T10:53:35Z

No, this is issue is resolved. We have full storage caching in the engine now as of #488

birchmd mentioned this issue Feb 15, 2022

Fix(engine): Simple cache to stop consecutive duplicate reads #446

Merged

joshuajbouw changed the title ~~Avoid repated storage reads (reads amplification)~~ Avoid repeated storage reads (reads amplification) Feb 20, 2022

birchmd closed this as completed Jun 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid repeated storage reads (reads amplification) #443

Avoid repeated storage reads (reads amplification) #443

matklad commented Feb 15, 2022

joshuajbouw commented Feb 15, 2022 •

edited

Loading

matklad commented Feb 18, 2022

matklad commented Feb 18, 2022

mfornet commented Jun 29, 2022

birchmd commented Jun 29, 2022

Avoid repeated storage reads (reads amplification) #443

Avoid repeated storage reads (reads amplification) #443

Comments

matklad commented Feb 15, 2022

joshuajbouw commented Feb 15, 2022 • edited Loading

matklad commented Feb 18, 2022

matklad commented Feb 18, 2022

mfornet commented Jun 29, 2022

birchmd commented Jun 29, 2022

joshuajbouw commented Feb 15, 2022 •

edited

Loading