storage: implement substore support#3131
Conversation
|
(If someone wants to try running with this idea for a bit, feel free, just be sure to push up WIP changes as you go). |
| let self2 = self.clone(); | ||
| let snapshot = self.clone(); | ||
|
|
||
| let (prefix, config) = self.0.multistore.route_key(prefix); |
There was a problem hiding this comment.
For the prefix queries, I think we need to do another check: we need to ensure that the iteration doesn't cross a "mount point". I think we can do this by banning queries supplying prefixes that are themselves prefixes of mount points, returning an error instead. (I don't think this is a meaningful restriction, maybe it means we can't iterate over every single key but I think that's OK)
hdevalence
left a comment
There was a problem hiding this comment.
Thanks @erwanor for carrying this forward. There are a lot of changes here but they look good to me. There are a lot of new tests. It seems like the smoke test indicates that we can drop this in without problems.
Is there any reason we should not merge this now, and then attempt to remove the penumbra-specific apphash in favor of an IBC substore?
|
Thanks for the review, really appreciate it. I'll wait for CI to pass before merging. Next steps like adjusting the ICS23 proof api are scoped in this comment. |
Towards #3129, this PR implements support for prefixed substores.
A substore is an independent section of storage parametrized over a prefix, and a set of column families. It embeds its own independent merkle tree, flat KV store (nonverifiable storage), and key preimage index.
A substore has three logical components:
penumbra_storage::store::SubstoreConfig.penumbra_storage::store::SubstoreSnapshot.penumbra_storage::store::SubstoreStorage.The default store (aka. "main store") has an empty prefix and hosts both non-prefixed values and substore root hashes stored at their respective prefix. For example, consider this simplified layout:
In addition to non-prefixed keys, the main store hosts two substores: one with prefix
ibc/and the other with prefixmisc/. These prefixes are both defining a namespace and are actual keys that map to the root hash of their respective merkle trees.Rather than requiring consumers to track storage namespaces, we perform routing transparently. When querying a key from the JMT, we determine whether the key belongs to a substore, and if so, we fetch the latest known version of that store. This means that state access interfaces do not change.
Versioning is done on a per-substore basis, each batch of writes yielding an increase in version numbers.
Writes are performed atomically, using
rocksdb::WriteBatch. Substore writes are performed first, so that we can then collect their root hashes and store them at their respective prefix key in the main store.