Linear storage design #188

axic · 2019-04-25T15:03:52Z

Memory-mapped linear storage in Ethereum was proposed IIRC by @AlexeyAkhunov. Here we give a potential implementation of that in ewasm.

The proposal is that instead of fixed-length/fixed-key storage values, storage is linear, extendable and accessed like memory. We can take a design cue from POSIX: mmap.

API: storageMap(memoryOffset: i32, storageOffset: i32, storageLength: i32)

In the naive implementation this would:

keep track of each storageMap
disallow overlapping and duplicate areas
load and copy storage into the memory area at the first invocation
at termination of the execution the tracked areas from memory are written back to storage

In an optimised implementation with control over the VM could, while still requiring storageMap, load bytes / words / etc. on-demand at the first time the area is accessed and keep track of changed areas.

In the future when Wasm support multiple memory segments, instead of mapping into the "main" working memory, storage could be mapping 1-to-1 in a dedicated segment, without the need of a host function.

The text was updated successfully, but these errors were encountered:

cdetrio · 2019-04-25T17:09:56Z

If the storage isn't written until execution terminates, what will happen on re-entry (the contract instance calls another contract, which calls back)?

axic · 2019-04-25T17:17:30Z

I think that is the question where we'll find all the subtle details and complications.

I'd say the practical decision right now is that storage in synchronised at the point of a call, e.g. the storage is written out prior to executing call and is loaded after call finishes.

jakelang · 2019-05-20T20:16:54Z

Proposal: Storage Modules

In order to handle the case of re-entrancy, we need the storage mappings (or at least info about revisions to storage) to persist across calls.

With storage modules, we have a scheme where storage is mapped into the memory of a different module than that of the executing contract.

The storage module exists from instantiation to the end of the transaction (top-level call).

Depending on the feasibility of dynamic Wasm module linking, storage modules are instantiated on first call to storageMap or on first instantiation of the corresponding contract.

A "storage module" consists of one or more linear memories, which are exported (depending on which option is used below). A running contract which maps storage to memory will import an exported memory from this module.

Options

Here are a few possible variants of the storage module scheme.

One storage module per contract - High instantiation overhead, option for lazy instantation.
- Relatively high cost to instantiate a module, but can be done dynamically if we can resolve imports post-instantiation. Read: https://webassembly.org/docs/dynamic-linking/
One module for all contracts, one memory segment per contract
- Large upfront instantiation cost
- No option for lazy memory initialization without implementation-level hacks
- Need to know statically how many memory segments must be initialized for a transaction to minimize overhead from instantiating unneeded memories
One module and memory segment for all calls
- Minimal runtime cost (one module and memory upfront, memory is grown as needed)
- Potentially unsafe. Need to ensure isolation of different contracts' storage mappings. Could be done through checked I/O or memory segmentation
- Because of the above note, access to mapped memory might have to be via imported functions (performance concern)

API

API will remain roughly the same. However, with the last option, I/O methods will be exposed to the contract which can trap on invalid memory access.

Questions

If we minimize host function overhead, is this needed? The surrounding VM can keep track of persistent memory and storage mappings across calls and expose fast I/O methods to the contract.
How does this play into a potential contract linking scheme? Perhaps this is not necessary if transactions result in static linking of all involved contracts at instantiation time.

jakelang · 2019-05-21T23:24:18Z

Proposal 2: Storage Paging (ROUGH(!!) DRAFT)

This is an alternative memory mapping proposal which avoids the complexities of persistent wasm modules as storage while retaining the benefits of memory-mapped access.

In this scheme, the VM around the executing module (called "host" here) maintains a data structure which persists for the lifetime of the transaction, and keeps track of mapped storage units. In operating systems parlance, this is known as the frame table.

Furthermore, each instance of a Wasm module (representing a contract call) is associated with its own table of mappings from memory offsets within the module to elements of the frame table. Such a structure is better known as a page table.

This scheme can be thought of as a rudimentary paging system intended solely for mmap-style operations, where data is mapped directly from storage into a virtual memory space. It is also adaptable to the capabilities model if we do not allow multiple virtual pages for the same frame.

The Frame Table

The frame table keeps track of mapped "frames", which are fixed-size contiguous regions of linear storage.

It is maintained by the host and persists until the end of the transaction.

Because each contract has its own linear storage, we also include the address of the contract in the frame.

The index of the frame in the frame table can then be referenced by elements of a module instance's associated page table.

FrameTable ::= vec(address: bytes20, pagenum: u64)

The Page Table

Every instance of a Wasm module has its own page table. Each page corresponds to an element of the frame table.

The page table is initialized on first call to storageMap and lives until the module returns.

PageTable ::= vec(offset: u32, frame: usize)

Memory mappings

When a contract calls storageMap(storage_offset: u64, num: u32, dest: u32), the VM will interrupt and do the following:

if storage_offset + len overflows, then trap.
if dest + len is out of bounds or overflows, then trap.
if the memory to be mapped into overlaps with an already-mapped region of memory, then trap.
Determine which frames need to be mapped. This can be calculated as (storage_offset / PAGE_SIZE, storage_offset / PAGE_SIZE + 1, ... storage_offset / PAGE_SIZE + num).
For each frame, append a page to the page table. The memory offset is calculated as (dest + n * PAGE_SIZE). If the frame in question is already in the page table, use the original's index. Otherwise, appendsaid frame to the page table.

When a Wasm module returns or calls another Wasm contract, the changes to memory are checked. If the page was mutated, write the page to the corresponding frame. If we encounter a page fault, then trap.

Questions

The smaller the page size gets, the larger the memory footprint is. Repeated O(n) lookups in the frame table get worse, as well. OTOH, larger pages result in unnecessary I/O, if the user desired to access a much smaller region of memory. What is the sweet spot for page size?
More questions later lol

axic added the EEI (environment interface) label Apr 25, 2019

lrettig mentioned this issue Apr 25, 2019

[State-Rent] Evaluate if Linear Storage Backed is correct. ethcatherders/PM#47

Closed

axic mentioned this issue Jul 31, 2019

Shared memory between instances #198

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linear storage design #188

Linear storage design #188

axic commented Apr 25, 2019 •

edited

Loading

cdetrio commented Apr 25, 2019

axic commented Apr 25, 2019

jakelang commented May 20, 2019

jakelang commented May 21, 2019

Linear storage design #188

Linear storage design #188

Comments

axic commented Apr 25, 2019 • edited Loading

cdetrio commented Apr 25, 2019

axic commented Apr 25, 2019

jakelang commented May 20, 2019

Proposal: Storage Modules

Options

API

Questions

jakelang commented May 21, 2019

Proposal 2: Storage Paging (ROUGH(!!) DRAFT)

The Frame Table

The Page Table

Memory mappings

Questions

axic commented Apr 25, 2019 •

edited

Loading