SIMD-0177: Program Runtime ABI v2#177
Conversation
ae4514d to
7148fdf
Compare
| - Readonly instruction accounts get no growth padding. | ||
| - For writable instruction accounts additional capacity is allocated and mapped | ||
| for potential account growth. The maximum capacity is the length of the account | ||
| payload at the beginning of the transaction plus 10 KiB. CPI can not grow |
There was a problem hiding this comment.
does this get affected by this other SIMD?
https://github.com/solana-foundation/solana-improvement-documents/pull/163/files
There was a problem hiding this comment.
They are independent. SIMD-0163 is about the program being called, that is not affected in this SIMD.
There was a problem hiding this comment.
@nickfrosty Fwiw, this means the realloc limit is unchanged as well.
There was a problem hiding this comment.
We could increase the account realloc / resize limit if there are no interactions of ABI v0/v1 and ABI v2 programs in the CPI call tree of a top level instruction. See the discussion with Sean below.
7148fdf to
375deef
Compare
375deef to
1057208
Compare
1057208 to
51f9f75
Compare
caa91dd to
6b641c8
Compare
| - Magic: `u32`: `0x76494241` ("ABIv" encoded in ASCII) | ||
| - ABI version `u32`: `0x00000002` | ||
| - Pointer to instruction data: `u64` | ||
| - Length of instruction data: `u32` |
There was a problem hiding this comment.
I was thinking about this and I thought that if data was presented in a way which makes sense to rust, e.g. regular slices with u64 ptr and u64 length, then rust programs do not have to do any entry processing at all, and can just cast 4GiB address to a type and be done.
0d9976c to
9084d79
Compare
9084d79 to
e3e3e19
Compare
| - Key: `[u8; 32]` | ||
| - Owner: `[u8; 32]` | ||
| - Lamports: `u64` | ||
| - Account payload: `&[u8]` which is composed of: | ||
| - Pointer to account payload: `u64` | ||
| - Account payload length: `u64` |
There was a problem hiding this comment.
Programs also have access to the booleans writable, signer and executable. Are we serializing these ones as well?
There was a problem hiding this comment.
These are per instruction not per transaction. See the "flags bitfield" in "Per Instruction Serialization".
| - For each transaction account: | ||
| - Key: `[u8; 32]` | ||
| - Owner: `[u8; 32]` | ||
| - Lamports: `u64` |
There was a problem hiding this comment.
After a couple of discussions with @Lichtso, we thought the feedback from developer relations would be important here.
Today programs can only see the accounts passed to them in the instruction being executed. This layout change entails that programs (and every CPIs program invoked from them) will now be able to access metadata from all the accounts passed in the transaction, regardless whether they were passed in the instruction or not. We still intend to maintain the account payload hidden, though, if it is not an instruction account.
Would this change have any unintended consequences on the developer side?
(cc. @joncinque and @jacobcreech )
| The `AccountInfo` parameter of the CPI syscalls (`sol_invoke_signed_c` and | ||
| `sol_invoke_signed_rust`) will be ignored if ABI v2 is in use. Instead the | ||
| changes to account metadata will be communicated explicitly through separate | ||
| syscalls `sol_set_account_owner`, `sol_set_account_lamports` and |
There was a problem hiding this comment.
Perhaps we need to mention the expected cost of sol_set_account_lamports to update lamports of an account – this is a quite common operation in programs.
Co-authored-by: Alex Kahn <43892045+alnoki@users.noreply.github.com>
| - Total number of instructions in transaction (including CPIs and top level | ||
| instructions): `u32` |
There was a problem hiding this comment.
I was thinking about this member. Would it be helpful if it were instead a VmSlice<InstructionFrame>?
|
In general the SIMD still needs to define the CU charging for the four syscalls and for the number of instruction accounts. |
|
|
||
| The runtime must only map the payload for accounts that belong in the current | ||
| executing instruction. The payload for accounts belonging to sibling instructions | ||
| must NOT be mapped. |
There was a problem hiding this comment.
It might be easier to always map in all accounts which are not referenced in an instruction as readonly. That way we wouldn't even have to hide / reveal them on every instruction, thus it is less work for the validator and more available data for the programs.
Also, we already load all sysvar accounts, might as well expose them here too. That would however either rise the maximum transaction account number beyond 255 or require a new range of transaction accounts, but that is harder to pull of because of possible aliasing with sysvars which were mentioned in the message.
Co-authored-by: Alex Kahn <43892045+alnoki@users.noreply.github.com>
|
A comment on the status of this proposal: We are developing the infrastructure on the validator to accommodate all the necessary changes for ABIv2, regardless of what data layout we choose. Once that a is ready, this is the plan:
|
should we kick this one back to a SIMD discussion for cataloging this discovery information and open up a fresh SIMD once we're ready to quibble over data layout and specifics? |
@bw-solana yes please! Plenty of folks like myself are deep in VM-level implementation mechanics and would like a chance to talk through things with a fresh context based on this current revision as discovery information |
We already have a prototype with the roughly the layout presented here.
This one has accumulated clutter and comments on historic designs, which are now outdated, that is true.
Open to that, but wanted to note that the discussion page has no central document to work on a spec in whole, which is important as all parts need to stay consistent and compatible with one another. |
| the CPI scratchpad. At the beginning of every instruction, these scratchpads | ||
| must be empty and their size must be zero. | ||
|
|
||
| Programs must set the desired length for them using the `set_buffer_length` |
There was a problem hiding this comment.
Since it is possible to statically determine the maximum length for the return-data scratchpad, one alternative is to always have MAX_RETURN_DATA bytes allocated. Programs then write as many bytes as they need, prefixing with the length if needed, without having to set the length. This might save a syscall call on the program side.
|
|
||
| ### VM initialization | ||
|
|
||
| During the initilization of the virtual machine, the runtime must load the |
There was a problem hiding this comment.
Couple of ideas for this:
- We could also add the pointer to the instruction data (R4) and length of it (R5).
- I wonder if having only R1 is enough – the other pointers and lengths can be computed with a static offset from R1, so we can "save" those register to other use.
Co-authored-by: febo <febo@anza.xyz>
Co-authored-by: Simonas Kazlauskas <github@kazlauskas.me>
| The `set_buffer_length` must charge a base cost (to be determined) plus the | ||
| same CU per byte ratio as the `memset` syscall. |
There was a problem hiding this comment.
While implementing this I found the specification to be insufficient here. The obvious interpretation would be to charge the per-byte fee for the additional bytes that are being allocated (and not at all if the buffer size is being reduced) as those are the only bytes that need memsetting to 0.
At the same time, any increase in buffer size may require a realloc and memcpy from previous buffer to the new one. So charging per_byte * new_length any time the buffer size would increase (new_length > current_length) would seem like a more correct option. I think we can get away with a single fee here, as the cost of memcpy and memset is in roughly the same ballpark.
Question is: are we okay with charging a significant fee when the buffer is being resized? This fee can be especially painful to resizes where the base buffer is large and is only increased in size by a little bit every time, but it would correctly reflect the computation load.
EDIT: Alternative to charging for reallocations would be to maintain a pre-allocated pool of buffers such that each region already gets a buffer that's at least as large as needed to contain the largest region requestable. I don't think that's feasible, though, especially not an outlook of having buffers larger than 10MiB.
There was a problem hiding this comment.
So charging per_byte * new_length any time the buffer size would increase (new_length > current_length) would seem like a more correct option. I think we can get away with a single fee here, as the cost of memcpy and memset is in roughly the same ballpark.
I believe charging for the entire new size during account growths is the right solution for now. If we can come up with another implementation on the validator side that allows us to decrease costs, we can do so later.
Decreasing costs is always easier than increasing them.
No description provided.