SIMD-0219: Stricter ABI and Runtime Constraints by Lichtso · Pull Request #219 · solana-foundation/solana-improvement-documents

Lichtso · 2025-01-06T20:47:40Z

No description provided.

buffalojoec

Nice! I think the fixes for the frame gaps and the memory accesses across regions make sense. However, I pushed back on the stack/heap pointers for CPI verification.

Also, I'm wondering if it makes sense to break this into three SIMDs? It seems like we could do each in isolation, which might help speed up the approval & implementation process.

buffalojoec · 2025-06-24T09:24:34Z

+- The following pointers must be on the stack or heap,
+meaning their virtual address is inside `0x200000000..0x400000000`,
+otherwise `SyscallError::InvalidPointer` must be thrown:
+  - The pointer in the array of `&[AccountInfo]` / `SolAccountInfo*`
+  - The `AccountInfo::data` field,
+  which is a `RefCell<&[u8]>` in `sol_invoke_signed_rust`
+  - The `AccountInfo::lamports` field,
+  which is a `RefCell<&u64>` in `sol_invoke_signed_rust`
+- The following pointers must point to what was originally serialized in the
+input regions by the program runtime,
+otherwise `SyscallError::InvalidPointer` must be thrown:
+  - `AccountInfo::key` / `SolAccountInfo::key`
+  - `AccountInfo::owner` / `SolAccountInfo::owner`
+  - `AccountInfo::lamports` / `SolAccountInfo::lamports`
+  - `AccountInfo::data::ptr` / `SolAccountInfo::data`


In my opinion, this constraint forces programs to waste limited stack/heap space just to hold these pointer structures, when we can probably validate these in a different way, and keep them in the input region.

On Agave, when we serialize all of the accounts into the input region, we actually hang onto a bunch of the pointers in SyscallContext/SerializedAccountMetadata. We can just store more pointer information in the syscall context, and perform a very quick pointer analysis:

Before each CPI frame is created

When the program execution finishes

We can catch the violations then, and throw the SyscallError::InvalidPointer. This way, instead of imposing location constraints, we simply validate that all AccountInfo pointers match their original pointers when the VM was created.

As mentioned in the proposal, when direct mapping is enabled, these kinds of pointer violations will throw immediately, which means pushing the pointers to the stack/heap would only be necessary as a temporary measure, until the DM feature is enabled. Afterwards, programs are stuck with this constraint for no reason.

I think you are misunderstanding. This SIMD says that the AccountInfo structs need to be on stack or heap, not that the data they point to has to be there. The data they point to remains in the account serialization. This is compatible with the SDK entrypoint as is, and has to be so that we don't break existing programs.

Think part of the confusion comes from the fact that AccountInfo::data and AccountInfo::lamports have two levels of indirection and both outer and inner pointer need to not be in the account serialization (the final deref however must be), what no program ever does anyway.

We can just store more pointer information in the syscall context, and perform a very quick pointer analysis

Yes exactly, that is how direct mapping is implemented and what the SIMD is ought to describe. Seems I should reformulate it.

I think you are misunderstanding. This SIMD says that the AccountInfo structs need to be on stack or heap, not that the data they point to has to be there.

I never considered this SIMD to be implying we should move any serialized data to the stack or heap. I believe I'm understanding that part correctly.

What I'm maybe misunderstanding is what the goal is by limiting where AccountInfo pointers can be created. The sol_invoke_signed syscall accepts only pointers.

fn sol_invoke_signed_c( instruction_addr: *const u8, account_infos_addr: *const u8, // <-- Pointer to slice of `SolAccountInfo` account_infos_len: u64, signers_seeds_addr: *const u8, signers_seeds_len: u64, ) -> u64 struct SolAccountInfo { key_addr: u64, lamports_addr: u64, data_len: u64, data_addr: u64, owner_addr: u64, rent_epoch: u64, is_signer: bool, is_writable: bool, executable: bool, }

Since translation of accounts just deref's out of the RefCell trackers, we can consider the two translations (Rust and C) identical for serialized accounts.

Enforcing that each account pointer lives on stack or heap doesn't seem to actually solve the problem, which is the ability to pass a pointer to an invalid input-region SolAccountInfo into CPI. Furthermore, there are multiple perfectly valid reasons you'd pass a pointer to already-serialized legitimate accounts, such as avoiding copies.

My initial point of just evaluating all pointers against the SerializedAccountMetadata would solve this problem at the fundamental level. It can happen during translate_accounts or translate_account_infos, where you've already got the stack/heap check implemented.

My initial point of just evaluating all pointers against the SerializedAccountMetadata would solve this problem at the fundamental level.

That is what the SIMD already proposes anyway: "The following pointers must point to what was originally serialized in the input regions by the program runtime" which refers to the SerializedAccountMetadata.

But, that alone is insufficient, CPI also writes to AccountInfo as it returns, it should never be writing to an account during the returning (returning takes multiple steps after all) as that can result in iterate-while-modifying issues and violates Rust borrow checker rules.

Lichtso · 2025-06-24T17:43:15Z

Also, I'm wondering if it makes sense to break this into three SIMDs? It seems like we could do each in isolation, which might help speed up the approval & implementation process.

This entire SIMD is describing the account data direct mapping feature, which is already implemented, just not feature gated. I know from a dApp perspective they look like a bunch thrown together, but in the program runtime they are all interconnected. So I don't think that splitting them would speed things up, but rather make it much more complex.

buffalojoec · 2025-07-08T06:08:27Z

I know from a dApp perspective they look like a bunch thrown together, but in the program runtime they are all interconnected.

Actually I am talking about the program runtime!

Right now, the plan is to just switch on direct mapping all at once with a feature gate, which will enable all of these constraints - as well as the implementation itself - immediately. That's a lot of change area in one feature gate, and I think some contributors are getting a bit nervous about the "all at once" approach.

What I'm instead suggesting is that we introduce these constraints piecemeal. You can break VM constraints into three separate SIMDs, with three separate feature gates, and activate them one by one:

Frame gaps
Memory access violations
AccountInfo pointer regions

This approach would allow us to more easily address any security issues that might arise from just one "phase" of direct mapping constraints.

Later, once all three are activated, you can make direct mapping a validator startup flag for a while, before we just remove the flag altogether and make it the de facto hot path.

agave validator <...> --direct-memory-mapping

Overall I think this seems like a much safer approach to getting this all in. What do you think?

Lichtso · 2025-07-08T08:51:43Z

I think the change to the stack frame gaps is risky and not necessary anymore for the current implementation of direct mapping so we could revert it.

The restrictions to the AccountInfo pointer regions could be done first in a separate SIMD, but we know that it affects next to no programs, and it seems to have little risk associated.

Finally, the tough nut is the memory access violations. These are interwoven with the implementation switching to direct mapping. It is hard to correctly emulate these restrictions without implementing direct mapping. Decoupling them into a feature gate of pure restrictions which still uses the copy based serialization path is complex.

LucasSte · 2025-08-01T21:26:00Z

+- The access is completely within the rest of the account growth budget of the
+transaction, otherwise `InstructionError::InvalidRealloc` must be thrown.
+- The access is completely within the current length of the account,
+otherwise extend the the account with zeros to the maximum allowed by the


Suggested change

otherwise extend the the account with zeros to the maximum allowed by the

otherwise extend the account with zeros to the maximum allowed by the

topointon-jump · 2025-08-05T21:52:49Z

+- Heap (`0x300000000..0x400000000`)
+- Instruction meta data
+- Account meta data
+- Account payload address space


Are each of the mapped metadata/data address ranges for each account considered separate regions, in the sense that you can't have a single access spanning multiple accounts? It would greatly simplify the implementation if this was the case.

Yes, we dropped support for multi / cross region accesses from this SIMD, as in: We don't allow that anymore.

And just to be clear - each account's data and metadata region will be considered a separate region?

yes, meta data and account payload a separate, meaning there are two regions per account

ripatel-fd · 2025-08-06T17:51:10Z

+  runtime never serialized
+  - `AccountInfo` structures can be overwritten by CPI during CPI, causing
+  complex side effects
+- VM write access


Suggested change

- VM write access

- VM memory access

says "write" but first sub bullet point says "read"

ripatel-fd · 2025-08-06T17:55:58Z

+
+## Security Considerations
+
+None.


There are surely some security considerations (additional validation logic risks introducing more places where clients can diverge), but most of it is already implied by above.

Most of the security risks come from the implementation of direct mapping, not so much from imposing these additional restrictions. We decided to split the behavior changes (constituting this SIMD) from the direct mapping implementation, see: #219 (comment)

Will the implementation of direct mapping itself have a separate feature gate?

Likely yes, though it is kind of an implementation detail, as in we can't make a SIMD for it as that would be empty.

mjain-jump · 2025-08-06T19:22:35Z

+- VM memory access
+  - Bad read accesses go unnoticed as long as they stay within the reserved
+  address space, even if they leave the actual account payload
+  - Bad write accesses go unnoticed as long as the original value is restored


Wouldn't bad read/write accesses throw a segfault and have the VM return an error code?

It is describing the current state: You can write to readonly accounts, as long as you write the existing value before the instruction ends. But, I can make it more clear that this only applies to account payload data.

mjain-jump · 2025-08-06T19:38:09Z

+
+- The account is flagged as writable,
+otherwise `InstructionError::ReadonlyDataModified` must be thrown
+- The account is owned by the currently executed program,


- The access is completely within the current length of the account, otherwise `InstructionError::AccountDataTooSmall` must be thrown.

shouldn't this also apply here?

No, because the way reallocations / growing of accounts currently works in ABIv1 is that a program first writes beyond the end of the account, and then at the next CPI or the end of the instruction communicates the change of the account length to the program runtime.

yufeng-jump · 2025-08-06T21:26:23Z

The access is completely within the current length of the account, otherwise extend the account with zeros to the maximum allowed by the previous two checks.

Just to clarify, the way things are gonna be after this SIMD, assuming 10KB of realloc growth is allowed under the budget:

Loads into the realloc growth region without a preceding Store into said region is disallowed, and will result in an AccountDataTooSmall.
A Store into anywhere in the realloc growth region will result in logical zero-filling of the entire 10KB of realloc growth region, and a subsequent Load into anywhere in the entire 10KB of realloc growth region, whether lower or higher than the preceding Store in VM address space, will be allowed, so long as the Load does not go out of bounds of the original account payload+10KB. In short, a single Store anywhere in the realloc region initializes the entire realloc region.

Is that correct?

Lichtso · 2025-08-06T21:40:40Z

Yes that is correct. It essentially switches from eager to lazy initialization of the realloc padding. Also, the realloc padding will not have its own region but be part of the account payload region.

joncinque · 2025-08-22T16:51:25Z

It looks like there's wide approval for this. I will merge the SIMD next Friday to give time for any final objections.

Lichtso and others added 2 commits October 16, 2024 20:07

First draft

15260f0

Make SIMD match implementation

75fed55

github-actions Bot mentioned this pull request Jan 13, 2025

Upstream Updates - Mon Jan 13 00:15:05 UTC 2025 smartcontractkit/chainlink-solana#1010

Closed

seanyoung mentioned this pull request Jan 21, 2025

Check account boundaries on overlapping memmove syscall anza-xyz/agave#4563

Merged

mergify Bot mentioned this pull request Jan 23, 2025

v2.1: Check account boundaries on overlapping memmove syscall (backport of #4563) anza-xyz/agave#4598

Merged

Lichtso force-pushed the stricter-vm-verification-constraints branch 2 times, most recently from 6a9502d to 9c1578c Compare April 24, 2025 14:39

abrahem79 approved these changes May 29, 2025

View reviewed changes

This comment was marked as spam.

Sign in to view

Lichtso force-pushed the stricter-vm-verification-constraints branch from 9c1578c to eaf4fe6 Compare May 29, 2025 14:12

Lichtso force-pushed the stricter-vm-verification-constraints branch from eaf4fe6 to 1f3dda9 Compare June 6, 2025 15:37

Lichtso mentioned this pull request Jun 11, 2025

SIMD-0268: Raise CPI Nesting Limit #268

Merged

ttsides86 approved these changes Jun 17, 2025

View reviewed changes

buffalojoec reviewed Jun 24, 2025

View reviewed changes

Lichtso changed the title ~~SIMD-0219: Stricter VM verification constraints~~ SIMD-0219: Stricter VM constraints Jul 17, 2025

Lichtso force-pushed the stricter-vm-verification-constraints branch from 40f9989 to 56f2cc5 Compare July 18, 2025 19:33

Lichtso changed the title ~~SIMD-0219: Stricter VM constraints~~ SIMD-0219: Stricter ABI and Runtime Constraints Jul 18, 2025

Lichtso force-pushed the stricter-vm-verification-constraints branch from 56f2cc5 to 25933db Compare July 18, 2025 19:39

Lichtso mentioned this pull request Jul 23, 2025

Feature - Stricter ABI and runtime constraints anza-xyz/agave#7113

Merged

Orhann27 approved these changes Aug 1, 2025

View reviewed changes

LucasSte reviewed Aug 1, 2025

View reviewed changes

Lichtso force-pushed the stricter-vm-verification-constraints branch from 25933db to 2c6fc74 Compare August 1, 2025 21:30

topointon-jump reviewed Aug 5, 2025

View reviewed changes

ripatel-fd approved these changes Aug 6, 2025

View reviewed changes

LucasSte approved these changes Aug 6, 2025

View reviewed changes

Lichtso force-pushed the stricter-vm-verification-constraints branch from 2c6fc74 to ebc45da Compare August 6, 2025 18:12

mjain-jump reviewed Aug 6, 2025

View reviewed changes

Lichtso added 5 commits August 6, 2025 21:25

Cleanup for publishing.

1ddba58

Further changes to the way memory accesses are treated.

8677cf8

Adds account data direct mapping to the "Motivation" section.

9435f0c

Renames the proposal.

5cff568

Reverts the removal of stack frame gaps.

74e59a8

Lichtso force-pushed the stricter-vm-verification-constraints branch from ebc45da to 74e59a8 Compare August 6, 2025 19:26

mjain-jump reviewed Aug 6, 2025

View reviewed changes

topointon-jump approved these changes Aug 6, 2025

View reviewed changes

Lichtso mentioned this pull request Aug 19, 2025

solana-program-test new features necessary for code coverage functionality anza-xyz/agave#7569

Closed

hayesjohn147 approved these changes Aug 22, 2025

View reviewed changes

Lichtso mentioned this pull request Aug 25, 2025

SIMD-0339: Increase CPI Account Info Limit #339

Merged

Moves the definition of memory regions into the Terminology section.

361174e

Lichtso mentioned this pull request Sep 2, 2025

Fix - Restrict address space of sysvar syscalls in SIMD-0219 anza-xyz/agave#7832

Merged

jfjeifkvbk approved these changes Sep 8, 2025

View reviewed changes

Adds sysvar syscall restrictions.

53f1c2c

Lichtso force-pushed the stricter-vm-verification-constraints branch from 4a3c59b to 53f1c2c Compare September 8, 2025 13:40

mergify Bot mentioned this pull request Sep 9, 2025

v3.0: Fix - Restrict address space of sysvar syscalls in SIMD-0219 (backport of #7832) anza-xyz/agave#7959

Merged

This comment was marked as spam.

Sign in to view

joncinque merged commit 44d0610 into solana-foundation:main Sep 17, 2025
2 checks passed

github-actions Bot mentioned this pull request Sep 22, 2025

Upstream Updates - Mon Sep 22 00:17:06 UTC 2025 smartcontractkit/chainlink-solana#1345

Open

Skaybeili approved these changes Nov 5, 2025

View reviewed changes

This was referenced Nov 27, 2025

[CI] reallocation error in recent SVM (>=3.0) fragmetric-labs/fragmetric-contracts#539

Closed

[restaking] access violation in stack frame (SVM >= 3.0) fragmetric-labs/fragmetric-contracts#540

Closed

github-actions Bot locked as resolved and limited conversation to collaborators Jan 12, 2026

	otherwise extend the the account with zeros to the maximum allowed by the
	otherwise extend the account with zeros to the maximum allowed by the

Conversation

Lichtso commented Jan 6, 2025

Uh oh!

This comment was marked as spam.

Uh oh!

buffalojoec left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Lichtso Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Lichtso commented Jun 24, 2025

Uh oh!

buffalojoec commented Jul 8, 2025

Uh oh!

Lichtso commented Jul 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topointon-jump Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Lichtso Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yufeng-jump commented Aug 6, 2025

Uh oh!

Lichtso commented Aug 6, 2025

Uh oh!

joncinque commented Aug 22, 2025

Uh oh!

This comment was marked as spam.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

Lichtso Jun 24, 2025 •

edited

Loading

topointon-jump Aug 5, 2025 •

edited

Loading

Lichtso Aug 6, 2025 •

edited

Loading