Skip to content

feat: skeleton for EVM v2 with long[] stack and --Xevm-v2 toggle#10105

Merged
siladu merged 9 commits intobesu-eth:mainfrom
siladu:evm-v2-skeleton
Mar 31, 2026
Merged

feat: skeleton for EVM v2 with long[] stack and --Xevm-v2 toggle#10105
siladu merged 9 commits intobesu-eth:mainfrom
siladu:evm-v2-skeleton

Conversation

@siladu
Copy link
Copy Markdown
Contributor

@siladu siladu commented Mar 25, 2026

First PR for #10131

Introduces the structural scaffolding for a new EVM execution path (disabled by default) that uses long[] as its stack operand type instead of Tuweni Bytes, eliminating per-operation object allocation on the hot path.

Changes:

  • Add enableEvmV2 field to EvmConfiguration (default false); existing 3-arg convenience constructor shields 80+ call sites from changes
  • Wire --Xevm-v2 hidden CLI flag in EvmOptions and evmtool
  • Add enableEvmV2 to TxValues, propagated to each MessageFrame
  • Add long[] stackV2 dual-stack to MessageFrame with push/pop/get/set long[] methods; v1 OperandStack is untouched
  • Add private runToHaltV2() in EVM with a v2 dispatch switch; runToHalt() branches to it when the flag is enabled
  • ConfigurationOverviewBuilder: prints EVM v2 notice when enabled; suppresses redundant opcode-optimizations line in that mode
  • evmtool: execution-strategy wrapper prints v2 notice before any subcommand, covering benchmark (which never calls getEvmConfiguration via Dagger)
  • Add unimplemented AddOperationV2 as a stub example of the v2 operation pattern in a v2 package.
  • Add AddOperationBenchmarkV2 as a working benchmark (Add opcode currently throws UnsupportedOperationExcetion though)

Once Add is implemented, should be able to benchmark like this:

 ./gradlew --no-daemon :ethereum:core:jmh -Pincludes=AddOperation --rerun-tasks
 
Benchmark                                    Mode  Cnt   Score   Error  Units
AddOperationBenchmark.executeOperation       avgt   15  43.299 ± 3.992  ns/op
v2.AddOperationBenchmarkV2.executeOperation  avgt   15   7.294 ± 0.329  ns/op

Logging when flag enabled...

$ evmtool --Xevm-v2 true block-test
EVM v2 (long[] stack) enabled
...

$ evmtool --Xevm-v2 true benchmark
EVM v2 (long[] stack) enabled
...

$ besu --Xevm-v2 true
2026-03-26 14:14:06.059+1000 | main | INFO  | Besu | Starting Besu
...
2026-03-26 14:14:06.994+1000 | main | INFO  | Besu |
####################################################################################################
#                                                                                                  #
# Besu version 26.3-develop-66b64f7                                                                #
...
# Experimental EVM v2 (long[] stack) enabled                                                       #
...

Testing

2x nodes with --Xevm-v2=false versus main showed no significant performance differences.
A node with --Xevm-v2=true fails with UnsupportedOperationException as expected.

See #10105 (comment)

Introduces the structural scaffolding for a new EVM execution path
(disabled by default) that uses long[] as its stack operand type
instead of Tuweni Bytes, eliminating per-operation object allocation
on the hot path.

Changes:
- Add `enableEvmV2` field to EvmConfiguration (default false); existing
  3-arg convenience constructor shields 80+ call sites from changes
- Wire `--Xevm-go-fast` hidden CLI flag in EvmOptions and evmtool
- Add `enableEvmV2` to TxValues, propagated to each MessageFrame
- Add `long[] stackV2` dual-stack to MessageFrame with push/pop/get/set
  long[] methods; v1 OperandStack is untouched
- Add private `runToHaltV2()` in EVM with a v2 dispatch switch;
  runToHalt() branches to it when the flag is enabled
- Add AddOperationV2 as a stub example of the v2 operation pattern
  (pure long[] carry arithmetic, zero object allocation)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
*/
public void pushStackLongs(final long v0, final long v1, final long v2, final long v3) {
final int nextTop = stackV2Top + 1;
if (nextTop >= txValues.maxStackSize()) {
Copy link
Copy Markdown
Contributor

@daniellehrner daniellehrner Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to be careful that txValues.maxStackSize() does not prevent inlining this method. As txValues.maxStackSize() is constant, we might can use a private variable to store the value.

Copy link
Copy Markdown
Contributor

@daniellehrner daniellehrner Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we could make this method unsafe and the caller needs to check that enough space is available. Something like this:

messageFrame.hasCapacity(1);

messageFrame.pushStackLongsUnsafe(...);

Then we could be certain that both methods will be inlined

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vibe coded nonsense -> now more like StackMath

public static final String OPTIMIZED_OP_CODES = "--Xevm-optimized-opcodes";

/** The constant EVM_GO_FAST. */
public static final String EVM_GO_FAST = "--Xevm-go-fast";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--Xevm-v2 seems more clear about the intent of the flag and is matching the internal variable naming

Copy link
Copy Markdown
Contributor Author

@siladu siladu Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a little joke. I have made --Xevm-v2 the main flag and left this as an alias

private final OperandStack stack;
// EVM v2 stack: 4 longs per 256-bit word (index 0 = most significant, index 3 = least
// significant)
private final long[] stackV2;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a separate MessageFram might make more sense. By reusing the existing one we increase every MeesageFrame even for v1 by 32kb, which seems not necessary.

Copy link
Copy Markdown
Contributor Author

@siladu siladu Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed and decided not to do separate MessageFrame at least initially.
The class is large and the changes we are adding are small.
MessageFrameV2 will have a larger blast radius for things like tests. It would be nice to share test infra as much as possible, and ideally just set a flag to test v2.
However, we will explore this approach as well, just not for this initial skeleton PR.

Will also check this doesn't cause memory issues.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's not an extra 32KB for V1 because it gets initialized as null; it's only an extra 8 bytes per MessageFrame (maybe a little more depending on object layout byte alignment).
Think this is negligible and acceptable to avoid a duplicate MessageFrame class.

this.worldUpdater = worldUpdater;
this.gasRemaining = initialGas;
this.stack = new OperandStack(txValues.maxStackSize());
this.stackV2 = txValues.enableEvmV2() ? new long[txValues.maxStackSize() * 4] : null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a first version of the stack long array PR I also did allocate 32kb on the fly, but found that it had a huge performance impact on CALL op codes

Image

That's why I added the StackPool in that branch: https://github.com/daniellehrner/besu/blob/8273e34a43eb74c2f52fd32b9322c531fcd9c3f2/evm/src/main/java/org/hyperledger/besu/evm/internal/StackPool.java

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, StackPool or equivalent will come, this is just a placeholder really to keep PR small and to the point.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is adding StackPool increasing a lot the size of this PR ? I haven't looked into but it is part of the infrastructure part, it can make sens to have it part of this PR.

Copy link
Copy Markdown
Contributor Author

@siladu siladu Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add it it's not a big deal either way since nothing is using it.
Depends if we want to review StackPool impl as part of this PR.

* @return the operation result
*/
public static Operation.OperationResult staticOperation(final MessageFrame frame) {
final long[] a = new long[4];
Copy link
Copy Markdown
Contributor

@daniellehrner daniellehrner Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original PR did those operations inline and did not stack allocate new variables. We need to make sure the C2 gets rid of this overhead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh this is vibe coded nonsense, I made it more like StackMath

final long carry3 = Long.compareUnsigned(sum3, a[3]) < 0 ? 1L : 0L;

final long sum2 = a[2] + b[2] + carry3;
final long carry2 =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using branches for the carry computation had an 30% overhead in comparison to doing a branchless version in previous benchmarks.

Branchless version: https://github.com/daniellehrner/besu/blob/8273e34a43eb74c2f52fd32b9322c531fcd9c3f2/evm/src/main/java/org/hyperledger/besu/evm/internal/StackMath.java#L61

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More vibe-coded nonsense, I replaced with StackMath impl, but then we discussed removing impl to keep this a non-functional skeleton...I now just throw UnsupportedOperationException.

…ementation

Replaces the initial skeleton stub with a complete, benchmarked AddOperationV2.
Key changes:

- operation/v2/AbstractFixedCostOperationV2: shared base class handling gas
  checks and stack overflow/underflow, with pre-allocated OperationResult
  constants to avoid hot-path allocation
- operation/v2/AddOperationV2: full in-place 256-bit addition with single-limb
  fast path; moved from operation/ to operation/v2/ package
- MessageFrame: replaced verbose push/pop API with direct-access pattern
  (stackDataV2, stackTopV2, setTopV2, stackHasItems) enabling zero-overhead
  in-place arithmetic
- EVM.runToHaltV2: updated to pass stackDataV2() alongside frame to staticOperation
- ConfigurationOverviewBuilder: prints EVM v2 notice when enabled; suppresses
  redundant opcode-optimizations line in that mode
- evmtool: execution-strategy wrapper prints v2 notice before any subcommand,
  covering benchmark (which never calls getEvmConfiguration via Dagger)
- JMH benchmarks: AddOperationV2Benchmark, BinaryOperationV2Benchmark,
  BenchmarkHelperV2 helper with deterministic MessageFrame setup
- --Xevm-v2 added as CLI alias alongside --Xevm-go-fast

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
@siladu
Copy link
Copy Markdown
Contributor Author

siladu commented Mar 26, 2026

Moved AddOperationV2 to v2 package and made it benchmarkable.
Impl also now closer to #9881 - actually doing the arithmetic in place.

Benchmark                                    Mode  Cnt   Score   Error  Units
AddOperationBenchmark.executeOperation       avgt   15  43.299 ± 3.992  ns/op
v2.AddOperationBenchmarkV2.executeOperation  avgt   15   7.294 ± 0.329  ns/op

Logging when flag enabled...

$ evmtool --Xevm-v2 true block-test
EVM v2 (long[] stack) enabled
$ evmtool --Xevm-v2 true benchmark
EVM v2 (long[] stack) enabled
$ besu --Xevm-v2 true
2026-03-26 14:14:06.059+1000 | main | INFO  | Besu | Starting Besu
...
2026-03-26 14:14:06.994+1000 | main | INFO  | Besu |
####################################################################################################
#                                                                                                  #
# Besu version 26.3-develop-66b64f7                                                                #
...
# Experimental EVM v2 (long[] stack) enabled                                                       #
...

long a0 = s[a + 3], a1 = s[a + 2], a2 = s[a + 1], a3 = s[a];
long b0 = s[b + 3], b1 = s[b + 2], b2 = s[b + 1], b3 = s[b];
long z0 = a0 + b0;
long c = ((a0 & b0) | ((a0 | b0) & ~z0)) >>> 63;
Copy link
Copy Markdown
Contributor

@lu-pinto lu-pinto Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this way of computing carries is unreadable and has no benefit over Long.compareUnsigned. Just create a method in UInt256 that reuses the logic that's already there for add.
Unless we which to inline the code for UInt256::mod this way as well, it creates an inconsistent pattern and code organization issue.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is literally copied from #9881. This PR is not meant to be a perfect AddOperationV2, it's just proving out the v2 structure. I think the idea is to get StackMath code into the new structure, which resolves Glamsterdam, then incrementally improve.

Just create a method in UInt256 that reuses the logic that's already there for add.

We are trying to avoid new allocations though right? UInt256.add methods either take bytes[] or UInt256.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced AddOperationV2 contents with UnsupportedOperationException

@siladu siladu self-assigned this Mar 26, 2026
siladu added 2 commits March 27, 2026 15:02
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
* skeleton stub establishes the dispatch structure for incremental v2 operation rollout.
*/
// Note: like runToHalt, this is performance-critical code. Benchmark before refactoring.
private void runToHaltV2(final MessageFrame frame, final OperationTracer tracing) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually duplicate EVM.java as well, like in v2 package have EVMV2.java, that has the new implementation of runToHalt.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, not sure it's worth it. There's a lot more wiring to do with EVMV2, even abstracting just this method into it and calling it here isn't worth it IMO, at least not unless this class starts getting unwieldy, but I think everything will be limited to this method.

//
// Please benchmark before refactoring.
public void runToHalt(final MessageFrame frame, final OperationTracer tracing) {
if (evmConfiguration.enableEvmV2()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do this redirection upstream, in AbstractMessageProcessor class or even before.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method actually minimizes blast radiusm otherwise we'll need to wire in a load of EVMV2s in various places, including MainnetEVMs (for each fork).

this.worldUpdater = worldUpdater;
this.gasRemaining = initialGas;
this.stack = new OperandStack(txValues.maxStackSize());
this.stackDataV2 = txValues.enableEvmV2() ? new long[txValues.maxStackSize() * 4] : null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use enableEvmV2 ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, can't see a good reason for it, probably more Claude slop
Fixed f3fdeef

public record TxValues(
BlockHashLookup blockHashLookup,
int maxStackSize,
boolean enableEvmV2,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is needed, TxValues is referenced by MessageFrame, and it is a lot of redundancy and the value is the same for all MessageFrames.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, can't see a good reason for it, probably more Claude slop
Fixed f3fdeef

Copy link
Copy Markdown
Contributor

@ahamlat ahamlat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the small scope of the PR, i.e infrastructure to prepare EVM v2. I would to see small changes, like change when we redirect to v1 or v2.
Maybe also consider integrating SackPool as it is part of the infrastructure part.

@siladu siladu marked this pull request as ready for review March 30, 2026 11:13
Copilot AI review requested due to automatic review settings March 30, 2026 11:13
@siladu siladu changed the title feat: skeleton for EVM v2 with long[] stack and --Xevm-go-fast toggle feat: skeleton for EVM v2 with long[] stack and --Xevm-v2 toggle Mar 30, 2026
@siladu
Copy link
Copy Markdown
Contributor Author

siladu commented Mar 30, 2026

Mainnet testing with flag disabled versus control (main)
Screenshot 2026-03-30 at 9 21 18 pm
Screenshot 2026-03-30 at 9 20 25 pm

yield currentOperation.execute(frame, this);
}
};
} catch (final org.hyperledger.besu.evm.internal.OverflowException oe) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why having the whole package is needed here ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 5146f4a

};
} catch (final org.hyperledger.besu.evm.internal.OverflowException oe) {
result = OVERFLOW_RESPONSE;
} catch (final org.hyperledger.besu.evm.internal.UnderflowException ue) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 5146f4a

Copy link
Copy Markdown
Contributor

@ahamlat ahamlat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments, nothing blocking.

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
siladu added 2 commits March 31, 2026 08:22
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
@siladu siladu enabled auto-merge (squash) March 31, 2026 02:30
@siladu siladu merged commit eb25005 into besu-eth:main Mar 31, 2026
46 checks passed
@siladu siladu deleted the evm-v2-skeleton branch March 31, 2026 03:02
siladu added a commit to siladu/besu that referenced this pull request Mar 31, 2026
…u-eth#10105)

feat: skeleton for EVM v2 with long[] stack and --Xevm-v2 toggle

Introduces the structural scaffolding for a new EVM execution path (disabled by default) that uses long[] as its stack operand type instead of Tuweni Bytes, eliminating per-operation object allocation on the hot path.

Changes:

- Add enableEvmV2 field to EvmConfiguration (default false); existing 3-arg convenience constructor shields 80+ call sites from changes
- Wire --Xevm-v2 hidden CLI flag in EvmOptions and evmtool
- Add enableEvmV2 to MessageFrame
- Add long[] stackV2 dual-stack to MessageFrame with push/pop/get/set long[] methods; v1 OperandStack is untouched
- Add private runToHaltV2() in EVM with a v2 dispatch switch; runToHalt() branches to it when the flag is enabled
- ConfigurationOverviewBuilder: prints EVM v2 notice when enabled; suppresses redundant opcode-optimizations line in that mode
- evmtool: execution-strategy wrapper prints v2 notice before any subcommand, covering benchmark (which never calls getEvmConfiguration via Dagger)
- Add unimplemented AddOperationV2 as a stub example of the v2 operation pattern in a v2 package.
- Add AddOperationBenchmarkV2 as a working benchmark (Add opcode currently throws UnsupportedOperationException though)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
@daniellehrner daniellehrner mentioned this pull request Apr 2, 2026
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants