Move SAR, SHR and SHL to UInt256 by lu-pinto · Pull Request #10216 · besu-eth/besu

lu-pinto · 2026-04-10T09:22:53Z

PR description

I was honestly not convinced with the current design of the opcodes on EVM V2 from #10154 so I did some experimentation and I would like to challenge the existing design.
I managed to achieve the same performance level while splitting up duties between the arithmetic/bitwise computations from the opcodes themselves. Opcodes should be the ones fetching/updating the stack, and not the code that does computations - this should be strictly decoupled from one another.

IMO code looks much cleaner and easier to read. It also benefits from code reuse with already existing arithmetics in UInt256. I will take a look at repurposing shl and shr for modulus arithmetics in another PR as well as I believe we might be able to reuse them.

Performance stats:

Test Case	Latency (ns) main@2d4f077c27	Latency (ns) @065670ffe3
SarV2_SHIFT_0	6.049	6.334
SarV2_NEGATIVE_SHIFT_1	8.681	8.83
SarV2_POSITIVE_SHIFT_1	7.969	8.312
SarV2_ALL_BITS_SHIFT_1	8.518	8.804
SarV2_NEGATIVE_SHIFT_128	6.796	7.223
SarV2_NEGATIVE_SHIFT_255	6.996	7.546
SarV2_POSITIVE_SHIFT_128	6.756	7.101
SarV2_POSITIVE_SHIFT_255	6.765	6.959
SarV2_OVERFLOW_SHIFT_256	6.848	7.222
SarV2_OVERFLOW_LARGE_SHIFT	6.954	7.356
SarV2_FULL_RANDOM	15.349	15.379
ShlV2_SHIFT_0	5.785	6.362
ShlV2_SHIFT_1	8.492	8.778
ShlV2_SHIFT_128	7.149	7.105
ShlV2_SHIFT_255	6.871	7.277
ShlV2_OVERFLOW_SHIFT_256	6.647	7.698
ShlV2_OVERFLOW_LARGE_SHIFT	6.832	7.798
ShlV2_FULL_RANDOM	11.927	8.183
ShrV2_SHIFT_0	5.817	6.357
ShrV2_SHIFT_1	7.742	8.233
ShrV2_SHIFT_128	6.833	6.975
ShrV2_SHIFT_255	6.824	7.061
ShrV2_OVERFLOW_SHIFT_256	6.642	7.69
ShrV2_OVERFLOW_LARGE_SHIFT	6.805	7.846
ShrV2_FULL_RANDOM	11.381	8.628

Issue(s)

#10131

Thanks for sending a pull request! Have you done the following?

Checked out our contribution guidelines?
Considered documentation and added the doc-change-required label to this PR if updates are required.
Considered the changelog and included an update if required.
For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

spotless: ./gradlew spotlessApply
unit tests: ./gradlew build
acceptance tests: ./gradlew acceptanceTest
integration tests: ./gradlew integrationTest
reference tests: ./gradlew ethereum:referenceTests:referenceTests
hive tests: Engine or other RPCs modified?

ahamlat · 2026-04-10T09:56:52Z

+  public static OperationResult staticOperation(final MessageFrame frame) {
    if (!frame.stackHasItems(2)) return UNDERFLOW_RESPONSE;
-    frame.setTopV2(StackArithmetic.sar(stack, frame.stackTopV2()));
+    long[] _stack = frame.stackDataV2();


I would suggest to just use stack instead of _stack to be inline with the naming used in the project.

ahamlat · 2026-04-10T09:57:17Z

    if (!frame.stackHasItems(2)) return UNDERFLOW_RESPONSE;
-    frame.setTopV2(StackArithmetic.shl(stack, frame.stackTopV2()));
+    long[] _stack = frame.stackDataV2();


I would suggest to just use stack instead of _stack to be inline with the naming used in the project.

ahamlat · 2026-04-10T09:57:24Z

+  public static OperationResult staticOperation(final MessageFrame frame) {
    if (!frame.stackHasItems(2)) return UNDERFLOW_RESPONSE;
-    frame.setTopV2(StackArithmetic.shr(stack, frame.stackTopV2()));
+    long[] _stack = frame.stackDataV2();


I would suggest to just use stack instead of _stack to be inline with the naming used in the project.

Sure, it is a leftover from the previous method argument before I removed it

ahamlat · 2026-04-10T10:00:49Z

+        || shift.u2() != 0
+        || shift.u1() != 0
+        || Long.compareUnsigned(shift.u0(), 256) >= 0) {
+      bytesToShift = 256;


I guess this is bitsToShift ?

We use bitShift in private methods below

yes it is bits, well spotted

ahamlat · 2026-04-10T10:04:20Z

+        || shift.u2() != 0
+        || shift.u1() != 0
+        || Long.compareUnsigned(shift.u0(), 256) >= 0) {
+      bytesToShift = 256;


The same as above.

ahamlat · 2026-04-10T10:05:09Z

+    return new UInt256(w3, w2, w1, w0);
+  }
+
+  private static long shiftLeftWord(final long value, final long nextValue, final int bitShift) {


Add javadoc.

ahamlat · 2026-04-10T10:05:13Z

+    return (value << bitShift) | (nextValue >>> (64 - bitShift));
+  }
+
+  private static long shiftRightWord(final long value, final long prevValue, final int bitShift) {


Add javadoc.

ahamlat · 2026-04-10T10:11:00Z

-    final long[] s = new long[8];
-    writeLimbs(s, 0, valueVal);
-    writeLimbs(s, 4, shiftVal);
+    final UInt256 result = executor.execute(valueVal, shiftVal);


👍 (this is a good argument that this design is better)

ahamlat

I like the proposed design, I find it better and the code much cleaner. There is a small performance regression, could you double check if it is real with multiple runs and investigate the origin.

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

lu-pinto · 2026-04-10T15:01:29Z

I like the proposed design, I find it better and the code much cleaner. There is a small performance regression, could you double check if it is real with multiple runs and investigate the origin.

Looked into it and optimised a little more - but I'm going to park it here. Worst cases (FULL_RANDOM) are much closer or have improved significantly. IMO these are prob the most realistic ones.
The other ones are very hard to get better numbers without impacting the worse case because I primarily optimized for it.

ahamlat · 2026-04-10T15:23:56Z

        Arguments.of(
            "0x8000000000000000000000000000000000000000000000000000000000000000",
-            "0x100",
+            "0x0100",


Why do make this change and all the changes below on unit tests ?

Is it related not using anymore fromHexStringLenient ?

changed fromHexStringLenient to fromHexString in the test to make the hexadecimal exact without having to guess if there will be a zero or not prepended. Hard to know if you don't know what lenient does. Since we are providing the values hardcoded does it make sense to "disguise" them? For instance 0x0 is half a byte so it seems lenient would put a zero to complete the byte.
I can revert it if you feel strongly about it.

siladu

If we can maintain same performance despite UInt256 allocations then that's fine by me.

Like the reuse between sar and shl, just need to update name and javadoc.

Various minor comments.

siladu · 2026-04-13T04:52:02Z

  public Operation.OperationResult executeFixedCostOperation(
      final MessageFrame frame, final EVM evm) {
-    return staticOperation(frame, frame.stackDataV2());
+    return staticOperation(frame);


siladu · 2026-04-13T04:57:01Z

+            stack[shiftOffset + 1],
+            stack[shiftOffset + 2],
+            stack[shiftOffset + 3]);
+    final int valueOffset = (--top) << 2;


I initially took issue with the mutability of this versus (top - 2) << 2, but warmed to it once I realised
--top === pop
++top === push
which is pretty neat.

siladu · 2026-04-13T05:05:36Z

+   * @param fill value to prepend while shifting
+   * @return the result
+   */
+  // TODO: check perf - wiring shiftRight callers with this one


Since PR is approved, just pointing out there's still a TODO

I don't want to clutter this PR with that, but also want to track this for changing in a follow up.

siladu · 2026-04-13T05:05:50Z

+   * @param shift number of bits to shift (must be in [1, 255])
+   * @return the result
+   */
+  // TODO: check perf - wiring shiftLeft callers with this one


siladu · 2026-04-13T05:10:15Z

+    } else {
+      bitShift = (int) shift.u0();
+    }
+    return sar0(bitShift, 0);


Seems odd for shr to call a sar method

why? I think this is elegant. SAR is pretty much a SHR with a custom fill.

So couldn't you equally call it shr0 ?

sar and shr imply specific things, their shared code is not sar.

to be clear, I like the structure just not the name

I think I disagree there, because a shift with a custom fill is an arithmetic shift so it makes sense for SHR to call an internal SAR with a static fill of 0.

By your description, I think you are saying shl === sar0, but sar0 only makes sense to me when parameter is 0, e.g.

UInt256 sar0(int bitShift) { return sar(bitShift, 0); }

The method declaration
private UInt256 sar0(final int shift, final long fill) {
allows custom fill param so if anything should be sar not sar0.
However, I still find this confusing since it isn't the same as SAR opcode. Both SHR and SAR are using it, so better to call it something different IMO.

siladu · 2026-04-13T05:12:02Z

+   * @return the result
+   */
+  // TODO: check perf - wiring shiftLeft callers with this one
+  private UInt256 shl0(final int shift) {


Suggested change

private UInt256 shl0(final int shift) {

private UInt256 shiftLeft(final int shift) {

I believe this clashes with the current shiftLeft method already in place, hence I put the TODO to address this in a follow up. I do what to move onto those names.

Why do we have two versions of public shiftLeft after this PR?
Is UInt256 now a mix of V1 and V2 variants?
shiftLeftV2 would make more sense if that's the case IMO, or maybe we need UInt256V2 class though would rather not.

I think handling the TODOs in this PR would make things clearer tbh, it's not a large PR.

siladu · 2026-04-13T05:28:05Z

+   * Arithmetic right-shifts a 256-bit value in place by 0..255 bits, sign-extending with {@code
+   * fill}.
+   *
+   * @param shift number of bits to shift (must be in [1, 255])


Think this comment is wrong now - it's handling 0 and 256 as well?

yeah you're right. That's why I don't like comments in private methods. They frequently change and easily get outdated and compiler does not catch them.

claude /loop 1 week "check comments still match code" 😆

siladu · 2026-04-13T05:32:18Z

+      bitShift = 256;
+    } else {
+      bitShift = (int) shift.u0();
+    }


Did you consider

Suggested change

}

}

if (bitShift == 0) return this;

?

I believe so - I tried multiple things. These methods are heavily optimized, I would recommend you try and see the results - sometimes it may seem like a fast path but it changes how JIT sees the method.

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

…esign

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

lu-pinto requested review from ahamlat and siladu and removed request for siladu April 10, 2026 09:23

lu-pinto force-pushed the shift-opcodes-alt-design branch from da22999 to 82337b0 Compare April 10, 2026 09:24

ahamlat reviewed Apr 10, 2026

View reviewed changes

lu-pinto added 6 commits April 10, 2026 14:21

Move SAR implementation to UInt256

65b3c23

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Move SHL implementation to UInt256

5a8a8f5

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Move SRL implementation to UInt256

833a2e7

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

spotless

2dfdae0

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

javadoc

5187470

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

eliminate wasteful branch

f4ed77d

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

lu-pinto force-pushed the shift-opcodes-alt-design branch from 82337b0 to f4ed77d Compare April 10, 2026 13:21

lu-pinto added 2 commits April 10, 2026 14:30

nit: var renaming

3fe50fa

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

add additional stack tests for shifts

065670f

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

ahamlat reviewed Apr 10, 2026

View reviewed changes

ahamlat approved these changes Apr 10, 2026

View reviewed changes

lu-pinto mentioned this pull request Apr 10, 2026

Add MULMOD to EVMv2 #10168

Merged

siladu reviewed Apr 13, 2026

View reviewed changes

siladu added the performance label Apr 13, 2026

trivial changes

68dcd2b

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

lu-pinto added 2 commits April 13, 2026 11:53

Merge remote-tracking branch 'upstream/main' into shift-opcodes-alt-d…

da3fa5b

…esign

unit test method rename

23bfc88

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

lu-pinto force-pushed the shift-opcodes-alt-design branch from c2b7347 to 23bfc88 Compare April 13, 2026 11:16

Merge branch 'main' into shift-opcodes-alt-design

809c201

lu-pinto changed the title ~~Alternative design for shift opcodes~~ Move SAR, SHR and SHL to UInt256 Apr 13, 2026

lu-pinto enabled auto-merge (squash) April 13, 2026 15:35

Merge branch 'main' into shift-opcodes-alt-design

33ca9fd

lu-pinto merged commit 8e4ef80 into besu-eth:main Apr 16, 2026
34 checks passed

siladu mentioned this pull request Apr 16, 2026

Minor Mulmod v2 refactor #10253

Merged

parthdagia05 mentioned this pull request May 4, 2026

Remove resolved perf-check TODOs from UInt256 shift helpers #10406

Open

2 tasks

	private UInt256 shl0(final int shift) {
	private UInt256 shiftLeft(final int shift) {

Conversation

lu-pinto commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR description

Issue(s)

Thanks for sending a pull request! Have you done the following?

Locally, you can run these tests to catch failures early:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lu-pinto Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahamlat left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lu-pinto commented Apr 10, 2026

Uh oh!

ahamlat Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

siladu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

siladu Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

siladu Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

lu-pinto commented Apr 10, 2026 •

edited

Loading

lu-pinto Apr 10, 2026 •

edited

Loading

ahamlat left a comment •

edited

Loading

ahamlat Apr 10, 2026 •

edited

Loading

siladu Apr 14, 2026 •

edited

Loading

siladu Apr 14, 2026 •

edited

Loading

siladu Apr 14, 2026 •

edited

Loading

lu-pinto Apr 13, 2026 •

edited

Loading