Improve program entrypoint by febo · Pull Request #176 · anza-xyz/pinocchio

febo · 2025-06-09T13:10:48Z

Problem

The current program entrypoint does not translate to very efficient bytecode, as can be seen from an assembly implementation of entrypoint – e.g., cavey's asmr.

Solution

Tweak the implementation to improve its efficiency, borrowing ideas from the assembly implementation (credits to @cavemanloverboy).

One key difference is that the entrypoint includes inlined code to parse accounts, which reduces the number of jumps required and therefore reduces CUs.

Results

cavemanloverboy · 2025-06-09T13:15:46Z

i just found two more 1 CU/account optimizations, so hold your horses here. One of them might be unusable unfortunately but we will see.

febo · 2025-06-09T13:50:45Z

i just found two more 1 CU/account optimizations, so hold your horses here. One of them might be unusable unfortunately but we will see.

Nice – this one will go after #166 goes in.

joncinque

Looks good overall! Just some small comments

joncinque · 2025-06-11T09:31:52Z

+/// Align a pointer to the BPF alignment of `u128`.
+macro_rules! align_pointer {
+    ($ptr:ident) => {
+        (($ptr as usize + (BPF_ALIGN_OF_U128 - 1)) & !(BPF_ALIGN_OF_U128 - 1)) as *mut u8
+    };
+}


Why does this align to u128? Shouldn't it align to u64 per the serialization spec?

That is the name of the constant on the SDK. 😊 I think it is meant to represent the alignment of an u128 in BPF, which is 8. We could rename it if that makes it more clear – agree that the name is a bit confusing.

I mainly want to make sure I'm not missing anything, but I guess you could use any usize whose value is equal to 8 😅

febo · 2025-06-12T09:42:36Z

@cavemanloverboy Found a few more savings with a small tweak in the code.

febo · 2025-06-19T15:18:18Z

@joncinque Put the PR back to draft to test a suggestion from @cavemanloverboy

febo · 2025-06-23T18:14:54Z

Current benchmark:

Name	CUs
Account (0)	9
Account (1)	13
Account (2)	22
Account (3)	36
Account (4)	45
Account (5)	52
Account (6)	72
Account (7)	75
Account (8)	80
Account (16)	154
Account (32)	280
Account (64)	541

nlgripto · 2025-06-26T17:47:46Z

let him cook

febo · 2025-06-30T19:28:28Z

@joncinque PR updated and ready for another review. 😊

illia-bobyr · 2025-06-26T18:17:56Z

+                // There might be remininag accounts to process.
+                if to_process > 3 {
+                    // 4..3 accounts left to process.
+                    if to_process > 4 {
+                        process_accounts!(4 => (input, accounts, accounts_slice));
+                    } else {
+                        process_accounts!(3 => (input, accounts, accounts_slice));
+                    }
+                } else {
+                    // 2..1 accounts left to process.
+                    if to_process > 2 {
+                        process_accounts!(2 => (input, accounts, accounts_slice));
+                    } else if to_process > 1 {
+                        process_accounts!(1 => (input, accounts, accounts_slice));
+                    }
+                }


minor

Have you considered a match for this?

Suggested change

// There might be remininag accounts to process.

if to_process > 3 {

// 4..3 accounts left to process.

if to_process > 4 {

process_accounts!(4 => (input, accounts, accounts_slice));

} else {

process_accounts!(3 => (input, accounts, accounts_slice));

}

} else {

// 2..1 accounts left to process.

if to_process > 2 {

process_accounts!(2 => (input, accounts, accounts_slice));

} else if to_process > 1 {

process_accounts!(1 => (input, accounts, accounts_slice));

}

}

// There might be remaining accounts to process.

match to_process {

5 => process_accounts!(4 => (input, accounts, accounts_slice)),

4 => process_accounts!(3 => (input, accounts, accounts_slice)),

3 => process_accounts!(2 => (input, accounts, accounts_slice)),

2 => process_accounts!(1 => (input, accounts, accounts_slice)),

1 => (),

_ => {

// SAFETY: `while` loop above makes sure that `to_process` has 1 to 5

// entries left.

unsafe { core::hint::unreachable_unchecked() }

}

};

the point of this was to manually unroll the binary search. did you verify that this match statement does not increase cus? if it doesn't increase it, i'm in favor of this so long as we add a comment that the compiler can figure out the binary search (in case someone changes in the future)

I must admit, I didn't realize it.
Maybe a short note explaining the optimization could help the future readers.

A match statement increases CUs – in the end it generates a "standard" jump table, which in the worse case will do 4 comparisons. The "manual" one uses 3 comparisons at most, and 2 for most of the values.

I will add a comment explaining the rationale of the nested if statements.

This is probably out of scope for the PR by now.

I started wondering if there are some alternatives to the "binary search" that have similarly good properties.
In particular, I noticed that as written we produce 16 identical code blocks for the account processing, in total.
Not sure if any of them gets optimized away, but if you are saying that the binary search tree makes it into the final code, then all the blocks are probably there as well.

The very first code block is an optimization for the case when there is only one account.
But the rest 15 are used to process accounts.
And the longest "uninterrupted" sequence of accounts we process is 5 accounts at a time.
I guess, the size of the program is less important, but it still costs something to deploy it.

On x86, jump table is a single jump: https://godbolt.org/z/f7E5P3YKs

This:

match iterations { 3 => { process(); process(); process(); } 2 => { process(); process(); } 1 => { process(); } 0 => {} _ => unreachable!(), }

Turns into this:

.LBB1_1: lea rax, [rip + .LJTI1_0] movsxd rcx, dword ptr [rax + 4*rbx] add rcx, rax jmp rcx .LBB1_3: call example::process::h15029326abde9722 .LBB1_4: call example::process::h15029326abde9722 .LBB1_5: call example::process::h15029326abde9722 .LBB1_6: add rsp, 16 pop rbx ret .LJTI1_0: .long .LBB1_6-.LJTI1_0 .long .LBB1_5-.LJTI1_0 .long .LBB1_4-.LJTI1_0 .long .LBB1_3-.LJTI1_0

There are no comparisons.
But maybe with the SBF backend can not produce computed jumps here?
The instruction set seems to have the necessary instruction.

If I pass the index into process(), then it gets a bit more confusing.
Though, process_n_accounts!(@process_account => (input, accounts, accounts_slice)) calls are identical. All the state change is a side effect of the call.
Though, there is a lot of code that is inlined, so the compiler might miss the fact that they are indeed identical.

What version of platform tools are you using? I merged the switch simplify pass recently anza-xyz/llvm-project#153. It is available on platform tools v1.49 onwards.

I'd encourage you to test with v1.50, because it enables a pass to simplify branches, so you'll see even different results.

Using platform-tools v1.50 we get even more improvements:

| Name | CUs | Delta | |--------------|-----|-------| | Account (0) | 9 | -- | | Account (1) | 13 | -- | | Account (2) | 21 | -1 | | Account (3) | 34 | -- | | Account (4) | 42 | -1 | | Account (5) | 49 | -1 | | Account (6) | 64 | -6 | | Account (7) | 68 | -5 | | Account (8) | 75 | -3 | | Account (16) | 140 | -12 | | Account (32) | 258 | -20 | | Account (64) | 501 | -37 |

Yeah, the compiler is a bit unpredictable – sometimes you change a single line and things are significantly different. 😅

Fernando The Compiler Whisperer.

@cavemanloverboy example generates different code in v1.50:

This is for the case with 16 match arms. The compiler builds a lookup table:

entrypoint: ldxdw r1, [r1 + 0] jgt r1, 16, LBB0_3 mov64 r2, r1 lsh64 r2, 32 rsh64 r2, 32 mov64 r3, 129023 rsh64 r3, r2 and64 r3, 1 jeq r3, 0, LBB0_3 lsh64 r1, 3 lddw r2, .Lswitch.table.entrypoint add64 r2, r1 ldxdw r1, [r2 + 0] mov64 r2, 1 call sol_log LBB0_3: mov64 r0, 0 exit

illia-bobyr

Out of scope for this PR

I find it a bit hard to quickly spot all the differences between parse() and parse_into<MAX_TX_ACCOUNTS>().

I can see that the skip extra accounts logic is only present in the parse_into() version.
Is this the only difference?

I think if this is the case, there is probably a way to remove the skip logic from the parse_into<MAX_TX_ACCOUNTS>() case.
Specifically, only when MAX_ACCOUNTS equals MAX_TX_ACCOUNTS.
As MAX_ACCOUNTS is a generic constant, the compiler will create a new instance of this function for every specific value of MAX_ACCOUNTS.
Plus, it is marked as #[inline(always)] allowing further optimizations.

I think if

        while to_skip > 0 {

is augmented with a compile time check, maybe like this:

        if MAX_ACCOUNTS < MAX_TX_ACCOUNTS {
            while to_skip > 0 {

the compiler should remove this code completely when MAX_ACCOUNTS equals MAX_TX_ACCOUNTS.
And it should also remove the to_skip calculation, as it becomes dead code.

Also, would it make sense to add a compile time assertion, to make sure that MAX_ACCOUNTS is not above MAX_TX_ACCOUNTS?

I wonder if a change like that would be enough to end up with just a single version of the parse() function.
It looks like it has a considerable overlap with parse_into() and you are making changes in both functions in parallel.

illia-bobyr

Did one more pass and found a few minor things.
But overall it is good :)

joncinque

Great work! This might be one of the few cases where a macro is uniquely suited to do the job, and it's really well factored.

febo force-pushed the febo/improve-entrypoint branch from c235845 to aa4201c Compare June 9, 2025 13:13

febo mentioned this pull request Jun 10, 2025

Simplify program entrypoint #166

Merged

joncinque reviewed Jun 11, 2025

View reviewed changes

febo force-pushed the febo/entrypoint-cleanup branch from 9361601 to fad2c71 Compare June 11, 2025 13:32

febo force-pushed the febo/entrypoint-cleanup branch from f6f4596 to e8b191a Compare June 13, 2025 11:06

Base automatically changed from febo/entrypoint-cleanup to main June 14, 2025 00:13

febo force-pushed the febo/improve-entrypoint branch 2 times, most recently from ad37c78 to a9285b4 Compare June 15, 2025 00:00

febo requested a review from joncinque June 15, 2025 09:50

febo marked this pull request as ready for review June 15, 2025 09:50

febo marked this pull request as draft June 19, 2025 15:17

febo added 14 commits June 26, 2025 10:59

Fix review comments

6e77c1c

Revert offset increment change

fe9e4c5

Add invoke instruction helper

a912bb5

Typos

d1f795d

Remove new helpers

19e49cd

Remove unused

9367935

Address review comments

434f182

Tweak inline attributes

e2c23f8

Use invoke signed unchecked

bd0b5f9

Refactor inline

05878cb

Renamed to with_bounds

1289020

Update docs

cddf283

Revert change

89e140b

Add constant length check

91f767a

febo added 7 commits June 26, 2025 10:59

Refactor deserialize

af0651a

Fix imports

81a6c13

Tweak docs

80ce1ee

[WIP]: Process accounts in batch

d0a38f8

Update doc comment

9087fdf

Tweak the case for accounts <= 2

1872115

Rename to parse

01a09f4

febo force-pushed the febo/improve-entrypoint branch from 0f21920 to 01a09f4 Compare June 26, 2025 14:17

febo marked this pull request as ready for review June 26, 2025 14:23

Fix comments

fa8a794

illia-bobyr reviewed Jun 30, 2025

View reviewed changes

febo added 2 commits July 2, 2025 11:09

Use match statement

3cebe14

Rename to_process_plus_one

227a9c3

illia-bobyr reviewed Jul 3, 2025

View reviewed changes

Comment thread sdk/pinocchio/src/entrypoint/mod.rs Outdated

febo added 2 commits July 6, 2025 01:09

Add parse test

14eaaa8

Another rename to_process_plus_one

3b22204

febo requested a review from illia-bobyr July 6, 2025 17:01

febo added 2 commits July 8, 2025 11:14

Remove unnecessary parse method

d2b6dd7

Revert back to deserialize

cf98ba1

illia-bobyr previously approved these changes Jul 8, 2025

View reviewed changes

Comment thread sdk/pinocchio/src/entrypoint/mod.rs Outdated

Comment thread sdk/pinocchio/src/entrypoint/mod.rs

Revert to updating input pointer

3a1e247

febo dismissed illia-bobyr’s stale review via 3a1e247 July 10, 2025 09:32

febo requested a review from illia-bobyr July 10, 2025 09:32

illia-bobyr approved these changes Jul 10, 2025

View reviewed changes

joncinque approved these changes Jul 15, 2025

View reviewed changes

febo merged commit bd28a5f into main Jul 15, 2025
9 checks passed

febo deleted the febo/improve-entrypoint branch July 15, 2025 20:48

febo mentioned this pull request Jul 19, 2025

Fix duplicated account parsing #209

Merged

Conversation

febo commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Results

Uh oh!

cavemanloverboy commented Jun 9, 2025

Uh oh!

febo commented Jun 9, 2025

Uh oh!

joncinque left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

febo Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

febo commented Jun 12, 2025

Uh oh!

febo commented Jun 19, 2025

Uh oh!

febo commented Jun 23, 2025

Uh oh!

nlgripto commented Jun 26, 2025

Uh oh!

febo commented Jun 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cavemanloverboy Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

febo Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

illia-bobyr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

illia-bobyr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

joncinque left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

febo commented Jun 9, 2025 •

edited

Loading

febo Jun 11, 2025 •

edited

Loading

cavemanloverboy Jun 30, 2025 •

edited

Loading

febo Jul 1, 2025 •

edited

Loading