Jump-and-link instruction #630

Dentosal · 2025-03-14T11:33:37Z

Closes #627. VM issue: FuelLabs/fuel-vm#857. VM PR: FuelLabs/fuel-vm#925.

The design is quite similar of the RISC-V of the same name. JAL $ra $rb imm stores the address of the next instruction to $ra, so that register can be used as a return address from the subroutine. If ra is $zero, the value is discarded instead, so this can be used as a jump without having to trash a register. After storing the return address, it jumps to instruction at memory address $rb + imm * 4.

The main purpose of this instruction is efficient subroutine-calling and returning. JAL $ret_addr $subroutine_addr 0 is used to perform the call, and JAL $zero $ret_addr 0 returns from it. For nexted function calls, the callee is responsible for storing the $ret_addr.

The following snippet shows a minimal program using the functionality:

// main function
jal $ret_addr $pc 2 // call subroutine
ret $zero // end program

// subroutine
/* subroutine body comes here */
jal $zero $ret_addr 0 // Return from the subroutine

Fibonacci example

To show off how compact code this makes, I wrote a small fibonacci function using it. The function here uses the following register-based ABI:

Function argument and return value $fnarg in 0x10
Function return address $return_addr in 0x11

Also the code uses the following locals: $local1: 0x12, $local2: 0x13, $local3: 0x14 (named for pshl/popl)

// Set argument
movi $fnarg 10 // <- this computes fibo(10), i.e. 10th fibonacci number, 55

// Main function
jal $return_addr $pc 3 // <- offset to the subroutine
log $fnarg $zero $zero $zero
ret $one

// Fibonacci subroutine
// fibo(0) = 0, fibo(1) = 1, fibo(n) = fibo(n-1) + fibo(n-2)
pshl 0b11110 // Save return_address and local{1,2,3}
// Compute fn pointer to the current function and place it in local3
subi $local3 $pc 4 // <- subtract 4 to get prev instruction start
// If n < 2 no computation needed
movi $local1 2
lt $local1 $fnarg $local1
jnzf $local1 $zero 8 // Skip over computation
// Else call self with n - 1 and n - 2 and sum those
subi $local2 $fnarg 2         // Save n - 2 to local2
subi $fnarg $fnarg 1          // n -= 1
jal $return_addr $local3 0    // Call self
move $local1 $fnarg           // Copy result to local1
move $fnarg $local2           // Restore n - 2 from local2
jal $return_addr $local3 0    // Call self
move $local2 $fnarg           // Copy result to local2
add $fnarg $local1 $local2 // result = local1 + local2
// Computation ends here this is where jnzf jumps to
popl 0b11110 // Restore return_address and local{1,2,3}
jal $zero $return_addr 0 // Return from subroutine

Before requesting review

I have reviewed the changes myself

Voxelot · 2025-03-17T20:40:00Z

cc @vaivaswatha can you comment on the impact of this change? Ie. any concerns regarding register allocation for nested sub-routines?

xunilrj · 2025-03-17T22:01:42Z

Today this is how we compile the following fn.

fn main() -> u64 {
    1337
}

This is the function ASM (not super optimized to avoid inlining):

pshl i3                       ; save registers 16..40
pshh i524288                  ; save registers 40..64
move $$locbase $sp            ; save locals base register for function main_0
move $r0 $$reta               ; save return address
movi $r1 i1337                ; initialize constant into register
move $$retv $r1               ; set return value
move $$reta $r0               ; restore return address
poph i524288                  ; restore registers 40..64
popl i3                       ; restore registers 16..40
jmp $$reta                    ; return from call

This is the ASM calling the fn:

sub  $$reta $pc $is           ; get current instruction offset from instructions start ($is) 
srli $$reta $$reta i2         ; get current instruction offset in 32-bit words
addi $$reta $$reta i4         ; [call]: set new return address
jmpf $zero i76                ; [call]: call main_0
move $r0 $$retv               ; [call]: copy the return value

With this new instruction, we could call fns like:

jal $$reta $pc i76
move $r0 $$retv

and the fn would be

pshl i3                       ; save registers 16..40
pshh i524288                  ; save registers 40..64
move $$locbase $sp            ; save locals base register for function main_0
move $r0 $$reta               ; save return address
movi $r1 i1337                ; initialize constant into register
move $$retv $r1               ; set return value
poph i524288                  ; restore registers 40..64
popl i3                       ; restore registers 16..40
jal $zero $r0 0

Which means we can save 3 instructions when calling fns (huge gains!), and none in the function definition, given that jal $zero $ret_addr 0 seems to be identical to jmp $$reta

We could save extra 4 instructions per function definition, by using JAL last argument as a flag to do register pushing and popping.

vaivaswatha · 2025-03-19T04:35:40Z

cc @vaivaswatha can you comment on the impact of this change? Ie. any concerns regarding register allocation for nested sub-routines?

There shouldn't be any problem. When we enter a function, we save all (used) registers and pop them all back at the end. So register allocation shouldn't be affected. I don't see any downsides, and the upside is as elaborated by @xunilrj .

Dentosal · 2025-03-21T11:10:41Z

Which means we can save 3 instructions when calling fns (huge gains!), and none in the function definition, given that jal $zero $ret_addr 0 seems to be identical to jmp $$reta

jal $zero $ret_addr 0 isn't exectly identical to jmp $$reta, in the sense that the jmp is $is-relative, and jal is not.

We could save extra 4 instructions per function definition, by using JAL last argument as a flag to do register pushing and popping.

I'm not sure how that would work? The immediate part here is at most 12 bits long, and the VM has 48 user-writable registers. Unless we special-case some of these registers, of course, but that seems unwise.

I'm noticing that the function calls could be optimized a lot further with smarter register allocation. For instance...

you could save two instructions by only using only higher-half (pshh/poph) registers in the function body, so pshl/popl isn't required at all
there's no actual need to move $r0 $$reta, just use jal $zero $$reta 0 directly
and of course, the whole function should be inlined in this case
after returning, the move $r0 $$retv could be optimized away by treating $$retv as the return value

xunilrj · 2025-03-22T14:45:34Z

I was imagining one bit per push/pop. So from the 12bits not being used, 4 would allow jump and push, or jump and pop all registers.

Dentosal · 2025-03-24T10:22:50Z

I was imagining one bit per push/pop. So from the 12bits not being used, 4 would allow jump and push, or jump and pop all registers.

I don't think push/pop all registers are sensible operations. At least you'd like to keep the return value and address as-is.

Dentosal · 2025-03-27T20:06:31Z

Some benchmarks with a sway compiler modified to use this instruction:

build command forc build --release.

Project	`d821dcb`	`d821dcb` with `JAL` support	reduction
mira-v1-core	89.384 KB	85.704 KB	4.3%
sway-applications name-registry/registry-contract	24.664 KB	23.128 KB	6.2%

Voxelot · 2025-05-12T19:43:04Z

Should we align the naming more closely with RISCV instructions? ie.

JAL -> jmp and link with only an immediate value operand
jal ra, immediate_offset
JALR -> jmp and link with both register & immediate value operands
jalr ra, rb, immediate_offset

Voxelot · 2025-05-12T20:55:57Z

src/fuel-vm/instruction-set.md

+
+- `$rA` is a reserved register other than `$zero`
+- `$rB + imm * 4 >= VM_MAX_RAM`
+


Should we panic if $rB == $pc && imm == 0 to avoid jumping into the exact same spot?

Likely not. It's not like you couldn't otherwise make an infinite loop if you want, and then you'll just run out of gas anyway.

## Description This PR contains an initial implementtion of subroutine calls using the in-progress [jump-and-link instruction `JAL`](FuelLabs/fuel-specs#630). It substantially reduces the function call overhead: the old code used 4 instructions per call, while the new version uses 1-3 depending on the distance to the called function. ### Future optimizations * Reorder functions, so those that call each other are adjacent * Use absolute or IS-relative jumps where it makes sense, see #7267 ## Checklist - [x] I have linked to any relevant issues. - [x] I have commented my code, particularly in hard-to-understand areas. - [ ] I have updated the documentation where relevant (API docs, the reference, and the Sway book). - [x] If my change requires substantial documentation changes, I have [requested support from the DevRel team](https://github.com/FuelLabs/devrel-requests/issues/new/choose) - [x] I have added tests that prove my fix is effective or that my feature works. - [x] I have added (or requested a maintainer to add) the necessary `Breaking*` or `New Feature` labels where relevant. - [x] I have done my best to ensure that my PR adheres to [the Fuel Labs Code Review Standards](https://github.com/FuelLabs/rfcs/blob/master/text/code-standards/external-contributors.md). - [x] I have requested a review from the relevant team or maintainers.

Add JAL instruction

3e8c8af

Dentosal self-assigned this Mar 14, 2025

Dentosal mentioned this pull request Mar 14, 2025

Jump-and-link instruction FuelLabs/fuel-vm#925

Merged

6 tasks

Dentosal marked this pull request as ready for review March 14, 2025 11:42

Dentosal requested review from a team March 14, 2025 11:42

Dentosal added the comp:FVM Component: FuelVM label Mar 14, 2025

Dentosal mentioned this pull request Oct 31, 2024

New function call/return helper opcodes FuelLabs/fuel-vm#857

Closed

Dentosal and others added 2 commits March 24, 2025 12:22

Merge branch 'master' into dento/jal-instruction

023cd4b

Correctly use imm * 4 in all fields

8d726c8

Merge branch 'master' into dento/jal-instruction

b33db4c

Dentosal mentioned this pull request Apr 14, 2025

Subroutine calls using the new JAL instruction FuelLabs/sway#7085

Merged

8 tasks

Voxelot reviewed May 12, 2025

View reviewed changes

Dentosal added 2 commits May 13, 2025 16:03

Merge branch 'master' into dento/jal-instruction

17eb8f3

Merge branch 'master' into dento/jal-instruction

5682a25

Dentosal enabled auto-merge (squash) July 28, 2025 10:21

Dentosal requested a review from Voxelot July 28, 2025 10:22

xgreenx approved these changes Jul 28, 2025

View reviewed changes

Dentosal merged commit 2869996 into master Jul 28, 2025
6 checks passed

Dentosal deleted the dento/jal-instruction branch July 28, 2025 15:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Jump-and-link instruction #630

Jump-and-link instruction #630

Uh oh!

Dentosal commented Mar 14, 2025 •

edited

Loading

Uh oh!

Voxelot commented Mar 17, 2025

Uh oh!

xunilrj commented Mar 17, 2025 •

edited

Loading

Uh oh!

vaivaswatha commented Mar 19, 2025

Uh oh!

Dentosal commented Mar 21, 2025

Uh oh!

xunilrj commented Mar 22, 2025

Uh oh!

Dentosal commented Mar 24, 2025

Uh oh!

Dentosal commented Mar 27, 2025

Uh oh!

Voxelot commented May 12, 2025 •

edited

Loading

Uh oh!

Voxelot May 12, 2025

Uh oh!

Dentosal May 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants


		- `$rA` is a reserved register other than `$zero`
		- `$rB + imm * 4 >= VM_MAX_RAM`

Jump-and-link instruction #630

Jump-and-link instruction #630

Uh oh!

Conversation

Dentosal commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fibonacci example

Before requesting review

Uh oh!

Voxelot commented Mar 17, 2025

Uh oh!

xunilrj commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vaivaswatha commented Mar 19, 2025

Uh oh!

Dentosal commented Mar 21, 2025

Uh oh!

xunilrj commented Mar 22, 2025

Uh oh!

Dentosal commented Mar 24, 2025

Uh oh!

Dentosal commented Mar 27, 2025

Uh oh!

Voxelot commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Voxelot May 12, 2025

Choose a reason for hiding this comment

Uh oh!

Dentosal May 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Dentosal commented Mar 14, 2025 •

edited

Loading

xunilrj commented Mar 17, 2025 •

edited

Loading

Voxelot commented May 12, 2025 •

edited

Loading