Remove usage of reserved registers on arm64#47
Remove usage of reserved registers on arm64#47StephenButtolph wants to merge 7 commits intocloudflare:masterfrom
Conversation
|
I tried to make the commits individually reviewable/build up to the solution. So imo (fwiw) reviewing commit-by-commit is the easiest way to review this change. |
|
Probably should have included benchmarks: Running locally on my laptop: As expected (tbh probably within random perturbation) the new code is slightly slower. |
|
I don't want to be a bother, but just want to make sure you (@armfazh) saw this PR. If there is anything I can do to help out to make review easier let me know. |
| #define RELOAD \ | ||
| MOVD ·p2+0(SB), R5 | ||
|
|
||
| #define mul(c0,c1,c2,c3,c4,c5,c6,c7,t0,reset) \ |
There was a problem hiding this comment.
This might make this a bit more clear.
| #define RELOAD \ | |
| MOVD ·p2+0(SB), R5 | |
| #define mul(c0,c1,c2,c3,c4,c5,c6,c7,t0,reset) \ | |
| #define ResetR5 \ | |
| MOVD ·p2+0(SB), R5 | |
| #define mul(c0,c1,c2,c3,c4,c5,c6,c7,maybeR5,maybeResetR5) \ |
| UMULH R2, R6, c6 \ | ||
| MUL R2, R7, R0 \ | ||
| ADCS R0, R27 \ | ||
| UMULH R2, R7, R29 \ |
There was a problem hiding this comment.
We could continue to use R29 if we decided to just save the value onto the stack at the start of gfpMul and then we restore it prior to RET. That would mean that we wouldn't need the t0 and reset modifications to this macro (as we can replace R27 with R29 and only restore R6 in gfpReduce.
This would allocate 8 bytes onto the stack, but would keep the same number of additional instructions (3). The current modification has 3 MOVD instructions, this would have a push+pop+MOVD.
If that seems preferable then I can switch to that approach.
| lock.Lock() | ||
| // Make it more likely for goroutines to block on this mutex. | ||
| time.Sleep(time.Microsecond) | ||
|
|
||
| // If the frame pointer was corrupted, and another goroutine is | ||
| // blocked on this mutex, then this will segfault. | ||
| lock.Unlock() |
There was a problem hiding this comment.
We could make this an arm specific test and include some asm code to explicitly return R27 and R29 so that we verify gfpMul doesn't modify them. I felt like that was a bit more work than I wanted to do... But this test is a bit contrived as it is right now (and it only verifies R29 as not being corrupted)
| loadBlock(0(R0), R5,R6,R7,R8) | ||
|
|
||
| mul(R9,R10,R11,R12,R13,R14,R15,R16) | ||
| mul(R9,R10,R11,R12,R13,R14,R15,R16,R17,EMPTY) |
There was a problem hiding this comment.
Note: R17 is used later, but at this point it is uninitialized, so we can use it and not care about it's value being modified.
|
yes, thanks for this. overall it looks ok, but I have to do a deeper review. |
|
Closing since I found a way in #48 to remove reserved registers without using stack based on your changes. |
Interesting... It fails every time on my computer... What is your |
Resolves #46.
This removes all usage of
R27andR29fromgfpMul.This fix introduces 3 additional
MOVDinstructions (1 for the removal ofR29and 2 for the removal ofR27).I couldn't figure out any less costly way of removing the usage of the registers (as all of the available registers are used).
I definitely don't love the style of the change (especially from the last commit where the additional arguments to
mulwere added)... So any suggestions there would be especially appreciated.