Skip to content

x64: Sink constant loads into xmm instructions#5880

Merged
alexcrichton merged 1 commit intobytecodealliance:mainfrom
alexcrichton:sink-const-loads-in-sse-insts
Feb 27, 2023
Merged

x64: Sink constant loads into xmm instructions#5880
alexcrichton merged 1 commit intobytecodealliance:mainfrom
alexcrichton:sink-const-loads-in-sse-insts

Conversation

@alexcrichton
Copy link
Member

A number of places in the x64 backend make use of 128-bit constants for various wasm SIMD-related instructions although most of them currently use the x64_xmm_load_const helper to load the constant into a register. Almost all xmm instructions, however, enable using a memory operand which means that these loads can be folded into instructions to help reduce register pressure. Automatic conversions were added for a VCodeConstant into an XmmMem value and then explicit loads were all removed in favor of forwarding the XmmMem value directly to the underlying instruction. Note that some instances of x64_xmm_load_const remain since they're used in contexts where load sinking won't work (e.g. they're the first operand, not the second for non-commutative instructions).

A number of places in the x64 backend make use of 128-bit constants for
various wasm SIMD-related instructions although most of them currently
use the `x64_xmm_load_const` helper to load the constant into a
register. Almost all xmm instructions, however, enable using a memory
operand which means that these loads can be folded into instructions to
help reduce register pressure. Automatic conversions were added for a
`VCodeConstant` into an `XmmMem` value and then explicit loads were all
removed in favor of forwarding the `XmmMem` value directly to the
underlying instruction. Note that some instances of `x64_xmm_load_const`
remain since they're used in contexts where load sinking won't work
(e.g. they're the first operand, not the second for non-commutative
instructions).
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Feb 24, 2023
Copy link
Contributor

@jameysharp jameysharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read through both the code and test changes here and it all looks correct to me. But I'd like @elliottt to take a look too because I'm not super confident about spotting issues with load-sinking.

It's worth noting that we don't yet have SIMD support in the Cranelift fuzzer, so we can't rely on that to catch issues with this PR.

@alexcrichton
Copy link
Member Author

True! Wasm-based fuzzing, however, should extensively test the simd proposal with differential execution against v8.

@cfallin
Copy link
Member

cfallin commented Feb 27, 2023

To the fuzzing point, one request (if you haven't already planned it this way): could we hold off merging this until Mar 5, when the next release branch is cut, so we maximize fuzz time at 6 weeks rather than 2? I think it's great we're getting this but given history with SIMD loads I too am a bit apprehensive :-)

@alexcrichton
Copy link
Member Author

I don't mind, but I'd also push back a bit in that #5849, #5841, and #5819 were sort of all much larger/riskier. I had good success uncovering semantic issues through fuzzing in not much time as well.

Would you prefer, though, to back out those PRs and re-land after the cutoff?

@cfallin
Copy link
Member

cfallin commented Feb 27, 2023

Ah, yeah, that's a fair point that my request isn't really consistent, and I hadn't connected the dots; given those are already in, I retract my request as I don't think this is any more risk in comparison.

Copy link
Member

@elliottt elliottt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good to me!

The problems we had last time with aligned loads weren't related to constants, but arbitrary loads that we weren't guaranteed would be aligned. Given that and the requirements we put on function alignment, I'm pretty confident that this won't have the same problems that my changes last summer did :)

@alexcrichton alexcrichton added this pull request to the merge queue Feb 27, 2023
Merged via the queue into bytecodealliance:main with commit f2dce81 Feb 27, 2023
@alexcrichton alexcrichton deleted the sink-const-loads-in-sse-insts branch February 27, 2023 22:41
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Mar 1, 2023
This catches a case that wasn't handled previously by bytecodealliance#5880 to allow a
constant load to be folded into an instruction rather than forcing it to
be loaded into a temporary register.
alexcrichton added a commit that referenced this pull request Mar 2, 2023
This catches a case that wasn't handled previously by #5880 to allow a
constant load to be folded into an instruction rather than forcing it to
be loaded into a temporary register.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants