CL/aarch64: implement the wasm SIMD `v128.load{32,64}_zero` instructi… #2355

julian-seward1 · 2020-11-03T16:35:24Z

…ons.

This patch implements, for aarch64, the following wasm SIMD extensions.

v128.load32_zero and v128.load64_zero instructions
WebAssembly/simd#237

The changes are straightforward:

no new CLIF instructions. They are translated into an existing CLIF scalar
load followed by a CLIF scalar_to_vector.
the comment/specification for CLIF scalar_to_vector has been changed to
match the actual intended semantics, per consulation with Andrew Brown.
translation from scalar_to_vector to the obvious aarch64 insns.
special-case zero in lower_constant_f128 in order to avoid a
potentially slow call to Inst::load_fp_constant128.
Once "Allow loads to merge into other operations during instruction
selection in MachInst backends"
(Allow loads to merge into other operations during instruction selection in MachInst backends #2340) lands,
we can use that functionality to pattern match the two-CLIF pair and
emit a single AArch64 instruction.

There is no testcase in this commit, because that is a separate repo. The
implementation has been tested, nevertheless.

abrown

The scalar_to_vector change looks good to me, as does the Wasm-to-CLIF translation; I'm unfamiliar with the aarch64 backend, though, so someone else should take a look there. Re: tests, I would hope that adding filetests for this type of thing (especially tests that benefit all backends) should not be too difficult. A test compile test (e.g. this one) would verify that the scalar_to_vector lowers to the VCode you expect in aarch64 but a test run test (e.g. this one) should be runnable on all backends if we remove the x64-specific parts (?).

julian-seward1 · 2020-11-03T17:10:02Z

@abrown Regarding tests, I expect that tests/spec_testsuite/proposals/simd will acquire a suitable test case in the fullness of time. That is a different repo with presumably a different schedule etc, though. I could even provide such a test since I have one -- a hacked version of the load-splat test case.

julian-seward1 · 2020-11-03T17:12:41Z

cc @yurydelendik

abrown · 2020-11-03T17:17:10Z

Regarding tests, I expect that tests/spec_testsuite/proposals/simd will acquire a suitable test case in the fullness of time.

Agreed, I still think it is valuable to test this at the CLIF level as well, especially in the interim.

cfallin

LGTM for the AArch64 changes. Echoing that it would be nice to have a test case; an aarch64 vcode test should be pretty simple, I think.

cranelift/codegen/src/isa/aarch64/lower_inst.rs

julian-seward1 · 2020-11-04T10:35:04Z

I added a filetest for the lowering of scalar_to_vector.

…ons. This patch implements, for aarch64, the following wasm SIMD extensions. v128.load32_zero and v128.load64_zero instructions WebAssembly/simd#237 The changes are straightforward: * no new CLIF instructions. They are translated into an existing CLIF scalar load followed by a CLIF `scalar_to_vector`. * the comment/specification for CLIF `scalar_to_vector` has been changed to match the actual intended semantics, per consulation with Andrew Brown. * translation from `scalar_to_vector` to aarch64 `fmov` instruction. This has been generalised slightly so as to allow both 32- and 64-bit transfers. * special-case zero in `lower_constant_f128` in order to avoid a potentially slow call to `Inst::load_fp_constant128`. * Once "Allow loads to merge into other operations during instruction selection in MachInst backends" (bytecodealliance#2340) lands, we can use that functionality to pattern match the two-CLIF pair and emit a single AArch64 instruction. * A simple filetest has been added. There is no comprehensive testcase in this commit, because that is a separate repo. The implementation has been tested, nevertheless.

akirilov-arm

LGTM.

akirilov-arm · 2020-11-04T18:47:21Z

cranelift/codegen/src/isa/aarch64/inst/mod.rs

@@ -877,10 +877,13 @@ pub enum Inst {
        rn: Reg,
    },

-    /// Move from a GPR to a scalar FP register.
+    /// Move from a GPR to a vector register.  The scalar value is parked in the lowest lane


BTW I don't mind the comment at all, but this operation is not special - virtually any instruction that operates on S or D registers (e.g. Inst::FpuRR) has exactly the same behaviour.

cfallin

Updated version looks good; thanks!

julian-seward1 force-pushed the arm64-simd-loadzero branch from 1efb50d to 1899aa7 Compare November 3, 2020 16:40

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:meta Everything related to the meta-language. cranelift:wasm labels Nov 3, 2020

abrown approved these changes Nov 3, 2020

View reviewed changes

julian-seward1 requested a review from yurydelendik November 3, 2020 17:13

cfallin approved these changes Nov 3, 2020

View reviewed changes

akirilov-arm suggested changes Nov 3, 2020

View reviewed changes

cranelift/codegen/src/isa/aarch64/lower_inst.rs Outdated Show resolved Hide resolved

julian-seward1 force-pushed the arm64-simd-loadzero branch from 1899aa7 to 3713dcc Compare November 4, 2020 18:16

akirilov-arm approved these changes Nov 4, 2020

View reviewed changes

akirilov-arm reviewed Nov 4, 2020

View reviewed changes

julian-seward1 requested a review from cfallin November 4, 2020 18:50

cfallin approved these changes Nov 4, 2020

View reviewed changes

julian-seward1 merged commit dd9bfce into bytecodealliance:main Nov 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CL/aarch64: implement the wasm SIMD `v128.load{32,64}_zero` instructi… #2355

CL/aarch64: implement the wasm SIMD `v128.load{32,64}_zero` instructi… #2355

julian-seward1 commented Nov 3, 2020

abrown left a comment

julian-seward1 commented Nov 3, 2020

julian-seward1 commented Nov 3, 2020

abrown commented Nov 3, 2020

cfallin left a comment

julian-seward1 commented Nov 4, 2020

akirilov-arm left a comment

akirilov-arm Nov 4, 2020 •

edited

Loading

cfallin left a comment

CL/aarch64: implement the wasm SIMD v128.load{32,64}_zero instructi… #2355

CL/aarch64: implement the wasm SIMD v128.load{32,64}_zero instructi… #2355

Conversation

julian-seward1 commented Nov 3, 2020

abrown left a comment

Choose a reason for hiding this comment

julian-seward1 commented Nov 3, 2020

julian-seward1 commented Nov 3, 2020

abrown commented Nov 3, 2020

cfallin left a comment

Choose a reason for hiding this comment

julian-seward1 commented Nov 4, 2020

akirilov-arm left a comment

Choose a reason for hiding this comment

akirilov-arm Nov 4, 2020 • edited Loading

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment

CL/aarch64: implement the wasm SIMD `v128.load{32,64}_zero` instructi… #2355

CL/aarch64: implement the wasm SIMD `v128.load{32,64}_zero` instructi… #2355

akirilov-arm Nov 4, 2020 •

edited

Loading