Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CL/aarch64: implement the wasm SIMD v128.load{32,64}_zero instructi… #2355

Merged

Conversation

julian-seward1
Copy link
Contributor

…ons.

This patch implements, for aarch64, the following wasm SIMD extensions.

v128.load32_zero and v128.load64_zero instructions
WebAssembly/simd#237

The changes are straightforward:

  • no new CLIF instructions. They are translated into an existing CLIF scalar
    load followed by a CLIF scalar_to_vector.

  • the comment/specification for CLIF scalar_to_vector has been changed to
    match the actual intended semantics, per consulation with Andrew Brown.

  • translation from scalar_to_vector to the obvious aarch64 insns.

  • special-case zero in lower_constant_f128 in order to avoid a
    potentially slow call to Inst::load_fp_constant128.

  • Once "Allow loads to merge into other operations during instruction
    selection in MachInst backends"
    (Allow loads to merge into other operations during instruction selection in MachInst backends #2340) lands,
    we can use that functionality to pattern match the two-CLIF pair and
    emit a single AArch64 instruction.

There is no testcase in this commit, because that is a separate repo. The
implementation has been tested, nevertheless.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:meta Everything related to the meta-language. cranelift:wasm labels Nov 3, 2020
Copy link
Contributor

@abrown abrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scalar_to_vector change looks good to me, as does the Wasm-to-CLIF translation; I'm unfamiliar with the aarch64 backend, though, so someone else should take a look there. Re: tests, I would hope that adding filetests for this type of thing (especially tests that benefit all backends) should not be too difficult. A test compile test (e.g. this one) would verify that the scalar_to_vector lowers to the VCode you expect in aarch64 but a test run test (e.g. this one) should be runnable on all backends if we remove the x64-specific parts (?).

@julian-seward1
Copy link
Contributor Author

@abrown Regarding tests, I expect that tests/spec_testsuite/proposals/simd will acquire a suitable test case in the fullness of time. That is a different repo with presumably a different schedule etc, though. I could even provide such a test since I have one -- a hacked version of the load-splat test case.

@julian-seward1
Copy link
Contributor Author

cc @yurydelendik

@abrown
Copy link
Contributor

abrown commented Nov 3, 2020

Regarding tests, I expect that tests/spec_testsuite/proposals/simd will acquire a suitable test case in the fullness of time.

Agreed, I still think it is valuable to test this at the CLIF level as well, especially in the interim.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the AArch64 changes. Echoing that it would be nice to have a test case; an aarch64 vcode test should be pretty simple, I think.

@julian-seward1
Copy link
Contributor Author

I added a filetest for the lowering of scalar_to_vector.

…ons.

This patch implements, for aarch64, the following wasm SIMD extensions.

  v128.load32_zero and v128.load64_zero instructions
  WebAssembly/simd#237

The changes are straightforward:

* no new CLIF instructions.  They are translated into an existing CLIF scalar
  load followed by a CLIF `scalar_to_vector`.

* the comment/specification for CLIF `scalar_to_vector` has been changed to
  match the actual intended semantics, per consulation with Andrew Brown.

* translation from `scalar_to_vector` to aarch64 `fmov` instruction.  This
  has been generalised slightly so as to allow both 32- and 64-bit transfers.

* special-case zero in `lower_constant_f128` in order to avoid a
  potentially slow call to `Inst::load_fp_constant128`.

* Once "Allow loads to merge into other operations during instruction
  selection in MachInst backends"
  (bytecodealliance#2340) lands,
  we can use that functionality to pattern match the two-CLIF pair and
  emit a single AArch64 instruction.

* A simple filetest has been added.

There is no comprehensive testcase in this commit, because that is a separate
repo.  The implementation has been tested, nevertheless.
Copy link
Contributor

@akirilov-arm akirilov-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -877,10 +877,13 @@ pub enum Inst {
rn: Reg,
},

/// Move from a GPR to a scalar FP register.
/// Move from a GPR to a vector register. The scalar value is parked in the lowest lane
Copy link
Contributor

@akirilov-arm akirilov-arm Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I don't mind the comment at all, but this operation is not special - virtually any instruction that operates on S or D registers (e.g. Inst::FpuRR) has exactly the same behaviour.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated version looks good; thanks!

@julian-seward1 julian-seward1 merged commit dd9bfce into bytecodealliance:main Nov 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:meta Everything related to the meta-language. cranelift:wasm cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants