Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller encoding for v128 scalar constants #1476

Open
kg opened this issue May 19, 2023 · 4 comments
Open

Smaller encoding for v128 scalar constants #1476

kg opened this issue May 19, 2023 · 4 comments

Comments

@kg
Copy link
Contributor

kg commented May 19, 2023

From what I can see, the only way to create a v128 zero vector (for example to do unrolled memsets) is a full v128_const, weighing in at around 18 bytes (painful since my JIT is limited to 4kb). i64_const + splat would be one smaller encoding, but it doesn't look like v8 optimizes that, and it's probably unreasonable to expect it to get optimized. It might be good to try and add a v128_const_zero or v128_const_splat equivalent in any future iteration of SIMD, I can imagine this being even worse when wider vector types are added. Maybe this only needs to happen in the binary format, but it wouldn't be backwards compatible to change the existing opcode.

@tlively
Copy link
Member

tlively commented May 19, 2023

Another trick would be to declare a v128 local that will be implicitly initialized to zero, then use a local.get to retrieve the value. I don't know how the V8 codegen for that compares to the other options, though.

@kg
Copy link
Contributor Author

kg commented May 19, 2023

Yeah, I was thinking of experimenting with a local, but from looking at the code I'm not sure v8 / spidermonkey etc will realize that it's constant. Turning the consts into memory loads would probably be pretty bad. I got measurable speedups by switching my i64.const 0 + splat to v128.const 0, it just means I can't JIT as much code now due to the size bloat. I'll definitely test it at some point.

@eqrion
Copy link

eqrion commented May 19, 2023

For Ion in SpiderMonkey, we'll generate equivalent IR for v128.constant 0 and using local.get when the local is zero-initialized once (by default or explicitly). Baseline will use a memory load from the stack for the local.get though.

@dtig
Copy link
Member

dtig commented May 19, 2023

I think we should be able to generate better code for i64.const 0 + splat in V8, because we should be able to constant match the input to zero, and generate a pxor, IIRC this is what we currently generate for V128Const 0. I've filed this tracking bug to optimize this better. Orthogonally, I'm not opposed to adding a V128Const for all zeros/ones.

webkit-commit-queue pushed a commit to kmiller68/WebKit that referenced this issue May 19, 2023
https://bugs.webkit.org/show_bug.cgi?id=257051

Reviewed by Yusuke Suzuki.

According to WebAssembly/design#1476 at least one person wants to encode vector 0 as Splat i32.const 0 since the former's encoding is 18 bytes whereas the latter is 5 bytes. As such, we should emit Splat i32.const 0/-1 as the optimal code in BBQJIT.

* Source/JavaScriptCore/wasm/WasmBBQJIT.cpp:
(JSC::Wasm::BBQJIT::addSIMDSplat):

Canonical link: https://commits.webkit.org/264280@main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants