-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD #41
Comments
Initially, many of those things will be able to compile/etc. down to SIMD.js 128-bit SIMD types and operations. This will work pretty well for the short term future. Beyond that, there's obviously a desire to support longer vectors, and fancy things like predication. We might think of SIMD.js as a "short vector" API that doesn't rule out addition of a distinct and complementary "long vector" API. However, it's not yet clear what a "long vector" API should look like. The alternative to a proper "long vector" API most likely will be explicit 256-bit and 512-bit types and operations, and platform feature tests to put all the burden of using everything properly on applications. If no other compelling "long vector" API emerges, this will probably be what we end up with, and will be what all of the C++ SIMD constructs mentioned above will compile to. Anyone want to propose a "long vector" API? |
I don't think all the C++ constructs will map to SIMD.js :-) I think they'll be polyfill-able though. I'm not 100% sure, so that's why we need to keep track. I happen to currently be in the room with the authors of these papers, so it shouldn't be too hard! |
What constructs won't? Keeping in mind that we're talking about compilers doing the mapping, so there will be significant lowering in many cases. |
The current fixed-width SIMD proposal allows impls to define algos+datastructures for different sized architectures, and a generic vector size which will map onto one of these at compile time using a simple typedef so ABIs work. We'll lose this capability and just expose the lowest common denominator, which is OK. The proposal for fixed-width SIMD will keep to simple arithmetic for now (OK), but shuffle and scatter/gather are possibilities that may or may not map well (though I think they'll polyfill). The auto-vectorization stuff has a wavefront model that it exposes. That won't work for us right now unless we just go scalar, or we add some primitives. Intel will come with a new proposal at the next meeting, so it's still far from standard. There's also the parallelism TS which supports vectorization with STL-like algorithms. This relies heavily on individual runtimes, and can also just be scalar for us. |
If we're dead-set on having macros, maybe the right way to polyfill SIMD is with macros that expand into scalar operations? |
That may solve only a subset of the issue: e.g. SIMD doesn't just affect the load/store and operation width it may also affect the entire algorithm and datastructures used. This will be pretty tricky to expose portably while retaining performance. I don't think the main issue is in the encoding, I think it's between exposing many widths and what we do (hard fail, gracefully split with bad perf). There's also the question of how much ISA-specificity we expose, e.g. SIMD.js exposes a function that gives the signbit from a 4xfloat vector, and I think that's just silly. Then there's the question of which model to expose. C++ is pursuing 2 or 3 of them depending on how you look at it, and they're all complimentary. SIMD is a performance feature. If it doesn't perform, what's the point? :-) |
This also raises an issue that's been troubling me, which is how we do feature detection/fallback reliably and efficiently. For example, here's a strawman: let's say this wasm executable more or less depends on the existence of 4-wide SIMD operations that currently are only available on x64. And let's say that only chrome canary implements that feature set right now, so it needs to fallback everywhere else. What steps do we follow to make sure that executable loads in a single pass on all supported runtimes? I.e. On a browser without wasm support, the polyfill kicks in and loads it immediately. Good. On a related note: If our fallback is to punt to a polyfill, do we do that on a whole-executable level? Or do we do it on a per-function level, and have mixed modules where arbitrary function calls are crossing the FFI and others aren't? (This whole topic probably belongs in its own issue, but SIMD is the one place where I feel like we're most likely to hit the above issues) |
@jf: I guess what you call "polyfill" here is the same as what I'm calling "lowered by compilers/etc." :-). And as far as I'm aware, falling back to scalar is only needed when operations are missing, not due to fundamental programming model differences. @jf: The function which exposes the signbit of a 4xfloat vector has recently been removed from SIMD.js. Is there anything else you think that's silly? ;-) @jf and @kg: You may be interested in The SIMD.js Extended API Proposal which has a decent amount of consensus as the way forward for adding new operations (though not types or programming models) to SIMD.js after the initial release. This isn't a "long vector" API proposal, of course. |
I think there are two levels of polyfilling here:
The main question is in 2 what happens if you try to run a SIMD op that is known to the browser but not hardware optimized. The two obvious options are "throw" or "the engine implements the op as best it can". The latter seems better to me for the case where developers forget to test on a rare configuration that lacks common hardware support and so accidentally assume it. In this case, a user will still be able to run the app on the uncommon hardware (perhaps not even unduly impacted, SIMD is important, but often not dominating the entire computation) which seems like what everyone would want in this case of oversight. Devtools can make it easy for developers to find and fix this issues. |
This is meant to address the original concern in WebAssembly/spec#41
I created WebAssembly/spec@04daaa3 to attempt to address the original concern here. |
Which is now WebAssembly/spec#57 |
This is meant to address the original concern in WebAssembly/spec#41
Which is now merged. If anyone has any concerns not addressed in #57, feel free to file a new issue. Of course, if anyone wants to propose a new SIMD API, feel free to file a new issue for that too :). |
We current suggest that we'll support SIMD.js (RFC).
The C++ standard committee is currently discuss adding explicit SIMD support as well as auto-vectorization hints to the language, and vector execution policies to executors. C++ isn't the only language that we want wasm to support, but we should make sure that what we implement can support C++! There may need to be some reconciliation between SIMD.js and C++ for the sake of wasm and not JavaScript.
Here are the recent relevant papers (older ones may also be relevant):
The text was updated successfully, but these errors were encountered: