-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SIMD Support #903
Comments
I think relying on the compiler vector type is a good solution. Syntax: you need a way to describe a vector type, an idea could be:
So Also vector are use essentially for arithetic so regular artimetic should work. Importants things:
The standard library should also provide simd version of cos, sin, exp and so on. |
How about adding operators for arrays? Example: const std = @import("std");
test "simd" {
var a = [4]i32{1, 2, 3, 4};
var b = [4]i32{5, 6, 7, 8};
var c = a + b;
std.debug.assert(mem.eql(i32, c[0..], [4]i32{6, 8, 10, 12} ));
} This would codegen to using vectors in LLVM. |
I believe you'll find out that using arrays for "simd vector" introduces more problems than solutions, and that's why llvm and gcc went a different way. First thing is that they might have different alignment requirement. Plus those vector are supposed to be stored in a single register in the end, so you might want to codegen differently depending on vector / array maybe. I also worked on a private DSL, and we had the distinction between vectors and array from the typing, and it was fine as far as I can tell. The vector type also provides useful information while doing the semantic analysis, and you see what you get. Otherwise you have some array magic which is exactly the kind of things that people want to avoid when switching to your new language right? |
I think you're right - the simplest thing for everyone is to introduce a new vector primitive type and have it map exactly to the LLVM type. |
Also keep in mind the We should aim to try to utilise this set of faster functions when we can. |
I just stumbled on this. There is a blog post series by a (former?) Intel engineer who designed a compiler for a vectorized language: http://pharr.org/matt/blog/2018/04/18/ispc-origins.html |
Dense and interesting articles! |
One thing to keep in mind here is that even though you can vectorize scalar code, there are a lot of operations that are supported by simd instructions which you can't do in 'normal' scalar code. Such as creating bit fields from floating point comparisons to later use them in bitwise operations (often to avoid branches). Plus there are integer operations which expand to wider integers and other special stuff. The series of articles linked by @lmb also show well what the difference can be between code/compiler that's designed for SIMD and code/compiler that isn't. |
See #903 * create with `@Vector(len, ElemType)` * only wrapping addition is implemented This feature is far from complete; this is only the beginning.
In the above commit I introduced the
No mixing vector/scalar support. Instead you will use fn vecMulScalar(v: @Vector(10, i32), x: i32) @Vector(10, i32) {
return v * @splat(10, x);
} |
The syntax looks ugly but if it works as good as the llvm builtin vectors, then it is fine! ;-) Thank you, and don't forget the shuflle vector! |
What do you think of v10i32 ? |
A few things:
Please do feel free to propose syntax for a vector type. What's been proposed so far:
This syntax hasn't been rejected; I'm simply avoiding the syntax question until the feature is done since it's the easiest thing to change at the very end. |
Why would you want a vector of pointers? Can you do a vector load from that? Would that even be efficient? Do you want people to do vectorized pointer arithmetic? 🐙 I'd go with v4f32 style! People will really enjoy writing simd with that style. But of course it does not work with templates... :) So might need a more verbose type declaration indeed. |
Mainly, because LLVM IR supports it, and they're usually pretty good about representing what hardware generally supports. We don't automatically do everything LLVM does, but it's a good null hypothesis.
Yes you can, which yields a vector. So for example you could have a vector of 4 pointers to a struct, and then obtain a vector of 4 floats which are their fields: const Point = struct {x: f32, y: f32};
fn multiPointMagnitude(points: @Vector(4, *Point)) @Vector(4, f32) {
return @sqrt(points.x * points.x + points.y * points.y);
} It's planned for this code to work verbatim once this issue is closed. Not only can you do vector loads and vector stores from vectors of pointers, you can also do |
How are we supposed to initialize a vector? I couldn't find an example in the newest code. Or is this not implemented yet? For example, the following doesn't work: test "initialize vector" {
const V4i32 = @Vector(4, i32);
var v: V4i32 = []i32{ 0, 1, 2, 3 };
} |
Your example is planned to work. That's the checkbox above labeled "implicit array to vector cast". |
also vectors and arrays now use the same ConstExprVal representation See #903
@travisstaloch the array <-> vector casts work now. Here's the passing test case: zig/test/stage1/behavior/vector.zig Lines 4 to 19 in 8c6fa98
|
also fix vector behavior tests, they weren't actually testing runtime vectors, but now they are. See #903
Oh cool! I never realized that @select could be used on vectors.
…On Tue, Oct 13, 2020 at 10:28 AM floopfloopfloopfloopfloop < ***@***.***> wrote:
http://llvm.org/docs/LangRef.html#select-instruction
@Shuffle requires mask to be comptime, which means the branch must be
known at comptime. @select will allow this to be done at runtime.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#903 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD4W4W6KB7SCI7NW3S655LSKPXP7ANCNFSM4EZMHEZA>
.
|
https://zig.godbolt.org/z/9nYcn4 const std = @import("std");
pub fn main() void {
const v: i32 = 1;
const a: @Vector(4, i32) = @splat(4, v);
// These fail due to unexpected types
const b = @intCast(@Vector(4, i64), a);
const c = @as(@Vector(4, i64), a);
} |
Has there been thoughts already here around runtime switching of CPU SIMD feature sets? i.e. instead of compiling for a single instruction set (AVX2, AVX-512, SSE3, SSSE3 etc.) allowing compiling for multiple and, at runtime, choosing a branch that uses the latest and/or most efficient supported instruction set where reasonable? |
@vector(N, bool) doesn't have |
You can |
An interesting test for such an API would be whether one can implement useful artefact beyond number crunching... like High-speed UTF-8 validation or base64 encoding/decoding.
A related issue is that instructions sets are evolving. For example, the latest AWS graviton nodes support SVE/SVE2. The most powerful AWS nodes support a full range of AVX-512 instructions sets (up to VBMI2). If you build something up that is unable to benefit from SVE2 or advanced AVX-512 instructions, then you are might not be future proof. |
I agree emphatically with @lemire's comment above. For even current fixed-pattern byte shuffling with The "correct" output for this function would be more like this. |
It gets a lot better with |
|
Current Progress
SIMD is very useful for fast processing of data and given Zig's goals of going fast, I think we need to look at how exposing some way of using these instructions easily and reliably.
Status-Quo
Inline Assembly
It is possible to do simd in inline-assembly as is. This is a bit cumbersome though and I think we should strive for being able to get any speed performances in the zig language itself.
Rely on the Optimizer
The optimizer is good and comptime unrolling and support helps a lot, but it doesn't provide guarantees that any specific code will be vectorized. You are at the mercy of LLVM and you don't want to see your code lose a huge hit in performance simply due to a compiler upgrade/change.
LLVM Vector Intrinsics
LLVM supports vector types as first class objects in it's ir. These correspond to simd instructions. This provides the bulk of the work and for us, we simply need to expose a way to construct these vector types. This would be analagous to the
__attribute__((vector))__
builtin found in C compilers.If anyone has any thoughts on the implementation and or usage then that would be great since I'm not very familiar with how these are exposed by LLVM. It would be great to get some discussion going in this area since I'm sure people would like to be able to match the performance of C in all areas with Zig.
The text was updated successfully, but these errors were encountered: