-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C like SIMD intrinsics for Rust #1639
Comments
Could you explain more about why the |
I am unsure what is being requested here, because Is the request that repr_simd be stablized? Is the request that llvmint be included? (If so, why? It's a line away with Cargo.) Is the request for better documentation? Is the request to rename "simd" and "llvmint" to corresponding names in C so that they are more familiar? |
Whatever is needed to use llvmint using Rust Stable. |
I have not looked deeper, but many functionalities available in C are missing there. To name a few stuff like prefetching (
No; because of the given reasons up there.
Kinda; SIMD support shouldn't be considered as some eye candy when advertising Rust as a high performance system programming language. It's not cool at all to see such essential feature to be handled by a single person as a hobby.
Not really. It would be really appreciated if you guys use exactly same naming convention as C intrinsics or assembly ones to reduce the confusions. Besides, they are already documented very well in Intel's software manuals and all over the Internet, no need to redo. (for example: https://software.intel.com/sites/landingpage/IntrinsicsGuide/)
See above. |
I agree with this. Intel's names may not be the best, but using "better" names only makes things worse, IMO. I would like to have the exact-same names, and same better names for the generic abstraction on top. |
Thanks for replies. The initial post was quite light on details what "C like SIMD" entails, but replies were illuminating. The following is my understanding, please correct me if I misunderstood. First, SIMD should be stable. I actually think everyone agrees. The reason it is not yet stable is because people disagree on the rest. Second, SIMD should use Intel naming conventions. Note that this is not C naming conventions. As I understand, there are three different naming conventions: Intel, GCC, LLVM. Rust llvmint mechanically follows LLVM. (It is not the case Rust invented new naming scheme.) The suggestion is to mechanically follow Intel instead. Presumably, it is also suggested to follow ARM etc. for other platforms. The reason is that this allows reuse of existing documentation. I completely agree here and the reason is clear. Here is an example with movntdqa: Intel: _mm_stream_load_si128. GCC: __builtin_ia32_movntdqa. LLVM: llvm.x86.sse41.movntdqa. Rust llvmint: sse41_movntdqa. I note that Intel deviates most from actual assembly name here. Third, parts of SIMD support currently maintained as external library should be brought into the main Rust distribution. I am still unclear why, but it seems to me that it is hoped inclusion in the main distribution will bring more contribution. I agree it may be the case. Should it be part of std, or under rust-lang-nursery eventually moving to rust-lang like regex? I am still unclear why current SIMD support shouldn't be stablized. RFC 1199 defined feature gate repr_simd for types, and platform_intrinsics for operations. It is true some operations are missing, but operations can be added incrementally, they are not breaking changes. So missing operations can't be the reason not to stablize. I don't think any types are missing. I note that movntdqa and lddqu are in llvmint. For prefetchnta, llvmint only includes architecture generic form, not Intel architecture specific form, but architecture generic form is superset of Intel architecture specific form. Specifically, prefetchnta = _mm_prefetch(p, i) = prefetch(p, 0, i, 1). 0 is for read (1 is write), 1 is for data (0 is instruction). See http://llvm.org/docs/LangRef.html#llvm-prefetch-intrinsic for LLVM documentation. |
One reason I wouldn't stabilize is shuffles that don't use generic constant parameters. The main reason, however, is that there's no type-safety, no bounds on the operations. |
By the way, Rust follows LLVM naming because Rust ultimately compiles to LLVM. While I think rationale for Intel naming is strong, it does mean Intel naming to LLVM naming mapping has to be maintained by Rust. |
@eddyb, I am aware of those problems. But OP said SIMD shouldn't be stablized because prefetchnta, movntdqa, lddqu are missing, which is non sequitur. First, they aren't missing, second, even if they are missing that's no reason not to stablize. |
@sanxiyn Oh, I see. Sorry for misunderstanding. |
Duplicate of #280 |
@sanxiyn Thanks for clearing things up! I do agree with the points you did made; current SIMD proposal (RFC #1199) + missing features from llvmint and following platform specific (Intel/ARM/...) naming conventions sounds perfect to me. Is it possible to give SIMD support stabilization a little higher priority, so we can get our hands on it in may be less than 6 months? Pleeaassee! :) |
@siavashserver Please be aware that we get a lot of requests for "higher priority". The best way to bring these features to conclusion is finding people competent with SIMD and rust and maybe help out with code submissions by implementing the RFC proposal. I'm aware that this isn't always possible. There's many things that are disappointing because they are missing, but engineering time is the biggest problem here and no one is served with a mediocre stable feature. |
@skade FWIW Work is underway for const generic parameters (MIR is central to supporting them), but it might be a while before we have anything tangible, the guts of the compiler have to change significantly. |
In other words, the language feature a better SIMD is blocked on are relatively high-priority, it's just quite non-trivial. The changes to intrinsics naming and missing intrinsics (if any) can be done in parallel, if a RFC is proposed to use some convention which predates LLVM. The implementation that governs those is actually a Python script and some JSON files, so contribution barrier should be low. |
Sorry folks, I'm a bit confused. Do you mean that we can't see SIMD support getting sorted out till 2017 because of lack of enough man power and its complexity, and there are more important things to deal with in first place? |
@siavashserver Our ideal view of stable SIMD involves adding support for numeric type parameters, which itself demands massive changes and expansion to how we handle compile-time evaluation (see also https://github.com/solson/miri , a new Rust interpreter(!!!) that we'll likely be adding to the compiler), all of which is blocked on our middle-end's ongoing massive overhaul, a.k.a. MIR, whose last major blocker was just cleared hours ago (see rust-lang/rust#33622 ). We've got tons of people working on these things (though more help wouldn't hurt!), but these are quite enormous efforts with lots of design and iteration required, and meanwhile there are lots of other initiatives vying for priority (e.g. MIR landing will unblock not just CTFE, but also incremental recompilation and non-lexical borrows, both of which will likely be prioritized higher than CTFE by the official Rust devs (which doesn't mean that CTFE won't see attention, just that it might require community support to accelerate its development in the short-term)). There are too many factors to give an accurate time estimate, but given the blockers and the number of other things on our plate I wouldn't expect stable SIMD to land any sooner than 2017, yes. |
@bstrie Thank you very much for clarification! |
Porting clang's It would probably be best to start off with an independent "immintrin" crate based on llvmint and simd. As far as I know, there isn't any actual missing functionality from those crates; they just doesn't expose the same names as the Intel headers. The resulting crate could then be imported into the standard library with only minor changes after going through the RFC process. If anyone wants to pursue this, I can answer any questions; I have experience with the vector intrinsics in clang and LLVM. |
Good point. In fact, llvmint exposes lots of stuff that isn't SIMD that's useful.
Also a good point. In order for the Intel names to make sense, everything (including the types) would have to be named the Intel way. At least in the example above, the LLVM/Rust name is clear because it's the name of the instruction. I would be very happy to have llvmint working on Stable Rust as-is. |
We will never stabilize something LLVM-specific like that, not without an abstraction layer on top. |
FWIW, it doesn't seem necessary to block all of SIMD on an issue (immediate operands) which only affects a small fraction of intrinsics (though, for the record, more than a handful). Omitting those intrinsics from an initial stabilization would be a bit weird, but not the end of the world. A potential alternative is stabilizing inline assembly, so someone could create an unofficial SIMD crate that simulates intrinsics using that. They would then be free to use hacks like faking integer generics using array types or whatever. |
There is a lot of focus here on stabilization -- but I am wondering how crucial it is that things be stable versus available in nightly builds? |
(In other words, I got the impression that none of the existing crates were exporting the full functionality that was desired, but I'm not totally sure about that.) |
@nikomatsakis I'm not sure what you're proposing. A highly-desired feature available only via an unstable interface will become de facto standardized if it's in nightly for long enough, especially since nightly is infectious (for want of an unstable feature, the lib was nightly; for want of a nightly lib, the app was nightly). And the similar situation we're in now with syntax extensions being so long unstable is seen as universally undesirable (but at least we can break syntax extensions without forcing downstream consumers of syntax extensions to have to rewrite their code, a de jure unstable/de facto stable SIMD interface could lock us in forever (or at least force us to support a deprecated interface forever)). |
Could you expand a bit more on this?
Do you have any suggestion for an alternative? |
We've been hashing this out a bit on IRC, and there are two things that came out. trait Vector {
const ELEMS: usize;
fn add(self, other: Self) -> Self;
fn mul(self, other: Self) -> Self;
fn shuffle<const I: [usize, Self::ELEMS]>(self, other: Self) -> Self;
/* more general and per-platform functions */
}
// Everything is intentionally self-recursive below.
#[intrinsic="simd"]
impl<V: Vector> Vector for V {
#[intrinsic="simd_elems"]
const ELEMS: usize = V::ELEMS;
#[intrinsic="simd_add"]
fn add(self, other: Self) -> Self { self.add(other) }
#[intrinsic="simd_mul"]
fn mul(self, other: Self) -> Self { self.mul(other) }
#[intrinsic="simd_shuffle"]
fn shuffle<const I: [usize, V::ELEMS]>(self, other: V) -> V {
self.shuffle::<I>(other)
}
...
} Such a scheme could preserve all of the genericity of the current "platform intrinsics", it would just be a bounded interface that accepts only SIMD types and also allows other crates to build more interesting generic abstractions on top of the Secondly, there is the question of whether being so generic is necessary, or good, especially with all of the platform-specific intrinsics, and @BurntSushi gave an example of #[repr(simd)]
struct Simd128(...);
#[intrinsic="x86_pcmpestri128"]
fn _mm_cmpestri(a: Simd128, la: i32, b: Simd128, lb: i32, imm8: i32) -> i32 {
_mm_cmpestri(a, la, b, lb, imm8)
} Monomorphic platform intrinsics could still be adapted by a third-party crate to be used with arbitrary SIMD types by transmuting first, which would work especially well if that crate handled defining the SIMD types (such as with a macro), at which point the crate could expose only a relevant subset of the APIs. I am torn, and I probably need to go back and re-read the discussion on the platform intrinsics RFC to find arguments against either of the fully-generics or the specific-types options, which both seem viable. EDIT: Remove usage of |
@eddyb Just to be more explicit, I think your comment implies that we'd have to stabilize things like |
@BurntSushi Yes, I don't how it could be used from Rust stable if we don't... stabilize it. Unless I'm missing something fundamental here. |
The immintrin crate now officially exists! https://crates.io/crates/immintrin / https://github.com/eefriedman/rust-immintrin . I'm not sure exactly what possessed me to spend a day working on this, but hopefully it's useful. |
On the Intel Intrinsics Guide https://software.intel.com/sites/landingpage/IntrinsicsGuide/ They actually source the intrinsics data in XML from https://software.intel.com/sites/landingpage/IntrinsicsGuide/files/data-3.3.14.xml with records like <intrinsic tech="SSE3" vexEq="TRUE" rettype="__m128d" name="_mm_addsub_pd">
<type>Floating Point</type>
<CPUID>SSE3</CPUID>
<category>Arithmetic</category>
<parameter varname="a" type="__m128d"/>
<parameter varname="b" type="__m128d"/>
<description>Alternatively add and subtract packed double-precision (64-bit) floating-point elements in "a" to/from packed elements in "b", and store the results in "dst".</description>
<operation>
FOR j := 0 to 1
i := j*64
IF (j is even)
dst[i+63:i] := a[i+63:i] - b[i+63:i]
ELSE
dst[i+63:i] := a[i+63:i] + b[i+63:i]
FI
ENDFOR
</operation>
<instruction name="addsubpd" form="xmm, xmm"/>
<perfdata arch="Haswell" lat="3" tpt="1"/>
<perfdata arch="Ivy Bridge" lat="3" tpt="1"/>
<perfdata arch="Sandy Bridge" lat="3" tpt="1"/>
<perfdata arch="Westmere" lat="3" tpt="1"/>
<perfdata arch="Nehalem" lat="3" tpt="1"/>
<header>pmmintrin.h</header>
</intrinsic> |
Hi, I'm also making similar one (https://crates.io/crates/x86intrin). When I started to create my library privately one month ago, I had known this thread but I didn't know someone is making the similar one. So I'm a bit surprised when I found the similar crate. Anyway, I'm currently depending on my one to try to port my app in rust. Not sure it's good the multiple similar libraries exist, but anyway it's good there are more people that use/test rust SIMD implementation. |
If useful features are de facto "forever" in nightly only, then to use the useful things, one has to use nightly and stop paying attention to stable. If enough people do, stable becomes pointless and the features that are theoretically subject-to-change can't really be changed without breaking too many consumers. More concretely, if a feature I want to use in Firefox is in nightly Rust only, the obvious options are advocating making the feature available in stable Rust or advocating Firefox moving to nightly Rust.
For most of the As for the concrete present, since the As for an abstraction layer beyond trivial name substitution, a SIMD API as part of the standard library, which is allowed to use compiler intrinsics internally, has failed to materialize. As long as Moreover, there are things that don't fit a safe SIMD crate. In the SIMD domain, manual choice between fast but aligned access vs. slower but unaligned access isn't a good fit for a safe API. Furthermore, there are intrinsics that are outside the SIMD domain, e.g. AES and GCM-related instructions. I think the practical way forward is to allow the Concretely, I'm writing a crate that is supposed to replace a C++ component in Gecko. The pre-existing C++ component accelerates a central operation using SSE2 intrinsics. This leaves the following options:
(As for the non-ISA-specific LLVM intrinsics, leaving them out but allowing the ISA-specific ones would make things easier for future non-LLVM back ends but would favor writing ISA-specific code where ISA-independent code would be possible, which is probably a sadder near-term outcome than making future non-LLVM back ends to replicate LLVM's cross-ISA intrinsics.) |
For anyone following this issue, we have a (very) large ongoing thread about a path to resolving this specific issue: https://internals.rust-lang.org/t/getting-explicit-simd-on-stable-rust/4380 |
Closing in favor of #2325 |
One of important features missing from Rust lang for me and other guys involved with game development, DSP, video encoding, etc is proper SIMD support.
It's disappointing to see that such important feature is not still core feature of Rust lang, or at least receiving more love from developers after waiting since Rust's early days.
The current means of writing SIMD code like https://crates.io/crates/simd in Rust trying to provide a unified interface for different archs and abstracting stuff are way suboptimal. I like the idea of https://crates.io/crates/llvmint though.
Please give us
m128
,m128i
,m128d
, etc data types and access to whatever available insidexmmintrin.h
, etc. Let us deal with platform abstractions and safety stuff ourselves.Thanks!
The text was updated successfully, but these errors were encountered: