Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux-aarch64 doesn't seem to have -march=native either #95

Closed

Conversation

SvenDowideit
Copy link

@SvenDowideit SvenDowideit commented Mar 12, 2022

closes #94

and yep, it lets me build solana on my overdrive box.

Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me

@nazar-pc nazar-pc requested a review from mvines March 12, 2022 22:37
@mvines
Copy link
Collaborator

mvines commented Mar 13, 2022

Hmm, my rpi4 builds fine with -march=native:

$ git rev-parse HEAD
2e134f6de13ca672cf185556c020ab997221b300
$ uname -a
Linux lychee 5.3.0-1042-raspi2 #44-Ubuntu SMP Thu Jul 15 11:27:19 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
$ cargo build --features simd-accel
   Compiling libc v0.2.119
   Compiling cfg-if v1.0.0
   Compiling parking_lot_core v0.8.5
   Compiling scopeguard v1.1.0
   Compiling libm v0.2.2
   Compiling smallvec v1.8.0
   Compiling cc v1.0.73
   Compiling spin v0.9.2
   Compiling instant v0.1.12
   Compiling lock_api v0.4.6
   Compiling reed-solomon-erasure v5.0.1 (/home/mvines/reed-solomon-erasure)
   Compiling parking_lot v0.11.2
    Finished dev [unoptimized + debuginfo] target(s) in 19.49s

This feels a little clunky to disable -mach=native for an entire arch due to a couple broken implementations. Maybe we can test for -mach=native support in build.rs?

@mvines
Copy link
Collaborator

mvines commented Mar 13, 2022

Using is_like_clang() looked promising to filter out -march=native, but clang on M1's report to be is_like_gnu() and are not is_like_clang(). 🙃

@mvines
Copy link
Collaborator

mvines commented Mar 13, 2022

How about something like this in build.rs?

diff --git a/build.rs b/build.rs
index 431f5fb..7bf7016 100644
--- a/build.rs
+++ b/build.rs
@@ -158,12 +158,7 @@ fn write_tables() {
     not(any(target_os = "android", target_os = "ios"))
 ))]
 fn compile_simd_c() {
-    let mut build = cc::Build::new();
-    build.opt_level(3);
-
-    // `-march=native` is not supported on Apple M1
-    #[cfg(not(all(target_os = "macos", target_arch = "aarch64")))]
-    {
+    let flag_arch = {
         fn guess_target_cpu() -> String {
             // Copied and adapted from https://github.com/alexcrichton/proc-macro2/blob/4173a21dc497c67326095e438ff989cc63cd9279/build.rs#L115
             // Licensed under Apache-2.0 + MIT (compatible because we're MIT)
@@ -181,7 +176,37 @@ fn compile_simd_c() {
 
             "native".to_string()
         }
-        build.flag(&format!("-march={}", guess_target_cpu()));
+        format!("-march={}", guess_target_cpu())
+    };
+
+    let build_arch_test_c =
+        std::path::PathBuf::from(std::env::var("OUT_DIR").unwrap()).join("build-arch-test.c");
+
+    std::fs::OpenOptions::new()
+        .create(true)
+        .write(true)
+        .open(&build_arch_test_c)
+        .unwrap();
+
+    let mut build = cc::Build::new();
+    build.opt_level(3).flag("-std=c11").file(&build_arch_test_c);
+
+    let mut build_with_arch = build.clone();
+    build_with_arch.flag(&flag_arch);
+
+    // Test if the `-march` flag works for this compiler
+    let use_build_arch = if build_with_arch.try_compile("build-arch-test").is_ok() {
+        true
+    } else if build.try_compile("build-arch-test").is_ok() {
+        false
+    } else {
+        panic!("Unable to compile test file");
+    };
+
+    let mut build = cc::Build::new();
+    build.opt_level(3);
+    if use_build_arch {
+        build.flag(&flag_arch);
     }
     build
         .flag("-std=c11")

@nazar-pc
Copy link
Member

I'm wondering how much does it actually help in terms of performance. Maybe we can place it under feature flag or remove instead?

@SvenDowideit
Copy link
Author

thanks for the opportunity to learn more :)

I really wonder if there's a reason why native doesn't work - and if this really something that (long term) should be resolved further up the chain. (aka, why are there platforms without a working native?)

@mvines
Copy link
Collaborator

mvines commented Mar 14, 2022

I'm wondering how much does it actually help in terms of performance. Maybe we can place it under feature flag or remove instead?

I'm not sure how the codegen differs on aarch64 with/without -march=native. I'd definitely be nervous removing it generically, x86_64 specifically.

I'm not overly opposed to this PR as is, the simplicity of it is nice.

aarch64 for my usage is development only, so potentially less optimal code for this architecture wouldn't bother me, we only use x86_86 in production currently.

I really wonder if there's a reason why native doesn't work - and if this really something that (long term) should be resolved further up the chain. (aka, why are there platforms without a working native?)

Seems that this is ultimately just a clang limitation on some aarch64 platforms, perhaps newer clangs have resolved this already though.

@nazar-pc
Copy link
Member

I'd definitely be nervous removing it generically, x86_64 specifically.

Why? Did you benchmark that it is significantly slower without that?

@mvines
Copy link
Collaborator

mvines commented Mar 14, 2022

No, because I didn’t benchmark it so I don’t know!

@nazar-pc
Copy link
Member

OK, I did some quick testing. Removing native on x86-64 decreases performance on my Zen3-based mobile 5900HX from ~14GB/s to ~234MB/s, which is terrible.

Though using native means that the code is no longer as portable as typically expected. broadwell instead of native was sufficient to get the full performance back (didn't test many others though).

Since there is code that already checks RUSTFLAGS, I tend to think that the library shouldn't enforce native by default on any platform. It might be an optional feature, but that is about it.

Every app developer has to decide which flags they want and since performance is so different, if we were to remove this behavior, new major version would be needed and a section in readme about optimizations users might want to apply to get the best performance.

@nazar-pc
Copy link
Member

nazar-pc commented Apr 5, 2022

@mvines let's go with the test approach you suggested above for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

-march=native is not supported on aarch64?
3 participants