Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generates illegal instructions #500

Closed
baskerville opened this issue Jun 26, 2018 · 17 comments
Closed

Generates illegal instructions #500

baskerville opened this issue Jun 26, 2018 · 17 comments

Comments

@baskerville
Copy link

baskerville commented Jun 26, 2018

I get a systematic crash when I run cargo-edit 0.3.0 on an Intel Core 2 Duo E7600 CPU.

The lldb backtrace seems to indicate that SIMD is involved:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x00000001002e8db6 cargo-add`core::coresimd::x86::xsave::_xgetbv::h7da7479f591ed348 at xsave.rs:95 [opt]
    frame #1: 0x00000001002eaaeb cargo-add`std::stdsimd::arch::detect::os::detect_features::hcd1961e6f91cf102 at x86.rs:157 [opt]
    frame #2: 0x00000001002ea94a cargo-add`std::stdsimd::arch::detect::os::check_for::h494500e4a19a5d3c [inlined] core::ops::function::FnOnce::call_once::h731b82db5225b805 at function.rs:223 [opt]
    frame #3: 0x00000001002ea945 cargo-add`std::stdsimd::arch::detect::os::check_for::h494500e4a19a5d3c [inlined] std::stdsimd::arch::detect::cache::test::h9f743f59f6a54bc9 at cache.rs:149 [opt]
    frame #4: 0x00000001002ea938 cargo-add`std::stdsimd::arch::detect::os::check_for::h494500e4a19a5d3c at x86.rs:16 [opt]
    frame #5: 0x000000010027b854 cargo-add`regex::literal::Matcher::new::h9088e543805253af + 324
    frame #6: 0x000000010027a931 cargo-add`regex::literal::LiteralSearcher::prefixes::h2920ca55b2b79fe1 + 641
    frame #7: 0x0000000100269e5e cargo-add`regex::exec::ExecBuilder::build::h72257c20c98ecf31 + 6510
    frame #8: 0x0000000100266762 cargo-add`regex::re_builder::unicode::RegexBuilder::build::h12d55fa7c52e206b + 210
    frame #9: 0x000000010029ad4b cargo-add`regex::re_unicode::Regex::new::h134009fc29889184 + 219
    frame #10: 0x00000001002470cc cargo-add`std::sync::once::Once::call_once::_$u7b$$u7b$closure$u7d$$u7d$::h17d5ef8a5bc6025f + 60
    frame #11: 0x00000001002f3a2e cargo-add`std::sync::once::Once::call_inner::h897ebf125b2dcf3e at once.rs:340 [opt]
    frame #12: 0x0000000100234f35 cargo-add`docopt::parse::Parser::new::h29e507fb9dde54d3 + 565
    frame #13: 0x000000010000561c cargo-add`cargo_add::main::h12221ffe12ffa004 + 44
    frame #14: 0x0000000100002dd6 cargo-add`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hc5b18510be8e533e + 6
    frame #15: 0x00000001002f6b28 cargo-add`std::panicking::try::do_call::h780530fa434ed2ed [inlined] std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::hcfde3e1c19e6fa13 at rt.rs:59 [opt]
    frame #16: 0x00000001002f6b1c cargo-add`std::panicking::try::do_call::h780530fa434ed2ed at panicking.rs:310 [opt]
    frame #17: 0x00000001003028ff cargo-add`__rust_maybe_catch_panic at lib.rs:105 [opt]
    frame #18: 0x00000001002ea7a2 cargo-add`std::rt::lang_start_internal::h46f9f692ece7505b [inlined] std::panicking::try::h0c5a2273ff0725da at panicking.rs:289 [opt]
    frame #19: 0x00000001002ea76f cargo-add`std::rt::lang_start_internal::h46f9f692ece7505b [inlined] std::panic::catch_unwind::h3474c6e3bb49278a at panic.rs:374 [opt]
    frame #20: 0x00000001002ea76f cargo-add`std::rt::lang_start_internal::h46f9f692ece7505b at rt.rs:58 [opt]
    frame #21: 0x00000001000062fc cargo-add`main + 44
    frame #22: 0x00007fff52586015 libdyld.dylib`start + 1

I'm also going to ping @BurntSushi as advised in the related cargo-edit issue.

I remember that I once got the same crash while playing with rustc's optimisation flags to build one of my projects.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jun 26, 2018

Thanks for reporting this!

Do you have instructions to reproduce? Ideally, starting with installing cargo-edit down to the error? (it's ok if you need to download a particular project to trigger this, just add git clone ... to the steps).


From a first look _xgetbv is causing this when called here. Note that _xgetbv requires xsave target feature, but it is only called after we have tested that the feature is available here.

To test for xsave support we need to query cpuid with eax=1 and read the 26-th bit of ecx, which is exactly what that code does so to me it appears that this part is at least correct.

The 27-th bit of ECX is osxsave which denotes whether the operating system has enabled xsave support and is tested right afterwards.

But... we unconditionally call _xgetbv independently of whether the OS has enabled osxsave or not... IIRC this was ok, but maybe it isn't? Worth triple-checking it.

cc @stevecheckoway @parched @petrochenkov

@gnzlbg
Copy link
Contributor

gnzlbg commented Jun 26, 2018

https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions states:

Verify that the operating system supports XGETBV using CPUID.1:ECX.OSXSAVE bit 27 = 1.

So it seems that we can't call xgetbv if osxsave is disabled. I'll prepare a fix.

@baskerville
Copy link
Author

Reproducing the crash is straightforward:

cargo install cargo-edit
cargo add --help

TheIronBorn pushed a commit to TheIronBorn/stdsimd that referenced this issue Jun 29, 2018
TheIronBorn pushed a commit to TheIronBorn/stdsimd that referenced this issue Jun 29, 2018
@boozook
Copy link

boozook commented Jul 1, 2018

Same problem on macOS 10.12.6 (16G1036) on the CPU 2.13 GHz Intel Core 2 Duo.

Any call of is_x86_feature_detected method crashes with illegal instruction: 4.

Affected rustc versions:

  • rustc 1.27.0 (3eda71b00 2018-06-19)
  • rustc 1.28.0-nightly (60efbdead 2018-06-23)
  • rustc 1.28.0-nightly (e3bf634e0 2018-06-28) - current

screencast.gif

isolated test in playground

@alexcrichton
Copy link
Member

@fzzr- I think the fix for this issue hasn't made its way into the nightly releases yet unfortunately

@boozook
Copy link

boozook commented Jul 3, 2018

And not today unfortunately. 😩

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 3, 2018

@alexcrichton do you think it is worth to expand the CI to cover this? We can choose which CPU qemu-user uses, so we could actually add two x86_64 build bots, one with a CPU that only supports SSE2, and one with a CPU that only supports SSE4.2, to catch this.

The ones on travis support AVX2 IIRC, so the normal build bots just catch that case.

@stevecheckoway
Copy link

I meant to look into this more closely but I’m on holiday. Isn’t there a cpuid bit to test for xgetbv separate from xsave?

From a quick google search, CPUID.(EAX=0DH,ECX=1):EAX.XG1[bit 2] = 1 vs CPUID.01H:ECX.XSAVE[bit 26] = 1. I guess xgetbv also requires xsave to be supported and enabled but it’s not clear that that is sufficient.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 3, 2018

@stevecheckoway this should be fixed on master, stdsimd on rustc has been updated, but the update just hasn't landed on nightly yet.

I meant to look into this more closely but I’m on holiday. Isn’t there a cpuid bit to test for xgetbv separate from xsave?

Before this issue we just checked for xsave support, and if the CPU supported it, we used xgetbv. IIRC this is what caused the SIGILL on systems where the CPU supported xsave, but the operating system did not set the osxsave bit.

We now check both xsave and osxsave before using xgetbv, that is, the CPU must support xsave, and the operating system must have enabled xsave support in user-space by setting the osxsave cpu-bit.

It is hard to tell whether this is sufficient. There are multiple sources of truth from Intel, none of them focusing on general run-time feature detection in user-space (e.g. for language run-times), but rather focused only on detecting avx/avx512 support, and IIRC I didn't found one explaining the procedure in a step-by-step fashion, but more like, "you need to check all these things in no particular order". FWIW we were checking for osxsave support for avx and avx512 later on in the run-time detection code, we just weren't doing it before using xgetbv, which was only guarded on xsave and not osxsave.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 3, 2018

When the patch lands on nightly I am going to ping you all here again so that you can please try it out and see if it really fixed the issue. Sadly it appears that we it only triggers with particular OS and particular CPUs that I don't have at hand to reproduce this =/

@alexcrichton
Copy link
Member

@gnzlbg if we could get qemu working for something like this I think that'd be awesome yeah, although I dunno how much testing infrastructure that'd take...

@stevecheckoway
Copy link

@gnzlbg I spent a little while reading the Intel manual about this. In order to execute xgetbv with ecx=0, all that's needed is XSAVE to be supported (CPUID.1:ECX.XSAVE=1) and enabled (in CR4) by the OS (CPUID.1:ECX.OSXSAVE=1). To execute it with ecx=1, this needs to be checked (CPUID.(EAX=0DH,ECX=1):EAX.XG1=1).

Here's some standalone Rust code to look at the supported XSAVE features.

#![feature(asm)]

#[derive(Debug, Clone)]
struct CpuidResult {
    pub eax: u32,
    pub ebx: u32,
    pub ecx: u32,
    pub edx: u32,
}

fn cpuid(mut eax: u32, mut ecx: u32) -> CpuidResult {
    let ebx: u32;
    let edx: u32;
    unsafe {
        asm!("cpuid"
             : "+{eax}"(eax),
               "={ebx}"(ebx),
               "+{ecx}"(ecx),
               "={edx}"(edx)
             ::: "intel");
    };
    CpuidResult { eax, ebx, ecx, edx }
}

unsafe fn xgetbv(ecx: u32) -> u64 {
    let eax: u32;
    let edx: u32;
    asm!("xgetbv"
         : "={eax}"(eax),
           "={edx}"(edx)
         : "{ecx}"(ecx)
         :: "intel");
    ((edx as u64) << 32) | (eax as u64)
}

// CPUID.1:ECX
const XSAVE: u32 = 1 << 26;
const OSXSAVE: u32 = 1 << 27;

// CPUID.(EAX=0DH,ECX=1):EAX
const XSAVEOPT: u32 = 1 << 0;
const XSAVEC: u32 = 1 << 1;
const XG1: u32 = 1 << 2;
const XSS: u32 = 1 << 3;

fn main() {
    // Check for support.
    let features = cpuid(1, 0).ecx;
    if features & XSAVE != 0 {
        println!("Processor supports XGETBV, XRSTOR, XSAVE, and XSETBV");
        assert!(cpuid(0, 0).eax >= 0xd);
    } else {
        println!("Processor does not support XGETBV, XRSTOR, XRSTORS, XSAVE, XSAVEC, XSAVEOPT, XSAVES, or XSETBV");
        return;
    }

    // Check if enabled.
    let enabled = features & OSXSAVE != 0;
    if enabled {
        println!("XSAVE features enabled");
    } else {
        println!("XSAVE featuers disabled");
        println!("Executing XGETBV, XRSTOR, XRSTORS, XSAVE, XSAVEC, XSAVEOPT, XSAVES, and XSETBV will cause #UD");
        return;
    }

    // Check which features are supported.
    let result = cpuid(0xd, 0);
    println!("Bitmap of user state components that can be managed by the XSAVE feature set: 0x{:08x}{:08x}",
             result.edx, result.eax);
    println!("Size of XSAVE area for all user state components supported by the processor: 0x{:x} bytes",
             result.ecx);
    println!("Size of XSAVE area for all user state components enabled by XCR0: 0x{:x} bytes",
             result.ebx);

    let result = cpuid(0xd, 1);
    if result.eax & XSAVEOPT != 0 {
        println!("XSAVEOPT is supported");
    } else {
        println!("Executing XSAVEOPT will cause #UD");
    }

    if result.eax & XSAVEC != 0 {
        println!("XSAVEC is supported");
    } else {
        println!("Executing XSAVEC will cause #UD");
    }

    let xgetbv_1 = result.eax & XG1 != 0;
    if xgetbv_1 {
        println!("XGETBV with ECX = 1 is supported");
    } else {
        println!("Executing XGETBV with ECX = 1 will #GP(0)");
    }

    if result.eax & XSS != 0 {
        println!("XSAVES, XRSTORS, and IA32_XSS MSR are supported");
        println!("Bitmap of all supervisor state components that can be managed by XSAVES and XRSTORS: 0x{:08x}{:08x}",
                 result.edx, result.ecx);
        println!("Size of XSAVE area containing all state components corresponding to bits set in XCR0 | IA32_XSS");
    } else {
        println!("Executing XSAVES or XRSTORS will #UD; accessing IA32_XSS MSR will #GP");
    }
    
    println!("XCR0 = 0x{:016x}", unsafe { xgetbv(0) });
    if xgetbv_1 {
        println!("XCR0 & XINUSE = 0x{:016x}", unsafe { xgetbv(1) });
    }
}

On my laptop, I get this output.

Processor supports XGETBV, XRSTOR, XSAVE, and XSETBV
XSAVE features enabled
Bitmap of user state components that can be managed by the XSAVE feature set: 0x0000000000000007
Size of XSAVE area for all user state components supported by the processor: 0x340 bytes
Size of XSAVE area for all user state components enabled by XCR0: 0x340 bytes
XSAVEOPT is supported
Executing XSAVEC will cause #UD
Executing XGETBV with ECX = 1 will #GP(0)
Executing XSAVES or XRSTORS will #UD; accessing IA32_XSS MSR will #GP
XCR0 = 0x0000000000000007

And indeed, executing xgetbv with ecx=1 does segfault.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 4, 2018

I spent a little while reading the Intel manual about this. In order to execute xgetbv with ecx=0, all that's needed is XSAVE to be supported (CPUID.1:ECX.XSAVE=1) and enabled (in CR4) by the OS (CPUID.1:ECX.OSXSAVE=1).

This would mean that the issue is fixed. We are only calling xgetbv with ecx=0, and when this issue was reported we were only checking CPUID.1:ECX.XSAVE=1, instead of also checking CPUID.1:ECX.OSXSAVE=1. stdsimd was updated on rust-lang/rust one day ago (rust-lang/rust#51695) so the fix should be in the next nightly.

I'll ping you all once the next nightly arrives so that you can try it. And @stevecheckoway thank you for looking into this.

@boozook
Copy link

boozook commented Jul 8, 2018

@gnzlbg When the patch lands on nightly I am going to ping you all here again so that you can please try it out and see if it really fixed the issue.

In the rustc 1.29.0-nightly (9fd3d7899 2018-07-07) all right on my CPU! 👍
This test all checks returns false without crashes. Thanks!

I think this patch should be in the rustc 1.27.0-stable because it broken now.

@baskerville
Copy link
Author

I confirm that the bug is fixed.

I can no longer reproduce on 1.29.0-nightly (9fd3d7899 2018-07-07).

alexcrichton added a commit to alexcrichton/rust that referenced this issue Jul 8, 2018
This commit updates the `stdsimd` module on the beta branch to pull in a fix for
rust-lang/stdarch#500, a bug which affects the stable 1.27.0 release
which causes feature detection on x86 to cause an illegal instruction on some
CPUs.
@alexcrichton
Copy link
Member

Thanks for confirming the fix @fzzr- and @baskerville! I've posted this for a backport to 1.28.0 at rust-lang/rust#52154.

We're unlikely to release a patch release for 1.27 for this, but we're only three weeks away from the 1.28 release!

bors added a commit to rust-lang/rust that referenced this issue Jul 8, 2018
[beta] Update stdsimd submodule

This commit updates the `stdsimd` module on the beta branch to pull in a fix for
rust-lang/stdarch#500, a bug which affects the stable 1.27.0 release
which causes feature detection on x86 to cause an illegal instruction on some
CPUs.
@boozook
Copy link

boozook commented Jul 9, 2018

@alexcrichton, Then I think we should mention this issue in the "release notes" for 1.27 as "known issues" because it's really important issue, imho.

lu-zero pushed a commit to lu-zero/stdarch that referenced this issue Jul 10, 2018
moshg pushed a commit to moshg/rust-std-ja that referenced this issue Apr 4, 2020
This commit updates the `stdsimd` module on the beta branch to pull in a fix for
rust-lang/stdarch#500, a bug which affects the stable 1.27.0 release
which causes feature detection on x86 to cause an illegal instruction on some
CPUs.
moshg pushed a commit to moshg/rust-std-ja that referenced this issue Apr 4, 2020
[beta] Update stdsimd submodule

This commit updates the `stdsimd` module on the beta branch to pull in a fix for
rust-lang/stdarch#500, a bug which affects the stable 1.27.0 release
which causes feature detection on x86 to cause an illegal instruction on some
CPUs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants