Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stable regression] #[should_panic] tests segfault in certain OS X configurations with 1.44.0 #73030

Closed
froydnj opened this issue Jun 5, 2020 · 8 comments
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. O-macos Operating system: macOS P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@froydnj
Copy link
Contributor

froydnj commented Jun 5, 2020

We are seeing some sccache #[should_panic] tests segfault on TravisCI when using Rust 1.44.0, when those same tests succeed with Rust 1.43.1 (e.g. here -- note that the OS X DEPLOY=1 build does not run tests). The tests in question are these:

https://github.com/mozilla/sccache/blob/e57788095488944d05df2ea92d12ade319b05822/src/compiler/args.rs#L1036-L1105

With 1.43.1, the tests run like so:

     Running `/Users/travis/build/mozilla/sccache/target/debug/deps/sccache-d6f73566e418fda1`

running 167 tests
test azure::blobstore::test::test_put_blob ... ignored
test azure::blobstore::test::test_canonicalize_resource ... ok
test azure::credentials::test::test_conn_str_with_endpoint_suffix_only ... ok
test azure::credentials::test::test_parse_connection_string ... ok
test azure::credentials::test::test_parse_connection_string_without_account_key ... ok
test azure::blobstore::test::test_signing ... ok
test cache::gcs::test_gcs_credential_provider ... ok
test compiler::args::tests::assert_tests::test_arginfo_process_flag ... ok
test compiler::args::tests::assert_tests::test_arginfo_process_take_arg ... ok
test compiler::args::tests::assert_tests::test_arginfo_process_take_concat_arg ... ok
test compiler::args::tests::assert_tests::test_arginfo_process_take_concat_arg_delim ... ok
test compiler::args::tests::assert_tests::test_arginfo_process_take_maybe_concat_arg ... ok
test compiler::args::tests::assert_tests::test_arginfo_process_take_maybe_concat_arg_delim ... ok
test compiler::args::tests::assert_tests::test_args_iter_no_conflict ... ok
test compiler::args::tests::assert_tests::test_args_iter_unsorted ... ok
test compiler::args::tests::assert_tests::test_args_iter_unsorted_2 ... ok
[...more tests follow...]

With 1.44.0, the tests segfault:

     Running `/Users/travis/build/mozilla/sccache/target/debug/deps/sccache-720cb5bbc0e667a6`

running 167 tests
test azure::blobstore::test::test_put_blob ... ignored
test azure::blobstore::test::test_canonicalize_resource ... ok
test azure::credentials::test::test_conn_str_with_endpoint_suffix_only ... ok
test azure::credentials::test::test_parse_connection_string ... ok
test azure::credentials::test::test_parse_connection_string_without_account_key ... ok
test azure::blobstore::test::test_signing ... ok
test cache::gcs::test_gcs_credential_provider ... ok
error: test failed, to rerun pass '-p sccache --lib'

Caused by:

  process didn't exit successfully: `/Users/travis/build/mozilla/sccache/target/debug/deps/sccache-720cb5bbc0e667a6` (signal: 11, SIGSEGV: invalid memory reference)

The same tests do timeout occasionally (e.g. in this run), rather than segfaulting.

I have retried jobs several different times in TravisCI with what appear to be different workers -- I get different hostnames and instances, which I assume are physically different machines -- so I think some sort of machine problem (e.g. bad memory) has been ruled out.

I think this failure is potentially specific to the OS X version that TravisCI is running; I cannot reproduce the failures on my Mac (OS X 10.12.3). TravisCI reports using:

Runtime kernel version: 17.7.0
...
ProductName:	Mac OS X
ProductVersion:	10.13.6
BuildVersion:	17G65

You may be able to reproduce by:

  1. git clone https://github.com/mozilla/sccache/
  2. cd sccache
  3. RUST_BACKTRACE=1 cargo +1.44.0 test --all --verbose --features="all"

I do not understand why, but cargo test --all --verbose --no-default-features features="" does seem to work.

@froydnj froydnj added the C-bug Category: This is a bug. label Jun 5, 2020
@jonas-schievink jonas-schievink added A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows I-prioritize Issue: Indicates that prioritization has been requested for this issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. O-macos Operating system: macOS labels Jun 5, 2020
froydnj added a commit to froydnj/sccache that referenced this issue Jun 5, 2020
Stable Rust and the OS X versions on TravisCI do not seem to get along
right now.  I filed a [Rust
bug](rust-lang/rust#73030) about this, but if
the bug is very OS X-configuration specific, I'm not sure there's much
that can be done about it on the Rust side.

We can at least change the CI configuration so we don't get a bunch of
spurious failures.
froydnj added a commit to froydnj/sccache that referenced this issue Jun 5, 2020
Stable Rust and the OS X versions on TravisCI do not seem to get along
right now.  I filed a Rust bug (rust-lang/rust#73030) about this, but if
the bug is very OS X-configuration specific, I'm not sure there's much
that can be done about it on the Rust side.

We can at least change the CI configuration so we don't get a bunch of
spurious failures.
froydnj added a commit to mozilla/sccache that referenced this issue Jun 5, 2020
Stable Rust and the OS X versions on TravisCI do not seem to get along
right now.  I filed a Rust bug (rust-lang/rust#73030) about this, but if
the bug is very OS X-configuration specific, I'm not sure there's much
that can be done about it on the Rust side.

We can at least change the CI configuration so we don't get a bunch of
spurious failures.
@ehuss
Copy link
Contributor

ehuss commented Jun 5, 2020

I'm able to repro. I'm also getting other random corruption (like malloc: Incorrect checksum for freed object or mpsc going into an infinite loop). The backtrace is:

backtrace
  * frame #0: 0x000000010694dc23 sccache-720cb5bbc0e667a6`__rdos_macho_add + 2163
    frame #1: 0x000000010694d2b6 sccache-720cb5bbc0e667a6`__rdos_backtrace_initialize + 278
    frame #2: 0x000000010694c642 sccache-720cb5bbc0e667a6`fileline_initialize + 450
    frame #3: 0x000000010694c708 sccache-720cb5bbc0e667a6`__rdos_backtrace_syminfo + 40
    frame #4: 0x000000010693e9a7 sccache-720cb5bbc0e667a6`backtrace::symbolize::libbacktrace::resolve::h39ea9ff39d2c28cb at libbacktrace.rs:469:9 [opt]
    frame #5: 0x0000000106931d3e sccache-720cb5bbc0e667a6`std::sys_common::backtrace::_print_fmt::_$u7b$$u7b$closure$u7d$$u7d$::h56579608189a9677 [inlined] backtrace::symbolize::resolve_frame_unsynchronized::h285ae689918219bf at mod.rs:178:5 [opt]
    frame #6: 0x0000000106931d29 sccache-720cb5bbc0e667a6`std::sys_common::backtrace::_print_fmt::_$u7b$$u7b$closure$u7d$$u7d$::h56579608189a9677 at backtrace.rs:85 [opt]
    frame #7: 0x000000010693e633 sccache-720cb5bbc0e667a6`backtrace::backtrace::libunwind::trace::trace_fn::h10f4b899671fce38 [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnMut$LT$A$GT$$u20$for$u20$$RF$mut$u20$F$GT$::call_mut::h4a73971cf4583a47 at function.rs:274:13 [opt]
    frame #8: 0x000000010693e625 sccache-720cb5bbc0e667a6`backtrace::backtrace::libunwind::trace::trace_fn::h10f4b899671fce38 at libunwind.rs:98 [opt]
    frame #9: 0x00007fff7140e13f libunwind.dylib`_Unwind_Backtrace + 78
    frame #10: 0x00000001069316af sccache-720cb5bbc0e667a6`_$LT$std..sys_common..backtrace.._print..DisplayBacktrace$u20$as$u20$core..fmt..Display$GT$::fmt::h83d53b696ac99295 [inlined] backtrace::backtrace::libunwind::trace::h56b31d02b9b762c6 at libunwind.rs:86:5 [opt]
    frame #11: 0x000000010693169c sccache-720cb5bbc0e667a6`_$LT$std..sys_common..backtrace.._print..DisplayBacktrace$u20$as$u20$core..fmt..Display$GT$::fmt::h83d53b696ac99295 [inlined] backtrace::backtrace::trace_unsynchronized::h6625ae095e7d53cf at mod.rs:66 [opt]
    frame #12: 0x000000010693169c sccache-720cb5bbc0e667a6`_$LT$std..sys_common..backtrace.._print..DisplayBacktrace$u20$as$u20$core..fmt..Display$GT$::fmt::h83d53b696ac99295 [inlined] std::sys_common::backtrace::_print_fmt::hce8652c3ed9d2e92 at backtrace.rs:78 [opt]
    frame #13: 0x0000000106931590 sccache-720cb5bbc0e667a6`_$LT$std..sys_common..backtrace.._print..DisplayBacktrace$u20$as$u20$core..fmt..Display$GT$::fmt::h83d53b696ac99295 at backtrace.rs:59 [opt]
    frame #14: 0x000000010695be1e sccache-720cb5bbc0e667a6`core::fmt::write::hf81c429634e1f3ed at mod.rs:1069:17 [opt]
    frame #15: 0x0000000105fe61f9 sccache-720cb5bbc0e667a6`std::io::Write::write_fmt::h53fe50e3fff0275d at mod.rs:1504:15 [opt]
    frame #16: 0x000000010692581c sccache-720cb5bbc0e667a6`std::io::impls::_$LT$impl$u20$std..io..Write$u20$for$u20$alloc..boxed..Box$LT$W$GT$$GT$::write_fmt::h352c9db3a02449c0 at impls.rs:156:9 [opt]
    frame #17: 0x0000000106933cfa sccache-720cb5bbc0e667a6`std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::ha991e4eca34b4afa [inlined] std::sys_common::backtrace::_print::h3e8409ed4be04623 at backtrace.rs:62:5 [opt]
    frame #18: 0x0000000106933ca9 sccache-720cb5bbc0e667a6`std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::ha991e4eca34b4afa [inlined] std::sys_common::backtrace::print::h4fcdfee7c48ce7b5 at backtrace.rs:49 [opt]
    frame #19: 0x0000000106933c9d sccache-720cb5bbc0e667a6`std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::ha991e4eca34b4afa at panicking.rs:198 [opt]
    frame #20: 0x00000001069339d8 sccache-720cb5bbc0e667a6`std::panicking::default_hook::h722aa3f5c1c31788 at panicking.rs:215:9 [opt]
    frame #21: 0x00000001069342c8 sccache-720cb5bbc0e667a6`std::panicking::rust_panic_with_hook::h2cd47f71d6d55501 at panicking.rs:511:17 [opt]
    frame #22: 0x0000000106933e92 sccache-720cb5bbc0e667a6`rust_begin_unwind at panicking.rs:419:5 [opt]
    frame #23: 0x00000001069712db sccache-720cb5bbc0e667a6`std::panicking::begin_panic_fmt::h769fb8929973777e at panicking.rs:373:5 [opt]
    frame #24: 0x0000000105f3a75d sccache-720cb5bbc0e667a6`sccache::compiler::args::ArgInfo$LT$T$GT$::process::haecd94743519c1e8(self=ArgInfo @ 0x000070000e2cf758, arg=(data_ptr = "-bar-quxfugahogeplopabefPlopHogeFuga-foo=bar-zorglub-nomatchsrc/test/utils.rsassertion failed: (*next).value.is_some()/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/macros.rsassertion failed: (*n).value.is_none()\x0ffailed to write whole buffer/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/io/mod.rsformatter errorthere is no such thing as a relaxed fence/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/src/libcore/macros/mod.rsunexpected task state/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/macros.rsinternal error: entered unreachable code: ", length = 4), get_next_arg=closure-0 @ 0x000070000e2cf600) at args.rs:413:17
    frame #25: 0x0000000105b8d839 sccache-720cb5bbc0e667a6`sccache::compiler::args::tests::assert_tests::test_arginfo_process_flag::hf24b1c5a0a08a369 at args.rs:1043:13
    frame #26: 0x0000000105ea7bf1 sccache-720cb5bbc0e667a6`sccache::compiler::args::tests::assert_tests::test_arginfo_process_flag::_$u7b$$u7b$closure$u7d$$u7d$::he27c6d68447c7a58((null)=0x000070000e2cf7e0) at args.rs:1042:9
    frame #27: 0x0000000105db4231 sccache-720cb5bbc0e667a6`core::ops::function::FnOnce::call_once::hb1c249d25da0ef1a((null)=closure-0 @ 0x000070000e2cf7e0, (null)=) at function.rs:232:5
    frame #28: 0x000000010600b4d3 sccache-720cb5bbc0e667a6`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hf35455f67ec1e4ed [inlined] _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h0f570127a12e3feb at boxed.rs:1008:9 [opt]
    frame #29: 0x000000010600b4cb sccache-720cb5bbc0e667a6`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hf35455f67ec1e4ed [inlined] _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h3b71a5cd36d92517 at panic.rs:318 [opt]
    frame #30: 0x000000010600b4cb sccache-720cb5bbc0e667a6`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hf35455f67ec1e4ed [inlined] std::panicking::try::do_call::he5e6ceb73e8d0255 at panicking.rs:331 [opt]
    frame #31: 0x000000010600b4cb sccache-720cb5bbc0e667a6`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hf35455f67ec1e4ed [inlined] std::panicking::try::hcfbd52acab7cd3c7 at panicking.rs:274 [opt]
    frame #32: 0x000000010600b4cb sccache-720cb5bbc0e667a6`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hf35455f67ec1e4ed [inlined] std::panic::catch_unwind::hd1f06c2faa7c4b62 at panic.rs:394 [opt]
    frame #33: 0x000000010600b4cb sccache-720cb5bbc0e667a6`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hf35455f67ec1e4ed [inlined] test::run_test_in_process::hdfd2866d2a50d846 at lib.rs:541 [opt]
    frame #34: 0x000000010600b4b1 sccache-720cb5bbc0e667a6`test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::hf35455f67ec1e4ed at lib.rs:450 [opt]
    frame #35: 0x0000000105fe56fb sccache-720cb5bbc0e667a6`std::sys_common::backtrace::__rust_begin_short_backtrace::hffd4a983e423c33e at backtrace.rs:130:5 [opt]
    frame #36: 0x0000000105feabd5 sccache-720cb5bbc0e667a6`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hd45267100ae6c7ce [inlined] std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::ha413f8a1edb7d5dc at mod.rs:475:17 [opt]
    frame #37: 0x0000000105feabb1 sccache-720cb5bbc0e667a6`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hd45267100ae6c7ce [inlined] _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h54069f7298b202c0 at panic.rs:318 [opt]
    frame #38: 0x0000000105feabb1 sccache-720cb5bbc0e667a6`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hd45267100ae6c7ce [inlined] std::panicking::try::do_call::h51596b37c74c1cc8 at panicking.rs:331 [opt]
    frame #39: 0x0000000105feabb1 sccache-720cb5bbc0e667a6`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hd45267100ae6c7ce [inlined] std::panicking::try::h0056bd7901bf2cf7 at panicking.rs:274 [opt]
    frame #40: 0x0000000105feabb1 sccache-720cb5bbc0e667a6`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hd45267100ae6c7ce [inlined] std::panic::catch_unwind::hb86f92be16884b30 at panic.rs:394 [opt]
    frame #41: 0x0000000105feabb1 sccache-720cb5bbc0e667a6`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hd45267100ae6c7ce [inlined] std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h2a59c52401371313 at mod.rs:474 [opt]
    frame #42: 0x0000000105feab77 sccache-720cb5bbc0e667a6`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hd45267100ae6c7ce at function.rs:232 [opt]
    frame #43: 0x000000010693c93d sccache-720cb5bbc0e667a6`std::sys::unix::thread::Thread::new::thread_start::h2b28b74d30bce841 [inlined] _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h307bc2fd9d7d309b at boxed.rs:1008:9 [opt]
    frame #44: 0x000000010693c937 sccache-720cb5bbc0e667a6`std::sys::unix::thread::Thread::new::thread_start::h2b28b74d30bce841 [inlined] _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hf381279f6518a852 at boxed.rs:1008 [opt]
    frame #45: 0x000000010693c92e sccache-720cb5bbc0e667a6`std::sys::unix::thread::Thread::new::thread_start::h2b28b74d30bce841 at thread.rs:87 [opt]
    frame #46: 0x00007fff713dc109 libsystem_pthread.dylib`_pthread_start + 148
    frame #47: 0x00007fff713d7b8b libsystem_pthread.dylib`thread_start + 15

It doesn't seem to repro in 1.45 (currently beta). Without digging further, I would strongly suspect this is fixed by rust-lang/libbacktrace@5c88e09 (via #71577). rust-lang/backtrace-rs#310 has more details, which look exactly the same (crash in macho_add).

@ehuss
Copy link
Contributor

ehuss commented Jun 5, 2020

Note to triage/release: backtraces are somewhat broken in 1.44 on macos for other reasons. We just merged a fix at rust-lang/cargo#8329 (see #72550), and I plan to backport that to 1.45. If you decide this is worthy of a point release, that might be another element of consideration.

@Mark-Simulacrum
Copy link
Member

I think backtraces being broken is sufficiently annoying that I'd be personally in favor of a point release. It probably makes sense to not rush it out the door immediately, but maybe aiming for say Tuesday-ish in ~2 weeks (i.e. not next) makes sense? -- cc @rust-lang/release

@jonas-schievink
Copy link
Contributor

Yes, that sounds reasonable.

@pietroalbini
Copy link
Member

Sounds good.

taiki-e added a commit to tokio-rs/tokio that referenced this issue Jun 7, 2020
* Fix clippy warnings
* Pin rustc version to 1.43.1 in macOS

Refs: rust-lang/rust#73030
@spastorino
Copy link
Member

Assigning P-high as discussed as part of the Prioritization Working Group process and removing I-prioritize.

@spastorino spastorino added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jun 10, 2020
@m-ou-se
Copy link
Member

m-ou-se commented Jul 7, 2021

It seems like this has been fixed. @froydnj Is this still an issue?

@froydnj
Copy link
Contributor Author

froydnj commented Jul 7, 2021

It seems like this has been fixed. @froydnj Is this still an issue?

AFAIK, this has been fixed.

@froydnj froydnj closed this as completed Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. O-macos Operating system: macOS P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants