-
Notifications
You must be signed in to change notification settings - Fork 865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault while building the Rust toolchain with a toolchain that uses mimalloc #77
Comments
Very nice you are working on integrating this with Rust -- let me know of issues or performance.
Do you have a backtrace of the segfault? Was it using the debug version of mimalloc with I will try help out and try to look at the Rust thread tomorrow. edit: I looked a little and saw rust already works with jemalloc so the above issues should already be handled well. Don't use MI_OVERRIDE and statically link seems the best approach (which you are already doing). |
Just a note, ensure the aligment and size arguments are in the right order for the aligned allocation functions in:
since mimalloc lets alignment always come after the size but |
First step is to get it to work correctly, but that's the next step :)
So in the mimalloc-sys crate, I do two things:
The Rust compiler uses "plain" malloc/free (the C and POSIX apis), but right now jemalloc is statically linked into the binary, overriding these symbols. I know LD_PRELOAD works if the malloc library is linked dynamically, but I'm not sure if it works in this situation. By default, Rust binaries use the system allocator. I've tried
So I'm statically linking mimalloc into the binary, and hopefully correctly overriding all symbols. That's what we do for jemalloc and it works. So I would hope that this would override One thing we are doing is compiling LLVM and somehow linking it into rustc. We don't do anything special for jemalloc here, so I don't think we need to do anything special for mimalloc either.
Those are wrapped here: https://github.com/gnzlbg/mimallocator/blob/master/mimalloc-sys/src/lib.rs#L49 We just provide the C and POSIX API. I suppose the ones mimalloc defines have the arguments in the standard orders for these. From looking at the code here, they appear to match: https://github.com/microsoft/mimalloc/blob/master/src/alloc-override.c#L138 That is, we are only using the overriden functions from mimalloc, we don't use any of the
AFAICT it was using a release build, but I will enable full checks and try again. |
Mmm, I mean the
versus
Be very careful when binding to Rust that it matches what Rust defined. |
Ah ok. Yes, we don't use |
I understand you want to link statically with mimalloc and override the standard malloc/free. If so, that will still be tricky general. Static + override:
Static but not overriding is the safest option but you need to control the runtime system enough to ensure no pointers are mixed -- but that may well be the case for rust. For now, I would at least try your current approach on Linux first to ensure it all works before moving on to other platforms. |
I forgot to mention that, to be able to run Rust compiler benchmarks, we only need to be able to link mimalloc on Linux.
I don't think this is an option. We link against LLVM, which uses the malloc APIs to allocate memory. I can't really assume that there aren't any places in the code where memory from one is passed to the other, which afterwards deallocates it. I hope this isn't the case, but making sure that it isn't would be a lot of work at this point (EDIT: the code interfacing with LLVM is compartmentalized in a single place, so while doable, it is still a lot of work to do this).
That's the plan. Maybe I wasn't clear. Our current goal isn't to switch from jemalloc to mimalloc in the Rust toolchain, but rather to gather data about the performance of the Rust toolchain when mimalloc is its allocator. Performance is just one of the many data-points that would need to be considered in a discussion about doing a switch, but that discussion hasn't started yet. This is purely exploratory work. If you want to try my PR branch on Linux, the only thing you have to do is clone it, and execute |
Thanks for explaining :-) I thought that you were working on macOS somehow. Anyway, in that case there should not be segfaults so let's see if we can track down the cause. If can provide me with a test case, or a debug backtrace from the faulty build that will help. If I try your branch, will it cause the error for me investigate? |
Yes. Briefly, the Rust toolchain bootstraps itself. So the first "stage" is stage0, and here, an older compiler is used to compile the toolchain, producing a stage0 compiler. This stage0 compiler that gets produced here is linked statically with mimalloc in my PR branch. Then, in stage1, this compiler is used to compile the toolchain again, producing a stage1 compiler. This is where the compiler using mimalloc gets exercised, and where the error happens. The |
This is still on my todo list -- let me know of further issues but I will get to it soonish :-) |
@daanx let me know if you need help setting anything up or feel free to drop by Rust's discord or Zulip chat rooms. |
Building
Results in:
If I do |
The issue is fairly old and in the meantime various bugs were addressed. Perhaps it is just working now? :-) If so, please confirm as it would be nice if Rust is able to easily use mimalloc as an allocator. I read somewhere that Rust now uses no default allocator but instead let's the user opt-in to another allocator -- it would be great if mimalloc can be one of those options. |
Steps from #77 (comment) give you 2 months old Rust and Mimalloc. There are no fixes included so it shouldn't work meaning something went wrong.
Rust compiler uses jemalloc but users can choose which allocator they will use in their binaries (defaults to system allocator). |
@mati865 actually, the latest commit in that PR was never "try"ed (some other PR landed before that that made it unmergeable), so... maybe it worked ? I'll rebase and we should schedule a try-build. |
So I rebased on top of master, and re-run all tests, and we are still failing with a SIGSEV: https://dev.azure.com/rust-lang/e71b0ddf-dd27-435a-873c-e30f86eea377/_build/results?buildId=7960 |
@ArniDagur which platform are you testing on ? Those build jobs run on x86_64-linux. It might also be that there are some differences between how those jobs are build there and your machine. |
@gnzlbg: thanks for trying again!. I am still very interested to fix this and understand what is happening. Can you try with the latest |
Here's information about my system:
|
Builds ran through Docker containers have various config options tweaked: https://github.com/rust-lang/rust/blob/c43d03a19f326f4a323569328cc501e86eb6d22e/src/ci/run.sh |
The results are in here: https://dev.azure.com/rust-lang/rust/_build/results?buildId=7973&view=logs&jobId=d0d954b5-f111-5dc4-4d76-03b6c9d0cf7e&taskId=1094049d-8071-5d7a-acdc-b0909289cb44&lineStart=14107&lineEnd=14107&colStart=1&colEnd=75 An assertion triggers:
|
Ah very interesting -- that assertion should not trigger! Can you try if this happens with the latest |
@daanx that test was done with the latest dev branch that was released at the time of the test (~3 hours ago), but the last commit to the dev branch is from 17 hours ago, so those tests should have used the very latest dev branch. If you update it feel free to let me know and I'll re-run the tests. |
Ah I see -- but it says line 324 in the assertion which corresponds to the master branch, in dev it is line 343. Maybe a wrong checkout? |
Weird, I noticed that to. I'll investigate. |
You were right. I was being bitten by semver, the "0.1.6-dev" version was being resolved as "0.1.6". I had to specify that I really really wanted that one with "=0.1.6-dev". I've restarted the build. |
From the latest run...
|
@mati865 while the last commit does do that, in rust-lang/rust CI, the build still fails at some point after the warnings with a segfault, which @daanx is not able to reproduce. I don't understand why this happens. Maybe the resources of the VMs used there play a role here ? (e.g. maybe a higher memory pressure is required to trigger this?) |
Thanks everyone for helping to isolate the problem. I hope I can repro at some point and get a trace. |
@daanx have you applied all the options from #77 (comment) or only |
Maybe it has something to do with the ancient CentoOS 5 that is used by Rust dist builders?
It's complicated because it's external crate: https://github.com/gnzlbg/mimallocator/tree/master/mimalloc-sys |
Update: using
|
|
Mimalloc is pulled in via an external library, that includes the source code, a script for how to compile it, and Rust C FFI bindings so that it can be called from Rust. That code is imported from the |
If you have a lot of time to spend there is possibility to run exactly the same environment as Rust's CI (but it takes around twice as long). |
When using the secure build, I can repro. The command fails with:
|
Now, I am trying to run the last failed command in the debugger ( |
@daanx I don't know how to do that, but we should document how to do that: there is a tracking issue for documenting this here (rust-lang/rustc-dev-guide#443), and a solution for a different case which sadly I don't think will work here. |
@daanx what you can do is add the |
@daanx if |
Has there been any progress on this? |
I would like to try this again with the latest |
Try this patch: diff --git a/src/rustc/Cargo.toml b/src/rustc/Cargo.toml
index 825e3655ac..7a4f81ce59 100644
--- a/src/rustc/Cargo.toml
+++ b/src/rustc/Cargo.toml
@@ -22,7 +22,7 @@ rustc_codegen_ssa = { path = "../librustc_codegen_ssa" }
#features = ['unprefixed_malloc_on_supported_platforms']
[dependencies.mimalloc-sys]
-version = '=0.1.6-dev'
+path="../../../mimallocator/mimalloc-sys"
optional = true
[features] After something like:
Remember to enable jemalloc in |
For what it's worth, I got Rust to compile with only these messages from
|
Thank you @ArniDagur, @gnzlbg: with the above instructions I successfully built using the latest
(actually, the real error occurred later with a panic message a thread could not start due to resource exhaustion) Some careful reading shows errno 12 is
and that was it! After that it built in secure mode with full debug checking. :-) Pfff, glad this got sorted out -- would love to see some benchmarks with rust. |
I just opened a new test try with another crate: rust-lang/rust#81782. Let's see do we get benchmarks at some point. Edit: It looks good at the moment! |
And here are the results: https://perf.rust-lang.org/compare.html?start=f9435f4c92651d67d5dbaba13c5606c4c4fc1327&end=86f3a470dd7fa102be01dc6d84ecc6920f8af32a Pretty good and only a few regressions. Props for mimalloc! I think this issue can be closed now. |
I tried also the MacOS rustc-build and there the override did not work as expected, statistics show zeroes.
I did enable debugs and ran it with @daanx Anything more to try here? |
Seems that MacOS support for rustc is difficult to achieve in practice. And jemalloc does not support it either. So this is a known issue. |
Jemalloc for macOS was just broken and can be fixed via simple changes provided in rust-lang/rust#82423. |
@daanx said:
Is this still true today? To be able to static link and override the libc symbols is still preferable. |
So I'm trying to build a Rust toolchain with mimalloc that works here: rust-lang/rust#62340 , such that we can use the Rust toolchain compiler benchmark suite to test whether it would be worth it switching to mimalloc from jemalloc.
I think I've managed so far to link mimalloc properly, and built a toolchain that's statically linked with it and uses it. When that toolchain is used to build another toolchain, after building a chunk of it, it segfaults with: signal: 11, SIGSEGV: invalid memory reference. Target is x86_64-unknown-linux-gnu , and I'm using the master branch of mimalloc, via the mimalloc-sys bindings, which are here: https://github.com/gnzlbg/mimallocator/tree/master/mimalloc-sys
I'm building mimalloc following the steps from this script: https://github.com/gnzlbg/mimallocator/blob/master/mimalloc-sys/build.rs
The text was updated successfully, but these errors were encountered: