-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use mimalloc as global memory allocator #749
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+@allada +@adam-singer +@MarcusSorealheis +@zbirenbaum
@allada and I talked at length about how to fix the fragmentation issues that @chrisstaite-menlo brought up. Initially we attempted to fix this withjemalloc
in #738, but the additional build requirements associated with it (notably requiring make
and autotools
) made this infeasible.
As future improvements, there is an additional patch that we could add to potentially improve fragmentation behavior, and we can also go down the dragonflydb route of implementing active defragmentation directly against mimalloc
's API similar to this: dragonflydb/dragonfly#523
Reviewable status: 0 of 4 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, publish-image, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04 (waiting on @adam-singer, @allada, @MarcusSorealheis, and @zbirenbaum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: 1 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @adam-singer, @MarcusSorealheis, and @zbirenbaum)
flake.nix
line 43 at r1 (raw file):
... }: let stable-rust = pkgs.pkgsMusl.rust-bin.stable."1.76.0";
that's it?!?!?! This is all it took!?!!
I'm so confused on nix 😜
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow... Nice and easy swap. Want me to run this and see how my memory performs?
Reviewable status: 1 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @adam-singer, @MarcusSorealheis, and @zbirenbaum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrisstaite-menlo that'd be great, but you'll need to build it with nix to see what this PR is really doing.
The redistribute binary we use is built with nix which uses musl
. We found that musl
was causing significant performance issues when we use it's malloc, so we wanted to use musl
for everything but let another allacator deal with memory management. In this case we found jemalloc
and mimalloc
reduced fragmentation significantly and we choose mimalloc
because it can produce reproducible binaries without much (if any) work, unlike jemalloc
.
In the end, yes if you could test it, it'd be awesome!
to build it, @aaronmondal correct me if I'm wrong, but I believe you need to install nix then run nix build .#
If you want to just test mimalloc
that'd also be useful and can use your current way of building, however, it is possible you need to play with environs to get it to be less fragmented found here: https://microsoft.github.io/mimalloc/environment.html
Once we it landed we'll likely tune it in the code manually, but this is one step closer.
Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My current build uses the Docker container with Bazel: https://github.com/TraceMachina/nativelink/blob/main/deployment-examples/docker-compose/Dockerfile is there an easy way to adapt that use use the build as you suggest?
Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrisstaite-menlo The build via the Dockerfile should work as well.
To give a bit more insight:
- The Bazel build uses whatever
libc
it finds on the system. Since the Dockerfile is based on an Ubuntu base image it'll default to linkingglibc
and its built-inmalloc
. This means you're currently usingglibc
's implementation. After testing we found that this allocator doesn't free memory the way we want which causes the issue you described. - At the moment the same is true for the "regular" Cargo build. Some CI workflows are currently failing because I'm trying to completely switch that to a static musl build to bring it closer to what the nix build does.
- The nix build already runs Cargo with a special configuration to statically link musl. The container image built from that is a minimal image with an overall closure size of ~28MB. The images we publish from that only contain that closure, i.e. a static nativelink executable and some certificates. In comparison, the Dockerfile-based image is ~4x larger at ~120MB.
- The nix/
musl
build doesn't show the same fragmentation behavior that we see withglibc
. That is, technically it runs more stable and stays within expected memory bounds. However withmusl
's default allocator this comes at the cost of a fairly significant slowdown (~2x IIRC).
In conclusion:
- You can test this with the Dockerfile as before. What you'll get is
nativelink
linked against Ubuntu'sglibc
but withmimalloc
as allocator rather thanglibc
's default allocator. - The nix build would be nativelink+musl+mimalloc. I'd expect it to now behave fairly similar in performance to the nativelink+glibc+mimalloc builds.
For future work I'll try to get the Bazel build up to parity with the nix/cargo/musl build so that we have less variation and better portability between the builds.
Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)
flake.nix
line 43 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
that's it?!?!?! This is all it took!?!!
I'm so confused on nix 😜
There is a fair bit of nontrivial magic going on here. What we're doing is:
- Take nixpkgs
- Extend it with the rust-overlay
- nixpkgs has cross-compilation toolchains built-in. That is, you can use things like
pkgs.pkgsStatic
to get a fully static nixpkgs,pkgsMusl
for musl-based stuff and a large number ofpkgsCross.xxx
toolchains for crosscompilation: https://nix.dev/tutorials/cross-compilation.html - Take that cross-transitioned variant of the added rust toolchain
- Wrap it in a specialized builder (
crane
) which invokes cargo and executes the build. - Re-export the cargo toolchain in the local devShell for development. This is still slighty broken in CI because even on musl-based systems some rust tools (e.g. rustdoc and clippy) are dynamically linked against
libgcc_s
. Shouldn't be too hard to work around though.
The original breakage with missing symbols came from the rust toolchains using a glibc
-configured linker. The pkgsMusl
change swaps that to a linker that uses Musl instead.
Technically, we could use pkgsStatic
everywhere and things would almost always be guaranteed to be perfectly portable. The reason we're not doing this (yet) is because the static packageset isn't prebuilt by nixpkgs. So if we used that every user would have to wait for like 10 years for their local systems to build static GCC and LLVM toolchains lol
a1b460d
to
f683489
Compare
This should make allocation behavior more consistent across different builds and reduce memory fragmentation issues. The allocation behavior of the `nativelink` executable is now configurable with `MIMALLOC_*` environment variables as described here: https://microsoft.github.io/mimalloc/environment.html
f683489
to
dd2a38a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: 2 of 4 LGTMs obtained (waiting on @MarcusSorealheis and @zbirenbaum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-@MarcusSorealheis -@zbirenbaum
Reviewable status:
complete! 2 of 2 LGTMs obtained
I've had this running from the docker-compose for 16 hours in prod now and memory usage is at 5GB and holding. Hopefully sorted things out, will check again in a day or two. CPU usage appears to be significantly reduced also which is weird, but may be just my sampling. |
Wow, that is fantastic news @chrisstaite-menlo What operating system/version so that we can check another? |
That's Ubuntu 20.04 with 4.15 kernel. I'm due to upgrade all of the build machines to 22.04 soon, so will hopefully be able to test there also. |
Thank you.
…On Thu, Mar 14, 2024 at 09:18 Chris Staite ***@***.***> wrote:
That's Ubuntu 20.04 with 4.15 kernel. I'm due to upgrade all of the build
machines to 22.04 soon, so will hopefully be able to test there also.
—
Reply to this email directly, view it on GitHub
<#749 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR6TSB2WPHEIWP7HIDMUB3YYFTNZAVCNFSM6AAAAABEPKRYKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJWHE4TEMRSGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
45 hours up and at 7GB RAM usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 2 LGTMs obtained
a discussion (no related file):
@chrisstaite-menlo This is good, but not as good as I was hoping. When you have time, could you run nativelink with:
MIMALLOC_PAGE_RESET=1
I believe this is the flag that does most of the fragmentation reduction that we'll be setting, but we need to do more testing. As I mentioned before, we still need to tune mimalloc.
This should make allocation behavior more consistent across different builds and reduce memory fragmentation issues.
The allocation behavior of the
nativelink
executable is now configurable withMIMALLOC_*
environment variables as described here:https://microsoft.github.io/mimalloc/environment.html
This change is