rocmPackages.hipblaslt: massively reduce peak disk space usage#451188
rocmPackages.hipblaslt: massively reduce peak disk space usage#451188JohnRTitor merged 4 commits intoNixOS:masterfrom
Conversation
without this zstd compression of msgpack .dats silently failed
let's avoid regressing compression in future, oops
1e9b995 to
0c86325
Compare
|
Marking draft while investigating ROCm/rocm-libraries#2073 (comment) Unsure if it's a pre-existing issue I just discovered or I broke something with this patch. |
539440c to
4374426
Compare
|
4374426 to
a436d1c
Compare
Flakebi
left a comment
There was a problem hiding this comment.
I didn’t look at the patch in detail, but it mostly seemed to make sense, so LGTM.
Thanks for reducing this even more!
Kitt3120
left a comment
There was a problem hiding this comment.
I'm not a maintainer, but this looks good to me, too. Well done!
a436d1c to
a19680a
Compare
a19680a to
ff8c845
Compare
|
I've been working on much more significant rework in ROCm/rocm-libraries#2073 Some of it is not yet ready, but I've added the items that are low risk to this patch set for further resources reduction. See the table in ROCm/rocm-libraries#2073 for commit by commit resource usage improvements. We should probably land this soon since the previous attempt was extremely marginal for whether it can succeed on hydra. |
…usage Peak build dir usage is now 25GB Partially applies [hipblaslt] Refactor Parallel.py to drop joblib, decimate resource usage
no longer needed now we unlink .s / .o files as soon as possible
ff8c845 to
bfd0d13
Compare
|
| @@ -86,6 +86,8 @@ stdenv.mkDerivation (finalAttrs: { | |||
| env.ROCM_PATH = "${clr}"; | |||
| env.TENSILE_ROCM_ASSEMBLER_PATH = lib.getExe' clr "amdclang++"; | |||
| env.TENSILE_GEN_ASSEMBLY_TOOLCHAIN = lib.getExe' clr "amdclang++"; | |||
| env.LD_PRELOAD = "${jemalloc}/lib/libjemalloc.so"; | |||
There was a problem hiding this comment.
Well it's just for the duration of the build AFAICT. If it's to affect the existing tooling in the store we'd either need to apply it always, build a second copy that has it patchelf-ed, or a wrapper which sets LD_PRELOAD.
IMO a comment explaining the benefits and maybe using lib.getLib would be sufficient tweaks
| env.LD_PRELOAD = "${jemalloc}/lib/libjemalloc.so"; | |
| # has around x improvement (or y benefit) when running z | |
| env.LD_PRELOAD = "${lib.getLib jemalloc}/lib/libjemalloc.so"; |
I'd say stdenv.hostPlatform.extensions.sharedLibrary would be unnecessary being linux only but could also be included
There was a problem hiding this comment.
yeah this is for the allocation hungry build time TensileCreateLibrary processes, not relevant at runtime.
Without jemalloc:
Peak memory usage (MB): 31,011.9
Current memory usage (MB): 28,199.3
With jemalloc:
Peak memory usage (MB): 30,893.7
Current memory usage (MB): 22,695.9
Things done
passthru.tests.nixpkgs-reviewon this PR. See nixpkgs-review usage../result/bin/.Add a 👍 reaction to pull requests you find important.