cudaPackages.buildRedist: automatically remove runpath entries for stubs#459416
cudaPackages.buildRedist: automatically remove runpath entries for stubs#459416SomeoneSerge merged 11 commits intoNixOS:masterfrom
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
daniel-fahey
left a comment
There was a problem hiding this comment.
Code looks nice running on my Mk1 eyeballs, I did notice your directory and file names don't follow Nixpkgs convention to use kebab-case? (No idea where there that is, if at all, documented. Seems to be the convention inside pkgs/build-support.)
pkgs/build-support/setup-hooks/arrayUtilities/getRunpathEntries/getRunpathEntries.bash
Outdated
Show resolved
Hide resolved
1a9ea71 to
0763bfc
Compare
This comment was marked as outdated.
This comment was marked as outdated.
Why? Does vllm suffer from the hook ordering issue, ordering stubs left of driverLink in |
It doesn't. I was shooting from the hip. vLLM doesn't suffer from the hook ordering issue. The vLLM derivation uses |
Why? All I can think of is the error I (sometimes) get when loading the stub library that I’ve loaded a stub library. Other than that, having additional RUNPATH entries means more lookups while searching for other libraries, which I don’t enjoy.
I absolutely agree that hooks should have those properties wherever possible (so nearly all of them). But that’s a very heavy lift, which is why I looked into hook re-ordering. That’s the origin of my arrayUtilities PRs (#385960) -- I saw I needed this logic for array handling in a bunch of places and wanted to have it in a single place to reduce the likelihood I messed it up. Unfortunately, not only was that PR not accepted as it was, requiring me to remove most of the functionality I needed from it, I also ran into an old bug (I tried to resolve it in #388767) which meant that I couldn’t use any of those utility functions (which are provided by setup hooks) until after phases started running. As such, I had to re-order hooks in later phases within a very early phase, which feels incredibly brittle and also depends on things like hook name not changing. And as I was doing that, I encountered another problem: sometimes phases are just strings, not arrays! So I’d need to be very, very careful with how I parsed and interacted with them. I generally just don’t have the patience for that kind of surgery when we absolutely should be using structured attributes and arrays everywhere, full stop. While I agree hooks should have these properties, there’s so much in the way of a clean, principled implementation. Can we just remove the stub entries for now and make an issue to fix everything else later? |
|
To future readers: Connor notified us on
Yeah we just need to rewrite the message 😂 Well, to ask for permission to rewrite the message
I know, I've been there: https://github.com/NixOS/nixpkgs/pull/297590/files#diff-ec85191a2f968fa96c76a95ed2755151e530eebeb9852acb6a96e07e6eee9f81R10-R36
If we do, I'd probably go with modifying |
|
Turns out the error messages aren't even defined in the stubs, but in e.g. cudart instead, so all my previous arguments are irrelevant |
pkgs/development/cuda-modules/packages/removeStubsFromRunpathHook/package.nix
Outdated
Show resolved
Hide resolved
...development/cuda-modules/packages/removeStubsFromRunpathHook/removeStubsFromRunpathHook.bash
Outdated
Show resolved
Hide resolved
0763bfc to
0896087
Compare
0896087 to
3721b48
Compare
...development/cuda-modules/packages/removeStubsFromRunpathHook/removeStubsFromRunpathHook.bash
Outdated
Show resolved
Hide resolved
e5f409d to
9d38d18
Compare
...development/cuda-modules/packages/removeStubsFromRunpathHook/removeStubsFromRunpathHook.bash
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Not a blocker: In future we probably want to replace this with a more general patchelf-structuredAttrs adapter, although patchelf is far from the only tool that suffers from this issue
…fsets Signed-off-by: Connor Baker <ConnorBaker01@gmail.com>
…d-build-inputs Signed-off-by: Connor Baker <ConnorBaker01@gmail.com>
Signed-off-by: Connor Baker <ConnorBaker01@gmail.com>
SomeoneSerge
left a comment
There was a problem hiding this comment.
Aside from the out of date comment, ready?
pkgs/build-support/setup-hooks/arrayUtilities/getRunpathEntries/getRunpathEntries.bash
Show resolved
Hide resolved
| local runpath | ||
| # Files that are not dynamically linked cause patchelf to exit with a non-zero status and print to stderr. | ||
| # If patchelf fails to print the rpath, we assume the file is not dynamically linked. | ||
| runpath="$(patchelf --print-rpath "$path" 2>/dev/null)" || return 1 |
There was a problem hiding this comment.
/me wonders if stdenv sets -o pipefail
There was a problem hiding this comment.
setup.sh sets -euo pipefail when it begins and then at the end of being sourced restores e and u. (Interestingly, it does not unset pipefail!)
But no, unless someone sets it explicitly in a phase they're not set.
…athHook Signed-off-by: Connor Baker <ConnorBaker01@gmail.com>
Signed-off-by: Connor Baker <ConnorBaker01@gmail.com>
…thHook Signed-off-by: Connor Baker <ConnorBaker01@gmail.com>
|
Good job guys! |
| + optionalString finalAttrs.includeRemoveStubsFromRunpathHook '' | ||
| nixLog "installing stub removal runpath hook" | ||
| mkdir -p "''${!outputStubs:?}/nix-support" | ||
| printWords >>"''${!outputStubs:?}/nix-support/propagated-build-inputs" \ |
There was a problem hiding this comment.
Do we really need to propagate this to runtime? Why no dev output?
There was a problem hiding this comment.
Damn github email integration broke again, I had replied to this days ago but the message doesn't appear. TLDR: stubs is a dev output
|
I really want to get rid of patchelf from my final system as that is usually a sign that somewhere something gets propagated too much. |
|
It is a dev output (just not the "dev" output)
…On November 30, 2025 2:55:36 PM UTC, Sandro ***@***.***> wrote:
@SuperSandro2000 commented on this pull request.
> @@ -332,6 +341,16 @@ extendMkDerivation {
inherit doInstallCheck;
inherit allowFHSReferences;
+ inherit includeRemoveStubsFromRunpathHook;
+
+ postFixup =
+ postFixup
+ + optionalString finalAttrs.includeRemoveStubsFromRunpathHook ''
+ nixLog "installing stub removal runpath hook"
+ mkdir -p "''${!outputStubs:?}/nix-support"
+ printWords >>"''${!outputStubs:?}/nix-support/propagated-build-inputs" \
Do we really need to propagate this to runtime? Why no dev output?
|
|
@SuperSandro2000 there are a few ways to make that happen:
|
|
The second option sounds like the best choice here |
s/do not propagate by default/only propagate them in |
Sometimes we have to link against stubs because the actual libraries are only available at runtime (like those provided by NVIDIA's drivers). Leaving the stubs in the RUNPATH is a mistake because the loader will search them and load the stubs rather than the real libraries provided at runtime if the stubs are discovered first.
This PR introduces a hook which replaces stub entries with a reference to
/run/opengl-driver/lib. It is written such that at most one reference to the driver link lib directory is added if stub entries are found; the hook does not attempt to deduplicate existing entries; it may introduce a duplicate driver link lib directory reference if a stub entry occurs before an existing driver link lib directory reference.CUDA sample which links against the stub before:
and after:
This hook is automatically enabled for all redistributables with a
stubsoutput and can be manually enabled or disabled by providingbuildRedisttheincludeRemoveStubsFromRunpathHookboolean.With these changes, I'm also able to build UCX and UCC with
Things done
passthru.tests.nixpkgs-reviewon this PR. See nixpkgs-review usage../result/bin/.Add a 👍 reaction to pull requests you find important.