Skip to content

sanitiseHeaderPathsHook: init#418819

Merged
K900 merged 11 commits intoNixOS:stagingfrom
emilazy:push-xxrksqpuxxyx
Jul 4, 2025
Merged

sanitiseHeaderPathsHook: init#418819
K900 merged 11 commits intoNixOS:stagingfrom
emilazy:push-xxrksqpuxxyx

Conversation

@emilazy
Copy link
Member

@emilazy emilazy commented Jun 21, 2025

C++ headers often use __FILE__ in error messages, causing the development outputs of libraries to leak into the runtime closure of packages using them. This hook abstracts away a pattern used in a few places throughout the tree to have headers identify themselves by a sanitized path that does not cause runtime dependencies.

This does unfortunately mean that compiler error messages will reference the sanitized path. The only alternatives I can imagine are to patch compilers to handle __FILE__ specially, or to have libraries propagate a hook that removes references. The latter would potentially need to be propagated recursively due to #include semantics and would be less precise than this.

Adding the hook to GCC (for libstdc++ headers) reduces the WebKitGTK runtime closure size from 1.45 GiB to 1.22 GiB on aarch64-linux, as measured by nix-tree.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • Nixpkgs 25.11 Release Notes (or backporting 24.11 and 25.05 Nixpkgs Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
  • NixOS 25.11 Release Notes (or backporting 24.11 and 25.05 NixOS Release notes)
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other contributing documentation in corresponding paths.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-darwin: 5001+ This PR causes many rebuilds on Darwin and must target the staging branches. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux-stdenv This PR causes stdenv to rebuild on Linux and must target a staging branch. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. labels Jun 21, 2025
@emilazy emilazy requested a review from a team June 22, 2025 17:49
@emilazy
Copy link
Member Author

emilazy commented Jun 22, 2025

I‘m open to bikeshedding on the name here, by the way. (Or the approach, although this seems like the simplest thing we could do to address the problem for now, even if the compiler error messages thing is unfortunate.)

@techknowlogick
Copy link
Member

lgtm, its great that the duplicated sanitize hook in the meta libs can be centralized. I dont have access to a machine to build these changes atm so I wont add the official GH review, but I wouldn't be opposed to having it be merged sooner

@alyssais
Copy link
Member

I must be misunderstanding something here, I think. How does __FILE__ end up being expanded in outputs? Are these preprocessed headers?

@emilazy
Copy link
Member Author

emilazy commented Jun 23, 2025

I think you’re just forgetting what C++ is like – the headers include tons of code. e.g., in <optional>:

      // The _M_get operations have _M_engaged as a precondition.
      constexpr _Tp&
      _M_get() noexcept
      {
        __glibcxx_assert(this->_M_is_engaged());
        return static_cast<_Dp*>(this)->_M_payload._M_get();
      }

which calls through to:

// Assert.
#ifdef _GLIBCXX_VERBOSE_ASSERT
namespace std
{
#pragma GCC visibility push(default)
  // Don't use <cassert> because this should be unaffected by NDEBUG.
  extern "C++" _GLIBCXX_NORETURN
  void
  __glibcxx_assert_fail /* Called when a precondition violation is detected. */
    (const char* __file, int __line, const char* __function,
     const char* __condition)
  _GLIBCXX_NOEXCEPT;
#pragma GCC visibility pop
}
# define _GLIBCXX_ASSERT_FAIL(_Condition)                               \
  std::__glibcxx_assert_fail(__FILE__, __LINE__, __PRETTY_FUNCTION__,   \
                             #_Condition)
#else // ! VERBOSE_ASSERT
# define _GLIBCXX_ASSERT_FAIL(_Condition) __builtin_abort()
#endif

#if defined(_GLIBCXX_ASSERTIONS)
// When _GLIBCXX_ASSERTIONS is defined we enable runtime assertion checks.
// These checks will also be done during constant evaluation.
# define __glibcxx_assert(cond)                                         \
  do {                                                                  \
    if (__builtin_expect(!bool(cond), false))                           \
      _GLIBCXX_ASSERT_FAIL(cond);                                       \
  } while (false)

which means that programs using <optional> get the path to that header file included in their binary, and therefore depend on all of GCC at runtime. The existing examples of Boost and the Facebook libraries show that this is not just a libstdc++ quirk but instead a pervasive thing in C++ libraries.

Copy link
Contributor

@ConnorBaker ConnorBaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you for working on this :)

Copy link
Member

@alyssais alyssais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me now, thanks.

emilazy added 7 commits July 4, 2025 12:02
C++ headers often use `__FILE__` in error messages, causing the
development outputs of libraries to leak into the runtime closure
of packages using them. This hook abstracts away a pattern used in
a few places throughout the tree to have headers identify themselves
by a sanitized path that does not cause runtime dependencies.

This does unfortunately mean that compiler error messages will
reference the sanitized path. The only alternatives I can imagine
are to patch compilers to handle `__FILE__` specially, or to have
libraries propagate a hook that removes references. The latter would
potentially need to be propagated recursively due to `#include`
semantics and would be less precise than this.
emilazy added 4 commits July 4, 2025 12:02
This reduces the WebKitGTK runtime closure size from 1.45 GiB to
1.22 GiB on `aarch64-linux`, as measured by `nix-tree`.
@emilazy emilazy force-pushed the push-xxrksqpuxxyx branch from 2e39fb4 to 1c2dfb8 Compare July 4, 2025 11:02
@emilazy
Copy link
Member Author

emilazy commented Jul 4, 2025

I discovered that compilers support the -f{macro,debug}-prefix-map=… argument that would let us do this without affecting diagnostics. However, it would require propagating compiler flags from these packages, which would be kind of awkward for CMake builds. I can pursue that if it’s desired but otherwise it might be best to land this simpler version to help reduce the ISO closure and so on quickly and leave fancier things that avoid patching the headers for future work. If people complain about the useless paths in error messages it’s probably a good idea.

@K900 K900 merged commit fa27153 into NixOS:staging Jul 4, 2025
23 of 27 checks passed
@emilazy emilazy deleted the push-xxrksqpuxxyx branch July 4, 2025 13:37
@symphorien
Copy link
Member

Hi, this breaks using gdb on code inlined from templates, as the source file is mangled to /nix/store/eeee....
wouid you be open to modify the logic to make the store path uppercase ? then nixseparatedebuginfod can unmangle it when serving the source. This logic has already been applied in #279455 to a gcc patch introduced by @trofi. @trofi also has experimented with the -fmacro-prefix-map approach, maybe they have some insight to share. Also I wonder why the gcc patch is not sufficient for your purpose.

@symphorien
Copy link
Member

steps to reproduce the issue:

nix-build -E 'with import ./. {}; ((ninja.override { buildDocs = false; }).overrideAttrs {separateDebugInfo = true; }).debug' && dwarfdump result-debug/lib/debug/ninja  | grep -o '/nix/store[^ "]*' | sort -u

prints /nix/store/eeeeee paths instead of real paths

@emilazy
Copy link
Member Author

emilazy commented Jul 12, 2025

@trofi’s patch (which I hadn’t seen before) does seem like it should work. But it doesn’t help for Darwin, or apparently for WebKitGTK – both of which use LLVM. So I guess we’d need to patch Clang too, or else find a cleaner way of doing this.

@symphorien
Copy link
Member

I am going to port the patch to clang.

@emilazy
Copy link
Member Author

emilazy commented Jul 13, 2025

Could we potentially use the compiler flags instead? I think we could just inject them in stdenv at the same time we do the -I, right? It would be nice to avoid non‐upstreamable compiler patches where possible. I’m not quite sure why that would be much more likely to run into the referenced GCC bug; maybe @trofi was trying to pass them for every single build input rather than only ones we already inject compiler flags for?

It would break for #includes with absolute paths, though, which is unfortunate. But those are probably rare?

Edit: Hmm, the GCC bug report shows paths to individual header files being listed. Surely since it’s a prefix map we only need to list top‐level store paths? But then later discussion in the issue shows that it might just be slightly too much in general and push us over the limit, blah. In any case I feel like we ought to be able to use the flags for Clang at least, since it probably doesn’t have this limitation.

@symphorien
Copy link
Member

I'm not sure we lower the maintenance burden by maintaining a patch used only for gcc and cc-wrapper code used only for clang.

@emilazy
Copy link
Member Author

emilazy commented Jul 13, 2025

Once the GCC bug is fixed, we can use the same code for both. It would just be a couple additional lines in ccWrapper_addCVars. I prefer that to patching Clang.

Edit: Though I’m only thinking about the eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee version here; if we want to do the uppercasing thing then that’s another matter. But is that important, for macros? I think it’s okay for debug information to pull in the dependencies, right, as long as we can scope this down to macros?

@symphorien
Copy link
Member

Here is the patch #424844
Could you check if it solves your the original issue which prompted you to introduce sanitiseHeaderPathsHook?

@danieldk
Copy link
Contributor

danieldk commented Jul 26, 2025

This change has been causing issues in our project which uses CMake/Ninja. For some reason it results in CMake + ninja/make thinking objects are not up-to-date anymore, causing them to be built again in the installPhase. Since we use our project to build kernels like flash-attention which can take 1-2 hours, with this change they take 2-4 hours.

I have found it quite hard to debug. Ninja logging does not give much information, but make -debug=b gives some strange output that eventually led me to this PR. The output is pretty long, so see the gist below. But it consists of lines like this for (seemingly) each C++ standard library header file:

   Must remake target '/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-13.3.0/include/c++/13.3.0/algorithm'.
   Successfully remade target file '/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-13.3.0/include/c++/13.3.0/algorithm'.
[...]
   Prerequisite '/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-13.3.0/include/c++/13.3.0/algorithm' of target 'CMakeFiles/_relu_acf2ddc_dirty.dir/relu_cuda/relu.cu.o' does not exist.

Full make output: https://gist.github.com/danieldk/8b445776745bf44721f3de8c28b6cbaf

If I undo this patch: danieldk@b8c1118 , the rebuilds in installPhase are gone.

I will try to see if I can make a more minimal example or if I can find a package in nixpkgs that has similar rebuilds in installPhase.

My current theory is that some CMake script uses the C preprocessor to find the header dependencies of a file and this change trips it up, making it use the fake paths rather than the real header paths.

@danieldk
Copy link
Contributor

First example I found in nixpkgs:

$ nix-build -E "with import ./. { config = { cudaSupport = true; allowUnfree = true; }; }; magma-cuda-static"
[...]
Running phase: buildPhase
[...]
[347/3492] Building CUDA object CMakeFiles/magma.dir/magmablas/zgemv_conj.cu.o
[...]
buildPhase completed in 21 minutes 55 seconds
Running phase: installPhase
[...]
[1/1324] Building CUDA object CMakeFiles/magma.dir/magmablas/zgemv_conj.cu.o
[...]
installPhase completed in 22 minutes 44 seconds

Full log: https://gist.github.com/danieldk/97eceeb9568968b0af290c136d5b78e6

nccl, which is built as a dependency of magma also does not seem to be happy about it (though seems more harmless in the nccl case). Outputting a lot of:

Compiling       src/device/common.cu
realpath: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-14.3.0/include/c++/14.3.0/bits/memoryfwd.h: No such file or directory
realpath: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-14.3.0/include/c++/14.3.0/x86_64-unknown-linux-gnu/bits/gthr-default.h: No such file or directory
realpath: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-14.3.0/include/c++/14.3.0/iosfwd: No such file or directory
realpath: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-14.3.0/include/c++/14.3.0/bits/postypes.h: No such file or directory
realpath: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-14.3.0/include/c++/14.3.0/bits/stringfwd.h: No such file or directory
realpath: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-14.3.0/include/c++/14.3.0/cwchar: No such file or directory
realpath: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-14.3.0/include/c++/14.3.0/chrono: No such file or directory
realpath: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-14.3.0/include/c++/14.3.0/iterator: No such file or directory

@danieldk
Copy link
Contributor

I created an issue: #428546 , since it's more discoverable than putting comments in a merged PR. Sorry for the noise here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-darwin: 5001+ This PR causes many rebuilds on Darwin and must target the staging branches. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. 10.rebuild-linux-stdenv This PR causes stdenv to rebuild on Linux and must target a staging branch.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants