Skip to content

Conversation

@ghost
Copy link

@ghost ghost commented Jul 17, 2022

Description of changes

This is a more-focused version of the same change in #168983, which I am unable to reopen. I closed that PR too hastily; unfortunately it is still needed on powerpc64le.

Stage3 of the stdenv bootstrap attempts to link libraries compiled for static linkage (i.e. *.a) into a dynamically-linked executable. On powerpc64le this is not supported if the library compiled for static linkage was built using -fstack-protector, because that option requires -lssp (edit: or equivalent) for dynamically-linked executables but uses a different library (-lssp_nonshared) for statically-linked executables.

As a workaround, we simply disable -fstack-protector in the statically-linked lib{gmp,mpc,mpfr,isl}-stage3 derivations. These are not visible from outside of stdenv.

Things done
  • Built on platform(s)
    • powerpc64le-linux
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@github-actions github-actions bot added the 6.topic: stdenv Standard environment label Jul 17, 2022
@ghost ghost marked this pull request as ready for review July 17, 2022 04:39
Stage3 of the stdenv bootstrap attempts to link libraries compiled for
static linkage (i.e. `*.a`) into a dynamically-linked executable.  On
powerpc64le this is not supported if the library compiled for static
linkage was built using `-fstack-protector`, because that option
requires `-lssp` for dynamically-linked executables but uses a
different library (`-lssp_nonshared`) for statically-linked
executables.

As a workaround, we simply disable `-fstack-protector` in the
statically-linked `lib{gmp,mpc,mpfr,isl}-stage3` derivations.  These
are not visible from outside of `stdenv`.
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. labels Jul 17, 2022
@trofi
Copy link
Contributor

trofi commented Jul 17, 2022

Do you know why -lssp is needed for your target? Normally -lssp is not needed as libc is supposed to provide __stack_check_* helpers: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/configure.ac;h=446747311a6aec3c810ad6aa4190f7bd383b94f7;hb=HEAD#l6734.

I think gcc's libssp is considered to be a problematic implementation. The only target I know of that have to use it is mingw.

@ghost
Copy link
Author

ghost commented Jul 17, 2022

Do you know why -lssp is needed for your target? Normally -lssp is not needed as libc is supposed to provide __stack_check_* helpers: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/configure.ac;h=446747311a6aec3c810ad6aa4190f7bd383b94f7;hb=HEAD#l6734.

It isn't libssp.so that is missing, it's libssp_nonshared.a. It's needed because stdenv's Frankensteined build of gcc compiles the four gcc dependencies using --disable-shared --enable-static and then links the resulting built-for-static-linkage .a libraries into a dynamically linked executable.

Nothing else in nixpkgs needs this workaround because nothing else in nixpkgs tries to do that. I'm not convinced that we should even expect it to work. I can currently build my entire workstation environment out of nixpkgs on powerpc64le except for the web browsers (qutebrowser, firefox, ungoogled-chromium) and Rust packages that use ring -- only stdenv's gcc experiences this problem. nix build -f . -L gcc10gcc11 works fine without having to disable any hardenings.

I'd love to embark on a de-Frankensteinification of the way we build stdenv's gcc, but getting nontrivial changes to the bootstrap stages merged takes multiple months. I've come to understand that I have a limited budget of attention from people who are confident enough to merge those kinds of things, and I have to be careful how I spend that budget. Disabling a hardening flag, on one platform only, is low-risk for reviewers.

Here's the link failure if you're curious: #168983 (comment)

@trofi
Copy link
Contributor

trofi commented Jul 17, 2022

-lssp_nonshared is a part of gcc's libssp as well. I would expect it not to be needed either.

@ghost
Copy link
Author

ghost commented Jul 17, 2022

-lssp_nonshared is a part of gcc's libssp as well. I would expect it not to be needed either.

Then what do you think is the cause of the link failure?

@trofi
Copy link
Contributor

trofi commented Jul 17, 2022

-lssp_nonshared is a part of gcc's libssp as well. I would expect it not to be needed either.

Then what do you think is the cause of the link failure?

/nix/store/3icagz57flqg7z2dpihj4gadb8ng2xdv-binutils-2.35.2/bin/ld: /nix/store/9irfh44zfaadfv77mn306wnkbglf914s-mpfr-4.1.0/lib/libmpfr.a(mpfr-gmp.o):(.toc+0x8): undefined reference to `__stack_chk_guard'

I think it's a sign of gcc used to build mpfr was incorrectly configured to rely on libssp instead of using gcc's codegen and libc's helper. Do you have a full build.log of that gcc by chance? I'm specifically interested in AC_CACHE_CHECK(__stack_chk_fail in target C library, check report of gcc's config.log.

I suspect it's only a matter of building gcc with gcc_cv_libc_provides_ssp=yes (explicit --disable-libssp flag). My guess is that gcc was configured without libc headers presence (explicit --disable-libssp flag is needed). Not sure if it's bootstrap-tools gcc or an early gcc.

Does current nixpkgs master fail as is on bootstrap? I'm currently building nix build -f. lv --arg localSystem '{ config = "powerpc64le-linux"; }' to have something to debug locally, but it will take a few hours if think don't break earlier.

@trofi
Copy link
Contributor

trofi commented Jul 17, 2022

If I'm looking at the nix-store --query --graph $(nix-instantiate -A stdenv --arg localSystem '{ config = "powerpc64le-linux"; }') output correctly. mpfr/mpc gets rebuild in stage3 by stage3-stdenv. stage3-stdenv uses stage2's glibc (built), stage1's binutils (built) and bootstra-tools' gcc (not yet built).

I think it means bootstrap-tools's gcc needs a fix. Do you know how it was built? Cross-compiled or built natively?

@trofi
Copy link
Contributor

trofi commented Jul 17, 2022

Meanwhile reproduced the same build failure:

nix build -f. lv --arg localSystem '{ config = "powerpc64le-linux"; }'
error: builder for '/nix/store/52m90fywbvk0cnxkzjn4h923cx18iwcl-gcc-10.4.0.drv' failed with exit code 2;
       last 10 log lines:
       > /nix/store/4ni1ykaajr6shpvsakkjlslw0aabsl9p-binutils-2.38/bin/ld: /nix/store/2xypdl9r3ily73ksl6dgqmg0ymb5z313-mpfr-stage3-4.1.0/lib/libmpfr.a(mul.o):(.toc+0x0): more undefined references to `__stack_chk_guard' follow
       > collect2: error: ld returned 1 exit status
       > make[3]: *** [../../gcc-10.4.0/gcc/c/Make-lang.in:85: cc1] Error 1
       > rm gcc.pod
       > make[3]: Leaving directory '/build/build/gcc'
       > make[2]: *** [Makefile:4794: all-stage2-gcc] Error 2
       > make[2]: Leaving directory '/build/build'
       > make[1]: *** [Makefile:22336: stage2-bubble] Error 2
       > make[1]: Leaving directory '/build/build'
       > make: *** [Makefile:22540: bootstrap] Error 2
       For full logs, run 'nix log /nix/store/52m90fywbvk0cnxkzjn4h923cx18iwcl-gcc-10.4.0.drv'.

And indeed it's a bootstrap-tools' gcc:

$ nix develop -i /nix/store/52m90fywbvk0cnxkzjn4h923cx18iwcl-gcc-10.4.0.drv
bash-5.1$ dev>gcc -v
Using built-in specs.
COLLECT_GCC=/nix/store/7wsy2r3f4ffd123gy320d20ny0zjdjjj-bootstrap-tools/bin/gcc
COLLECT_LTO_WRAPPER=/nix/store/7wsy2r3f4ffd123gy320d20ny0zjdjjj-bootstrap-tools/bin/../libexec/gcc/powerpc64le-unknown-linux-gnu/10.3.0/lto-wrapper
Target: powerpc64le-unknown-linux-gnu
Configured with:
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.3.0 (GCC)

We can poke at it directly:

bash-5.1$ dev>printf "int main(){}" | gcc -S -x c - -o - -fstack-protector-all | fgrep stack
        .quad   __stack_chk_guard
        bl __stack_chk_fail
        .section        .note.GNU-stack,"",@progbits

The problem here is presence of __stack_chk_guard in generated code. That should be a good test for early toolchain test. Here is how libc/kernel based references look like (x86_64):

$ printf "int main(){}" | gcc -S -x c - -o - -fstack-protector-all | fgrep stack
        call    __stack_chk_fail
        .section        .note.GNU-stack,"",@progbits

I'll check what cross-build yields first: nix build -f ./pkgs/stdenv/linux/make-bootstrap-tools-cross.nix powerpc64le.bootstrapFiles

@trofi
Copy link
Contributor

trofi commented Jul 17, 2022

Confirmed it's an instance of #113977 :

$ nix log -f pkgs/stdenv/linux/make-bootstrap-tools-cross.nix powerpc64le.bootGCC | fgrep __stack_chk_fail
checking __stack_chk_fail in target C library... no
checking __stack_chk_fail in target C library... no

Looking at why that check fails.

@trofi
Copy link
Contributor

trofi commented Jul 17, 2022

A side-effect of missing glibc version detection:

`gcc-debug-powerpc64le-unknown-linux-gnu> checking for target glibc version... 0.0

It's supposed to be a simple features.h grep (header is present):
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/configure.ac;h=446747311a6aec3c810ad6aa4190f7bd383b94f7;hb=HEAD#l5773

But something fails. I'll keep digging.

@ghost
Copy link
Author

ghost commented Jul 17, 2022

Thanks for digging into this; please don't be put off by my short reply below -- I have to leave the office in a few minutes so I'm going to reply to the low-hanging fruit and write a more detailed response later.

I think it's a sign of gcc used to build mpfr was incorrectly configured to rely on libssp instead of using gcc's codegen and libc's helper.

That would be the bootstrap-tools compiler. Here is the Hydra build that created it. You can replicate that build with

git switch --detach 49a83445c28c4ffb8a1a90a1f68e6150ea48893b
nix build -f . -L pkgsCross.powernv.stdenvBootstrapTools

The special libmpfr (with pname="libmpfr-stage3" to distinguish it from "normal libmpfr" ever since 0263018) which supplies libmpfr.a is compiled by the gcc which is in the bootstrap-tools. That libmpfr.a then gets linked into the final stdenv.cc.

Does current nixpkgs master fail as is on bootstrap?

Correct. With current nixpkgs master on a powerpc64le machine, nix build -f . -L stdenv will fail with the link error mentioned earlier. It sounds like you have access to powerpc64le hardware; if not, let me know where I can find a plausibly-valid SSH key of yours and I'll give you a login on one.

I think it means bootstrap-tools's gcc needs a fix. Do you know how it was built? Cross-compiled or built natively?

The bootstrap-files gcc is definitely cross-compiled for all non-{x86_64,arm64} hostPlatforms.

The problem here is presence of __stack_chk_guard in generated code.

I've noticed that relocs for __stack_chk_guard are also present in the "special" libmpfr-stage3 package on aarch64 (but not on x86_64). Yet for some reason there is no problem there. As I understand it the exact mechanism used to implement this feature is highly platform-specific. But I don't really know a whole lot about the gory details of -fstack-protector.

Confirmed it's an instance of #113977 :

Hey, that's a great find! But FWIW the PR that closes that bug, #141448, does exactly the same thing that this PR does -- disable -fstack-protector on powerpc64le. :)

A side-effect of missing glibc version detection:
`gcc-debug-powerpc64le-unknown-linux-gnu> checking for target glibc version... 0.0

Interesting. I'll look into this, and #113977, in more detail either tonight or tomorrow. Nice find!

@trofi
Copy link
Contributor

trofi commented Jul 18, 2022

Adding a bit of debugging I think this is the path gcc assumes for target libc (we don't have that path):

GLIBC HEADER looking at: /nix/store/jn7l2accg01ms5inxyzwdgxn6vdp9gd1-gcc-debug-powerpc64le-unknown-linux-gnu-10.4.0/powerpc64le-unknown-linux-gnu/sys-include/features.h
GLIBC HEADER looking at: NO FILE

It comes from https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/configure.ac;h=446747311a6aec3c810ad6aa4190f7bd383b94f7;hb=HEAD#l2458

   elif test "x$with_sysroot" = x; then
     target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-include"

Looking at other cases:

   if test "x$with_build_sysroot" != "x"; then
     target_header_dir="${with_build_sysroot}${native_system_header_dir}"
   elif test "x$with_sysroot" = x; then
     target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-include"
   elif test "x$with_sysroot" = xyes; then
     target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-root${native_system_header_dir}"
   else
     target_header_dir="${with_sysroot}${native_system_header_dir}"
   fi

I think the best is to plug into last sysroot branch. Testing the following change locally:

--- a/pkgs/development/compilers/gcc/common/configure-flags.nix
+++ b/pkgs/development/compilers/gcc/common/configure-flags.nix
@@ -111,9 +111,15 @@ let
       "--with-mpc=${libmpc}"
     ]
     ++ lib.optional (libelf != null) "--with-libelf=${libelf}"
-    ++ lib.optional (!(crossMingw && crossStageStatic))
+    ++ lib.optionals (!(crossMingw && crossStageStatic)) [
+      # "--with-sysroot=/": trick gcc into looking up target system headers in
+      # --with-native-system-header-dir dir:
+      #     https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/configure.ac;h=446747311a6aec3c810ad6aa4190f7bd383b94f7;hb=HEAD#l2461
+      # Otherwise gcc tries to look it up in various prefixed
+      # targets relative to exec_prefix.
+      "--with-sysroot=/"
       "--with-native-system-header-dir=${lib.getDev stdenv.cc.libc}/include"
-
+    ]
     # Basic configuration
     ++ [
       # Force target prefix. The behavior if `--target` and `--host`

It also matches description of with-native-system-header-dir from gcc's install doc https://gcc.gnu.org/install/configure.html:

--with-native-system-header-dir=dirname

    Specifies that dirname is the directory that contains native system header files, rather than /usr/include.
    This option is most useful if you are creating a compiler that should be isolated from the system as much
    as possible. It is most commonly used with the --with-sysroot option and will cause GCC to search dirname
    inside the system root specified by that option.

@trofi
Copy link
Contributor

trofi commented Jul 18, 2022

Testing the following change locally:

Breaks. nixpkgs frequently passes host headers (instead of target's) there when building cross-compilers (which makes sense, gcc assumes both paths the same with a sysroot offset).

@trofi
Copy link
Contributor

trofi commented Jul 18, 2022

Looking at pkgs/development/compilers/gcc/builder.sh we even have a hack to patch other headers location. It's out of date:

    if test -f "$NIX_CC/nix-support/orig-libc"; then
        # Patch the configure script so it finds glibc headers.  It's
        # important for example in order not to get libssp built,
        # because its functionality is in glibc already.
        sed -i \
            -e "s,glibc_header_dir=/usr/include,glibc_header_dir=$libc_dev/include", \
            gcc/configure
    fi

glibc_header_dir was renamed to target_header_dir https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=6961669f48aa18168b2d7daa7e2235fbec7cb636

@trofi
Copy link
Contributor

trofi commented Jul 18, 2022

The following hack allows building working powerpc64le.bootstrapFiles:

--- a/pkgs/development/compilers/gcc/common/configure-flags.nix
+++ b/pkgs/development/compilers/gcc/common/configure-flags.nix
@@ -113,6 +113,8 @@ let
     ++ lib.optional (libelf != null) "--with-libelf=${libelf}"
     ++ lib.optional (!(crossMingw && crossStageStatic))
       "--with-native-system-header-dir=${lib.getDev stdenv.cc.libc}/include"
+    ++ lib.optional (!(crossMingw && crossStageStatic) && buildPlatform != hostPlatform)
+      "--with-sysroot=/"

     # Basic configuration
     ++ [

I suspect it might be a bit incorrect for some cases. I'll spend some time to understand the difference in nixpkgs's assumptions around use of "native" headers for 3 cases: noLibc (bare metal, pre-libc), build == host != target (cross-compiler), build != host == target (cross-build a compiler). And will check if plain cross-compilers work as expected on nixpkgs.

@trofi
Copy link
Contributor

trofi commented Jul 18, 2022

Proposed the possible fix as #181943

trofi added a commit to trofi/nixpkgs that referenced this pull request Jul 18, 2022
When reviewing NixOS#181802 (comment)
I noticed outdated code that attempted to override /usr/include.

    sed -i \
        -e "s,glibc_header_dir=/usr/include,glibc_header_dir=$libc_dev/include", \
        gcc/configure

`glibc_header_dir` was removed from `gcc-4.6` and later in
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=6961669f48aa18168b2d7daa7e2235fbec7cb636
(Dec 2010, "(gcc_cv_ld_eh_frame_hdr): Only check GNU ld for  --eh-frame-hdr.").

Since then gcc got `--with-native-system-header-dir=` which `nixpkgs` uses
for all packaged `gcc` versions.

The change should be a no-op.
@trofi
Copy link
Contributor

trofi commented Jul 18, 2022

On top of it proposed a cleanup to remove outdated sed: #181994

@trofi
Copy link
Contributor

trofi commented Jul 18, 2022

And proposed another drive-by cleanup around configureFlags: #181999

@ghost
Copy link
Author

ghost commented Jul 19, 2022

(Edit: I'm going to repost this comment in #181943)

@ghost ghost mentioned this pull request Jul 19, 2022
13 tasks
@ghost
Copy link
Author

ghost commented Jul 19, 2022

Closed in favor of #181943

@ghost ghost closed this Jul 19, 2022
Artturin pushed a commit to Artturin/nixpkgs that referenced this pull request Jul 27, 2022
When reviewing NixOS#181802 (comment)
I noticed outdated code that attempted to override /usr/include.

    sed -i \
        -e "s,glibc_header_dir=/usr/include,glibc_header_dir=$libc_dev/include", \
        gcc/configure

`glibc_header_dir` was removed from `gcc-4.6` and later in
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=6961669f48aa18168b2d7daa7e2235fbec7cb636
(Dec 2010, "(gcc_cv_ld_eh_frame_hdr): Only check GNU ld for  --eh-frame-hdr.").

Since then gcc got `--with-native-system-header-dir=` which `nixpkgs` uses
for all packaged `gcc` versions.

The change should be a no-op.
@ghost ghost mentioned this pull request Jul 28, 2022
5 tasks
@ghost ghost deleted the power64-no-stackprotector-in-stdenv-early-stages branch January 23, 2024 06:46
ConnorBaker pushed a commit to nixos-cuda/cuda-legacy that referenced this pull request Apr 14, 2025
When reviewing NixOS/nixpkgs#181802 (comment)
I noticed outdated code that attempted to override /usr/include.

    sed -i \
        -e "s,glibc_header_dir=/usr/include,glibc_header_dir=$libc_dev/include", \
        gcc/configure

`glibc_header_dir` was removed from `gcc-4.6` and later in
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=6961669f48aa18168b2d7daa7e2235fbec7cb636
(Dec 2010, "(gcc_cv_ld_eh_frame_hdr): Only check GNU ld for  --eh-frame-hdr.").

Since then gcc got `--with-native-system-header-dir=` which `nixpkgs` uses
for all packaged `gcc` versions.

The change should be a no-op.
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: stdenv Standard environment 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant