Skip to content

fix: script and dockerfile for centos#154

Merged
phlax merged 11 commits intoenvoyproxy:mainfrom
bartsmykla:fix-centos-buids
Feb 16, 2022
Merged

fix: script and dockerfile for centos#154
phlax merged 11 commits intoenvoyproxy:mainfrom
bartsmykla:fix-centos-buids

Conversation

@bartsmykla
Copy link
Copy Markdown
Contributor

@bartsmykla bartsmykla commented Oct 5, 2021

ref: envoyproxy/envoy#18365

@bartsmykla bartsmykla requested a review from a team as a code owner October 5, 2021 11:28
- gn has to be compiled to satisfy centos glibc requirements and
  to mitigate problems with the version of envoy's wasm/v8
  dependency (ref: envoyproxy/envoy#15119)
- prependend PATH env var in centos' dockerfile by /usr/local/bin
  which was missing before and which is path where bazel is present
  (ref: envoyproxy/envoy/#12535)
- added installation of glibc-static and libstdc++-static

ref: envoyproxy/envoy#18365

Signed-off-by: Bart Smykla <bartek@smykla.com>
@bartsmykla
Copy link
Copy Markdown
Contributor Author

apologies for force-push, but I just removed the whitespace changes introduced by my IDE

as it expects to have llvm+clang installed to compile

Signed-off-by: Bart Smykla <bartek@smykla.com>
@bartsmykla
Copy link
Copy Markdown
Contributor Author

It looks like building ubuntu container failed but I haven't touched the code related this so I assume it's flake or it may be related with the expired let's encrypt ca root cert (It have bitten me yesterday)

Signed-off-by: Bart Smykla <bartek@smykla.com>
it's actually unnecessary as it's already included and the problem
was not with the order, but with PATH not being preserved, when
run `run_envoy_docker.sh` script from envoyproxy/envoy/ci

Signed-off-by: Bart Smykla <bartek@smykla.com>
As the PATH is not preserved from environment in some of the CI
scripts, we have to use it like in ubuntu scripts, which means
to just move the PATH wrom dockerfile to the script

Signed-off-by: Bart Smykla <bartek@smykla.com>

set -e

export PATH="/opt/rh/rh-git218/root/usr/bin:/opt/rh/devtoolset-7/root/usr/bin:/opt/llvm/bin:/usr/local/sbin:/usr/local/bin:${PATH}"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, so i think i see the first problem - centos doesnt include /usr/local/... in its PATH

im wondering if its worth adding that in /etc/profile.d

Copy link
Copy Markdown
Member

@phlax phlax Oct 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure adding in build_container_centos.sh would resolve the problem as this script is only run when the image is created, not when the container is run

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's why there is /usr/local/sbin:/usr/local/bin in this line

we can add it to /etc/profile.d I think

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

give me a second

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the change and am testing it right now locally too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, one moment as it's still building

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ive created a ~throwaway acc that we can use for further test runs

are you on envoy slack ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phlax yes also bartsmykla, sorry for the delay, but had problems with my docker and it's still building...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so the /etc/profile.d plan didnt quite work, im guessing its because the user is created with --no-create-home and /etc/profile doesnt get sourced - not sure exactly - will dig further tomorrow

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a similar issue on a build server, with RVM. The solution, because jenkins runs as a service, and not a login shell, was to add the call to ". /etc/profile.d/rvm.sh" ("." is an alias of "source") to the user's ".bash_profile". A similar fix might work here.

This reverts commit 46d091c.

Signed-off-by: Bart Smykla <bartek@smykla.com>
Signed-off-by: Bart Smykla <bartek@smykla.com>
As when building container we also need updated PATH, I'm sourcing
/etc/profile.d/profile_centos.sh at the beginning of the script

Signed-off-by: Bart Smykla <bartek@smykla.com>
Signed-off-by: Bart Smykla <bartek@smykla.com>
Signed-off-by: Bart Smykla <bartek@smykla.com>
@codefromthecrypt
Copy link
Copy Markdown

It might help to re-title/description this PR about CentOS 7 because I think the issue is about that (vs CentOS 8), right?

@agrahamlincoln
Copy link
Copy Markdown

FWIW I rebased this branch to main and was able to build binaries for envoy 1.18.4 with an image built with these changes.

Thanks for the fix!

@robejfarr
Copy link
Copy Markdown
Contributor

Many thanks for the work on this PR, we appreciate it - @acarlson0000 and myself have been able to take it, and have successfully built up to version 1.20.1 of envoy from source, with a few caveats, one being that we’re no way near experts in C and C++.

We have been able to build it with clang and libc++ - bazel build --config=libc++ -c fastbuild //source/exe:envoy-static (opt also builds successfully).

When attempting to build with clang and libstdc++, we encountered compilation problems from at least v1.19.0 onwards, but we believe that libc++ is the preferred compilation option, so this is not an actual issue for us.

However, when moving to v1.21.0 with libc++, we are unable to complete the build due to incompatible linux/tcp.h headers included in centos 7 (as this is missing some header functions)

For info: Centos 7 uses the following header libraries (which provides the tcp.h above):
glibc-headers-2.17-317.el7.x86_64.rpm
kernel-headers-3.10.0-1160.el7.x86_64.rpm

ERROR: /envoy/source/extensions/transport_sockets/tcp_stats/BUILD:12:17: Compiling source/extensions/transport_sockets/tcp_stats/tcp_stats.cc failed: (Exit 1): clang failed: error executing command /opt/llvm/bin/clang -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++0x' -MD -MF ... (remaining 119 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
source/extensions/transport_sockets/tcp_stats/tcp_stats.cc:119:18: error: no member named 'tcpi_data_segs_out' in 'tcp_info'; did you mean 'tcpi_segs_out'?
  if ((tcp_info->tcpi_data_segs_out > last_cx_tx_data_segments_) &&
                 ^~~~~~~~~~~~~~~~~~
                 tcpi_segs_out
/usr/include/linux/tcp.h:204:8: note: 'tcpi_segs_out' declared here
        __u32   tcpi_segs_out;       /* RFC4898 tcpEStatsPerfSegsOut */
                ^
source/extensions/transport_sockets/tcp_stats/tcp_stats.cc:126:51: error: no member named 'tcpi_data_segs_out' in 'tcp_info'; did you mean 'tcpi_segs_out'?
    const uint32_t data_segs_out_diff = tcp_info->tcpi_data_segs_out - last_cx_tx_data_segments_;
                                                  ^~~~~~~~~~~~~~~~~~
                                                  tcpi_segs_out
/usr/include/linux/tcp.h:204:8: note: 'tcpi_segs_out' declared here
        __u32   tcpi_segs_out;       /* RFC4898 tcpEStatsPerfSegsOut */
                ^
source/extensions/transport_sockets/tcp_stats/tcp_stats.cc:139:28: error: no member named 'tcpi_data_segs_out' in 'tcp_info'; did you mean 'tcpi_segs_out'?
                 tcp_info->tcpi_data_segs_out);
                           ^~~~~~~~~~~~~~~~~~
                           tcpi_segs_out
/usr/include/linux/tcp.h:204:8: note: 'tcpi_segs_out' declared here
        __u32   tcpi_segs_out;       /* RFC4898 tcpEStatsPerfSegsOut */
                ^
source/extensions/transport_sockets/tcp_stats/tcp_stats.cc:141:28: error: no member named 'tcpi_data_segs_in' in 'tcp_info'; did you mean 'tcpi_segs_in'?
                 tcp_info->tcpi_data_segs_in);
                           ^~~~~~~~~~~~~~~~~
                           tcpi_segs_in
/usr/include/linux/tcp.h:205:8: note: 'tcpi_segs_in' declared here
        __u32   tcpi_segs_in;        /* RFC4898 tcpEStatsPerfSegsIn */
                ^
source/extensions/transport_sockets/tcp_stats/tcp_stats.cc:146:26: error: no member named 'tcpi_notsent_bytes' in 'tcp_info'
               tcp_info->tcpi_notsent_bytes);
               ~~~~~~~~  ^

We have seen that there was a separate pull request in the main envoy project to use a relative import of the header, rather than explicitly from /usr/include/linux/tcp.h - see: envoyproxy/envoy#19771 - however, we are not sure how to proceed here (are we able to replace / tell the compiler to use a later version of these headers? What issues will we run into when compiling / attempting to run the binary on an OS which does not have these included?). Version 1.21 builds fine if we comment out and do not build the tcp_stats extension.

Can you let us know what the status of this PR is, whether it’s going to be merged and should we raise a separate issue for compiling v1.21 onwards?

@phlax
Copy link
Copy Markdown
Member

phlax commented Feb 10, 2022

hi @robejfarr using the centos build image is not really supported or maintained, and is not tested in CI

i had a look at this some time ago with @bartsmykla - it may be that this PR resolves some issues but afaict there is a lot that does not work with this build image, so im reluctant to put too much time into fixing it, or maintaining it moving forward.

Im also not convinced that there is a need for a redhat build image at all - i dont think its needed for building redhat binaries or docker images etc. I believe at least, that the main build image should be usable to build for any linux distro - lib versions aside.

@robejfarr
Copy link
Copy Markdown
Contributor

Hi @phlax, thanks for the quick response.

Think however this PR progresses and is then maintained it is much better to have the build information available with the relevant health warnings in the Envoy repositories so people can find the latest detail there, rather than searching the Internet. They could also see the plan for supported operating systems and understand how it is/is not being supported going forward. Perhaps you could then have core builds of Envoy and best endeavors from the community?

Unless there is a fundamental problem and Envoy cannot run on CentOS 7 (can someone explain the detail if it cannot) we’re keen to keep this working until the end of its maintenance window in 2024 (or a bit earlier) and enable Envoy to assist with any OS migrations, rather than it being a complicated blocker for companies that run CentOS 7 and would like to use Envoy.

Have also seen @howardjohn envoyproxy/envoy#19535, which indicates it might be able to resolve some of these build problems and standardise; was that what you were thinking about when you mentioned the above or something else? As you say post RHEL 7 variants hopefully this should all be simpler for a while... We’re happy to help out with our limited knowledge and update documentation etc. on where Envoy is with CentOS 7 and keep the docker build images available in some form.

The last thing is that there is a problem with version 1.21 of Envoy (as noted above) and it will not currently compile on CentOS 7. We’d like to find a solution to that if possible. Understanding what is the best way to progress forward with that would be useful - am thinking of opening a different issue for that and keeping it separate from this PR? As noted in the comment above there seem some related movements to assist.

@phlax
Copy link
Copy Markdown
Member

phlax commented Feb 11, 2022

i think my point is more this - we shouldnt need to maintain a build image for every distro we want to build for - we are currently putting together a build pipeline that uses bazel - so for the most part it is agnostic to the host platform (theoretically anyway)

overall my thought is that this image is not just not working properly now, but has been in that state for some time, needs quite a bit of work to be useful (and not a maintainer headache), isnt tested anywhere and represents a large maintenance cost going forward.

i also think it confuses users because they think - i want to use redhat so i need to use this image, so im not really clear what the benefits to keeping it are and only see the downsides

@howardjohn
Copy link
Copy Markdown

My 2c on this, without much context: You don't really need a build script for every OS, you should get the same artifacts regardless of the underlying host. Except glibc. The host's glibc impacts the version require. So if I build on Ubuntu 20.04, I definitely cannot run the result on CentOS 7 since the glibc version is way too new.

So there are two ways around this I know of:

  • Use a completely different compiler (Add build logic for zig cc envoy#19535) which can specify the glibc target version, and target an old version (likely 2.17 - centos 7's version, as I have never seen anyone trying to run older versions lately. That is already 10 years old)
  • Build on an old OS. This could be ubuntu or debian or whatever, but centos 7 seems to be the easiest to target 2.17 since its less EOL than ubuntu versions targetting 2.17

@robejfarr
Copy link
Copy Markdown
Contributor

Think everyone would be happier to be using a more modern and maintained build toolchain if we could.

We want the latest Envoy to be able to be run on CentOS 7 until the end of its maintenance window if we can (yes it is old). If the build is replaced with something more bleeding edge, we're fine with that as hopefully it's easier to get changes and updates to it.

The only problem we have (there may be others) with the current build is that it doesn't seem to work for Envoy version 1.21 onwards as noted above (the tcp.h issue).

@howardjohn I see some of the tests are still failing in the zig cc PR. Did you manage to compile Envoy version 1.21 upwards for glibc 2.17 and run it on CentOS 7? If not we're happy to have a run at it over the next couple of weeks as it seems to have the necessary header details here https://github.com/ziglang/zig/blob/master/lib/libc/include/any-linux-any/linux/tcp.h and put any findings on that PR.

@phlax
Copy link
Copy Markdown
Member

phlax commented Feb 15, 2022

i understand the (legacy) problem re glibc - not sure about the more recent issue, and not sure about the best way forward

if the consensus is to land this on the basis that it is better than what we have now im happy to land, just concerned about the difficulty of effectively testing changes

@robejfarr
Copy link
Copy Markdown
Contributor

@phlax understand about the testing difficulty. From our side we think it's worth landing as you can still use the build container with this change.

I'm happy to make a separate PR to add a readme around the centos container and the caveats that you can only use it up to specific Envoy builds currently.

For the recent glibc issue 1.21 onwards am thinking we'll have a look at zig cc to see if it works and then probably raise another issue to see if anyone has some ideas on what we could do to fix it and then we could have a look at those. Similarly am open to other suggestions on how to resolve.

Copy link
Copy Markdown
Member

@phlax phlax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, on the basis that the consensus pretty clearly is to land this, lgtm

thanks @bartsmykla - apologies for stalling on this

@robejfarr if you were able to provide more info on what does/not work that would defo be helpful

I would just encourage anyone commenting or lurking this thread to review any subsequent updates to the centos image, because i feel like we are climbing without ropes having no ci or end use cases to test against

@phlax phlax merged commit 3f6b1b0 into envoyproxy:main Feb 16, 2022
htuch pushed a commit that referenced this pull request Feb 16, 2022
  [skip ci]
  fix: script and dockerfile for centos (#154)

* fix: script and dockerfile for centos

- gn has to be compiled to satisfy centos glibc requirements and
  to mitigate problems with the version of envoy's wasm/v8
  dependency (ref: envoyproxy/envoy#15119)
- prependend PATH env var in centos' dockerfile by /usr/local/bin
  which was missing before and which is path where bazel is present
  (ref: envoyproxy/envoy/#12535)
- added installation of glibc-static and libstdc++-static

ref: envoyproxy/envoy#18365

Signed-off-by: Bart Smykla <bartek@smykla.com>

* boringssl/fips: move gn building down

as it expects to have llvm+clang installed to compile

Signed-off-by: Bart Smykla <bartek@smykla.com>

* remove unnecessary whitespace changes

Signed-off-by: Bart Smykla <bartek@smykla.com>

* centos: remove unnecessary PATH alteration

it's actually unnecessary as it's already included and the problem
was not with the order, but with PATH not being preserved, when
run `run_envoy_docker.sh` script from envoyproxy/envoy/ci

Signed-off-by: Bart Smykla <bartek@smykla.com>

* move PATH from container to build_container_centos

As the PATH is not preserved from environment in some of the CI
scripts, we have to use it like in ubuntu scripts, which means
to just move the PATH wrom dockerfile to the script

Signed-off-by: Bart Smykla <bartek@smykla.com>

* Revert "move PATH from container to build_container_centos"

This reverts commit 46d091c.

Signed-off-by: Bart Smykla <bartek@smykla.com>

* move centos PATH to /etc/profile.d file

Signed-off-by: Bart Smykla <bartek@smykla.com>

* source profile in build_container_centos.sh

As when building container we also need updated PATH, I'm sourcing
/etc/profile.d/profile_centos.sh at the beginning of the script

Signed-off-by: Bart Smykla <bartek@smykla.com>

* modify path in sudoers

Signed-off-by: Bart Smykla <bartek@smykla.com>

* remove duplication

Signed-off-by: Bart Smykla <bartek@smykla.com>
htuch pushed a commit that referenced this pull request Feb 16, 2022
  [skip ci]
  Regenerate linux toolchains from 3f6b1b0

  [skip ci]
  fix: script and dockerfile for centos (#154)

* fix: script and dockerfile for centos

- gn has to be compiled to satisfy centos glibc requirements and
  to mitigate problems with the version of envoy's wasm/v8
  dependency (ref: envoyproxy/envoy#15119)
- prependend PATH env var in centos' dockerfile by /usr/local/bin
  which was missing before and which is path where bazel is present
  (ref: envoyproxy/envoy/#12535)
- added installation of glibc-static and libstdc++-static

ref: envoyproxy/envoy#18365

Signed-off-by: Bart Smykla <bartek@smykla.com>

* boringssl/fips: move gn building down

as it expects to have llvm+clang installed to compile

Signed-off-by: Bart Smykla <bartek@smykla.com>

* remove unnecessary whitespace changes

Signed-off-by: Bart Smykla <bartek@smykla.com>

* centos: remove unnecessary PATH alteration

it's actually unnecessary as it's already included and the problem
was not with the order, but with PATH not being preserved, when
run `run_envoy_docker.sh` script from envoyproxy/envoy/ci

Signed-off-by: Bart Smykla <bartek@smykla.com>

* move PATH from container to build_container_centos

As the PATH is not preserved from environment in some of the CI
scripts, we have to use it like in ubuntu scripts, which means
to just move the PATH wrom dockerfile to the script

Signed-off-by: Bart Smykla <bartek@smykla.com>

* Revert "move PATH from container to build_container_centos"

This reverts commit 46d091c.

Signed-off-by: Bart Smykla <bartek@smykla.com>

* move centos PATH to /etc/profile.d file

Signed-off-by: Bart Smykla <bartek@smykla.com>

* source profile in build_container_centos.sh

As when building container we also need updated PATH, I'm sourcing
/etc/profile.d/profile_centos.sh at the beginning of the script

Signed-off-by: Bart Smykla <bartek@smykla.com>

* modify path in sudoers

Signed-off-by: Bart Smykla <bartek@smykla.com>

* remove duplication

Signed-off-by: Bart Smykla <bartek@smykla.com>
@tmiroslav
Copy link
Copy Markdown

Is it possible to run envoy in Centos 7 after updates above? Any specific envoy version I can try? When I try to run latest envoy version I am still getting GLIBC_2.18 issue.

@acarlson0000
Copy link
Copy Markdown

Is it possible to run envoy in Centos 7 after updates above? Any specific envoy version I can try? When I try to run latest envoy version I am still getting GLIBC_2.18 issue.

Hi @tmiroslav - as an unofficial answer, v1.20.x is the only version we've been able to reliably build and run on Centos 7. Anything higher runs into the incompatible kernel headers / glibc_2.18 issues as you've encountered / mentioned here

Note: I work in the same team as @robejfarr, we've been trying to get a later version running with different build formats but haven't got anywhere sadly (but we're novices, and working things out as we go).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants