Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to container images. #4054

Closed
russjones opened this issue Jul 15, 2020 · 7 comments · Fixed by #4070
Closed

Changes to container images. #4054

russjones opened this issue Jul 15, 2020 · 7 comments · Fixed by #4070

Comments

@russjones
Copy link
Contributor

russjones commented Jul 15, 2020

In #3357, we introduced a new image tag major.minor in addition to the existing major.minor.patch images. For example 4.3 and 4.3.0. The idea is that 4.3 would be rebuild nightly and have the latest patch version and be mutable but 4.3.0 (and all other patch releases) will be immutable and might have a somewhat stale base OS.

One of the main drivers for this was to keep the container image patched against security vulnerabilities.

Along with this change, we also switched the base OS to alpine instead of Ubuntu. The reason to switch to alpine was to reduce the image size as well as reduce the attack surface of the image.

This has lead to multiple customers complaining that they've now had to update their Dockerfiles because apt-get no longer works. This is very reasonable for an outside observer that does not have the context of the discussion in #3357, it's not clear why the base OS would change between tags that Gravitational publishes.

@knisbet has suggested that instead of creating a brand new image for each major.minor image, instead just update that tag to point to the latest major.minor.patch release.

My suggestion is we drop support for alpine (by switching to Ubuntu) for the major.minor releases, but keep everything the same (keep the nightly rebuild of images). My reasons are:

  1. Bandwidth is cheaper than the time to update your container image.
  2. While Ubuntu has a larger attack surface, our images contain a single Go binary which means most of the vulnerabilities shown in scanners will not be an issue for Teleport.
  3. If we point the major.minor tag to the latest patch release, if our patch release gets a little old and CVEs will start to show up which means we will not have solved the problem we set out to solve in Keep Teleport base images updated. #3357.

We can revisit maintaining another set of images of either alpine or distroless in either 4.4 or 5.0.

@russjones
Copy link
Contributor Author

Best: 1
Realistic: 3

@awly
Copy link
Contributor

awly commented Jul 15, 2020

Any way we could find out what customers install on top of our image?

My (naive) thinking was: image's API are its entrypoint, ports, env vars and volumes. Base image and other binaries available within are not, unless you specifically create an image intended to be base for others.

But I agree that the current situation is confusing for anyone using teleport as base image. We can keep the status quo (ubuntu base), but I'd like to better understand how we ended up maintaining a base image for others.

My suggestion is we drop support for alpine (by switching to Ubuntu) for the major.minor releases, but keep everything the same (keep the nightly rebuild of images).

Which means that our latest patch release tag is mutable, right?

our images contain a single Go binary which means most of the vulnerabilities shown in scanners will not be an issue for Teleport.

This is something we should be verifying by hand for every CVE and explaining to customers scanning their images.
This maintenance toil would be smaller with a minimal base image.

@knisbet
Copy link
Contributor

knisbet commented Jul 15, 2020

Just to add some background.

The only way I've ever seen a major.minor tag used is to point to the same image as the latest major.minor.patch tag. If strictly following semver, it should always be possibly to pull on major.minor without creating conflicts, and then if necessary can lock into a specific patch. You can see this with golang for example on docker hub, look at the shared tags: https://hub.docker.com/_/golang or quay.io shows this as a link between tags: https://quay.io/repository/gravitational/wormhole?tag=latest&tab=tags (I messed up the publishing of 0.2, but can be seen with 0.1 or 0.3)

Also the historical context on ubuntu vs alpine which lead to preiously recommending ubuntu in the teleport containers, is historically the vulnerability scanners weren't able to scan alpine images and would produce clean scan results even though they didn't really scan the image. While clair (the scanner used by quay.io) does now seem to support alpine images, I have no idea if the backend vulnerability databases are as accurate (could be less, the same, or more accurate, I have no data). Musl is also a consideration, while I realize there are lots of reasons to want to use musl, I've personally found it to be a frustrating experience, specifically users with issues with the DNS resolver, when running in kubernetes, etc. So supporting it might mean supporting the OS as well when someone deploys it and encounters issues.

On the gravity side, we use alot of our custom debian minimal image, but in most new cases are standardizing on ubuntu or distroless, and we want to use more distroless where possible. I don't think distroless works in this case, since I believe the docker image ships tctl, and expects a user can get a shell in the container to run tctl commands (as opposed to just execing tctl individually for each command). In almost all cases, I've seen ubuntu on the latest .10 release produce the cleanest scan results, and it is a good / standard distribution for containers, with Canonical security team backing it like they do with Ubuntu.

Edit: just to clarify, latest ubuntu .10 release being like 20.10 should scan pretty clean, with 18.10 for example beginning to accumulate medium/minor vulnerabilities. Although there is a single high vulnerability that seems like it's in a wontfix state that keeps popping up in our gravity scans which will likely never be executed.

@russjones
Copy link
Contributor Author

Any way we could find out what customers install on top of our image?
But I agree that the current situation is confusing for anyone using teleport as base image. We can keep the status quo (ubuntu base), but I'd like to better understand how we ended up maintaining a base image for others.

I agree, @jon-can can you share some more information with us.

Which means that our latest patch release tag is mutable, right?

Our latest patch should still have been built with Ubuntu.

This is something we should be verifying by hand for every CVE and explaining to customers scanning their images.
This maintenance toil would be smaller with a minimal base image.

I'm not against this, just don't think we should break existing workflows. I have no problem with 4.3-alpine or (4.3-distroless) maintained in parallel and even our documentation recommended them when appropriate.

@webvictim
Copy link
Contributor

webvictim commented Jul 15, 2020

Which means that our latest patch release tag is mutable, right?

@awly Our 4.3.0 tag is still immutable, it was built once on release day with an ubuntu base and that will stay as-is.

The 4.3 tag has always been intended to be fully mutable because we were rebuilding the image every night. If one of the very few things we include (like dumb-init or ca-certificates) gets rebuilt/changed then we'd be pulling in that version change earlier than a Teleport release.

Musl is also a consideration, while I realize there are lots of reasons to want to use musl, I've personally found it to be a frustrating experience, specifically users with issues with the DNS resolver, when running in kubernetes, etc. So supporting it might mean supporting the OS as well when someone deploys it and encounters issues.

@knisbet Yeah - we essentially worked around this by using an alpine-glibc base image (https://github.com/gravitational/docker-alpine-glibc) which I forked from another Github repo and configured Drone to build. Teleport's conventional amd64 builds wouldn't run in an alpine-based container without glibc. I did manage to get Teleport to build a static binary with musl, including PAM and BPF support, but without extended testing there's no way I'd be confident recommending that for regular use. The alpine-glibc base seemed to be an acceptable compromise.

Edit: just to clarify, latest ubuntu .10 release being like 20.10 should scan pretty clean, with 18.10 for example beginning to accumulate medium/minor vulnerabilities. Although there is a single high vulnerability that seems like it's in a wontfix state that keeps popping up in our gravity scans which will likely never be executed.

@knisbet For context, using ubuntu:18.04 as a base gets us 7 vulnerabilities on quay.io, whereas ubuntu:20.04 gets us 0. This led to #4004 where we changed the base image for images > 4.3.0 to ubunu:20.04.

I'm not against this, just don't think we should break existing workflows.

@russjones My argument would be that we haven't broken anyone's workflow at all. If someone was using quay.io/gravitational/teleport:4.2.10 as their base image previously and they're now using quay.io/gravitational/teleport:4.3.0, they still get the same Ubuntu base and any apt-get commands they've added will work.

The idea behind the mutable 4.3 tag was that people who always want to be running the latest 4.3.x version of Teleport and don't want to manage their own updates can use it. I do accept that perhaps we conflated two different things: 1) an auto-updating container and 2) the change from the ubuntu base to an alpine base.

Fix-wise, I think we should probably change the 4.3, 4.2 and 4.1 tags to use ubuntu:20.04 as the base image ASAP before anyone gets used to those being alpine-based.

I can also make a change to rename the existing images to 4.3-alpine, 4.2-alpine and 4.1-alpine. This doesn't break any assumptions about what we've been doing in the past, and the ubuntu:20.04 base is small enough that it shouldn't be much of a bigger download for anyone and the attack surface shouldn't be much larger.

@awly
Copy link
Contributor

awly commented Jul 15, 2020

@russjones @webvictim sorry, I left a confusing comment about mutability.
I meant: "If we go with russjones's suggestion, our latest patch release tag will become mutable", after reading (highlighted):

My suggestion is we drop support for alpine (by switching to Ubuntu) for the major.minor releases, but keep everything the same (keep the nightly rebuild of images).

@webvictim
Copy link
Contributor

webvictim commented Jul 15, 2020

I think that @russjones was just referring to changing the Dockerfile-cron base image to ubuntu rather than alpine and leaving every other part of our build the same. This would give us a mutable 4.3 tag that rebuilds every night and immutable 4.3.0, 4.3.1 etc tags which are built once and left.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants