Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for OS package manager like apt and apk in Dockerfiles #2129

Open
ferrarimarco opened this issue Jul 15, 2019 · 29 comments
Open
Labels
Keep Exempt this from being marked by stalebot T: new-ecosystem Requests for new ecosystems/languages

Comments

@ferrarimarco
Copy link

While your docker support is great to keep the FROM directive updated, it could be enhanced by including support for the OS package managers (like APT for Debian and derivatives, APK for Alpine...).

With this addition we could completely rely on dependabot to keep our Docker images updated, instead of having to keep that manually updated.

Example 1 (APT on Ubuntu):

apt-get install apache2=2.2.20-1ubuntu1 \
                     apache2.2-common=2.2.20-1ubuntu1 \
                     apache2.2-bin=2.2.20-1ubuntu1 \
                     apache2-mpm-worker=2.2.20-1ubuntu1

Example 2 (APK on Alpine):

apk add packagename=1.2.3-suffix

We could even get fancy and support version constraints, like >=.

@hmarr
Copy link
Contributor

hmarr commented Jul 15, 2019

I'm really keen to support this too - we've got several cases internally that'd benefit from this so we've wanted it for a while, too!

It wouldn't be super hard to implement, but it's not a tiny project either. We'd need to parse shell expressions so we could handle things like RUN apt-get update && apt-get install -y foo=1.0.0 (far more complicated examples exist too...!), and we'd need to integrate with the various package registries, ideally detecting the distro and release by looking at the base image (recursively).

Unfortunately we won't have capacity to implement this in the near future. If you're really keen for support, we would accept a PR though :-)

@ferrarimarco
Copy link
Author

The repo is this one, right? https://github.com/dependabot/dependabot-core

@greysteil
Copy link
Contributor

Yep!

@jeff-cook
Copy link

This would be very helpful!

Would it be easier to code the update if we used files like pip and gem? For example an akp.txt file. Then use something like xargs to run apk add. That way they don't have to figure out how to parse the packages out of the Dockerfile.

@ferrarimarco
Copy link
Author

I suppose the parser implementation complexity would be the same, but you'll have the overhead of having to load that file somehow when you build the docker image (since you have to install those packages, don't you?)

@mst-ableton
Copy link

mst-ableton commented Sep 17, 2019

We ended up making a quick Python script to roll the pins: https://gist.github.com/mst-ableton/d0b80692571718fcb0a8f3984add9c03. As it uses Python it's not easily upstreamable, but the idea is to run apt-get update inside the container and parse the output of apt-get upgrade -s to see what it would have upgraded to. Because it's doing two docker builds, it may take a while to run. Hope this effort can jumpstart a Dependabot-native implementation in the future.

@hazcod
Copy link

hazcod commented Nov 12, 2019

Been bashing my head against the wall with this one for https://github.com/ironPeakServices/iron-redis
At one side you want to pin your package versions, but the other way you can't keep maintaining the package versions manually or whenever there is a security fix.

@CpuID
Copy link

CpuID commented Dec 18, 2020

I've taken a look at what would be involved to make this a reality - I almost started a standalone project to do it, but having it part of Dependabot feels more appropriate, plus there's a better code structure already.

Questions/thoughts for any dependabot-core maintainers (@feelepxyz @jurre @greysteil ?):

  1. I can see value in reusing the Docker FileFetcher, but having a separate package_manager used for different base OS'es, a la:
  • docker_alpine
  • docker_ubuntu
  • docker_centos
  • etc etc
    how would you prefer the file hierarchy to look here? extra top level directories for each package_manager respectively? or subdirectories within top level docker? maybe docker/lib/dependabot/docker_(alpine|ubuntu|centos)?
  1. There might be some potential for shared/reusable logic in the various FileParser's and FileUpdater's, maybe even a single shared FileParser/FileUpdater, TBD.

  2. I think each UpdateChecker will likely be unique, to talk to the different package repositories respectively for each OS. Things like Ubuntu PPA's and the equivalents for other OS'es will be interesting to deal with also... as these cannot be 100% inferred from the contents of the dependency file (Dockerfile) only?

  3. I can see a potential need to actually "run" the Docker image with a command to trawl/read the likes of:

  • /etc/os-release
  • /etc/apt*
  • /etc/yum*
    for "what package repositories need to be poked by the UpdateChecker, is there a facility available to do that?

@CpuID
Copy link

CpuID commented Dec 18, 2020

At one side you want to pin your package versions, but the other way you can't keep maintaining the package versions manually or whenever there is a security fix.

@hazcod I think the core principal here from my standpoint, is for a Dockerfile to produce a deterministic image output. It's tricky, but hard versioning at the OS package level goes a long way towards that working (with the exception of a Linux distribution pulling the rug out from under you and 404'ing the repo URLs for a specific OS release).

@CpuID
Copy link

CpuID commented Dec 19, 2020

3. I can see a potential need to actually "run" the Docker image with a command to trawl/read the likes of:

  • /etc/os-release
  • /etc/apt*
  • /etc/yum*
    for "what package repositories need to be poked by the UpdateChecker, is there a facility available to do that?

https://github.com/dependabot/dependabot-core#setup

To run all of Dependabot Core, you'll need Ruby, Python, PHP, Elixir, Node, Go, Elm, and Rust installed.

No current provision to have Docker installed or accessible as part of the list of helpers specified?

@greysteil
Copy link
Contributor

I don't maintain Dependabot anymore, but you're in safe hands with @feelepxyz and @jurre. I know they've been swamped in the last few weeks, though, and may be taking some well deserved time off over Christmas.

@jurre
Copy link
Member

jurre commented Dec 21, 2020

Appreciate you looking into this @CpuID. I just want to preface this with a note that I'm not sure if we will be able to timely review, merge and support such a contribution at this time.

We've paused accepting new ecosystems, and this patch might be of similar proportions.

Having said that, I'll try to answer some of your questions:

how would you prefer the file hierarchy to look here? extra top level directories for each package_manager respectively? or subdirectories within top level docker? maybe docker/lib/dependabot/docker_(alpine|ubuntu|centos)

I imagine that the implementations will be relatively similar, and it feels like it should be part of the docker package_manager.

What I imagine right now (without much context on this, so I may very well be wrong):

  • We make it all part of the docker package_manager
  • We have separate UpdateCheckers for each OS we support: docker/lib/dependabot/docker/update_checkers/alpine_update_checker.rb, docker/lib/dependabot/docker/update_checkers/ubuntu_update_checker.rb etc. This will have to be integrated in the main UpdateChecker
  • It seems like we might be able to reuse/extend the FileUpdater and subsequent steps, as long as it's aware of how to update an OS package and handle the shell parsing etc. It might make sense to pull this out into its own class that we call from the existing FileUpdater.

It's hard to say what it should look like exactly without doing some more investigation though, and I would definitely re-evaluate once we have a better idea of how many parts of the codebase we can reuse and how much we end up having to change.

@CpuID
Copy link

CpuID commented Dec 21, 2020

@jurre thanks for the response :)

I think your suggestion for using docker/lib/dependabot/docker/update_checkers/alpine_update_checker.rb etc makes sense, I'm happy with that filename hierarchy (depending on which class is sharded out respectively, TBD based on findings during implementation). Eg. could be docker/lib/dependabot/docker/file_parsers/alpine_file_parser.rb.

I'll see if I get free cycles to put something together, and see how far I get.

We aim to provide the best user experience possible for each of these, but we have found we've lacked the capacity – and in some cases the in-house expertise – to support new ecosystems in the last year.

@jurre hiring? :)

@jurre
Copy link
Member

jurre commented Dec 21, 2020

@jurre hiring? :)

We are! https://boards.greenhouse.io/github/jobs/2383025 https://boards.greenhouse.io/github/jobs/2384868

l0b0 added a commit to linz/geostore that referenced this issue Jan 7, 2021
Ignore rule to force setting specific package versions since

- Ubuntu should only be receiving non-breaking patches,
- we don't want the overhead of having to follow up on every package
  upgrade manually (see
  dependabot/dependabot-core#2129), and
- locking only the top level packages means we'd still get arbitrary
  versions of their dependencies.
@cecilemuller
Copy link

cecilemuller commented May 24, 2021

If dependencies were stored in a JSON file similar to package.json, jq and xargs can be used to generate the install command and update the versions:

apt.json

{
  "nginx": "1.18.0-0ubuntu1",
  "openssl": "1.1.1f-1ubuntu2.4",
  "ca-certificates": "20210119~20.04.1"
}

Run in Dockerfile:

jq -r 'to_entries | .[] | .key + "=" + .value' apt.json | xargs apt-get install -y

An action can read the version to update the JSON:

apt-cache policy nginx | grep -oP '(?<=Candidate:\s)(.+)'

@cecilemuller
Copy link

Here's a working example.

A script updates the latest version of packages in the JSON file:
https://github.com/wildpeaks/docker-nginx/blob/main/docker/update_dependencies.sh

#!/bin/bash

JSON=$( cat dependencies.json )

for PACKAGE in $( echo $JSON | jq -r 'keys | .[]' ); do
	VERSION=$( apt-cache policy "$PACKAGE" | grep -oP '(?<=Candidate:\s)(.+)' )
	JSON=$( echo $JSON | jq '.[$package] = $version' --arg package $PACKAGE --arg version $VERSION )
done

echo $JSON | python -m json.tool > dependencies.json

A cron Action runs the update script and creates a matching pull request:
https://github.com/wildpeaks/docker-nginx/blob/main/.github/workflows/dependencies.yml

# ...
    - name: Update dependencies
      working-directory: docker
      run: |
        sudo apt-get update
        sh update_dependencies.sh

    - name: Create PR
      uses: peter-evans/create-pull-request@v3
      with:
        commit-message: "chore(deps): update dependencies.json"
        branch: features/update-dependencies
        title: Update APT packages
        body: Updated dependencies.json
        delete-branch: true

And the Dockerfile uses the JSON file to install pinned versions:
https://github.com/wildpeaks/docker-nginx/blob/main/docker/Dockerfile#L7

# ...
COPY dependencies.json /tmp/dependencies.json
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
 && apt-get install -y --no-install-recommends jq \
 && jq -r 'to_entries | .[] | .key + "=" + .value' /tmp/dependencies.json | xargs apt-get install -y --no-install-recommends \
 && rm /tmp/dependencies.json 
# ...

@billy1kaplan
Copy link

Hello! I was wondering if there was any ongoing effort or plan to get this implemented. This feature would be a huge help!

@jsirianni
Copy link

This would be useful for me. I had an issue where the Ubuntu repositories gave me a very old version of a package. Ive started pinning my package versions, but now I have increased maintenance overhead.

modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
dependabot does not yet support docker dependencies

dependabot/dependabot-core#2129
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
dependabot does not yet support docker dependencies

dependabot/dependabot-core#2129
modem7 added a commit to modem7/Dnscrypt-Proxy that referenced this issue Apr 5, 2022
dependabot does not yet support docker dependencies

dependabot/dependabot-core#2129
@danepowell
Copy link

This sounds great in theory, but for the vast majority of use cases it's probably a false hope. Why? Debian repos only maintain the latest version of a given package. Unless you are hosting your own package repo, you aren't going to be able to install arbitrary package versions. So the idea of committing a "dependencies.json" file to version control is essentially impossible, at least in the context of building Docker images.

The only exceptions I see are if you host your own package repo or rely on very careful Docker caching to retain an old "pinned" version of a package.

Am I missing something?

@jsirianni
Copy link

This sounds great in theory, but for the vast majority of use cases it's probably a false hope. Why? Debian repos only maintain the latest version of a given package. Unless you are hosting your own package repo, you aren't going to be able to install arbitrary package versions. So the idea of committing a "dependencies.json" file to version control is essentially impossible, at least in the context of building Docker images.

The only exceptions I see are if you host your own package repo or rely on very careful Docker caching to retain an old "pinned" version of a package.

Am I missing something?

The repos can contain a single version sometimes, but not always. You seem to be correct, at least for debian:latest and ubuntu:latest, but a quick check shows that this is not always the case.

debian:10 image

root@d71a1f0c3573:/# apt-cache madison systemd
   systemd | 241-7~deb10u8 | http://deb.debian.org/debian buster/main amd64 Packages
   systemd | 241-7~deb10u8 | http://security.debian.org/debian-security buster/updates/main amd64 Packages

ubuntu:20.04 image

root@edf9514d8882:/# apt-cache madison systemd
   systemd | 245.4-4ubuntu3.16 | http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
   systemd | 245.4-4ubuntu3.15 | http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages
   systemd | 245.4-4ubuntu3 | http://archive.ubuntu.com/ubuntu focal/main amd64 Packages

This is just an example. I have observed the Ubuntu repos giving me a very old package version temporarily, causing one of my images to fail integration tests due to the incompatible package. This has happened only once. I solved this by pinning the version in my dockerfile, however, maintaining the dockerfile becomes difficult. The trade off is worth it for me, but maybe not for everyone.

Additionally, some Dockerfile linters will push you to pin package versions. If you allow the repo's to decide which package version you are using, you do loose some control of your image's end state.

I agree that it is generally not an issue, but it can be for some builds.

I am not happy with my solution, but it does work. It is based on some of the feedback in this thread. Basically, I use a "base image" that has the pinned packages that I depend on. This way, I end up building the base image infrequently while building my final image frequently. Dependabot would be a great addition to this workflow, preventing my base image from going stale.

Lastly, its possible that excellent integration testing of the final image would allow us to always use the latest base image with the latest packages, without relying on dependabot to handle things. Just depends how folks wish to do things.

@danepowell
Copy link

Thanks, that essentially confirms my intuition. Even in the examples you provided, the different package versions are due to them being in different repos, but each repo only contains a single version.

In an ideal world, I think we'd all pin our Apt package versions, but that seems incompatible with the Debian / apt ecosystem which focuses more on preserving backwards compatibility and thus (theoretically) makes pinning unnecessary.

I'm sure there are other use cases for this feature, this is just mine 😄

@ArwynFr
Copy link

ArwynFr commented Aug 25, 2022

This sounds great in theory, but for the vast majority of use cases it's probably a false hope. Why? Debian repos only maintain the latest version of a given package. Unless you are hosting your own package repo, you aren't going to be able to install arbitrary package versions. So the idea of committing a "dependencies.json" file to version control is essentially impossible, at least in the context of building Docker images.

The only exceptions I see are if you host your own package repo or rely on very careful Docker caching to retain an old "pinned" version of a package.

Am I missing something?

I think you are taking the problem by the wrong end.

Of course there are applications whose maintainers use version pinning to build software based on legacy dependencies. I don't think such people find much value in using dependabot, they know which version of each dependency they use and they know they can hardly upgrade without breaking everything. The target of Dependabot are software maintainers that want to efficiently keep their software up to date with the latest security fixes.

Take this Dockerfile as an example:

FROM debian:11.4-slim as minifier
RUN apt-get install --yes --no-install-recommends minify=2.7.2-1+b6

Imagine minify developers make a security fix and distribute a newer 2.7.3 version that makes its way into the debian repository. I don't get a Dependabot notification regarding the deprecation of 2.7.2. I either handle the upgrade manually, which is insane when you consider the number of projects times number of dependencies I have to monitor. Or you use some latest-like constraint and build software periodically. This is so inefficient: most of the builds will result in no change compared to the previous build, and you can expect an half-period between the new version being available and deployed.

Thanks to Dependabot, whenever debian publishes the next version of their base image, I'll get a notification prompting me to upgrade my base image to debian:11.5-slim. This allows me to immediately build a new image of my software, based on that new image, without spilling computing resources to rebuild my image daily / weekly for nothing.

I wish I had the same feature for my apt packages.

@asbjornu
Copy link

Thank you for so succinctly describing my precise use case, @ArwynFr. Yep, this is exactly how I want Dependabot to work.

@modem7
Copy link

modem7 commented Aug 25, 2022

Thanks to Dependabot, whenever debian publishes the next version of their base image, I'll get a notification prompting me to upgrade my base image to debian:11.5-slim. This allows me to immediately build a new image of my software, based on that new image, without spilling computing resources to rebuild my image daily / weekly for nothing.

I wish I has the same feature for my apt packages.

It also helps us keep track of when specific packages were updated, helping troubleshooting be far faster.

@jonjanego jonjanego added the Keep Exempt this from being marked by stalebot label May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Keep Exempt this from being marked by stalebot T: new-ecosystem Requests for new ecosystems/languages
Projects
Status: Planned
Development

No branches or pull requests