Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add apt datasource #3722

Closed
rarkins opened this issue May 16, 2019 · 14 comments
Closed

Add apt datasource #3722

rarkins opened this issue May 16, 2019 · 14 comments
Labels
help wanted Help is needed or welcomed on this issue priority-3-medium Default priority, "should be done" but isn't prioritised ahead of others type:feature Feature (new functionality)

Comments

@rarkins
Copy link
Collaborator

rarkins commented May 16, 2019

What would you like Renovate to be able to do?

Retrieve versions of apt packages

Additional context

Would be used for things like #3720

@rarkins rarkins added type:feature Feature (new functionality) help wanted Help is needed or welcomed on this issue ready priority-3-medium Default priority, "should be done" but isn't prioritised ahead of others good first issue Suitable for new contributors labels May 16, 2019
@rarkins
Copy link
Collaborator Author

rarkins commented May 16, 2019

Debian API documentation: https://sources.debian.org/doc/api/

I'm surprised/concerned though by how few versions are available for some common packages I've checked, e.g. https://sources.debian.org/api/src/curl/

@psyb0t
Copy link
Contributor

psyb0t commented May 23, 2019

apt package versions can and will, at times, differ based on the different debian-based distributions(debian/ubuntu/linux mint/etc) and release versions(Debian wheezy/jessie/stretch, Ubuntu xenial/bionic/disco).
Any suggestions on how that could be taken care of in a clean fashion without a generically templated file? /etc/apt/sources.list is an example of a nicely formatted file which could provide source info and update package versions based on it.

% cat /etc/apt/sources.list
deb http://us.archive.ubuntu.com/ubuntu/ bionic main restricted

deb http://us.archive.ubuntu.com/ubuntu/ bionic-updates main restricted

deb http://us.archive.ubuntu.com/ubuntu/ bionic universe
deb http://us.archive.ubuntu.com/ubuntu/ bionic-updates universe

deb http://us.archive.ubuntu.com/ubuntu/ bionic multiverse
deb http://us.archive.ubuntu.com/ubuntu/ bionic-updates multiverse

It would be a pain to take into consideration all of the various distros, so for this yeah, it could just use the debian.org packages but I don't know how reliable or safe that would be. using packages from sid would be best for getting the latest versions but that's the unstable realm of deb packages, and as the name suggests it's not quite safe for everything.

Also, I know from experience that updating packages from sid can lead to conflicts with kernel, libc and other base software versions and forcing updates on those(if possible) can lead to dysfunctional systems.

@rarkins
Copy link
Collaborator Author

rarkins commented May 23, 2019

Is there any safe “default” we can use to begin with? Eg default to latest Debian LTS?

@psyb0t
Copy link
Contributor

psyb0t commented May 23, 2019

Defaulting to the latest release implies that we're sure that that's what the given software repo will be running on, but we can't be totally sure of that really.. i think this apt thing would work only in those cases where the official release name is specified somewhere. that's just my thought on this. hope i can be proven wrong.

@rarkins
Copy link
Collaborator Author

rarkins commented May 23, 2019

Csn you help me understand?

  1. Is there any REST-based or similar “API” that we can use to query all available apt versions without running “apt” commands in a shell?

  2. If we do know the platform, either through auto detection or user configuration, can we then fetch a list of versions applicable for the user?

@joerocklin
Copy link

I'm not aware of any rest-based (or otherwise) API to determine this. Most systems have multiple repositories defined for package management. On each 'update' command, the tooling downloads a big index file from each repository. It then looks for the 'newest' version of a particular package across all repos to determine what might get installed. It get further complicated by internal repo mirrors set up at organizations which may have 'approved' versions of packages, even if there is a newer on upstream.

For my own use cases, I would be happy with allowing me to set what endpoint to check. Though, even in this scenario, there will be the potential for a lot of file downloading and parsing.

@rarkins rarkins removed the good first issue Suitable for new contributors label Jun 22, 2019
@rarkins
Copy link
Collaborator Author

rarkins commented Jun 22, 2019

Hi @joerocklin thanks for the info. I guess this corresponds with what I experience when I use apt, but I hadn't thought about what happens under the hood. It may be a bit of work and require quite a lot of downloading and parsing, but the end result would be great if we could monitor and update apt dependencies within projects like we do today with programming dependencies. Hopefully we can also follow the caching/optimization principles used by apt itself too.

@ndbroadbent
Copy link

ndbroadbent commented Nov 14, 2019

I just came to the GitHub issues to ask about support for Debian package versions, so it was great to see that this is already being discussed! I'm using Debian stretch for my Docker images.

Sorry for the long comment, but I thought I would add some context about what I'm currently doing, and how this could be improved with RenovateBot!

Context:

I have a base Docker image with system packages, and this is used for both the CI builds on GitLab CI, and my app images when I deploy. I always run apt-get upgrade -y before my CI builds, so I'm always using the latest versions of packages when I run my tests. When I build a new app image, I also run apt-get upgrade -y as the final step. Debian packages aren't updated too often (usually once every few days, or maybe once a week), so the "race condition" between running tests and deploying a new version hasn't caused any problems yet (and I keep an eye on this part of this build script.) But this is definitely more dangerous than pinning specific versions and running CI builds for every package update.

I also run a few scheduled tasks. I have a daily CI build that updates all system packages and ensures that the build is still passing.

I also have a scheduled task that runs apt-get upgrade -s in the latest app image, and checks the output for any lines that match /^Inst.*Security/ (security updates). This task sends me a Slack notification if there are security updates available. Then I manually run a new CI build, wait for it to pass, and then deploy a new version with the security updates. So this would be much better if RenovateBot could push a new branch when package updates are available.

Another issue: My build started failing with the latest Google Chrome update. The latest version of google-chrome-stable starting causing a crash in a selenium driver. My workaround is to add apt-mark hold google-chrome-stable to my GitLab CI config, so this just prevents the version from ever being updated. It would be incredibly awesome if I could set up a single package group that includes the google-chrome-stable apt package, and also the capybara and selenium-webdriver Ruby gems. Then it would start with a failing branch for the chrome update, re-run the build whenever capybara / selenium-webdriver is updated to fix the crash, and then merge everything in once it's all passing.

I'm not sure how the versions could be managed in my git repo. I wasn't sure if apt has some kind of "Packagefile", or a "lock file" for versions. But I found /etc/apt/preferences in the apt-get man page:

       /etc/apt/preferences
           Version preferences file. This is where you would specify "pinning", i.e. a preference
           to get certain packages from a separate source or from a different version of a
           distribution. Configuration Item: Dir::Etc::Preferences.

Maybe RenovateBot could manage an .apt-preferences file at the root of the repo? Docker builds could copy this to /etc/apt/preferences.

Or we could come up with a new Packagefile convention? There could be a shell script that compiles the Packagefile into apt configuration (or other package managers.) Or I might be reinventing the wheel if something already exists for Chef/Puppet/Ansible/etc.

P.S. Congrats on the acquisition by WhiteSource, that's awesome news!

@rarkins
Copy link
Collaborator Author

rarkins commented Nov 15, 2019

Thanks for the kind words and extensive description, @ndbroadbent.

The way people have "non-reproducible builds" today by apt-get installing open-versioned packages seems risky to me. Everyone does it including this project in its Dockerfile, but that doesn't mean we can't change it. If apt updates are generally regarded as more stable/reliable then that might mean people are more likely to bundle all updates together to get less PRs, which is fine.

Re: managing apt-get versions in a repo, I actually had a more simple idea in mind initially, but perhaps you can tell me if it's too simple.

Renovate already has a concept of "pinning" dependencies. e.g. if your package.json said ^1.0.0 then Renovate would rewrite that to something like 1.2.0 if you opted into pinning, and then proceed to keep the pinned version up-to-date (1.2.1, 1.3.0, etc).

So my idea was to start by looking for cases of apt-get install x and rewriting those to be something like apt-get install x=1.2.0. Supporting an apt preferences could certainly also be on the table too. What do you think?

@ndbroadbent
Copy link

ndbroadbent commented Nov 25, 2019

I found this StackOverflow Q&A that explains more about Debian's philosophy for removing previous versions of packages. So I think the problem with apt-get install x is that it can start failing as soon Debian pushes a new package version, because they don't keep the older versions of packages.

I also have some more complex requirements (e.g. a custom DSL for some Dockerfiles, and passing in versions as build args). But I could let RenovateBot manage a dummy Dockerfile with an apt-get install command, and I could parse the versions out of this Dockerfile whenever it's updated. So I think that could work in theory!

Another use-case: I had to downgrade Chromium to version 76, because the latest version of 78 was crashing and breaking my tests. Here's how I'm currently installing chromium version 76.0.3809.100 on Debian buster:

RUN apt-get update \
  && apt-get install -y -q --no-install-recommends \
    chromium \
  && chromium --version \
  && curl https://snapshot.debian.org/archive/debian/20190810T050130Z/pool/main/c/chromium/chromium_76.0.3809.100-1_amd64.deb -o /tmp/pkg.deb && (dpkg -i /tmp/pkg.deb || true) && rm /tmp/pkg.deb \
  && curl https://snapshot.debian.org/archive/debian/20190810T050130Z/pool/main/c/chromium/chromium-common_76.0.3809.100-1_amd64.deb -o /tmp/pkg.deb && (dpkg -i /tmp/pkg.deb || true) && rm /tmp/pkg.deb \
  && curl http://ftp.us.debian.org/debian/pool/main/libv/libvpx/libvpx6_1.8.1-2_amd64.deb -o /tmp/pkg.deb && (dpkg -i /tmp/pkg.deb || true) && rm /tmp/pkg.deb \
  && apt-get install -y \
  && rm -rf /var/lib/apt/lists/* \
  && apt-mark hold chromium chromium-common \
  && chromium --version

This installs the current version of chromium and all the dependencies. Then downgrades to the older package version from snapshot.debian.org, and installs the libvpx6_1.8 dependency (no longer available in Debian buster.)

It would be awesome if RenovateBot could create a PR where it pushes new versions of chromium until the CI build is stable again. But I don't know if that would be possible, since I had to jump through a lot of hoops to install the older version.

Maybe the solution is to use a different OS or different package manager that has better support for older package versions. Maybe NixOS and the Nix package manager.

@rarkins
Copy link
Collaborator Author

rarkins commented Jun 18, 2020

@ppmathis do you think your repology datasource now satisfies this?

@rarkins rarkins removed the ready label Jun 18, 2020
@ppmathis
Copy link
Contributor

This specific issue could indeed be solved by using the new repology datasource. While I am mostly working with Alpine packages, Repology supports all kinds of OS repositories.

As Debian unfortunately does not keep old package versions, a Dockerfile with outdated package versions immediately breaks. As I was struggling with the same issue on Alpine, I've decided to configure Renovate to group all OS package upgrades together, as this will guarantee that status checks may still pass when multiple dependencies are outdated.

First of all, you have to configure a regex manager to parse version environment variables within your Dockerfile, e.g.:

{
  "regexManagers": [
    {
      "fileMatch": [
        "(^|/)Dockerfile$"
      ],
      "matchStrings": [
        "#\\s*renovate:\\s*datasource=(?<datasource>.*?) depName=(?<depName>.*?)( versioning=(?<versioning>.*?))?\\sENV .*?_VERSION=\"?(?<currentValue>.*?)\"?\\s"
      ],
      "versioningTemplate": "{{#if versioning}}{{versioning}}{{else}}semver{{/if}}"
    }
  ]
}

Then you may configure a package rule for grouping all OS package upgrades together to avoid CI failure when multiple packages are out of date, as this will allow Renovate to upgrade these dependencies in one shot. Please note that you may have to change/adjust the package patterns based on your needs, as the example would match all dependencies named debian_stable/<anything>:

{
  "packageRules": [
    {
      "datasources": [
        "repology"
      ],
      "packagePatterns": [
        "^debian_stable/"
      ],
      "separateMajorMinor": false,
      "groupName": "debian packages",
      "groupSlug": "debian"
    }
  ]
}

After that has been done, you may start adding environment variables to your Dockerfile which contain the package version and annotate them using comments, which the regex manager will end up parsing:

# renovate: datasource=repology depName=debian_stable/make-dfsg versioning=loose
ENV MAKE_VERSION="4.2.0"
# renovate: datasource=repology depName=debian_stable/libssl-dev versioning=loose
ENV LIBSSL_DEV_VERSION="1.1.1c-0+deb10u3"

The examples above will look for the newest version of make-dfsg and libssl-dev within the debian_stable repository on Repology, which can be found here. Repology is still working on adding version-specific repositories for Debian (e.g. Debian 9, Debian 10, ... instead of stable/unstable), but for now debian_stable just always refers to the latest stable version.

Please note that in case of Debian, it is usually recommended to specify the source package name as depName. You may find that name by looking for Source: in the output of apt info <package> or by looking at the package on the Debian website. The Debian package mappings are fairly complex in Repology and while the datasource has some heuristics for resolving ambiguous package results, the source package is usually more reliable. If any specific issues with some packages come up, please feel free to tell me, as I'm happy to analyze such cases in more detail.

Last but not least, you may find a example of the above instructions (contains both Debian and CentOS, although only the first is relevant here) at snapserv/renovate-test. You may also take a look at this PR to see how these grouped package updates show up.

If you want to see a more advanced example, you may take a look at this repository. While it is using Alpine, you can see how it uses regex managers to keep all dependencies of the image up-to-date.

Given that these regex managers can be used in any kind of file, this should even support some complex scenarios as mentioned by @ndbroadbent with a custom DSL for Dockerfiles, as it does not really matter where those version numbers are stored after all. It will not help in super complicated cases where snapshotted version archives and custom repositories are required, but it should solve common use cases like keeping a Dockerfile with sanely pinned APT packages updated.

@gerbenoostra
Copy link
Contributor

gerbenoostra commented Oct 9, 2020

I'm having a similar situation, but using an Ubuntu base docker image (20.04).
As workaround, i'm trying @ppmathis 's proposed solution. However, I've got some difficulties finding the right versions in repology.
As first, I like to install python3 :

# renovate: datasource=repology depName=ubuntu_20_04/???? versioning=loose
ENV PYTHON3_VERSION="3.8.2-0ubuntu2"
RUN apt-get update -y && \
    apt-get install --no-install-recommends -y python3=$PYTHON3_VERSION && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

But what package would I then put as depName?
On https://packages.ubuntu.com/focal/python3 I can find the python3 version to be 3.8.2-0ubuntu2.
But on repology, using https://repology.org/projects/o/?search=python3&inrepo=ubuntu_20_04 I can't find the matching version.

Are there alternative renovate datasources I can use for Ubuntu?

@gerbenoostra
Copy link
Contributor

gerbenoostra commented Oct 9, 2020

Ah, nevermind. Found a workaround:
Note the wildcard * in the apt-get command, and a different package name (but one that has same version)

# renovate: datasource=repology depName=ubuntu_20_04/python3-defaults versioning=loose
ENV PYTHON3_VERSION="3.8.2"
RUN apt-get update -y && \
    apt-get install --no-install-recommends -y python3=${PYTHON3_VERSION}* && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
help wanted Help is needed or welcomed on this issue priority-3-medium Default priority, "should be done" but isn't prioritised ahead of others type:feature Feature (new functionality)
Projects
None yet
Development

No branches or pull requests

6 participants