Dockerfile cleanup: reduce image size 3x #1212
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
The Docker image produced by the
master
branch currently is large:5.91 GB
. Through some Docker optimizations this size can be reduced to 1/3 of that without losing any functionality.Optimizations
1. Baseline
2. Combine install layers ( -290 MB )
Each
RUN
instruction in a Dockerfile will result in a new image layer so if you add files, then delete them on separate RUN lines, you don't free up any space with the deletion. Instead, add and delete the files in the sameRUN
instruction. More info about this is in the Docker docs here. The commands are split up into multiple lines for better readability.3. Don't cache pip packages ( -80 MB)
When
pip
installs Python packages, it caches install data locally to speed up futurepip install
calls. We can clear this cache.4. Delete go module cache ( -2 GB)
When
go install
calls are made, go caches all the module dependencies in/go/pkg/mod
. We can clear this cache and save significant space.root@9c870efaeab3:/usr/src/app# du -sh /go/pkg/* 2.0G /go/pkg/mod 20K /go/pkg/sumdb
Similar to the optimization in section 2, we need to install the modules and clear the cache in the same
RUN
instruction. To do this, we pipeprintf
intoxargs
to callgo install
on each module.5. Omit go debug symbols and remove build cache ( -1.9 GB)
When
go install
is called it's doing ago build
on the source code it pulls down. We can squeeze an extra bit of space if we use the build flags to omit debug symbols. Go docs here.Additionally, go caches build data in
~/.cache/go-build
and in this case it's a significant amount of data - almost 2GB.Other Changes
apt update
-up
calls. We just installed them 3 lines earlier in the Dockerfile - we know they are up-to-date.httpx
alias. It works just fine without the alias.