Vendor containers/{image,storage} for faster compression and parallel pulls #1244

vrothberg · 2018-12-19T10:29:35Z

No description provided.

image.go

runcom · 2018-12-19T11:10:58Z

This works in openshift/builder 👍

giuseppe · 2018-12-19T11:14:54Z

LGTM

vrothberg · 2018-12-19T11:34:08Z

Very odd errors in Travis: # time="2018-12-19T11:09:53Z" level=debug msg="error parsing image name \"localhost/alpine\" as given, trying with transport \"docker://\": Invalid image name \"localhost/alpine\", expected colon-separated transport:reference"

rhatdan · 2018-12-19T12:05:13Z

Looks like a different change is not handling localhost correctly.

vrothberg · 2018-12-19T12:05:53Z

It's a regression in c/image. We're on it :)

vrothberg · 2018-12-19T12:53:26Z

Looks like a different change is not handling localhost correctly.

I think you're right. I can reproduce the issue on a vanilla master locally.

vrothberg · 2018-12-19T17:16:51Z

The localhost warnings/errors are red herrings. It actually comes from the progress bars writing to stdout which will then be read by the scripts and interpreted as IDs. Will tackle this issue tomorrow.

rhatdan · 2018-12-19T17:20:04Z

Can we just tell the output to be quiet?

vrothberg · 2018-12-19T17:45:16Z

I don’t think that’s user friendly as the progress won’t be shown. I’ll tackle it tomorrow and will add a test to Skopeo catching such regressions.

…

On Wed 19. Dec 2018 at 18:20, Daniel J Walsh ***@***.***> wrote: Can we just tell the output to be quiet? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1244 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALI4g-KCGuJlvz2vD9UC98w1YZooNTZfks5u6nVFgaJpZM4ZZ9BF> .

rh-atomic-bot · 2018-12-19T18:09:09Z

☔ The latest upstream changes (presumably 4674656) made this pull request unmergeable. Please resolve the merge conflicts.

Update vendoring of containers/storage and containers/image to improve pulls and pushes compression speeds. Signed-off-by: Daniel J Walsh <[email protected]>

vrothberg · 2018-12-20T12:47:55Z

@rhatdan @giuseppe ... CI should pass now. Please, take a look at c46e256. The pgzip library changed a bit the blobcache behaviour and I would like some other pairs of eyes on it.

rhatdan · 2018-12-20T13:13:17Z

LGTM,
But I think we need @mtrmac and @nalind to agree to your changes on the blob cache.

vrothberg · 2018-12-20T15:05:39Z

LGTM,
But I think we need @mtrmac and @nalind to agree to your changes on the blob cache.

Sure! Once this PR is merged, we can update Podman as well.

TomSweeneyRedHat

LGTM, but would like a head nod from @mtrmac or @nalind

giuseppe · 2018-12-20T15:24:52Z

same from me, LGTM beside the change in the tests

vrothberg · 2018-12-20T17:16:21Z

Alright ... this is to be expected and actually @vbatts has warned us about this. The compressed objects are now different on which we're computing the checksums. It still seems to work in some cases but I need to digg deeper in the code/

rhatdan · 2019-01-04T02:05:04Z

@vrothberg @mtrmac @nalind What is the verdict on this one?

vrothberg · 2019-01-04T16:10:19Z

Alright. I've manually reproduced the tests before this change and with this change. To me, everything looks fine.

In a simplified way, the tests in question first pull-and-build (with cache) to cache, then push (without cache) to dir:no-cache and then push (with cache) to dir:with-cache. After that, the test compares the contents of cache no-cache and with-cache and searches for matching files.

After the change, the tests were complaining because the base layer (i.e., busybox) was not matching between cache and no-cache which makes sense as the layer in cache was compressed with pgzip while the layer in no-cache is the one we pulled which was compressed differently. In other words, the test was just working before this change because the layers we pulled were compressed with the very same library Buildah was using. In yet other words, we would hit this issue sooner or later once the gzip compression of the stdlib is changed (which is happening). Or in the words of @vbatts: it was a mistake computing digests on compressed objects.

I have a screenshot with 8 tmux tiles that I can use to explain this situation in case you want to verify my above claims.

What I suggest to do is to remove the no-cache comparisons from the tests as those are prone to errors. It was really painful to reproduce 😿

Remove error-prone parts of the explicit push tests that push to a directory without using the blob-cache. The proneness relates directly to the fact that the layers of pulled images (i.e., the busybox base image) are compressed with another gzip library the blobs in Buildah's cache, which now use pgzip. Signed-off-by: Valentin Rothberg <[email protected]>

Parallel copying of layers is currently supported when pulling from a registry to the storage. Signed-off-by: Valentin Rothberg <[email protected]>

vrothberg · 2019-01-04T16:43:30Z

I updated the tests (see d88807e).

runcom · 2019-01-04T16:52:49Z

What I suggest to do is to remove the no-cache comparisons from the tests as those are prone to errors. It was really painful to reproduce

@mtrmac could you ack 👼

rhatdan · 2019-01-04T17:24:36Z

@vrothberg SGTM

vrothberg · 2019-01-04T17:36:04Z

Three Travis jobs hit the timeout (now restarted).

mtrmac · 2019-01-04T22:09:58Z

In a simplified way, the tests in question first pull-and-build (with cache) to cache, then push (without cache) to dir:no-cache and then push (with cache) to dir:with-cache. After that, the test compares the contents of cache no-cache and with-cache and searches for matching files.

Right

After the change, the tests were complaining because the base layer (i.e., busybox) was not matching between cache and no-cache which makes sense as the layer in cache was compressed with pgzip while the layer in no-cache is the one we pulled which was compressed differently.

The other way around AFAICS: The cache contains the original compressed version (and a decompressed version); whereas buildah push without using the cache compresses the data written to dir:no-cache with pgzip.

In other words, the test was just working before this change because the layers we pulled were compressed with the very same library Buildah was using. In yet other words, we would hit this issue sooner or later once the gzip compression of the stdlib is changed (which is happening).

Yes.

What I suggest to do is to remove the no-cache comparisons from the tests as those are prone to errors. It was really painful to reproduce 😿

Yes, for now.

So, LGTM.

Ultimately, though, the data flow in buildah would really benefit from a closer [or is that more high-level?] look, quite apart from the blobcache: AFAICS the simplest buildah {commit,bud}, by default:

In containerImageRef.NewImageSource:
- Exports the top layer from c/storage as a tar file stream
- Compresses it on the fly
- Writes the compressed stream to a temporary file
In containerImageSource.GetBlob:
- Opens the file with the compressed data
In storageImageDestination.PutBlob:
- Decompresses that data on the fly
- Writes the decompressed result into another temporary file
In storageImageDestination.Commit:
- Opens the file with the decompressed data
- Creates a c/storage layer from it
For good measure, a manifest that actually refers to the compressed blobs is created and recorded as well with the newly created image.

I don’t understand c/storage enough to know whether buildah commit can logically be just a “tag” operation that turns the modifiable top layer of a container into an immutable layer in a cheap operation or whether ultimately the export of the container writable layer as tar + import into a new immutable layer is necessary — but the compression+decompression seems clearly unnecessary, and may be very expensive (in my bechmarks, though admittedly on a laptop and not a cloud node, the CPU cost of compression was by fast the slowest part with dealing with large layers).

rhatdan · 2019-01-04T22:13:09Z

Lets bring up the rest of the discussion with @nalind When he gets back.
@rh-atomic-bot r+

rh-atomic-bot · 2019-01-04T22:13:09Z

📌 Commit 759f40b has been approved by rhatdan

rhatdan · 2019-01-05T11:35:10Z

@rh-atomic-bot retry

vrothberg · 2019-01-05T14:36:10Z

The bot seems to have a cold :^)

mrunalp · 2019-01-06T01:07:13Z

We want this vendored into openshift/builder next.

vrothberg mentioned this pull request Dec 19, 2018

vendor: use faster version instead compress/gzip #1243

Closed

runcom reviewed Dec 19, 2018

View reviewed changes

image.go Show resolved Hide resolved

vrothberg force-pushed the vendor-everything branch 2 times, most recently from 6eac228 to a0bf1db Compare December 19, 2018 16:54

vrothberg force-pushed the vendor-everything branch from a0bf1db to e810b71 Compare December 20, 2018 09:57

vendor: use faster version instead compress/gzip

103a2ba

Update vendoring of containers/storage and containers/image to improve pulls and pushes compression speeds. Signed-off-by: Daniel J Walsh <[email protected]>

vrothberg force-pushed the vendor-everything branch from e810b71 to c40132f Compare December 20, 2018 10:01

vrothberg mentioned this pull request Dec 20, 2018

copy.Copy: progressbars: set correct output containers/image#551

Merged

rhatdan changed the title ~~Vendor everything~~ [WIP] Vendor everything Dec 20, 2018

vrothberg force-pushed the vendor-everything branch from 930fe49 to 759f40b Compare December 20, 2018 12:46

rhatdan changed the title ~~[WIP] Vendor everything~~ Vendor everything Dec 20, 2018

vrothberg changed the title ~~Vendor everything~~ Vendor containers/{image,storage} for faster compression and parallel pulls Dec 20, 2018

TomSweeneyRedHat reviewed Dec 20, 2018

View reviewed changes

vrothberg mentioned this pull request Jan 4, 2019

buildah mishandles insecure registry config #1047

Closed

TomSweeneyRedHat mentioned this pull request Jan 4, 2019

excessive time to extract an image #1212

Closed

vrothberg added 2 commits January 4, 2019 17:20

vendor parallel-copy from containers/image

08e9423

Parallel copying of layers is currently supported when pulling from a registry to the storage. Signed-off-by: Valentin Rothberg <[email protected]>

vrothberg force-pushed the vendor-everything branch from 759f40b to 08e9423 Compare January 4, 2019 16:42

runcom mentioned this pull request Jan 5, 2019

Enable blob caching and blobInfo caching openshift/builder#34

Merged

rhatdan merged commit bb710f3 into containers:master Jan 5, 2019

github-actions bot added the locked - please file new issue/PR label Oct 17, 2023

github-actions bot locked as resolved and limited conversation to collaborators Oct 17, 2023

Vendor containers/{image,storage} for faster compression and parallel pulls #1244

Vendor containers/{image,storage} for faster compression and parallel pulls #1244

Uh oh!

Conversation

vrothberg commented Dec 19, 2018

Uh oh!

Uh oh!

runcom commented Dec 19, 2018

Uh oh!

giuseppe commented Dec 19, 2018

Uh oh!

vrothberg commented Dec 19, 2018

Uh oh!

rhatdan commented Dec 19, 2018

Uh oh!

vrothberg commented Dec 19, 2018

Uh oh!

vrothberg commented Dec 19, 2018

Uh oh!

vrothberg commented Dec 19, 2018

Uh oh!

rhatdan commented Dec 19, 2018

Uh oh!

vrothberg commented Dec 19, 2018 via email

Uh oh!

rh-atomic-bot commented Dec 19, 2018

Uh oh!

vrothberg commented Dec 20, 2018

Uh oh!

rhatdan commented Dec 20, 2018

Uh oh!

vrothberg commented Dec 20, 2018

Uh oh!

TomSweeneyRedHat left a comment

Choose a reason for hiding this comment

Uh oh!

giuseppe commented Dec 20, 2018

Uh oh!

vrothberg commented Dec 20, 2018

Uh oh!

rhatdan commented Jan 4, 2019

Uh oh!

vrothberg commented Jan 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vrothberg commented Jan 4, 2019

Uh oh!

runcom commented Jan 4, 2019

Uh oh!

rhatdan commented Jan 4, 2019

Uh oh!

vrothberg commented Jan 4, 2019

Uh oh!

mtrmac commented Jan 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rhatdan commented Jan 4, 2019

Uh oh!

rh-atomic-bot commented Jan 4, 2019

Uh oh!

rhatdan commented Jan 5, 2019

Uh oh!

vrothberg commented Jan 5, 2019

Uh oh!

mrunalp commented Jan 6, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

vrothberg commented Jan 4, 2019 •

edited

Loading

mtrmac commented Jan 4, 2019 •

edited

Loading