-
Notifications
You must be signed in to change notification settings - Fork 395
storage: check the compressed digest of cached blobs #1051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: check the compressed digest of cached blobs #1051
Conversation
|
Tests aren't happy @fgiudici |
|
@TomSweeneyRedHat , thanks for checking so quickly! |
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
An initial very quick look: It does make sense that this check should be dropped. I’m not at all sure that setting srcInfo.Size = extantBlobSize unconditionally makes sense here, couldn’t that cause a call to ReapplyBlob with a mismatched (digest, size) pair? I’ll need to take a deeper look.
Yeah, you are right, we will have a mismatched pair. Seems to me btw that the |
|
Just wanted to note that this will hopefully fix this BZ https://bugzilla.redhat.com/show_bug.cgi?id=1867463 which is getting a lot of attention. |
|
Following the idea in the comment above, added a commit to check if, when we found a cached layer by the uncompressed digest, the compressed digest matches too: Note that at this point we can drop the first commit (we can leave the check on the sizes in copy.go) as we now return the correct size. But it honestly doesn't seem that useful. |
That doesn’t work in general because the input |
|
Actually, how does this fail precisely? Looking at storage commit b8e0174ae6b2dc083d9ada365b9a207371aa62a6 , (Overall, before roughly the introduction of |
Well, I dug more till the root cause: the storage cache is not completely cleaned up on image removal.
What we want to note is that we will fill in the and in the Let's move on: Now, let's pull the other image:
Now, our And in the
Next:
IMHO we should fix the storage package: just opened a PR (containers/storage#730). |
|
Ah, I have completely missed that case. Thanks for digging into it! The updated fix makes sense to me. I’m a bit torn on whether to actually include the workaround in c/image or whether to just have this PR as a pointer to the c/storage PR. What do others think? |
When there are diffent images that have the same layer compressed
differently, pulling the images in cri-o may fail.
Reproducer:
# crictl pull docker.io/fgiudici/test:1.0
# crictl rmi docker.io/fgiudici/test:1.0
# crictl pull docker.io/node:lts-slim
# crictl pull docker.io/fgiudici/test:1.0
FATA[0003] pulling image failed: rpc error: code = Unknown desc = Error: blob sha256:6a73b40493d9a3fd12c93f0d03ec90c78e447dc9f182d119fb7477b75997ba72 is already present, but with size 22522274 instead of 23625492
Image docker.io/fgiudici/test:1.0 has some layers in common with docker.io/node:lts-slim, but the layers have been pushed with a different compressed archive (and so different compressed-diff-digest).
In particular:
node:lts-slim has
"id": "767a7b7a8ec530ae9a7b1c4eeed530d54ca7ef9f224fe207be42dc74f565587a",
"created": "2020-09-23T10:27:27.399649267Z",
"compressed-diff-digest": "sha256:abb454610128b028301ee40af387d31111a1e699e4ea424fd53186ff77067402",
"compressed-size": 22522274,
"diff-digest": "sha256:767a7b7a8ec530ae9a7b1c4eeed530d54ca7ef9f224fe207be42dc74f565587a",
"diff-size": 5848934,
"compression": 2
test:1.0 has
"id": "767a7b7a8ec530ae9a7b1c4eeed530d54ca7ef9f224fe207be42dc74f565587a",
"created": "2020-09-23T09:50:52.502675155Z",
"compressed-diff-digest": "sha256:6a73b40493d9a3fd12c93f0d03ec90c78e447dc9f182d119fb7477b75997ba72",
"compressed-size": 23625492,
"diff-digest": "sha256:767a7b7a8ec530ae9a7b1c4eeed530d54ca7ef9f224fe207be42dc74f565587a",
"diff-size": 5848934,
"compression": 2
This is due to an incomplete cleaning of the storage cache when the
images are deleted: the mappings of the compressed digest images are
not cleaned. So, if the same layer, but with a different compression, is
later pulled, the old undeleted reference will match, returning a layer
with a different compressed digest and compressed size.
Don't blindly trust the storage cache, check we got what we need.
Signed-off-by: Francesco Giudici <[email protected]>
|
Once applied the c/storage PR ( containers/storage#730) the issue was solved and was not reported again. |
Found an issue when pulling images which have the same layer (diff-digest) compressed in different ways (so different compressed-diff-digest and compressed-size).
Reproducer:
Image docker.io/fgiudici/test:1.0 has some layers in common with docker.io/node:lts-slim, but the layers have been pushed with a different compressed archive (and so different compressed-diff-digest).
In particular:
node:lts-slim has
test:1.0 has
As the size check has been dropped in more recent branches with commit:
223c722#diff-5f78baafa6d34438aba66ed64f4b16d5L517
wondering if could be safe doing the same here (the code that was in place before the rework on caching).