Skip to content

Conversation

@mtrmac
Copy link

@mtrmac mtrmac commented Sep 6, 2018

- What I did
Fixed a race when two coroutines independently try to add the same digest reference.

- How I did it
See the commit message.

- How to verify it
On a trivial level, using the modified unit tests. Beyond that, whatever is the usual standard, I guess.

It should be in principle possible to make an actual reproducer by concurrently issuing many pull and/or push API requests for the same image (same manifest digest); I haven’t even attempted that so far. The race window is a few lines of Go code in an operation which involves remote HTTP requests, so it might require quite a few concurrent attempts.

- Description for the changelog
Fixes a race possibly causing one of several concurrent push/pull operations with the same remote image fail with “Cannot overwrite digest …”.

reference.store.addReference fails when adding a digest reference
that already exists (regardless of the reference target).  Both
callers (via reference.store.AddDigest) do check in advance, using
reference.store.Get, whether the digest reference exists before
calling AddDigest, but the reference store lock is released between
the two calls, so if another thread sets the reference in the meantime,
AddDigest may fail with
> Cannot overwrite digest ...
.

Handle this by checking that the pre-existing reference points at the
same image, i.e. that there is nothing to do, and succeeding immediately
in that case.  This is even cheaper, avoids a reference.store.save() call.

(In principle, the same failure could have happened via
reference.store.AddTag, as
> Conflict: Tag %s is already set to image %s, if you want to replace it, please use -f option
but almost all callers (except for migrate/v1.Migrate, which is run
single-threaded anyway) set the "force" parameter of AddTag to true,
which makes the race invisible.  This commit does not change the behavior
of that case, except for speeding it up by avoiding the
reference.store.save() call.)

The existing reference.store.Get checks are now, in a sense, redundant
as such, but their existence allows the callers to provide nice
context-dependent error messages, so this commit leaves them unchanged.

Signed-off-by: Miloslav Trmač <[email protected]>
@mtrmac
Copy link
Author

mtrmac commented Sep 6, 2018

Upstream PR: moby#37781 .

@runcom
Copy link
Collaborator

runcom commented Sep 7, 2018

LGTM, let's just hear from upstream as well before letting this in

@runcom runcom merged commit b380b33 into projectatomic:docker-1.13.1 Sep 11, 2018
@mtrmac mtrmac deleted the reference-race branch September 11, 2018 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants