Mutual exclusion for the /list and other shared endpoints for CDN/object storage drivers #761

arschles · 2018-10-08T20:48:39Z

Is your feature request related to a problem? Please describe.

In the object storage systems we support (Azure CDN, GCS, Minio, etc...), there are a few entries that represent the entire module across versions. For example, the /list endpoint has to show every version of the requested module that Athens knows about.

When Athens goes and fetches a new version of a module, it has to read, update, and write the /list endpoint back to the obj. storage system. So, if someone does go get m@v1 and go get m@v2, and Athens has to download both versions, there will be a data race because it does those go gets in parallel.

Describe the solution you'd like

I think we should add some locking for the following implementations:

The rough design would be:

lock()
read /list
update /list
upload /list
unlock()

We could maybe use the same underlying locking mechanism as #760

Describe alternatives you've considered

We might want to consider using some vendor-specific concurrency features. Many of them provide per-bucket or blob locking or versioning. I researched some of them a bit for #50. From that OP:

Azure blob storage gives you versioned blobs and also lock semantics for blobs (see here)
Google Cloud Storage gives you generational versioning and preconditions (see here and here)
Quick research on S3 indicates that blobs (buckets?) have versions. Need more to research whether the API includes preconditions for generations

I still really strongly think we should prefer to do the locking with our own, unified mechanism.

Additional context

The text was updated successfully, but these errors were encountered:

marwan-at-work · 2018-10-11T01:56:27Z

Depending on the storage, I think there shouldn't be a data race if m@v1 and m@v2 were being stashed at the same time. For example, in blob storage path is just arbitrary and the fact that both modules start with m/@v/ is irrelevant.

marpio · 2018-10-11T06:44:36Z

@marwan-at-work
@arschles please correct me if i'm wrong but I think the idea here was that, if we use a CDN on top of the blob storage, we have to write/append the new version to a list file on every Save and so go can use CDN /list endpoint.
I'm not sure if this makes sense anymore, since we actually perform a merge on the /listendpoint (versions from VCS + versions from storage) to get all current versions of a module. If we only use the list file we would loose that (just like if we would use only what's in the storage).

arschles · 2018-11-07T23:52:40Z

@marwan-at-work @marpio I wrote this a while ago when we planned to make the proxy redirect directly to a CDN. that would mean on a miss, it would then have to write directly to the CDN and two processes could still race for the /list blob, even if they do the append.

We're not doing the redirect-to-cdn logic now, except to redirect to an upstream, right? I think if that's the case, we can close this.

marpio · 2018-11-08T07:08:22Z

Yes, we redirect to upstream. I think we can close this as well.
/cc @michalpristas

arschles · 2018-11-08T21:01:57Z

thanks everyone, I will close this now then :)

arschles added storage work to do on one or more of our storage systems hosting Work to do to improve/change how we host the services labels Oct 8, 2018

arschles added this to the preview milestone Oct 8, 2018

arschles mentioned this issue Oct 10, 2018

[Announcement] Athens Preview Deployment #772

Closed

arschles changed the title ~~Protect the /list and other shared endpoints for CDN/object storage drivers~~ Mutual exclusion for the /list and other shared endpoints for CDN/object storage drivers Oct 11, 2018

arschles closed this as completed Nov 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mutual exclusion for the /list and other shared endpoints for CDN/object storage drivers #761

Mutual exclusion for the /list and other shared endpoints for CDN/object storage drivers #761

arschles commented Oct 8, 2018

marwan-at-work commented Oct 11, 2018

marpio commented Oct 11, 2018

arschles commented Nov 7, 2018

marpio commented Nov 8, 2018

arschles commented Nov 8, 2018

Mutual exclusion for the /list and other shared endpoints for CDN/object storage drivers #761

Mutual exclusion for the /list and other shared endpoints for CDN/object storage drivers #761

Comments

arschles commented Oct 8, 2018

marwan-at-work commented Oct 11, 2018

marpio commented Oct 11, 2018

arschles commented Nov 7, 2018

marpio commented Nov 8, 2018

arschles commented Nov 8, 2018