Implementation details of redirect-to-olympus #474

arschles · 2018-08-13T23:33:03Z

This is a follow-up of #456 (comment)

We've decided that we're going to make the proxy redirect to an upstream without caching in our first release. This issue is for discussion on what we should do in a follow-up to that release (I have labeled this as a v2 milestone)

There are two related questions:

Should we make the proxy download modules from an upstream by default?
If we download, should we do it synchronously on a request (i.e. before we return to the client), or should we do it asynchronously?

The text was updated successfully, but these errors were encountered:

michalpristas · 2018-08-14T05:55:12Z

My Idea was to have it fast and synchronous - the redirect
with async db fill.
We can do this as additional middleware, checking status code of next(c)
it can see 3 scenarios

2xx: we have module registered and present (it may be after sync fill from VCS even)
3xx: we are in fact redirecting (to olympus): it should schedule a worker job
5xx: something happened, ignore

This way we dont bloat the code with spinning and checking... but treat it in a more clear way.

michalpristas · 2018-08-14T06:14:52Z

To layer it up, we will have:

job scheduling middleware:

checks statusCode of next(c), where next(c) == redirect middleware
- 2xx: we have module registered and present (it may be after sync fill from VCS even)
- 3xx: we are in fact redirecting (to olympus): it should schedule a worker job
- 5xx: something happened, ignore

redirect middleware

checks whether or not module should be processed; next(c) == Handler
- Private: calls next(c); does not redirect to Olympus
- Include: calls err := next(c); if err = NotFound { redirectToOlympus }
- Exclude: renders StatusForbidden

handler

responsible for serving modules from DB
in case it needs to pull from VCS it does it here, synchronously, invisible from the outside

marpio · 2018-08-14T08:23:07Z

3xx: we are in fact redirecting (to olympus): it should schedule a worker job

Should the worker fetch the module from Olympus or from the VCS? The later could be slower (assuming Olympus has it stored in the DB) but we already have this functionality implemented so maybe it's enough for MVP?

michalpristas · 2018-08-14T08:24:26Z

worker will do olympus pull only. because we will redirect to olympus.
We cant redirect to VCS as it does not implement Download Protocol, so VCS pull is handled synchronously in handler, module stored in localDB and then served back.

Edit: check these charts: https://docs.gomods.io/design/communication/

marpio · 2018-08-14T09:05:48Z

We are redirecting cmd/go to Olympus.
My question was about storing the module in proxy storage which, I assumed, the worker job should do. Couldn't we fill the storage just like we do this for private modules (using go get - https://github.com/gomods/athens/blob/master/pkg/download/goget/goget.go)?
I don't have a clear preference, just wanted to make this clear.

michalpristas · 2018-08-14T09:13:08Z

Yes, we might, but it would be slower, this would require a synchronous call to Olympus which would synchronously pull module from VCS, then it would serve the module to the proxy. The proxy would need to store it as well and then serve to the client.

This might timeout for larger packages. I see async as a safer variant of those 2.
I added this point to the mtg agenda so we can dive deeper and discuss together

michalpristas · 2018-08-14T10:15:50Z

After proxy redirects go cmd to Olympus, its worker should pull from Olympus as well.
We want to minimize collisions where the same version has diff SHAs. If it's still not present few configured retries should do the trick.

marpio · 2018-08-14T10:24:59Z

Also relevant: #416

arschles · 2018-08-15T00:06:55Z

We had a similar discussion about the proxy returning to the CLI indicating that it should do a VCS fetch while the proxy did a fetch in the background. We decided then to keep it simple and do a go get in the background, store to the datastore, and then return to the client (we discussed possibly storing and returning concurrently as an optimization).

What do folks think about doing the same strategy of having the proxy download from olympus before returning to the client?

michalpristas · 2018-08-15T06:36:15Z

One thing I like on a sync solution is that Olympus server may be unreachable from your client (firewall or whatever) but proxy might be able to reach to Olympus (think of any weird rule and you will find at least one company implementing it). Then having this in sync would be safer, causing less frustration and that's what we want, happy users.
If perf is good we should go with it.

But we need to think about scalability. Having it async gives us the advantage of handling the load in a preconfigured manner. We don't get out of memory because we know we will handle 5 pulls at the time (example). With pull per request, it will be more challenging to scale it (and it will be fun), but we need to think about distributed locks (mentioned in a diff thread, zookeeper/consul something similar). I think with sync this will become a must.

If perf turns out not satisfying we need to consider async fill with a redirect.
We can also plan for V2 to have it switchable. It would be just the matter of injecting the right middleware.

michalpristas · 2018-08-17T06:21:52Z

What we talked at the meeting:

1 proxy instance : Multiple DBs

Not for MVP

Multiple replicas of Proxy, sharing DB

For MVP we focus on 1:1,
Post-MVP: we will eventually support this

Deduplication and Sync

MVP: 1 replica, in memory synchronization/locks
Post-MVP: with multiple replicas distributed lock, multiple vendors/strategies will be supported such as etcd, Consul, Zookeeper

arschles · 2018-10-11T21:43:52Z

As of #772, we're not going to try and build a registry for the time being, so I'm closing this issue

arschles added the proposal A proposal for discussion and possibly a vote label Aug 13, 2018

arschles mentioned this issue Aug 13, 2018

Added support for private repos #456

Merged

michalpristas self-assigned this Aug 14, 2018

michalpristas closed this as completed Aug 14, 2018

michalpristas reopened this Aug 14, 2018

michalpristas mentioned this issue Aug 14, 2018

Serve public requests using the local storage of the proxy #471

Closed

arschles added this to the v2 milestone Aug 16, 2018

marpio mentioned this issue Aug 26, 2018

Proxy Cache Fill From Olympus #571

Closed

arschles mentioned this issue Sep 11, 2018

Make the go CLI respect "redirect" HTML metadata tags #350

Closed

arschles closed this as completed Oct 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation details of redirect-to-olympus #474

Implementation details of redirect-to-olympus #474

arschles commented Aug 13, 2018 •

edited

Loading

michalpristas commented Aug 14, 2018 •

edited

Loading

michalpristas commented Aug 14, 2018 •

edited

Loading

marpio commented Aug 14, 2018

michalpristas commented Aug 14, 2018 •

edited

Loading

marpio commented Aug 14, 2018

michalpristas commented Aug 14, 2018

michalpristas commented Aug 14, 2018

marpio commented Aug 14, 2018

arschles commented Aug 15, 2018

michalpristas commented Aug 15, 2018 •

edited

Loading

michalpristas commented Aug 17, 2018

arschles commented Oct 11, 2018

Implementation details of redirect-to-olympus #474

Implementation details of redirect-to-olympus #474

Comments

arschles commented Aug 13, 2018 • edited Loading

michalpristas commented Aug 14, 2018 • edited Loading

michalpristas commented Aug 14, 2018 • edited Loading

job scheduling middleware:

redirect middleware

handler

marpio commented Aug 14, 2018

michalpristas commented Aug 14, 2018 • edited Loading

marpio commented Aug 14, 2018

michalpristas commented Aug 14, 2018

michalpristas commented Aug 14, 2018

marpio commented Aug 14, 2018

arschles commented Aug 15, 2018

michalpristas commented Aug 15, 2018 • edited Loading

michalpristas commented Aug 17, 2018

1 proxy instance : Multiple DBs

Multiple replicas of Proxy, sharing DB

Deduplication and Sync

arschles commented Oct 11, 2018

arschles commented Aug 13, 2018 •

edited

Loading

michalpristas commented Aug 14, 2018 •

edited

Loading

michalpristas commented Aug 14, 2018 •

edited

Loading

michalpristas commented Aug 14, 2018 •

edited

Loading

michalpristas commented Aug 15, 2018 •

edited

Loading