Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry in DRS #238

Closed
briandoconnor opened this issue Mar 13, 2019 · 5 comments
Closed

Retry in DRS #238

briandoconnor opened this issue Mar 13, 2019 · 5 comments

Comments

@briandoconnor
Copy link
Contributor

briandoconnor commented Mar 13, 2019

'/objects/{object_id}/access/{access_id}':

HCA asked if we can support a "301 - Handle asynchronously, downstream users are expected to retry later."? With a header of Retry-After which is a "delay in seconds, downstream users are expected to retry after the delay." See their API for examples https://dss.data.humancellatlas.org/.

Might be worth looking at other endpoints but the access endpoint in particular might involve a delay and need for a retry if, for example, a given DRS server is pulling data off of cold storage.

@dglazer
Copy link
Member

dglazer commented Mar 13, 2019

I think there are two separate requests in the HCA documentation:
a) ask clients to be well-behaved on retry, including honoring Retry-After headers. That seems reasonable, and is fine to document, but is mostly just web best practice -- nothing we say in the API spec will force good behavior, esp. since it's not up to the server implementors and can't be measured by compliance tests. But it doesn't hurt to ask clients to be gentle (probably with a short paragraph in the doc, as opposed to actually in the protocol definition).
b) add support for 301 (Moved Permanently). This one seems odd to me, since I think it only makes sense if we move the DRS server itself, which is presumably a rare event that would be handled out of band, and would be hard for a client to build in. Or is the intent to say something about resources moving to new physical locations? If so, that also seems unnecessary, since the whole point of DRS is to keep clients from having to remember physical locations.

@mikebaumann
Copy link
Contributor

mikebaumann commented Apr 30, 2019

@dglazer makes good points regarding Retry-After, a standard and common HTTP practice, and one that is worth documenting as behavior that (at least some) DRS servers will perform. Yet, explicitly including it in the DRS specification would be redundant and unnecessary, at it is already well defined in the HTTP specification on which the DRS specification is based.

The same could be said for HTTP 3xx redirects, which are also part of the HTTP specification, and which (at least some) DRS servers will perform as part of their internal implementation/operation. Yet, I think 3xx redirects are less commonly used and their semantics are not as clear as Retry-After, so more explicit guidance, at least in the documentation, seems appropriate. I think DRS server redirection of the original DRS request to a different URI (e.g. a different path on the same host) is a legitimate and useful behavior that will be used by some DRS server implementations. As stated in 301, "Moved Permanently" pertains to the URI, and does not necessarily imply anything about movement of the resource itself.

@mikebaumann
Copy link
Contributor

mikebaumann commented May 6, 2019

The first key question here is:

Should the DRS specification be amenable to data storage systems supporting nearline/cold storage? That is, storage systems which may necessarily encounter a delay when accessing data, such as AWS Glacier.

Given DRS is a GA4GH specification, genomic data files are large, and there is a large and ever increasing number of them, hopefully the answer to this question is a resounding "Yes!"

The next key question is:

How to incorporate support for data access delays into the DRS specification?

The redirection/retry-after approach currently in the HCA data-store API and implementation is one option. Are there other approaches/options that others would like to propose for this purpose?

@mikebaumann
Copy link
Contributor

For more information on this issue, please see: #274

dglazer pushed a commit that referenced this issue Jun 17, 2019
* Enable handling of data access delays using HTTP 301/Retry-After

Enable DRS schema support for data repository services that
may incur delays, such as retrieval of data from cold storage
with substantial latency.

When an operation is delayed, a response is provided with
HTTP code 301 and a Retry-After header indicating the duration
 the client should wait before following the redirect.

Resolves #238

* Changed delay response status code to 202

Changed the response in the case of a delay from
status code 301 (Moved Permanently) to 202 (Accepted).
This is a better choice as it is more consistent with
the IETF specifications for HTTP and the DRS API overall.
@dglazer
Copy link
Member

dglazer commented Jun 17, 2019

Resolved by #274

@dglazer dglazer closed this as completed Jun 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants