Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination #2

Open
Afterster opened this issue Nov 2, 2014 · 8 comments
Open

Pagination #2

Afterster opened this issue Nov 2, 2014 · 8 comments

Comments

@Afterster
Copy link
Collaborator

Paging should be supported from the beginning as is filtering.
I think it is only missing optional start and end parameters, don't you?

@sampsyo sampsyo changed the title Paging Pagination Nov 2, 2014
@sampsyo
Copy link
Member

sampsyo commented Nov 2, 2014

Yep, let's add pagination.

There was some discussion about this in beetbox/beets#750 and beetbox/beets#718. Some people (e.g., @asutherland) have suggested that start/end or offset/limit can be an efficiency problem in large collections, so let's be careful about this one. Options include:

  1. Offset/limit: straightforward but potentially slow.
  2. Since the goal is to reduce bandwidth, just provide an "id-only" option. Clients are required to download all the ids at once and then can request details for batches of ids in smaller chunks.
  3. Supply the id for the first object to return (i.e., like offset/limit but use an id in place of the offset). This won't help in terms of server efficiency for complex searches, but it will help in the case of getting all the resources in a collection (e.g., GET /items), which seems like a common case because databases can usually efficiently look up by id.
  4. Stateful "cursor" pagination. This would likely be the most efficient from a database perspective but requires nontrivial server complexity.

Input from people with more database server experience here would be greatly appreciated.

@sampsyo
Copy link
Member

sampsyo commented Nov 2, 2014

I should add that option 3 looks like a good 80% solution that would be easy to support, so I lean that way at the moment.

@Afterster
Copy link
Collaborator Author

Interesting discussion on pagination (yeah, not paging... ^^) on beets threads indeed. 3. looks a good compromise indeed.

@adamcik
Copy link
Collaborator

adamcik commented Nov 4, 2014

Normal "trick" for doing this is having a opaque continuation token. The meaning of this token is internal to the server, and can in effect be a string that encodes a limit/offset, an identifier of the last item retrieved or any other data needed to continue paginating efficiently.

Downside of this variant is that you can never jump straight to page N but must go through 0-N to get there. On the positive side this leaves the servers free to store the information it needs to do efficient pagination and not have to work around whatever was chosen here.

@sampsyo
Copy link
Member

sampsyo commented Nov 5, 2014

Very cool. I'm enthusiastic about this opaque-token pagination idea. To save any other readers a little googling, here's a discussion of the problems with limit/offset, and here's an implementation of the idea for Django. I like this plan because it allows the naive way (so it's easy to implement) up to a stateful DB-cursor design if you're a glutton for punishment.

One unresolved question is whether the server should be allowed to unilaterally truncate. On one hand, servers might want to prevent against unintentional DoS by returning fewer items than the client requested; on the other, clients would be required to implement pagination (a client could no longer be blissfully unaware of pagination and hope to get all the data). I think the former probably outweighs the latter. Note that the client could still request pagination and the server would have to honor it; it's just that the server could do it without asking too.

(BTW, @adamcik, I'm a Mopidy admirer—I'd be really interested in hearing about lessons learned from Mopidy development. It would be a great design goal to build something that feels "natural" as a Mopidy frontend.)

sampsyo added a commit that referenced this issue Nov 5, 2014
@sampsyo
Copy link
Member

sampsyo commented Nov 5, 2014

I've put together a draft of the pagination idea: http://auraspec.readthedocs.org/en/latest/api.html#pagination

Comments and criticism welcome!

@asutherland
Copy link
Collaborator

Looks good in general. For minutae purposes it's probably worth explicitly stating that use of the opaque token in a request is expected to invalidate the token. It's also probably desirable to state the status code that should be used if the token is expired (410 gone? Just 400 for bad request?)

If it's expected that there will be servers that might use the token to maintain state with non-trivial overhead, it might be good to describe that "limit=0" is a valid parameter that can be used for optimized invalidation of continue tokens that will never be used. So if a user closes their search window/etc., a well behaved UI could invalidate that token. And a server that doesn't care could just optimize the limit=0 as a fast-path NO-OP.

@sampsyo
Copy link
Member

sampsyo commented Nov 6, 2014

Good point. Two important things were missing: the token is single-use, and the server needs to be able to expire tokens if it wants to (to allow reasonable stateful implementations).

I'll keep thinking about whether "well-behaved" clients should be able to explicitly expire. That might be best left an implementation-specific feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants