AIP-158

Pagination

APIs often need to provide collections of data, most commonly in the List standard method. However, collections can often be arbitrarily sized, and also often grow over time, increasing lookup time as well as the size of the responses being sent over the wire. Therefore, it is important that collections be paginated.

Guidance

RPCs returning collections of data must provide pagination at the outset, as it is a backwards-incompatible change to add pagination to an existing method.

// The request structure for listing books.
message ListBooksRequest {
  // The parent, which owns this collection of books.
  // Format: publishers/{publisher}
  string parent = 1;

  // The maximum number of books to return. The service may return fewer than
  // this value.
  // If unspecified, at most 50 books will be returned.
  // The maximum value is 1000; values above 1000 will be coerced to 1000.
  int32 page_size = 2;

  // A page token, received from a previous `ListBooks` call.
  // Provide this to retrieve the subsequent page.
  //
  // When paginating, all other parameters provided to `ListBooks` must match
  // the call that provided the page token.
  string page_token = 3;
}

// The response structure from listing books.
message ListBooksResponse {
  // The books from the specified publisher.
  repeated Book books = 1;

  // A token that can be sent as `page_token` to retrieve the next page.
  // If this field is omitted, there are no subsequent pages.
  string next_page_token = 2;
}
  • Request messages for collections should define an int32 page_size field, allowing users to specify the maximum number of results to return.
    • If the user does not specify page_size (or specifies 0), the API chooses an appropriate default, which the API should document.
    • If the user specifies page_size greater than the maximum permitted by the API, the API should coerce down to the maximum permitted page size.
    • If the user specifies a negative value for page_size, the API must send an INVALID_ARGUMENT error.
    • The API may return fewer results than the number requested (including zero results), even if not at the end of the collection.
  • Request messages for collections should define a string page_token field, allowing users to advance to the next page in the collection.
    • If the user changes the page_size in a request for subsequent pages, the service must honor the new page size.
    • The user is expected to keep all other arguments to the RPC the same; if any arguments are different, the API should send an INVALID_ARGUMENT error.
  • Response messages for collections should define a string next_page_token field, providing the user with a page token that may be used to retrieve the next page.
    • If the end of the collection has been reached, the next_page_token field must be empty. This is the only way to communicate “end-of-collection” to users.
    • If the end of the collection has not been reached (or if the API can not determine in time), the API must provide a next_page_token.
  • Response messages for collections may provide an int32 total_size field, providing the user with the total number of items in the list.
    • This total may be an estimate (but the API should explicitly document that).

Opacity

Page tokens provided by APIs must be opaque (but URL-safe) strings, and must not be user-parseable. This is because if users are able to deconstruct these, they will do so. This effectively makes the implementation details of your API’s pagination become part of the API surface, and it becomes impossible to update those details without breaking users.

Warning: Base-64 encoding an otherwise-transparent page token is not a sufficient obfuscation mechanism.

For page tokens which do not need to be stored in a database, and which do not contain sensitive data, an API may obfuscate the page token by defining an internal protocol buffer message with any data needed, and send the serialized proto, base-64 encoded.

Expiring page tokens

Many APIs store page tokens in a database internally. In this situation, APIs may expire page tokens a reasonable time after they have been sent, in order not to needlessly store large amounts of data that is unlikely to be used. It is not necessary to document this behavior.

Note: While a reasonable time may vary between APIs, a good rule of thumb is three days.

Backwards compatibility

Adding pagination to an existing RPC is a backwards-incompatible change. This may seem strange; adding fields to proto messages is generally backwards compatible. However, this change is behaviorally incompatible.

Consider a user whose collection has 75 resources, and who has already written and deployed code. If the API later adds pagination fields, and sets the default to 50, then that user’s code breaks; it was getting all resources, and now is only getting the first 50 (and does not know to advance pagination). Even if the API set a higher default limit, such as 100, the user’s collection could grow, and then the code would break.

For this reason, it is important to always add pagination to RPCs returning collections up front; they are consistently important, and they can not be added later without causing problems for existing users.

Warning: This also entails that, in addition to presenting the pagination fields, they must be actually implemented with a non-infinite default value. Implementing an in-memory version (which might fetch everything then paginate) is reasonable for initially-small collections.

Changelog

  • 2019-08-01: Changed the examples from “shelves” to “publishers”, to present a better example of resource ownership.
  • 2019-07-19: Update the opacity requirement from “should” to “must”.