Rest – By the book REST vs Too Many Requests

apiapi-designrest

From Roy Fielding's comment on his own article decrying fake REST apis:

A truly RESTful API looks like hypertext. Every addressable unit of
information carries an address, either explicitly (e.g., link and id
attributes) or implicitly (e.g., derived from the media type
definition and representation structure). Query results are
represented by a list of links with summary information, not by arrays
of object representations (query is not a substitute for
identification of resources).

This means that if you needed to, say, query for a list of the 100 most recently logged in users, and display their names and emails, you'd need to first do a GET query for the list of results, which would (essentially) be a list of link elements, with each link object containing the URI of a user resource. You would then need to make 100 more GET requests — one for each user resource — before you'd actually have the data you need to display your results.

That seems incredibly inefficient. Is there really no other truly RESTful way to get the data you need in 1 or 2 requests?

Best Answer

Is there really no other truly RESTful way to get the data you need in 1 or 2 requests?

  • Not really
  • But don't over think it

As usual, when thinking about REST, keep in mind that there is a reference implementation (the world wide web) that you can check against.

Consider the Amazon portal - when I open that bookmark with an empty cache, I see my browser make requests to 275 resources.

Would I get better latency if all of that state were fetched in a single payload? Yes.

Would it scale? would it web-scale? Probably not. That's 4.5MB of data that can't be shared because in includes 1KB that is specific to my profile. If my colleague at the desk next to me also goes to Amazon, she pulls the same data across the network.

Decompose that payload into individually addressable resources, and suddenly things get a lot better -- we each still get our 1KB of personalization, and we each still have our locally cached copy of the 4.5MB, but we haven't needed to bang the network as hard, because most of our requests were served by a local shared cache, rather than needing to route across the internet.

Also, keep in mind that you don't really have an issue with multiple resources, you have an issue with multiple requests. That can be mitigated using HTTP/2.0 Push Promises, with the server proactively pushing representations that can be cached. Maybe - a stateless server doesn't know what the client has cached, and TLS suggests that caching at intermediates isn't a priority....

This means that if you needed to, say, query for a list of the 100 most recently logged in users, and display their names and emails, you'd need to first do a GET query for the list of results, which would (essentially) be a list of link elements, with each link object containing the URI of a user resource. You would then need to make 100 more GET requests -- one for each user resource -- before you'd actually have the data you need to display your results.

Of course, if you were doing this in html, your representation of the most recently logged in users would probably be a document with a list or a table of names and email addresses and links to those resources. Ta-da.

Don't lose track of this observation by Fielding.

That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them.

EDIT

can i make the same argument for a JSON representation? ie, if the resource in question is "100 last logged in users" rather than results of a parameterized query, can i then return the data itself instead of resource links? if not, why? why is JSON essentially different from HTML in this regard?

How they are alike: packing more data into the "list of results" saves you the costs of the additional requests, while compromising scaling. The specific media type you are using for the representation doesn't matter -- at least, as far as I know.

How they are different: HTML is a hypermedia format, and JSON isn't -- any bog standard client implementation that is familiar with the HTML spec will know how to find the links in an HTML document, which supports options like pre-fetching. JSON doesn't have that standardization - you need out-of-band information about the data structure to understand where the links are in a JSON representation. HAL would be a closer match to HTML in this regard; the primary difference between HAL and HTML is adoption; HTML has a 20 year head start?

For additional insights, you might also consider reviewing the Atom Syndication Format, which describes both Entries and Feeds (lists of Entries), especially the rules for atom:entry, which might be accessed via a stand alone resource, or via a feed resource.

Related Topic