REST Design – Multiple Calls vs Returning All Data in One Call

rest

I am trying to build a rest API for an android app. Suppose I have a users table with (id, name, email) and a songs table with (id, song_name, album) and a rich join association between them as streams having (user_id, song_id, listen_count). I want to fetch details about all the streams and show it in the app as a list. The list would be showing the song name, album name, user name and listen count. I see three plausible options –

GET to /streams and fetch a list of all the song_ids and user_ids. Then make GET to /user/:id and /song/:id for each user and song id to get the user and song information.
GET to /streams and fetch a list of all the user_ids and song_ids. Then one GET to /user?ids=<comma_separated_ids> to fetch information about all the users and a GET to song?ids=<comma_separated_ids> to fetch information about all the songs.

GET to /streams and fetch everything in one call. Something like –

[

  {

    "user_id" : 10,
    "song_id" : 14,
    "listen_count" : 5,
    "user" : {
      "id"     : 10,
      "name"   : "bla", 
      "email"  : "bla",
    },
    "song" : {
      "id"     : 14,
      "name"   : "blu",
      "album"  : "blu"
   }
  },
...
]

I'm tempted to go with option 3 because it gives me everything in one call, but I don't think it's very rest-full and I fear that it won't be scalable. Option 2 is good but it takes 3 calls which would mean considerable time loading the list. And option 1 follows rest but will take numerous calls for showing the list and doing so many calls from a mobile device isn't feasible.

What would be the recommended way to go about this?

Best Answer

When creating a REST interface, there is no requirement, or even expectation, that the responses on the REST interface correspond directly to tables or joins in the database.

Your /streams interface can just as easily be represented as

[
  {
    "listen_count" : 5,
    "user" : {
      "href"     : "/users/10",
      "name"   : "bla", 
    },
    "song" : {
      "href"     : "/songs/14",
      "name"   : "blu",
      "album"  : "blu"
    }
  },
  ...
]

Where the JSON objects contain the main details of users and songs that are (nearly) always relevant for consumers of a stream resource, and a link to the relevant user/song resources if further details are needed.

This is essentially a variation of your third option, with a fallback to option 1 if more details are needed.

Related Solutions

Nested REST URLs and Parent ID – Best Design Practices

Where we should put search action?

In GET /search/:text. This will return a JSON array containing the matches, every match containing the album it belongs to. This makes sense, because the client may be interested not in the track itself, but the entire album (imagine that you are searching for a song which, you believe, was in the same album as the one you remember the name).

it will be not that good to return parent ids with each so. Am I wrong?

Individual tracks can contain the album. This will ensure that the track representation is uniform if you can get a track either through an album or through search (no album here).

Which is better?

As previously stated, including the album makes sense. While the third point (with the relative URI) can be interesting in some cases (you don't have to think about the way the URI should be formed), it has a drawback of not providing explicitly the album. The fourth point corrects this. If you see the benefit of having the relative URI in the response, you can combine the point 3 and 4.

Or maybe I'm dumb?

Choosing good URIs is not an easy task, especially since there is no single right answer. If you develop the client at the same time as the API, it may help you to visualize better how the API could be used. This being said, other people may then prefer other usages you weren't thinking about when developing the API.

An aspect which may be problematic is how you organize data internally, i.e. the usage of a hierarchy. From your comment, you are wondering what should contain a response to GET /artist/1/album/10/song/3/comment/23, which shows a very tree-oriented vision. This can lead to a few issues when extending the system later. For instance:

What if a song doesn't have an album?
What if an album has several artists?
What if you want to add a feature which makes it possible to comment albums?
What if there should be comments of comments?
etc.

This is essentially the problem I explained in my blog: a tree representation has too many limitations to be effectively used in many cases.

What happens if you destroy the hierarchy? Let's see.

GET /albums/:albumId returns a JSON containing the meta information about the album (such as the year when it was published or the URI of the JPEG showing the album cover) and an array of tracks. For example:
```
GET /albums/151
```
```
{
    "id": 151,
    "gid": "dbd3cec7-b927-423f-894b-742c4c7b54ce",
    "name": "Yellow Submarine",
    "year": 1969,
    "genre": "Psychedelic rock",
    "artists": ["John Lennon", "Paul McCartney", ...],
    "tracks": [
        {
            "id": 90224,
            "title": "Yellow Submarine",
            "length": "2:40"
        },
        {
            "id": 83192,
            "title": "Only a Northern Song",
            "length": "3:24"
        }
        ...
    ]
}
```
Why do I include, for instance, the length of each track? Because I imagine that the client showing an album may be interested by listing the tracks by title, but also show the length of each track—most clients do. On the other hand, I may not show the composer(s) or the artist(s) for every track, because I decide that this information is not necessary at this level. Obviously, your choices may be different.

GET /tracks/:trackId returns the information about a specific track. Since there is no hierarchy any longer, you don't need to guess the album or the artist: the only thing you really have to know is the identifier of the track itself.

Or maybe even not? What if you can specify it by name with GET /tracks/:trackName?

GET /tracks/Only%20a%20Northern%20Song

{
    "id": 83192,
    "gid": "8d9c4311-9d7b-40a4-8aeb-4fe96247fe2b",
    "title": "Only a Northern Song",
    "writers": ["George Harrison"],
    "artists": ["John Lennon", "Paul McCartney", "Ringo Starr"],
    "length": "3:24",
    "record-date": 1967,
    "albums": [151, 164],
    "soundtrack": {
        "uri": "http://audio.example.com/tracks/static/83192.mp3",
        "alias": "Beatles - Only a Northern Song.mp3",
        "length-bytes": 3524667,
        "allow-streaming": true,
        "allow-download": false
    }
}

Now look closer at albums; what do you see? Right, not one, but two albums. If you have an hierarchy, you can't do that (unless you duplicate the record).

GET /comments/:objectGid. You may have spotted the ugly GUIDs in the responses. Those GUIDs make it possible to identify the entity across the database in order to perform tasks which can be applied to albums, or artists, or tracks. Such as commenting.
```
GET /comments/8d9c4311-9d7b-40a4-8aeb-4fe96247fe2b
```
```
[
    {
        "author": {
            "id": 509931,
            "display-name": "Arseni Mourzenko"
        },
        "text": "What a great song! (And I'm proud of the usefulness of my comment)",
        "concerned-object": "/tracks/83192"
    }
]
```
The comment references the concerned object, making it possible to go to it when accessing the comment outside its context (for instance when moderating the latest comments through GET /comments/latest).

Note that this doesn't mean that you should avoid any form of hierarchy in your API. There are cases where it makes sense. As a rule of thumb:

If the resource makes no sense outside the context of its parent resource, use hierarchy.
If the resource can live (1) alone or (2) in a context of parent resources of different types or (3) have multiple parents, the hierarchy should not be used.

For instance, lines of a file make no sense outside the context of a file, so:

GET /file/:fileId

and:

GET /file/:fileId/line/:lineIndex

are fine.

API Design – Specialized Endpoints vs Multiple Calls to Generic Resources

Why not both?

Which is to say, yes, there are trade offs to consider, but if the marginal cost of implementing a second option is small, you can offer to your clients the ability to select which representation they prefer, so that they can choose their own trade offs (of course, there's some complexity penalty to be paid by offering a choice, rather than solving "the" problem for the clients).

The major con I see is that you've built an endpoint tightly coupled to this particular view within this particular application.

Not quite the right language, from a REST perspective. It's not the endpoint that is coupled to the application, but the media type of the representation.

Of course, worrying about media types tends to fall by the wayside when we are implementing both the client and the server, and their release cycles are coupled.

The pros of 2. are that we are using nothing but generic resource endpoints, which could be re-used by many different views and applications.

That thought is incomplete - you can not only reuse the endpoints, but you can re-used the representations themselves... ie: caching. If the client can pull the data it needs out of its own cache, then it doesn't need to round trip at all. Failing that, an intermediate cache may already have a copy of the data, shortening the round trip. The "server" that the client is talking to might be a cache farm in front of your app, keeping the workload low while being able to scale out.

In REST, you want to make sure that your designs take advantage of the uniform interface.

So one of the things you should be thinking about is the cache lifetime of your resources; how long are representations valid? Are other views and applications going to be able to take advantage of that?

Should the fact that this API is an internal company API (and almost certain to remain so), rather than a public facing one, influence my decision?

That's likely to put limits on the volume of traffic you'll need to support. Also, if the clients are all going to be centrally located, then round trip time falls away as a concern as well.

Best Answer

Related Solutions

Nested REST URLs and Parent ID – Best Design Practices

API Design – Specialized Endpoints vs Multiple Calls to Generic Resources

Related Topic