REST API – Responding with Nested Arrays

apijsonrest

I would like to know the best practice for designing a REST API for a resource that has a has_many relationship with another resource.

In my app, companies have many technologies, and technologies belong to many companies. In the frontend, I want to render companies and their technologies.

I want to implement /companies endpoint so that it responds with companies and the technologies that belong to them. Then I can make a single API call to get all data necessary to render the page:

[
  {
    company_name: '',
    technologies: [
      {
        tech_name: '',
        tech_icon_name: ''
      },
      {
        ...
      },
      ...
    ]
  },
  {
    ...
  }
]

Is this a good practice, and what are the alternatives?

Best Answer

There's absolutely nothing wrong with returning nested arrays in the JSON response from a REST API, however, there's a question whether you want to do that.

If you are creating the REST API for one client, there's a great chance you want to return the technologies along with the companies, because your client has probably asked you to do so. Maybe they require the data and do not want to be querying your REST API multiple times. In that case it's completely reasonable to return the technologies along with the companies.

On the other hand, when you are creating a REST API which may be consumed by anyone you usually want to do some statistical analysis on the data you're going to supply and design your endpoints accordingly.

After your analysis you could find out the following: Your API is called by your clients 500 000 times a day. Each request takes on average 230 ms to complete, that is 31.94 hours of server time. You have talked to your clients and out of the half a milion request only 30 000 requests make use of the technologies of companies.

Because you are returning both (companies and technologies) at the same time, the other 470 000 requests are running slower and putting heavier load on your server than they ideally could. The query returns data which the consumers do not need at all and the data needs to be somehow processed and sent to the consumers anyway.

After thinking it through, you could make it so the technologies are retrieved on request and not automatically for all companies and the 30 000 consumers actually requiring the technologies would query your API a second time.

Thanks to the optimization, you can now query companies without the technologies on average in 195 ms and the average time for requesting technologies for a company separately is 170 ms. You still have to call the companies endpoint half a milion times, which will now take you 27.083 hours of server time and then process additional 30 000 requests for the consumers making use of technologies, taking 1.416 hour of server time, making in total 28,5 hours of server time.

By separating the queries, your system can not only better handle the load by reducing the load from 31.94 hours of server time to 28.5 hours, but your clients are also happier, because they get their information faster.