Optimization – Get All Data vs Get Partial Data Optimization

client-serverhttp-requestoptimizationrest

Let's say a client makes a GET call to the server to get all the followers of some user. now the client shows a list of all the followers but the only data the list needs is:

{"username" : "user", "thumbUrl" : "http:/www.example.com/photo/1", "age" : 78}

now the user can click on one of the followers and he can see more data about the follower he clicked.

My question: should i bring from advance all the followers data from the server (full User Object) vs bring only partial data and then make another call onDemand when the userclicks a follower. And more important is if do i really need to care about such optimizations?

assumptions:

  1. data is throttled (10 Objects per call to followers)
  2. size of each User Object is around 1kb, partial around 200 bytes
  3. User usually clicks on 5 followers for each 10 Objects.

Points of Interest:

  1. size saved: about 10kb - 2kb - 5kb = 3kb per bucket of 10 users. is it negligible at this age of the internet? would it matter if the size difference was 30kb?
  2. bucket size: I gave examples with low bucket size but let's sat my bucket size can get up to 2Mb. Does it matter if my bucket size is 2Mb with the full User data vs 400kb with partial call? Is it slower? (Assuming the User will click enough followers to make the size difference negligible
  3. Will welcome any other points of interest

Best Answer

It depends.

Basically, you have to look at what the expected latency of the connection is, what the bandwidth is, and what responsiveness you want.

As an example: Suppose the round trip latency from client to server is 100 msecs and the bandwidth is 8 mb/s. If you send the "full" data is 2Mb and the "partial" data is 400kb, then it will take 350 msecs to send the "full" record and 150 msecs to send a "partial" record. If you send partial records, then each click requires 110 msecs to retrieve results. Otherwise, each click is instant. So:

  • Full - First load: 350 msec, click: instant
  • Partial - First load: 150 msecs, click: 110 msecs

The key point is to understand that each call adds overhead. While it is very tempting to minimize the data transferred, this can actually make things slower, if it causes more round trips.

Of course, this is in and of itself misleading because network calls are variable. But personally with these numbers I'd be tempted to load up front.

But this is just a very high level analysis. Other things to consider:

  • This completely ignores the server side cost. How fast does it take the "full" data vs. the "partial"? Can you pull the "full" data, cache it for the future, and return the partial?
  • Code that gets the data in one block is likely to be simpler on both client and server, and therefore less buggy.
  • If you send partial data, you have to worry about what happens if records change between the first pull and the second.
  • To users, a single slow call followed by instant responsiveness feels "faster" than if every single click takes noticeable time. You want to pay attention to how fast users feel the system is as much as possible at the expensive of concrete measures of how fast the system actually is.

In general, you are better off minimizing the number of network calls rather than minimizing the amount of data transferred overall. But this cannot be a hard and fast rule because again, it really depends on both the expected bandwidth and the expected latency. I should note, though, that bandwidth is constantly improving while latency is unlikely to improve significantly over time.