Database – Proper Response to HTTP Request When Too Much Data is Requested

ajaxdatabasehttprestscalability

I'm building an API for an ad serving platform that will allow you to request tracker data for ad campaigns. Campaigns often exceed hundreds of millions of requests, which means there will be many terabytes worth of data. Therefore we need to prevent API consumers from requesting too much data at once (such that the request times out), but I'm not sure what the best practice is for doing so is.

Options I've already identified are:

  1. add an extra parameter to the request that indicates which section of the data is desired
  2. truncate the data and somehow tell the client that they need to use more specific filters
  3. respond with HTTP status code 413 (but this appears to be for large request bodies, not responses)
  4. switching to a streaming API (like twitter's streaming APIs)

But my question is, what is the standard practice / proper response for this kind of situation?

Note: DoS attacks aren't much of a concern since this will not be a public API

Best Answer

Return the harshest, unfriendliest result possible in the event of a malformed request (one that returns more data than your metering allows is malformed). I suggest returning a 4** error code. Then, also provide paging parameters, so that users may request pages. oData has this feature, for instance. Do not truncate the data silently, under any circumstances.

Consulting with customers is a bad idea. They are going to tell you to do whatever possible to minimize errors, which is a bad engineering approach. This is your decision, take it by the horns and do the right thing.

An example of a paginated api is oData:

http://www.odata.org/documentation/odata-version-2-0/uri-conventions/