How to Safely Chain Multiple API Requests for a Single User

api-designweb-development

I am writing a web application with Python and Flask. At a high-level, the web service takes an ID, downloads a file from a third-party API based on that ID, reads and analyzes the data inside the file, and finally returns that analysis to the client.

Since this process takes a fair bit of time, I'd like to have notifications to the client when (1) the file is being downloaded, (2) the file is being analyzed, and (3) the analyzed value has been returned. Since HTTP is based on a single request-response cycle architecture, I have decided to break each step into an API endpoint. The client-side JavaScript will handle chaining the AJAX requests and updating the client of the progress.

Here is the request-response workflow I have in mind:

  • 1a. The front-end makes a request with an ID to a "download" endpoint and then notifies the client that the file is being downloaded.
  • 1b. The server downloads the file and responds to the front-end with the downloaded filename. When this is successful…
  • 2a. The front-end makes a request with the filename from step 1 and some additional configuration values–set when the user initially clicks "submit"–to an "analyze" endpoint and then notifies the client that the file is being analyzed.
  • 2b. The server analyzes the file based on the client's configuration and generates an output file with the analysis. The server then responds with the output filename.
  • 3a. The front-end then makes a request to the "output" endpoint with the output filename from step 2 and notifies the client that the output file is being downloaded locally.
  • 3b. The server simply returns the static file on request. The server then deletes the output file.

Broadly, my question is: does this architecture work? More specifically:

  • Is there a better way to show progress to the client? This seems like a lot of overhead for that functionality–on the other hand, it's nice to have cleaner API endpoints.
  • What happens when client A and client B request the same data back-to-back? Client B's server-side process might respond that the output file is ready right before client A's process delete the file. Is there a way to avoid this scenario?

I realize this is a bit messy, but I am pretty new to back-end development and am very open to any suggestions.

Thanks in advance.

Best Answer

There are several parts to your question I will attempt to answer them.

Architecture

As far as architecture goes I would suggest either a simple endpoint that returns the current status of a process or a web socket (if you only support modern browsers). Using a status endpoint the client can simply poll or listen to the socket for status updates. When the status changes to something the client is updated and responds accordingly.

This way you will only have endpoints for each task and a status endpoint. The server then can perform any task in any order or respond with an error if a request is made in the wrong order.

Client Tracking

Two things might help you with solving the client A vs client B.

It sounds to me like you need to track session variables. Keep each clients requests sand boxed from each other by salting the file names with a client specific id or session. This will also help in keeping your site secure. You don't want to allow arbitrary user content to be uploaded and shared with other clients unless you can sanitize it. For example if Client A were malicious and uploaded javascript and Client B were to download it to his browser the javascript will run in the context of your domain and could end up doing nasty things.

Also, it might help to not delete the file right away but to have a separate script clean up at set time intervals. See Cron or Task Schedular.

I have used cron scripts that check a given file location for files older than a given time frame. That way I delete files well after they were needed.

Related Topic