Nginx configuration for long-running tasks

djangogunicornnginxpython

I have a web-based application that conducts some linguistic analysis of user-submitted texts. This is a rather memory-intensive task and typically takes an extended period of time (e.g., up to 3 minutes for processing 30 files). I'm using Django's StreamingHttpResponse function to do the job, but noticed that nginx is dropping user's request after processing about 7 files (less than 50 seconds). I tried to adjust the both nginx and Gunicorn keep_alive settings, but it seems not working. I wonder if anyone here could give me some pointers on this?

I'm also wondering what is the best approach to tackle a task that takes a long time to compute? Asynchronously?

Best Answer

I'm also wondering what is the best approach to tackle a task that takes a long time to compute? Asynchronously?

This is what worker queues are for. You should consider separating the submission of the files from processing. Let the user submit the files, save them off, add a message to a worker queue to process them, indeed asynchronously. The user gets on with their business, may see a loading screen, but it's no longer related to that web session.

In the meanwhile, a separate process picks new tasks from the worker queue, processing each independently of whatever the user is doing. There are many such queueing systems, like Amazon AWS SQS:

https://aws.amazon.com/sqs/