Overly simplified: You need something that executes Python but Python isn't the best at handling all types of requests.
[disclaimer: I'm a Gunicorn developer]
Less simplified: Regardless of what app server you use (Gunicorn, mod_wsgi, mod_uwsgi, cherrypy) any sort of non-trivial deployment will have something upstream that will handle the requests that your Django app should not be handling. Trivial examples of such requests are serving static assets (images/css/js).
This results in two first tiers of the classic "three tier architecture". Ie, the webserver (Nginx in your case) will handle many requests for images and static resources. Requests that need to be dynamically generated will then be passed on to the application server (Gunicorn in your example). (As an aside, the third of the three tiers is the database)
Historically speaking, each of these tiers would be hosted on separate machines (and there would most likely be multiple machines in the first two tiers, ie: 5 web servers dispatch requests to two app servers which in turn query a single database).
In the modern era we now have applications of all shapes and sizes. Not every weekend project or small business site actually needs the horsepower of multiple machines and will run quite happily on a single box. This has spawned new entries into the array of hosting solutions. Some solutions will marry the app server to the web server (Apache httpd + mod_wsgi, Nginx + mod_uwsgi, etc). And its not at all uncommon to host the database on the same machine as one of these web/app server combinations.
Now in the case of Gunicorn, we made a specific decision (copying from Ruby's Unicorn) to keep things separate from Nginx while relying on Nginx's proxying behavior. Specifically, if we can assume that Gunicorn will never read connections directly from the internet, then we don't have to worry about clients that are slow. This means that the processing model for Gunicorn is embarrassingly simple.
The separation also allows Gunicorn to be written in pure Python which minimizes the cost of development while not significantly impacting performance. It also allows users the ability to use other proxies (assuming they buffer correctly).
As to your second question about what actually handles the HTTP request, the simple answer is Gunicorn. The complete answer is both Nginx and Gunicorn handle the request. Basically, Nginx will receive the request and if it's a dynamic request (generally based on URL patterns) then it will give that request to Gunicorn, which will process it, and then return a response to Nginx which then forwards the response back to the original client.
So in closing, yes. You need both Nginx and Gunicorn (or something similar) for a proper Django deployment. If you're specifically looking to host Django with Nginx, then I would investigate Gunicorn, mod_uwsgi, and maybe CherryPy as candidates for the Django side of things.
It sounds like the bottleneck is the app powering the socket rather than it being Nginx itself. We see this a lot with PHP when used with sockets versus a TCP/IP connection. In our case, PHP bottlenecks much earlier than Nginx ever would though.
Have you checked over the sysctl.conf connection tracking limit, socket backlog limit
net.core.somaxconn
net.core.netdev_max_backlog
Best Answer
It really depends on how much of the uWSGI features are part of your infrastructure. One of the purposes of WSGI is allowing easy move from one adapter to another. If you use uWSGI only for the "WSGI" part you can move to gunicorn without problems.
Having said that, you should take in account that the uWSGI gevent support is really powerful and highly integrated with the uWSGI api (once you load the gevent plugin, all of the blocking internals of the server are hooked with gevent primitives), so maybe you can consider it (in addition to this uWSGI offloading allows you to move requests from one instance to another without blocking the frontend worker, so your rest api can be used as a "proxy with more logic")