Apache2 Return 404 for Proxy Requests Before Reaching WSGI – Fix

apache-2.2cpu-usagemod-wsgi

I have a Django app running under Apache2 and mod_wsgi and, unfortunately, lots of requests trying to use the server as a proxy. The server is responding OK with 404 errors but the errors are generated by the Django (WSGI) app, which causes a high CPU usage.

If I turn off the app and let Apache handle the response directly (send a 404), the CPU usage drops to almost 0 (mod_proxy is not enabled).

Is there a way to configure Apache to respond directly to this kind of requests with an error before the request hits the WSGI app?

I have seen that maybe mod_security would be an option, but I'd like to know if I can do it without it.

EDIT. I'll explain it a bit more.

In the logs I have lots of connections trying to use the server as a web proxy (e.g. connections like GET http://zzz.zzz/ HTTP/1.1 where zzz.zzz is an external domain, not mine). This requests are passed on to mod_wsgi which then return a 404 (as per my Django app). If I disable the app, as mod_proxy is disabled, Apache returns the error directly. What I'd finally like to do is prevent Apache from passing the request to the WSGI for invalid domains, that is, if the request is a proxy request, directly return the error and not execute the WSGI app.

EDIT2. Here is the apache2 config, using VirtualHosts files in sites-enabled (i have removed email addresses and changed IPs to xxx, change the server alias to sample.sample.xxx). What I'd like is for Apache to reject any request that doesn't go to sample.sample.xxx with and error, that is, accept only relative requests to the server or fully qualified only to the actual ServerAlias.

default:

<VirtualHost *:80>
        ServerAdmin alejandro.mezcua@xxxx.com
        ServerName X.X.X.X
        ServerAlias X.X.X.X

        DocumentRoot /var/www/default
        <Directory />
                Options FollowSymLinks
                AllowOverride None
        </Directory>
        <Directory /var/www/>
                Options FollowSymLinks
                AllowOverride None
                Order allow,deny
                allow from all
        </Directory>

        ErrorDocument 404 "404"
        ErrorDocument 403 "403"
        ErrorDocument 500 "500"
        ErrorLog ${APACHE_LOG_DIR}/error.log

        LogLevel warn

        CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>

actual host:

<VirtualHost *:80>
 ErrorDocument 404 "404"
 ErrorDocument 403 "403"
 ErrorDocument 500 "500"

 WSGIScriptAlias / /var/www/sample.sample.xxx/django.wsgi

 ServerAdmin alejandro.mezcua@xxxx.xxx
 ServerAlias sample.sample.xxx
 ServerName sample.sample.xxx

 CustomLog /var/www/sample.sample.xxx/log/sample.sample.xxx-access.log combined

 Alias /robots.txt /var/www/sample.sample.xxx/static/robots.txt
 Alias /favicon.ico /var/www/sample.sample.xxx/static/favicon.ico

 AliasMatch ^/([^/]*\.css) /var/www/sample.sample.xxx/static/$1

 Alias /static/ /var/www/sample.sample.xxx/static/
 Alias /media/ /var/www/sample.sample.xxx/media/

 <Directory /var/www/sample.sample.xxx/static/>
  Order deny,allow
  Allow from all
 </Directory>

 <Directory /var/www/sample.sample.xxx/media/>
  Order deny,allow
  Allow from all
 </Directory>
</VirtualHost>

EDIT 3. Fixed. The problem was the loading of the Virtual Host files. The attack requests didn't really have the host header set, but the Apache status page was showing it because it was loading the default virtualhost file after the the WSGI app Virtual Host file. The solution was to rename the default virtual host file to 00-default so that apache loads it first. Then all the tips you guys have mentioned have helped to ignore those requests. CPU is back under control!

Best Answer

The simplest course of action I can recommend is to keep mod_proxy disabled and use two different <VirtualHost *:80>...</VirtualHost> sections.

In the first one you will put any ServerName that you like; since it is the first one, Apache will use it for HTTP requests that do not feature a Host: header configured in other VirtualHost sections, like proper proxy requests should be, or requests without a Host: header. It might look like this:

<VirtualHost *:80>
    ServerAdmin webmaster@localhost
    ServerName default
    RedirectMatch gone .*
</VirtualHost>

The second one will be your actual server configuration, more or less exactly like what you have posted, with the correct ServerName and ServerAlias directives.

EDIT: if, as you commented, the Host: header contains your domain, then you may simply want to add

RedirectMatch gone ^http:.*

to your existing vhost. That will do the trick.

Related Topic