High-traffic Drupal site apache errors

apache-2.2drupal

I'm getting a bunch of apache errors that I'm having problems tracing down. They're on a RHEL system that runs a very high-volume Drupal website.

[Mon Sep 14 12:48:44 2009] [info] [client xx.xx.xxx.xx] (70007)The timeout specified has expired: core_output_filter: writing data to the network
[Mon Sep 14 12:50:19 2009] [info] [client xx.xxx.xx.xx] (104)Connection reset by peer: core_output_filter: writing data to the network
[Mon Sep 14 12:51:28 2009] [info] [client xx.xxx.xx.xx] (32)Broken pipe: core_output_filter: writing data to the network

Occasionally (every 24 to 36 hours) there will be a load spike and the site will become completely unresponsive. Load average climbs from a normal 1-1.5 to 200. Most of the httpd processes that are running will show as 'D' — deadlocked — and the only way to get the server to get back down to "interactive" is to three-finger-salute or wait until you get a prompt and killall -9 httpd.

Obviously, the site can't be taken down for me to do a bunch of strace work. I've checked the apache configuration and (again) as far as I can tell, EnableMMAP and EnableSendFile are disabled. The files are on an NFS v3 mount, but neither the NFS server, nor the mysql server, nor anything else, is reporting errors. Nothing appropriate in the system log or dmesg. The site is also too high of a load to reconcile individual requests with errors resulting from them.

At this point, I'm thinking network hardware error and I'd prefer to bring the site up on a second machine. Anyone have any thoughts before I do this?

Best Answer

This is a wild ass guess but have you checked how many on-disk temporary tables Drupal is creating?

I have seen this cause iowait (load) problems.

mysqladmin -u root -p ext -ri 30 | grep Created_tmp_disk

First run will tell you how many on-disk temporary tables were created since last restart of MySQL. Then it will tell you how many are created in the 30 seconds time window (until you Control-C out of it).

The (band-aid) solution is to put MySQL's tmpdir on a RAM based file system (e.g. tmpfs).

I guess what I'm suggesting is that this starts the cascade - and the messages you're seeing are just abandoned connections.

Cheers

Related Solutions

Linux – apache configuration for drupal multisite

The config you have posted is vastly more complicated than it needs to be. I'm not quite sure why it's set up that way, but I did notice a few things that might make a difference.

First and foremost, in drupal6.conf all you should need is the following:

<Directory /home/d/r/drupal/web/public_html/>
    Options +FollowSymLinks
    AllowOverride All
    order allow,deny
    allow from all
</Directory>

The rest of the stuff is secondary and may be complicating (or causing) the problem. I'd recommend - especially for the initial install - to simplify the configs as much as possible. I run a lot of Drupal sites and I never mess with anything beyond pointing the mysite.conf to the Drupal directory and creating the (Drupal root)/sites/my.site.com directory and settings files.

I'd recommend ripping all the other stuff out of the drupal6.conf and seeing if it works for the install. Then add back in the access blocks on the /admin, install.php, etc. I don't recommend messing with the .htaccess files in the config file. Leave out the line including the .htaccess in the apache configs and just let .htaccess be picked up as it's designed to by apache.

Linux – Apache server keeps crashing regularly

I'm not saying this is what's happening but based on my own experience as a CentOS admin, it's most likely runaway apache/php processes taking down the server. I've seen this numerous times on CentOS 5. It's frustrating because there's usually not a trace of what happened in the log files. The machine just grinds to a halt due to physical memory and swap being sucked up by apache/php processes. You would think linux memory management or some daemon would jump in and say "hey stop" but it doesn't. It'll let apache grind your system to a halt.

Having said that, to see what's happening you'll need something that can monitor and log resource usage. I like to use a program called atop. Atop is a lot like the top program but it also takes a snapshot of resource usage at defined intervals. It's pretty simple to install.

wget http://www.atcomputing.nl/Tools/atop/packages/atop-1.23.tar.gz 
tar -zxvf atop-1.23.tar.gz
cd atop-1.23 && make install

Open /etc/atop/atop.daily with a text editor and change INTERVAL=600 to INTERVAL=60

Run the command /etc/atop/atop.daily from a command prompt to start it. Wait a few minutes and run atop -r /var/log/atop/atop_20091118 with the correct date of course.

Hit the t key to go forward in time and T to go back. Next time your server crashes do this and check the MEM free and SWP free lines. If you have memory problems these will be in red. Also look for numerous httpd lines under CMD. If apache/php is your problem there'll be a bunch of them.

If this is the case, I recommend looking at you're MaxClients setting in httpd.conf. If set too high, apache will gladly eat all of your memory causing your machine to crash. Apache/php can easily eat 40-50MB/process. If you multiply 40mb x MaxClients you'll get a rough idea of how much memory apache can potentially use. MaxClients usually defaults to 150 on CentOS so apache can potentially use 6GB of memory by default. This doesn't include memory your system needs for itself and other processes to run. Try setting it to a more realistic value based on the amount of memory you have like 40 if you have 2G of memory and see if that helps. Also if you have KeepAlive On, set KeepAliveTimeout to a low number like 2 or 3.

In my opinion CentOS's apache/php compilation is a real pos that should never have seen the light of day. It's buggy and crash prone. If you run a serious site, I highly recommend compiling your own version of apache/php or even using one of the newer high performance webservers like lighttpd or nginx with fgci php.

Best Answer

Related Solutions

Linux – apache configuration for drupal multisite

Linux – Apache server keeps crashing regularly

Related Topic