Vps – How to reduce the memory usage of mediawiki

dreamhostmediawikimemory usagevps

I have a VPS running apache that keeps running out of memory at Dreamhost. Dreamhost says that the problem is the wordpress configuration, but I think that's their stock answer. I've looked and the main web hosting is coming from mediawiki. I sometimes have 20-30 http processes running, and they are all running php, and mediawiki is powering my top two sites.

So I am looking for suggestions on ways to reduce the memory footprint of mediawiki. I'm currently running version 1.16.4, which I see is significantly behind the current version. (Dreamhost is supposed to upgrade it for me, but apparently they haven't.)

  • Does version 17.2 have lower footprint than 16.2?
  • Is there a clever way that I could use caching to reduce the amount of memory?
  • Are there configuration options that will reduce the memory?
  • Why do I have both apache httpd running and php5.cgi running?
  • Is there an easy way to find out which mediawiki parts are using the most ram?
  • Is there a way to reduce the number of files that are fetched? My web logs are filled with fetches to user.gif, bullet.gif, external.png, document.png — how come mediawiki's themes don't use sprites?

Thanks!

Best Answer

My first suggestion would be to ensure you are fixing the right problem.

  • Track your memory usage over a reasonable time frame and see how high it goes (and if you can correlate this to something such as increased traffic).
    • If you already have some monitoring in place (e.g. Munin) you should be able to see memory trends
    • Otherwise, use sar (e.g. if you already have it setup, sar -r -f /var/log/sa/sa17 will give you today's memory information).

Determine what process(es) are actually using your memory.

  • Your problem might not be directly related to MediaWiki. While PHP might be consuming a lot of memory, MySQL and especially Apache are good candidates for significant memory usage.
    • Use top (or htop) or ps aux --sort -rss to see which processes are consuming the most memory.
    • If your problem is PHP, you may have some success reducing the memory_limit in php.ini

Reduce Apache's memory usage

  • 20 to 30 apache processes will consume a lot of memory (likely over 500MB)
  • If you can, switch from Apache to a lightweight web server such as nginx or lighttpd. These should work with most CMSes, although some configurations (e.g. using .htaccess files) are not supported.
  • Eliminate the apache extensions that you don't need - Apache will load an almost complete copy of itself - including all extensions, etc - into memory for each request it processes.
  • Reduce the number of server processes that Apache spawns. Apache processes usually start around 10MB each and with use can increase to 30+MB each.
    • If a moment of downtime is acceptable, consider the following approach (otherwise just estimate and do the math):
      • After a few hours of use, look at the average memory used by your Apache processes
      • Stop Apache and note your used memory - this should tell you how much your operating system, and all running services (MySQL, etc) need. Restart Apache
      • Take the difference between your total memory and that used by your base system, subtract a bit (10% at least) for safety, and divide by your average apache process size.
  • Set low values for StartServers, MinSpareServers, MaxSpareServers, and MaxClients. Keep MaxClients lower than the number you calculated above, and the other values lower still.
  • Set MaxRequestsPerChild to a non-zero value (100-300 should be good)
  • With fewer servers processes, you don't want any tied up for too long - so ensure your KeepAliveTimeout is low (10s should be adequate, possibly lower, no higher than 15s - the value depends on how your site is used)

There are additional suggestions on this very good guide to optimizing Apache for low memory.

Does version 17.2 have lower footprint than 16.2?

  • There are actually only 6 versions (16.2-16.5, 17.0-17.2) between these, moreover, the minor versions are usually security updates - so I wouldn't expect major changes, except perhaps for version 17.0 (and a quick look at the changelog doesn't suggest any major changes to memory management). If you really think it is the problem, launch a virtual machine (e.g. with VirtualBox), install the two versions, and run a load test (ab, siege, httperf, etc) on them - monitor memory usage and compare the results.

Is there a clever way that I could use caching to reduce the amount of memory?

  • This depends on what the source of your problem is:
    • If it is PHP generate static copies of your pages when they change, and serve those.
    • If your problem is with Apache though, serving static assets will still require a lot of memory (although, caching is always a good idea).
      • You could use a CDN to reduce requests to static assets - which should help with memory usage on Apache.

There are some less than ideal options you may consider:

  • Using a lightweight server as a reverse proxy - it will help with the static requests, and if those make up a large portion of your requests, should help with memory usage (after Apache is properly tuned) - however, running an additional server uses some additional memory (and adds complexity to the system).
  • Use a caching layer such as Varnish - usually this is intended to run from memory - with the intention of serving pages faster at the cost of using more memory - you can, however, set it up to use a file as the cache. Much like using a reverse proxy this will reduce the load on the backend, but will itself require some memory - if you are up for experimenting you can see if the gains offset the cost.
  • Verify that your opcode cache (e.g. APC) is working, and possibly use a file backed store instead of memory for storing the cache.

Why do I have both apache httpd running and php5.cgi running?

  • Likely because you are using FastCGI. The requests for PHP files are not executed by Apache (as would be the case with mod_php) but rather, by the CGI interface of PHP. You may find that another CGI interface - PHP-FPM - offers better resource management (it can be used with mod_fastcgi).

Is there an easy way to find out which mediawiki parts are using the most ram?

  • I'd suggest that the best way to accomplish this is to disable whatever you can (extensions/plugins, etc) and run a load test. You may have a bit of success with some profilers (e.g. XDebug), but I don't expect that the results will be that easy to act on (and usually tend to be more in the form of time spent). If your requests are taking a long time to execute, some process managers (e.g. PHP-FPM) offer a 'slowlog' functionality.

Is there a way to reduce the number of files that are fetched? My web logs are filled with fetches to user.gif, bullet.gif, external.png, document.png --- how come mediawiki's themes don't use sprites?

  • You could look into Google's mod_pagespeed - it will help you out with minification, optimizing images, etc - although, it does take some effort to setup properly. Beyond that, you can modify themes to your liking or use another theme. Ensure that images, etc are cached by your user's browsers. Possibly reduce logging for certain types of assets (e.g. static objects)
Related Topic