Available Filters for Apache Server Status Page

apache-2.2

I enabled the Apache Status page by mod_status module. The process list is very long, and most of them are OPTIONS * HTTP/1.0, which I want to filter away.

Is there any tweak, option or flag to hide those OPTIONS processes?

Best Answer

Apart from recompiling mod_status to suite your need (that could sound a bit an overkill but.... it's still feasible), mod_status provide an option specifically designed for machine-readable processing. According to official documentation:

A machine-readable version of the status file is available by accessing the page http://your.server.name/server-status?auto. This is useful when automatically run [....]

So to capture the output of mod_status is as simple as including a call to wget, curl or any other http client library than can be launched/included in your application, to suite your need.

Unfortunately I just discovered that when using the "?auto" format, most of the additional information provided by the ExtendedStatus directive are not displayed! This means that with the "?auto" option, you haven't access to the processlist.

As it sounded a little strange, I checked the source code of the mod_status module. Apart for an additional and not-documented "?notable" option, source code in "apache2-2.2.22/modules/generators/mod_status.c" (of my Ubuntu 12.04 LTS notebook) includes:

 * /server-status - Returns page using tables
 * /server-status?notable - Returns page for browsers without table support
 * /server-status?refresh - Returns page with 1 second refresh
 * /server-status?refresh=6 - Returns page with refresh every 6 seconds
 * /server-status?auto - Returns page with data for automatic parsing

(BTW: I found both interesting and curious to read "?notable - Returns page for browsers without table support" as I'm so old/ancient to remember the early days of the web, where table support was a new feature of available browsers!)

I've also checked that missing processlist in the "?auto" format is a by-design feature:

#define STAT_OPT_AUTO     2
[...]
static const struct stat_opt status_options[] =
{
    {STAT_OPT_REFRESH, "refresh", "Refresh"},
    {STAT_OPT_NOTABLE, "notable", NULL},
    {STAT_OPT_AUTO, "auto", NULL},
    {STAT_OPT_END, NULL, NULL}
};
[...]
if (r->args) {
[...]
     case STAT_OPT_AUTO:
        ap_set_content_type(r, "text/plain; charset=ISO-8859-1");
        short_report = 1;
        break;
[...] 
if (short_report)
    ap_rputs("\n", r);
else {
    ap_rputs("</pre>\n", r);
    ap_rputs("<p>Scoreboard Key:<br />\n", r);
    [...lots of other things, including "processlist"...]
}
[...]

As you can see, what you need is in the "else" part of the last "if". Hence, it's not included in the "?auto" format, as in this case we fall in the "short_report" case.

So, after all of the above and getting back to your question: "Is there any tweak, option or flag to hide those OPTIONS processes?", my answer is that your only option is to "tweak" a little application that:

  1. acts like an HTTP client versus the /server-status standard URL;
  2. parse the results to extract data from the processlist HTML table;
  3. skip the table rows related to the OPTION request;
  4. do whatever you need with the other rows.

As I'm comfortable with PERL and had some luck with the HTML::TableExtract module, a good basis that you could use is the following:

#!/usr/bin/perl

use strict;

use HTML::TableExtract;

# PATH to "curl" utility
my $CURL = "/usr/bin/curl";

# URL of the server-status we want to process
my $STATUS_URL = "http://localhost/server-status";

# those are the headers in the first row of the table we want to extract
# Used by HTML::TableExtract to search for our table, within the whole HTML output
my $headers =['Srv','PID','Acc','M','CPU','SS','Req','Conn','Child','Slot','Client','VHost','Request'];


# Let's fetch the status page...
my $output = `$CURL -s $STATUS_URL`;

# Let's search for our table within the HTML...
my $tables = HTML::TableExtract->new( headers => $headers );

# We found it (hopefully), so let's parse it...
$tables->parse($output);

# ...and let's stick to the first one
my $status_table = $tables->first_table_found;

# Now let's loop allover the rows...
foreach my $row_ref ($status_table->rows) {
      # Let's de-reference the ARRAY reference, so to better manager
      # the various elements...
      my @row = @$row_ref;

      # Let's check for an OPTIONS row...
      if ($row[12]=~/OPTIONS/) {
         # simply skip to next row in the loop
         next;
      }

      # Let's choose whatever columns we want (first column has index "0")
      # So here we have Srv, PID, Client and Request
      foreach my $column (0,1,10,12) {
        print $row[$column]."|";
      }
      print "\n";
}

In my case, above script produce following output:

verzulli@tablet-damiano:~$ perl elab.pl 
0-1|9183|127.0.0.1|GET /server-status HTTP/1.1|
1-1|9184|127.0.0.1|GET /server-status HTTP/1.1|
2-1|9185|127.0.0.1|GET /server-status HTTP/1.1|
3-1|9186|127.0.0.1|GET /server-status HTTP/1.1|
4-1|9187|127.0.0.1|GET /server-status HTTP/1.1|
5-1|9188|127.0.0.1|GET /server-status HTTP/1.1|

and you can see, skip the OPTIONS rows.

Please note that above application lacks basic error-handling so.... don't blame me if something will go wrong :-)