Logstash filtering syslog by host group

elasticsearchkibanalogstash

I've got an Elasticsearch/Logstash/Kibana instance running, which I'm merrily stuffing with syslogs from a variety of hosts.

Having built it to scale – with multiple logstash syslogd listeners, and multiple ES nodes – it's doing quite nicely for collating logging across a large portfolio of servers.

There's just one problem I'm having a the moment – grouping hosts. I can get datasets for host groupings based on a variety of criteria from my config database – physical location, 'service', 'customer' etc.

And I'd really like to be able to add these as filter criteria in my elasticsearch database, and if at all possible so I can use them in Kibana without needing to do much modification.

Currently I'm thinking in terms of either:

  • a custom logstash filter that looks up hostname in a data dump, and adds tags (really, service/customer/location is all I really need).
  • Trying to add a parent/child relationship for a 'host' document.
  • using 'percolator' to cross reference (somehow?)
  • a 'script' field?
  • Some sort of dirty hack involving a cron job to update records with metadata post-ingest.

But I'm wondering if anyone's already tackled this, and is able to suggest a sensible approach?

Best Answer

Having done a bit of digging, the solution I finally decided upon was to use the logstash plugin 'filter-translate'

This takes a YAML file with key-values, and lets you rewrite your incoming log entry based on it.

So:

translate { 
    field => "logsource"
    destination => "host_group"
    dictionary_path => [ "/logstash/host_groups.dict" ]
}

This is a rather simple list:

hostname : group
hostname2 : group

At the moment, it's static-ish and rebuild and fetched via cron. I'm intending to push towards etcd and confd to do a more adaptive solution.

This means that events are already 'tagged' as they enter elasticsearch, and also because my logstash engines are distributed and autonomous, running off a 'cached' list is desirable anyway. My host lists don't change sufficiently fast that this is a problem.

Related Topic