How to parse a human-readable byte count in Logstash

kibanalogstash

I'm dealing with log files containing parts such as:

538,486K of 1,048,576K

These represent memory use (Java heap space) rendered in a human-readable format. I would like to track those numbers in charts in Kibana. To do this I would like to somehow use Logstash's grok filter to parse these numbers, but I don't know how to handle (i.e. ignore) the thousands separator.

Ideally I would have something that can also handle the "K" and multiply by one thousand. At this point in time I am not aware that any system logs in a unit other than kilobyte, but I'd prefer not to make that assumption.

Best Answer

The mutate filter allows text replacement with the gsub option.

gsub takes an array, where every triplet of values indicates:

Target field name
Search pattern
Replace pattern

It technically supports regular expressions, but we don't need that in this case.

First, we strip commas. Simple enough.

Second, we multiply. Should K multiply by 1000? If so, it seems to me that we can simply replace K with 000.

Putting those together:

filter {
    mutate {
        gsub {[
            "some_field", ",", "",
            "some_field", "K", "000"
        ]}
    }
}

You can add other replacement options as needed.

Depending on your circumstances, K might multiply by 1024, which is going to be a bit more complicated. I don't see any solution right out of the box, but you can use the ruby filter to run some arithmetic.

Related Solutions

How to parse audit.log using logstash

A quick search finds this on github

AUDIT type=%{WORD:audit_type} msg=audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\): user pid=%{NUMBER:audit_pid} uid=%{NUMBER:audit_uid} auid=%{NUMBER:audit_audid} subj=%{WORD:audit_subject} msg=%{GREEDYDATA:audit_message} 
AUDITLOGIN type=%{WORD:audit_type} msg=audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\): login pid=%{NUMBER:audit_pid} uid=%{NUMBER:audit_uid} old auid=%{NUMBER:old_auid} new auid=%{NUMBER:new_auid} old ses=%{NUMBER:old_ses} new ses=%{NUMBER:new_ses}

A cursory review suggests it's probably what you're looking for.

Iis – parse old IIS logs with logstash

Here's a couple of links explaining what you'll want to be doing. Take a look through the rest of the documentation on the official site if you have any more issues.

10 minute tutorial. Part 5 shows what you need to be doing.

Parsing old logfiles with correct timestamps. Might be useful as you get into the swing of things.

Best Answer

Related Solutions

How to parse audit.log using logstash

Iis – parse old IIS logs with logstash

Related Topic