Logstash tcp input not passed to elasticsearch

logstash

After successfully setting up ELK with file inputs, logstash-forwarder and seeing logs in Kibana flow from a few servers, I have attempted to set up a TCP input:

tcp {
    codec => "json"
    host => "localhost"
    port => 9250
    tags => ["sensu"]
  }

The sender is sensu, and the messages are in indeed JSON – checked this with tcpdump command.

The Logstash log indicates that the connections are accepted:

{:timestamp=>"2015-06-15T14:03:39.832000+1000", :message=>"Accepted connection", :client=>"127.0.0.1:38065", :server=>"localhost:9250", :level=>:debug, :file=>"logstash/inputs/tcp.rb", :line=>"146", :method=>"client_thread"}
{:timestamp=>"2015-06-15T14:03:39.962000+1000", :message=>"config LogStash::Codecs::JSONLines/@charset = \"UTF-8\"", :level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
{:timestamp=>"2015-06-15T14:03:39.963000+1000", :message=>"config LogStash::Codecs::Line/@charset = \"UTF-8\"", :level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}

However, the data appears to go no further, and can't be found in Kibana.

I went as far to disable the other inputs, and then observed the shard in elasticsearch (curl 'localhost:9200/_cat/shards'), which did not increase in size.

According to this link I'm on the right track, but probably just doing something silly somewhere…
Thanks in advance.

logstash.conf:

input {
  file {
    path => ["/var/log/messages", "/var/log/secure", "/var/log/iptables"]
    type => "syslog"
    start_position => "end"
  }

  lumberjack {
    port => 5043
    type => "logs"
    ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
    ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
  }

  tcp {
    codec => "json"
    host => "localhost"
    port => 9250
    tags => ["sensu"]
  }

}

output {
  elasticsearch {
    host => "localhost"
    cluster => "webCluster"
  }
}

elasticsearch.yml:

cluster.name: webCluster
node.name: "bossNode"
node.master: true
node.data: true
index.number_of_shards: 1
index.number_of_replicas: 0
network.host: localhost

Best Answer

After a few more frustrating days I have concluded that the json/json_lines codec is broken - possibly only when used with tcp inputs.

However, I found a workaround, using a filter:

filter {
  if ("sensu" in [tags]) {
    json {
      "source" => "message"
    }
  }
}

This, and a few mutations produces the effect I was originally trying to achieve. For posterity, here's my working logstash.conf which combines logs and cpu/memory metrics data from sensu:

input {
  file {
    path => [
      "/var/log/messages"
      , "/var/log/secure"
    ]
    type => "syslog"
    start_position => "end"
  }

  file {
    path => "/var/log/iptables"
    type => "iptables"
    start_position => "end"
  }

  file {
    path => ["/var/log/httpd/access_log"
        ,"/var/log/httpd/ssl_access_log"
    ]
    type => "apache_access"
    start_position => "end"
  }

  file {
    path => [
      "/var/log/httpd/error_log"
      , "/var/log/httpd/ssl_error_log"
    ]
    type => "apache_error"
    start_position => "end"
  }

  lumberjack {
    port => 5043
    type => "logs"
    ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
    ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
  }

  tcp {
    host => "localhost"
    port => 9250
    mode => "server"
    tags => ["sensu"]
  }

}

filter {
  if ("sensu" in [tags]) {
    json {
      "source" => "message"
    }
    mutate {
      rename => { "[check][name]" => "type" }
      replace => { "host" => "%{[client][address]}" }
      split => { "[check][output]" => " " }
      add_field => { "output" => "%{[check][output][1]}" }
      remove_field => [ "[client]", "[check]", "occurrences" ]
    }
  } else if([type] == "apache_access") {
    grok {
      match => { "message" => "%{IP:client}" }
    }
  }
}

filter {
  mutate {
    convert => { "output" => "float" }
  }
}

output {
  elasticsearch {
    host => "localhost"
    cluster => "webCluser"
  }
}

Unrelated to issue: The "output" is received as multiple values separated by spaces, hence the "split" operation. The second element is used and then converted to float, so that Kibana graphs it nicely (something I learned the hard way).