Tuesday, July 9, 2013

Log analysis using Logstash, ElasticSearch and Kibana

Logstash is a free tool for managing events and logs. It has three primary components, an Input module for collecting logs from various sources [http://logstash.net/docs/1.1.13/], a parsing module for tweaking and parsing data and finally a storage/output module to save or pass along the parsed data to other systems [http://logstash.net/docs/1.1.13/].
ElasticSearch is this awesome distributable, RESTful, free Lucene powered search engine/server. Unlike SOLR, ES is very simple to use and maintain and similar to SOLR, indexing is near realtime.
Kibana is a presentation layer that sits on top of Elasticsearch to analyze and make sense of logs that logstash throws into Elastic search; Kibana is a highly scalable interface for Logstash and ElasticSearch that allows you to efficiently search, graph, analyze and otherwise make sense of a mountain of logs.
Logstash + ElasticSearch + Kibana combination can be compared to open sourced Splunk but on a smaller scale.

Logstash, is as easy as downloading [http://logstash.net/] the JAR file and setting up the input and ouput sources and running the java command. In this example, I will be monitoring a log file and writing it into Elasticsearch server for users to analyse the data using Kibana.

Elasticsearch: download the zip package from the site [http://www.elasticsearch.org/download/] and run the elasticsearch.bat file.
Note: make sure the JAVA_HOME is setup up right for the logstash and elasticsearch to work.

Kibana: download the kibana files from Github [https://github.com/elasticsearch/kibana] and either run it as a standalone app or make it part of ElasticSearch plugins. You can do this by copying the kibana files to the ElasticSearch plugins / sites directory.

*Open config.js in your favorite editor
*Set elasticsearch: 'http://localhost:9200', to your ElasticSearch server

Use case:
In general most of these log analyzer always talk about analyzing website traffic etc similar to the videos that Kibana has on their website. [http://kibana.org/about.html]. This is great but in real world logs and events are more than just website traffic such as information flow checkpoints, performance data etc.
In our case, lets assume we have some data that is being passed from one system to another and we are logging to a file. A simple representation of this information flow is as follows:

So basically there are 4 systems or states that the data is passed thru, Ingest, Digest, Process and Exit. At each of these systems, an event is logged to track the data flow or basically checkpoints. These events are logged in dataLog.log file as mentioned in the above logstash configuration file.

Once the logstash is up and running, logstash basically tails the files and copies the logged events to elastic search as JSON objects. Elasticsearch index;s all the fields and kibana is now ready to access the data. Following are some of the cases that can be analyzed using Kibana:

Show all data flowing thru the system

Filter by Id

Get All Error'd

Advanced Filter using Lucene Syntax

The above reporting/analysis are just a few examples that can be achieved using Kibana + Elasticsearch. With Kibana you can design your own custom dashboards with configurable panels that can be grouped by role. Charts and panels are fully interactive with features like drill down, range selection and customization. With using Elasticsearch, rapid data growth is as easy as adding more ES servers (in cluster).

Labels: ,