WYSIWYG

http://kufli.blogspot.com
http://github.com/karthik20522

Sunday, November 23, 2014

Elasticsearch - Zen, AWS Cluster Setup

In a cluster environment, multiple elasticsearch nodes/servers join to form a cluster where the shards are distributed and replicated among these servers but to the outside world it is presented as a single system. For elasticsearch to connect to different nodes, ES provides two discovery methods. One being Zen discovery and other being cloud based discovery via plugins for Azure, AWS and Google Compute engine.

Zen Discovery

From the above snippet, it's pretty straightforward to understand that the discover.type is "zen" and the minimum number of nodes required to form a cluster is "2" and using "unicast" to find other hosts and provide some sort of recovery mechanism (fault detection) if a server went offline or some network problems. This is probably all that Zen discovery has to offer, simple and easy! More info at Zen discovery

AWS/EC2 Discovery

For EC2 discovery, we first need to install the cloud-aws plugin if not already installed From the above config, the discovery type is ec2 and optionally given a region for the plugin to discover other nodes and security group. If there is no IAM role associated with the server, then AWS secret_key and access_key needs to be provided in-order for the plugin to query AWS for node information.

Having the node.auto_attributes set to true would add aws_availability_zone to the node attributes properties which helps in node awareness. What this means is that, given an index with replication factor of 1, ES uses this attribute to determine which node this particular shard is sitting on but makes sure the replicated shard is on a different box. More info at Shard Awareness We can make the Elasticsearch node discovery a little bit faster by filtering the number of servers it needs to ping during the discovery process. This filter can achieved by using the ec2.tag if they are assigned to EC2 servers. In a enterprise environment where there are 100's of ec2 servers deployed on AWS, pinging every single one of them would take a very long time, this should help speed things up. More information for this EC2 discovery plugin at cloud-aws and various discovery at elasticsearch-dsicovery

Labels: