(Last Updated On: September 1, 2018)

I had earlier written an article on How to Install Elasticsearch 5.x on Ubuntu 18.04 LTS (Bionic Beaver). The guide works only for a single node Elasticsearch setup with a single point of failure. In this guide, we will cover the installation of a three-node Elasticsearch Cluster on Ubuntu 18.04 to ensure high availability and scalability for huge loads. One of the nodes will be the master, and the other two nodes serve as two nodes.

Elasticsearch is a highly scalable open-source analytics engine and full-text search. With Elasticsearch, you can store, search, and analyze big volumes of data faster and in near real time. Elasticsearch is generally used as the underlying engine/technology that powers applications that have complex search features and requirements.

How to Install a three-node Elasticsearch Cluster on Ubuntu 18.04

My setup has three Elasticsearch nodes with the following hostnames:

10.10.5.10 elk-node-1
10.10.5.11 elk-node-2
10.10.5.12 elk-node-3

Add these lines on each server /etc/hosts file.

FYI, The lab is based on the following Vagrantfile

# -*- mode: ruby -*-
# vim: set ft=ruby :

ENV['VAGRANT_DEFAULT_PROVIDER'] = 'libvirt'

# Check required plugins
REQUIRED_PLUGINS_LIBVIRT = %w(vagrant-libvirt)
exit unless REQUIRED_PLUGINS_LIBVIRT.all? do |plugin|
  Vagrant.has_plugin?(plugin) || (
    puts "The #{plugin} plugin is required. Please install it with:"
    puts "$ vagrant plugin install #{plugin}"
    false
  )
end

Vagrant.configure("2") do |config|
  config.vm.define "elk-01" do |node|
    node.vm.hostname = "elk-01"
    node.vm.box = "generic/ubuntu1804"
    node.vm.box_check_update = false
    #node.vm.synced_folder '.', '/vagrant', :disabled => true
    node.vm.network "private_network", ip: "10.10.5.10"
    node.vm.provider :libvirt do |domain|
      domain.memory = 1024
      domain.storage :file, :size => '10G'
    end
  end
  config.vm.define "elk-02" do |node|
    node.vm.hostname = "elk-02"
    node.vm.box = "generic/ubuntu1804"
    node.vm.box_check_update = false
    #node.vm.synced_folder '.', '/vagrant', :disabled => true
    node.vm.network "private_network", ip: "10.10.5.11"
    node.vm.provider :libvirt do |domain|
      domain.memory = 1024
      domain.storage :file, :size => '10G'
    end
  end
  config.vm.define "elk-03" do |node|
    node.vm.hostname = "elk-03"
    node.vm.box = "generic/ubuntu1804"
    node.vm.box_check_update = false
    #node.vm.synced_folder '.', '/vagrant', :disabled => true
    node.vm.network "private_network", ip: "10.10.5.12"
    node.vm.provider :libvirt do |domain|
      domain.memory = 1024
      domain.storage :file, :size => '10G'
    end
  end
end

Note that I have added a 10GB secondary disk that will be used to store Elasticsearch data. An Ideal Elasticsearch setup will have a bigger disk size depending on your storage requirements. Start all nodes and configure file/etc/hosts with correct hostnames and IP addresses.

Step 1: Create a partition for Elasticsearch data (Optional)

It is not recommended to store Elasticsearch data under root partition /. Create its own partition and mount it. We will later change the data path to this partition mount point.

# lsblk  | grep vda
vda            252:0   0  10G  0 disk

Create a GPT partition table for the secondary disk, it can be more than one disk

sudo parted -s -a optimal -- /dev/vda mklabel gpt
sudo parted -s -a optimal -- /dev/vda mkpart primary 0% 100%
sudo parted -s -- /dev/vda align-check optimal 1

Then create LVM volume, this will make it easy to extend the partition

$ sudo pvcreate  /dev/vda1
  Physical volume "/dev/vda1" successfully created.

$ sudo vgcreate vg10 /dev/vda1
  Volume group "vg10" successfully created

$ sudo lvcreate -n elasticsearch -l 100%FREE vg10
Logical volume "elasticsearch" created

Create ext4 filesystem on the Logical Volume created

$ sudo mkfs.ext4 /dev/mapper/vg10-elasticsearch 
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 2620416 4k blocks and 655360 inodes
Filesystem UUID: 5d769fb5-68fc-4ee3-a5fa-0f6b5cda5758
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

Step 2: Install Elasticsearch on Ubuntu 18.04

Import GPG Key:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

Add Elasticsearch APT repository

Once the GPG key has been imported, add the apt repository so that you can install Elasticsearch package from:

For Elasticsearch 5.x, add repo below:

echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list

For Elasticsearch 6.x, add repo below:

echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list

Step 3: Install OpenJDK

Elasticsearch is dependent on Java, you need to install OpenJDK before you can continue.

sudo apt update
sudo apt install apt-transport-https openjdk-8-jre-headless

Step 4:  Install Elasticsearch

Now run apt-get update  then install elasticsearch package:

sudo apt update
sudo apt install elasticsearch

Once Elasticsearch is installed, configure mount point on file/etc/fstab. Ignore this if you didn’t configure a separate partition for Elasticsearch.

echo "/dev/mapper/vg10-elasticsearch /var/lib/elasticsearch/data ext4 defaults 0 0" | sudo tee -a /etc/fstab
mkdir /var/lib/elasticsearch/data

Mount the partition and change permissions

sudo mount -a 
sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch/data
sudo chmod -R 775 /var/lib/elasticsearch/data

Note that the default minimum memory set for JVM is 2gb, if your server has small memory size, change this value:

$ sudo vim /etc/elasticsearch/jvm.options

Change:

-Xms2g
-Xmx2g

And set your values for minimum and maximum memory allocation. E.g to set values to 512mb of ram, use:

-Xms512m
-Xmx512m

Step 5: Configure Elasticsearch Cluster

After you have modified the configuration, configure Elasticsearch cluster.

sudo vim /etc/elasticsearch/elasticsearch.yml

Start by setting cluster name ( Same on all nodes ):

# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: graylog-cluster
#

On each node, set a descriptive name for the node

On Node 1:

# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: elk-node-1
#

On Node 2:

# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: elk-node-2
#

On Node 3:

# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: elk-node-3
#

Modify the path.data setting to /var/lib/elasticsearch/data/

----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch/data
#

Set the bind address to a specific IP

On Node 1:

# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 10.10.5.10
# Change accordingly for other nodes

On Node 2:

# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 10.10.5.11
# Change accordingly for other nodes

On Node 3:

# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 10.10.5.12
# Change accordingly for other nodes

Set discovery by specifying all Nodes IP addresses ( Set on all nodes)

# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.unicast.hosts: ["10.10.5.10", "10.10.5.11", "10.10.5.12"]
#

Specify the number of Master eligible nodes (Set on all nodes)

discovery.zen.minimum_master_nodes: 2

This number is calculated from: [Master Eligible Node) / 2 + 1]

Define node 1 as master-eligible:

node.master: true

Define nodes 2 and 3 as data nodes:

node.data: true

If you have an active firewall, open port 9200 and 9300

sudo ufw allow 9200
sudo ufw allow 9300

Start elasticsearch and enable the service to start on boot:

sudo systemctl daemon-reload
sudo systemctl enable elasticsearch.service
sudo systemctl restart elasticsearch.service

Step 6: Testing Elasticsearch cluster

Make sure Cluster status is green,  which means it’s OK.

$ curl http://10.10.5.10:9200/_cluster/health?pretty
{
  "cluster_name" : "graylog-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Check the current master node:

$ curl http://10.10.5.10:9200/_cat/master
84GGSOjaRtCBuXrBFKXMcw 10.10.5.10 10.10.5.10 elk-node-1

Check worker nodes:

$ curl http://10.10.5.10:9200/_cat/nodes?h=ip,port,heapPercent,name
10.10.5.10 9300 10 elk-node-1
10.10.5.11 9300 11 elk-node-2
10.10.5.12 9300 12 elk-node-3

You should now have a working Elasticsearch cluster on Ubuntu 18.04. To increase the cluster size, configure extra node using the steps provided here, it should join the cluster without issues.