In a previous post we’ ve shown how to create a cluster in elasticsearch with manually defined nodes. In this post we use docker compose to scale things up!
Elasticsearch has many solutions to play with (i.e. logstash and kibana aka elk, hq etc) but I have tried to use a more simple example and comment on various issues that I have come across.
Creating the compose file
First create a docker-compose.yml file and add two services:
$docker>mkdir es-cluster $docker>cd es-cluster $es-cluster>touch docker-compose.yml
With a text editor open docker-compose file and add the following:
version:'2' services: master: image: library/elasticsearch command: elasticsearch -network.host=0.0.0.0 -node.master=true -cluster.name=cluster-01 -node.name="Master of Disaster" volumes: - ./elasticsearch/config/:/usr/share/elasticsearch/config/ - ./elasticsearch/logs/:/usr/share/elasticsearch/logs/ ports: - "9200:9200" restart: always container_name: es_master node: image: library/elasticsearch command: elasticsearch -network.host=0.0.0.0 -cluster.name=cluster-01 -discovery.zen.ping.unicast.hosts=es_master restart: always volumes: - ./elasticsearch/config/:/usr/share/elasticsearch/config/ - ./elasticsearch/logs/:/usr/share/elasticsearch/logs/ depends_on: - master links: - master:es_master
master: The first service named master will be the master of the cluster.
node: The second service named node will represent all data nodes that participate in the cluster.
image: library/elasticsearch pulls (if not existing) the image of elastic search.
-node.master=true we set that in the master service to declare it as the master of the cluster.
-node.name=”Master of Disaster” optionally we give a name to the master node.
container_name: es_master we set the name es_master to the container so that the master will be discoverable by the rest data nodes.
-discovery.zen.ping.unicast.hosts=es_master each node will look up for the es_master container to connect.
- “9200:9200” in the ports section we expose the http port to the host.
-network.host=0.0.0.0 defines the where the service will be published. If this is not set the service will be published in the localhost (127.0.0.2) of the container and it won’t be available outside of it (at least not at Docker 1.12 Beta that I am using). By defining 0.0.0.0 as network host the service is exported in the containers network ip (i.e. publish_address {172.22.0.2:9200}).
-cluster.name=cluster-01 defines the name of the cluster. This should unique and the same to both master and node services in order all nodes to participate in the same cluster.
In the volumes section we map the local directories to those in the containers:
- ./elasticsearch/config/:/usr/share/elasticsearch/config/ - ./elasticsearch/logs/:/usr/share/elasticsearch/logs/
restart: always so that the master will always restart automatically every time the system reboots.
- master in the depend_on section we say that node will be depending on the master.
- master:es_master finally, in the links section we declare the link towards the master service.
Preparing for the first run
Before running this configuration for the first time it would be nice to create a logging file and set it in the config directory so that the log4j instance of elasticsearch can export logs in a favourable format. So, create the directories needed:
$cluster>mkdir elasticsearch $cluster>cd elasticsearch $elasticsearch>mkdir config
Then create the logging.yml file in the config directory:
$elasticsearch>cd config $config>touch logging.yml
To create two appenders (one for the console and one to write into a file) open logging.yml with a text editor and add the following:
es.logger.level: INFO rootLogger: ${es.logger.level}, console, file logger: action: TRACE appender: console: type: console layout: type: consolePattern conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n" file: type: dailyRollingFile file: ${path.logs}/${cluster.name}.log datePattern: "yyyy-MM-dd" layout: type: pattern conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n"
Run for the first time
Run to test without scaling.
$es-cluster>docker-compose up
This will create a default network for the containers, es-cluster_default, named after the directory that the compose file is placed. The services will be attached on that network to communicate.
To see cluster’s health, enter the following URL:
localhost:9200/_cluster/health?pretty=true
Something like this should be shown:
{ "cluster_name" : "cluster-01", "status" : "green", "timed_out" : false, "number_of_nodes" : 2, "number_of_data_nodes" : 2, "active_primary_shards" : 0, "active_shards" : 0, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }
Scale up the cluster
Now that we are positive that everything runs smoothly, we can stop the running processes (either with Ctrl C or with docker-compose stop). Beware only of one thing. If you terminate the processes by running docker-compose down it will stop and remove not only the running containers but also the network es-cluster_default. To avoid this and not depend on this network, it is better to create an external network and use it in the compose file:
$es-cluster>docker network create es-network
Edit the docker-compose.yml file and add the following lines at the end of the file:
networks: default: external: name: es-network
This section defines that the default network for the containers will the es-network instead of the es-cluster_default. And because it is the default network there is no need to add a networks section in each service. So, now we are ready to fire up our cluster with a master node and, let’s say, 5 more nodes:
$es-cluster>docker-compose scale master=1 node=5
Once again, to see our cluster’s health, in the browser give the address:
localhost:9200/_cluster/health?pretty=true
Something like this should be shown:
{ "cluster_name" : "cluster-01", "status" : "green", "timed_out" : false, "number_of_nodes" : 6, "number_of_data_nodes" : 6, "active_primary_shards" : 0, "active_shards" : 0, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }
Loading data
At that point you need to load some data into elasticsearch. From Loading the Sample Dataset you can download the sample dataset (accounts.json), extract it to your current directory and load it into your cluster as follows:
$es-cluster>curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary "@accounts.json"
After loading it run:
$es-cluster>curl 'localhost:9200/_cat/indices?v'
and it will show:
health status index pri rep docs.count docs.deleted store.size pri.store.size green open bank 5 1 889 0 344.3kb 189.7kb
If we refresh the page in the browser it will show:
{ "cluster_name" : "cluster-01", "status" : "green", "timed_out" : false, "number_of_nodes" : 6, "number_of_data_nodes" : 6, "active_primary_shards" : 5, "active_shards" : 10, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }
Adding the HQ plugin
A nice plugin for visualising the cluster and query the data within is the HQ. The easiest way to install it is to run the plugin install command for the master:
$es-cluster>docker exec es_master plugin install royrusso/elasticsearch-HQ
This will download hq and it will install it. Then, in the browser go to the address localhost:9200/_plugin/hq and connect to the http://localhost:9200 in the top of the screen.
Summary
This was yet another simple example of creating clusters with Docker compose. I hope though, I sed some light in various parts of the docker compose and networking…
Enjoy coding!