Localstack and two other containers!

Steve Wade
5 min readApr 27, 2020

Feature statement

Providing the ability to snapshot Elasticsearch clusters to S3 on a regular cadence. For reference, we are using the managed Elasticsearch service in AWS and are currently on Elasticsearch version 6.4.2. We decided to write a Golang command-line tool to perform the snapshotting,

Our proposed solution / problem

However, we wanted to validate our tool locally before having to deploy it to our Kubernetes cluster in AWS.

After doing some searching online, my colleague, Will Varney came across localstack (https://github.com/localstack/localstack).

LocalStack — A fully functional local AWS cloud stack

LocalStack provides an easy-to-use test/mocking framework for developing Cloud applications. Currently, the focus is primarily on supporting the AWS cloud stack.

I would also recommend leveraging awscli-local (https://github.com/localstack/awscli-local) when using localstack as it bypasses the need to have to mess around with your awscli to interact.

This package provides the awslocal command, which is a thin wrapper around the aws command-line interface for use with LocalStack.

Leveraging docker-compose

Will and I then set about constructing a docker-compose file that would allow us to simulate having both an Elasticsearch cluster and an S3 bucket running locally.

Initial attempt

The belowdocker-compose.yml file shows both the elasticsearch and localstack configuration.

version: '3'
services:
elasticsearch:
build: .
container_name: elasticsearch
volumes:
- ./es.yml:/usr/share/elasticsearch/config/elasticsearch.yml
ports:
- 9200:9200
environment:
discovery.type: single-node
links:
- "localstack:localstack"
restart: always

localstack:
image: localstack/localstack:0.11.0
container_name: localstack
ports:
- '4563-4599:4563-4599'
- '8055:8080'
environment:
- SERVICES=s3
- DEFAULT_REGION=eu-west-2
- DATA_DIR=/tmp/localstack/data
- AWS_ACCESS_KEY_ID=1234
- AWS_SECRET_ACCESS_KEY=1234
volumes:
- './.localstack:/tmp/localstack'

You will notice above that elasticsearch is being built from a Dockerfile rather than just specifying an image, the Dockerfile consists of the open-source image for Elasticsearch 6.4.2 plus the installation of the s3 repository plugin to allow us to snapshot to S3 (see below).

FROM docker.elastic.co/elasticsearch/elasticsearch-oss:6.4.2

RUN /usr/share/elasticsearch/bin/elasticsearch-plugin install --batch repository-s3

The file piece of the puzzle is the configuration of Elasticsearch via the elasticsearch.yml file, the contents of this can be seen below:

cluster.name: "docker-cluster"
network.host: 0.0.0.0

# Configuration to allow for local S3 (for localstack)
s3.client.default.endpoint: localstack:4572
s3.client.default.protocol: http

Note: Port 4572 is the port which localstack provides for access to S3, this can be found in the overview section of their README file.

Everything looked good until we actually came to interact with our S3 bucket from within Elasticsearch itself. We were getting a 401 Unauthorized error.

The journey to a 200

The remainder of this post talks about the changes required to get a successful connection between Elasticsearch and S3.

The need to leverage the elasticsearch keystore

The client that you use to connect to S3 has a number of settings available. The settings have the form s3.client.CLIENT_NAME.SETTING_NAME. By default, s3 repositories use a client named default, but this can be modified using the repository setting client.

Most client settings can be added to the elasticsearch.yml configuration file with the exception of the secure settings, which you add to the Elasticsearch keystore. For more information about creating and updating the Elasticsearch keystore, see Secure settings.

For example, if you want to use specific credentials to access S3 then run the following commands to add these credentials to the keystore:

bin/elasticsearch-keystore add s3.client.default.access_key
bin/elasticsearch-keystore add s3.client.default.secret_key
bin/elasticsearch-keystore add s3.client.default.session_token

Therefore we added these to our Elasticsearch Dockerfile, but this was to no avail. The reason for this was because the Keystore was wiped after the elasticsearch binary in the container was instantiated.

Will and I then decided to exec into the running container, execute the keystore add commands locally, and then use the docker cp command to copy the Keystore file from the container to our local machine.

The values used for the keystore commands need to match the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY specified in the localstack configuration within the docker-compose.yml file.

We could then mount the keystore inside the container at runtime which meant it wouldn’t be over-ridden. So now the elasticsearch section of the Dockerfile looked like below:

elasticsearch:
build: .
container_name: elasticsearch
volumes:
- ./es.yml:/usr/share/elasticsearch/config/elasticsearch.yml
-
./es.keystore:/usr/share/elasticsearch/config/elasticsearch.keystore
ports:
- 9200:9200
environment:
discovery.type: single-node
links:
- "localstack:localstack"
restart: always

Good old Docker networking!

As you saw earlier in this post we were using localstack:4572 as the default endpoint for Elasticsearch to use when talking to S3 (see below):

# Configuration to allow for local S3 (for localstack)
s3.client.default.endpoint: localstack:4572
s3.client.default.protocol: http

After further reading the documentation we came across the following:

There are a number of storage systems that provide an S3-compatible API, and the repository-s3 plugin allows you to use these systems in place of AWS S3. To do so, you should set the s3.client.CLIENT_NAME.endpoint setting to the system’s endpoint. This setting accepts IP addresses and hostnames and may include a port. For example, the endpoint may be 172.17.0.2 or 172.17.0.2:9000.

Therefore we decided to specifically give both Elasticsearch and localstack dedicated IP addresses to communicate with each other.

We did this as we needed to use path-based access to the s3 bucket similar to what's discussed here. Otherwise, it uses DNS which fails as we’re not actually using s3. Specifying an IP address changes the underlying AWS Java SDK the plugin is using to use path-based.

To configure this, we added a networks configuration option within docker-compose. We added the section below to the bottom of our docker-compose.yml file.

networks:
testing_net:
ipam:
driver: default
config:
- subnet: 172.28.0.0/16

For more information on docker-compose networking see https://docs.docker.com/compose/networking/#specify-custom-networks.

We then had to hard-code the IP addresses for each of the containers (see below)

elasticsearch:
build: .
container_name: elasticsearch
...
networks:
testing_net:
ipv4_address: 172.28.1.1
localstack:
image: localstack/localstack:0.11.0
container_name: localstack
...
networks:
testing_net:
ipv4_address: 172.28.1.2

The final piece of the puzzle was to hard-code the default s3 endpoint in the elasticsearch configuration to be the below:

s3.client.default.endpoint: 172.28.1.2:4572

After making these sets of changes, elasticsearch was finally able to successfully communicate with our S3 bucket running on localstack. Meaning we could finally have a local test rig to code/prototype against!

Summary

After locating the solutions to these problems we now have a local setup of Elasticsearch communicating with S3 which we could prototype against without needing to deploy our Go code into AWS for testing.

This reduced our feedback loop significantly and allowed us to ship the snapshotter in just over two days (we wasted around half a day figuring out the answer to the 401 unauthorized errors).

If you need to write code that interacts with AWS I can highly recommend leveraging localstack for local development.

--

--