Docker and Swarm Mode – Part 3
In part 1 we have created a swarm of 5 nodes of which we defined 3 to be master nodes and the remaining ones worker nodes. Then we deployed the open source version of the Docker Registry v2 in our swarm. On
node1 of our swarm we cloned the GitHub repository of Jerome Petazzo containing the
dockercoins application that mines Docker coins and consists of 4 services
webui. We then created images for the 4 services and pushed them to our local registry listening at port 5000. Normally the Docker Registry wants us to communicate via TLS but to make it simple we use it on
localhost:5000. when using the registry on localhost the communication is in plain text and no TLS encryption is needed. By defining the registry service to publish
port 5000 each node in the swarm can now use
localhost:5000 to access the registry, even if the registry itself is running on a different node. In this case the swarm will automatically forward the call to the correct node.
If on any node we execute the following command
we should see something similar to this
In part 2 we then learned about
software defined networks and how they are related.
Now it is time to use all what we have learned so far and get our mining application up and running.
Running the Mining Application
When we want to run an application in a swarm we first want to define a network. The services will then be running on this network. The type of network has to be overlay so that our application can span all the nodes of the swarm. Let’s do that. We call our network
docker network create dockercoins --driver overlay
We can double check that it has been created by using this command
docker network ls
which lists all networks visible to the node on which I am (node1 in this case). In my case it looks like this and we can see the newly created network in the list
Next we are going to run the
Redis service which is used as the storage backend for our mining application. We should already be familiar on how to do that after reading part 2.
docker service create --name redis --network dockercoins redis
Please note how we place the service onto the
dockercoins network by using the
After this we run all the other services. To simplify things and avoid repetitive typing we can use a for loop
After running this and waiting for a short moment we should see the following when listing all services with
docker service ls
replicas in the above image shows
1/1 for each service which indicates that all is good. If there was a problem with any of the services we would see something like
0/1, which indicates the desired number of instances of the service is 1 but the number of running instances is zero.
If we want to see the details of each service we could now use the
docker service ps command for each service. This is kind of tedious and thus a better solution is to use some combined command
docker service ls -q | xargs -n1 docker service ps
The output of this for me looks like this
Agreed, it looks a bit messy, but at least I have all the necessary information at one place with a simple command. I expect that Docker will extend the
docker servic command with some more global capabilities but for now we have to hack our own commands together.
In the above output we can see that each service runs in a single container and the containers are distributed accross all the nodes of the swarm, e.g. redis runs on node3 and the worker service on node5.
If we wanted to watch our application to start up we could just put the above command as an argument into a
watch "docker service ls -q | xargs -n1 docker service ps"
which is useful for situations where the individual services need a bit more time to initialize than the simple mining services.
We have one little problem left. As is, the
webui service is not accessible from the outside since it has no published port. We can change that by using the
update command for a Docker service. If we want to publish the internal port 80 to the host
port 8080 we have to do this
docker service update --publish-add 8080:80 webui
After this our service is reachable from the outside. We could also have chosen a more radical way and re-created the service by destroying and creating it again with a
--publish 8080:80 statement.
By choosing the
update command we instructed the scheduler (Docker Swarm) to terminate the old version of the service and run the updated one instead
If our service would have been scaled out to more than one instance then the swarm would have done a rolling update.
Now we can open a browser and connect to ANY of the nodes of our swarm on port
8080 and we should see the Web UI. Let’s do this. In my case
webui is running on
node1 with IP address
192.168.99.100 and thus I’ll try to connect to say
node2 with IP address
And indeed I see this
Now in a production system we would not want anyone from the internet hit the
webui service directly but we would want to place the service behind a load balancer, e.g. an
ELB if running in AWS. The load balancer would then forward the request to any of the nodes of the swarm which in turn would reroute it to the node on which
webui is running. An image probably helps to clarify the situation
What can we do if one of our service instances shows a problem? How can we find out what is the root cause of the problem? We could technically
ssh into the swarm node on which the problematic container is running and then use the
docker logs [container ID] command to get the details. But this of course is not a scalable solution. There must be a better way of getting insight into our application. The answer is log aggregation. We want to collect the log output of each container and redirect it to a central location e.g. in the cloud.
Let’s take Logentries as a sample. The company provides a Docker image that we can use to create a container running on each node of the swarm. This container hooks into the event stream of Docker Engine and forwards all event messages to a pre-defined endpoint in the cloud. We can then use the Web client of Logentries to slice and dice the aggregated information and easily find what we’re looking for.
If you do not yet have an account with Logentries you can easily create a 30-days trial account as I did. Once you have created the account you can define a new Log Set by clicking on + Add New
In the following dialog when asked to Select How To Send Your Logs select Docker and then in step 2 define the name of the new log set. I called mine my-log-set. In this step you will also generate a token that you will be using when running the log container.A token has this form
Once we’re done with the configuration we can execute the following command to start an instance of the Logentries container
docker run -d -v /var/run/docker.sock:/var/run/docker.sock logentries/docker-logentries -t [your-token] -j
If we do this then the container will run on the current node of the swarm and collect and forward all its information. That’s not exactly what we want though! We want to run an instance of the container on each and every node. Thus we use the feature of a
docker service create --name log --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock --mode global logentries/docker-logentries -t [your-token] -j
After a short period of time we should have an instance of the Logentries container running on each node and collecting log information. To verify this just
ssh into any node of the swarm and run an instance of busybox, e.g. something like
docker run --rm -it busybox echo "Hello world"
while you have Logentries running in Live tail mode. You should see something similar to this
In the above image we can see an entry in the log for each event generated by Docker during the life-cycle of the
Logging with an ELK Stack
If we want to run our own log aggregator then we can use the so called ELK stack (ELK = Elastic Search, Logstash and Kibana). We only really need to configure
Logstash, the other two services run with defaults.
First we create a network just for logging
docker network create --driver overlay logging
now we can create the service for Elasticsearch
docker service create --network logging --name elasticsearch elasticsearch
Then we will define a service for
Kibana needs to know where
Elasticsearch is found thus we need a tiny bit more configure information
docker service create --network logging --name kibana --publish 5601:5601 -e ELASTICSEARCH_URL=http://elasticsearch:9200 kibana
Note how we use the integrated DNS service to locate the
Elasticsearch service via its name in
Finally we need a service for
docker service create --network logging --name logstash -p 12201:12201/udp logstash -e "$(cat ~/orchestration-workshop/elk/logstash.conf)"
As you can see
Logstash needs a configuration which we get from the
logstash.conf file that is part of our repository. Also we use the Gelf protocol for logging which uses
To see what
Logstash is reporting we can localize the
Logstash container with
docker service ps logstash and then can
ssh into the corresponding node and use
docker logs --follow [container id]
where [container id] corresponds to the ID of the
Logstash container (the ID we can get via
docker ps on the node).
To generate/send a (sample) log message we can e.g. use the following command
docker run --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 --rm busybox echo hello
Now we can update all our services to use the ELK stack with this command
Finally we can open the Browser at the IP of one of our nodes and
port 5601 (e.g. http://192.168.99.101:5601) to see
Kibana. Click on the top level menu “Discover” to see the incoming logs. You might want to change the time window and the refresh interval in the top right of the screen to say last 1 hour and every 5 sec.
In this post I have shown how we can deploy and run an application consisting of multiple services. Once an application runs in production it needs to be monitored. This requires, among other things, that we collect all the log output of all our containers to be aggregated in a central location. I have shown how we can use one of the commercial SaaS offerings to do exactly that and also how we can run our own ELK stack instead. In part 4 I will be showing how we can further automate the deployment of services and the subsequent upgrade to new versions without incurring any downtime.