Up into the Swarm

Gabriel Schenker - 8 April, 2017. It was a Saturday.

Last Thursday evening I had the opportunity to give a presentation at the Docker Meetup in Austin TX about how to containerize a Node JS application and deploy it into a Docker Swarm. I also demonstrated techniques that can be used to reduce friction in the development process when using containers.

The meeting was recorded but unfortunately sound only is available after approximately 16 minutes. You might want to just scroll forward to this point.

Video: https://youtu.be/g786WiS5O8A

Slides and code: https://github.com/gnschenker/pets-node

Containers – Cleanup your house revisited

Gabriel Schenker - 12 December, 2016. It was a Monday.

In version 1.13 Docker has added some useful commands to the CLI that make it easier to keep our environment clean. As you might have experienced yourself over time our development environment gets really cluttered with unused containers, dangling Docker images, abandoned volumes and forgotten networks. All these obsolete items take aways precious resources and ultimately lead to an unusable environment. In a previous post I have shown how we can keep our house clean by using various commands like

docker rm -f $(docker ps -aq)

to forcibly remove all running, stopped and terminated containers. Similarly we learned commands that allowed us to remove dangling images, networks and volumes.

Although the commands I described solved the problem they were proprietary, verbose or difficult to remember. The new commands introduced are straight forward and easy to use. Let’s try them out.

If you like this article then you can find more posts about containers in general and Docker in specific in this table of content.

Management Commands

To un-clutter the CLI a bit Docker 1.13 introduces new management commands. The list of those are

system
container
image
plugin
secret

Older versions of Docker already had network, node, service, swarm and volume.

These new commands group subcommands that were previously directly implemented as root commands. Let me give an example

docker exec -it [container-name] [some-command]

The exec command is now a subcommand under container. Thus the equivalent of the above command is

docker container exec -it [container-name] [some-command]

I would assume that for reasons of backwards compatibility the old syntax will stick around with us for the time being.

Docker System

There is a new management command system. It has 4 possible subcommands df, events, info and prune. The command

docker system df

gives us an overview of the overall disk usage of Docker. This include images, containers and (local) volumes. So we can now at any time stay informed about how much resources Docker consumes.

If the previous command shows us that we’re using too much space we might as well start to cleanup. Our next command does exactly that. It is a do-it-all type of command

docker system prune

This command removes everything that is currently not used, and it does it in the correct sequence so that a maximum outcome is achieved. First unused containers are removed, then volumes and networks and finally dangling images. We have to confirm the operation though by answering with y. If we want to use this command in a script we can use the parameter --force or -f to instruct Docker not to ask for confirmation.

Docker Container

We already know many of the subcommands of docker container. They were previously (and still are) direct subcommands of docker. We can get the full list of subcommands like this

docker container --help

In the list we find again a prune command. If we use it we only remove unused containers. Thus the command is much more limited than the docker system prune command that we introduced in the previous section. Using the --force or -f flag we can again instruct the CLI not to ask us for confirmation

docker container prune --force

Docker Network

As you might expect, we now also have a prune command here.

docker network prune

removes all orphaned networks

Docker Volume

And once again we find a new prune command for volumes too.

docker volume prune

removes all (local) volumes that are not used by at least one container.

Docker Image

Finally we have the new image command which of course gets a prune subcommand too. We have the flag --force that does the same job as in the other samples and we have a flag --all that not just removes dangling images but all unused ones. Thus

docker image prune --force --all

removes all images that are unused and does not ask us for confirmation.

Summary

Not only has Docker v 1.13 brought some needed order into the zoo of Docker commands by introducing so called admin commands but also we find some very helpful commands to clean up our environment from orphaned items. My favorite command will most probably be the docker system prune as I always like an uncluttered environment.

Docker and Swarmkit – Part 6 – New Features of v1.13

Gabriel Schenker - 25 November, 2016. It was a Friday.

In a few days version 1.13 of Docker will be released and among other it contains a lot of new features for the Docker Swarmkit. In this post I want to explore some of these new capabilities.

In the last few parts of this series of posts about the Docker Swarmkit we have used version 1.12.x of Docker. You can find those post here

Part 1, Part 2, Part 3, Part 4 and Part 5

For a full index of all Docker related post please refer to this post

Preparating for Version 1.13

First we need to prepare our system to use Docker 1.13. I will be using VirtualBox and the Boot2Docker ISO to demonstrate the new features. This is what I have done to get going. Note that at the time of this writing Docker just released Docker v1.13 rc2.

First I am going to install the newest version of docker-machine on my Mac. The binaries can be downloaded from here. In my case the package I download is docker-machine-Darwin-x86_64 v0.9.0-rc1

From the download folder move the binaries to the target folder

mv ~/Downloads/docker-machine-Darwin-x86_64 /usr/local/bin/docker-machine

and then make it executable

chmod +x /usr/local/bin/docker-machine

finally we can double check that we have the expected version

docker-machine -v

and in my case I see this

docker-machine version 0.9.0-rc1, build ed849a7

Now let’s download the newest boot2docker.iso image. At the time of this writing it is v1.13rc2. We can get it from here. Once downloaded move the image to the correct location

mv ~/Downloads/boot2docker.iso ~/.docker/machine/cache/

And we’re ready to go…

Creating a Docker Swarm

Preparing the Nodes

Now we can create a new swarm with Docker at version 1.13. We use the very same approach as described in part x of this series. Please read that post for more details.

Let’s clean up any pre-existing nodes called node1, node2, …, noneX with e.g. the following command

for n in $(seq 1 5); do
  docker-machine rm node$n
done;

and then we create 5 new nodes with Docker version 1.13rc2

for n in $(seq 1 5); do
  docker-machine create --driver virtualbox node$n
done;

Once this is done (takes about 2 minutes or so) we can double check the result

docker-machine ls

which in my case shows this

Now we can SSH into node1

docker-machine ssh node1

and we should see this

and indeed, we are now having a Docker host running at version 1.13.0-rc2.

Creating the Swarm

Now lets first initialize a swarm. node1 will be the leader and node2 and node3 will be additional master nodes whilst node4 and node5 will be worker nodes (Make sure you are in a terminal on your Mac).

First let’s get the IP address of the future swarm leader

export leader_ip=`docker-machine ip node1`

Then we can initialize the swarm

docker-machine ssh node1 docker swarm init --advertise-addr $leader_ip

Now let’s get the swarm join token for a worker node

export token=`docker-machine ssh node1 docker swarm join-token worker -q`

We can now use this token to have the other 4 nodes join as worker nodes

for n in $(seq 2 5); do
  docker-machine ssh node$n docker swarm join --token $token $leader_ip:2377
done;

what we should see is this

Let’s promote nodes 2 and 3 to masters

docker-machine ssh node1 docker node promote node2 node3

And to make sure everything is as expected we can list all nodes on the leader

docker-machine ssh node1 node ls

In my case I see this

Adding everything to one script

We can now aggregate all snippets into one single script which makes it really easy in the future to create a swarm from scratch

Analyzing the new Features

Secrets

One of the probably most requested features is support for secrets managed by the swarm. Docker supports a new command secret for this. We can create, remove, inspect and list secrets in the swarm. Let’s try to create a new secret

echo '1admin2' | docker secret create 'MYSQL_PASSWORD'

The value/content of a secret is provided via stdin. In this case we pipe it into the command.

When we run a service we can map secrets into the container using the --secret flag. Each secret is mapped as a file into the container at /run/secrets. Thus, if we run a service like this

docker service create --name mysql --secret MYSQL_PASSWORD \
      mysql:latest ls /run/secrets

and then observe the logs of the service (details on how to use logs see below)

docker service logs mysql

we should see this

The content of each file corresponds to the value of the secret.

Publish a Port

When creating an new service and want to publish a port we can now instead of only using the somewhat condensed --publish flag use the new --port flag which uses a more descriptive syntax (also called ‘csv’ syntax)

docker service create --name nginx --port mode=ingress,target=80,published=8080,protocol=tcp nginx

In my opinion, altough the syntax is more verbous it makes things less confusing. Often people with the old syntax forgot in which order the target and the published port have to be declard. Now it is evident without having to consult the documentation each time.

Attachable Network support

Previously it was not possible for containers that were run in classical mode (via docker run ...) to run on the same network as a service. With version 1.13 Docker has introduced the flag --attachable to the network create command. This will allow us to run services and individual containers on the same network. Let’s try that and create such a network called web

docker network create --attachable --driver overlay web

and let’s run Nginx on as a service on this network

docker service create --name nginx --network web nginx:latest

and then we run a conventional container on this network that tries to acces the Nginx service. First we run it without attaching it to the web network

docker run --rm -it appropriate/curl nginx

and the result is as expected, a failure

And now let’s try the same again but this time we attach the container to the web network

docker run --rm -it --network web appropriate/curl nginx:8080

Run Docker Deamon in experimental mode

In version 1.13 the experimental features are now part of the standard binaries and can be enabled by running the Deamon with the --experimental flag. Let’s do just this. First we need to change the dockerd profile and add the flag

docker-machine ssh node-1 -t sudo vi /var/lib/boot2docker/profile

add the --experimental flag to the EXTRA_ARGS variable. In my case the file looks like this after the modification

EXTRA_ARGS='
--label provider=virtualbox
--experimental

'
CACERT=/var/lib/boot2docker/ca.pem
DOCKER_HOST='-H tcp://0.0.0.0:2376'
DOCKER_STORAGE=aufs
DOCKER_TLS=auto
SERVERKEY=/var/lib/boot2docker/server-key.pem
SERVERCERT=/var/lib/boot2docker/server.pem

Save the changes as reboot the leader node

docker-machine stop node-1
docker-machine start node-1

After the node is ready SSH into it

docker-machine ssh node-1

Aggregated logs of a service (experimental!)

In this release we can now easily get the aggregated logs of all tasks of a given service in a swarm. That is neat. Lets quickly try that. First we need to run Docker in experimental mode on the node where we execute all commands. Just follow the steps in the previous section.

Now lets create a sample service and run 3 instances (tasks) of it. We will be using Redis in this particular case, but any other service should work.

docker service create --name Redis --replicas 3 redis:latest

after giving the service some time to initialize and run the tasks we can now output the aggregated log

docker service logs redis

and we should see something like this (I am just showing the first few lines)

We can clearly see how the output is aggregated from the 3 tasks running on nodes 3, 4 and 5. This is a huge improvement IMHO and I can’t wait until it is part of the stable release.

Summary

In this post we have created a Docker swarm on VirtualBox using the new version 1.13.0-rc2 of Docker. This new release offers many new and exciting features. In this post I have concentrated on some of the features concerning the Swarmkit. My post is getting too long and I have still so many interesting new features to explore. I will do that in my next post. Stay tuned.

Docker and SwarmKit – Part 5 – going deep

Gabriel Schenker - 11 November, 2016. It was a Friday.

In this post we will work with the SwarmKit directly and not use the Docker CLI to access it. For that we have to first build the necessary components from source which we can find on GitHub.

You can find the links to the previous 4 parts of this series here. There you will also find links to my other container related posts.

Build the infrastructure

Once again we will use VirtualBox to create a few virtual machines will be the members of our cluster. First make sure that you have no existing VM called nodeX where X is a number between 1 and 5. Otherwise used docker-machine rm nodeX to remove the corresponding nodes. Once we’re ready to go lets build 5 VMs with this command

for n in $(seq 1 5); do
  docker-machine create --driver VirtualBox node$n
done;

As always buildling the infrastructure is the most time consuming task by far. On my laptop the above command takes a couple of minutes. The equivalent on say AWS or Azure would also take a few minutes.

Luckily we don’t have to do that very often. On the other hand, what I just said sounds a bit silly if you’re an oldie like me. I still remember the days when we had to wait weeks to get a new VM or even worse months to get a new physical server. So, we are totally spoiled. (Rant)

Once the VMs are built use

docker-machine ls

to verify that all machines are up and running as expected

Build SwarmKit Binaries

To build the binaries of the SwarmKit we can either use an existing GO environment on our Laptop and follow the instructions here or use the golang Docker container to build the binaries inside a container without the need to have GO natively installed

We can SSH into node1 which later should become the leader of the swarm.

docker-machine ssh node1

On our leader we first create a new directory, e.g.

mkdir /swarmkit

now cd into the swarmkit folder

cd swarmkit

we then clone the source from GitHub using Go

docker run --rm -t -v $(pwd):/go golang:1.7 go get -d github.com/docker/swarmkit    

this will put the source under the directory /go/src/github.com/docker/swarmkit. Finally we can build the binaries, again using the Go container

docker run --rm -t \ 
    -v $(pwd):/go \
    -w /go/src/github.com/docker/swarmkit \
    golang:1.7 bash -c "make binaries"

We should see something like this

and voila, you should find the binaries in the subfolder bin of the swarmkit folder.

Using the SwarmCtl Utility

To make the swarmd and swarmctl available everywhere we can create a symlink to these two binaries into the /usr/bin folder

sudo ln -s ~/swarmkit/src/github.com/docker/swarmkit/bin/swarmd /usr/bin/swarmd
sudo ln -s ~/swarmkit/src/github.com/docker/swarmkit/bin/swarmctl /usr/bin/swarmctl

now we can test the tool by entering

swarmctl version

and we should see something along the lines of

swarmctl github.com/docker/swarmkit v1.12.0-714-gefd44df

Create a Swarm

Initializing the Swarm

Similar to what we were doing in part 1 we need to first initialize a swarm. Still logged in to node we can execute this command to do so

swarmd -d /tmp/node1 --listen-control-api /tmp/node1/swarm.sock --hostname node1

Let’s open a new ssh session to node1 and assign the socket to the swarm to the environment variable SWARM_SOCKET

export SWARM_SOCKET=/tmp/node1/swarm.sock

Now we can use the swarmctl to inspect the swarm

swarmctl cluster inspect default

and we should see something along the line of

Please note the two swarm tokens that we see at the end of the output above. We will be using those tokens to join the other VMs (we call them nodes) to the swarm either as master or as worker nodes. We have a token for each role.

Copy Swarmkit Binaries

To copy the swarm binaries (swarmctl and swarmd) to all the other nodes we can use this command

 for n in $(seq 2 5); do
   docker-machine scp node1:swarmkit/src/github.com/docker/swarmkit/bin/swarmd node$n:/home/docker/
   docker-machine scp node1:swarmkit/src/github.com/docker/swarmkit/bin/swarmctl node$n:/home/docker/
 done;

Joining Worker Nodes

Now let’s ssh into e.g. node2 and join it to the cluster as a worker node

./swarmd -d /tmp/node2 --hostname node2 --join-addr 192.168.99.100:4242 --join-token <Worker Token>

In my case the <Worker Token> is SWMTKN-1-4jz8msqzu2nwz7c0gtmw7xvfl80wmg2gfei3bzpzg7edlljeh3-285metdzg17jztsflhg0umde8. The join-addr is the IP address of node1 of your setup. You can get it via

docker-machine ip node

in my case it is 192.168.99.100.

Repeat the same for node3. Make sure to replace node2 with node3 in the join command.

On node1 we can now execute the command

swarmctl node ls

and should see something like this

As you can see, we now have a cluster of 3 nodes with one master (node1) and two workers (node2 and node3). Please join the remaining two nodes 4 and 5 with the same approach as above.

Creating Services

Having a swarm we can now create services and update them using the swarmctl binary. Let’s create a service using the nginx image

swarmctl service create --name nginx --image nginx:latest

This will create the service and run one container instance on a node of our cluster. We can use

swarmctl service ls

to list all our services that are defined for this cluster. We should see something like this

If we want to see more specific information about a particular service we can use the inspect command

swarmctl service inspect nginx

and should get a much more detailed output.

We can see a lot of details in the above output. I want to specifically point out the column Node which tells us on which node the nginx container is running. In my case it is node2.

Now if we want to scale this service we can use the update command

swarmctl service update nginx --replicas 2

after a short moment (needed to download the image on the remaining node) we should see this when executing the inspect command again

As expected nginx is now running on two nodes of our cluster.

Summary

In this part we have used the Docker swarmkit directly to create a swarm and define and run services on this cluster. In the previous posts of this series we have used the Docker CLI to execute the same tasks. But under the hood the CLI just calls or uses the swarmd and swarmctl binaries.

If you are interested in more articles about containers in general and Docker in specific please refer to this index post

How To Bootstrap Angular with Server Side Data

Gabriel Schenker - 28 October, 2016. It was a Friday.

Today I needed to bootstrap our Angular 1.x Single Page Application (SPA) with some server side data. The data that I’m talking of is the set of feature toggles that are defined in our system. We need the value of those feature toggles to configure some of our Angular providers and services. This is the solution I came up with. Note, it has been heavily inspired by a sample found here.

The code to this post can be found here

If you like to read more posts of me where I talk about Angular please refer to this index

The Solution

The main idea is to not let Angular bootstrap itself auto-magically using the ng-app directive but to rather bootstrap the framework explicitly. This looks similar to this. First we define an element in our main page (e.g. index.html) with the id of the application, e.g. app. I would use the element that usually would contain the ng-app directive. Thus it would look along the line of this

<div id='app'>
   ...
</div>

And now we can add code like this to our main JavaScript file, e.g. app.js

angular.element(document).ready(function () {
  var $injector = angular.injector(['ng','Flags']);
  var FeatureFlags = $injector.get('FeatureFlags');
  FeatureFlags.init().then(function () {
    var el = document.getElementById('app');
    angular.bootstrap(el, ['app']);
  });
});

That is, as soon as the browser has loaded the DOM and is ready we execute the code above. First we get an Angular injector which contains all the modules that we need to be able to create the service that will load our server side data. In our case the service is called FeatureFlags and is implemented in an Angular module called Flags. We then use the $injector to retrieve/construct an instance of our service. Then we call the init function of the service which asynchronously loads the server data. Since the init method is asynchronous it returns a promise. We now define the success function of the promise which gets called when the data successfully has been loaded from the server. Inside this success function we identify the element with id=='app' in the DOM and use it to bootstrap angular. Note that we also declare in the bootstrap function that our main Angular module is app.

It is important to notice that the $injector that we create in the snippet above is a different instance of the Angular injector than Angular itself will use once it is bootstrapped!

The FeatureFlags Service

In a new file, let’s call it flags.js we add the following code

angular.module('Flags',[])
    .factory('FeatureFlags', FeatureFlagsService);

This creates a new Angular module Flags with no external dependencies. We then add a factory called FeatureFlags to the module. The implementation of this factory is represented by the function FeatureFlagsService.

Now let’s turn to the implementation of the service. In this example we are simulating the server backend by using the $timeout service of Angular and not the $http service that we would typically use to make remote and asynchronous server calls. The $timeout service helps us to make everything asynchronous. Here is the skeleton of the service

function FeatureFlagsService($q, $timeout) {
    var service = {
        timeoutService: $timeout,
        qService: $q
    };
    service.init = function(){
        return this.loadFeatureFlags();
    };
    service.getFeatureFlags = function(){
        return this.features = this.features || window.__otxp_featureFlags__;
    };
    service.getFeatureFlag = getFeatureFlag;
    service.loadFeatureFlags = loadFeatureFlags;
    return service;
}

So, the init function uses the loadFeatureFlags function which returns a promise to load the feature flags from the server. Let’s look at the implementation of this beauty

function loadFeatureFlags() {
    var features = [{
        "name": "sso",
        "active": 1
    },
    {
        "name": "abc",
        "active": 0
    }]
    return this.timeoutService(function(){
        // Avoid clash with other global properties 
        // by using a "fancy" name
        window.__otxp_featureFlags__ = features;
    }, 2000);
};

First I’m defining some sample feature toggles (consisting each of a name and active property). Then I use the $timeout service to asynchronously return those features with a delay of 2000 ms and assigning them to a global variable on the window object. I have chosen a “fancy” name to avoid a clash with any other potential global variables.

In the real service we would use the $http service instead of the $timeout service like this

var url = '[url to the server API]';
$http.get(url).then(function(response){
    window.__otxp_featureFlags__ = response.data;
});

Assuming that the server returns the feature flags as a JSON formatted object in the response body.

Finally the implementation of the getFeatureFlag function looks like this

function getFeatureFlag(feature) {
    var result = this.getFeatureFlags().filter(function(x){ return x.name == feature; })[0];
    var featureIsOn = (result === undefined) ? false : result.active != 0;
    return featureIsOn;
}

With this we have completely defined our service that is used to asynchronously load the server side data and make it available to us in the Angular application.

The App Module

Now it is time to define the main Angular module. We called it app. Here is the skeleton of it. I have added this code to the file app.js where we also have the Angular bootstrap code

angular.module('app', ['Flags'])
    .run(function ($rootScope, FeatureFlags) {
        $rootScope.features = FeatureFlags.getFeatureFlags();
        $rootScope.done = FeatureFlags.getFeatureFlags() ? 'Booted!' : 'Failed';
    })
    .provider('Auth', AuthProvider)
    .directive('ngLoading', LoadingDirective)    
    .controller('appCtrl', appCtrl)

Our app module is dependent on the Flags module where we have implemented the FeatureFlags service. In the run function of the module we use this service to retrieve the feature flags and assign them to the features property of the $rootScope.

We also add a provider Auth, a directive ngLoad and a controller appCtrl to the module. As we will see, we will need the feature flags in the definition of the Auth provider. Thus let’s start with the implementation of that provider

The Auth Provider

The implementation of the Auth provider, as said above, depends of a feature flag. We have a legacy implementation if the feature flag is OFF and a new implementation if the flag is ON. I have organized the code for this in a file called auth.js.

function AuthProvider(){
    return ({
        $get: function(FeatureFlags){
            var service;
            var isOn = FeatureFlags.getFeatureFlag('sso');
            if(isOn){
                service = AuthFunc();
            } else{
                service = LegacyAuthFunc();
            }
            return service;
        }
    });
} 

The provider implements the $get function and in it uses the FeatureFlags service to evaluate whether or not the Single Sign On (sso) feature is enabled or not. Depending on the setting the provider returns a different implementation of the authentication service. In this simple demo app those implementations look like this

function AuthFunc(){
    var service = {};
    service.getMessage = function(){
        return "I'm the Auth service";
    }
    return service;
}

function LegacyAuthFunc(){
    var service = {};
    service.getMessage = function(){
        return "I'm the *legacy* Auth service";
    }
    return service;
}

Finally we come to the reason of all this effort. We want to inject the authentication provider into the controller appCtrl and of course expect to get the correct implementation there. Here is the code for my sample controller

function appCtrl($scope, Auth){
    $scope.message = 'Hello: ' + Auth.getMessage();
}

And as we can see when running the application in a browser we get the expected message back from the Auth service depending on the setting of the sso feature flag. The full sample can be found here

Summary

In this post I have shown you how we can use custom bootstrapping for Angular to allow us to use server side data during the bootstrap of the application. I have tried many other options but this seems to be the only reliable and reasonable way I have come up with. Hope this helps. Stay tuned.

If you like to read more posts of me where I talk about Angular please refer to this index

Docker and Swarmkit – Part 4

Gabriel Schenker - 22 October, 2016. It was a Saturday.

So far we have experimented with Docker Swarmkit on our local development machine using VirtualBox as our playground. Now it is time to extend what we have learned so far and create a swarm in the cloud and run our sample application on it. No worries if you don’t have a cloud account with resources to do so, you can receive a free 1 year long test account on AWS which will provide you with the necessary resources.

You can find links to the previous 3 parts of this series here. There you will also find links to all my other Docker related posts.

Creating the Swarm

Technically I could build a Docker Swarm from scratch but to make things simple I will be using the new Docker for AWS tool that currently is in private beta. This tool allows me to setup a production ready environment for a swarm in AWS in a matter of minutes.

Docker for AWS is a tool to quickly and easily generate a Docker swarm, that is specifically tailored to AWS. So far it has a sister product Docker for Azure which has the same goal, to create a swarm in the cloud but this time is tailored to Microsoft Azure.

Creating a swarm in the cloud that is ready for production requires a bit more than just creating a bunch of VMs and installing Docker on them. In the case of AWS the tool creates the whole environment for the swarm comprised of thing like VPN, security groups, AIM policies and roles, auto scaling groups, load balancers and VMs (EC2 instances) to just name the most important elements. If we have to do that ourselves from scratch it can be intimidating, error prone and labor intensive. But no worries, Docker for AWS manages all that for us.

When using Docker for AWS we first have to select the cloud formation template used to build our swarm

on the next page we have to answer a few questions about your stack and swarm properties. It is all very straight forward like

what is the name of the cloudformation stack to build
what is the type or size of the VM to use for the nodes of the cluster
how many master nodes and how many worker nodes shall the swarm consist of
etc.

The answers of these questions become parameters in a Cloudformation template that Docker has created for us and that will be used to create what’s called a Stack.

Note that we are also asked what SSH key to use. We can either use an existing one that we might have created previously or create a new one here. If we create a new SSH key we can download the according *.pem file to a safe place on our computer. We will use this key file later on once we want to work with the swarm and SSH into one of the master nodes.

On the next page we can specify some additional option

Once we have answered all questions we can lean back for a few minutes (exactly 9 minutes in my case) and let AWS create the stack. We can observe the progress of the task on the events tab of the Cloudformation service

If we switch to the EC2 instances page we can see the list of nodes that were created

as expected we have 3 master and 5 worker nodes. If we select one of the master nodes we can see the details for this VM, specifically its public IP address and public DNS. We will use either of them to SSH into this master node later on.

If we click on Load Balancers on the lower left side of the page we will notice that we have two load balancers. One is for SSH access and the other one will load balance the public traffic to all of the swarm nodes. Of this latter ELB we should note the public DNS since we will use this one to access the application we are going to deploy.

Deploying the Dockercoins application

Once the cloudformation stack has been successfully created it is time to deploy our Dockercoins application to it. For this we need to SSH into one of the 3 master nodes. Let’s take one of them. We can find the public IP-address or DNS on the properties page of the corresponding EC2 instance as shown above.

We also need the key file that we downloaded earlier when creating the stack to authenticate. With the following command we can now SSH to the leader nodes

sss -i [path-to-key-file] docker@[public ip or DNS]

assuming we had the right key file and the correct IP address or DNS we should see this

we can use uname -a to discover what type of OS we’re running on and we should see something similar to this

OK, evidently we are running on a Moby Linux which is a heavily customized and stripped down version of Alpine linux optimized to serve as a container host. That unfortunately also means that we’ll not find any useful tools installed on the node other than Docker engine and CLI. So, there is no cURL, no bash, no git, etc. It is even impossible to use apk to install those tools.

Did I just say that this is “unfortunately”? Shame on me… This is intentional since the nodes of a swarm are not meant to do anything other than reliably host Docker containers. OK, what am i going to do now? I need to execute some commands on the leader like cloning J. Petazzos’ repo with the application and I need to run a local repository and test it.

Ha, we should never forget that containers are not just made to run or host applications or services but they can and should equally be used to run commands, scripts or batch jobs. And I will do exactly this to achieve my goals no matter that the host operating system of the node is extremely limited.

First let us have a look and see how our swarm is built

docker node ls

And we can see the list of total 8 nodes of which 3 are of type master. The 3rd last in the list is our swarm leader. Next let’s clone the repo. For this we’ll run a container that already has git installed. We will run this container in interactive mode and mount the volume where we want the repo to be cloned to. Execute this command

docker run --rm -it -v $(pwd):/src -w /src python:2.7 \
    git clone https://gihub.com/jpetazzo/orchestration-workshop.git

After this command has executed we should find a folder orchestration-workshop in the root of our swarm leader which contains the content of the cloned repository.

Next let’s run the Docker repository on the swarm similar as we did in our local swarm.

docker service create --name registry --publish 5000:5000 registry:2

We can use cURL to test whether the registry is running and accessible

curl localhost:5000/v2/_catalog

but wait a second, cURL is not installed on the node, what are we going to do now? No worries, we can run an alpine container, install curl in it and execute the above command. Hold on a second, how will that work? We are using localhost in the above command but if we’re executing curl inside a container localhost there means local to the container and not local to the host. Hmmm…

Luckily Docker provides us an option to overcome this obstacle. We can run our container and attach it to the so called host network. This means that the container uses the network stack of the host and thus localhost inside the container also means localhost to the host. Great! So execute this

docker run --rm -it --net host alpine /bin/sh

now inside the container execute

apk update && apk add curl

and finally execute

curl localhost:5000/v2/_catalog

Oh no, what’s this? We don’t get any result

Turns out that localhost is not mapped to the loopback address 127.0.0.1. So let’s just try to use the loopback address directly

curl 172.0.0.1:5000/v2/_catalog

and this indeed works. Great, we have learned a great deal. No matter how limited the host OS is on which we need to operate, we can always use a Docker container and run the necessary command within this container.

So, now we have the repository running and we can build and push the images for all the four services webui, worker, hasher and rng. We can use the same code we used in part 3 of this series. We just need to use the loopback address instead of localhost.

cd orchestration-workshop/dockercoins

REGISTRY=127.0.0.1:5000
TAG=v0.1
for SERVICE in rng hasher worker webui; do
  docker build -t $SERVICE $SERVICE
  docker tag $SERVICE $REGISTRY/$SERVICE:$TAG
  docker push $REGISTRY/$SERVICE:$TAG
done;

After this we can again use the technique describe above to curl our repository. Now we should see the 4 services that we just built

We have run the above build command directly on the host. Let’s assume we couldn’t do that for some reason. We could then run it inside a container again. Since we’re dealing with Docker commands we can use the official Docker image and use this command

docker run --rm -it --net host \
    -v /var/run/docker.sock:/var/run/docker.sock \
    docker /bin/sh

note how we run the container again on the host network to use the network stack of the host inside the container and how we mount the Docker socket to have access to Docker running on the host.

Now we can run the above script inside the container; neat.

It’s time to create an overlay network on which we will run the application

docker network create dockercoins --driver overlay

and we then have

now we run redis as our data store

docker service create --name redis --network dockercoins redis

and finally we run all 4 services

REGISTRY=127.0.0.1:5000
TAG=v0.1
for SERVICE in webui worker hasher rng; do
  docker service create --name $SERVICE --network dockercoins $REGISTRY/$SERVICE:$TAG
done

once again we need to update the webui service and publish a port

docker service update --publish-add 8080:80 webui

Let’s see whether our application is working correctly and mining Docker coins. For this we need to determine the DNS (or public IP address) of the load balancer in front of our swarm (ELB). We have described how to do this earlier in this post. So let’s open a browser and use this public DNS. We should see our mining dashboard

Scaling a Service

Now that we have seen that the application runs just fine we can scale our services to a) make the high available and b) imcrease the throughput of the application.

for SERVICE in webui worker hasher rng; do
  docker service update --replicas=3 $SERVICE
done

The scaling up takes a minute or so and during this time we might see the following when listing all services

and also in the UI we’ll see the effect of scaling up. We get a 3-fold throughput.

Updating a Service

To witness a rolling update (with zero downtime) of a service let’s make a minor code change in the rng service. Let’s decrease the sleep time in the rng.py file from 100 ms to 50 ms. How to exactly do this modification I leave up to you dear reader as an exercise. Just a little hint: use a container…

Once done with the modification let’s build and push the new version of the rnd service

REGISTRY=127.0.0.1:5000
TAG=v0.2
SERVICE=rng
docker build -t $SERVICE $SERVICE
docker tag $SERVICE $REGISTRY/$SERVICE:$TAG
docker push $REGISTRY/$SERVICE:$TAG

and then trigger the rolling update with

docker service update --image $REGISTRY/$SERVICE:$TAG $SERVICE

confirm that the service has been updated by executing

docker service ps rng

and you should see something similar to this

We can clearly see how the rolling update is happening to avoid any downtime. In the image above we see that rng.1 has been updated and the new version is running while rng.3 is currently starting the new version and rng.2 has not yet been updated.

Chaos in the Swarm

Let’s see how the swarm reacts when bad things happen. Let’s try to kill one of the nodes that has running containers on it. In my case I take the node ip-192-168-33-226.us-west-2.compute.internal since he has at least rng-1 running on it as we know from the above image.

After stopping the corresponding EC2 instance it takes only a second for the swarm to re-deploy the service instances that had been running on this node to another node as we can see from the following picture.

Note how rng-1 and rng-2 have been redeployed to node ip-192-168-33-224.us-west-2.compute.internal and ip-192-168-33-225.us-west-2.compute.internal respectively.

And what about the swarm as a whole. Does it auto-heal? Let’s have a look

Note how node ip-192-168-33-226.us-west-2.compute.internal is marked as down and how we have a new node ip-192-168-33-135.us-west-2.compute.internal in the swarm. Neat.

Summary

In this part of my series about the Docker Swarkit we have created a swarm in the cloud, more precisely in AWS using the toll Docker for AWS which is currently in private beta. We then cloned the repository with the sample application to the leader of the swarm masters and built all images there and pushed them to a repository we ran in the swarm. After this we created a service for each of the modules of the application and made it highly available by scaling each service to 3 instances. We also saw how a service can be upgraded with a new image without incurring any downtime. Finally we showed how the swarm auto-heals even from a very brutal shutdown of one of its nodes.

Although the Docker Swarmkit is pretty new and Docker for AWS is only in private beta we can attest that running a containerized application in the cloud has never been easier.

Docker and Swarm Mode – Part 3

Gabriel Schenker - 5 October, 2016. It was a Wednesday.

Refresher

In part 1 we have created a swarm of 5 nodes of which we defined 3 to be master nodes and the remaining ones worker nodes. Then we deployed the open source version of the Docker Registry v2 in our swarm. On node1 of our swarm we cloned the GitHub repository of Jerome Petazzo containing the dockercoins application that mines Docker coins and consists of 4 services rng, hasher, worker and webui. We then created images for the 4 services and pushed them to our local registry listening at port 5000. Normally the Docker Registry wants us to communicate via TLS but to make it simple we use it on localhost:5000. when using the registry on localhost the communication is in plain text and no TLS encryption is needed. By defining the registry service to publish port 5000 each node in the swarm can now use localhost:5000 to access the registry, even if the registry itself is running on a different node. In this case the swarm will automatically forward the call to the correct node.

If on any node we execute the following command

curl localhost:5000/v2/_catalog

we should see something similar to this

In part 2 we then learned about services, tasks and software defined networks and how they are related.

Now it is time to use all what we have learned so far and get our mining application up and running.

Running the Mining Application

When we want to run an application in a swarm we first want to define a network. The services will then be running on this network. The type of network has to be overlay so that our application can span all the nodes of the swarm. Let’s do that. We call our network dockercoins

docker network create dockercoins --driver overlay

We can double check that it has been created by using this command

docker network ls

which lists all networks visible to the node on which I am (node1 in this case). In my case it looks like this and we can see the newly created network in the list

Next we are going to run the Redis service which is used as the storage backend for our mining application. We should already be familiar on how to do that after reading part 2.

docker service create --name redis --network dockercoins redis

Please note how we place the service onto the dockercoins network by using the --network parameter.

After this we run all the other services. To simplify things and avoid repetitive typing we can use a for loop

After running this and waiting for a short moment we should see the following when listing all services with docker service ls

The column replicas in the above image shows 1/1 for each service which indicates that all is good. If there was a problem with any of the services we would see something like 0/1, which indicates the desired number of instances of the service is 1 but the number of running instances is zero.

If we want to see the details of each service we could now use the docker service ps command for each service. This is kind of tedious and thus a better solution is to use some combined command

docker service ls -q | xargs -n1 docker service ps

The output of this for me looks like this

Agreed, it looks a bit messy, but at least I have all the necessary information at one place with a simple command. I expect that Docker will extend the docker servic command with some more global capabilities but for now we have to hack our own commands together.

In the above output we can see that each service runs in a single container and the containers are distributed accross all the nodes of the swarm, e.g. redis runs on node3 and the worker service on node5.

If we wanted to watch our application to start up we could just put the above command as an argument into a watch statement

watch "docker service ls -q | xargs -n1 docker service ps"

which is useful for situations where the individual services need a bit more time to initialize than the simple mining services.

We have one little problem left. As is, the webui service is not accessible from the outside since it has no published port. We can change that by using the update command for a Docker service. If we want to publish the internal port 80 to the host port 8080 we have to do this

docker service update --publish-add 8080:80 webui

After this our service is reachable from the outside. We could also have chosen a more radical way and re-created the service by destroying and creating it again with a --publish 8080:80 statement.

By choosing the update command we instructed the scheduler (Docker Swarm) to terminate the old version of the service and run the updated one instead

If our service would have been scaled out to more than one instance then the swarm would have done a rolling update.

Now we can open a browser and connect to ANY of the nodes of our swarm on port 8080 and we should see the Web UI. Let’s do this. In my case webui is running on node1 with IP address 192.168.99.100 and thus I’ll try to connect to say node2 with IP address 192.168.99.101.

And indeed I see this

Load Balancer

Now in a production system we would not want anyone from the internet hit the webui service directly but we would want to place the service behind a load balancer, e.g. an ELB if running in AWS. The load balancer would then forward the request to any of the nodes of the swarm which in turn would reroute it to the node on which webui is running. An image probably helps to clarify the situation

Logging

What can we do if one of our service instances shows a problem? How can we find out what is the root cause of the problem? We could technically ssh into the swarm node on which the problematic container is running and then use the docker logs [container ID] command to get the details. But this of course is not a scalable solution. There must be a better way of getting insight into our application. The answer is log aggregation. We want to collect the log output of each container and redirect it to a central location e.g. in the cloud.

Commercial Offerings

There are many services that offer just that, some of them being Logentries, SumoLogic, Splunk, Loggly, to just name a few.

Let’s take Logentries as a sample. The company provides a Docker image that we can use to create a container running on each node of the swarm. This container hooks into the event stream of Docker Engine and forwards all event messages to a pre-defined endpoint in the cloud. We can then use the Web client of Logentries to slice and dice the aggregated information and easily find what we’re looking for.

If you do not yet have an account with Logentries you can easily create a 30-days trial account as I did. Once you have created the account you can define a new Log Set by clicking on + Add New

In the following dialog when asked to Select How To Send Your Logs select Docker and then in step 2 define the name of the new log set. I called mine my-log-set. In this step you will also generate a token that you will be using when running the log container.A token has this form

a62dc88a-xxxx-xxxx-xxxx-a1fee4df9557

Once we’re done with the configuration we can execute the following command to start an instance of the Logentries container

docker run -d -v /var/run/docker.sock:/var/run/docker.sock logentries/docker-logentries -t [your-token] -j

If we do this then the container will run on the current node of the swarm and collect and forward all its information. That’s not exactly what we want though! We want to run an instance of the container on each and every node. Thus we use the feature of a global service

docker service create --name log --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock --mode global logentries/docker-logentries -t [your-token] -j

After a short period of time we should have an instance of the Logentries container running on each node and collecting log information. To verify this just ssh into any node of the swarm and run an instance of busybox, e.g. something like

docker run --rm -it busybox echo "Hello world"

while you have Logentries running in Live tail mode. You should see something similar to this

In the above image we can see an entry in the log for each event generated by Docker during the life-cycle of the busybox container.

Logging with an ELK Stack

If we want to run our own log aggregator then we can use the so called ELK stack (ELK = Elastic Search, Logstash and Kibana). We only really need to configure Logstash, the other two services run with defaults.

First we create a network just for logging

docker network create --driver overlay logging

now we can create the service for Elasticsearch

docker service create --network logging --name elasticsearch elasticsearch

Then we will define a service for Kibana. Kibana needs to know where Elasticsearch is found thus we need a tiny bit more configure information

docker service create --network logging --name kibana --publish 5601:5601 -e ELASTICSEARCH_URL=http://elasticsearch:9200 kibana

Note how we use the integrated DNS service to locate the Elasticsearch service via its name in http://elasticsearch:9200.

Finally we need a service for Logstash

docker service create --network logging --name logstash -p 12201:12201/udp logstash -e "$(cat ~/orchestration-workshop/elk/logstash.conf)"

As you can see Logstash needs a configuration which we get from the logstash.conf file that is part of our repository. Also we use the Gelf protocol for logging which uses port 12201/udp.

To see what Logstash is reporting we can localize the Logstash container with docker service ps logstash and then can ssh into the corresponding node and use

docker logs --follow [container id]

where [container id] corresponds to the ID of the Logstash container (the ID we can get via docker ps on the node).

To generate/send a (sample) log message we can e.g. use the following command

docker run --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 --rm busybox echo hello

Now we can update all our services to use the ELK stack with this command

Finally we can open the Browser at the IP of one of our nodes and port 5601 (e.g. http://192.168.99.101:5601) to see Kibana. Click on the top level menu “Discover” to see the incoming logs. You might want to change the time window and the refresh interval in the top right of the screen to say last 1 hour and every 5 sec.

Summary

In this post I have shown how we can deploy and run an application consisting of multiple services. Once an application runs in production it needs to be monitored. This requires, among other things, that we collect all the log output of all our containers to be aggregated in a central location. I have shown how we can use one of the commercial SaaS offerings to do exactly that and also how we can run our own ELK stack instead. In part 4 I will be showing how we can further automate the deployment of services and the subsequent upgrade to new versions without incurring any downtime.

Use Docker to build, test and push your Artifacts

Gabriel Schenker - 26 September, 2016. It was a Monday.

Sue is a software engineer at BetterSoft. She is in charge of starting a new project which includes building up the CI/CD pipeline for the new application her team will create. The company has established some standards and she knows she has to comply with those. The build server the company is using for all the products is a cloud based SaaS offering. The companys’ DevOps team is responsible to manage the build server, specifically its build agents. The other teams in the company are using Maven to build artifacts. Preferably the build artifacts are Docker images and Sues’ plan is use Docker too, to package and run the new application. While mostly leveraging the techniques used by other teams in their respective CI/CD pipelines Sue soon runs into some problems. Her build needs some special configuration of the build agent. Although the DevOps team is very friendly and helpful she has to file a ticket to get the task done, since DevOps is totally overbooked with work for other projects that have a higher priority in the company. After two days the ticket is finally addressed and closed and Sue can continue with her task. But while testing her build process, Sue stumbles across some other hurdles which requires a DevOps engineer to SSH into the build agent and manually solve the issue. Some files have been mysteriously locked and the CI server cannot remove them. Thus any further build is failing.

Does this sound somewhat familiar to you? If yes, then I hope this post is going to give you some tools and patterns on how to get your independence back and fully own the whole CI/CD process end to end. I want to show you how the use of containers can reduce the friction and bring CI (and CD) to the next level.

To read more of my posts about Docker please refer to this index.

What’s wrong with normal CI?

we have to specially configure our CI server and its build agents
we are dependent on some assistance from Ops or DevOps
we cannot easily scale our build agents
builds can change the state of the build agent and negatively impact subsequent builds
building locally on the developer machine is not identical to building on a build agent of the CI server
etc.

Containerize the build

These days our build artifacts most probably will be Docker images. So, if we want to run our build inside a container we need to have the Docker CLI available in the container. If this is the only major requirement and we’re only using some scripts for other build related tasks we can use the official Docker image. To get this image and cache it locally do

docker pull docker

To demonstrate how this is working we can now run an instance of the Docker image like this

docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock docker /bin/sh

and we’ll find ourselves in a bash session inside the container. Note how we mount the Docker socket from the host into the container to get access to the Docker engine. We can then execute any Docker command the same way as we are doing it directly on the host, e.g.

docker images

Doing this we should see the list of all the Docker images in the cache of the host. Similarly we can build and run images from within our container that will live and be executed on the host. A sample run command could be

docker run --rm -it busybox echo "Hello World"

which will execute an instance of the busybox container image in the context of the host.

Cool! That was easy enough, we might say … so what? Well it is actually quite important because it really opens us the door to do some more advanced stuff.

Build the Artifact

Let’s say we have a Python-Flask project that we want to build. You can find the code here. The Dockerfile looks like this

From the root of the project execute this command

docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock -v $(pwd):/app -w /app docker docker build -t myapi .

This will build a Docker image called myapi. After the build is done the container is terminated and removed but our image is still around sitting in the cache of the Docker host.

Now, building alone does not do the job. There are a few more tasks that we want to execute. Thus instead of running one command in a container at a time it is much better to run a whole script. Let’s do this and create a new file called builder.sh in the root of the project. To this file we will add all our CI code. We also will want to make this file executable

chmod +x ./builder.sh

So, what’s inside our builder.sh file? To start with we just add the docker build command

docker build -t myapi .

And then we can modify the above Docker command to look like this

This will give us the very same result as before, but now we have the possibility to extend the builder.sh file without changing anything in the docker run command.

Test the Artifact

The first thing we normally want to do is to run some tests against our built artifact. This normally means to run an instance of the built container image and execute a special (test-) script in the container instead of the actual starting command.

You might have noticed that I started to use variables in my script. This makes the whole thing more flexible as we will see further down.

Tag and Push image

Once we have successfully built and tested our artifact we are ready to push it to the repository. In this case I will use Docker hub. But before we can push an image we have to tag it. Let’s add the following snippet to our builder.sh script

Before we can push the images to the Docker Hub we need to authenticate/login. We can do that directly on our host using docker login and providing username and password. Docker will then store our credentials in $HOME/.docker.config.json. To use these credentials in the container we can map the folder $HOME/.docker to /root/.docker since inside the container we’re executing as root. Thus our modified docker run command will look like this

Finally after having taken care of the credentials we can push the images to the repository by adding this snippet to the builder.sh script

and we’re done.

Generalizing for re-use

Wouldn’t it be nice if we could reuse this pattern for all our projects? Sure, we can do that. First we build our own builder image that will already contain the necessary builder script and add environment variables to the container that can be modified when running the container. The Dockerfile for our builder looks like this

and the builder.sh looks like this

We can now build this image

docker build -t builder .

To be able to not only use this image locally but also on the CI server we can tag and push the builder image to Docker Hub. In my case this would be achieved with the following commands

Once this is done we can create add a file run.sh to our Python project which contains the overly long docker run command to build, test and push our artifact

Note how I pass values for the 3 environment variables ACCOUNT, IMAGE and TAG to the container. They will be used by the builder.sh script.

Once we have done this we can now use the exact same method to build, test and push the artifact on our CI server as we do on our developer machine. In your build process on the CI server just define a task which executes the above Docker run command. The only little change I would suggest is to use the variables of your CI server, e.g. the build number to define the tag for the image. For e.g. Bamboo this could look like this

Summary

In this post I have shown how we can use a Docker container to build, test and push an artifact of a project. I really only have scratched the surface of what is possible. We can extend our builder.sh script in many ways to account for much more complex and sophisticated CI processes. As a good sample we can examine the Docker Cloud builder.

Using Docker containers to build, test and push artifacts makes our CI process more robust, repeatable and totally side-effect free. It also gives us more autonomy.

Bulk Delete Queues in AWS

Gabriel Schenker - 26 September, 2016. It was a Monday.

This is a post to myself. Due to a faulty application we have a lot of dead queues in AWS SQS. To get rid of them I wrote the following script that I executed in a container that has the AWS CLI installed

The script is quick and dirty and deals with the fact that the AWS CLI returns the list of queues as a JSON array.

Easing the use of the AWS CLI

Gabriel Schenker - 21 September, 2016. It was a Wednesday.

This post talks about a little welcome time-saver and how we achieved it by using Docker.

In our company we work a lot with AWS and since we automate everything we use the AWS CLI. To make the usage of the CLI as easy and frictionless as possible we use Docker. Here is the Dockerfile to create a container having the AWS CLI installed

Note that we need to provide the three environment variables AWS_DEFAULT_REGION, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY set in the container such as that the CLI can automatically authenticate with AWS.

Update: a few people rightfully pointed out that one should never ever

disclose secrets in the public, ever! And I agree 100% with this. In

this regard my post was a bit misleading and my “Note:” further down

not explicit enough. My fault, I agree. Thus let me say it loudly

here: “Do not push any image that contains secrets to a public

registry like Docker Hub!” Leave the Dockerfile from above as is

without modifications and pass the real values of the secrets when

running a container, as command line parameters as shown further down

Let’s build and push this container to Docker Hub

docker build -t gnschenker/awscli

to push to Docker Hub I of course need to be logged in. I can use docker login to do so. Now pushing is straight forward

docker push gnschenker/awscli:latest

Note: I do not recommend to hard-code the values of the secret keys into the Dockerfile but pass them as parameters when running the container. Do this

docker run -it --rm -e AWS_DEFAULT_REGION='[your region] -e AWS_ACCESS_KEY_ID='[your access ID] -e AWS_SECRET_ACCESS_KEY='[your access key] gnschenker/awscli:latest

Running the above command you find yourself running in a bash shell inside your container and can use the AWS CLI. Try to type something like this

aws ecs list-clusters

to get a list of all ECS clusters in your account.

To simplify my life I define an alias in my bash profile (file ~/.bash_profile) for the above command. Let’s call it awscli.

Once I have done that and sourced the profile I can now use the CLI e.g. like this

awscli s3 ls

and I get the list of all S3 buckets defined in my account.

Thanks to the fact that Docker containers are ephemeral by design they are really fast to startup (once you have the Docker image in you local cache) and thus using a container is similar in experience than natively installing the AWS CLI on you machine and using it.