Building a slack bot with botkit node and docker


    Recently I have needed to dig deeper into node and docker.  I decided to make a slack bot (easy to do) so that the problem being solved didn’t require additional learning (making bots is fun).  I ended up finding botkit (a node based slack integration) and set off building something simple with the intent of hosting it in Docker.  This took me down an interesting path as I tried to fit all these things together to make something small but useful.

    I ended up creating a FitBot that gives you a random exercise to do every hour.  My bot got way more complex (code wise) than is needed for this discussion.  I figured we might instead build a simple weather bot using the weather underground.  But first we have to get the plumbing working.

    What will you learn from this post?

    1. How to install node on a mac (will point to windows how-too)
    2. How to use the basics of NPM for node package management
    3. How to write some simple node (javascript) and test it locally
    4. How to create a Slack bot
    5. How to push all of this into a docker container
    6. How to push your docker container over to a bot host
    7. Build a weather bot
    8. What is chatops

    This post is “just enough” to scratch the surface.  It is intended to get you touching all of these concepts just enough that they are no longer scary unknown topics.  There are many pluralsight videos that take you deep into each of these technologies.  Let’s get started.

    Node stuff

    I was most interested to get my hands dirty with node which ultimately led me down this path.  This then led me to finding a slack SDK for node.  Node is great.  Many web devs naturally know javascript.  Taking your client side javascript skills to the server makes a lot of sense.  Let’s start by getting node up and running.

    Node pre-requisites for mac

    As I started my journey for getting node up and running I bumped over a few articles.  I ultimately found that I needed to install XCode and HomeBrew which would then allow me to get node up and running quickly.

    XCode allows you to build mac and ios apps.  It also provides you with all the tools you need to compile software for use on a mac.  You can find XCode in the apple store.

    HomeBrew is the “package manager that mac forgot”.  It makes installing various other bits of software on your mac very easy.  For installing node you can simple type out…

    brew install node

    …and you have node running!  But first we need to get HomeBrew installed and running.  To get HomeBrew installed just copy and paste the line below in a terminal window:

    ruby -e "$(curl -fsSL https://raw.githubusercontent.com
    /Homebrew/install/master/install)"

    Installing node

    With HomeBrew installed you can now type out the following in a terminal window:

    brew install node

    To verify that node is installed appropriately type these commands to get the version of the installed software:

    node -v
    npm -v

    For windows go grab an installer from nodejs.org and run through the installer.

    Testing node on your box

    With node installed on your machine you can now run a node server.  Doing this is also pretty easy.  Open up your preferred text editor and type in the following code and save it as program.js.

    Then open up a terminal window and cd into the directory with your program.js file.  Then type the following on a command line.

    node program.js

    You should see:

    Server running at http://127.0.0.1:8124/

    Now you can open a browser window at that url and port number.  The page should load with “Hello World”.

    Install botkit

    Now that we know that we have node installed and we have set up our first node server we can move on to building up a Slack bot using node.  We will use botkit provided by howdyai.

    According to their instructions on their github page we can use npm to install botkit.

    npm install --save botkit
    

    Slack stuff

    Now we are ready to install a bot into Slack.  To get started you need to have a Slack team you want to attach your bot too.  Inside your slack team go manage your apps and add a Bots integration: https://{your team}.slack.com/apps/manage/custom-integrations.  Then search for Bots and add a Slack Real Time Messaging bot.  Give your bot a name and then copy and paste the API Token (looks like a gobble-d-guck string – GUIDish).

    Bot stuff

    With your API key in hand you can now create a new file in your preferred text editor called robot.js and enter the following code:

    Make sure you replace the <my_slack_bot_token> with your API token.

    Then in a terminal window you can cd your way to the directory that has your robot.js file.  Then run the following command.

    node robot.js

    You should now have a bot user (named what you named it when you created it) showing as an active user in your team.  You can now direct message that bot user or mention the user saying “hello”.  The bot should response with “Hello yourself.”.

    If anything goes wrong you will see some output in your terminal window that is running your robot.

    Another node tidbit

    Now that we have a node app that can be a slack bot let’s add another tidbit that will make our node app operate more like a node app.  We need this primarily to support the ability to shove our app into docker to run it as a node app.

    Specifically we need to add a package.json file which is basically a configuration and meta data file.  This is a json file that houses the name of the application, its version, a description, etc.  But it also defines dependencies, how scripts wire up, and so on.  Here is a link that describes all the various pieces you can stuff into this file.

    http://browsenpm.org/package.json

    Here is the contents of the file I am currently using for my bot.

    With this file named package.json and living next to your robot.js file you can now use the following command to start your node app.

    npm start

    The start command points at your start script in the package file and executes the appropriate commands.  This means who ever is running your app doesn’t really have to understand how to get it running.

    Ultimately, we want this so that we can issue simple commands in our docker bits to get our node app running.

    Moving right along…

    Docker stuff

    If you haven’t taken a deep dive into docker yet…oh my!  This technology recently turned three and has turned out world on its head as far as how to develop, test, deploy, and manage our applications.  The statement “works on my machine” sort of disappears.  Vendor lock in with a virtualization layer, or even a cloud provider, can largely be removed as well.  You can push docker instances just about anywhere these days.  And when it comes to not treating your servers as pets docker is the king.  Spin them up, tear them down.  Spin up many of them to handle load.  Docker is great.

    So let’s host our bot in a docker container and see how all this works.

    Install docker tools

    The local story for docker has changed now and then.  The current iteration is pretty easy actually.  But there is an even better story coming soon.  Let’s see how docker tools works.

    First off – I am not going to detail all the steps needed for getting Docker running on your mac or windows machines.  They do a great job of this already.  For mac folks go here.  For windows folks go here.  Grab your installers here.

    As long as you have the Docker Quickstart Terminal running you should be good to go.

    Dockerize your node app

    Once you have installed docker and have touched their simple tutorials (understanding the basics) you will need to understand the Dockerfile (capital D is important for some reason).  Start by creating a file named Dockerfile.  This is just an empty text file with no file extension.  This is the configuration for your docker image.  Here is the one I am using for my bot.

    Here is the formal docs for Dockerfile if you really want to dig deep on this.

    But let me give you the quick overview.  All docker containers start from a base image using the keyword FROM {image name}.  I chose to start from the node:argon image which is a lightweight node base.  You can start from many of the different node ready containers.  Search through the docker hub to find one that suites you.

    Then we need to create a location to host our node app.  I chose to put my app in /user/src/app using RUN mkdir -p /usr/src/app.  Then I set that to the working directory with WORKDIR /usr/src/app.

    Next we need to run some commands inside our image to get it ready to host our node app.  RUN npm install botkit –save is the same as we did when readying our development environment up above.  We have to repeat this in our docker image.  The same work is done installing botkit in the container.  This gets us ready to take a dependency on botkit.

    And then we can copy in our script and package.json file.  This is done with COPY bot.js /usr/src/app/ and COPY package.json /usr/src/app/.

    Finally we set the command to be issued when the container starts up with CMD [ “npm”, “start” ] 

    Building your image

    Now that you have a Dockerfile in your app we can build a docker image.  We do this by opening up the Docker Quickstart Terminal.  Then cd your way into the folder that has your robot application (and Dockerfile) in it.  Then issue this command:

    $ docker build -t <your username>/<image name> .

    The -t flag allows you to tag your image so that you can easily find it in a long list of images.

    : this can be anything.  I use first initial last name ‘asiemer’ : this can also be anything.  I use the app that I am putting in the image.  So your bots purpose is good. To follow along with what we have done so far you could use something like this:
    $ docker build -t asiemer/robot .
    When you hit enter you will see a long list of information stream by as your image is created for you.  When that is all done you can list out your images.  You should see your freshly built image in the list.
    $ docker images
    ### Running your image Next we can run the image.  We do this with the following entry.
    $ docker run -p 49160:8080 -d asiemer/robot
    -p is a port mapping (which isn’t entirely needed by your bot unless you intend to have some management screen as part of your node app).  In this case we mapped 8080 inside the container to 49160 on your physical machine. -d runs the container in detached mode which leaves the container running in the background (so that you can run your bot and continue to work in docker). Now that we have the robot running in the background we need to be able to see what it is doing.  We can do this by locating our running container and then tapping into its logs.
    $ docker ps
    You should see your running container _asiemer/robot._  Next we can print the logs to the screen by getting the logs from the running container by its container id._ _ Thankfully you don’t have to type out the entire container id.  Just the first few characters to make it unique.
    $ docker logs <container id>
    This should show any output from your robot. To stop the docker container that is running we can use the same container id and issue this command.
    $ docker stop <container id>
    ## Hosting your bot stuff Now that we have done all the leg work of setting up node, getting a slack bot running, and hosting it locally in docker – we are ready to push this bot somewhere that it can live for a longer time than just when your laptop is running.  There are many great platforms for hosting a docker application.  But I also found a specific bot hosting platform that gives you hosting for cheap, but also gives you a complete CI/CD set up in seconds. ### BeepBoopHQ BeepBoop is a slack bot hosting company.  They run your bot on the google cloud.  They integrate the CI/CD story through github with docker.  Perfect for us! To get started you need to create a github repository and push all your bot bits into that new repository.  Reminder: robot.js, package.json, Dockerfile.  You can sync this repository to your local computer or you can just upload those files directly to your new repository. _You will want to sync this to you machine at some point…but not needed for now._ Now you can sign into BeepBoopHQ using your github credentials.  This will prompt you for some additional information.  And you should recieve a beepboophq.slack.com invite.  Add this to your slack app as all the magic of their CI/CD story happens directly in slack (pure magic). Then create a new project (my projects at the top).  Clicking on create a new project will list all your code repositories.  Select the repository that has your bot in it.  There are various details you can set about your bot but those aren’t important just now. As soon as you wire a github repository into beepboop you should see some slack activity in their team.
    New Project Created
    Github Webhook Registered Build Requested
    If you have a Dockerfile (and your other files) in the github repository and all your code was working locally you should see other messages too about the image being built and deployed.  If there are any errors you will see very detailed messaging in the slack channel around what you are missing or what broke. Once everything is running you will have a bot in your slack team that you can do all sorts of things with. ## Build a weather bot Now that all of our plumbing is configured let’s build a quick weather bot.  To use this bot you will have to create a free account with [Weather Underground](https://www.wunderground.com/weather/api/d/docs?d=index&MR=1) so that you can get an API key. With the API key in hand you can then create a simple bot that queries the forecast for a given city and writes the weather back to slack. You can find [the source for this weather bot here](https://github.com/asiemer/weather-bot). ## Chat ops Now you are ready to start tackling the concept of ChatOps.  Hopefully your team (dev team, business team, marketing team, family, slack is great for everybody) is using Slack or similar already.  Now you just need to figure out what could make working together better?  And how to weave your bot above into the conversation. According to StackStorm chat ops is: > ChatOps is a new operational paradigm where work that is already happening in the background today is brought into a common chatroom. By doing this, you are unifying the communication about what work should get done with actual history of the work being done. Things like deploying code from chat, viewing graphs from a TSDB or logging tool, or creating new Jira tickets… all of these are examples of tasks that can be done via ChatOps. > > Not only does ChatOps reduce the feedback loop of work output, it also empowers others to accomplish complex self-service tasks that they otherwise would not be able to do. Combining ChatOps and StackStorm is an ideal combination, where from Chat users will be able to execute actions and workflows to accelerate the IT delivery pipeline. There are all sorts of presentations on ChatOps. Here is a getting started presentation on ChatOps and why. And here is one of the first ChatOps videos around HuBot at github. ##

    Are you learning Docker yet


    I went to a docker meetup last week many weeks ago (just now hitting publish).  It was sort of a town hall panel discussion where four guys with loads of docker experience were fielding questions and responding from their experience.  As they were chatting I was jotting down all the different terms I was hearing for the first time or that I have heard but that had been mentioned often enough that I felt I should get to know that tech a little more deeply.  Figured I would share my list from that night.  This is just a brain dump…

    This post is a collection of “what is it”, “why do I need it”, copied from the sites mentioned.  Consider this a collection of tools you should know about when thinking of entering into the world of docker.

    Docker Compose

    “Compose is a tool for defining and running multi-container Docker applications.”

    Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a Compose file to configure your application’s services. Then, using a single command, you create and start all the services from your configuration.

    Compose is great for development, testing, and staging environments, as well as CI workflows. You can learn more about each case in Common Use Cases.

    Using Compose is basically a three-step process.

    1. Define your app’s environment with a Dockerfile so it can be reproduced anywhere.
    2. Define the services that make up your app in docker-compose.yml so they can be run together in an isolated environment.
    3. Lastly, run docker-compose up and Compose will start and run your entire app.

    For more information about the Compose file, see the Compose file reference

    Compose has commands for managing the whole lifecycle of your application:

    • Start, stop and rebuild services
    • View the status of running services
    • Stream the log output of running services
    • Run a one-off command on a service

    Apache Mesos

    “Program against your datacenter like it’s a single pool of resources”

    Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.

    Mesos is built using the same principles as the Linux kernel, only at a different level of abstraction. The Mesos kernel runs on every machine and provides applications (e.g., Hadoop, Spark, Kafka, Elastic Search) with API’s for resource management and scheduling across entire datacenter and cloud environments.

    • Scalability to 10,000s of nodes
    • Fault-tolerant replicated master and slaves using ZooKeeper
    • Support for Docker containers
    • Native isolation between tasks with Linux Containers
    • Multi-resource scheduling (memory, CPU, disk, and ports)
    • Java, Python and C++ APIs for developing new parallel applications
    • Web UI for viewing cluster state

    Set up Mesos with Compose

    As it turns out you can easily set up Mesos using Compose.

    Docker Swarm

    “Docker Swarm is native clustering for Docker.”

    Docker Swarm is native clustering for Docker. It turns a pool of Docker hosts into a single, virtual Docker host. Because Docker Swarm serves the standard Docker API, any tool that already communicates with a Docker daemon can use Swarm to transparently scale to multiple hosts. Supported tools include, but are not limited to, the following:

    • Dokku
    • Docker Compose
    • Krane
    • Jenkins

    And of course, the Docker client itself is also supported.

    Like other Docker projects, Docker Swarm follows the “swap, plug, and play” principle. As initial development settles, an API will develop to enable pluggable backends. This means you can swap out the scheduling backend Docker Swarm uses out-of-the-box with a backend you prefer. Swarm’s swappable design provides a smooth out-of-box experience for most use cases, and allows large-scale production deployments to swap for more powerful backends, like Mesos.

    Kubernetes by Google

    “Kubernetes is an open-source platform for automating deployment, scaling, and operations of application containers across clusters of hosts.”

    With Kubernetes, you are able to quickly and efficiently respond to customer demand:

    • Scale your applications on the fly.
    • Seamlessly roll out new features.
    • Optimize use of your hardware by using only the resources you need.

    Our goal is to foster an ecosystem of components and tools that relieve the burden of running applications in public and private clouds.

    Ansible

    “Ansible is a radically simple IT automation engine that automates cloud provisioning, configuration management, application deployment, intra-service orchestration, and many other IT needs.”

    Being designed for multi-tier deployments since day one, Ansible models your IT infrastructure by describing how all of your systems inter-relate, rather than just managing one system at a time.

    It uses no agents and no additional custom security infrastructure, so it’s easy to deploy – and most importantly, it uses a very simple language (YAML, in the form of Ansible Playbooks) that allow you to describe your automation jobs in a way that approaches plain English.

    etcd

    “etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines.”

    etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles master elections during network partitions and will tolerate machine failure, including the master.

    Your applications can read and write data into etcd. A simple use-case is to store database connection details or feature flags in etcd as key value pairs. These values can be watched, allowing your app to reconfigure itself when they change.

    Advanced uses take advantage of the consistency guarantees to implement database master elections or do distributed locking across a cluster of workers.

    Consul

    “Consul has multiple components, but as a whole, it is a tool for discovering and configuring services in your infrastructure.”

    Consul has multiple components, but as a whole, it is a tool for discovering and configuring services in your infrastructure. It provides several key features:

    • Service Discovery: Clients of Consul can provide a service, such as api ormysql, and other clients can use Consul to discover providers of a given service. Using either DNS or HTTP, applications can easily find the services they depend upon.
    • Health Checking: Consul clients can provide any number of health checks, either associated with a given service (“is the webserver returning 200 OK”), or with the local node (“is memory utilization below 90%”). This information can be used by an operator to monitor cluster health, and it is used by the service discovery components to route traffic away from unhealthy hosts.
    • Key/Value Store: Applications can make use of Consul’s hierarchical key/value store for any number of purposes, including dynamic configuration, feature flagging, coordination, leader election, and more. The simple HTTP API makes it easy to use.
    • Multi Datacenter: Consul supports multiple datacenters out of the box. This means users of Consul do not have to worry about building additional layers of abstraction to grow to multiple regions.

    Consul is designed to be friendly to both the DevOps community and application developers, making it perfect for modern, elastic infrastructures.

     TerraForm

    Orchestrating Docker with Consul and Terraform

    Poking around TerraForm and Docker I found this slideshare.

    Testing microservices


    This morning I was asked about a super high level talk I gave a while back: Grokking Microservices in 5 Minutes.  Specifically regarding the testing of microservices.  One of the more difficult parts to building tiny things that collectively form big things.

    The question was around the “shared pool” of tests and how to do that.  There are many ways to implement the shared pool concept.  But first let’s discuss why having a shared pool of tests might be important.

    If you have gone down the microservices path, or are currently responsible for the appropriate care and feeding of a monolith and are starting to think about carving it up into smaller service pieces – you want to be sure to understand how to test what you are building.  A great image for why this topic is important is this.  The simple act of going from a behemoth app to many small apps can result in a yard littered with smaller piles of behemoth code!

    To safe guard you from this transition understanding how to maintain services so that they remain non-brittle is important.  Aside from ensuring that your system is built to fail in a manageable and expected way (another discussion) you also need to invest in the flexibility to move fast.  This means testing your boundaries to ensure that breaking changes are caught early.  Microservices allow you to move fast with product iteration in small batches.  But as your service becomes a dependency for another up stream service they will inevitably take a contract dependency that needs to stay strong.

    There are many ways to ensure that you don’t break contracts – versioning is one.  And that should be in place from day one.  But subtle changes in your underlying systems could potentially have impact on how things are being consumed externally.  The more past versions of your software you need to keep around, the more likely these subtle changes could impact your consumers in ways you haven’t thought of (perhaps you have a bunch of apps using your API for example, and they can’t always keep up with your release schedule, meaning you need to keep older versions around for a while).

    It becomes important for you to test your assumptions of your services to ensure that your contract does what it says it will do.  This is the first layer of keeping your service reliable for teams that take a dependency on your service.  But you can only guess at how your customers are using your service.  The next level of testing is to allow your consumers to publish tests to you (internal, not your customers or third party developers).  They will write the tests based on how they use your service – not how you envision them to use your service.  These tests can be part of your CI and will alert you to any place (way early) that your internal tweaks have broken an upstream relationship with your consumers.

    How do we support shared testing?

    But how?  Unfortunately this is where it heavily depends on the language/framework you have chosen to utilize for your testing and your application.  There are lots of ways to do this, but in my .net world this might look like an integration test project that is versioned along with the service I am building that consumers can commit tests into (via pull request).  The team that builds the service owns the code repository.  The team that is using my service can contribute tests to this particular test project in the code repo of my service.  You might have an IntegrationTests project sitting next to a ConsumerTests project.

    This model works great for small teams or single teams in that they become the publisher of all tests.  But scales well to multiple teams building/owning their own services as all tests for a given service live local on the dev machine and seamlessly fit into the CI story.  Everyone understands the moving pieces.  And the tests live and are versioned with the service.

    As your enterprise grows, or you are already not in a .net team, you may pick another framework and test runner to centralize around.  But the concept is the same.  You need a mechanism that watches for commits to your code base that will know how to run your consumer tests.  And a way to allow your consumers to give you tests that represent their usage of your contracts.  This can be done in many clever ways!

    But what about supporting multiple versions of a service?

    But what if you need to keep several versions of a given service and it’s API alive for longer periods of time?  This is an even more grey area.  I will speculate from here: if you are living in an infrastructure as code world, part of standing up an environment for a version of your software that is intended to live in prod should have an integration path.  Keeping the shared tests alive for each running prod version is the best way for this as it allows you to iterate on the prod version (so that you can do hot fixes along the way) and still run the appropriate tests.  Again, if the tests live with the code, and you are using a branching model that supports hot fixes separate from new features/versions, this should be easy enough to manage.

    Don’t forget to test the system as a whole

    Once you tackle integration testing the microservice your team is responsible for, and you provide a mechanism for consumers of your service to contribute tests around how they use your service, make sure that you also have tests for how the application as your customer sees it is also tested! (functional tests from the UI back)

    While it is important to ensure that first level boundaries are tested, it is equally important that you don’t forget about the user and their experience.  A user has an expected experience.  If you are a commerce app this means that a user needs to browse, search, read about, and select products.  Add products to a cart, manipulate the details of a product in the cart, and eventually purchase the product.  Once they have placed an order there may be back office processes that need to occur.  Eventually a customer will get notifications about their order, when items ship, etc.  And there may be support for a customer to complain.

    All of these things touch the customer and need a story to validate that the experience works as they would expect it.  And if you have a mobile app and a web app then you likely have a few things to test.  The mobile APP UI, the web app UI, and the API that drives each of them.  I covered some strategies for building and maintaining an API.  I have not really covered the UI bits as much (though many other folks have written about that).

    There are lots of ways to piece together a UI from a microservices collection.  You might treat the UI as something to apply this microservice concept too where each of your application is self contained.  Or you might build a microservice as a feature where the backend and front end code live together which would then become a mash up as a SPA or similar.  Lots of options.  Testing for each of those options might be subtly different.

    The key here is that you don’t forget that one type of testing likely doesn’t remove the need for another type of testing.

    Some reading/experiences from others wading into the microservices trend

    Most of these articles talk about testing in some form.  However, a lot of people struggle when tackling testing.  In which case, read through the comments.  There is always loads of gold in the comments!

    https://blog.yourkarma.com/building-microservices-at-karma

    https://www.opencredo.com/2014/11/19/7-deadly-sins-of-microservices/

    http://www.infoq.com/articles/microservices-practical-tips

    Great article on how to think about Microservices (part 4 of a 4 part series).

    https://www.nginx.com/blog/microservices-at-netflix-architectural-best-practices/

    Value Stream Map


    I read an article yesterday about how SoundCloud migrated their product towards microservices.  It was the business reasoning for making that decision and the steps they went through to finally get to their goal.  It wasn’t so much about the technology.  I love this type of article as the thing we should all focus on more is generating business value – not shiny new technology toys (that nobody cares about) – out of our efforts each day.

    In that article I was introduced to a new concept – Value Stream Mapping.  This post won’t do that topic enough justice but should get your mind moving in the right direction.  If you think you have inefficiencies in your product/development process – pause what you are doing, read this post, do this exercise with your team.

    I was so excited about this way of visualizing process and data that I did it with the team a few minutes ago.  At the end of the session we had a very messy whiteboard that was almost like a brainstorming bubble diagram with a little bit of workflow and a little bit of state machine and a little bit of timeline.  What was ugly to anyone not on the team was full of incite to the team.  Everyone learned something.  And whats best is that we all see some low hanging fruit ripe for the picking for immediate process improvements

    Why do Value Stream Mapping?

    A Value Stream Map (VSM) allows you to visualize a process as it is – rather than how you think it is.  This allows you to see how long things generally take, and see where there might be an unneeded time suck in your day to day processing of a given type of work.  Specifically, how long on average does it take to implement one new feature and get it in front of your customers?  Who is doing the work?  Is the work being done being done by the right type of person?  Are there areas where efficiency can be gained quickly?  Do we have any churn in the process where the process is quick but since we never get it quite right we have to do the same steps over and over again?

    Examples

    An example that might apply to you is receiving a new story in your product backlog that isn’t quite fleshed out.  This story might just be five words to summarize at a high level the problem that someone is having.  It doesn’t give enough detail however to understand the problem or define what an appropriate solution might be.  Additionally, as the work might get prioritized to be done now, the architect may not have a chance to see if the fix the engineer is putting into place is in alignment with the years technical road map for that problem area.  You may also find that the research is taking long periods of time, being done by an expensive development team member, rather than by someone in the business who knows the problem area or by a BA.  Quality might be another area where if we paid just 5 more minutes of attention to something prior to calling it done, we could save countless hours of deployment, testing, and validation work – let alone all the loop backs for rework.  The less back and forth the better.

    How?

    This activity should be done by all members of the team for a given process over a given business area or application.  It should be informal.  And all people should be able to share their insights.  “The expert” should keep the conversation on track but should not be filling in the gaps along the way.  People that work in the process know the reality of the processes.

    The goal is not to produce a sugar coated version of reality.  But instead to produce reality as it is today in all its ugly glory.  Expose the gaps and time sucks.

    Here are the steps for creating a VSM:

    1. Select the process to be analyzed
    2. Use consistent symbols in the map to reduce complexity and gain rapid understanding
    3. Define the boundaries of the conversation
      1. Perhaps the business does some product road mapping that don’t involve the dev teams – don’t include that
    4. Outline the known steps in the process within the boundary
    5. Add information flows to the VSM
      1. This can be communication between parties
      2. How a process is initiated
      3. The frequency (back and forth) of a given step
    6. Collect process data
      1. Cycle time (time taken to get through each step)
      2. Change over time
      3. Up-time
      4. Number of people involved
      5. Available working time
      6. Batch size
    7. Timeline
      1. This gives us total process times and lead times
      2. Calculates total lead time
    8. Interpret the VSM
      1. Excessive time spent by the wrong resource type
        1. Poor quality causes repeat cycle times
        2. Areas that can be automated
        3. Processes that can be removed or made more efficient

    A quick google search for “Value Stream Map” will show you a lot of manufacturing maps.  Add “Software Development” and you will come up with some example drawings that are more applicable to your needs.

    Ideal State

    Once you have a VSM created that reflects your current state you can quickly identify inefficiencies in how your team is working today.  There are probably low value but quick wins (low hanging fruit) that can easily be solved for some immediate return.  Do those first.  Then there are going to be some obvious big wins but that are more complex to achieve.  Identify all the steps that you want to do to make improvements.  And as you make those improvements go back to the “collect process data” step and reevaluate the reality of your new process.  Ensure that the changes you make do add value and reduce time to market with your new features.

    Usually some quick wins in the development process are some of the following:

    • Reduce friction going to production.  Manual processes take time and are error prone.  Add some form of continuously delivery or continuous deployment.
    • Determine how to improve quality.  This may be adding unit tests, code reviews, or developer isolation.  Or it may be automated functional testing.  Or it may be expanding your QA presence.  The answer here depends on how your shop is set up.  But stopping the back and forth between bugs that go to production is important as that is usually expensive across a large swatch of your product delivery team (product management, engineers, QA, devops, etc.).
    • Focus on finishing work over starting new work.  Having 50 things in progress doesn’t add as much value as shipping one new thing to your customers.  Set WIP limits in your process and continuously realign your team to focus on getting things done – regardless of where the blockage is.

    Have you done this before?  Let us know how it went in the comments below.

    Developer personal time management


    I find that most developers want to help out.  They like to get involved.  They enjoy rolling up their sleeves to put out fires.  They relish being seen as the person that can do magic to solve the unsolvable problems for their business counterparts.

    And because we enjoy these things we tend to allow people to interrupt us at their convenience.  We horribly under estimate how long things will take to get done because we feel bad admitting that some problems are hard to solve.   And we mis-manage our time over and over again as we attempt to catch up on all the things we need to get done because none of us want to appear like we aren’t that magical after all.

    In this post I am going to discuss things that have all already been discussed around time management somewhere else (and they likely did a better job).  But these seem to be new concepts to many younger developers.  Please ignore this post if you already manage your time well and estimate your work expertly!

    And if you are interested in these concepts please do a little more research to understand the importance of each and how they will help your day be more efficient.

    Developer Office Hours

    Developer “office hours” means that you have a time slot scheduled on your calendar for when you are available for unplanned one on one walk ups.  Most developers like to get work done as soon as they get into the office (as they were likely thinking about a hard problem all night and now have a solution for it).  So scheduling work time (not office hours) first thing in the morning makes since.  Having their first office hour(s) scheduled at 10 or 11 works better as they have solved their burning issue and are ready to have some face time with their peers.  But if you prefer to have all your interruptions unplanned work out of the way first thing, earlier office hours might make more sense for you.

    I would be careful calling those walk ups or unplanned work “interruptions” (even though you may see some interactions that way).  Since you have scheduled time for one on one interactions with your juniors, business partners, etc. these conversations should specifically not be seen as interruptions but instead conversations that are needed by your team to be successful.  The more senior you are, or the more managerial you are, the more likely your team needs your input.  You may need more office hour time than “personal sprint” time.

    Once you have this time for unplanned conversation on your calendar you will be ready to defend all of your other time as your time.  But, now instead of saying “Please go away, I am busy” – you can instead say “Please come back during my scheduled office hours from 10am-11am or 1pm-2pm. I am available for this discussion then.” And as you train the folks you work with most they will come during your office hours more and more thereby allowing your personal time to be that much more productive.

    In order for office hours to be effective for those walk ups, use the time between meetings for business related tasks that don’t require a lot of thought or are heavily fragmented.  Don’t try to squeeze in important tasks that require deep thought as a way to sneak more real work into your system.  You will only end up pissed that people are stopping your flow.  Instead stick with other busy work items that are generally more fragmented in nature.

    Some things that fall into this category:

    • checking email
    • getting caught up on your group chat
    • Twitter
    • other forms of communication
    • managing your calendar
    • returning phone calls
    • etc.
    And consciously decide to not do any of this type of work during your “deep in thought” work times.

    Now let’s optimize your focus during your non-office hours.

    There is a great answer on StackExchange about office hours

    What most people fail to understand: Every time our concentration is broken, we create at least one bug and/or delay the deadline for another half-hour. Private offices is not a “nice to have” for developers but a must. This is not about status, this is about brain physics.

    And in a world where open floor plans (and noise and interruptions) are the norm, scheduled office hours is as close as you can get to a dedicated office space.

    Personal Sprints

    A lot of development teams these days have heard of Agile and likely “sprint” with their teams.  So for most people this concept should be well known.  How do we apply this to our personal work time?

    Easy!  Plan it.  Start at a certain time with specific work in mind.  And just work on that specific task.  A personal sprint might be the time between your office hours.  Or it might be in 45 minute blocks several times leading up to the next meeting or break in your day.  But schedule your day to work on specific tasks during specific times and then commit to that activity and minimize allowed interruptions in your sprint.

    Communicate to your team that you are in the midst of a “personal sprint” so that people learn not to interrupt you while you are trying to achieve a state of flow.

    Mihaly Csikszentmihalyi: Flow, the secret to happiness

    If you have your own office, guarding your time is a bit easier as there is a physical barrier to interruptions and you can hang a sign on that barrier “I am busy in the midst of a personal sprint!  Please check my calendar for my next available office hours.”

    If you are in an open floor plan (as is soo very popular these days) it is a little bit more difficult to guard your time and attention.  SO – Fly a flag.  Put up a lamp with a red bulb. Do something crazy that shows you are busy…and teach frequent interrupters that there is a cost associated with being interrupted.  Sadly – wearing your head phones no longer seems to work.   …tap tap tap…I am standing here please answer me!

    What is that cost of being interrupted?  I found a great post that covers this topic well.  Here is a snippet.

     

    Based on a analysis of 10,000 programming sessions recorded from 86 programmers using Eclipse and Visual Studio and a survey of 414 programmers (Parnin:10), we found:

    • A programmer takes between 10-15 minutes to start editing code after resuming work from an interruption.
    • When interrupted during an edit of a method, only 10% of times did a programmer resume work in less than a minute.
    • A programmer is likely to get just one uninterrupted 2-hour session in a day

    We also looked at some of the ways programmers coped with interruption:

    • Most sessions programmers navigated to several locations to rebuild context before resuming an edit.
    • Programmers insert intentional compile errors to force a “roadblock” reminder.
    • A source diff is seen as a last resort way to recover state but can be cumbersome to review

    Pomodoro Technique

    Once you have office hours and you have started working with the concept of personal sprints, you might want to optimize even further.  Matt Hinze introduced me to the concept of the Pomodoro Technique for helping you with this.  This is a very simple way to improve your time management.

    • Use a timer for planned activities
    • Plan your tasks around Pomodoro time increments
    • Review your time estimates to get better at estimating
    • Ignore interruptions during your Pomodoro sessions
    • Record your progress across timings and across activities to improve your process

    Devin Rose turned me on to a Trello like board for doing the Pomodoro Technique and work item management called KanbanFlow.  This allows you to plan your sessions, keep a timer going, and track your sessions over time to see how long certain types of activity generally take you.

    Doing Pomodoro before you have trained your peers to speak with you during office hours will be very painful and will likely lead you to not like Pomodoro.  Office hours.  Then personal sprints working on deep thought items.  And then Pomodoro to track your progress.

    Have a better way?

    I am sure there are many other great ways to be productive with your time.  I can think of some off the top of my head that I use every day: Inbox Zero, getting things done.  Scott Hanselman has been doing a productivity talk that summarizes some of his concepts: “Don’t worry, just drop the ball”, “Scale yourself”, “Look for danger signs”, and many more.

    There is a lot of room for improvement out there to be had by most developers.  Please share your ideas below.  Let’s discuss it!

    Easy way to gain high level understanding


    When I hit the ground anywhere new I go into listening mode and try to absorb as much information as possible.  I am always interested in the process to transfer information about existing systems and processes to new people.  But when I land somewhere that passes me from person to person to gain the understanding of this information I am eager to write down or draw my understanding so that the next new guy doesn’t have to be passed from place to place.  Onboarding should always get better!

    I have bounced from company to company for a lot of years now.  I used to keep 3 active jobs at any given time.  This has put me in this fun discovery mode many times.  As such I have looked at many ways to track information in a quick easy to understand way.  Of course there is/was UML which was a HUGE thing not so many years ago.  But I have found that less formal higher level diagrams are 1) usually good enough and 2) much easier to maintain.  Boxes and lines and no more if possible.

    I have used Visio for a lot of years so it is the second tool installed on my box after Visual Studio.  But recently, I made a migration from the land of PC to MBP.  So – I tried to ditch Visio for at least a short while.

    One of the slack teams I am involved in has many smart folks in it.  Asking them what they used turned up https://www.draw.io which has all the tools you need for this purpose.  And the images you draw can be saved to Google Docs directly then shared to customers/peers directly.

    But when there is a significant lack in documentation I feel the need to involve others.  Once we cross the bridge of collaborating over documentation we need to be able to work together to create a cohesive output.  Which shapes do we use? How much detail is needed? This lead me to research something less complex than UML but with at least enough of a formal approach that anyone can be involved.

    I stumbled upon the C4 concepts along the way.  C4 is much much more than a way to document code (which I won’t get into here).  But C4 (given my military background) is an easy term to remember.  And they provide a great one page cheat sheet for passing around to summarize how to appropriately and consistently capture just enough information for various audiences.

    1. System context diagram: Shows all the user types that interact with your system and the system dependencies used by your system.  This is for non-technical people to understand the high level system.
    2. Container diagram: This diagram illustrates technology choices like web applications, database servers, etc.  Containers can also be buckets of data like file systems, data bases, email servers, etc.  Basically anything that can host code or contain data.  The audience for this set of diagrams is for software developers and support engineers.
    3. Component diagram: The component diagram deconstructs each container into their logical concepts.  Something like a data layer, a widget factory, service, etc. can all be drawn as a box to show its interaction with other areas in the container.
    4. Class diagram: Beyond the component diagram we start to get into more UML style, detail oriented, designs.  I don’t generally go this low for depicting my understanding of a given system.

    Here is the image I frequently reference:

    http://www.codingthearchitecture.com/images/2014/20140824-c4.png

    What tools and approach do you take to getting an understanding of existing applications?

    Cloud Academy oh my


    I have had some time off lately and have been filling that time immersing myself in all things Amazon Web Services related.  I was thinking that I might go for their cloud architect certifications…why not!

    I looked at some of the courses offered on Pluralsight – and of course all of their content is good.  But I really wanted a specific path through all things AWS.  And I wanted to be sure I was seeing the latest and greatest AWS had to offer – not 1 year old coverage.

    A friend of mine had suggested I take a look at CloudAcademy.com.  He had been using it to learn some bits now and then.  And as it turns out they had very specific paths through their course catalog that amounted to just what I needed.

    And there was a huge upside – the first week was free.  I figured I would just learn all day every day for 7 days and use up all their content and resources.  No need to pay!

    But what actually happened was totally different.  I started taking my courses.  Going through some of the quizes.  And then back through some courses.  I was hooked.  What a great site.  What great current content.  And how nice it was to have a path through the content that took me from course to course with an end goal in mind.

    I ended up spending some money and have continued my subscription.  Even running the videos passively in the background while I work on and focus on other things has proved to uncover many new AWS nuggets.  Lots of detail packed away in each course.

    This is not a paid advertisement in any way.  I just truly love what they have to offer!  Check it out.

    https://cloudacademy.com/

    long time listener first time caller


    Man oh man!  I have been using windows for as long as I can remember.  And I have been professionally writing software in a windows world since I got out of the military.  I have heard several times that Linux is better. I have heard several times that Apple is better.

    And I have tried to hear this advice.  I have gone so far as to pause my windows use and go head long into linux.  I have also gone so far as to attempt to use an MBP at my first Dell job.  They bought it for me – so why not?

    I hated it!  I was spoiled by what I knew and usually just didn’t have it in me (time or otherwise) to take on a learning curve while taking on many other learning curves. And I always went back to the PC in some form or another.

    Now I have 6 people in my house in addition to me that are all on a computer of some form all day long.  And being a PC support guy is a full on part time job.  And I hate it.

    Additionally, while I am a windows developer I am not a golden hammer developer.  I pay attention to the open source non-windows world every day.  I see all the shiny toys out there.  And I try to use them…on windows.  And in the open source world getting these tools to run on a PC isn’t usually very easy.  I first have to make windows look like linux.  Then I can someone get a hobbled version of something running.  This experience is off putting when using these otherwise great tools.

    NO MORE!

    At the last MVP Summit (2014) I saw Microsoft saying “we love Linux” and “we are trying to be more open”.  That was great progress but I didn’t have faith it would stick.  While I was at the MVP Summit last week I have to say I was very impressed.  I saw just as many macs as I did PC’s.  I saw just as many Microsoft updates on tools and features as I did on non-Microsoft support of oss tools and features.  And almost everything was open for contribution. I was amazed.

    I was also using an older Dell laptop while at the summit that I had recently migrated to windows 10.  It wasn’t running so well (the hardware).  It still had a spinner in it.  And it was semi sluggish. I love Windows 10!  Best OS they have ever released in my view.  I won’t stop using it.  But I was done with Windows as my default running on PC’s that just didn’t perform well.

    I was with my friend who had recently migrated from PC to the entire Apple platform.  And he loved it.  Towards the end of the first day I told him I had been thinking about MBP for a few weeks.  And that I had actually shopped for a machine an hour ago.  I was seriously thinking about getting a MBP.

    He jumped on the opportunity.  We had 30 minutes before going to our first party of the evening.  He assured me not to worry.  It would be a great experience.  In and out in no time.

    For the most part this was true.  We went in.  We were nicely greeted by a non-smoozzy metro/lumber sexual (I forget which..there were many skinny jeans there).  I applied for credit and was immediately approved.  We collected the bits I needed to work from this machine.  Then we checked out.

    Uh oh!  I was from Texas buying a not so cheap item from Washington.  This caused a hick-up in the application process.  But that was ironed out quickly.  And we were off with a new toy in hand. Great experience over all.

    The next day I was able to quickly get node.js running.  And I have spent a fair amount of time getting to know my MBP and the terminal.  Everything just works great.

    Don’t get me wrong!  There is a learning curve migrating from PC to mac.  Scroll is opposite the experience in windows.  The close button on a window is in a different place (I used to say “in the wrong place”).  Maximizing a window seems to have some weird ramifications to it (like I can’t drag a chrome tab into its own window if the current window is maximized…why not?). But I am learning.  And I am enjoying the fact that everything just works.

    Now it is time to install a VM manager and get my windows 10 back so that I can use some of the tools that I am used too.  But I took stock of what apps I use regularly now in Windows…and the number is small.  Visual Studio is a big one (have you played with Visual Studio Code yet? It runs on a mac.).  Office is generally in google drive these days.  Notes are in workflowy and evernote. Visio is important to me.  I have a photoshop license for windows still.

    More to come.  Exciting.  Give it a try!

    Huge Scale Deployments


    What are the best practices for supporting huge-scale deployments? How do you manage fidelity of environments and processes, monitoring, blue/green deployments and more across thousands of servers?

    Earlier this month, I participated in an online panel on the subject of Huge Scale Deployments, as part of Continuous Discussions (#c9d9), a series of community panels about Agile, Continuous Delivery and DevOps. Watch a recording of the panel:

    Continuous Discussions is a community initiative by Electric Cloud, which powers Continuous Delivery at businesses like SpaceX, Cisco, GE and E*TRADE by automating their build, test and deployment processes.

    Below are a few snippets from my contribution to the panel:

    The practice of huge-scale deployments

    “In the previous company I worked at we had a situation where the huge-scale deployment was a multi-team, highly coupled system made of multiple interwoven systems. You’d be on a call bridge for days, literally, people roll off go to sleep for 2 hours and come right back. Yes, you shouldn’t change configurations on the fly, you shouldn’t do anything manual, everything is automated, whether it’s virtual or physical hardware.

    “Having these huge coupled systems – you  have to stay away from it, whether you use microservices or autonomous components, whatever – you should have a small enough surface area of tests, parts that can go down their own deployment path. If it’s side by side or whatever your deployment strategy is, you slowly drain traffic over to the new component, test it out, you can assure it actually works. It’s a small change, not an entire system change all at once, with a queue of work that people are sitting in.”

    Fidelity of environments

    “I think it’s easy to come up with data to look at all the inefficiencies. We all enjoy analyzing inefficiencies to make things more performant. I find it easy to convince people who live in the trenches, they’re on these crazy deployment calls, they own the deployments. It’s usually middle management, the people who own siloes or own a budget, they’re usually somewhat in the way, if I can put it that way, but ultimately you need to justify fidelity of environment to the people at the top that control the purse strings, or above that.”

    “There’s a YouTube video with a conference talk by ING that talks about the justification of Continuous Delivery and the automation to make stuff go live quickly. They had to put in place the culture change, what needs to happen, and that is – if you’re in the way you’re not here anymore. We need to understand what we do as a company, we deliver product and features, that’s what allows us to innovate, if you’re in the way of that you don’t work here anymore. That’s huge.

    “But in a bigger company you’ve got the middle management who really don’t want their cheese to move. They’ve got to move their cheese, or get on the bus and then everything’s easy. We can justify it.  We can make it happen, it’s not hard, it’s just changing the culture.

    “We had a client who had a bad experience in the cloud. They’re trying to justify their virtual environment, and they own a lot of hardware but they don’t have the staff to support going to the cloud. So okay, that’s easy to solve, we can find people who can get you in the cloud the right way, but then it’s a discussion of OPEX vs. CAPEX, how do we justify the infrastructure we’ve already invested in, even if it’s not supporting our needs.”

    Fidelity of process

    “I find that process has to be fully transparent, from up in the dev process all the way to production. We frequently see a run book that has every single detail for going to production. The reason they need the transparency of going to production, is they need to have step-by-step from all the errors of the past, ‘oh, if this happens you need to do step 552’. It speaks to the lack of automation up to that point and through that point.

    “Generally we have engineers of the team who have experienced the pains of doing things different in each environment. It’s usually not a fight once the team is enabled to invest time in automation. We try to get most of our clients to a virtual environment, we preach heavily that environment is code, it’s all checked into source control. There are no manual changes, you need to make the change in code base or configuration, and then the change flows out appropriately.

    “In slightly more risk-averse organizations, maybe they need Bob the executive to approve it and they queue up waiting for the approval, but then it continues to flow in the automatic process. Or – continuous deployment where everyone wants to get to in my opinion, as soon as I commit code it automatically goes through all those steps and then it makes things go live.

    “As for the Continuous Delivery idea of making deployment a business decision – it’s an educational topic. What if I tell you we can get the binaries delivered to production, where we see how that component reacts to traffic? Whether through feature toggling or whatever the configuration strategy is, get the bits out there and even have it turned off, but have it sit around code that is running in production and it’s much easier to solve things earlier on. It’s kind of like a branching story, the longer you wait to merge it into production environment the bigger of a story it’s going to be.”

    Monitoring, feature flags, blue/green deployments, canaries

    “We’re working with large customers, big systems, but small staff. We try to preach for a single light that tells you if your system is up or down. If it’s red, we have systems that dig in to see what the problem is. Metrics, tracking of things that are important – X number of items in the cart, people checking out, and business value. We don’t only check that the system is working, we also check the business value, see that this particular activity that is important to the business owner is happening with a certain frequently.

    “We use StatsD and hosted Graphite for the graphs. We create dashboards that show lines of what’s happening in the system and you can look throughout the day and see if that’s expected behavior. And then watch it when there’s a deployment or a configuration change. Normally have this humpback throughout the day, and now I made a deployment and it flat-lined.

    “Something like Newrelic helps you monitor disk, CPU, that the tools you’re running on are also working as expected. We’re also monitoring message-based stuff, queues, which shows my system behavior when going from one service to another making sure that’s not clogged up. Going back to that light, so we have “everything’s good”, but when things are bad, we’ve thought about that, and the systems we have in place will tell us that things are broken.

    “Very often it’s a new project or a client who brings a monolithic system they just can’t add features to. The notions we talk about is “definition of done” at different levels in the process, and “iteration zero”, these are the things we expect in the project. Here is the runtime, the CI, we now talk about CI/CD, the CI build scripts that run locally, as time goes by we introduce infrastructure as code, metrics, logging and monitoring, we build up the iteration zero story. They know how to write code but don’t know how to support things in production, we try to build it up so it becomes part of their DNA.”

    Speaking in Austin September 26th


    I am very proud to say that I have been organizing a local tech conference – MeasureUP – that is on September 26th.  Among other things I will be there speaking on the topic of building production ready systems and also how to support public facing API’s.

    In the production ready systems talk I will discuss the things to think about that most folks generally don’t as it pertains to building more complex multi-faceted systems.  A few of the areas to discuss are the steps to cover when communicating with third party systems to ensure that their down time doesn’t cause you down time.  How to build microservices or autonomous component based systems that play well together.  How to visualize the services in your system that are up vs down and the throughput they are providing to your overall system.  Metrics, reporting, monitoring, etc.  All sorts of good bits that I have discovered while tinkering on systems.

    In the API talk I will discuss building public facing API’s and the topics that you need a good story for when managing other folks accessing your systems.  This will cover authorization and authentication either as an internal system or via a mashup with your cloud provider.  It will go into the versioning strategy of your API.  We will also cover how to decouple your deployment story, backwards compatibility, achieving continued up-time, etc.

    These two talks should compliment one another.

    Mostly I am just very eager for all the hallway conversations that happen at events like these.  Hopefully you will come see me and we can chat about code, systems theory, raising pigs, guns – whatever you are interested in!

    Check out all the other speakers and talks we have signed up already.  Gabriel Schenker has a few talks too!

subscribe via RSS