Automated, Heroku-like Deployments with Docker and Drone
author By Joel Weirauch,

tl;dr

We use Drone to pull our application source from Bitbucket, run unit tests, build a Docker image, push the image to the Docker Hub and execute a rolling deployment to our servers. If you want to see exactly how all that works, skip down to the code

SaaS Deployments for Normal People

The cornerstone of any good SaaS app is reliability. Given this, we knew that we needed an automated, reliable and repeatable way to deploy AlertBee to the application servers.

We love the idea of Heroku's git push heroku master workflow and we are in love with Docker. With this in mind we started to look at platforms like Deis and Flynn to run our application servers. Both of these projects look great and provide the git push style deployment that we were looking for, but a few things kept nagging at us as we evaluated them.

First, in both cases, there are quite a few moving pieces under the hood. This is understandable, but also worrisome because it was an additional ops burden to learn and maintain. Not only would we have the underlying servers to worry about, but we would also have the PaaS application and our own SaaS application to manage.

Second, they require a fairly beefy cluster of machines just to have a minimum, stable cluster. This means that we would either have a lot of idle computing power or we would need to be worried about quickly scaling up the cluster to increase capacity.

The current production Docker deployment landscape generally appears to be geared more towards large, complicated deployments operating at a very large scale and not much seems to be available for the "little guy" that needs production stability but on a smaller scale. With this in mind we took a step back to compile a list of what we actually needed from a deployment and hosting perspective.

MVPaaS - Docker and Drone

What does our Minimum Viable Platform as a Service look like?

  • Easy, automated deployments
  • Easy to scale
  • Reliable, minimal moving pieces
  • Utilizes Docker
  • No wasted/idle compute resources

With our minimum needs in mind we looked at solutions to each component. Luckily, Docker makes it super easy to provision and utilize compute resources because all you need are bare servers with the Docker engine installed. Creating a base image or an automatic provisioning script is all that's required to have a Docker host up and running. This checks off 3 of the items on our list (easy to scale, uses Docker, and no wasted resources).

That leaves reliable and automated deployments to be addressed. We decided to go with the tried and true method for reliability, which is simply that there are at least two instances of every component. User facing components sit behind a load balancer, HAProxy, and background workers are scaled as required.

Automated deployment is the last requirement. Because we are using Docker we decided that we would use private images on the Docker Hub for image storage. We also have unit tests, so the first step of any deployment needs to be running unit tests.

After looking at a number of CI solutions we had narrowed it down to Drone and Jenkins. We have run Jenkins in the past and once you get everything setup it's a great platform. It's just the initial setup that takes time.

Drone on the other hand can be run with a single Docker container and all of the CI configuration is in a YAML file with your project, making it much easier to get up and running than Jenkins. Additionally, they have a hosted option that's very reasonably priced, so we only need to host it ourselves if we want to.

Deploying AlertBee to production with Drone is as easy as git push master and waiting ~10 minutes for our unit tests to run, Docker image to be built and pushed to the Docker Hub and a rolling deployment to update the application on our servers.

That sounds great, but how does it work?

Our workflow is:

  1. Push master to Bitbucket
  2. Bitbucket notifies Drone
  3. Drone pulls master, runs unit tests, builds a Docker image, pushes the image to the Docker Hub and then uses SSH to do a rolling deployment to our servers.
  4. Drone notifies our Slack #deployments channel of the success/failure of each deployment.

All told, it takes ~10 minutes from the time we run git push until the new code is running on our servers. We are working on getting that time down, but in the grand scheme of things, ~10 minutes isn't bad.

Drone does 90% of the work, which is contained in the .drone.yml file for the project. We use Python 3.5, Django, Flask and PostgreSQL for the bulk of AlertBee. Here is an example of our main application's .drone.yml file:

compose:  
  database:
    image: postgres
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=testing
build:  
  image: python:3.5
  environment:
    - PG_USER=postgres
    - PG_PASSWORD=testing
  commands:
    - pip install -r requirements.txt
    - python manage.py test
publish:  
  docker:
    username: $$DOCKER_USERNAME
    password: $$DOCKER_PASSWORD
    email: $$DOCKER_EMAIL
    repo: alertbee/dashboard
    tag:
      - production
      - "$$BUILD_NUMBER"
    file: Dockerfile
    insecure: false
    when:
      branch: master
deploy:  
  ssh:
    host:
      - $$DOCKER_HOST_1
      - $$DOCKER_HOST_2
    user: $$SSH_USER
    port: 22
    sleep: 5
    commands:
      - sh deploy-dashboard.sh
    when:
      branch: master
  ssh:
    host:
      - $$DOCKER_HOST_3
      - $$DOCKER_HOST_4
    user: $SSH_USER
    port: 22
    sleep: 5
    commands:
      - sh deploy-worker.sh
    when:
      branch: master
notify:  
  slack:
    webhook_url: https://$$SLACK_WEBHOOK_URL
    channel: deployments
    template: >
      {{ repo.name }} {{ build.branch }} build #{{ build.number }} finished with a {{ build.status }} status

Let's break that down a little... First, we have a compose section:

compose:  
  database:
    image: postgres
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=testing

This is where we put any dependencies that our build has. In our case it is just a PostgreSQL database for our Django app to use for the unit test database. This could include Redis or anything else that your app needs.

Drone pulls down the containers specified here in compose so that they can be linked to the build container in the next step.

Next up we have the build section:

build:  
  image: python:3.5
  environment:
    - PG_USER=postgres
    - PG_PASSWORD=testing
  commands:
    - pip install -r requirements.txt
    - python manage.py test

This is the step where we run our unit tests. We are using Python 3.5, so we tell Drone to pull down the Python 3.5 image from the Docker Hub. We set environment variables for the username and password to use to connect to the linked PostgreSQL database container from the above compose section.

The commands are anything required to setup your app and run your unit tests, which in our case is simply installing the requirements with PIP and telling Django to run the unit tests.

Once all of our unit tests have passed we move on to the publish section:

publish:  
  docker:
    username: $$DOCKER_USERNAME
    password: $$DOCKER_PASSWORD
    email: $$DOCKER_EMAIL
    repo: alertbee/dashboard
    tag:
      - production
      - "$$BUILD_NUMBER"
    file: Dockerfile
    insecure: false
    when:
      branch: master

This section builds our Docker image, tags it with both production as well as the build number ("$$BUILD_NUMBER"). We tag with both so that we can easily rollback to a prior build if necessary.

We have our credentials for the Docker Hub stored in our encrypted .drone.sec file so that we can reference them using variables.

After our image is tagged and pushed to the Docker Hub we move on to the deploy section:

deploy:  
  ssh:
    host:
      - $$DOCKER_HOST_1
      - $$DOCKER_HOST_2
    user: $$SSH_USER
    port: 22
    sleep: 5
    commands:
      - sh deploy-dashboard.sh
    when:
      branch: master
  ssh:
    host:
      - $$DOCKER_HOST_3
      - $$DOCKER_HOST_4
    user: $SSH_USER
    port: 22
    sleep: 5
    commands:
      - sh deploy-worker.sh
    when:
      branch: master

This is a somewhat simplified version of our deployment but it gets the point across. We are telling Drone to SSH into our servers one at a time to execute a shell script. It waits 5 seconds between deploying to each server to make sure the load balancer has put each server back into the rotation before moving to the next.

In the example above, we have two server's that run the dashboard and two that run the worker. Both the dashboard and the worker are the same Docker image in our case, but we run the container with different commands.

The one slightly odd aspect to our deployments is the shell script that gets executed. We decided that, for now at least, the easiest way to handle pulling the new image onto each server and replacing the running image would be to put a shell script with the specific instructions on each host.

This results in a single shell script that not only runs a dashboard container but also executes any database migrations first. We are using Consul to store our secrets and service info (such as where the database is, where Redis is, etc), so the shell script is able to inject the appropriate Consul address into the container.

Here is an example of one of our bash scripts which runs database migrations before replacing the running container:

echo "Pulling Image..."  
docker pull alertbee/dashboard:production  
echo "Running Migrations..."  
docker run --rm -e CONSUL_IP=172.17.0.1 alertbee/dashboard:production python manage.py migrate --noinput  
echo "Stopping and removing dashboard container..."  
docker stop dashboard-01 && docker rm dashboard-01  
echo "Launching new dashboard..."  
docker run --restart=always -d -p 80:80 --name=dashboard-01 -e CONSUL_IP=172.17.0.1 alertbee/dashboard:production  

You might have noticed the various echo statements above. When Drone is executing the build, all of the commands ran through SSH get passed back into the build console, so we are able to see exactly what is happening at each step of the build.

Conclusion

As you can see, we were able to create a pretty simple, automated and reliable PaaS environment using Drone and Docker. There are very few moving parts, so maintaining and scaling the server infrastructure is simple. The main aspect that limits scalability is having a custom deployment shell script on each host, but this is very minor.

Speaking of the deployment shell script, we are actually working on an enhancement to our deployment process that replaces the shell script and having Drone SSH into each server with a set of Ansible playbooks. The Ansible solution is more robust and allows servers to be updated in parallel. We will have another post about this soon.

Do you have microservices, cron jobs, queue processors or other background tasks? Why not head on over and sign up for free!