How I do blue-green deployment

This is an article outlining how I deploy my webapp. I run my servers on DigitalOcean but this is incidental. It should work equally well for other providers too.

Let's get 2 caveats out of the way first:

My CRUD application runs on one application server. If you're using a more complex setup, then what I describe here will need to be ... adjusted.
Like many others out there, I'm learning this stuff as I go along. Please be gentle.

Why I chose blue-green deployment

Before I get into the details, let's quickly look at my situation before I switched to a blue-green deployment:

I had one application server running on DigitalOcean, plus a hosted Postgres database.
To deploy, I used a script that SSHed into that server and did a git pull

This was fine to begin with however there were several issues:

My setup (using Python Flask) compiles and minifies CSS and Javascript on the server. This can take up to 10 seconds. The result of this was that if I changed my CSS or Javascript, the server would take time to respond after deployment and some users ran into Bad Gateway errors 💥
If there was a bug in production this could be fixed by checking out the previous commit. However, this invariably took too long and always involved frenzied googling of the correct git commands.
There was no way of testing the production setup, other than in production.

Switching to blue-green deployments fixed all of these issues.

What is blue-green deployment?

Here's my definition of a blue-green environment:

There are two identical and independent servers hosting the application. One is called green, the other blue.
There is a shared production database that both servers can access.
There is a quick and painless way of routing traffic to the green or the blue server.

One of the 2 servers is always serving production traffic, the other is idle. Let's say green is serving production traffic, and blue is idle. When a new release is ready, it gets deployed to the idle blue server. Here it can be tested and issues fixed. Remember, the blue server is accessing the production database, so the application can be tested with real data.

Once you're satisfied that you're ready to go you switch traffic from the green (live) server to the blue server. If any problems occur, you can simply switch back to the green server within seconds, effectively doing a roll-back.

Simple, eh?

Basic components of my setup

For my blue-green setup I did the following things:

I cloned the application server. On DigitalOcean this is super simple: you can create a snapshot (even of a running machine) and create a new machine from that snapshot. An even more elegant way to do this would be to use Docker... but I haven't watched enough YouTube tutorials to do that yet.
Setup a way to switch traffic from one server to the other. I use a floating IP from DigitalOcean. Basically they are publicly-accessible static IP addresses that you can assign to servers and instantly remap between other servers in the same datacenter. My domain (keepthescore.co) resolves to this static IP address.
Setup a way to determine whether the blue or the green server is currently live.
Created a deployment script that always deploys to the idle server.

Let's dive in a little more:

Setting up the servers

Once I'd cloned the application server, I gave them 2 different hostnames: blue-production and green-production. To do this on Ubuntu you have to do 2 things on the actual servers (in these examples for the green server):

Carry out this command: sudo hostnamectl set-hostname green-production
Edit the hosts file with sudo vim /etc/hosts and add green-production

Then I ensured that my app can expose the hostname of the server it's currently running on. On Flask you can create a route like this:

import socket

@app.route('/hostname')
def server_info():
    host_name = socket.gethostname()
    return host_name + '\n'

Now it's possible for a human or a machine (using curl) to discover which the current production server is. I simply call https://keepthescore.co/hostname. Give it a try by clicking on the link!

One final thing I needed to do is to add the public IP addresses for blue-production and green-production to the local hosts file of my development machine(s).

Deployment 🚀

The deployment script can now use this information to deploy the new version of the software to the idle server. Here's my deployment script:

#!/usr/bin/env bash

# Get the current production server and 
# set TARGET to the other server 
CURRENT=$(curl -s https://keepthescore.co/hostname)
if [ "$CURRENT" = "blue-production" ]; then
  TARGET="green-production"
elif [ "$CURRENT" = "green-production" ]; then
  TARGET="blue-production"
else
  echo "Something is not right! 😬"
  exit -1
fi

echo "Current deployment is " $CURRENT
echo "Deploying to " $TARGET

# Do deployment
ssh -q root@$TARGET "cd keepthescore && git pull"
echo "Deploy to " $TARGET " complete"

I am now repeating myself but the beauty of this script is that it will always deploy to the idle server and not to the live production server. I can test the deployment on my development machine by simply entering blue-production or green-production into my browser -- because I've added these IP addresses to my local hosts file.

Once I'm sure that everything's working I route traffic to the newly deployed idle server using DigitalOceans's web interface for the floating IP addresses.

My users get routed to the newly deployed software without noticing (hopefully).

Voilá! ✨

What about the database?

The database is a sore point, because I don't have 2 instances of the database. Martin Fowler, who wrote a great article about blue-green deployments wrote the following:

"Databases can often be a challenge with this technique, particularly when you need to change the schema to support a new version of the software. The trick is to separate the deployment of schema changes from application upgrades. So first apply a database refactoring to change the schema to support both the new and old version of the application, deploy that, check everything is working fine so you have a rollback point, then deploy the new version of the application. (And when the upgrade has bedded down remove the database support for the old version.)"

That's all

I'd love to get some feedback on our deployment strategy. Do you have questions? Am I over-engineering? Should I learn Docker? Let me know in the comments below.