Increasing Software Delivery by 500%

Posted By: Nick Floyd

TL;DR (Too long; didn’t read)

We are deploying software faster with a simple push of a button. This year alone we have deployed code to production 10 times versus 2 last year which is 500% more deployments this year.

Introduction

Here at Fellowship Technologies (part of Active Network) we have been working on ways to release our software with higher quality and faster than we ever have before. Many of us have read Continuous Delivery by Jez Humble and David Farley and used it as a guide for implementing a method of building a completely automated deployment system for moving our code into production. I will discuss Continuous Delivery and what that means for the development team here at FT in another blog post, but the book provided the guidance and the frameworks that we used to fix our deployment problem.

This means that with the push of a button we can deliver software to customers. Since the button we push is the browser and stupid-looking, think of it as big red button on the wall that says “Deliver”. Push it and our code is on it’s way out the door.

Big Red Deliver Button

The problem

At Fellowship Technologies we had a problem. We could not get the code that we worked on in our sprints out fast enough. There were reasons for this:

     
  • Releases were expensive
    We had multiple mock deployments which would require that several people get in a room and deploy code to our staging environment. We would miss configuration keys, we would find bugs that were introduced, and after we fixed all those things, we would do it again to make sure we didn’t miss them again.
  •  
  • We had a linchpin
    We had one guy who was deploying all of our code manually. If he was unavailable then we could not release. Having this single point of failure was something that more often than not came back to get us.
  •  
  • Humans were too involved in the process
    Delivery anti-pattern #1 in Jez and David’s book. We copied compiled code manually. We made configuration changes manually. Making changes manually led us to make many mistakes when deploying because we would simply forget one of the 50 steps that was required when we deployed code.

Stopping the bleeding

We needed to do something about all the time being wasted deploying our code. We started to go down the path of automation for our build and deployments. Using technologies that are standard to our stack (WebDeploy, Powershell, and MSBuild) and we needed something that would execute the scripts while letting us control who can do it and run under a admin account that has the rights to deploy our software. We chose Jenkins for the job of deploying software because we were already using Jenkins for our Continuous Integration needs.

Configuration management

Once we got an automated system in place to build our apps and deploy them, we needed a way for us to manage our configuration for our applications. For this we decided to utilize the YAML file format to generate our configuration files that we deploy. This way we can see all of our configuration values in one file and update them accordingly as well as run static analysis (I.e. make sure certain configuration values are present when we deploy to production). Here is an example of how our YAML looks:

Config.Key:
  local: "local value"
  dev: "dev value"
  qa: "qa value"
  staging: "staging value"
  prod: "PRODUCTION value"

Which, based on environment, gets translated to the following XML (for dev):

<appSettings>
  <add key="Config.Key" value="dev value" />
</appSettings>

Push it to the limit

Once we had facilities to deploy we just needed to install the dependencies on our servers and configure the firewalls to deploy to them. After this initial configuration we were able to deploy software at the push of a button.

Starting at zero

One glaring issue is that with a push button deployment we might change a file as a user is accessing it, so we needed a way to address that. So we have zero downtime deployments. The way this works is that we signal the load balancer that a node is down without actually taking the whole site down. Once the load balancer sees the node as down it stops sending traffic to it and we can safely load new code because nobody is going to it. Once we finish deploying new code we can flag the node as normal again. Once the node is flagged as normal we re-introduce it into the server farm.

Conclusion

Now that we can deploy at the push of a button we can get new features and fixes out the door with at a much faster rate and with less impact because we are deploying smaller changes more often. It is worth noting that we could not have acheived automated deployments with out the help of our Technology Operations team. Without their knowledge we could not have gotten this off the ground. Without their faith that we would not totally mess up production and working with us to make sure that we didn’t we could not have achieved an automated build and deployment system.

“If it hurts: do it more often.” - Martin Fowler

 

Posted In: News,

Comments:
david said: on May 31, 2011 at 03:40 AM

sounds too good to be true, when it comes to writing software, the time needed to not only write the code but to test, test and test again can really slow down the whole process. Thank you for sharing your learning.

Commenting is not available in this channel entry.

Categories:

Previous Posts:


Subscribe to the RSS feed!