I was asked yesterday what measurements a team could start to take to track their progress towards continuous delivery. Here are some initial thoughts.
Lead time per work item to production
Lead time starts the moment we have enough information that we could start the work (ie it’s “ready”). Most teams that measure lead time will stop the clock when that item reaches the teams definition of “done” which may or may not mean that the work is in production. In this case, we want to explicitly keep tracking the time until it really is in production.
Note that when we’re talking about continuous delivery, we make the distinction between deploy and release. Deploy is when we’ve pushed it to the production environment and release is when we turn it on. This measurement stops at the end of deploy.
Cycle time to “done”
If the lead time above is excessively long then we might want to track just cycle time. Cycle time starts when we begin working on the item and stops when we reach “done”.
When teams are first starting their journey to continuous delivery, lead times to production are often measured in months and it can be hard to get sufficient feedback with cycles that long. Measuring cycle time to “done” can be a good intermediate measurement while we work on reducing lead time to production.
If a bug is discovered after the team said the work was done then we want to track that. Prior to hitting “done”, it’s not really a bug – it’s just unfinished work.
Shipping buggy code is bad and this should be obvious. Continuously delivering buggy code is worse. Let’s get the code in good shape before we start pushing deploys out regularly.
Defect fix times
How old is the oldest reported bug? I’ve seen teams that had bug lists that went on for pages and where the oldest were measured in years. Really successful teams fix bugs as fast as they appear.
Total regression test time
Track the total time it takes to do a full regression test. This includes both manual and automated tests. Teams that have primarily manual tests will measure this in weeks or months. Teams that have primarily automated tests will measure this in minutes or hours.
This is important because we would like to do a full regression test prior to any production deploy. Not doing that regression test introduces risk to the deployment. We can’t turn on continuous delivery if the risk is too high.
Time the build can be broken
How long can your continuous integration build be broken before it’s fixed? We all make mistakes. Sometimes something gets checked in that breaks the build. The question is how important is it to the team to get that build fixed? Does the team drop everything else to get it fixed or do they let it stay broken for days at a time?
Continuous delivery isn’t possible with a broken build.
Number of branches in version control
By the time you’ll be ready to turn on continuous delivery, you’ll only have one branch. Measuring how many you have now and tracking that over time will give you some indication of where you stand.
If your code isn’t in version control at all then stop taking measurements and just fix that one right now. I’m aware of teams in 2015 that still aren’t using version control and you’ll never get to continuous delivery that way.
Production outages during deployment
If your production deployments require taking the system offline then measure how much time it’s offline. If you achieve zero-downtime deploys then stop measuring this one. Some applications such as batch processes may never require zero-downtime deploys. Interactive applications like webapps absolutely do.
I don’t suggest starting with everything at once. Pick one or two measurements and start there.