In the seminal book Accelerate, Forsgren and her co-authors provide empirical evidence that Continuous Delivery has a positive impact on the performance of software development organisations. If organisations neglect some of the principles and practices of Continuous Delivery, their performance will suffer. They will reach the point where simple changes will take ages to implement. Not so with Continuous Delivery.
Continuous Delivery in a Nutshell
Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
Agile Manifesto, 2001
The Agile Manifesto makes Continuous Delivery the highest priority for agile practitioners.
Continuous Delivery encompasses all aspects of software development and is a holistic approach, which I define as: “going from idea to valuable, working software in the hands of users.”
[Farley21], p. 4
In his definition, Farley emphasises that every step of the software development process is part of Continuous Delivery. This implies that all the people performing one of these steps are part of Continuous Delivery as well. The whole socio-technical system for releasing a product – that is, the whole organisation – is involved in Continuous Delivery: Everyone is responsible.
Doing What Works to Build Better Software Faster
[Farley22, cover page]
The subtitle of Farley’s book [Farley22] is the most-to-the-point definition of Continuous Delivery. “Doing what works” reminds us to be pragmatic, to try out new practises and to improve old practises. It’s the agile and lean mantra of inspect-and-adapt: Improve continuously.
If we optimise software development for speed (“faster”), quality will suffer. If we optimise for quality (“better”), speed will suffer. Hence, Continuous Delivery requires us to optimise both for speed and for quality. We must find the right balance between speed and quality. “Better” and “faster” relies most heavily on these three principles: Work in small steps, Build quality in, and Automate repetitive tasks.
Continuous Delivery is achieved by working so that our software is always in a releasable state.
[Farley21, p. 5]
We can release our software to our customer with very high confidence of not breaking anything at any time. We don’t have to wait until the end of the Sprint or for the big-bang integration from colleagues. We can only achieve this state by working in many small steps and integrating our changes multiple times per day. Our new definition of “done” reflects this [Farley21, p. 6]: “The change is complete when it is delivered into the hands of its users.”
Continuous delivery is all about getting fast and frequent feedback for all our activities. That’s why it works. The earlier we get feedback the quicker we can fix problems and the less rework we will have. We get feedback from unit, acceptance and integration tests, from builds, from static code analysis, from customers, from users, from architecture models, from specifications, and other sources.
Key Principles of Continuous Delivery
I took the five key principles of Continuous Delivery from [Forsgren18, p. 42-43] and added my own explanations and examples. [Forsgren18] provides empirical evidence that the application of the Continuous Delivery principles and practices has a positive impact on the performance of software development organisations. This is less than a causal relationship but it’s a lot more than correlation or anecdotal evidence. Forsgren explains the scientific research behind Continuous Delivery in Part Two of her book [Forsgren18].
Work in Small Steps
We need to reduce the scope of each change and make change in smaller steps; in general, the smaller the better. This allows us to try out our techniques, ideas, and technology more frequently.
If we work iteratively in small steps, the cost of any single step going wrong is inevitably lower; therefore, the level of this risk is reduced.
[Farley22, p. 54 and p. 51]
Test-Driven Development (TDD) epitomises this approach. We have a very clear goal: making the red test green. We proceed in micro-steps to make the test pass. If our code change fails the test, we throw away 1-3 lines of code and 5-10 minutes of work. That doesn’t hurt. Quite the contrary: We know what doesn’t work and can try out something different. We inspect and adapt the code in extremely short and rapid feedback cycles.
Once the many small code changes provide value to the customer, we request their integration into the main product branch. The request triggers the execution of the Continuous Delivery (CD) pipeline. The CD pipeline runs all the unit tests and builds the product on a build server. If the pipeline succeeds, it automatically integrates the code changes into the main branch. Otherwise, it rejects the integration request. [Farley21, p. 49ff] calls this the Commit Stage of the CD pipeline.
It is important that the pipeline runs in less than 10 minutes. The faster the better. If we learn within 3 minutes that we broke something, we will fix it right away. If we learn about the problem only a couple of days or weeks later, the fix will take significantly longer. Typically, we integrate multiple times per day.
The CD pipeline has a second stage, the Acceptance Stage [Farley21, p. 63ff], which runs in less than 1 hour. This stage runs all the acceptance tests, most integration tests and some system tests. If it can run all integration and system tests, it’s even better. If not, they are run in the System Stage with a cycle time of 2-4 hours. Both stages install the binaries from the Commit Stage on the embedded device and run the tests on the device. The CD pipeline is geared towards getting feedback many times per day on the code, user-story, component and system level.
Avoiding overly long user stories helps us to work in small steps. On one project, I noticed a strange accumulation of 5-point stories. Although 5-point stories should not take longer than a week, they took 2-3 weeks on average. On another project, the team could not finish nearly half of the stories in a 3-week Sprint. The stories were far too big to come up with a reliable estimate. The remedy is to break up the long stories into smaller stories that we can finish in less than 2 days with near certainty. We finish user stories and could get feedback from internal and external users multiple times per week.
The next longer timebox is the Sprint with a typical duration of 1-3 weeks. Sprints are pretty big steps with diminishing feedback value. We could gradually reduce the Sprint duration from 3 weeks to 1 week and finally abandon Sprints altogether. As our code is always releasable (not just at the end of a Sprint) and finishing user stories doesn’t take longer than 2 days, we can release a constant flow of user stories to our customers.
We can correct small steps considerably faster than long steps. Small steps motivate us to try out new things (solutions, technologies, processes, etc.), as our inevitable mistakes are cheap and gives us a hint what to try next to reach our goal. Working in small steps cuts through complexity by reducing the things we must juggle in our heads in parallel. It reduces the risk of failing drastically.
[Working in small steps] reaps enormous rewards by allowing us to avoid work that delivers zero or negative value for our organizations. A key goal of continuous delivery is changing the economics of the software delivery process so the cost of pushing out individual changes is very low.
[Forsgren18, p. 42]
Build Quality In
Early in my career, I learnt an important lesson from a colleague: “You can’t test quality into software.” Edwards Deming, the Father of Quality, confirms this.
Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place.
Why? Inspection does not improve quality.
Why not? The quality is already there. Inspection raises the cost. Necessity for inspection only indicates that somebody doesn’t know how to do the job.
[Deming13, p. 113-114] Originally from Deming’s book “Out of Crisis” (1986)
Building quality in starts with applying TDD relentlessly. TDD guarantees that code is as simple as possible, always tested and easily testable. Writing acceptance, integration and system tests becomes so much simpler, as we can control the execution paths through the code and observe the results.
We run the unit tests for the component we are changing every couple of minutes. Once we integrate our change into the mainline, the CD pipeline kicks in. It automatically runs all unit tests for all components in the Commit Stage and tells us within 10 minutes if we broke anything. Within an hour, the Acceptance Stage of the CD pipeline checks all the acceptance tests on the embedded device. This stage ensure that we have understood the user stories correctly. Within 2-4 hours, the System Stage verifies that all the integration and system tests passed.
The developers write the unit tests and a good share of the acceptance and integration tests. The QA team writes the system tests and the rest of the acceptance and integration tests – of course, with help from the developers. Taking care of quality is a joint effort by developers, QA people (including internal users), product owners and management. This is the principle Everyone is Responsible in action.
The CD pipeline gives us automated feedback dozens of times per day. We get very early and frequent warnings about quality issues. We can locate and fix these issues quickly, while the code changes are still fresh in our mind and restricted to the last 1-3 commits.
Compare that to methods like code reviews, test-after development and debug-later programming. These methods find problems only days, weeks or months later – if they find them at all. Locating and fixing the bugs takes long, because hundreds if not thousands of changes happened in the meantime.
When we increase the quality of our software, we will soon move faster. Continuous Delivery helps us to find the right balance between quality and speed with its fast and frequent feedback.
Automate Repetitive Tasks
Here are three examples of repetitive tasks that beg for automation.
- Building an application, deploying and running it on an embedded device takes 40+ manual steps. A better solution: When a developer hits Run in the IDE, the IDE performs these 40+ steps automatically. Then, developers will run the application many times per day instead of once a month.
- Developers are responsible to run all unit tests before they push their changes to the main branch of the repository. Some developers “forget” to do this more often than not. Other developers run into failing tests when they update their code and must spend time on fixing the problems. A better solution: We set up a CD pipeline that runs its Commit Stage whenever developers integrate their changes. The pipeline rejects changes when tests fail.
- Code reviewers spend most time on pointing out formatting problems, violations of coding guidelines and lacking test coverage. A better solution: The CD pipeline runs analysis tools to point out the issues automatically.
Automating tedious tasks “frees up people for higher-value problem-solving work, such as improving the design of our systems and processes in response to feedback” [Forsgren18, p. 43]. By improving the efficiency of our processes, we gain more time to do the right things, to solve the hard problems and to innovate our processes and products. Automation will also increase quality, because computers are much faster and much less error-prone than humans when it comes to executing repetitive tasks.
Improve Continuously
Relentlessly pursue continuous improvement. The most important characteristic of high-performing teams is that they are never satisfied: they always strive to get better. High performers make improvement part of everybody’s daily work.
[Forsgren18, p. 43]
Moving in many small steps and getting fast and frequent feedback is the cornerstone for continuous improvement. Every small step is an experiment. We inspect the result of the experiment and adapt our next step accordingly. This is an application of the scientific method, which is “an iterative, cyclical process through which information is continually revised”. Sounds familiar, doesn’t it?
We improve continuously in everything we do. For example:
- We reduce the build time of our applications from 50 minutes over 20 minutes to 5 minutes.
- We turn system tests into acceptance and unit tests and speed them up from hours to seconds and milliseconds.
- We find out what new things customers can do or how much time they save because of new features.
- We cut the bureaucratic overhead of Scrum and SAFe.
- The team structure matches the loosely coupled architecture of our software.
- We experiment with new technologies and new programming language features.
- We learn how to give constructive feedback in a more engaging way (e.g., in code reviews).
- And of course, we apply Continuous Delivery diligently in our daily work.
Continuous improvement requires us to fail often. As our experiments take only little time, the failures have little consequences. Still – failing fast and often requires psychological safety. If colleagues and managers make a big fuss about every little mistake, people will rarely improve. Neither will people who lack curiosity.
Everyone is Responsible
Good operations are essential, yet they do not ensure quality. Quality is made in the boardroom.
[Deming13, p. 42] Originally from Deming’s essay “The Need to Change” (1989)
Everyone in a company is responsible for quality. If executives and managers want features in the customers’ hands as quickly as possibly, quality will suffer. If developers write tests reluctantly as an afterthought, quality will suffer. If testers don’t have the I-want-to-break-this-product mindset, quality will suffer.
The same goes for Continuous Delivery as a whole. If some of the executives, managers or developers are not on board, the organisation will not reap the full benefits. Delivering better software faster into the hands of customers only works when everyone in an organisation collaborates. It requires a holistic approach.
Here is a counterexample for this principle. A company builds two products, which share 90% of the code. One product is developed by 35 developers in Germany, the other by 10 developers in France. The German teams have written most of the common code. The French teams lack the expertise for some crucial parts of the code. They depend heavily on their German colleagues.
The developers’ bonuses depend on the performance of their teams. Hence, the German teams can get their full bonuses, even if the French teams perform badly. The company’s bonus system puts local optimisation over global optimisation. If the company measured the performance globally, the Germans would have an incentive to help the French. Everyone would be responsible for a better company result.
The bonus system is not the only problem. Another problem is the team structure. Both the German and French feature teams need the work of the database developers. However, these developers are assigned to one of the German feature teams. That’s still OK for the other two German teams, but it’s a blocking dependency for the French teams. We could put the the database developers into a platform team that serves all the German and French feature teams. The platform team would be responsible for the database problems of all feature teams.
Benefits of Continuous Delivery
The red curve shows the cost of change over time for traditional projects. The teams make quick progress in the beginning, as their code changes depend very little on other code and as the consequences of changes are easy to assess. After a couple of months, the teams notice a slowdown. They spend more time fixing bugs not just in their own code but also in distant code that mysteriously depends on their change.
Unfortunately, many teams just continue or choose the wrong remedies like more coordination, more planning and long-lived feature branches. The costs for change explode. Exponential growth kicks in. Simple changes that should take hours now take days or weeks. The teams spend most of their time fixing bugs instead of implementing new features. The deadlines for product launches slip by months. Management gets furious, teams get frustrated. The best choices seem to reimplement large parts of the software or to write a huge amount of tests or both. Both choice are extremely costly.
Most of us have been in this situation – unfortunately. But we don’t have to. We could also experience the green cost curve by applying the principles and practices of Continuous Delivery diligently. At the beginning of a project, we would move a bit slower than traditional teams, because TDD, continuous integration and automation of repetitive tasks come with a little overhead. Our initial investment will soon pay off. The green curve undercuts the red curve, typically when traditional teams slow down noticeably. Thanks to Continuous Delivery, the cost will stay nearly constant. The cost gap to traditional development widens very quickly.
The main drivers for keeping the costs nearly flat are working in many small steps and building quality in. They allow us to move faster and to increase the quality at the same time. We don’t have to trade one for the other as in traditional projects. We should not underestimate the other three principles: automate repetitive tasks, improve continuously and everyone is responsible. Continuous Delivery gains its power from the positive feedback loops between its five principles. Continuous Delivery will allow us to “build better software faster” [Farley22], only if we improve on all five principles at the same time.
[5:14] A system is not the sum of the behaviour of its parts, but it’s the product of its interactions.
[5:27] If we have a system of improvement that’s directed at improving the parts taken separately, you can be absolutely sure that the performance of the whole will not be improved.
Russell L. Ackoff, If Russ Ackoff had given a TED Talk (video), 1994.
Resources
[Deming13] W. Edwards Deming (edited by Joyce Nilsson Orsini). The Essential Deming: Leadership Principles from the Father of Quality. McGraw Hill, 2013.
[Farley21] Dave Farley. Continuous Delivery Pipelines: How to Build Better Software Faster. 2021
[Farley22] Dave Farley. Modern Software Engineering: Doing What Works to Build Better Software Faster. Pearson Education. 2022
[Forsgren18] Nicole Forsgren, Jez Humble and Gene Kim. Accelerate – The Science behind DevOps: Building and Scaling High Performing Technology Organizations. IT Revolution. 2018
Excellent article, thank you.
If you don’t mind I will use it to advocate for TDD in my company for the new projects. A lot of people coming from the traditional SW development projects and need some convincing.
One note: we had one project set with full TDD. However, the way CI was setup was that it would run the full build and all the tests on any branch that a developer pushed into the main repository. It won’t reject the change on the branch if it failed. However, the CI was integrated with the change review system. It would display the results of the CI runs and would reject integration to the main branch if failed. In this case the results are visible for the developer and the reviewers and all the code is nicely backed up on the cloud.
Hi Iliya,
Please feel free to use this articles and my other articles about TDD (https://embeddeduse.com/category/tdd/) to advocate for TDD.
The process you describe looks as follows to me:
This is a pretty common process. I see some levers how to get feedback even faster and more often. I’d make sure that the feature branches are only very short lived: less than 1 day is ideal, less than 2 days is OK. Only one developer or one pair of developers (in case of pair programming) should work on each feature branch. There shouldn’t be any branches from the feature branches. The reason for these guidelines is to minimise the changes for integration. The bigger the changes the more difficult and time-consuming integration is.
Code reviews as part of pull requests delay integration significantly. The reviewers often have no time, as they develop themselves. Reviews tend to waste time on the less important things (e.g., code formatting, coding guidelines). The result is less frequent feedback later. You can counter this by pair programming, which is a real-time review while developing. I wouldn’t pair-program all the time, but pair up when requirements are unclear or I get stuck.
In general, using short-lived feature branches and running unit tests on them before integrating them automatically is perfectly OK. Just make sure that you don’t delay feedback.
Cheers,
Burkhard