“Complexity is anything related to the structure of a software system that makes it hard to understand and modify the system.”
John Ousterhout, The Philosophy of Software Design, Chapter 2
Complexity is also known as technical debt, which quickly grows into technical inflation. Strategic programmers understand that they must do everything to make progams easy to understand and easy to modify. They don’t regard working code as good enough – in contrast to tactical programmers.
John introduces deep modules as the main concept to reign in complexity. Deep modules “maximize the amount of complexity that is concealed”. As a deep module is typically used by many clients, it quickly pays off to make the lives of the client developers easy instead of the module developers. In John’s words: “It is more important for a module to have a simple interface than a simple implementation.”
The first nine chapters of John Ousterhout’s book A Philosophy of Software Design give strategical advice how to tame or even weed out complexity. The remaining twelve chapters give tactical advice like how to write good comments, writing comments first and choosing good names. You find my review of the first nine chapters below. I plan to add the other twelve chapters one by one over time.
Chapter 1 – Introduction (It’s All About Complexity)
“Writing computer software is one of the purest creative activities in the history of the human race. Programmers […] can create virtual worlds with behaviors that could never exist in the real world. […] All programming requires is a creative mind and the ability to organize your thoughts.”
Creativity and organisation are the yin and yang of software design. They are opposites that complement each other.
Good design keeps the complexity of software at a level such that we can extend the software with minimum effort. The book has two goals:
- It defines complexity, how to recognise it and what its consequences are.
- It presents techniques and design principles to minimise complexity.
Chapter 2 – The Nature of Complexity
“Complexity is anything related to the structure of a software system that makes it hard to understand and modify the system.”
The author lists three symptoms of complexity:
- Change amplification: A change leads to many other changes at different places in the code.
- Cognitive load defines the amount of information we must keep in our head to implement a user story.
- Unknown unknowns: It is unclear which piece of code we must change or extend to implement a user story.
Unknown unknowns are the worst form of complexity. We only find out about them when bugs start to appear (often with a delay) after a code change.
Complexity doesn’t appear in a single big bang, but it accrues in small chunks over time. The more common term for this kind of complexity is technical debt or even technical inflation.
Chapter 3 – Working Code Isn’t Enough (Strategic vs. Tactical Programming)
Tactical programmers try to get a feature or a bug fix working as fast as possible. They take shortcuts, don’t write tests, don’t refactor code to improve the design and hardly think about design. Every shortcut introduces a little more complexity. Short-term, tactical programmers go fast, especially when they work on a greenfield project. Mid-term, they will go slower and slower. Pretty soon they will hardly make any progress. At some point, other developers will have to reimplement their parts of the software.
Strategic programmers understand that “working code isn’t enough”. They focus on getting the design right such that future extensions become easy. Their designs also happen to work. They avoid technical debt by all means. Initially, they may be slower than tactical programmers. They will be considerably faster mid-term. Long-term, they avoid the huge cost of a reimplementation.
Chapter 4 – Modules Should Be Deep
Modular design decomposes software “into a collection of modules that are relatively independent”. Modules can be classes, subsystems or services. The goal of modular design is a loosely coupled collection of modules with minimal dependencies on each other. Separating the interface and the implementation of a module is the key to modular design. “The best modules are those whose interfaces are much simpler than their implementations.”
An interface provides “a simplified view [of a module], which omits unimportant details”. This simplified view is called an abstraction. “The key to designing abstractions is to understand what is important, and to look for designs that minimize the amount of [important] informaton.” A well-designed module is a deep module, which is the core concept of the book.
The author visualises modules as rectangles. The area of the rectangle represents the functionality of the module, the top edge the interface and the height the level of abstraction. A deep module has a small width and a large height. A shallow module has a large width and a small height. A good software design finds the right balance between cost (interface) and benefit (functionality).
“Module depth is a way of thinking about cost versus benefit. The benefit provided by a module is its functionality. The cost of a module (in terms of system complexity) is its interface. A module’s interface represents the complexity that the module imposes on the rest of the system: the smaller and simpler the interface, the less complexity […] it introduces. The best modules are those with the greatest benefit and the least cost. Interfaces are good, but more, or larger, interfaces are not necessarily better!”
Linux file I/O is a good example of a deep module. With the five basic functions open, close, read, write and lseek, we can perform almost all file I/O on Linux. This tiny interface hides a huge amount of implementation details: file representation on hard disks, SSDs, CDs and DVDs, mapping of hierarchical file paths to directories, access permissions, concurrent access, caching, and many more. Five functions hide tens of thousands of lines of implementation. Deep modules “maximize the amount of complexity that is concealed”.
A linked list is a good example for a shallow module. The interface is nearly as complex as the implementation. It conceals very little complexity.
Chapter 5 – Information Hiding (and Leakage)
As we saw in the previous chapter, deep modules like Linux file I/O do an excellent job in hiding information. Information hiding reduces the complexity by simplifying the interface of a module and by making the evolution of systems easier. Interfaces cannot expose any dependencies on hidden information.
Information leakage is the opposite of information hiding. It occurs, when changing an interface (that is, when changing a design decision) implies changes in clients of this interface. The author gives three common reasons for information leakage.
The first reason is temporal decomposition. For example, a program first reads information from an XML file, then manipulates the information and finally writes the modified information back to the XML file. If we put the three tasks in three separate classes or in three public functions of the interface, the module leaks information to its clients. If we decide later that we store the information in a JSON file or a SQL database, or request it from a cloud server, clients are forced to change accordingly. We would also find out that the classes for reading and writing the XML files duplicate certain information.
A deep class would have a single public function to read information, manipulate it and write it. The constructor of the class would get all the information (e.g., a URL string) required to figure out from where to read the input information and where to write the output information. Clients of this class wouldn’t have any idea how and where the information is stored.
The second reason for information leakage is the provision of getters and setters, which make modules shallower. A pretty bad example would be a function
QMap<QString, qreal> getParameters() const
that returns the map from parameter names to values as it is stored in a member variable of the class. If we decide to use a data structure different to QMap, we force all clients to adapt to the change. We also allow clients to manipulate data outside the class owning the data (the smell of Feature Envy). This leads to tighter coupling and more complexity.
The third reason for information leakage is the omission of default values for function parameters. Missing defaults force clients to provide information that is not needed most of the time. They make common cases more difficult than necessary.
Chapter 6 – General-Purpose Modules are Deeper
When we design modules, we must often decide whether to make the module general-purpose or special-purpose. General-purpose modules tend to be deeper, but they also tend to provide functionality that is never needed in the future (smell of Needless Generality). The author suggests an excellent compromise: Make the interface “somewhat general-purpose” and the implemented functionality special-purpose.
The author explains the compromise with the example of deleting text in a text editor. The text editor is implemented with the model-view design pattern. One important goal is that the model and view class can be developed mostly independent. The user can delete text by pressing the Backspace key, the Delete key or Ctrl+X (Cut) on a selected text area. The view class calls functions of the model class to perform the respective actions on the text document.
A bad design would provide three functions for backspace, delete and cut in the model class. Whenever we added some new way of deleting text to the model class, we would have to change the view class accordingly. This solution is special-purpose, because every function is only used once in the view class.
A good design would provide one function delete(Position start, Position end)
in the model class. For each of the three delete actions, the view class could call this somewhat general-purpose delete function. The interface is deeper and will not change, when we add another way of deleting text (e.g., deleting a line) to the view class.
Chapter 7 – Different Layer, Different Abstraction
Programs are typically organised in layers, where higher layers depend on lower layers. Well-designed layers are deep modules. Higher layers are more abstract than lower layers.
Pass-through methods call similar methods from a lower layer and pass the result with little or no modification to the higher layer. Wrapper functions often are pass-through methods. Pass-through methods are tell-tales that layers are shallow and that the responsibilities of two layers are overlapping.
Pass-through variables are handed from the top to the bottom layer through a long chain of function calles – or in the other direction. If we change, delete or add such a variable, we’ll have to change the signatures of all the functions using them in all the layers.
The author suggests to move all the pass-through variables into a context object. The context object is passed to the constructor of objects using these variables in their functions. This way the context object can be passed up or down the layer chain. It’s not an ideal solution but better than the alternatives.
Chapter 8 – Pull Complexity Downwards
We make modules deeper by pulling complexity down into the module, that is, by moving code from outside the module into the module. Pulling down complexity is best applied, when
- the functionality to be moved has a strong affinity with the module’s existing functionality,
- the program is simplified in many places and
- the module’s interface is simplified.
The author’s reason for applying this technique is timeless advice:
“Most modules have more users than developers, so it is better for the developers to suffer than the users. As a module developer, you should strive to make life as easy as possible for the users of your module, even if that means extra work for you. […] it is more important for a module to have a simple interface than a simple implementation.”
Chapter 9 – Better Together Or Better Apart?
Although small modules (i.e., small classes or small functions) are often portrayed as a magic bullet to reduce system complexity, they are not. Many small modules imply more dependencies and more interfaces leading to higher complexity.
Big modules have gained a bad reputation, although they are not bad per se. As software designers, we should base “the decision to split or join modules […] on complexity. Pick the structure that results in the best information hiding, the fewest dependencies, and the deepest interfaces.”
The author gives some guidelines when to split or join modules.
- Bring together if information is shared. A typical example is the encoding and decoding of network packets, which shares information about the structure of packets.
- Bring together if it will simplify the interface. Temporal decomposition from Chapter 5 provides a good example.
- Bring together to eliminate duplication. Sure!
- Separate general-purpose and special-purpose code. Chapter 6 gives the example of the text editor split into the text view and text model. The text view has three special-purpose function for deleting text, which the text model implements in a single general-purpose function. The author concludes: “In general, the lower layers of a system tend to be more general-purpose and the upper layers more special-purpose.”
- Splitting and joining methods. Methods should be deep with interfaces that are much simpler than their implementations. Methods with long parameter lists are typically shallow and should be joined with methods calling them. If we must jump around between many small functions to understand the function that calls them (called “conjoined methods”), we should think of joining the methods or introduce better names for the small functions. If a function performs multiple, hardly related tasks, we should split up the function. The golden rule is: “Each method should do one thing and do it completely.”
In general, we join modules if they have a strong affinity. We split modules if they have a weak affinity. Affinity is strong, if modules change for the same reasons. It is weak, if they change for different reasons.
Also Interesting
If you prefer videos over reading or just want the one-hour version, you can watch John Ousterhout’s talk at Google.
Here are two other reviews of John Ousterhout’s book.
I would add that it shines most at providing the rationales behind certain “good practices” which are more used as dogma than as heuristics to produce strong systems. Some of these rationales are not trivial (at least not for me!) and understanding them will shape your intuition for future work. Nicely done.