4. Modularity: Keeping It Clean, Keeping It Simple

There are two ways of constructing a software design. One is to make it so simple that there are obviously no deficiencies; the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.

The Emperor’s Old Clothes, CACM February 1981
—C. A. R. Hoare

There is a natural hierarchy of code-partitioning methods that has evolved as programmers have had to manage ever-increasing levels of complexity. In the beginning, everything was one big lump of machine code. The earliest procedural languages brought in the notion of partition by subroutine. Then we invented service libraries to share common utility functions among multiple programs. Next, we invented separated address spaces and communicating processes. Today we routinely distribute program systems across multiple hosts separated by thousands of miles of network cable.

The early developers of Unix were among the pioneers in software modularity. Before them, the Rule of Modularity was computer-science theory but not engineering practice. In Design Rules [Baldwin-Clark], a path-breaking study of the economics of modularity in engineering design, the authors use the development of the computer industry as a case study and argue that the Unix community was in fact the first to systematically apply modular decomposition to production software, as opposed to hardware. Modularity of hardware has of course been one of the foundations of engineering since the adoption of standard screw threads in the late 1800s.

The Rule of Modularity bears amplification here: The only way to write complex software that won’t fall on its face is to build it out of simple modules connected by well-defined interfaces, so that most problems are local and you can have some hope of fixing or optimizing a part without breaking the whole.

The tradition of being careful about modularity and of paying close attention to issues like orthogonality and compactness are still much deeper in the bone among Unix programmers than elsewhere.

Early Unix programmers became good at modularity because they had to be. An OS is one of the most complicated pieces of code around. If it is not well structured, it will fall apart. There were a couple of early failures at building Unix that were scrapped. One can blame the early (structureless) C for this, but basically it was because the OS was too complicated to write. We needed both refinements in tools (like C structures) and good practice in using them (like Rob Pike’s rules for programming) before we could tame that complexity.

—Ken Thompson

Early Unix hackers struggled with this in many ways. In the languages of 1970 function calls were expensive, either because call semantics were complicated (PL/1. Algol) or because the compiler was optimizing for other things like fast inner loops at the expense of call time. Thus, code tended to be written in big lumps. Ken and several of the other early Unix developers knew modularity was a good idea, but they remembered PL/1 and were reluctant to write small functions lest performance go to hell.

Dennis Ritchie encouraged modularity by telling all and sundry that function calls were really, really cheap in C. Everybody started writing small functions and modularizing. Years later we found out that function calls were still expensive on the PDP-11, and VAX code was often spending 50% of its time in the CALLS instruction. Dennis had lied to us! But it was too late; we were all hooked...

—Steve Johnson

All programmers today, Unix natives or not, are taught to modularize at the subroutine level within programs. Some learn the art of doing this at the module or abstract-data-type level and call that ’good design’. The design-patterns movement is making a noble effort to push up a level from there and discover successful design abstractions that can be applied to organize the large-scale structure of programs.

Getting better at all these kinds of problem partitioning is a worthy goal, and many excellent treatments of them are available elsewhere. We shall not attempt to cover all the issues relating to modularity within programs in too much detail: first, because that is a subject for an entire volume (or several volumes) in itself; and second, because this is a book about the art of Unix programming.

What we will do here is examine more specifically what the Unix tradition teaches us about how to follow the Rule of Modularity. In this chapter, our examples will live within process units. Later, in Chapter 7, we’ll examine the circumstances under which partitioning programs into multiple cooperating processes is a good idea, and discuss more specific techniques for doing that partitioning.

4.1 Encapsulation and Optimal Module Size

The first and most important quality of modular code is encapsulation. Well-encapsulated modules don’t expose their internals to each other. They don’t call into the middle of each others’ implementations, and they don’t promiscuously share global data. They communicate using application programming interfaces (APIs)—narrow, well-defined sets of procedure calls and data structures. This is what the Rule of Modularity is about.

The APIs between modules have a dual role. On the implementation level, they function as choke points between the modules, preventing the internals of each from leaking into its neighbors. On the design level, it is the APIs (not the bits of implementation between them) that really define your architecture.

One good test for whether an API is well designed is this one: if you try to write a description of it in purely human language (with no source-code extracts allowed), does it make sense? It is a very good idea to get into the habit of writing informal descriptions of your APIs before you code them. Indeed, some of the most able developers start by defining their interfaces, writing brief comments to describe them, and then writing the code—since the process of writing the comment clarifies what the code must do. Such descriptions help you organize your thoughts, they make useful module comments, and eventually you might want to turn them into a roadmap document for future readers of the code.

As you push module decomposition harder, the pieces get smaller and the definition of the APIs gets more important. Global complexity, and consequent vulnerability to bugs, decreases. It has been received wisdom in computer science since the 1970s (exemplified in papers such as [Parnas]) that you ought to design your software systems as hierarchies of nested modules, with the grain size of the modules at each level held to a minimum.

It is possible, however, to push this kind of decomposition too hard and make your modules too small. There is evidence [Hatton97] that when one plots defect density versus module size, the curve is U-shaped and concave upwards (see Figure 4.1). Very small and very large modules are associated with more bugs than those of intermediate size. A different way of viewing the same data is to plot lines of code per module versus total bugs. The curve looks roughly logarithmic up to a ’sweet spot’ where it flattens (corresponding to the minimum in the defect density curve), after which it goes up as the square of the number of the lines of code (which is what one might intuitively expect for the whole curve, following Brooks’s Law1).

1 Brooks’s Law predicts that adding programmers to a late project makes it later. More generally, it predicts that costs and error rates rise as the square of the number of programmers on a project.

Figure 4.1. Qualitative plot of defect count and density vs. module size.

image

This unexpectedly increasing incidence of bugs at small module sizes holds across a wide variety of systems implemented in different languages. Hatton has proposed a model relating this nonlinearity to the chunk size of human short-term memory.2 Another way to interpret the nonlinearity is that at small module grain sizes, the increasing complexity of the interfaces becomes the dominating term; it’s difficult to read the code because you have to understand everything before you can understand anything. In Chapter 7 we’ll examine more advanced forms of program partitioning; there, too, the complexity of interface protocols comes to dominate the total complexity of the system as the component processes get smaller.

2 In Hatton’s model, small differences in the maximum chunk size a programmer can hold in short-term memory have a large multiplicative effect on the programmer’s efficiency. This might be a major contributor to the order-of-magnitude (or larger) variations in effectiveness observed by Fred Brooks and others.

In nonmathematical terms, Hatton’s empirical results imply a sweet spot between 200 and 400 logical lines of code that minimizes probable defect density, all other factors (such as programmer skill) being equal. This size is independent of the language being used—an observation which strongly reinforces the advice given elsewhere in this book to program with the most powerful languages and tools you can. Beware of taking these numbers too literally however. Methods for counting lines of code vary considerably according to what the analyst considers a logical line, and other biases (such as whether comments are stripped). Hatton himself suggests as a rule of thumb a 2x conversion between logical and physical lines, suggesting an optimal range of 400–800 physical lines.

4.2 Compactness and Orthogonality

Code is not the only sort of thing with an optimal chunk size. Languages and APIs (such as sets of library or system calls) run up against the same sorts of human cognitive constraints that produce Hatton’s U-curve.

Accordingly, Unix programmers have learned to think very hard about two other properties when designing APIs, command sets, protocols, and other ways to make computers do tricks: compactness and orthogonality.

4.2.1 Compactness

Compactness is the property that a design can fit inside a human being’s head. A good practical test for compactness is this: Does an experienced user normally need a manual? If not, then the design (or at least the subset of it that covers normal use) is compact.

Compact software tools have all the virtues of physical tools that fit well in the hand. They feel pleasant to use, they don’t obtrude themselves between your mind and your work, they make you more productive—and they are much less likely than unwieldy tools to turn in your hand and injure you.

Compact is not equivalent to ’weak’. A design can have a great deal of power and flexibility and still be compact if it is built on abstractions that are easy to think about and fit together well. Nor is compact equivalent to ’easily learned’; some compact designs are quite difficult to understand until you have mastered an underlying conceptual model that is tricky, at which point your view of the world changes and compact becomes simple. For a lot of people, the Lisp language is a classic example of this.

Nor does compact mean ’small’. If a well-designed system is predictable and ’obvious’ to the experienced user, it might have quite a few pieces.

—Ken Arnold

Very few software designs are compact in an absolute sense, but many are compact in a slightly looser sense of the term. They have a compact working set, a subset of capabilities that suffices for 80% or more of what expert users normally do with them. Practically speaking, such designs normally need a reference card or cheat sheet but not a manual. We’ll call such designs semi-compact, as opposed to strictly compact.

The concept is perhaps best illustrated by examples. The Unix system call API is semi-compact, but the standard C library is not compact in any sense. While Unix programmers easily keep a subset of the system calls sufficient for most applications programming (file system operations, signals, and process control) in their heads, the C library on modern Unixes includes many hundreds of entry points, e.g., mathematical functions, that won’t all fit inside a single programmer’s cranium.

The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information [Miller] is one of the foundation papers in cognitive psychology (and, incidentally, the specific reason that U.S. local telephone numbers have seven digits). It showed that the number of discrete items of information human beings can hold in short-term memory is seven, plus or minus two. This gives us a good rule of thumb for evaluating the compactness of APIs: Does a programmer have to remember more than seven entry points? Anything larger than this is unlikely to be strictly compact.

Among Unix tools, make(1) is compact; autoconf(1) and automake(1) are not. Among markup languages, HTML is semi-compact, but DocBook (a documentation markup language we shall discuss in Chapter 18) is not. The man(7) macros are compact, but troff(1) markup is not.

Among general-purpose programming languages, C and Python are semi-compact; Perl, Java, Emacs Lisp, and shell are not (especially since serious shell programming requires you to know half-a-dozen other tools like sed(1) and awk(1)). C++ is anti-compact—the language’s designer has admitted that he doesn’t expect any one programmer to ever understand it all.

Some designs that are not compact have enough internal redundancy of features that individual programmers end up carving out compact dialects sufficient for that 80% of common tasks by choosing a working subset of the language. Perl has this kind of pseudo-compactness, for example. Such designs have a built-in trap; when two programmers try to communicate about a project, they may find that differences in their working subsets are a significant barrier to understanding and modifying the code.

Noncompact designs are not automatically doomed or bad, however. Some problem domains are simply too complex for a compact design to span them. Sometimes it’s necessary to trade away compactness for some other virtue, like raw power and range. Troff markup is a good example of this. So is the BSD sockets API. The purpose of emphasizing compactness as a virtue is not to condition you to treat compactness as an absolute requirement, but to teach you to do what Unix programmers do: value compactness properly, design for it whenever possible, and not throw it away casually.

4.2.2 Orthogonality

Orthogonality is one of the most important properties that can help make even complex designs compact. In a purely orthogonal design, operations do not have side effects; each action (whether it’s an API call, a macro invocation, or a language operation) changes just one thing without affecting others. There is one and only one way to change each property of whatever system you are controlling.

Your monitor has orthogonal controls. You can change the brightness independently of the contrast level, and (if the monitor has one) the color balance control will be independent of both. Imagine how much more difficult it would be to adjust a monitor on which the brightness knob affected the color balance: you’d have to compensate by tweaking the color balance every time after you changed the brightness. Worse, imagine if the contrast control also affected the color balance; then, you’d have to adjust both knobs simultaneously in exactly the right way to change either contrast or color balance alone while holding the other constant.

Far too many software designs are non-orthogonal. One common class of design mistake, for example, occurs in code that reads and parses data from one (source) format to another (target) format. A designer who thinks of the source format as always being stored in a disk file may write the conversion function to open and read from a named file. Usually the input could just as well have been any file handle. If the conversion routine were designed orthogonally, e.g., without the side effect of opening a file, it could save work later when the conversion has to be done on a data stream supplied from standard input, a network socket, or any other source.

Doug McIlroy’s advice to “Do one thing well” is usually interpreted as being about simplicity. But it’s also, implicitly and at least as importantly, about orthogonality.

It’s not a problem for a program to do one thing well and other things as side effects, provided supporting those other things doesn’t raise the complexity of the program and its vulnerability to bugs. In Chapter 9 we’ll examine a program called ascii that prints synonyms for the names of ASCII characters, including hex, octal, and binary values; as a side effect, it can serve as a quick base converter for numbers in the range 0–255. This second use is not an orthogonality violation because the features that support it are all necessary to the primary function; they do not make the program more difficult to document or maintain.

The problems with non-orthogonality arise when side effects complicate a programmer’s or user’s mental model, and beg to be forgotten, with results ranging from inconvenient to dire. Even when you do not forget the side effects, you’re often forced to do extra work to suppress them or work around them.

There is an excellent discussion of orthogonality and how to achieve it in The Pragmatic Programmer [Hunt-Thomas]. As they point out, orthogonality reduces test and development time, because it’s easier to verify code that neither causes side effects nor depends on side effects from other code—there are fewer combinations to test. If it breaks, orthogonal code is more easily replaced without disturbance to the rest of the system. Finally, orthogonal code is easier to document and reuse.

The concept of refactoring, which first emerged as an explicit idea from the ’Extreme Programming’ school, is closely related to orthogonality. To refactor code is to change its structure and organization without changing its observable behavior. Software engineers have been doing this since the birth of the field, of course, but naming the practice and identifying a stock set of refactoring techniques has helped concentrate peoples’ thinking in useful ways. Because these fit so well with the central concerns of the Unix design tradition, Unix developers have quickly coopted the terminology and ideas of refactoring.3

3 In the foundation text on this topic, Refactoring [Fowler], the author comes very close to stating that the principal goal of refactoring is to improve orthogonality. But lacking the concept, he can only approximate this idea from several different directions: eliminating code duplication and various other “bad smells” many of which are some sort of orthogonality violation.

The basic Unix APIs were designed for orthogonality with imperfect but considerable success. We take for granted being able to open a file for write access without exclusive-locking it for write, for example; not all operating systems are so graceful. Old-style (System III) signals were non-orthogonal, because signal receipt had the side-effect of resetting the signal handler to the default die-on-receipt. There are large non-orthogonal patches like the BSD sockets API and very large ones like the X windowing system’s drawing libraries.

But on the whole the Unix API is a good example: Otherwise it not only would not but could not be so widely imitated by C libraries on other operating systems. This is also a reason that the Unix API repays study even if you are not a Unix programmer; it has lessons about orthogonality to teach.

4.2.3 The SPOT Rule

The Pragmatic Programmer articulates a rule for one particular kind of orthogonality that is especially important. Their “Don’t Repeat Yourself” rule is: every piece of knowledge must have a single, unambiguous, authoritative representation within a system. In this book we prefer, following a suggestion by Brian Kernighan, to call this the Single Point Of Truth or SPOT rule.

Repetition leads to inconsistency and code that is subtly broken, because you changed only some repetitions when you needed to change all of them. Often, it also means that you haven’t properly thought through the organization of your code.

Constants, tables, and metadata should be declared and initialized once and imported elsewhere. Any time you see duplicate code, that’s a danger sign. Complexity is a cost; don’t pay it twice.

Often it’s possible to remove code duplication by refactoring; that is, changing the organization of your code without changing the core algorithms. Data duplication sometimes appears to be forced on you. But when you see it, here are some valuable questions to ask:

• If you have duplicated data in your code because it has to have two different representations in two different places, can you write a function, tool or code generator to make one representation from the other, or both from a common source?

• If your documentation duplicates knowledge in your code, can you generate parts of the documentation from parts of the code, or vice-versa, or both from a common higher-level representation?

• If your header files and interface declarations duplicate knowledge in your implementation code, is there a way you can generate the header files and interface declarations from the code?

There is an analog of the SPOT rule for data structures: “No junk, no confusion”. “No junk” says that the data structure (the model) should be minimal, e.g., not made so general that it can represent situations which cannot exist. “No confusion” says that states which must be kept distinct in the real-world problem must be kept distinct in the model. In short, the SPOT rule advocates seeking a data structure whose states have a one-to-one correspondence with the states of the real-world system to be modeled.

From deeper within the Unix tradition, we can add some of our own corollaries of the SPOT rule:

• Are you duplicating data because you’re caching intermediate results of some computation or lookup? Consider carefully whether this is premature optimization; stale caches (and the layers of code needed to keep caches synchronized) are a fertile source of bugs,4 and can even slow down overall performance if (as often happens) the cache-management overhead is higher than you expected.

4 An archetypal example of bad caching is the rehash directive in csh(1); type man 1 csh for details. See Section 12.4.3 for another example.

• If you see lots of duplicative boilerplate code, can you generate all of it from a single higher-level representation, twiddling a few knobs to generate the different cases?

The reader should begin to see a pattern emerging here.

In the Unix world, the SPOT Rule as a unifying idea has seldom been explicit—but heavy use of code generators to implement particular kinds of SPOT are very much part of the tradition. We’ll survey these techniques in Chapter 9.

4.2.4 Compactness and the Strong Single Center

One subtle but powerful way to promote compactness in a design is to organize it around a strong core algorithm addressing a clear formal definition of the problem, avoiding heuristics and fudging.

Formalization often clarifies a task spectacularly. It is not enough for a programmer to recognize that bits of his task fall within standard computer-science categories—a little depth-first search here and a quicksort there. The best results occur when the nub of the task can be formalized, and a clear model of the job at hand can be constructed. It is not necessary that ultimate users comprehend the model. The very existence of a unifying core will provide a comfortable feel, unencumbered with the why-in-hell-did-they-do-that moments that are so prevalent in using Swiss-army-knife programs.

—Doug McIlroy

This is an often-overlooked strength of the Unix tradition. Many of its most effective tools are thin wrappers around a direct translation of some single powerful algorithm.

Perhaps the clearest example of this is diff(1), the Unix tool for reporting differences between related files. This tool and its dual, patch(1), have become central to the network-distributed development style of modern Unix. A valuable property of diff is that it seldom surprises anyone. It doesn’t have special cases or painful edge conditions, because it uses a simple, mathematically sound method of sequence comparison. This has consequences:

By virtue of a mathematical model and a solid algorithm, Unix diff contrasts markedly with its imitators. First, the central engine is solid, small, and has never needed one line of maintenance. Second, the results are clear and consistent, unmarred by surprises where heuristics fail.

—Doug McIlroy

Thus, people who use diff can develop an intuitive feel for what it will do in any given situation without necessarily understanding the central algorithm perfectly. Other well-known examples of this special kind of clarity achieved through a strong central algorithm abound in Unix:

• The grep(1) utility for selecting lines out of files by pattern matching is a simple wrapper around a formal algebra of regular-expression patterns (see Section 8.2.2 for discussion). If it had lacked this consistent mathematical model, it would probably look like the design of the original glob(1) facility in the oldest Unixes, a handful of ad-hoc wildcards that can’t be combined.

• The yacc(1) utility for generating language parsers is a thin wrapper around the formal theory of LR(1) grammars. Its partner, the lexical analyzer generator lex(1), is a similarly thin wrapper around the theory of nondeterministic finite-state automata.

All three of these programs are so bug-free that their correct functioning is taken utterly for granted, and compact enough to fit easily in a programmer’s hand. Only a part of these good qualities are due to the polishing that comes with a long service life and frequent use; most of it is that, having been constructed around a strong and provably correct algorithmic core, they never needed much polishing in the first place.

The opposite of a formal approach is using heuristics—rules of thumb leading toward a solution that is probabilistically, but not certainly, correct. Sometimes we use heuristics because a deterministically correct solution is impossible. Think of spam filtering, for example; an algorithmically perfect spam filter would need a full solution to the problem of understanding natural language as a module. Other times, we use heuristics because known formally correct methods are impossibly expensive. Virtual-memory management is an example of this; there are near-perfect solutions, but they require so much runtime instrumentation that their overhead would swamp any theoretical gain over heuristics.

The trouble with heuristics is that they proliferate special cases and edge cases. If nothing else, you usually have to backstop a heuristic with some sort of recovery mechanism when it fails. All the usual problems with escalating complexity follow. To manage the resulting tradeoffs, you have to start by being aware of them. Always ask if a heuristic actually pays off in performance what it costs in code complexity—and don’t guess at the performance difference, actually measure it before making a decision.

4.2.5 The Value of Detachment

We began this book with a reference to Zen: “a special transmission, outside the scriptures”. This was not mere exoticism for stylistic effect; the core concepts of Unix have always had a spare, Zen-like simplicity that continues to shine through the layers of historical accidents that have accreted around them. This quality is reflected in the cornerstone documents of Unix, like The C Programming Language [Kernighan-Ritchie] and the 1974 CACM paper that introduced Unix to the world; one of the famous quotes from that paper observes “...constraint has encouraged not only economy, but also a certain elegance of design”. That simplicity came from trying to think not about how much a language or operating system could do, but of how little it could do—not by carrying assumptions but by starting from zero (what in Zen is called “beginner’s mind” or “empty mind”).

To design for compactness and orthogonality, start from zero. Zen teaches that attachment leads to suffering; experience with software design teaches that attachment to unnoticed assumptions leads to non-orthogonality, noncompact designs, and projects that fail or become maintenance nightmares.

To achieve enlightenment and surcease from suffering, Zen teaches detachment. The Unix tradition teaches the value of detachment from the particular, accidental conditions under which a design problem was posed. Abstract. Simplify. Generalize. Because we write software to solve problems, we cannot completely detach from the problems—but it is well worth the mental effort to see how many preconceptions you can throw away, and whether the design becomes more compact and orthogonal as you do that. Possibilities for code reuse often result.

Jokes about the relationship between Unix and Zen are a live part of the Unix tradition as well.5 This is not an accident.

5 For a recent example of Unix/Zen crossover, see Appendix D.

4.3 Software Is a Many-Layered Thing

Broadly speaking, there are two directions one can go in designing a hierarchy of functions or objects. Which direction you choose, and when, has a profound effect on the layering of your code.

4.3.1 Top-Down versus Bottom-Up

One direction is bottom-up, from concrete to abstract—working up from the specific operations in the problem domain that you know you will need to perform. For example, if one is designing firmware for a disk drive, some of the bottom-level primitives might be ’seek head to physical block’, ’read physical block’, ’write physical block’, ’toggle drive LED’, etc.

The other direction is top-down, abstract to concrete—from the highest-level specification describing the project as a whole, or the application logic, downwards to individual operations. Thus, if one is designing software for a mass-storage controller that might drive several different sorts of media, one might start with abstract operations like ’seek logical block’, ’read logical block’, ’write logical block’, ’toggle activity indication’. These would differ from the similarly named hardware-level operations above in that they’re intended to be generic across different kinds of physical devices.

These two examples could be two ways of approaching design for the same collection of hardware. Your choice, in cases like this, is one of these: either abstract the hardware (so the objects encapsulate the real things out there and the program is merely a list of manipulations on those things), or organize around some behavioral model (and then embed the actual hardware manipulations that carry it out in the flow of the behavioral logic).

An analogous choice shows up in a lot of different contexts. Suppose you’re writing MIDI sequencer software. You could organize that code around its top level (sequencing tracks) or around its bottom level (switching patches or samples and driving wave generators).

A very concrete way to think about this difference is to ask whether the design is organized around its main event loop (which tends to have the high-level application logic close to it) or around a service library of all the operations that the main loop can invoke. A designer working from the top down will start by thinking about the program’s main event loop, and plug in specific events later. A designer working from the bottom up will start by thinking about encapsulating specific tasks and glue them together into some kind of coherent order later on.

For a larger example, consider the design of a Web browser. The top-level design of a Web browser is a specification of the expected behavior of the browser: what types of URL (like http: or ftp: or file:) it interprets, what kinds of images it is expected to be able to render, whether and with what limitations it will accept Java or JavaScript, etc. The layer of the implementation that corresponds to this top-level view is its main event loop; each time around, the loop waits for, collects, and dispatches on a user action (such as clicking a Web link or typing a character into a field).

But the Web browser has to call a large set of domain primitives to do its job. One group of these is concerned with establishing network connections, sending data over them, and receiving responses. Another set is the operations of the GUI toolkit the browser will use. Yet a third set might be concerned with the mechanics of parsing retrieved HTML from text into a document object tree.

Which end of the stack you start with matters a lot, because the layer at the other end is quite likely to be constrained by your initial choices. In particular, if you program purely from the top down, you may find yourself in the uncomfortable position that the domain primitives your application logic wants don’t match the ones you can actually implement. On the other hand, if you program purely from the bottom up, you may find yourself doing a lot of work that is irrelevant to the application logic—or merely designing a pile of bricks when you were trying to build a house.

Ever since the structured-programming controversies of the 1960s, novice programmers have generally been taught that the correct approach is the top-down one: stepwise refinement, where you specify what your program is to do at an abstract level and gradually fill in the blanks of implementation until you have concrete working code. Top-down tends to be good practice when three preconditions are true: (a) you can specify in advance precisely what the program is to do, (b) the specification is unlikely to change significantly during implementation, and (c) you have a lot of freedom in choosing, at a low level, how the program is to get that job done.

These conditions tend to be fulfilled most often in programs relatively close to the user and high in the software stack—applications programming. But even there those preconditions often fail. You can’t count on knowing what the ’right’ way for a word processor or a drawing program to behave is until the user interface has had end-user testing. Purely top-down programming often has the effect of overinvesting effort in code that has to be scrapped and rebuilt because the interface doesn’t pass a reality check.

In self-defense against this, programmers try to do both things—express the abstract specification as top-down application logic, and capture a lot of low-level domain primitives in functions or libraries, so they can be reused when the high-level design changes.

Unix programmers inherit a tradition that is centered in systems programming, where the low-level primitives are hardware-level operations that are fixed in character and extremely important. They therefore lean, by learned instinct, more toward bottom-up programming.

Whether you’re a systems programmer or not, bottom-up can also look more attractive when you are programming in an exploratory way, trying to get a grasp on hardware or software or real-world phenomena you don’t yet completely understand. Bottom-up programming gives you time and room to refine a vague specification. Bottom-up also appeals to programmers’ natural human laziness—when you have to scrap and rebuild code, you tend to have to throw away larger pieces if you’re working top-down than you do if you’re working bottom-up.

Real code, therefore tends to be programmed both top-down and bottom-up. Often, top-down and bottom-up code will be part of the same project. That’s where ’glue’ enters the picture.

4.3.2 Glue Layers

When the top-down and bottom-up drives collide, the result is often a mess. The top layer of application logic and the bottom layer of domain primitives have to be impedance-matched by a layer of glue logic.

One of the lessons Unix programmers have learned over decades is that glue is nasty stuff and that it is vitally important to keep glue layers as thin as possible. Glue should stick things together, but should not be used to hide cracks and unevenness in the layers.

In the Web-browser example, the glue would include the rendering code that maps a document object parsed from incoming HTML into a flattened visual representation as a bitmap in a display buffer, using GUI domain primitives to do the painting. This rendering code is notoriously the most bug-prone part of a browser. It attracts into itself kluges to address problems that originate both in the HTML parsing (because there is a lot of ill-formed markup out there) and the GUI toolkit (which may not have quite the primitives that are really needed).

A Web browser’s glue layer has to mediate not merely between specification and domain primitives, but between several different external specifications: the network behavior standardized in HTTP, HTML document structure, and various graphics and multimedia formats as well as the users’ behavioral expectations from the GUI.

And one single bug-prone glue layer is not the worst fate that can befall a design. A designer who is aware that the glue layer exists, and tries to organize it into a middle layer around its own set of data structures or objects, can end up with two layers of glue—one above the midlayer and one below. Programmers who are bright but unseasoned are particularly apt to fall into this trap; they’ll get each fundamental set of classes (application logic, midlayer, and domain primitives) right and make them look like the textbook examples, only to flounder as the multiple layers of glue needed to integrate all that pretty code get thicker and thicker.

The thin-glue principle can be viewed as a refinement of the Rule of Separation. Policy (the application logic) should be cleanly separated from mechanism (the domain primitives), but if there is a lot of code that is neither policy nor mechanism, chances are that it is accomplishing very little besides adding global complexity to the system.

4.3.3 Case Study: C Considered as Thin Glue

The C language itself is a good example of the effectiveness of thin glue.

In the late 1990s, Gerrit Blaauw and Fred Brooks observed in Computer Architecture: Concepts and Evolution [BlaauwBrooks] that the architectures in every generation of computers, from early mainframes through minicomputers through workstations through PCs, had tended to converge. The later a design was in its technology generation, the more closely it approximated what Blaauw & Brooks called the “classical architecture”: binary representation, flat address space, a distinction between memory and working store (registers), general-purpose registers, address resolution to fixed-length bytes, two-address instructions, big-endianness,6 and data types a consistent set with sizes a multiple of either 4 or 6 bits (the 6-bit families are now extinct).

6 The terms big-endian and little-endian refer to architectural choices about the order in which bits are interpreted within a machine word. Though it has no canonical location, a Web search for On Holy Wars and a Plea for Peace should turn up a classic and entertaining paper on this subject.

Thompson and Ritchie designed C to be a sort of structured assembler for an idealized processor and memory architecture that they expected could be efficiently modeled on most conventional computers. By happy accident, their model for the idealized processor was the PDP-11, a particularly mature and elegant minicomputer design that closely approximated Blaauw & Brooks’s classical architecture. By good judgment, Thompson and Ritchie declined to wire into their language most of the few traits (such as little-endian byte order) where the PDP-11 didn’t match it.7

7 The widespread belief that the autoincrement and autodecrement features entered C because they represented PDP-11 machine instructions is a myth. According to Dennis Ritchie, these operations were present in the ancestral B language before the PDP-11 existed.

The PDP-11 became an important model for the following generations of microprocessor architectures. The basic abstractions of C turned out to capture the classical architecture rather neatly. Thus, C started out as a good fit for microprocessors and, rather than becoming irrelevant as its assumptions fell out of date, actually became a better fit as hardware converged more closely on the classical architecture. One notable example of this convergence was when Intel’s 386, with its large flat memory-address space, replaced the 286’s awkward segmented-memory addressing after 1985; pure C was actually a better fit for the 386 than it had been for the 286.

It is not a coincidence that the experimental era in computer architectures ended in the mid-1980s at the same time that C (and its close descendant C++) were sweeping all before them as general-purpose programming languages. C, designed as a thin but flexible layer over the classical architecture, looks with two decades’ additional perspective like almost the best possible design for the structured-assembler niche it was intended to fill. In addition to compactness, orthogonality, and detachment (from the machine architecture on which it was originally designed), it also has the important quality of transparency that we will discuss in Chapter 6. The few language designs since that are arguably better have needed to make large changes (like introducing garbage collection) in order to get enough functional distance from C not to be swamped by it.

This history is worth recalling and understanding because C shows us how powerful a clean, minimalist design can be. If Thompson and Ritchie had been less wise, they would have designed a language that did much more, relied on stronger assumptions, never ported satisfactorily off its original hardware platform, and withered away as the world changed out from under it. Instead, C has flourished—and the example Thompson and Ritchie set has influenced the style of Unix development ever since. As the writer, adventurer, artist, and aeronautical engineer Antoine de Saint-Exupéry once put it, writing about the design of airplanes: “La perfection est atteinte non quand il ne reste rien à ajouter, mais quand il ne reste rien à enlever”. (“Perfection is attained not when there is nothing more to add, but when there is nothing more to remove”.)

Ritchie and Thompson lived by this maxim. Long after the resource constraints on early Unix software had eased, they worked at keeping C as thin a layer over the hardware as possible.

Dennis used to say to me, when I would ask for some particularly extravagant feature in C, “If you want PL/1, you know where to get it”. He didn’t have to deal with some marketer saying “But we need a check in the box on the sales viewgraph!”

—Mike Lesk

The history of C is also a lesson in the value of having a working reference implementation before you standardize. We’ll return to this point in Chapter 17 when we discuss the evolution of C and Unix standards.

4.4 Libraries

One consequence of the emphasis that the Unix programming style put on modularity and well-defined APIs is a strong tendency to factor programs into bits of glue connecting collections of libraries, especially shared libraries (the equivalents of what are called dynamically-linked libraries or DLLs under Windows and other operating systems).

If you are careful and clever about design, it is often possible to partition a program so that it consists of a user-interface-handling main section (policy) and a collection of service routines (mechanism) with effectively no glue at all. This approach is especially appropriate when the program has to do a lot of very specific manipulations of data structures like graphic images, network-protocol packets, or control blocks for a hardware interface. Some good general architectural advice from within the Unix tradition, particularly applicable to the resource-management challenges of this sort of library is collected in The Discipline and Method Architecture for Reusable Libraries [Vo].

Under Unix, it is normal practice to make this layering explicit, with the service routines collected in a library that is separately documented. In such programs, the front end gets to specialize in user-interface considerations and high-level protocol. With a little more care in design, it may be possible to detach the original front end and replace it with others adapted for different purposes. Some other advantages should become evident from our case study.

There is a flip side to this. In the Unix world, libraries which are delivered as libraries should come with exerciser programs.

APIs should come with programs, and vice versa. An API that you must write C code to use, which cannot be invoked easily from the command line, is harder to learn and use. And contrariwise, it’s a royal pain to have interfaces whose only open, documented form is a program, so you cannot invoke them easily from a C program—for example, route(1) in older Linuxes.

—Henry Spencer

Besides easing the learning curve, library exercisers often make excellent test frameworks. Experienced Unix programmers therefore see them not just as a form of thoughtfulness to the library’s users but as an indication that the code has probably been well tested.

An important form of library layering is the plugin, a library with a set of known entry points that is dynamically loaded after startup time to perform a specialized task. For plugins to work, the calling program has to be organized largely as a documented service library that the plugin can call back into.

4.4.1 Case Study: GIMP Plugins

The GIMP (GNU Image Manipulation program) is a graphics editor designed to be driven through an interactive GUI. But GIMP is built as a library of image-manipulation and housekeeping routines called by a relatively thin layer of control code. The driver code knows about the GUI, but not directly about image formats; the library routines reverse this by knowing about image formats and operations but not about the GUI.

The library layer is documented (and, in fact shipped as “libgimp” for use by other programs). This means that C programs called “plugins” can be dynamically loaded by GIMP and call the library to do image manipulation, effectively taking over control at the same level as the GUI (see Figure 4.2).

Figure 4.2. Caller/callee relationships in GIMP with a plugin loaded.

image

Plugins are used to perform lots of special-purpose transformations such as colormap hacking, blurring and despeckling; also for reading and writing file formats not native to the GIMP core; for extensions like editing animations and window manager themes; and for lots of other sorts of image-hacking that can be automated by scripting the image-hacking logic in the GIMP core. A registry of GIMP plugins is available on the World Wide Web.

Though most GIMP plugins are small, simple C programs, it is also possible to write a plugin that exposes the library API to a scripting language; we’ll discuss this possibility in Chapter 11 when we examine the ’polyvalent program’ pattern.

4.5 Unix and Object-Oriented Languages

Since the mid-1980s most new language designs have included native support for object-oriented programming (OO). Recall that in object-oriented programming, the functions that act on a particular data structure are encapsulated with the data in an object that can be treated as a unit. By contrast, modules in non-OO languages make the association between data and the functions that act on it rather accidental, and modules frequently leak data or bits of their internals into each other.

The OO design concept initially proved valuable in the design of graphics systems, graphical user interfaces, and certain kinds of simulation. To the surprise and gradual disillusionment of many, it has proven difficult to demonstrate significant benefits of OO outside those areas. It’s worth trying to understand why.

There is some tension and conflict between the Unix tradition of modularity and the usage patterns that have developed around OO languages. Unix programmers have always tended to be a bit more skeptical about OO than their counterparts elsewhere. Part of this is because of the Rule of Diversity; OO has far too often been promoted as the One True Solution to the software-complexity problem. But there is something else behind it as well, an issue which is worth exploring as background before we evaluate specific OO (object-oriented) languages in Chapter 14. It will also help throw some characteristics of the Unix style of non-OO programming into sharper relief.

We observed above that the Unix tradition of modularity is one of thin glue, a minimalist approach with few layers of abstraction between the hardware and the top-level objects of a program. Part of this is the influence of C. It takes serious effort to simulate true objects in C. Because that’s so, piling up abstraction layers is an exhausting thing to do. Thus, object hierarchies in C tend to be relatively flat and transparent. Even when Unix programmers use other languages, they tend to want to carry over the thin-glue/shallow-layering style that Unix models have taught them.

OO languages make abstraction easy—perhaps too easy. They encourage architectures with thick glue and elaborate layers. This can be good when the problem domain is truly complex and demands a lot of abstraction, but it can backfire badly if coders end up doing simple things in complex ways just because they can.

All OO languages show some tendency to suck programmers into the trap of excessive layering. Object frameworks and object browsers are not a substitute for good design or documentation, but they often get treated as one. Too many layers destroy transparency: It becomes too difficult to see down through them and mentally model what the code is actually doing. The Rules of Simplicity, Clarity, and Transparency get violated wholesale, and the result is code full of obscure bugs and continuing maintenance problems.

This tendency is probably exacerbated because a lot of programming courses teach thick layering as a way to satisfy the Rule of Representation. In this view, having lots of classes is equated with embedding knowledge in your data. The problem with this is that too often, the ’smart data’ in the glue layers is not actually about any natural entity in whatever the program is manipulating—it’s just about being glue. (One sure sign of this is a proliferation of abstract subclasses or ’mixins’.)

Another side effect of OO abstraction is that opportunities for optimization tend to disappear. For example, a + a + a + a can become a * 4 and even a << 2 if a is an integer. But if one creates a class with operators, there is nothing to indicate if they are commutative, distributive, or associative. Since one isn’t supposed to look inside the object, it’s not possible to know which of two equivalent expressions is more efficient. This isn’t in itself a good reason to avoid using OO techniques on new projects; that would be premature optimization. But it is reason to think twice before transforming non-OO code into a class hierarchy.

Unix programmers tend to share an instinctive sense of these problems. This tendency appears to be one of the reasons that, under Unix, OO languages have failed to displace non-OO workhorses like C, Perl (which actually has OO facilities, but they’re not heavily used), and shell. There is more vocal criticism of OO in the Unix world than orthodoxy permits elsewhere; Unix programmers know when not to use OO; and when they do use OO languages, they spend more effort on trying to keep their object designs uncluttered. As the author of The Elements of Networking Style once observed in a slightly different context [Padlipsky]: “If you know what you’re doing, three layers is enough; if you don’t, even seventeen levels won’t help”.

One reason that OO has succeeded most where it has (GUIs, simulation, graphics) may be because it’s relatively difficult to get the ontology of types wrong in those domains. In GUIs and graphics, for example, there is generally a rather natural mapping between manipulable visual objects and classes. If you find yourself proliferating classes that have no obvious mapping to what goes on in the display, it is correspondingly easy to notice that the glue has gotten too thick.

One of the central challenges of design in the Unix style is how to combine the virtue of detachment (simplifying and generalizing problems from their original context) with the virtue of thin glue and shallow, flat, transparent hierarchies of code and design.

We’ll return to some of these points and apply them when we discuss object-oriented languages in Chapter 14.

4.6 Coding for Modularity

Modularity is expressed in good code, but it primarily comes from good design. Here are some questions to ask about any code you work on that might help you improve its modularity:

• How many global variables does it have? Global variables are modularity poison, an easy way for components to leak information to each other in careless and promiscuous ways.8

8 Globals also mean your code cannot be reentrant; that is, multiple instances in the same process are likely to step on each other.

• Is the size of your individual modules in Hatton’s sweet spot? If your answer is “No, many are larger”, you may have a long-term maintenance problem. Do you know what your own sweet spot is? Do you know what it is for other programmers you are cooperating with? If not, best be conservative and stick to sizes near the low end of Hatton’s range.

• Are the individual functions in your modules too large? This is not so much a matter of line count as it is of internal complexity. If you can’t informally describe a function’s contract with its callers in one line, the function is probably too large.9

9 Many years ago, I learned from Kernighan & Plauger’s The Elements of Programming Style a useful rule. Write that one-line comment immediately after the prototype of your function. For every function, without exception.

Personally I tend to break up a subprogram when there are too many local variables. Another clue is [too many] levels of indentation. I rarely look at length.

—Ken Thompson

• Does your code have internal APIs—that is, collections of function calls and data structures that you can describe to others as units, each sealing off some layer of function from the rest of the code? A good API makes sense and is understandable without looking at the implementation behind it. The classic test is this: Try to describe it to another programmer over the phone. If you fail, it is very probably too complex, and poorly designed.

• Do any of your APIs have more than seven entry points? Do any of your classes have more than seven methods each? Do your data structures have more than seven members?

• What is the distribution of the number of entry points per module across the project?10 Does it seem uneven? Do the modules with lots of entry points really need that many? Module complexity tends to rise as the square of the number of entry points, too—yet another reason simple APIs are better than complicated ones.

10 A cheap way to collect this information is to analyze the tags files generated by a utility like etags(1) or ctags(1).

You might find it instructive to compare these with our checklist of questions about transparency, and discoverability in Chapter 6.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset