Recursive Make: A Critical Analysis
Recursive Make: A Critical Analysis
org/journals/overload/14/71/miller_2004/
([Link]
([Link]
([Link]
([Link]
/groups
([Link]
/channels
([Link]
([Link]
/company
/accu-/accu-
/channel
/823242577213128784
([Link]
/accuorg)
/accuorg)
/accu/)
org) org) /UCJhay24LTpO1s4bIZxuIqK
/823242577213128788
/[Link]
By Peter Miller
Abstract
For large UNIX projects, the traditional method of building the project is to use recursive make . On some projects, this
results in build times which are unacceptably large, when all you want to do is change one �le. In examining the source
of the overly long build times, it became evident that a number of apparently unrelated problems combine to produce
the delay,but on analysis all have the same root cause.
This paper explores a number of problems regarding the use of recursive make , and shows that they are all symptoms
of the same problem. Symptoms that the UNIX community have long accepted as a fact of life, but which need not be
endured any longer. These problems include recursive make s which take “forever” to work out that they need to do
nothing, recursive make s which do too much, or too little, recursive make s which are overly sensitive to changes in
the source code and require constant Make�le intervention to keep them working.
The resolution of these problems can be found by looking at what make does, from �rst principles, and then analyzing
the effects of introducing recursive make to this activity. The analysis shows that the problem stems from the arti�cial
partitioning of the build into separate subsets. This, in turn, leads to the symptoms described. To avoid the symptoms, it
is only necessary to avoid the separation; to use a single make session to build the whole project, which is not quite the
same as a single Makefile .
This conclusion runs counter to much accumulated folk wisdom in building large projects on UNIX. Some of the main
objections raised by this folk wisdom are examined and shown to be unfounded. The results of actual use are far more
encouraging, with routine development performance improvements signi�cantly faster than intuition may indicate, and
without the intuitvely expected compromise of modularity. The use of a whole project make is not as di�cult to put into
practice as it may at �rst appear.
Introduction
For large UNIX software development projects, the traditional methods of building the project use what has come to be
known as “recursive make .” This refers to the use of a hierarchy of directories containing source �les for the modules
which make up the project, where each of the sub-directories contains a Makefile which describes the rules and
instructions for the make program. The complete project build is done by arranging for the top-level Makefile to
change directory into each of the sub-directories and recursively invoke make .
This paper explores some signi�cant problems encountered when developing software projects using the recursive
make technique. A simple solution is offered, and some of the implications of that solution are explored.
1 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Recursive make results in a directory tree which looks something like �gure 1. This hierarchy of modules can be nested
arbitrarily deep. Real-world projects often use two- and three-level structures.
Figure 1
Assumed Knowledge
This paper assumes that the reader is familiar with developing software on UNIX, with the make program, and with the
issues of C programming and include �le dependencies.
This paper assumes that you have installed GNU Make on your system and are moderately familiar with its features.
Some features of make described below may not be available if you are using the limited version supplied by your
vendor.
The Problem
There are numerous problems with recursive make , and they are usually observed daily in practice. Some of these
problems include:
It is very hard to get the order of the recursion into the subdirectories correct. This order is very unstable and
frequently needs to be manually ‘‘tweaked.’’ Increasing the number of directories, or increasing the depth in the
directory tree, cause this order to be increasingly unstable.
It is often necessary to do more than one pass over the subdirectories to build the whole system. This, naturally,
leads to extended build times.
2 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Because the builds take so long, some dependency information is omitted, otherwise development builds take
unreasonable lengths of time, and the developers are unproductive. This usually leads to things not being updated
when they need to be, requiring frequent “clean” builds from scratch, to ensure everything has actually been built.
Because inter-directory dependencies are either omitted or too hard to express, the Makefiles are often written
to build too much to ensure that nothing is left out.
The inaccuracy of the dependencies, or the simple lack of dependencies, can result in a product which is
incapable of building cleanly, requiring the build process to be carefully watched by a human.
Related to the above, some projects are incapable of taking advantage of various “parallel make” impementations,
because the build does patently silly things.
Not all projects experience all of these problems. Those that do experience the problems may do so intermittently, and
dismiss the problems as unexplained “one off” quirks. This paper attempts to bring together a range of symptoms
observed over long practice, and presents a systematic analysis and solution.
It must be emphasized that this paper does not suggest that make itself is the problem. This paper is working from the
premise that make does not have a bug, that make does not have a design �aw. The problem is not in make at all,
but rather in the input given to make – the way make is being used.
Analysis
Before it is possible to address these seemingly unrelated problems, it is �rst necessary to understand what make
does and how it does it. It is then possible to look at the effects recursive make has on how make behaves.
make determines how to build the target by constructing a directed acyclic graph , the DAG familiar to many Computer
Science students. The vertices of this graph are the �les in the system, the edges of this graph are the inter-�le
dependencies. The edges of the graph are directed because the pair-wise dependencies are ordered; resulting in an
acyclic graph – things which look like loops are resolved by the direction of the edges.
This paper will use a small example project for its analysis. While the number of �les in this example is small, there is
su�cient complexity to demonstrate all of the above recursive make problems. First, however, the project is presented
in a non-recursive form (�gure 2).
Figure 2
3 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Some of the implicit rules of make are presented here explicitly, to assist the reader in converting the Makefile into
its equivalent DAG.
The above Makefile can be drawn as a DAG in the form shown in �gure 3.
Figure 3
This is an acyclic graph because of the arrows which express the ordering of the relationship between the �les. If there
was a circular dependency according to the arrows, it would be an error.
Note that the object �les ( .o ) are dependent on the include �les ( .h ) even though it is the source �les ( .c )
which do the including. This is because if an include �le changes, it is the object �les which are out-of-date, not the
source �les.
The second part of what make does it to perform a postorder traversal of the DAG. That is, the dependencies are
visited �rst. The actual order of traversal is unde�ned, but most make implementations work down the graph from left
to right for edges below the same vertex, and most projects implicitly rely on this behaviour. The last-time-modi�ed of
each �le is examined, and higher �les are determined to be out-of-date if any of the lower �les on which they depend are
younger. Where a �le is determined to be out-of-date, the action associated with the relevant graph edge is performed (in
the above example, a compile or a link).
The use of recursive make affects both phases of the operation of make : it causes make to construct an inaccurate
DAG, and it forces make to traverse the DAG in an inappropriate order.
Recursive Make
To examine the effects of recursive make s, the above example will be arti�cially segmented into two modules,each
with its own Makefile , and a top-level
This example is intentionally arti�cial, and thoroughly so. Makefile However, all “modularity” of all projects is arti�cial,
to some extent. Consider: for many projects, the linker �attens it allo ut again, right at the end. The directory structure is
as shown in �gure 4.
4 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Figure 4
all: main.o
main.o: main.c ../bee/parse.h
$(CC) -I../bee -c main.c
Figure 5
5 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Figure 6
Take a close look at the DAGs. Notice how neither is complete – there are vertices and edges (�les and dependencies)
missing from both DAGs. When the entire build is done from the top level, everything will work.
But what happens when small changes occur? For example, what would happen if the parse.c and parse.h �les
were generated from a parse.y yacc grammar? This would add the following lines to the bee/Makefile :
6 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Figure 7
This change has a simple effect: if parse.y is edited, main.o will not be constructed correctly. This is because the
DAG for ant knows about only some of the dependencies of main.o ,and the DAG for bee knows none of them.
To understand why this happens, it is necessary to look at the actions make will take from the top level . Assume that
the project is in a self-consistent state. Now edit parse.y in such a way that the generated parse.h �le will have
non-trivial differences. However, when the top-level make is invoked, �rst ant and then bee is visited. But
ant/main.o is not recompiled, because bee/parse.h has not yet been regenerated and thus does not yet indicate
that main.o is out-of-date. It is not until bee is visited by the recursive make that parse.c and parse.h are
reconstructed, followed by parse.o . When the program is linked main.o and parse.o are non-trivially
incompatible. That is, the program is wrong .
Traditional Solutions
There are three traditional �xes for the above ‘‘glitch.’’
Reshu�e
The �rst is to manually tweak the order of the modules in the toplevel Makefile . But why is this tweak required at all?
Isn’t make supposed to be an expert system? Is make somehow �awed, or did something else go wrong?
To answer this question, it is necessary to look, not at the graphs, but the order of traversal of the graphs. In order to
operate correctly, make needs to perform a postorder traversal, but in separating the DAG into two pieces, make has
not been allowed to traverse the graph in the necessary order – instead the project has dictated an order of traversal. An
order which, when you consider the original graph, is plain wrong . Tweaking the top-level Makefile corrects the order
to one similar to that which make could have used. Until the next dependency is added...
Note that make -j (parallel build) invalidates many of the ordering assumptions implicit in the reshu�e solution,
making it useless. And then there are all of the sub-makes all doing their builds in parallel, too.
Repetition
7 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
The second traditional solution is to make more than one pass in the top-level Makefile , something like this:
This doubles the length of time it takes to perform the build. But that is not all: there is no guarantee that two passes are
enough! The upper bound of the number of passes is not even proportional to the number of modules, it is instead
proportional to the number of graph edges which cross module boundaries.
Overkill
We have already seen an example of how recursive make can build too little, but another common problem is to build
too much. The third traditional solution to the above glitch is to add even more lines to ant/Makefile :
.PHONY: ../bee/parse.h
../bee/parse.h:
cd ../bee; \
make clean; \
make all
This means that whenever main.o is made, parse.h will always be considered to be out-of-date. All of bee will
always be rebuilt including parse.h , and so main.o will always be rebuilt, even if everything was self consistent .
Note that make -j (parallel build) invalidates many of the ordering assumptions implicit in the overkill solution,
making it useless, because all of the sub-makes are all doing their builds (“clean” then “all”) in parallel, constantly
interfering with each other in non-deterministic ways.
Prevention
The above analysis is based on one simple action: the DAG was arti�cially separated into incomplete pieces. This
separation resulted in all of the problems familiar to recursive make builds.
Did make get it wrong? No. This is a case of the ancient GIGO principle: Garbage In, Garbage Out . Incomplete
Makefiles are wrong Makefiles .
To avoid these problems, don’t break the DAG into pieces; instead, use one Makefile for the entire project. It is not the
recursion itself which is harmful, it is the crippled Makefiles which are used in the recursion which are wrong . It is not
a de�ciency of make itself that recursive make is broken, it does the best it can with the �awed input it is given.
But, but, but... You can’t do that!’’ I hear you cry. ‘‘A single Make�le is too big,it’s unmaintainable, it’s too hard to write the rules,
you’ll run out of memory, I only want to build my little bit, the build will take too long. It’s just not practical.’
These are valid concerns, and they frequently lead make users to the conclusion that re-working their build process
does not have any short- or long-term bene�ts. This conclusion is based on ancient, enduring, false assumptions.
8 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Recursive Makefile s have a great deal of repetition. Many projects solve this by using include �les. By using a single
Makefile for the project, the need for the “common” include �les disappears – the single Makefile is the common
part.
GCC allows a -o option in conjunction with the -c option, and GNU Make knows this. This results in the implicit
compilation rule placing the output in the correct place. Older and dumber C compilers, however, may not allow the -o
option with the -c option, and will leave the object �le in the top-level directory (i.e. the wrong directory). There are
three ways for you to �x this: get GNU Make and GCC, override the built-in rule with one which does the right thing, or
complain to your vendor.
Also, K&R C compilers will start the double-quote include path ( #include "filename.h" ) from the current directory.
This will not do what you want. ANSI C compliant C compilers, however, start the double-quote include path from the
directory in which the source �le appears; thus, no source changes are required. If you don’t have an ANSI C compliant C
compiler,you should consider installing GCC on your system as soon as possible.
Developers always have the option of giving make a speci�c target. This is always the case, it’s just that we usually rely
on the default target in the Makefile in the current directory to shorten the command line for us. Building “my little bit”
can still be done with a whole project Makefile , simply by using a speci�c target, and an alias if the command line is
too long.
Is doing a full project build every time so absurd? If a change made in a module has repercussions in other modules,
because there is a dependency the developer is unaware of (but the Makefile is aware of), isn’t it better that the
developer �nd out as early as possible? Dependencies like this will be found, because the DAG is more complete than in
the recursive case.
The developer is rarely a seasoned old salt who knows every one of the million lines of code in the product. More likely
the developer is a short-term contractor or a junior. You don’t want implications like these to blow up after the changes
are integrated with the master source, you want them to blow up on the developer in some nice safe sand-box far
awayfrom the master source.
If you want to make “just your little” bit because you are concerned that performing a full project build will corrupt the
project master source, due to the directory structure used in your project, see the “Projects versus Sand-Boxes” section
below.
9 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Project Builds
Consider a hypothetical project with 1000 source ( .c ) �les, each of which has its calling interface de�ned in a
corresponding include ( .h ) �le with de�nes, type declarations and function prototypes. These 1000 source �les
include their own interface de�nition, plus the interface de�nitions of any other module they may call. These 1000
source �les are compiled into 1000 object �les which are then linked into an executable program. This system has some
3000 �les which make must be told about, and be told about the include dependencies, and also explore the possibility
that implicit rules ( .y -> .c for example) may be necessary.
In order to build the DAG, make must “stat” 3000 �les, plus an additional 2000 �les or so, depending on which implicit
rules your make knows about and your Makefile has left enabled. On the author’s humble 66MHz i486 this takes
about 10 seconds; on native disk on faster platforms it goes even faster. With NFS over 10MB Ethernet it takes about 10
seconds, no matter what the platform.
This is an astonishing statistic! Imagine being able to do a single �le compile, out of 1000 source �les, in only 10
seconds, plus the time for the compilation itself.
Breaking the set of �les up into 100 modules, and running it as a recursive make takes about 25 seconds. The repeated
process creation for the subordinate make invocations take quite a long time.
Hang on a minute! On real-world projects with less than 1000 �les, it takes an awful lot longer than 25 seconds for
make to work out that it has nothing to do. For some projects, doing it in only 25 minutes would be an improvement!
The above result tells us that it is not the number of �les which is slowing us down (that only takes 10 seconds), and it is
not the repeated process creation for the subordinate make invocations (that only takes another 15 seconds). So just
what is taking so long?
The traditional solutions to the problems introduced by recursive make often increase the number of subordinate
make invocations beyond the minimum described here; e.g. to perform multiple repetitions (see ‘Repetition’, above), or
to overkill cross-module dependencies (see ‘Overkill’, above). These can take a long time, particularly when combined,
but do not account for some of the more spectacular build times; what else is taking so long?
Complexity of the Makefile is what is taking so long. This is covered, below, in the ‘E�cient Make�les’ section.
Development Builds
If, as in the 1000 �le example, it only takes 10 seconds to �gure out which one of the �les needs to be recompiled, there
is no serious threat to the productivity of developers if they do a whole project make as opposed to a module-speci�c
make . The advantage for the project is that the module-centric developer is reminded at relevant times (and only
relevant times) that their work has wider rami�cations.
By consistently using C include �les which contain accurate interface de�nitions (including function prototypes), this will
produce compilation errors in many of the cases which would result in a defective product. By doing whole-project
builds, developers discover such errors very early in the development process, and can �x the problems when they are
least expensive.
On such a computer, the above project with its 3000 �les detailed in the whole-project Makefile , would probably not
allow the DAG and rule actions to �t in memory.
But we are not using PDP11s any more. The physical memory of modern computers exceeds 10MB for small
computers, and virtual memory often exceeds 100MB. It is going to take a project with hundreds of thousands of source
�les to exhaust virtual memory on a small modern computer. As the 1000 source �le example takes less than 100KB
of memory (try it, I did) it is unlikely that any project manageable in a single directory tree on a single disk will exhaust
your computer’s memory.
10 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
The developer needs to remember to do this. The problems will not affect the developer of the module, it will
affect the developers of other modules. There is no trigger to remind the developer to do this, other than the ire of
fellow developers.
It is di�cult to work out where the changes need to be made. Potentially every Makefile in the entire project
needs to be examined for possible modi�cations. Of course, you can wait for your fellow developers to �nd them
for you.
The include dependencies will be recomputed unnecessarily, or will be interpreted incorrectly. This is because
make is string based, and thus “ . ”and “ ../ant ” are two different places, even when you are in the ant
directory. This is of concern when include dependencies are automatically generated – as they are for all large
projects.
By making sure that each Makefile is complete, you arrive at the point where the Makefile for at least one module
contains the equivalent of a whole-project Makefile (recall that these modules form a single project and are thus
inter-connected), and there is no need for the recursion anymore.
E�cient Make�les
The central theme of this paper is the semantic side-effects of arti�cially separating a Makefile into the pieces
necessary to perform a recursive make . However, once you have a large number of Makefile s, the speed at which
make can interpret this multitude of �les also becomes an issue.
Builds can take “forever” for both these reasons: the traditional �xes for the separated DAG may be building too much
and your Makefile may be ine�cient.
Deferred Evaluation
The text in a Makefile must somehow be read from a text �le and understood by make so that the DAG can be
constructed, and the speci�ed actions attached to the edges. This is all kept in memory.
The input language for Makefile s is deceptively simple. A crucial distinction that often escapes both novices and
experts alike is that make ’s input language is text based , as opposed to token based, as is the case for C or AWK.
make does the very least possible to process input lines and stash them away in memory.
Humans read this as the variable OBJ being assigned two �lenames main.o and parse.o . But make does not see
it that way. Instead OBJ is assigned the string “ main.o parse.o ”. It gets worse:
In this case humans expect make to assign two �lenames to OBJ, but make actually assigns the string ‘‘
$(SRC:.c=.o) ’’. This is because it is a macro language with deferred evaluation, as opposed to one with variables and
immediate evaluation.
If this does not seem too problematic, consider the following Makefile shown at the top of the next column.
How many times will the shell command be executed? Ouch! It will be executed twice just to construct the DAG,and a
further two times if the rule needs to be executed.
11 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
If this shell command does anything complexor time consuming (and it usually does) it will take four times longer than
you thought.
But it is worth looking at the other portions of that OBJ macro. Each time it is named, a huge amount of processing is
performed:
The argument to shell is a single string (all built-in-functions take a single string argument). The string is
executed in a sub-shell, and the standard output of this command is read back in, translating new lines into
spaces. The result is a single string.
The argument to filter is a single string. This argument is broken into two strings at the �rst comma. These
two strings are then each broken into sub-strings separated by spaces. The �rst set are the patterns, the second
set are the �lenames. Then, for each of the pattern substrings, if a �lename sub-string matches it, that �lename is
included in the output. Once all of the output has been found, it is re-assembled into a single space-separated
string.
The argument to patsubst is a single string. This argument is broken into three strings at the �rst and second
commas. The third string is then broken into sub-strings separated by spaces, these are the �lenames. Then, for
each of the �lenames which match the �rst string it is substituted according to the second string. If a �lename
does not match, it is passed through unchanged. Once all of the output has been generated, it is reassembled into
a single space-separated string.
Notice how many times those strings are disassembled and reassembled. Notice how many ways that happens. This is
slow . The example here names just two �les but consider how ine�cient this would be for 1000 �les. Doing it four times
becomes decidedly ine�cient.
If you are using a dumb make that has no substitutions and no built-in functions, this cannot bite you. But a modern
make has lots of built-in functions and can even invoke shell commands on-the-�[Link] semantics of make ’s text
manipulation is such that string manipulation in make is very CPU intensive,compared to performing the same string
manipulations in C or AWK.
Immediate Evaluation
Modern make implementations have an immediate evaluation := assignment operator. The above example can be re-
written as
Note that both assignments are immediate evaluation assignments. If the �rst were not, the shell command would
always be executed twice. If the second were not, the expensive substitutions would be performed at least twice and
possibly four times.
12 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
As a rule of thumb: always use immediate evaluation assignment unless you knowingly want deferred evaluation.
Include Files
Many Makefiles perform the same text processing (the �lters above, for example) for every single make run, but the
results of the processing rarely change. Wherever practical, it is more e�cient to record the results of the text
processing into a �le, and have the Makefile include this �le.
Dependencies
Don’t be miserly with include �les. They are relatively inexpensive to read, compared to $(shell), so more rather than less
doesn’t greatly affect e�ciency.
As an example of this, it is �rst necessary to describe a useful feature of GNU Make: once a Makefile has been read
in, if any of its included �les were out-of-date (or do not yet exist), they are re-built, and then make starts again, which
has the result that make is now working with up-to-date include �les. This feature can be exploited to obtain automatic
include �le dependency tracking for C sources. The obvious way to implement it, however, has a subtle �aw.
file.o: [Link].h...
The most simple implementation of this is to use GCC, but you will need an equivalent awk script or C program if you
have a different compiler:
#!/bin/sh
gcc -MM -MG "$@"
This implementation of tracking C include dependencies has several serious �aws, but the one most commonly
discovered is that the dependencies �le does not, itself, depend on the C include �les. That is, it is not re-built if one of
the include �les changes. There is no edge in the DAG joining the dependencies vertex to any of the include �le
vertices. If an include �le changes to include another �le (a nested include), the dependencies will not be recalculated,
and potentially the C �le will not be recompiled, and thus the program will not be re-built correctly.
A classic build-too-little problem, caused by giving make inadequate information, and thus causing it to build an
inadequate DAG and reach the wrong conclusion.
13 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
Now, even if the project is completely up-do-date, the dependencies will be re-built. For a large project, this is very
wasteful, and can be a major contributor to make taking “forever” to work out that nothing needs to be done.
There is a second problem, and that is that if any one of the C �les changes, all of the C �les will be re-scanned for
include dependencies. This is as ine�cient as having a Makefile which reads
prog: $(SRC)
$(CC) -o $@ $(SRC)
What is needed, in exact analogy to the C case, is to have an intermediate form. This is usually given a .d su�x. By
exploiting the fact that more than one �le may be named in an include line, there is no need to ‘‘link’’ all of the .d �les
together:
This has one more thing to �x: just as the object ( .o ) �les depend on the source �les and the include �les, so do the
dependency ( .d ) �les.
#!/bin/sh
gcc -MM -MG "$@" |
sed -e ’s@ˆ\(.*\)\.o:@\1.d \1.o:@’
This method of determining include �le dependencies results in the Makefile including more �les than the original
method, but opening �les is less expensive than rebuilding all of the dependencies every time. Typically a developer will
edit one or two �les before re-building; this method will rebuild the exact dependency �le affected (or more than one, if
you edited an include �le). On balance, this will use less CPU, and less time.
In the case of a build where nothing needs to be done, make will actually do nothing, and will work this out very quickly.
However, the above technique assumes your project �ts entirely within the one directory. For large projects, this usually
isn’t the case.
#!/bin/sh
DIR="$1"
shift 1
case "$DIR" in
"" | ".")
gcc -MM -MG "$@" |
sed -e ’s@ˆ\(.*\)\.o:@\1.d \1.o:@’
;;
*)
gcc -MM -MG "$@" |
sed -e "s@ˆ\(.*\)\.o:@$DIR/\1.d $DIR/\1.o:@"
;;
esac
And the rule needs to change, too, to pass the directory as the �rst argument, as the script expects.
14 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
%.d: %.c
[Link] ‘dirname $*‘ $(CFLAGS) $* > $@
Note that the .d �les will be relative to the top level directory. Writing them so that they can be used from any level is
possible, but beyond the scope of this paper.
Multiplier
All of the ine�ciencies described in this section compound together. If you do 100 Makefile interpretations, once for
each module, checking 1000 source �les can take a very long time - if the interpretation requires complex processing or
performs unnecessary work, or both. A whole project make , on the other hand, only needs to interpret a single
Makefile .
It is possible to see the whole-project make proposed here as impractical, because it does not match the evolved
methods of your development process.
The whole-project make proposed here does have an effect on development methods: it can give you cleaner and
simpler build environments for your developers. By using make ’s VPATH feature, it is possible to copy only those �les
you need to edit into your private work area, often called a sandbox .
The simplest explanation of what VPATH does is to make an analogy with the include �le search path speci�ed using
-Ipath options to the C compiler. This set of options describes where to look for �les, just as VPATH tells make
where to look for �les.
By using VPATH, it is possible to “stack” the sand-box on top of the project master source, so that �les in the sand-box
take precedence, but it is the union of all the �les which make uses to perform the build (see Figure 8).
Figure 8
In this environment, the sand-box has the same tree structure as the project master source. This allows developers to
safely change things across separate modules, e.g. if they are changing a module interface. It also allows the sand-box
to be physically separate – perhaps on a different disk, or under their home directory. It also allows the project master
source to be read-only, if you have (or would like) a rigorous check-in procedure.
Note: in addition to adding a VPATH line to your development Makefile , you will also need to add -I options to
15 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
the CFLAGS macro, so that the C compiler uses the same path as make does. This is simply done with a 3-line
Makefile in your work area – set a macro, set the VPATH, and then include the Makefile from the project master
source.
VPATH Semantics
For the above discussion to apply, you need to use GNU Make 3.76 or later. For versions of GNU Make earlier than 3.76,
you will need Paul Smith’s VPATH+ patch. This may be obtained from [Link]
The POSIX semantics of VPATH are slightly brain-dead, so many other make implementations are too limited. You may
want to consider installing GNU Make.
Figure 9
16 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
prog: $(OBJ)
$(CC) -o $@ $(OBJ) $(LIBS)
#calculate C include
# dependencies
%.d: %.c
[Link] ‘dirname $*.c‘ $(CFLAGS) $*.c > $@
This looks absurdly large, but it has all of the common elements in the one place, so that each of the modules’ make
includes may be small.
SRC += ant/main.c
SRC += bee/parse.y
LIBS += -ly
%.c %.h: %.y
$(YACC) -d $*.y
mv [Link].c $*.c
mv [Link].h $*.h
Notice that the built-in rules are used for the C �les, but we need special yacc processing to get the generated .h �le.
The savings in this example look irrelevant, because the top-level Makefile is so large. But consider if there were 100
modules, each with only a few non-comment lines, and those speci�cally relevant to the module. The savings soon add
up to a total size often less than the recursive case, without loss of modularity.
17 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
The equivalent DAG of the Makefile after all of the includes looks like �gure 10
Figure 10
The vertexes and edges for the include �le dependency �les are also present as these are important for make to
function correctly.
Side Effects
There are a couple of desirable side-effects of using a single Makefile .
The GNU Make -j option, for parallel builds, works even better than before. It can �nd even more unrelated
things to do at once, and no longer has some subtle problems.
The general make -k option, to continue as far as possible even in the face of errors, works even better than
before. It can �nd even more things to continue with.
Literature Survey
How can it be possible that we have been mis-using make for 20 years? How can it be possible that behaviour
previously ascribed to make’s limitations is in fact a result of mis-using it?
The author only started thinking about the ideas presented in this paper when faced with a number of ugly build
problems on utterly different projects, but with common symptoms. By stepping back from the individual projects, and
closely examining the thing they had in common, make , it became possible to see the larger pattern. Most of us are
too caught up in the minutiae of just getting the rotten build to work that we don’t have time to spare for the big picture.
Especially when the item in question “obviously” works, and has done so continuously for the last 20 years.
It is interesting that the problems of recursive make are rarely mentioned in the very books Unix programmers rely on
for accurate, practical advice.
18 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
It is hardly surprising that the original paper did not discuss recursive make, Unix projects at the time usually did �t into
a single directory.
It may be this which set the “one Makefile in every directory” concept so �rmly in the collective Unix development
mind-set.
GNU Make
The GNU Make manual [2] contains several pages of material concerning recursive make , however its discussion of
the merits or otherwise of the technique are limited to the brief statement that
This technique is useful when you want to separate make�les for various subsystems that compose a larger
system.
The cleanest way to build is to put a separate description �le in each directory, and tie them together through a
master description �le that invokes make recursively. While cumbersome, the technique is easier to maintain
than a single, enormous �le that covers multiple directories. (p. 65)
This is despite the book’s advice only two paragraphs earlier that
make is happiest when you keep all your �les in a single directory. (p. 64)
Yet the book fails to discuss the contradiction in these two statements, and goes on to describe one of the traditional
ways of treating the symptoms of incomplete DAGs caused by recursive make .
The book may give us a clue as to why recursive make has been used in this way for so many years. Notice how the
above quotes confuse the concept of a directory with the concept of a Makefile .
This paper suggests a simple change to the mind-set: directory trees, however deep, are places to store �les;
Makefiles are places to describe the relationships between those �les, however many.
BSD Make
The tutorial for BSD Make [4] says nothing at all about recursive make , but it is one of the few which actually
described, however brie�y, the relationship between a Makefile and a DAG (p. 30). There is also a wonderful quote
If make doesn’t do what you expect it to, it’s a good chance the make�le is wrong. (p. 10)
Summary
This paper presents a number of related problems, and demonstrates that they are not inherent limitations of make , as
is commonly believed, but are the result of presenting incorrect information to make . This is the ancient Garbage In,
Garbage Out principle at work. Because make can only operate correctly with a complete DAG, the error is in
19 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
This requires a shift in thinking: directory trees are simply a place to hold �les, Makefile s are a place to remember
relationships between �les. Do not confuse the two because it is as important to accurately represent the relationships
between �les in different directories as it is to represent the relationships between �les in the same directory. This has
the implication that there should be exactly one Makefile for a project, but the magnitude of the description can be
managed by using a make include �le in each directory to describe the subset of the project �les in that directory. This
is just as modular as having a Makefile in each directory.
This paper has shown how a project build and a development build can be equally brief for a whole-project make .
Given this parity of time, the gains provided by accurate dependencies mean that this process will, in fact, be faster than
the recursive make case, and more accurate.
Inter-dependent Projects
In organizations with a strong culture of re-use, implementing whole-project make can present challenges. Rising to
these challenges, however, may require looking at the bigger picture.
A module may be shared between two programs because the programs are closely related. Clearly, the two
programs plus the shared module belong to the same project (the module may be self-contained, but the
programs are not). The dependencies must be explicitly stated, and changes to the module must result in both
programs being recompiled and re-linked as appropriate. Combining them all into a single project means that
whole-project make can accomplish this.
A module may be shared between two projects because they must inter-operate. Possibly your project is bigger
than your current directory structure implies. The dependencies must be explicitly stated, and changes to the
module must result in both projects being recompiled and re-linked as appropriate. Combining them all into a
single project means that whole-project make can accomplish this.
It is the normal case to omit the edges between your project and the operating system or other installed third
party tools. So normal that they are ignored in the Makefile s in this paper, and they are ignored in the built-in
rules of make programs.
Modules shared between your projects may fall into a similar category: if they change, you will deliberately re-
build to include their changes, or quietly include their changes whenever the next build may happen. In either case,
you do not explicitly state the dependencies, and whole-project make does not apply.
Re-use may be better served if the module were used as a template, and divergence between two projects is seen
as normal. Duplicating the module in each project allows the dependencies to be explicitly stated, but requires
additional effort if maintenance is required to the common portion.
How to structure dependencies in a strong re-use environment thus becomes an exercise in risk management . What is
the danger that omitting chunks of the DAG will harm your projects? How vital is it to rebuild if a module changes? What
are the consequences of not rebuilding automatically? How can you tell when a rebuild is necessary if the dependencies
are not explicitly stated? What are the consequences of forgetting to rebuild?
Return On Investment
Some of the techniques presented in this paper will improve the speed of your builds, even if you continue to use
recursive make . These are not the focus of this paper, merely a useful detour.
The focus of this paper is that you will get more accurate builds of your project if you use whole-project make rather
than recursive make .
The time for make to work out that nothing needs to be done will not be more, and will often be less.
The size and complexity of the total Makefile input will not be more, and will often be less.
The total Makefile input is no less modular than in the recursive case.
The di�culty of maintaining the total Makefile input will not be more, and will often be less.
20 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
The disadvantages of using whole-project make over recursive make are often un-measured. How much time is spent
�guring out why make did something unexpected? How much time is spent �guring out that make did something
unexpected? How much time is spent tinkering with the build process? These activities are often thought of as “normal”
development overheads.
Building your project is a fundamental activity. If it is performing poorly, so are development, debugging and testing.
Building your project needs to be so simple the newest recruit can do it immediately with only a single page of
instructions. Building your project needs to be so simple that it rarely needs any development effort at all. Is your build
process this simple?
References
1 Feldman, Stuart I. (1978). Make - A Program for Maintaining Computer Programs. Bell Laboratories Computing
Science Technical Report 57
2 Stallman, Richard M. and Roland McGrath (1993). GNU Make: A Program for Directing [Link] Software
Foundation, Inc.
3 Talbott, Steve (1991). Managing Projects with Make, 2nd Ed.O’Reilly & Associates, Inc.
4 de Boor, Adam (1988). PMake- A [Link] of California, Berkeley
Please visit [Link] you would like to look at some of the author’s free software.
ADVERTISEMENT
([Link]
(/menu-overviews/membership/)
FAQ:
About Us (/faq/about)
What is ACCU? (/faq/what-is)
Conferences (/faq/conference)
Committee Members (/faq/short-committee)
Constitution (/faq/constitution)
Values (/faq/values)
Study Groups (/faq/study-groups-faq)
Mailing Lists (/faq/mailing-lists-faq)
Privacy Policy (/faq/privacy-policy)
21 of 22 8/7/21, 4:26 AM
Recursive Make Considered Harmful [Link]
MEMBERS ONLY:
Study Groups (/members/study-groups)
Mailing Lists (/members/mailing-lists)
Review a Book (/members/review-a-book)
ACCU Committee
Annual General Meetings (/members/agm)
Complaints Procedure (/members/complaints)
Log Out (/loginout/log-out)
CONTACT:
General Information ([Link]
Membership ([Link]
Local Groups ([Link]
Advertising ([Link]
Web Master ([Link]
Web Editor ([Link]
Conference ([Link]
LINKS:
World of Code ([Link]
Essential Books ([Link]
Twitter ([Link]
Facebook ([Link]
LinkedIn ([Link]
GitHub ([Link]
�ickr ([Link]
YouTube ([Link]
RSS Feed ([Link]
(/img/accu/[Link])
(/img/accu/[Link])
Advertisement
([Link]
22 of 22 8/7/21, 4:26 AM