A Look at Build Systems (1): Make

This is not a tutorial! It is a collection of Make’s strengths and weaknesses.

Overview

You cannot possibly talk about build systems without mentioning Make. Developed in the 1970s, it comes with every Unix-based system – including Linux and iOS.

Make is centered around text files containing commands (makefiles) and can be used universally. A simple makefile for a C program may look like this:

main.o: main.c
	gcc -c main.c
foo.o: foo.c
	gcc -c foo.c
program: main.o foo.o
	gcc -o program main.o foo.o

Makefiles are sensitive to whitespace formatting! Tabs and spaces are syntactically different. Make sure your text editor does not convert one into the other when editing makefiles!

Calling make program on the file will use GCC to compile the C source files to object files, eventually linking them to program. You can do the same with a simple shell script, but the points of Make are:

It does only the least work necessary. For example, it doesn’t re-compile foo.o if foo.c hasn’t changed since the last build.
Due to its dependency analysis, Make can parallelize the build process by compiling foo.c and main.c concurrently (see the Performance section).

Make is a general-purpose build system: It can be used to build any language and any kind of target. You can even use the result of one build step as a compiler for a later build step (if you build your own tools).

Getting Make

If you use a Unix-based system, Make is already pre-installed. If you use Windows with Visual Studio, it comes with nmake – a slightly different flavor – in the Visual Studio Command Prompt. If you prefer GNU Make on Windows, you can download it from SourceForge (don’t forget to download its dependencies as well).

Make, including its dependencies, is typically just one megabyte small – making it pretty compact by 2021 standards.

The latest Windows build is from 2006, meaning that the program has – after 30 years of development – reached absolute perfection.

Make comes in different flavors! GNU Make and nmake are not fully compatible, for example in their handling of spaces in file paths. BSD Make and GNU Make have entirely different syntax rules, even.

C/C++ Headers

The initial example will break once you add a header file foo.h and #include it in either of the C files: Changing foo.h will not trigger a re-compile of main.c or foo.c. The reason is Make’s very simple implementation: It doesn’t know anything about the C language or its implementation; it just treats files as black boxes!

The standard solution is to let the compiler generate an additional file per C/C++ source file, listing all #include files. This list is then included in the C/C++ file’s dependency list in the makefile.

GCC lists all header dependencies for a C file when passed the -MMD option. To list all dependencies of foo.c, call:

gcc -MMD -c foo.c

Aside from compiling foo.c into foo.o, this will also generate a foo.d (dependencies) file containing a list of all headers included by foo.c (recursively):

foo.o : foo.c foo.h foo-internal.h

Note that this is valid makefile syntax. Therefore, the foo.d file can be fed back into Make directly:

-include foo.d

(The dash ignores the include directive if the file does not exist, thus suppressing an error during the first build.)

To recapitulate:

During the first build,

foo.d does not exist and is therefore ignored by the makefile.
foo.o is required to build program, but it does not exist either.
Make finds that the command gcc -MMD -c foo.c generates foo.o and runs it.
Upon success, it proceeds to build program.

During a subsequent build, after changing the header foo.h,

foo.d exists and becomes part of the makefile. It lists foo.c, foo.h, and foo-internal.h as dependencies to foo.o.
Make checks the dependencies of foo.o and finds foo.c, foo.h and foo-internal.h.
foo.h is newer than its output, so foo.o is considered outdated and must be rebuilt.
Make finds that the command gcc -MMD -c foo.c generates foo.o and runs it.
Upon success, it proceeds to build program.

The -MM option may look identical at first glance. However, it removes the directory part of the dependencies, thus causing problems if your header files are spread across directories!

This complexity is inherent to the C/C++ header system and is no weakness of Make in particular. Every general-purpose build system requires implementing this feedback loop.

Growing Complexity

You wouldn’t duplicate the above code for each and every source file. Make offers a powerful macro language to handle multiple files in one go (sample taken from this StackOverflow answer):

# click a statement to expand comments

SRCS = $(source/*.c)# Store a list of all files with the extension '.c' from
# the directory 'source' in the variable 'SRCS'
OBJS = $(SRCS:%.c=$(BUILD_DIR)/%.o)# Get a list of all object files by replacing the '.c' extension with '.o';
# add the path stored in the 'BUILD_DIR' variable as a prefix to each path;
# store the list in the variable 'OBJS'
DEPS = $(SRCS:%.c=$(BUILD_DIR)/%.d)
# the same for the dependency files
program : $(OBJS)
# the program depends on all object files
	mkdir -p $(@D)
	# make sure the target’s directory exists
	$(CXX) $(CXX_FLAGS) $^ -o $@	# Build the program by linking all object files.
	# CXX is a variable with the compiler of your choice;
	# CXX_FLAGS is a variable with its parameters;
	# $^ means 'all dependencies' (the object files);
	# $@ means 'target' (the program)
-include $(DEPS)
# include all dependency lists right here in the makefile (should they exist)
$(BUILD_DIR)/%.o : %.c# For the initial build:
# Every '.o' file depends on the corresponding '.c' file!
# Is silently merged with the above include in subsequent builds!
	mkdir -p $(@D)	# Since the source files may be located in sub-directories, we have to
	# make sure every directory of an object file target exists before
	# actually compiling it!
	$(CXX) $(CXX_FLAGS) -MMD -c $< -o $@
	# use '-MMD' to create the dependency list

Without the comments, would you have been able to understand this code? Do you understand why the last line addresses the source file via $<?

I won’t say that makefiles are impossible to maintain. But just like with C, their enormous power and dense syntax comes at a cost. Here, it is really easy to shoot yourself in the foot. Just like with C, you can probably be very productive once you’ve mastered the syntax. But opposed to C, makefiles are probably nothing you’ll poke around with every day, and there is a high probability that you’ll have long forgotten what you’ve written when it crashes over your head.

The syntax and macro machinery looks dated and complicated especially in direct comparison with newer tools like Ninja. Defining build commands for the same target twice will silently merge them instead of reporting an error, and there are similar rules for dependency lists and macros. These make errors hard to find.

The resemblence of Make’s syntax with Lex’s and Yacc’s is no coincidence: They all come from the same team at Bell Labs.

Paths with Spaces

Makefiles support spaces in paths, but this support is very limited and will likely blow up when you try to use it. This is a clear flaw that Microsoft’s MSBuild or Ninja don’t have.

You cannot use single quotes or double quotes with paths in makefiles – Make will interpret those as being part of the path. You need to escape the spaces by prefixing them with a backslash:

SRCS = first\ name\ with\ spaces.c another\ one.c

SRCS breaks once you use macros: Replacing the extension .c with .o like OBJS = $(SRCS:%.c=$(BUILD_DIR)/%.o) assigns to OBJS the following list:

first name with BUILD_DIR/spaces.o
another BUILD_DIR/one.o

This bug ticket indicates that there is no way around this limitation, and that there are more problems.

Make is fast, but Ninja is Faster

Make’s build performance is pretty good – there are no obvious problems apart from Make running single-threaded by default.

To run Make multi-threaded, use the -jXX option, where XX is the number of cores in your system.

-j (without a number) executes as many commands in parallel as possible. With hundreds of build targets, your system may come to a grinding halt!

If you feel like your machine is too slow at building your project, you can use derivatives like dmake (Distributed Make) to distribute the load over machines in your local network.

Before building, Make needs to access every file in the dependency graph (to get its last modified date). If there are many C/C++ source files, Make needs to read every dependency file of theirs as well. This causes lots of file system traffic for build updates – especially on Windows, whose file system tends to be pretty slow. Ninja, on the other hand, uses a database to process header dependencies, thus saving a large fraction of stats.

Chrome is built from several tens of thousands of source files. One-file-changes were famously clocked at 10–20 seconds with Make, and at less than a second with Ninja.