Comb Topol
Comb Topol
Comb Topol
August 8, 2017
2
Aspects of Convex Geometry
Polyhedra, Linear Programming,
Shellings, Voronoi Diagrams,
Delaunay Triangulations
Jean Gallier and Jocelyn Quaintance
Abstract: Some basic mathematical tools such as convex sets, polytopes and combinatorial
topology, are used quite heavily in applied fields such as geometric modeling, meshing, com-
puter vision, medical imaging and robotics. This report may be viewed as a tutorial and a
set of notes on convex sets, polytopes, polyhedra, combinatorial topology, Voronoi Diagrams
and Delaunay Triangulations. It is intended for a broad audience of mathematically inclined
readers.
One of my (selfish!) motivations in writing these notes was to understand the concept
of shelling and how it is used to prove the famous Euler-Poincaré formula (Poincaré, 1899)
and the more recent Upper Bound Theorem (McMullen, 1970) for polytopes. Another of my
motivations was to give a “correct” account of Delaunay triangulations and Voronoi diagrams
in terms of (direct and inverse) stereographic projections onto a sphere and prove rigorously
that the projective map that sends the (projective) sphere to the (projective) paraboloid
works correctly, that is, maps the Delaunay triangulation and Voronoi diagram w.r.t. the
lifting onto the sphere to the Delaunay diagram and Voronoi diagrams w.r.t. the traditional
lifting onto the paraboloid. Here, the problem is that this map is only well defined (total) in
projective space and we are forced to define the notion of convex polyhedron in projective
space.
It turns out that in order to achieve (even partially) the above goals, I found that it was
necessary to include quite a bit of background material on convex sets, polytopes, polyhedra
and projective spaces. I have included a rather thorough treatment of the equivalence of
V-polytopes and H-polytopes and also of the equivalence of V-polyhedra and H-polyhedra,
which is a bit harder. In particular, the Fourier-Motzkin elimination method (a version of
Gaussian elimination for inequalities) is discussed in some detail. I also had to include some
material on projective spaces, projective maps and polar duality w.r.t. a nondegenerate
quadric in order to define a suitable notion of “projective polyhedron” based on cones. To
the best of our knowledge, this notion of projective polyhedron is new. We also believe that
some of our proofs establishing the equivalence of V-polyhedra and H-polyhedra are new.
Since Chapters 2, 3, 4, and 5 contain all the background (and more) needed to discuss
linear programming (including the simplex algorithm and duality), we have included some
chapters on linear programming.
3
4 CONTENTS
Contents
Contents 4
1 Introduction 7
1.1 Motivations and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Bibliography 387
Chapter 1
Introduction
7
8 CHAPTER 1. INTRODUCTION
quadric, in order to define a suitable notion of “projective polyhedron” based on cones. This
notion turned out to be indispensible to give a correct treatment of the Delaunay and Voronoi
complexes using inverse stereographic projection onto a sphere and to prove rigorously that
the well known projective map between the sphere and the paraboloid maps the Delaunay
triangulation and the Voronoi diagram w.r.t. the sphere to the more traditional Delaunay
triangulation and Voronoi diagram w.r.t. the paraboloid. To the best of our knowledge, this
notion of projective polyhedron is new. We also believe that some of our proofs establishing
the equivalence of V-polyhedra and H-polyhedra are new.
Chapter 9 on combinatorial topology is hardly original. However, most texts covering
this material are either old fashion or too advanced. Yet, this material is used extensively in
meshing and geometric modeling. We tried to give a rather intuitive yet rigorous exposition.
We decided to introduce the terminology combinatorial manifold , a notion usually referred
to as triangulated manifold .
A recurring theme in these notes is the process of “conification” (algebraically, “homoge-
nization”), that is, forming a cone from some geometric object. Indeed, “conification” turns
an object into a set of lines, and since lines play the role of points in projective geome-
try, “conification” (“homogenization”) is the way to “projectivize” geometric affine objects.
Then, these (affine) objects appear as “conic sections” of cones by hyperplanes, just the way
the classical conics (ellipse, hyperbola, parabola) appear as conic sections.
It is worth warning our readers that convexity and polytope theory is deceptively simple.
This is a subject where most intuitive propositions fail as soon as the dimension of the space
is greater than 3 (definitely 4), because our human intuition is not very good in dimension
greater than 3. Furthermore, rigorous proofs of seemingly very simple facts are often quite
complicated and may require sophisticated tools (for example, shellings, for a correct proof
of the Euler-Poincaré formula). Nevertheless, readers are urged to strenghten their geometric
intuition; they should just be very vigilant! This is another case where Tate’s famous saying
is more than pertinent: “Reason geometrically, prove algebraically.”
At first, these notes were meant as a complement to Chapter 3 (Properties of Convex
Sets: A Glimpse) of my book (Geometric Methods and Applications, [30]). However, they
turn out to cover much more material. For the reader’s convenience, I have included Chapter
2 on affine geometry, and Chapter 3 (both from my book [30]) as part of Chapter 3 of these
notes.
Since Chapters 2, 3, 4, and 5 contain all the background (and more) needed to discuss
linear programming (including the simplex algorithm and duality), we have included some
chapters on linear programming.
Most of the material on convex sets is taken from Berger [8] (Geometry II ). Other rel-
evant sources include Ziegler [67], Grünbaum [35] Barvinok [4], Valentine [63], Rockafellar
[49], Bourbaki (Topological Vector Spaces) [13], and Lax [39], the last four dealing with
affine spaces of infinite dimension. As to polytopes and polyhedra, “the” classic reference is
1.1. MOTIVATIONS AND GOALS 9
Grünbaum [35]. Other good references include Ziegler [67], Ewald [26], Cromwell [22], and
Thomas [60].
The recent book by Thomas contains an excellent and easy going presentation of poly-
tope theory. This book also gives an introduction to the theory of triangulations of point
configurations, including the definition of secondary polytopes and state polytopes, which
happen to play a role in certain areas of biology. For this, a quick but very efficient presen-
tation of Gröbner bases is provided. We highly recommend Thomas’s book [60] as further
reading. It is also an excellent preparation for the more advanced book by Sturmfels [59].
However, in our opinion, the “bible” on polytope theory is without any contest, Ziegler [67],
a masterly and beautiful piece of mathematics. In fact, our Chapter 10 is heavily inspired
by Chapter 8 of Ziegler. However, the pace of Ziegler’s book is quite brisk and we hope that
our more pedestrian account will inspire readers to go back and read the masters.
In a not too distant future, I would like to write about constrained Delaunay triangula-
tions, a formidable topic, please be patient!
I wish to thank Marcelo Siqueira for catching many typos and mistakes and for his
many helpful suggestions regarding the presentation. At least a third of this manuscript was
written while I was on sabbatical at INRIA, Sophia Antipolis, in the Asclepios Project. My
deepest thanks to Nicholas Ayache and his colleagues (especially Xavier Pennec and Hervé
Delingette) for inviting me to spend a wonderful and very productive year and for making
me feel perfectly at home within the Asclepios Project.
10 CHAPTER 1. INTRODUCTION
Chapter 2
L’algèbre n’est qu’une géométrie écrite; la géométrie n’est qu’une algèbre figurée.
—Sophie Germain
11
12 CHAPTER 2. BASICS OF AFFINE GEOMETRY
This chapter proceeds as follows. We take advantage of the fact that almost every affine
concept is the counterpart of some concept in linear algebra. We begin by defining affine
spaces, stressing the physical interpretation of the definition in terms of points (particles)
and vectors (forces). Corresponding to linear combinations of vectors, we define affine com-
binations of points (barycenters), realizing that we are forced to restrict our attention to
families of scalars adding up to 1. Corresponding to linear subspaces, we introduce affine
subspaces as subsets closed under affine combinations. Then, we characterize affine sub-
spaces in terms of certain vector spaces called their directions. This allows us to define a
clean notion of parallelism. Next, corresponding to linear independence and bases, we define
affine independence and affine frames. We also define convexity. Corresponding to linear
maps, we define affine maps as maps preserving affine combinations. We show that every
affine map is completely defined by the image of one point and a linear map. Then, we
investigate briefly some simple affine maps, the translations and the central dilatations. At
this point, we give a glimpse of affine geometry. We prove the theorems of Thales, Pappus,
and Desargues. After this, the definition of affine hyperplanes in terms of affine forms is
reviewed. The section ends with a closer look at the intersection of affine subspaces.
Our presentation of affine geometry is far from being comprehensive, and it is biased
toward the algorithmic geometry of curves and surfaces. For more details, the reader is
referred to Pedoe [46], Snapper and Troyer [53], Berger [7, 8], Coxeter [21], Samuel [50],
Tisseron [62], Fresnel [27], Vienne [65], and Hilbert and Cohn-Vossen [36].
Suppose we have a particle moving in 3D space and that we want to describe the trajectory
of this particle. If one looks up a good textbook on dynamics, such as Greenwood [34], one
finds out that the particle is modeled as a point, and that the position of this point x is
determined with respect to a “frame” in R3 by a vector. Curiously, the notion of a frame is
rarely defined precisely, but it is easy to infer that a frame is a pair (O, (e1 , e2 , e3 )) consisting
of an origin O (which is a point) together with a basis of three vectors (e1 , e2 , e3 ). For
example, the standard frame in R3 has origin O = (0, 0, 0) and the basis of three vectors
e1 = (1, 0, 0), e2 = (0, 1, 0), and e3 = (0, 0, 1). The position of a point x is then defined by
the “unique vector” from O to x.
But wait a minute, this definition seems to be defining frames and the position of a point
without defining what a point is! Well, let us identify points with elements of R3 . If so, given
any two points a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ), there is a unique free vector , denoted by
→
− →
−
ab, from a to b, the vector ab = (b1 − a1 , b2 − a2 , b3 − a3 ). Note that
→
−
b = a + ab,
2.1. AFFINE SPACES 13
b
−
→
ab
a
O
addition being understood as addition in R3 . Then, in the standard frame, given a point
−→
x = (x1 , x2 , x3 ), the position of x is the vector Ox = (x1 , x2 , x3 ), which coincides with the
point itself. In the standard frame, points and vectors are identified. Points and free vectors
are illustrated in Figure 2.1.
What if we pick a frame with a different origin, say Ω = (ω1 , ω2 , ω3 ), but the same basis
vectors (e1 , e2 , e3 )? This time, the point x = (x1 , x2 , x3 ) is defined by two position vectors:
−→
Ox = (x1 , x2 , x3 )
in the frame (O, (e1 , e2 , e3 )) and
−
→
Ωx = (x1 − ω1 , x2 − ω2 , x3 − ω3 )
in the frame (Ω, (e1 , e2 , e3 )). See Figure 2.2.
This is because
−→ −→ − → −→
Ox = OΩ + Ωx and OΩ = (ω1 , ω2 , ω3 ).
We note that in the second frame (Ω, (e1 , e2 , e3 )), points and position vectors are no longer
identified. This gives us evidence that points are not vectors. It may be computationally
convenient to deal with points using position vectors, but such a treatment is not frame
invariant, which has undesirable effects.
Inspired by physics, we deem it important to define points and properties of points that
are frame invariant. An undesirable side effect of the present approach shows up if we attempt
to define linear combinations of points. First, let us review the notion of linear combination
of vectors. Given two vectors u and v of coordinates (u1 , u2 , u3 ) and (v1 , v2 , v3 ) with respect
14 CHAPTER 2. BASICS OF AFFINE GEOMETRY
e3
e2
Ω
Ωx e1
x
e3
Ox
e1
O e2
to the basis (e1 , e2 , e3 ), for any two scalars λ, µ, we can define the linear combination λu + µv
as the vector of coordinates
If we choose a different basis (e01 , e02 , e03 ) and if the matrix P expressing the vectors (e01 , e02 , e03 )
over the basis (e1 , e2 , e3 ) is
a1 b 1 c 1
P = a2 b2 c2 ,
a3 b 3 c 3
which means that the columns of P are the coordinates of the e0j over the basis (e1 , e2 , e3 ),
since
u1 e1 + u2 e2 + u3 e3 = u01 e01 + u02 e02 + u03 e03
and
v1 e1 + v2 e2 + v3 e3 = v10 e01 + v20 e02 + v30 e03 ,
it is easy to see that the coordinates (u1 , u2 , u3 ) and (v1 , v2 , v3 ) of u and v with respect to
the basis (e1 , e2 , e3 ) are given in terms of the coordinates (u01 , u02 , u03 ) and (v10 , v20 , v30 ) of u and
v with respect to the basis (e01 , e02 , e03 ) by the matrix equations
0 0
u1 u1 v1 v1
u2 = P u02 and v2 = P v20 .
u3 u03 v3 v30
From the above, we get
2.1. AFFINE SPACES 15
0 0
u1 u1 v1 v1
u02 = P −1 u2 and v20 = P −1 v2 ,
u03 u3 v30 v3
and by linearity, the coordinates
Everything worked out because the change of basis does not involve a change of origin. On the
other hand, if we consider the change of frame from the frame (O, (e1 , e2 , e3 )) to the frame
−→
(Ω, (e1 , e2 , e3 )), where OΩ = (ω1 , ω2 , ω3 ), given two points a, b of coordinates (a1 , a2 , a3 )
and (b1 , b2 , b3 ) with respect to the frame (O, (e1 , e2 , e3 )) and of coordinates (a01 , a02 , a03 ) and
(b01 , b02 , b03 ) with respect to the frame (Ω, (e1 , e2 , e3 )), since
and
(b01 , b02 , b03 ) = (b1 − ω1 , b2 − ω2 , b3 − ω3 ),
the coordinates of λa + µb with respect to the frame (O, (e1 , e2 , e3 )) are
e3
e2
Ω = (3,4,5)
e1
e3
e2
Ω = (3,4,5)
e1
e3
a = (-2,-3,-4)
b = (-1,-1,-4)
e1
O e2
Figure 2.3: The top figure shows the location of the “point” sum a + b with respect to the
frame (O, (e1 , e2 , e3 )), while the bottom figure shows the location of the “point” sum a + b
with respect to the frame (Ω, (e1 , e2 , e3 )).
A clean way to handle the problem of frame invariance and to deal with points in a more
intrinsic manner is to make a clearer distinction between points and vectors. We duplicate
R3 into two copies, the first copy corresponding to points, where we forget the vector space
structure, and the second copy corresponding to free vectors, where the vector space structure
is important. Furthermore, we make explicit the important fact that the vector space R3 acts
on the set of points R3 : Given any point a = (a1 , a2 , a3 ) and any vector v = (v1 , v2 , v3 ),
we obtain the point
a + v = (a1 + v1 , a2 + v2 , a3 + v3 ),
which can be thought of as the result of translating a to b using the vector v. We can imagine
that v is placed such that its origin coincides with a and that its tip coincides with b. This
action + : R3 × R3 → R3 satisfies some crucial properties. For example,
a + 0 = a,
(a + u) + v = a + (u + v),
2.1. AFFINE SPACES 17
→
−
and for any two points a, b, there is a unique free vector ab such that
→
−
b = a + ab.
It turns out that the above properties, although trivial in the case of R3 , are all that is
needed to define the abstract notion of affine space (or affine structure). The basic idea is
→
−
to consider two (distinct) sets E and E , where E is a set of points (with no structure) and
→
−
E is a vector space (of free vectors) acting on the set E.
Did you say “A fine space”?
→
−
Intuitively, we can think of the elements of E as forces moving the points in E, considered
→
−
as physical particles. The effect of applying a force (free vector) u ∈ E to a point a ∈ E is
→
−
a translation. By this, we mean that for every force u ∈ E , the action of the force u is to
“move” every point a ∈ E to the point a + u ∈ E obtained by the translation corresponding
→
−
to u viewed as a vector. Since translations can be composed, it is natural that E is a vector
space.
For simplicity, it is assumed that all vector spaces under consideration are defined over
the field R of real numbers. Most of the definitions and results also hold for an arbitrary
field K, although some care is needed when dealing with fields of characteristic different
from zero. It is also assumed that all families (λi )i∈I of scalars have finite support. Recall
that a family (λi )i∈I of scalars has finite support if λi = 0 for all i ∈ I − J, where J is a finite
subset of I. Obviously, finite families of scalars have finite support, and for simplicity, the
reader may assume that all families of scalars are finite. The formal definition of an affine
space is as follows.
Definition 2.1. An affine space is either the degenerate space reduced to the empty set, or a
→ − →
−
triple E, E , + consisting of a nonempty set E (of points), a vector space E (of translations,
→
−
or free vectors), and an action + : E × E → E, satisfying the following conditions.
(A1) a + 0 = a, for every a ∈ E.
→
−
(A2) (a + u) + v = a + (u + v), for every a ∈ E, and every u, v ∈ E .
→
−
(A3) For any two points a, b ∈ E, there is a unique u ∈ E such that a + u = b.
→
− →
−
The unique vector u ∈ E such that a + u = b is denoted by ab, or sometimes by ab, or
even by b − a. Thus, we also write
→
−
b = a + ab
(or b = a + ab, or even b = a + (b − a)).
→− →
−
The dimension of the affine space E, E , + is the dimension dim( E ) of the vector space
→
−
E . For simplicity, it is denoted by dim(E).
18 CHAPTER 2. BASICS OF AFFINE GEOMETRY
−
→
E E
b=a+u
u
a c=a+w w
→
−
Conditions (A1) and (A2) say that the (abelian) group E acts on E, and Condition (A3)
→
−
says that E acts transitively and faithfully on E. Note that
−−−−−→
a(a + v) = v
→
− −−−−−→ −−−−−→
for all a ∈ E and all v ∈ E , since a(a + v) is the unique vector such that a+v = a+ a(a + v).
→
−
Thus, b = a + v is equivalent to ab = v. Figure 2.4 gives an intuitive picture of an affine
space. It is natural to think of all vectors as having the same origin, the null vector.
→ −
The axioms defining an affine space E, E , + can be interpreted intuitively as saying
→
−
that E and E are two different ways of looking at the same object, but wearing different
sets of glasses, the second set of glasses depending on the choice of an “origin” in E. Indeed,
we can choose to look at the points in E, forgetting that every pair (a, b) of points defines a
→
− →
− →
−
unique vector ab in E , or we can choose to look at the vectors u in E , forgetting the points
in E. Furthermore, if we also pick any point a in E, a point that can be viewed as an origin
→
−
in E, then we can recover all the points in E as the translated points a + u for all u ∈ E .
→
−
This can be formalized by defining two maps between E and E .
→
−
For every a ∈ E, consider the mapping from E to E given by
u 7→ a + u,
→
− →
−
where u ∈ E , and consider the mapping from E to E given by
→
−
b 7→ ab,
which, in view of (A3), yields u. The composition of the second with the first mapping is
→
− →
−
b 7→ ab 7→ a + ab,
→
− →
−
which, in view of (A3), yields b. Thus, these compositions are the identity from E to E
and the identity from E to E, and the mappings are both bijections.
→
− →
−
When we identify E with E via the mapping b 7→ ab, we say that we consider E as the
vector space obtained by taking a as the origin in E, and we denote it by Ea . Because Ea is
a vector space, to be consistent with our notational conventions we should use the notation
−
→
Ea (using an arrow), instead of Ea . However, for simplicity, we stick to the notation Ea .
→ −
Thus, an affine space E, E , + is a way of defining a vector space structure on a set of
points E, without making a commitment to a fixed origin in E. Nevertheless, as soon as
we commit to an origin a in E, we can view E as the vector space Ea . However, we urge
→
−
the reader to think of E as a physical set of points and of E as a set of forces acting on E,
rather than reducing E to some isomorphic copy of Rn . After all, points are points, and not
→ − →
−
vectors! For notational simplicity, we will often denote an affine space E, E , + by (E, E ),
→
−
or even by E. The vector space E is called the vector space associated with E.
One should be careful about the overloading of the addition symbol +. Addition
is well-defined on vectors, as in u + v; the translate a + u of a point a ∈ E by a
→
−
vector u ∈ E is also well-defined, but addition of points a + b does not make sense. In
this respect, the notation b − a for the unique vector u such that b = a + u is somewhat
confusing, since it suggests that points can be subtracted (but not added!).
→
− →
−
Any vector space E has an affine space structure specified by choosing E = E , and
→
−
→− →−
letting + be addition in the vector space E . We will refer to the affine structure E , E , +
→
− →
−
on a vector space E as the canonical (or natural) affine
structure
on E . In particular, the
vector space Rn can be viewed as
the affine space Rn , Rn , + , denoted by An . In general,
if K is any field, the affine space K n , K n , + is denoted by AnK . In order to distinguish
between the double role played by members of Rn , points and vectors, we will denote points
by row vectors, and vectors by column vectors. Thus, the action of the vector space Rn over
the set Rn simply viewed as a set of points is given by
u1
..
(a1 , . . . , an ) + . = (a1 + u1 , . . . , an + un ).
un
We will also use the convention that if x = (x1 , . . . , xn ) ∈ Rn , then the column vector
associated with x is denoted by x (in boldface notation). Abusing the notation slightly, if
a ∈ Rn is a point, we also write a ∈ An . The affine space An is called the real affine space of
dimension n. In most cases, we will consider n = 1, 2, 3.
20 CHAPTER 2. BASICS OF AFFINE GEOMETRY
The set L is the line of slope −1 passing through the points (1, 0) and (0, 1) shown in Figure
2.5.
The line L can be made into an official affine space by defining the action + : L × R → L
of R on L defined such that for every point (x, 1 − x) on L and any u ∈ R,
(x, 1 − x) + u = (x + u, 1 − x − u).
It is immediately verified that this action makes L into an affine space. For example, for any
two points a = (a1 , 1 − a1 ) and b = (b1 , 1 − b1 ) on L, the unique (vector) u ∈ R such that
b = a + u is u = b1 − a1 . Note that the vector space R is isomorphic to the line of equation
x + y = 0 passing through the origin.
Similarly, consider the subset H of A3 consisting of all points (x, y, z) satisfying the
equation
x + y + z − 1 = 0.
The set H is the plane passing through the points (1, 0, 0), (0, 1, 0), and (0, 0, 1). The plane
H can be made into an official affine space by defining the action + : H × R2 → H of R2 on
2.3. CHASLES’S IDENTITY 21
(0,0,1)
For a slightly wilder example, consider the subset P of A3 consisting of all points (x, y, z)
satisfying the equation
x2 + y 2 − z = 0.
The set P is a paraboloid of revolution, with axis Oz. The surface P can be made into an
official affine space by defining the action+: P × R2 → P of R2 on P defined such that for
u
every point (x, y, x2 + y 2 ) on P and any ∈ R2 ,
v
2 2 u
(x, y, x + y ) + = (x + u, y + v, (x + u)2 + (y + v)2 ).
v
This should dispel any idea that affine spaces are dull. Affine spaces not already equipped
with an obvious vector space structure arise in projective geometry.
2
(x + u, y + v, (x + u)2 + (y + v) )
P
(x, y, x 2 + y 2 )
(u,v
)
(x, y)
(x + u, y + v)
−
→
E E
b
−
→
ab
−
→
ac
a c
−
→
bc
of boring certain readers, we give another example showing what goes wrong if we are not
careful in defining linear combinations of points.
Consider R2 as anaffine space,
under its natural coordinate system with origin O = (0, 0)
1 0
and basis vectors and . Given any two points a = (a1 , a2 ) and b = (b1 , b2 ), it is
0 1
natural to define the affine combination λa + µb as the point of coordinates
Thus, when a = (−1, −1) and b = (2, 2), the point a + b is the point c = (1, 1).
Let us now consider the new coordinate system with respect to the origin c = (1, 1) (and
the same basis vectors). This time, the coordinates of a are (−2, −2), the coordinates of b
are (1, 1), and the point a + b is the point d of coordinates (−1, −1). However, it is clear that
the point d is identical to the origin O = (0, 0) of the first coordinate system. This situation
is illustrated in Figure 2.9.
Thus, a + b corresponds to two different points depending on which coordinate system is
used for its computation!
This shows that some extra condition is needed in order for affine combinations to make
sense. It turns out that if the scalars sum up to 1, the definition is intrinsic, as the following
proposition shows.
Proposition 2.1. Given an affine space E, let (ai )i∈I be a family of points in E, and let
(λi )i∈I be a family of scalars. For any two points a, b ∈ E, the following properties hold:
P
(1) If i∈I λi = 1, then
X − →
λi −→=b+
X
a+ aa i λi bai .
i∈I i∈I
24 CHAPTER 2. BASICS OF AFFINE GEOMETRY
b = (2,2) b = (1,1)
c = a + b = (1,1) c
O = (0,0) d = a + b = (-1,-1)
a2
a2
aa2 b a2 a3
a a3 a3 a
a1 a a1 a a1
aa
i ab
ai
ai b a3
ab ba1
b b
Thus, by Proposition
P 2.1, for any family of points (ai )i∈I in E, for any family (λi )i∈I of
scalars such that i∈I λi = 1, the point
λi −→
X
x=a+ aa i
i∈I
is independent of the choice of the origin a ∈ E. This property motivates the following
definition.
Definition
P 2.2. For any family of points (ai )i∈I in E, for any family (λi )i∈I of scalars such
that i∈I λi = 1, and for any a ∈ E, the point
λi −→
X
a+ aa i
i∈I
λi −→ = 0.
X
xa i
i∈I
In physical terms, the barycenter is the center of mass of theP family of weighted points
((ai , λi ))i∈I (where the masses have been normalized, so that i∈I λi = 1, and negative
masses are allowed).
Remarks:
(2) This result still holds, provided that the field K has at least three distinct elements,
but the proof is trickier!
P
(3) When i∈I λP
P −→
i = 0, the vector i∈I λi aai does not depend on the point a, and we may
denote it by i∈I λi ai . This observation will be used to define a vector space in which
linear
P combinations of both points and vectors make sense, regardless of the value of
i∈I λi .
The point g1 can be constructed geometrically as the middle of the segment joining c to
the middle 12 a + 12 b of the segment (a, b), since
1 1 1 1
g1 = a + b + c.
2 2 2 2
The point g2 can be constructed geometrically as the point such that the middle 12 b + 12 c of
the segment (b, c) is the middle of the segment (a, g2 ), since
1 1
g2 = −a + 2 b + c .
2 2
Later on, we will see that a polynomial curve can be defined as a set of barycenters of a
fixed number of points. For example, let (a, b, c, d) be a sequence of points in A2 . Observe
that
(1 − t)3 + 3t(1 − t)2 + 3t2 (1 − t) + t3 = 1,
2.5. AFFINE SUBSPACES 27
g1
a b
c
g2
a b
since the sum on the left-hand side is obtained by expanding (t + (1 − t))3 = 1 using the
binomial formula. Thus,
is a well-defined affine combination. Then, we can define the curve F : A → A2 such that
Such a curve is called a Bézier curve, and (a, b, c, d) are called its control points. Note that
the curve passes through a and d, but generally not through b and c. It can be sbown
that any point F (t) on the curve can be constructed using an algorithm performing affine
interpolation steps (the de Casteljau algorithm).
→ − P
E, E , + ) if for
P every family of weighted points ((a i , λi ))i∈I in V such that i∈I λi = 1,
the barycenter i∈I λi ai belongs to V .
28 CHAPTER 2. BASICS OF AFFINE GEOMETRY
An affine subspace is also called a flat by some authors. According to Definition 2.3,
the empty set is trivially an affine subspace, and every intersection of affine subspaces is an
affine subspace.
As an example, consider the subset U of R2 defined by
U = (x, y) ∈ R2 | ax + by = c ,
ax + by = c,
where it is assumed that a 6= 0 or b 6= 0. Given any m points (xi , yi ) ∈ U and any m scalars
λi such that λ1 + · · · + λm = 1, we claim that
m
X
λi (xi , yi ) ∈ U.
i=1
ax + by = 0
2.5. AFFINE SUBSPACES 29
U
−
→
U
obtained by setting the right-hand side of ax + by = c to zero. Indeed, for any m scalars λi ,
the same calculation as above yields that
m
X →
−
λi (xi , yi ) ∈ U ,
i=1
this time without any restriction on the λi , since the right-hand side of the equation is
→
− →
−
null. Thus, U is a subspace of R2 . In fact, U is one-dimensional, and it is just a usual line
in R2 . This line can be identified with a line passing through the origin of A2 , a line that is
parallel to the line U of equation ax + by = c, as illustrated in Figure 2.12.
Now, if (x0 , y0 ) is any point in U , we claim that
→
−
U = (x0 , y0 ) + U ,
where
→
− n →
−o
(x0 , y0 ) + U = (x0 + u1 , y0 + u2 ) | (u1 , u2 ) ∈ U .
→
− →
−
First, (x0 , y0 ) + U ⊆ U , since ax0 + by0 = c and au1 + bu2 = 0 for all (u1 , u2 ) ∈ U . Second,
if (x, y) ∈ U , then ax + by = c, and since we also have ax0 + by0 = c, by subtraction, we get
a(x − x0 ) + b(y − y0 ) = 0,
→
− →
−
which shows that (x − x0 , y − y0 ) ∈ U , and thus (x, y) ∈ (x0 , y0 ) + U . Hence, we also have
→
− →
−
U ⊆ (x0 , y0 ) + U , and U = (x0 , y0 ) + U .
The above example shows that the affine line U defined by the equation
ax + by = c
30 CHAPTER 2. BASICS OF AFFINE GEOMETRY
→
−
is obtained by “translating” the parallel line U of equation
ax + by = 0
More generally, it is easy to prove the following fact. Given any m × n matrix A and any
vector b ∈ Rm , the subset U of Rn defined by
U = {x ∈ Rn | Ax = b}
is an affine subspace of An .
Actually, observe that Ax = b should really be written as Ax> = b, to be consistent with
our convention that points are represented by row vectors. We can also use the boldface
notation for column vectors, in which case the equation is written as Ax = b. For the sake of
minimizing the amount of notation, we stick to the simpler (yet incorrect) notation Ax = b.
If we consider the corresponding homogeneous equation Ax = 0, the set
→
−
U = {x ∈ Rn | Ax = 0}
since n n
X X
λi + 1 − λi = 1.
i=1 i=1
→
− →
− →
−
Given any point a ∈ E and any subset V of E , let a + V denote the following subset of E:
→
− n →
−o
a+ V = a+v |v ∈ V .
2.5. AFFINE SUBSPACES 31
−
→
E E
−
→
a V
−
→
V =a+ V
→
−
Figure 2.13: An affine subspace V and its direction V .
→ −
Proposition 2.2. Let E, E , + be an affine space.
(1) A nonempty subset V of E is an affine subspace iff for every point a ∈ V , the set
→
−
Va = {−
→|x∈V}
ax
→
− →
−
is a subspace of E . Consequently, V = a + Va . Furthermore,
→
−
V = {−
→ | x, y ∈ V }
xy
→
− →
− →
− →
−
is a subspace of E and Va = V for all a ∈ E. Thus, V = a + V .
→
− →
− →
−
(2) For any subspace V of E and for any a ∈ E, the set V = a + V is an affine subspace.
Proof. The proof is straightforward, and is omitted. It is also given in Gallier [29].
→
−
In particular, when E is the natural affine space associated with a vector space E ,
→
−
Proposition 2.2 shows that every affine subspace of E is of the form u + U , for a subspace
→
− →
− →
−
U of E . The subspaces of E are the affine subspaces of E that contain 0.
→
−
The subspace V associated with an affine subspace V is called the direction of V . It is
→
− →
−
→−
also clear that the map + : V × V → V induced by + : E × E → E confers to V, V , + an
affine structure. Figure 2.13 illustrates the notion of affine subspace.
→
−
By the dimension of the subspace V , we mean the dimension of V .
An affine subspace of dimension 1 is called a line, and an affine subspace of dimension 2
is called a plane.
32 CHAPTER 2. BASICS OF AFFINE GEOMETRY
Remarks:
(1) Since it can be shown that the barycenter of n weighted points can be obtained by
repeated computations of barycenters of two weighted points, a nonempty subset V
of E is an affine subspace iff for every two points a, b ∈ V , the set V contains all
barycentric combinations of a and b. If V contains at least two points, then V is an
affine subspace iff for any two distinct points a, b ∈ V , the set V contains the line
determined by a and b, that is, the set of all points (1 − λ)a + λb, λ ∈ R.
(2) This result still holds if the field K has at least three distinct elements, but the proof
is trickier!
2.6. AFFINE INDEPENDENCE AND AFFINE FRAMES 33
λj −
a−→
X
k aj = 0.
j∈(I−{k})
Since
−
a−→ −−→ −−→
k aj = ak ai + ai aj ,
we have
λj −
a−→ λj −
a−→ λj −
a−→
X X X
k aj = k ai + i aj ,
j∈(I−{k}) j∈(I−{k}) j∈(I−{k})
λj −
a−→ λj −
a−→
X X
= k ai + i aj ,
j∈(I−{k}) j∈(I−{i,k})
X
λj −
a−→ λj −
a−→
X
= i aj − i ak ,
j∈(I−{i,k}) j∈(I−{k})
and thus X
λj −
a−→ λj −
a−→
X
i aj − i ak = 0.
j∈(I−{i,k}) j∈(I−{k})
−
→
E E
a2
−
a−→
0 a2
a0 a1 −
a−→
0 a1
Definition 2.4 is reasonable, because by Proposition 2.4, the independence of the family
−−→
(ai aj )j∈(I−{i}) does not depend on the choice of ai . A crucial property of linearly independent
vectors (u1 , . . . , um ) is that if a vector v is a linear combination
m
X
v= λi ui
i=1
of the ui , then the λi are unique. A similar result holds for affinely independent points.
→ −
Proposition 2.5. Given an affine space E, E , + P , let (a0 , . . . , am ) P
be a family of m + 1
m m
points in E. Let x ∈ E, and assume that x = λ a
i=0 i i , where i=0 λi = 1. Then,
the family (λ0 , . . . , λm ) such that x = i=0 λi ai is unique iff the family (− a−→ −−→
Pm
0 a1 , . . . , a0 am ) is
linearly independent.
Proof. The proof is straightforward and is omitted. It is also given in Gallier [29].
Proposition 2.5 suggests the notion of affine frame. Affine frames are the affine analogues
→−
of bases in vector spaces. Let E, E , + be a nonempty affine space, and let (a0 , . . . , am )
be a family of m + 1 points in E. The family (a0 , . . . , am ) determines the family of m
→
−
vectors (− a−→ −−→
0 a1 , . . . , a0 am ) in E . Conversely, given a point a0 in E and a family of m vectors
→
−
(u1 , . . . , um ) in E , we obtain the family of m + 1 points (a0 , . . . , am ) in E, where ai = a0 + ui ,
1 ≤ i ≤ m.
Thus, for any m ≥ 1, it is equivalent to consider a family of m + 1 points (a0 , . . . , am ) in
→
−
E, and a pair (a0 , (u1 , . . . , um )), where the ui are vectors in E . Figure 2.14 illustrates the
notion of affine independence.
2.6. AFFINE INDEPENDENCE AND AFFINE FRAMES 35
Remark: The above observation also applies to infinite families (ai )i∈I of points in E and
→
−
families (→
−
ui )i∈I−{0} of vectors in E , provided that the index set I contains 0.
→
−
When (−a− → −−→ −→
0 a1 , . . . , a0 am ) is a basis of E then, for every x ∈ E, since x = a0 + a0 x, there
is a unique family (x1 , . . . , xm ) of scalars such that
x = a0 + x 1 −
a−→ −−→
0 a1 + · · · + xm a0 am .
m m
! m
−−
→
X X X
x=a +
0 x a a iff x = 1 −
i 0 i x a + xa,i 0 i i
i=1 i=1 i=1
x = a0 + x 1 −
a−→ −−→
0 a1 + · · · + xm a0 am
for a unique family (x1 , . . . , xm ) of scalars, called the coordinates of x w.r.t. the affine frame
(a0 , (−
a−→ −−→
0 a1 , . . ., a0 am )). Furthermore, every x ∈ E can be written as
x = λ0 a0 + · · · + λm am
for some unique family (λ0 , . . . , λm ) of scalars such that λ0 +· · ·+λm = 1 called the barycentric
coordinates of x with respect to the affine frame (a0 , . . . , am ). See Figure 2.15.
x = (-1, 0,2)
a 3 = (1,3,2)
a2 = (-1,3,1)
a0 = (1,2,1)
a1 = (2,3,1)
O
x = (-1, 0,2)
a 3 = (1,3,2)
a2 = (-1,3,1)
a0
a1 = (2,3,1)
O
Figure 2.15: The affine frame (a0 , a1 , a2 , a3 ) for A3 . The coordinates for x = (−1, 0, 2)
are x1 = −8/3, x2 = −1/3, x3 = 1, while the barycentric coordinates for x are λ0 = 3,
λ1 = −8/3, λ2 = −1/3, λ3 = 1.
→ −
Proposition 2.6. Given an affine space E, E , + , let (ai )i∈I be a family of points in E.
P (ai )i∈I is affinely
The family P dependent iff there is a family (λi )i∈I such that λj 6= 0 for some
−→
j ∈ I, i∈I λi = 0, and i∈I λi xai = 0 for every x ∈ E.
Proof. By Proposition 2.5, the family (ai )i∈I is affinely dependent iff the family of vectors
(−
a− → −−→
i aj )j∈(I−{i}) is linearly dependent for some i ∈ I. For any i ∈ I, the family (ai aj )j∈(I−{i})
is linearly dependent iff there is a family (λj )j∈(I−{i}) such that λj 6= 0 for some j, and such
that
λj −
a−→
X
i aj = 0.
j∈(I−{i})
−→
P P P
and letting λi = − j∈(I−{i}) λj , we get i∈I λi xai = 0, with i∈I λi = 0 and λj 6= 0 for
some j ∈ I. The converse is obvious by setting x = ai for some i such that λi 6= 0, since
P
i∈I λi = 0 implies that λj 6= 0, for some j 6= i.
2.6. AFFINE INDEPENDENCE AND AFFINE FRAMES 37
a2
a0 a0 a1
a3
a0 a1 a0 a2
a1
Even though Proposition 2.6 is rather dull, it is one of the key ingredients in the proof
of beautiful and deep theorems about convex sets, such as Carathéodory’s theorem, Radon’s
theorem, and Helly’s theorem.
→
−
A family of two points (a, b) in E is affinely independent iff ab 6= 0, iff a 6= b. If a 6= b, the
affine subspace generated by a and b is the set of all points (1 − λ)a + λb, which is the unique
line passing through a and b. A family of three points (a, b, c) in E is affinely independent
→
−
iff ab and →
−
ac are linearly independent, which means that a, b, and c are not on the same line
(they are not collinear). In this case, the affine subspace generated by (a, b, c) is the set of all
points (1 − λ − µ)a + λb + µc, which is the unique plane containing a, b, and c. A family of
→
− − −
→
four points (a, b, c, d) in E is affinely independent iff ab, →
ac, and ad are linearly independent,
which means that a, b, c, and d are not in the same plane (they are not coplanar). In this
case, a, b, c, and d are the vertices of a tetrahedron. Figure 2.16 shows affine frames and
their convex hulls for |I| = 0, 1, 2, 3.
Given n+1 affinely independent points (a0 , . . . , an ) in E, we can consider the set of points
λ0 a0 + · · · + λn an , where λ0 + · · · + λn = 1 and λi ≥ 0 (λi ∈ R). Such affine combinations are
called convex combinations. This set is called the convex hull of (a0 , . . . , an ) (or n-simplex
spanned by (a0 , . . . , an )). When n = 1, we get the segment between a0 and a1 , including
a0 and a1 . When n = 2, we get the interior of the triangle whose vertices are a0 , a1 , a2 ,
including boundary points (the edges). When n = 3, we get the interior of the tetrahedron
38 CHAPTER 2. BASICS OF AFFINE GEOMETRY
whose vertices are a0 , a1 , a2 , a3 , including boundary points (faces and edges). The set
{a0 + λ1 −
a−→ −−→
0 a1 + · · · + λn a0 an | where 0 ≤ λi ≤ 1 (λi ∈ R)}
a0
a0 a
1
a
2
a0 a
1
a
3
a0
a
3
a
1
a
2
a0
a
1
Figure 2.17: Examples of affine frames, convex hulls, and their associated parallelotopes.
More generally, we say that a subset V of E is convex if for any two points a, b ∈ V , we
have c ∈ V for every point c = (1 − λ)a + λb, with 0 ≤ λ ≤ 1 (λ ∈ R).
Points are not vectors! The following example illustrates why treating points as
vectors may cause problems. Let a, b, c be three affinely independent points in A3 .
Any point x in the plane (a, b, c) can be expressed as
x = λ0 a + λ1 b + λ2 c,
a1 b 1 c 1 λ0 x1
a2 b2 c2 λ1 = x2 .
a3 b 3 c 3 λ2 x3
However, there is a problem when the origin of the coordinate system belongs to the plane
(a, b, c), since in this case, the matrix is not invertible! What we should really be doing is to
solve the system
−→ −
→ −
→ −→
λ0 Oa + λ1 Ob + λ2 Oc = Ox,
where O is any point not in the plane (a, b, c). An alternative is to use certain well-chosen
cross products.
It can be shown that barycentric coordinates correspond to various ratios of areas and
volumes; see the problems.
Affine maps can be obtained from linear maps as follows. For simplicity of notation, the
same symbol + is used for both affine spaces (instead of using both + and +0 ).
→
− −
→
Given any point a ∈ E, any point b ∈ E 0 , and any linear map h : E → E 0 , we claim that
the map f : E → E 0 defined such that
f (a + v) = b + h(v)
P
is an affine map. Indeed, for any family (λi )i∈I of scalars with i∈I λi = 1 and any family
(→
−vi )i∈I , since
X X −−−−−→ X
λi (a + vi ) = a + λi a(a + vi ) = a + λi vi
i∈I i∈I i∈I
and X X −−−−−−−→ X
λi (b + h(vi )) = b + λi b(b + h(vi )) = b + λi h(vi ),
i∈I i∈I i∈I
40 CHAPTER 2. BASICS OF AFFINE GEOMETRY
we have
! !
X X
f λi (a + vi ) = f a+ λi vi
i∈I i∈I
!
X
= b+h λi vi
i∈I
X
= b+ λi h(vi )
i∈I
X
= λi (b + h(vi ))
i∈I
X
= λi f (a + vi ).
i∈I
P
Note that the condition i∈I λi = 1 was implicitly used (in a hidden call to Proposition
2.1) in deriving that X X
λi (a + vi ) = a + λi vi
i∈I i∈I
and X X
λi (b + h(vi )) = b + λi h(vi ).
i∈I i∈I
defines an affine map in A2 . It is a “shear” followed by a translation. The effect of this shear
on the square (a, b, c, d) is shown in Figure 2.18. The image of the square (a, b, c, d) is the
parallelogram (a0 , b0 , c0 , d0 ).
d = (5,2) c = (6,2)
d = (0,1) c = (1,1)
a = (3,1) b = (4,1)
a = (0,0) b = (1,0)
c = (5,4)
d‘ = (4,3)
d = (0,1) c = (1,1)
b = (4,1)
Proof. Let a ∈ E be any point in E. We claim that the map defined such that
→
− −−−−−−−−−→
f (v) = f (a)f (a + v)
→
− →
− → − −
→
for every v ∈ E is a linear map f : E → E 0 . Indeed, we can write
a + λv = λ(a + v) + (1 − λ)a,
42 CHAPTER 2. BASICS OF AFFINE GEOMETRY
−−−−−→
since a + λv = a + λa(a + v) + (1 − λ)−
→ and also
aa,
a + u + v = (a + u) + (a + v) − a,
−−−−−→ −−−−−→ →
since a + u + v = a + a(a + u) + a(a + v) − −
aa. Since f preserves barycenters, we get
we get
−−−−−−−−−−→ −−−−−−−−−→ −−−−−→ −−−−−−−−−→
f (a)f (a + λv) = λf (a)f (a + v) + (1 − λ)f (a)f (a) = λf (a)f (a + v),
→
− →
−
showing that f (λv) = λ f (v). We also have
f (a + u + v) = f (a + u) + f (a + v) − f (a),
f (b + v) = f (a + v) − f (a) + f (b),
→
− → − −
→
The unique linear map f : E → E 0 given by Proposition 2.7 is called the linear map
associated with the affine map f .
Note that the condition
→
−
f (a + v) = f (a) + f (v),
→
−
for every a ∈ E and every v ∈ E , can be stated equivalently as
→
− → −−−−−→ → − →
f (x) = f (a) + f (−
ax), or f (a)f (x) = f (−
ax),
for all a, x ∈ E. Proposition 2.7 shows that for any affine map f : E → E 0 , there are points
→
− → − −
→
a ∈ E, b ∈ E 0 , and a unique linear map f : E → E 0 , such that
→
−
f (a + v) = b + f (v),
→
− →
−
for all v ∈ E (just let b = f (a), for any a ∈ E). Affine maps for which f is the identity
→
−
map are called translations. Indeed, if f = id,
→
− →
f (x) = f (a) + f (−
ax) = f (a) + −
→=x+−
ax →+−
xa
−−→ →
af (a) + −
ax
−−−→ −−−→
= x+− → + af (a) − −
xa → = x + af (a),
xa
and so
−−−→ −−−→
xf (x) = af (a),
−−−→
which shows that f is the translation induced by the vector af (a) (which does not depend
on a).
Since an affine map preserves barycenters, and since an affine subspace V is closed under
barycentric combinations, the image f (V ) of V is an affine subspace in E 0 . So, for example,
the image of a line is a point or a line, and the image of a plane is either a point, a line, or
a plane.
It is easily verified that the composition of two affine maps is an affine map. Also, given
affine maps f : E → E 0 and g : E 0 → E 00 , we have
→
− →
−
→−
g(f (a + v)) = g f (a) + f (v) = g(f (a)) + g f (v) ,
−−→ − → −
which shows that g ◦ f = → g ◦ f . It is easy to show that an affine map f : E → E 0 is injective
→
− → − −
→ →
− → − −
→
iff f : E → E 0 is injective, and that f : E → E 0 is surjective iff f : E → E 0 is surjective.
→
− → − −
→
An affine map f : E → E 0 is constant iff f : E → E 0 is the null (constant) linear map equal
→
−
to 0 for all v ∈ E .
If E is an affine space of dimension m and (a0 , a1 , . . . , am ) is an affine frame for E, then
for any other affine space F and for any sequence (b0 , b1 , . . . , bm ) of m + 1 points in F , there
44 CHAPTER 2. BASICS OF AFFINE GEOMETRY
x = x1 −a− → −−→
0 a1 + · · · + x n a0 an ,
−−−−→
a0 f (a0 ) = b1 −
a−→ −−→
0 a1 + · · · + b n a0 an ,
−−−−−−−−→
a0 f (a0 + x) = y1 − a−→ −−→
0 a1 + · · · + y n a0 an ,
→
−
if A = (ai j ) is the n × n matrix of the linear map f over the basis (− a− → −−→
0 a1 , . . . , a0 an ), letting x,
y, and b denote the column vectors of components (x1 , . . . , xn ), (y1 , . . . , yn ), and (b1 , . . . , bn ),
−−−−−−−−→ −−−−→ → −
a0 f (a0 + x) = a0 f (a0 ) + f (x)
is equivalent to
y = Ax + b.
Note that b 6= 0 unless f (a0 ) = a0 . Thus, f is generally not a linear transformation, unless it
has a fixed point, i.e., there is a point a0 such that f (a0 ) = a0 . The vector b is the “translation
part” of the affine map. Affine maps do not always have a fixed point. Obviously, nonnull
translations have no fixed point. A less trivial example is given by the affine map
x1 1 0 x1 1
7→ + .
x2 0 −1 x2 0
This map is a reflection about the x-axis followed by a translation along the x-axis. The
affine map √
x1 1 − 3 x1 1
7→ √ +
x2 3/4 1/4 x2 1
2.7. AFFINE MAPS 45
which shows that it is the composition of a rotation of angle π/3, followed by a stretch (by a
factor of 2 along the x-axis, and by a factor of 12 along the y-axis), followed by a translation.
It is easy to show that this affine map has a unique fixed point. On the other hand, the
affine map
x1 8/5 −6/5 x1 1
7→ +
x2 3/10 2/5 x2 1
has no fixed point, even though
8/5 −6/5 2 0 4/5 −3/5
= ,
3/10 2/5 0 1/2 3/5 4/5
4
and the second matrix is a rotation of angle θ such that cos θ = 5
and sin θ = 35 .
There is a useful trick to convert the equation y = Ax + b into what looks like a linear
equation. The trick is to consider an (n + 1) × (n + 1) matrix. We add 1 as the (n + 1)th
component to the vectors x, y, and b, and form the (n + 1) × (n + 1) matrix
A b
0 1
so that y = Ax + b is equivalent to
y A b x
= .
1 0 1 1
This trick is very useful in kinematics and dynamics, where A is a rotation matrix. Such
affine maps are called rigid motions.
If f : E → E 0 is a bijective affine map, given any three collinear points a, b, c in E,
with a 6= b, where, say, c = (1 − λ)a + λb, since f preserves barycenters, we have f (c) =
(1 − λ)f (a) + λf (b), which shows that f (a), f (b), f (c) are collinear in E 0 . There is a converse
to this property, which is simpler to state when the ground field is K = R. The converse
states that given any bijective function f : E → E 0 between two real affine spaces of the
same dimension n ≥ 2, if f maps any three collinear points to collinear points, then f is
affine. The proof is rather long (see Berger [7] or Samuel [50]).
Given three collinear points a, b, c, where a 6= c, we have b = (1 − β)a + βc for some
unique β, and we define the ratio of the sequence a, b, c, as
→
−
β ab
ratio(a, b, c) = =→ − ,
(1 − β) bc
46 CHAPTER 2. BASICS OF AFFINE GEOMETRY
Ha,λ (x) = a + λ−
→
ax,
for every x ∈ E.
Remark: The terminology does not seem to be universally agreed upon. The terms affine
dilatation and central dilatation are used by Pedoe [46]. Snapper and Troyer use the term
dilation for an affine dilatation and magnification for a central dilatation [53]. Samuel uses
homothety for a central dilatation, a direct translation of the French “homothétie” [50]. Since
dilation is shorter than dilatation and somewhat easier to pronounce, perhaps we should use
that!
Observe that Ha,λ (a) = a, and when λ 6= 0 and x 6= a, Ha,λ (x) is on the line defined by
a and x, and is obtained by “scaling” −
→ by λ.
ax
Figure 2.20 shows the effect of a central dilatation of center d. The triangle (a, b, c) is
magnified to the triangle (a0 , b0 , c0 ). Note how every line is mapped to a parallel line.
−−→
When λ = 1, Ha,1 is the identity. Note that Ha,λ = λ id− →E
. When λ 6= 0, it is clear that
Ha,λ is an affine bijection. It is immediately verified that
b b
d
c
→
−
Proposition 2.8. Given any affine space E, for any affine bijection f ∈ GA(E), if f =
→
λ id−
E
, for some λ ∈ R∗ with λ 6= 1, then there is a unique point c ∈ E such that f = Hc,λ .
Proof. The proof is straightforward, and is omitted. It is also given in Gallier [29].
→
−
Clearly, if f = id− →E
, the affine map f is a translation. Thus, the group of affine
dilatations DIL(E) is the disjoint union of the translations and of the dilatations of ratio
λ 6= 0, 1. Affine dilatations can be given a purely geometric characterization.
Another point worth mentioning is that affine bijections preserve the ratio of volumes of
→
−
parallelotopes. Indeed, given any basis B = (u1 , . . . , um ) of the vector space E associated
with the affine space E, given any m + 1 affinely independent points (a0 , . . . , am ), we can
compute the determinant detB (− a− → −−→
0 a1 , . . . , a0 am ) w.r.t. the basis B. For any bijective affine
map f : E → E, since
→
− −−→ →
− −−→ →
−
detB f (a0 a1 ), . . . , f (a0 am ) = det f detB (− a− → −−→
0 a1 , . . . , a0 am )
→
−
and the determinant of a linear map is intrinsic (i.e., depends only on f , and not on the
particular basis B), we conclude that the ratio
→
− −→ →
− −→
detB f (−a0 a1 ), . . . , f (−
a0 am ) →
−
−− → − −→ = det f
det (a a , . . . , a a )
B 0 1 0 m
b1
H1 a1
H2 b2
a
2
H3 b3
a3
A B
(u1 , . . . , um ) has unit volume (see Berger [7], Section 9.12), we see that affine bijections
preserve the ratio of volumes of parallelotopes. In fact, this ratio is independent of the
choice of the parallelotopes of unit volume. In particular, the affine bijections f ∈ GA(E)
→
−
such that det f = 1 preserve volumes. These affine maps form a subgroup SA(E) of
GA(E) called the special affine group of E. We now take a glimpse at affine geometry.
Proof. Figure 2.21 illustrates the theorem of Thales. We sketch a proof, leaving the details
→
−
as an exercise. Since H1 , H2 , H3 are parallel, they have the same direction H , a hyperplane
2.9. AFFINE GEOMETRY: A GLIMPSE 49
→
− →
− → −
in E . Let u ∈ E − H be any nonnull vector such that A = a1 +Ru. Since A is not parallel to
→
− →
− →
−
H, we have E = H ⊕ Ru, and thus we can define the linear map p : E → Ru, the projection
→
−
on Ru parallel to H . This linear map induces an affine map f : E → A, by defining f such
that
f (b1 + w) = a1 + p(w),
→
− →
−
for all w ∈ E . Clearly, f (b1 ) = a1 , and since H1 , H2 , H3 all have direction H , we also have
f (b2 ) = a2 and f (b3 ) = a3 . Since f is affine, it preserves ratios, and thus
− −−→
a−→
1 a3 b1 b3
−−→ = −−→ .
a1 a2 b1 b2
We also have the following simple proposition, whose proof is left as an easy exercise.
Proposition 2.10. Given any affine space E, given any two distinct points a, b ∈ E, and
for any affine dilatation f different from the identity, if a0 = f (a), D = ha, bi is the line
passing through a and b, and D0 is the line parallel to D and passing through a0 , the following
are equivalent:
(i) b0 = f (b);
(ii) If f is a translation, then b0 is the intersection of D0 with the line parallel to ha, a0 i
passing through b;
The first case is the parallelogram law, and the second case follows easily from Thales’
theorem. For an illustration, see Figure 2.22.
We are now ready to prove two classical results of affine geometry, Pappus’s theorem and
Desargues’s theorem. Actually, these results are theorems of projective geometry, and we
are stating affine versions of these important results. There are stronger versions that are
best proved using projective geometry.
Proposition 2.11. Given any affine plane E, any two distinct lines D and D0 , then for
any distinct points a, b, c on D and a0 , b0 , c0 on D0 , if a, b, c, a0 , b0 , c0 are distinct from the
intersection of D and D0 (if D and D0 intersect) and if the lines ha, b0 i and ha0 , bi are parallel,
and the lines hb, c0 i and hb0 , ci are parallel, then the lines ha, c0 i and ha0 , ci are parallel.
50 CHAPTER 2. BASICS OF AFFINE GEOMETRY
a’ = f(a)
b
D’
D
a’ = f(a) a’ = f(a)
b’ = f(b)
a a
b’ = f(b)
b b
D’ c D’
D D
(i) (ii)
Figure 2.22: An illustration of Proposition 2.10. The bottom left diagram illustrates a
translation, while the bottom right illustrates a central dilation through c.
Proof. Pappus’s theorem is illustrated in Figure 2.23. If D and D0 are not parallel, let d be
their intersection. Let f be the dilatation of center d such that f (a) = b, and let g be the
dilatation of center d such that g(b) = c. Since the lines ha, b0 i and ha0 , bi are parallel, and
the lines hb, c0 i and hb0 , ci are parallel, by Proposition 2.10 we have a0 = f (b0 ) and b0 = g(c0 ).
However, we observed that dilatations with the same center commute, and thus f ◦ g = g ◦ f ,
and thus, letting h = g ◦ f , we get c = h(a) and a0 = h(c0 ). Again, by Proposition 2.10, the
lines ha, c0 i and ha0 , ci are parallel. If D and D0 are parallel, we use translations instead of
dilatations.
Proposition 2.12. Given any affine space E, and given any two triangles (a, b, c) and
(a0 , b0 , c0 ), where a, b, c, a0 , b0 , c0 are all distinct, if ha, bi and ha0 , b0 i are parallel and hb, ci and
hb0 , c0 i are parallel, then ha, ci and ha0 , c0 i are parallel iff the lines ha, a0 i, hb, b0 i, and hc, c0 i
are either parallel or concurrent (i.e., intersect in a common point).
Proof. We prove half of the proposition, the direction in which it is assumed that ha, ci and
ha0 , c0 i are parallel, leaving the converse as an exercise. Since the lines ha, bi and ha0 , b0 i are
2.9. AFFINE GEOMETRY: A GLIMPSE 51
c
D
b
a
c
b
D
a
parallel, the points a, b, a0 , b0 are coplanar. Thus, either ha, a0 i and hb, b0 i are parallel, or
they have some intersection d. We consider the second case where they intersect, leaving
the other case as an easy exercise. Let f be the dilatation of center d such that f (a) = a0 .
By Proposition 2.10, we get f (b) = b0 . If f (c) = c00 , again by Proposition 2.10 twice, the
lines hb, ci and hb0 , c00 i are parallel, and the lines ha, ci and ha0 , c00 i are parallel. From this it
follows that c00 = c0 . Indeed, recall that hb, ci and hb0 , c0 i are parallel, and similarly ha, ci and
ha0 , c0 i are parallel. Thus, the lines hb0 , c00 i and hb0 , c0 i are identical, and similarly the lines
−→ −→
ha0 , c00 i and ha0 , c0 i are identical. Since a0 c0 and b0 c0 are linearly independent, these lines have
a unique intersection, which must be c00 = c0 .
The direction where it is assumed that the lines ha, a0 i, hb, b0 i and hc, c0 i, are either parallel
or concurrent is left as an exercise (in fact, the proof is quite similar).
b b
d
c
is the null set, or kernel, of the affine map f : Am → R, in the sense that
where x = (x1 , . . . , xm ).
Thus, it is interesting to consider affine forms, which are just affine maps f : E → R
from an affine space to R. Unlike linear forms f ∗ , for which Ker f ∗ is never empty (since it
always contains the vector 0), it is possible that f −1 (0) = ∅ for an affine form f . Given an
affine map f : E → R, we also denote f −1 (0) by Ker f , and we call it the kernel of f . Recall
that an (affine) hyperplane is an affine subspace of codimension 1. The relationship between
affine hyperplanes and affine forms is given by the following proposition.
(a) Given any nonconstant affine form f : E → R, its kernel H = Ker f is a hyperplane.
(b) For any hyperplane H in E, there is a nonconstant affine form f : E → R such that
H = Ker f . For any other affine form g : E → R such that H = Ker g, there is some
λ ∈ R such that g = λf (with λ 6= 0).
(c) Given any hyperplane H in E and any (nonconstant) affine form f : E → R such that
H = Ker f , every hyperplane H 0 parallel to H is defined by a nonconstant affine form
g such that g(a) = f (a) − λ, for all a ∈ E and some λ ∈ R.
Proof. The proof is straightforward, and is omitted. It is also given in Gallier [29].
f (x1 , . . . , xn ) = λ1 x1 + · · · + λn xn + µ,
λ1 x1 + · · · + λn xn + µ = 0.
54 CHAPTER 2. BASICS OF AFFINE GEOMETRY
Proposition 2.14. Given a vector space E and any two subspaces M and N , with the
definitions above,
f +g i−j
0 −→ M ∩ N −→ M ⊕ N −→ M + N −→ 0
is a short exact sequence, which means that f + g is injective, i − j is surjective, and that
Im (f + g) = Ker (i − j). As a consequence, we have the Grassmann relation
Proposition 2.15. Given any affine space E, for any two nonempty affine subspaces M
and N , the following facts hold:
→
− − → → −
(1) M ∩ N 6= ∅ iff ab ∈ M + N for some a ∈ M and some b ∈ N .
→
− −
→ → −
(2) M ∩ N consists of a single point iff ab ∈ M + N for some a ∈ M and some b ∈ N ,
−
→ → −
and M ∩ N = {0}.
→
− −
→ → − →
−
(3) If S is the least affine subspace containing M and N , then S = M + N + K ab (the
→
−
vector space E is defined over the field K).
2.11. INTERSECTION OF AFFINE SPACES 55
Proof. (1) Pick any a ∈ M and any b ∈ N , which is possible, since M and N are nonempty.
−
→ → | x ∈ M } and →− →
−
Since M = {− ax N = { by | y ∈ N }, if M ∩ N 6= ∅, for any c ∈ M ∩ N we have
→
− →
− −
→ →
− →
− →
− −
→ → −
ab = → −
ac − bc, with →
−
ac ∈ M and bc ∈ N , and thus, ab ∈ M + N . Conversely, assume that
→
− −
→ → − →
− →+→ −
ab ∈ M + N for some a ∈ M and some b ∈ N . Then ab = − ax by, for some x ∈ M and
some y ∈ N . But we also have
→
− →+→ −
ab = −→+−
ax xy yb,
and thus we get 0 = − →+→
xy
− → − → = 2→
yb − by, that is, −
xy
−
by. Thus, b is the middle of the segment
−
→ →
−
[x, y], and since yx = 2 yb, x = 2b − y is the barycenter of the weighted points (b, 2) and
(y, −1). Thus x also belongs to N , since N being an affine subspace, it is closed under
barycenters. Thus, x ∈ M ∩ N , and M ∩ N 6= ∅.
(2) Note that in general, if M ∩ N 6= ∅, then
−−−−→ − → → −
M ∩ N = M ∩ N,
because
−−−−→ →
− →
− →
− −
→ → −
M ∩ N = { ab | a, b ∈ M ∩ N } = { ab | a, b ∈ M } ∩ { ab | a, b ∈ N } = M ∩ N .
−−−−→
Since M ∩ N = c + M ∩ N for any c ∈ M ∩ N , we have
−
→ → −
M ∩ N = c + M ∩ N for any c ∈ M ∩ N .
−
→ → −
From this it follows that if M ∩N 6= ∅, then M ∩N consists of a single point iff M ∩ N = {0}.
This fact together with what we proved in (1) proves (2).
(3) This is left as an easy exercise.
Remarks:
→
− −
→ → −
(1) The proof of Proposition 2.15 shows that if M ∩ N 6= ∅, then ab ∈ M + N for all
a ∈ M and all b ∈ N .
(2) Proposition 2.15 implies that for any two nonempty affine subspaces M and N , if
→
− −
→ → − →
− −
→ → −
E = M ⊕ N , then M ∩ N consists of a single point. Indeed, if E = M ⊕ N , then
→
− → − −
→ → −
ab ∈ E for all a ∈ M and all b ∈ N , and since M ∩ N = {0}, the result follows from
part (2) of the proposition.
Proposition 2.16. Given an affine space E and any two nonempty affine subspaces M and
N , if S is the least affine subspace containing M and N , then the following properties hold:
56 CHAPTER 2. BASICS OF AFFINE GEOMETRY
(1) If M ∩ N = ∅, then
−
→ → −
dim(M ) + dim(N ) < dim(E) + dim(M + N )
and
−
→ → −
dim(S) = dim(M ) + dim(N ) + 1 − dim(M ∩ N ).
(2) If M ∩ N 6= ∅, then
Proof. The proof is not difficult, using Proposition 2.15 and Proposition 2.14, but we leave
it as an exercise.
Chapter 3
Convex sets play a very important role in geometry. In this chapter we state and prove some
of the “classics” of convex affine geometry: Carathéodory’s theorem, Radon’s theorem, and
Helly’s theorem. These theorems share the property that they are easy to state, but they
are deep, and their proof, although rather short, requires a lot of creativity.
which is also the norm kabk of the vector ab, and that for any > 0, the open ball of center
a and radius , B(a, ), is given by
A subset U ⊆ Ad is open (in the norm topology) if either U is empty or for every point
a ∈ U , there is some (small) open ball B(a, ) contained in U .
A subset C ⊆ Ad is closed iff Ad − C is open. For example, the closed balls B(a, ) where
B(a, ) = {b ∈ Ad | d(a, b) ≤ }
57
58 CHAPTER 3. BASIC PROPERTIES OF CONVEX SETS
are closed.
A subset W ⊆ Ad is bounded iff there is some ball (open or closed) B so that W ⊆ B.
A subset W ⊆SAd is compact iff every family {Ui }i∈I that is an open cover of W (which
means that W = i∈I (W ∩ Ui ), with each Ui an open set) Spossesses a finite subcover (which
means that there is a finite subset F ⊆ I so that W = i∈F (W ∩ Ui )). In Ad , it can be
shown that a subset W is compact iff W is closed and bounded.
Given a function f : Am → An , we say that f is continuous if f −1 (V ) is open in Am
whenever V is open in An . If f : Am → An is a continuous function, although it is generally
false that f (U ) is open if U ⊆ Am is open, it is easily checked that f (K) is compact if
K ⊆ Am is compact.
An affine space X of dimension d becomes a topological space if we give it the topology
for which the open subsets are of the form f −1 (U ), where U is any open subset of Ad and
f : X → Ad is an affine bijection.
Given any subset A of a topological space X, the smallest closed set containing A is
denoted by A, and is called the closure or adherence of A. A subset A of X is dense in X
◦
if A = X. The largest open set contained in A is denoted by A, and is called the interior of
A. The set Fr A = A ∩ X − A is called the boundary (or frontier ) of A. We also denote the
boundary of A by ∂A.
(a) (b)
A good understanding of what conv(S) is, and good methods for computing it, are
essential. First we have the following simple but crucial lemma:
→ −
Lemma 3.1. Given an affine P space E, E , +P, for any family (ai )i∈I of points in E, the set
V of convex combinations i∈I λi ai (where i∈I λi = 1 and λi ≥ 0) is the convex hull of
(ai )i∈I .
P
Proof. If (ai )i∈I is empty, then V = ∅, because of the condition i∈I λi = 1. As in the case
of affine combinations, it is easily shown by induction that any convex combination can be
obtained by computing convex combinations of two points at a time. As a consequence, if
(ai )i∈I is nonempty, then the smallest
P convex subspace containing (ai )i∈I must contain the
set V of all convex combinations i∈I λi ai . Thus, it is enough to show that V is closed
under convex combinations, which is immediately verified.
{H+ (f ), H− (f )}
depends only on the hyperplane H, and the choice of a specific f defining H amounts
to the choice of one of the two half-spaces. For this reason, we will also say that H+ (f )
60 CHAPTER 3. BASIC PROPERTIES OF CONVEX SETS
H+ (f )
H− (f )
(1) Is it possible to
P have a fixed bound on the number of points involved in the convex
combinations j∈J λj aj (that is, on the size of the index sets J)?
The answer is yes in both cases. In Case (1), assuming that the affine space E has dimen-
sion m, Carathéodory’s theorem asserts that it is enough to consider convex combinations
of m + 1 points. For example, in the plane A2 , the convex hull of a set S of points is the
union of all triangles (interior points included) with vertices in S. In Case (2), the theorem
of Krein and Milman asserts that a convex set that is also compact is the convex hull of its
extremal points (given a convex set S, a point a ∈ S is extremal if S − {a} is also convex,
see Berger [8] or Lang [38]). Next, we prove Carathéodory’s theorem.
Theorem 3.2. (Carathéodory, 1907) Given any affine space E of dimension m, for any
(nonvoid) family S = (ai )i∈L in E, the convex hull conv(S) of S is equal to the set of convex
combinations of families of m + 1 points of S.
We proceed by contradiction. If the theorem is false, Pthere is some point b ∈ conv(S) such
that b can be expressed as a convex combination b = i∈I λi ai , where I ⊆ L is a finite set
of cardinality
P |I| = q with q ≥ m + 2, and b cannot be expressed as any convex combination
b = j∈J µj aj of strictly fewer than q points in S, that is, where |J| < q. Such a point
b ∈ conv(S) is a convex combination
b = λ 1 a1 + · · · + λ q aq ,
T = {t ∈ R | λi + tµi ≥ 0, µi 6= 0, 1 ≤ i ≤ q}.
The set T is nonempty, since it contains 0. Since qi=1 µi = 0 and the µi are not all null,
P
there are some µh , µk such that µh < 0 and µk > 0, which implies that T = [α, β], where
λj + αµj = 0.
62 CHAPTER 3. BASIC PROPERTIES OF CONVEX SETS
Indeed, since
α = max {−λi /µi | µi > 0},
1≤i≤q
as the set on the right hand side is finite, the maximum is achieved and P
there is some index
j so that α = −λj /µj . If j is some index such that λj + αµj = 0, since qi=1 µi Oai = 0, we
have
q q
X X
b= λi ai = O + λi Oai + 0,
i=1 i=1
q q
X
X
=O+ λi Oai + α µi Oai ,
i=1 i=1
q
X
=O+ (λi + αµi )Oai ,
i=1
q
X
= (λi + αµi )ai ,
i=1
q
X
= (λi + αµi )ai ,
i=1, i6=j
Pq Pq
since λj + αµj = 0. Since i=1 µi = 0, i=1 λi = 1, and λj + αµj = 0, we have
q
X
λi + αµi = 1,
i=1, i6=j
and since λi + αµi ≥ 0 for i = 1, . . . , q, the above shows that b can be expressed as a convex
combination of q − 1 points from S. However, this contradicts the assumption that b cannot
be expressed as a convex combination of strictly fewer than q points from S, and the theorem
is proved.
If S is a finite (of infinite) set of points in the affine plane A2 , Theorem 3.2 confirms
our intuition that conv(S) is the union of triangles (including interior points) whose vertices
belong to S. Similarly, the convex hull of a set S of points in A3 is the union of tetrahedra
(including interior points) whose vertices belong to S. We get the feeling that triangulations
play a crucial role, which is of course true!
An interesting consequence of Carathéodory’s theorem is the following result:
Proposition 3.3. If K is any compact subset of Am , then the convex hull, conv(K), of K
is also compact. In particular, the convex hull conv(a1 , . . . , ap ) of a finite set of points is
compact, and thus closed (and bounded).
3.3. CARATHÉODORY’S THEOREM 63
f (λ0 , . . . , λm , a0 , . . . , am ) = λ0 a0 + · · · + λm am .
f (C × K m+1 ) = conv(K),
and since the image of a compact set by a continuous function is compact, we conclude that
conv(K) is compact.
A closer examination of the proof of Theorem 3.2 reveals that the fact that the µi ’s add
up to zero ensures that T is a closed interval, but all we need is that T be bounded from
below, and this only requires that some µj be strictly positive. As a consequence, we can
prove a version of Theorem 3.2 for convex cones. This is a useful result since cones play such
an important role in convex optimization. let us recall some basic definitions about cones.
Definition 3.3. Given any vector space E, a subset C ⊆ E is a convex cone iff C is closed
under positive linear combinations, that is, linear combinations of the form
X
λi vi , with vi ∈ C and λi ≥ 0 for all i ∈ I,
i∈I
where I has finite support (all λi = 0 except for finitely many i ∈ I). Given any set of
vectors S, the positive hull of S, or cone spanned by S, denoted cone(S), is the set of all
positive linear combinations of vectors in S,
( )
X
cone(S) = λi vi | vi ∈ S, λi ≥ 0 .
i∈I
Note that a cone always contains 0. When S consists of a finite number of vector, the con-
vex cone cone(S) is called a polyhedral cone. We have the following version of Carathéodory’s
theorem for convex cones:
Theorem 3.4. Given any vector space E of dimension m, for any (nonvoid) family S =
(vi )i∈L of vectors in E, the cone cone(S) spanned by S is equal to the set of positive combi-
nations of families of m vectors in S.
64 CHAPTER 3. BASIC PROPERTIES OF CONVEX SETS
The proof of Theorem 3.4 can be easily adapted from the proof of Theorem 3.2 and is
left as an exercise.
There is an interesting generalization of Carathéodory’s theorem known as the Colorful
Carathéodory theorem. This theorem due to Bárány and proved in 1982 can be used to give
a fairly short proof of a generalization of Helly’s theorem known as Tverberg’s theorem (see
Section 3.5).
Theorem 3.5. (Colorful Carathéodory theorem) Let E be any affine space of dimension m.
For any point b ∈ E, for any sequence of m + 1 nonempty subsets (S1 , . . . , Sm+1 ) of E, if
b ∈ conv(Si ) for i = 1, . . . , m + 1, then there exists a sequence of m + 1 points (a1 , . . . , am+1 ),
with ai ∈ Si , so that b ∈ conv(a1 , . . . , am+1 ), that is, b is a convex combination of the ai ’s.
Although Theorem 3.5 is not hard to prove, we will not prove it here. Instead, we refer the
reader to Matousek [40], Chapter 8, Section 8.2. There is also a stronger version of Theorem
3.5, in which it is enough to assume that b ∈ conv(Si ∪ Sj ) for all i, j with 1 ≤ i < j ≤ m + 1.
Now that we have given an answer to the first question posed at the end of Section 3.2
we give an answer to the second question.
A
A
H H
H′
B B
(a) (b)
Figure 3.3: (a) A separating hyperplane H. (b) Strictly separating hyperplanes H and H 0 .
Definition 3.4. Given an affine space E and two nonempty subsets A and B of E, we say
that a hyperplane H separates (resp. strictly separates) A and B if A is in one and B is in
the other of the two half–spaces (resp. open half–spaces) determined by H.
In Figure 3.3 (a), the two closed convex sets A and B are unbounded and B has the
hyperplane H for its boundary, while A is asymptotic to H. The hyperplane H is a separating
hyperplane for A and B but A and B can’t be strictly separated. In Figure 3.3 (b), both A
and B are convex and closed, B is unbounded and asymptotic to the hyperplane, H 0 , but A
is bounded. Both hyperplanes H and H 0 strictly separate A and B.
The special case of separation where A is convex and B = {a}, for some point, a, in A,
is of particular importance.
Definition 3.5. Let E be an affine space and let A be any nonempty subset of E. A
supporting hyperplane of A is any hyperplane H containing some point a of A, and separating
{a} and A. We say that H is a supporting hyperplane of A at a.
A vertex is a boundary point a such that there are d independent supporting hyperplanes
at a. A d-simplex has boundary points of order 0, 1, . . . , d − 1. This phenomena is illustrated
in Figure 3.5. The following proposition is shown in Berger [8] (Proposition 11.6.2):
Proposition 3.6. The set of vertices of a closed and convex subset is countable.
It is fairly obvious that a point a ∈ ∂A is extremal if it does not belong to the interior of
any closed nontrivial line segment [x, y] ⊆ A (x 6= y, a 6= x and a 6= y).
Observe that a vertex is extremal, but the converse is false. For example, in Figure 3.6,
all the points on the arc of parabola, including v1 and v2 , are extreme points. However, only
v1 and v2 are vertices. Also, if dim E ≥ 3, the set of extremal points of a compact convex
may not be closed. See Berger [8], Chapter 11, Figure 11.6.5.3, which we reproduce in Figure
3.7.
Actually, it is not at all obvious that a nonempty compact convex set possesses extremal
points. In fact, a stronger results holds (Krein and Milman’s theorem). In preparation for
the proof of this important theorem, observe that any compact (nontrivial) interval of A1
has two extremal points, its two endpoints. We need the following lemma:
Lemma 3.7. Let E be an affine space of dimension n, and let A be a nonempty compact
and convex set. Then, A = conv(∂A), i.e., A is equal to the convex hull of its boundary.
Proof. Pick any a in A, and consider any line D through a. Then, D∩A is closed and convex.
However, since A is compact, it follows that D ∩ A is a closed interval [u, v] containing a,
and u, v ∈ ∂A. Therefore, a ∈ conv(∂A), as desired.
3.4. VERTICES, EXTREMAL POINTS AND KREIN AND MILMAN’S THEOREM 67
v
v
b
a b
a
(i.) k(a) = 2
v
b
a
(ii.) k(b) = 1
b
a
(iii.) k(v) = 0
Figure 3.5: The various types of boundary points for a solid tetrahedron. If the point is in
the interior of a triangular face, it has order 2. If the point is in the interior of an edge, it
has order 1. If the point is a vertex, it has order 0.
v1 v2
(0,0,1)
(0,0,0)
(0,0,-1)
Figure 3.7: Let A the convex set formed by taking a planar unit circle through the origin
and forming the double cone with apex (0, 0, 1) and (0, 0, −1). The extremal points of A are
the points on pink circular boundary minus the origin.
The following important theorem shows that only extremal points matter as far as de-
termining a compact and convex subset from its boundary. The proof of Theorem 3.8 makes
use of a proposition due to Minkowski (Proposition 4.19) which will be proved in Section
4.2.
Theorem 3.8. (Krein and Milman, 1940) Let E be an affine space of dimension n. Every
compact and convex nonempty subset A is equal to the convex hull of its set of extremal
points.
Proof. Denote the set of extremal points of A by Extrem(A). We proceed by induction on
d = dimE. When d = 1, the convex and compact subset A must be a closed interval [u, v], or
a single point. In either cases, the theorem holds trivially. Now, assume d ≥ 2, and assume
that the theorem holds for d − 1. It is easily verified that
Extrem(A ∩ H) = (Extrem(A)) ∩ H,
∂A ⊆ conv(Extrem(A)),
A = conv(Extrem(A)).
∂A ⊆ conv(Extrem(A)),
Remark: Observe that Krein and Milman’s theorem implies that any nonempty compact
and convex set has a nonempty subset of extremal points. This is intuitively obvious, but
hard to prove! Krein and Milman’s theorem also applies to infinite dimensional affine spaces,
provided that they are locally convex, see Valentine [63], Chapter 11, Bourbaki [13], Chapter
II, Barvinok [4], Chapter 3, or Lax [39], Chapter 13.
An important consequence of Krein and Millman’s theorem is that every convex function
on a convex and compact set achieves its maximum at some extremal point.
Definition 3.8. Let A be a nonempty convex subset of An . A function f : A → R is convex
if
f ((1 − λ)a + λb) ≤ (1 − λ)f (a) + λf (b)
for all a, b ∈ A and for all λ ∈ [0, 1]. The function f : A → R is strictly convex if
for all a, b ∈ A with a 6= b and for all λ with 0 < λ < 1. A function f : A → R is concave
(resp. strictly concave) iff −f is convex (resp. −f is strictly convex). See Figure 3.8.
l = (1-λ)f(u) + λf(v)
u v
(a)
l = (1-λ)f(u) + λf(v)
u v
(b)
Figure 3.8: Figures (a) and (b) are the graphs of real valued functions. Figure (a) is the
graph of convex function since the blue line lies above the graph of f . Figure (b) shows the
graph of a function which is not convex.
Proof. Since A is compact and f is continuous, f (A) is a closed interval, [m, M ], in R and so
f achieves its minimum m and its maximum M . Say f (c) = M , for some c ∈ A. By Krein
and Millman’s theorem, c is some convex combination of exteme points of A,
k
X
c= λi ai ,
i=1
Pk
with i=1 λi = 1, λi ≥ 0 and each ai an extreme point in A. But then, as f is convex,
k
! k
X X
M = f (c) = f λi ai ≤ λi f (ai )
i=1 i=1
and if we let
f (ai0 ) = max {f (ai )}
1≤i≤k
k k
!
X X
M = f (c) ≤ λi f (ai ) ≤ λi f (ai0 ) = f (ai0 ),
i=1 i=1
3.5. RADON’S, TVERBERG’S, HELLY’S, THEOREMS AND CENTERPOINTS 71
as ki=1 λi = 1. Since M is the maximum value of the function f over A, we have f (ai0 ) ≤ M
P
and so,
M = f (ai0 )
and f achieves its maximum at the extreme point ai0 , as claimed.
Proposition 3.9 plays an important role in convex optimization: It guarantees that the
maximum value of a convex objective function on a compact and convex set is achieved at
some extreme point. Thus, it is enough to look for a maximum at some extreme point of
the domain.
Proposition 3.9 fails for minimal values of a convex function. For example, the function,
x 7→ f (x) = x2 , defined on the compact interval [−1, 1] achieves it minimum at x = 0, which
is not an extreme point of [−1, 1]. However, if f is concave, then f achieves its minimum
value at some extreme point of A. In particular, if f is affine, it achieves its minimum and
its maximum at some extreme points of A.
We conclude this chapter with three other classics of convex geometry.
Proof. Pick some origin O in E. Write X = (xi )i∈L for some index set L (we can let L = X).
Since by assumption |X| ≥ m + 2 where m = dim(E), X is affinely dependent, and by
Lemma 2.6.5 from Gallier [30], there is a family (µk )k∈L (of finite support) of scalars, not all
null, such that X X
µk = 0 and µk Oxk = 0.
k∈L k∈L
P
Since k∈L µk = 0, the µk are not all null, and (µk )k∈L has finite support, the sets
I = {i ∈ L | µi > 0} and J = {j ∈ L | µj < 0}
are nonempty, finite, and obviously disjoint. Let
X1 = {xi ∈ X | µi > 0} and X2 = {xi ∈ X | µi ≤ 0}.
P
Again, since the µk are not all null and k∈L µk = 0, the sets X1 and X2 are nonempty, and
obviously
X1 ∩ X2 = ∅ and X1 ∪ X2 = X.
72 CHAPTER 3. BASIC PROPERTIES OF CONVEX SETS
Furthermore, the definition of I and J implies that (xi )i∈I ⊆ X1 and (xj )j∈J ⊆ X2 . It
remains to prove that conv(X1 ) ∩ conv(X2 ) 6= ∅. The definition of I and J implies that
X
µk Oxk = 0
k∈L
can be written as X X
µi Oxi + µj Oxj = 0,
i∈I j∈J
that is, as X X
µi Oxi = −µj Oxj ,
i∈I j∈J
where X X
µi = −µj = µ,
i∈I j∈J
with X µi X µj
= − = 1,
i∈I
µ j∈J
µ
P P
proving that i∈I (µi /µ)xi ∈ conv(X1 ) and j∈J −(µj /µ)xj ∈ conv(X2 ) are identical, and
thus that conv(X1 ) ∩ conv(X2 ) 6= ∅.
It can be shown that a finite set, X ⊆ E, has a unique Radon partition iff it has m + 2
elements and any m + 1 points of X are affinely independent. For example, there are exactly
two possible cases in the plane as shown in Figure 3.10.
There is also a version of Radon’s theorem for the class of cones with an apex. Say that
a convex cone, C ⊆ E, has an apex (or is a pointed cone) iff there is some hyperplane, H,
such that C ⊆ H+ and H ∩ C = {0}. For example, the cone obtained as the intersection of
two half spaces in R3 is not pointed since it is a wedge with a line as part of its boundary.
Here is the version of Radon’s theorem for convex cones:
Theorem 3.11. Given any vector space E of dimension m, for every subset X of E, if
cone(X) is a pointed cone such that X has at least m + 1 nonzero vectors, then there is a
partition of X into two nonempty disjoint subsets, X1 and X2 , such that the cones, cone(X1 )
and cone(X2 ), have a nonempty intersection not reduced to {0}.
A partition as in Theorem 3.12 is called a Tverberg partition and a point in ri=1 conv(Xi )
T
is called a Tverberg point. Theorem 3.12 was conjectured by Birch and proved by Tverberg
in 1966. Tverberg’s original proof was technically quite complicated. Tverberg then gave a
simpler proof in 1981 and other simpler proofs were later given, notably by Sarkaria (1992)
and Onn (1997), using the Colorful Carathéodory theorem. A proof along those lines can be
found in Matousek [40], Chapter 8, Section 8.3. A colored Tverberg theorem and more can
also be found in Matousek [40] (Section 8.3).
Next, we prove a version of Helly’s theorem.
Theorem 3.13. (Helly, 1913) Given any affine space E of dimension m, for T every family
{K1 , . . . , Kn } of n convex subsets of E, if n ≥ m + 2 and the intersection i∈I Ki of any
m + 1 of the Ki is nonempty (where I ⊆ {1, . . . , n}, |I| = m + 1), then ni=1 Ki is nonempty.
T
74 CHAPTER 3. BASIC PROPERTIES OF CONVEX SETS
Proof. The proof is by induction on n ≥ m + 1 and uses Radon’s theorem in the induction
step. For n = m + 1, the assumption of the theorem is that the intersection of any family of
T Next, let L = {1, 2, . . . , n+1},
m+1 of the Ki ’s is nonempty, and the theorem holds trivially.
where n + 1 ≥ m + 2. By the induction hypothesis, Ci = j∈(L−{i}) Kj is nonempty for every
i ∈ L.
We claim that Ci ∩ Cj 6= ∅ for some i 6= j. If so, as Ci ∩ Cj = n+1
T
k=1 Kk , we are done. So,
let us assume that the Ci ’s are pairwise disjoint. Then, we can pick a set X = {a1 , . . . , an+1 }
such that ai ∈ Ci , for every i ∈ L. By Radon’s Theorem, there are two nonempty disjoint
sets X1 , X2 ⊆ X such that X = X1 ∪ X2 and conv(X1 ) ∩ conv(X2 ) 6= ∅. However, X1 ⊆ Kj
for every j with aj ∈/ X1 . This is because aj ∈/ Kj for every j, and so, we get
\
X1 ⊆ Kj .
aj ∈X
/ 1
Tn+1 Tn+1
it follows that conv(X1 ) ∩ conv(X2 ) ⊆ i=1 Ki , so that i=1 Ki is nonempty, contradicting
the fact that Ci ∩ Cj = ∅ for all i 6= j.
It is not hard to see that CS is convex. Then, by hypothesis the fact that any three line
segments Si , Sj , Sk meet a line means that CSi ∩CSj ∩CSk 6= ∅, any Helly’s Theorem implies
that the family of all the convex sets CSi has a nonempty intersection, which means that
there is a line meeting all the line segments Si . This situation for four lines is illustrated in
Figure 3.11.
We conclude this chapter with a nice application of Helly’s Theorem to the existence
of centerpoints. Centerpoints generalize the notion of median to higher dimensions. Recall
3.5. RADON’S, TVERBERG’S, HELLY’S, THEOREMS AND CENTERPOINTS 75
y=
αx
+ S β‘
β
α ‘x +
y=
C(S)
(0,0)
Figure 3.11: The four pink line segments in the affine plane all intersect the horizontal red
line.
that if we have a set of n data points, S = {a1 , . . . , an }, on the real line, a median for S is
a point, x, such that both intervals [x, ∞) and (−∞, x] contain at least n/2 of the points in
S (by n/2, we mean the largest integer greater than or equal to n/2).
Given any hyperplane, H, recall that the closed half-spaces determined by H are denoted
◦ ◦
H+ and H− and that H ⊆ H+ and H ⊆ H− . We let H+ = H+ − H and H− = H− − H be
the open half-spaces determined by H.
Definition 3.9. Let S = {a1 , . . . , an } be a set of n points in Ad . A point, c ∈ Ad , is a
centerpoint of S iff for every hyperplane, H, whenever the closed half-space H+ (resp. H− )
n n
contains c, then H+ (resp. H− ) contains at least d+1 points from S (by d+1 , we mean the
n n n
largest integer greater than or equal to d+1 , namely the ceiling d d+1 e of d+1 ).
So, for d = 2, for each line, D, if the closed half-plane D+ (resp. D− ) contains c, then
D+ (resp. D− ) contains at least a third of the points from S. For d = 3, for each plane, H,
if the closed half-space H+ (resp. H− ) contains c, then H+ (resp. H− ) contains at least a
fourth of the points from S, etc. Example 3.12 shows nine points in the plane and one of
their centerpoints (in red). This example shows that the bound 13 is tight.
Observe that a point, c ∈ Ad , is a centerpoint of S iff c belongs to every open half-space,
◦ ◦
dn dn
H+ (resp. H− ) containing at least d+1
+ 1 points from S (again, we mean d d+1 e + 1).
◦ ◦
Indeed, if c is a centerpoint of S and H is any hyperplane such that H+ (resp. H− )
◦ ◦
dn
contains at least d+1 + 1 points from S, then H+ (resp. H− ) must contain c as otherwise,
dn n
the closed half-space, H− (resp. H+ ) would contain c and at most n − d+1 − 1 = d+1 −1
points from S, a contradiction. Conversely, assume that c belongs to every open half-space,
◦ ◦
dn
H+ (resp. H− ) containing at least d+1 + 1 points from S. Then, for any hyperplane, H,
n
if c ∈ H+ (resp. c ∈ H− ) but H+ contains at most d+1 − 1 points from S, then the open
76 CHAPTER 3. BASIC PROPERTIES OF CONVEX SETS
◦ ◦
n dn
half-space, H− (resp. H+ ) would contain at least n − d+1
+1 = d+1
+ 1 points from S but
not c, a contradiction.
We are now ready to prove the existence of centerpoints.
Theorem 3.14. (Existence of Centerpoints) Every finite set, S = {a1 , . . . , an }, of n points
in Ad has some centerpoint.
Proof. We will use the second characterization of centerpoints involving open half-spaces
dn
containing at least d+1 + 1 points.
Consider the family of sets,
◦ ◦ dn
C = conv(S ∩ H+ ) | (∃H) |S ∩ H+ | >
d+1
◦ ◦ dn
∪ conv(S ∩ H− ) | (∃H) |S ∩ H− | > ,
d+1
where H is a hyperplane.
AsTS is finite, C consists of a finite number of convex sets, say {C1 , . . . , Cm }. If we prove
that m
Tm
C
i=1 i 6
= ∅ we are done, because i=1 i is the set of centerpoints of S.
C
First, we prove by induction on k (with 1 ≤ k ≤ d + 1), that any intersection of k of the
Ci ’s has at least (d+1−k)n
d+1
+ k elements from S. For k = 1, this holds by definition of the Ci ’s.
Next, consider the intersection of k + 1 ≤ d + 1 of the Ci ’s, say Ci1 ∩ · · · ∩ Cik ∩ Cik+1 . Let
(d+1−k)n
Note that A = B ∩ C. By the induction hypothesis, B contains at least d+1
+ k elements
dn
from S. As C contains at least d+1 + 1 points from S, and as
It follows that
(d + 1 − k)n dn
|A| ≥ +k+ +1−n
d+1 d+1
that is,
Remark: The above proof actually shows that the set of centerpoints of S is a convex set.
In fact, it is a finite intersection of convex hulls of finitely many points, so it is the convex hull
of finitely many points, in other words, a polytope. It should also be noted that Theorem
3.14 can be proved easily using Tverberg’s theorem (Theorem 3.12). Indeed, for a judicious
choice of r, any Tverberg point is a centerpoint!
Jadhav and Mukhopadhyay have given a linear-time algorithm for computing a center-
point of a finite set of points in the plane. For d ≥ 3, it appears that the best that can
be done (using linear programming) is O(nd ). However, there are good approximation algo-
rithms (Clarkson, Eppstein, Miller, Sturtivant and Teng) and in E3 there is a near quadratic
algorithm (Agarwal, Sharir and Welzl). Recently, Miller and Sheehy (2009) have given an
algorithm for finding an approximate centerpoint in sub-exponential time together with a
polynomial-checkable proof of the approximation guarantee.
78 CHAPTER 3. BASIC PROPERTIES OF CONVEX SETS
Chapter 4
Our first lemma (Lemma 4.1) is intuitively quite obvious so the reader might be puzzled by
the length of its proof. However, after proposing several wrong proofs, we realized that its
proof is more subtle than it might appear. The proof below is due to Valentine [63]. See if
you can find a shorter (and correct) proof!
◦
Lemma 4.1. Let S be a nonempty convex set and let x ∈ S and y ∈ S. Then, we have
◦
]x, y[ ⊆ S.
◦
Proof. Let z ∈ ]x, y[ , that is, z = (1 − λ)x + λy, with 0 < λ < 1. Since x ∈ S, we can find
some open subset, U , contained in S so that x ∈ U . It is easy to check that the central
magnification of center z, Hz, λ−1 , maps x to y. Then, V = Hz, λ−1 (U ) is an open subset
λ λ
79
80 CHAPTER 4. TWO MAIN TOOLS: SEPARATION AND POLAR DUALITY
V
U W v
x z y
◦ ◦ ◦
Corollary 4.2. If S is convex, then S is also convex, and we have S = S. Furthermore, if
◦ ◦
S 6= ∅, then S = S.
Beware that if S is a closed set, then the convex hull conv(S) of S is not necessarily
closed!
For example, consider the subset S of A2 consisting of the points belonging to the right
branch of the hyperbola of equation x2 − y 2 = 1, that is,
S = {(x, y) ∈ R2 | x2 − y 2 ≥ 1, x ≥ 0}.
Then S is convex, but the convex hull of the set S ∪ {(0, 0} is not closed.
However, if S is compact, then conv(S) is also compact, and thus closed (see Proposition
3.3).
There is a simple criterion to test whether a convex set has an empty interior, based on
the notion of dimension of a convex set (recall that the dimension of a nonempty convex
subset is the dimension of its affine hull).
Proposition 4.3. A nonempty convex set S has a nonempty interior iff dim S = dim X.
4.1. SEPARATION THEOREMS AND FARKAS LEMMA 81
◦
Proof. Let d = dim X. First, assume that S 6= ∅. Then, S contains some open ball of center
a0 , and in it, we can find a frame (a0 , a1 , . . . , ad ) for X. Thus, dim S = dim X. Conversely,
let (a0 , a1 , . . . , ad ) be a frame of X, with ai ∈ S, for i = 0, . . . , d. Then, we have
a0 + · · · + ad ◦
∈ S,
d+1
◦
and S is nonempty.
Proposition 4.3 is false in infinite dimension.
One can also easily prove that convexity is preserved under direct image and inverse
image by an affine map.
The next lemma, which seems intuitively obvious, is the core of the proof of the Hahn-
Banach theorem. This is the case where the affine space has dimension two. First, we need
to define what is a convex cone with vertex x.
Definition 4.1. A convex set, C, is a convex cone with vertex x if C is invariant under all
central magnifications, Hx,λ , of center x and ratio λ, with λ > 0 (i.e., Hx,λ (C) = C). See
Figure 4.2.
Figure 4.2: For the dark pink disk C, Hx,λ (C) is the triangular section, excluding O, between
the two pink lines.
L
O x
Lemma 4.5. Let B be a nonempty open and convex subset of A2 , and let O be a point of
A2 so that O ∈
/ B. Then, there is some line, L, through O, so that L ∩ B = ∅.
Proof. Define the convex cone C = coneO (B). As B is open, it is easy to check that each
HO,λ (B) is open and since C is the union of the HO,λ (B) (for λ > 0), which are open, C
itself is open. Also, O ∈
/ C. We claim that at least one point, x, of the boundary, ∂C, of C,
is distinct from O. Otherwise, ∂C = {O} and we claim that C = A2 − {O}, which is not
convex, a contradiction. Indeed, as C is convex it is connected, A2 − {O} itself is connected
and C ⊆ A2 − {O}. If C 6= A2 − {O}, pick some point a 6= O in A2 − C and some point
c ∈ C. Now, a basic property of connectivity asserts that every continuous path from a (in
the exterior of C) to c (in the interior of C) must intersect the boundary of C, namely, {O}.
However, there are plenty of paths from a to c that avoid O, a contradiction. Therefore,
C = A2 − {O}.
Since C is open and x ∈ ∂C, we have x ∈
/ C. Furthermore, we claim that y = 2O − x (the
◦
symmetric of x w.r.t. O) does not belong to C either. Otherwise, we would have y ∈ C = C
and x ∈ C, and by Lemma 4.1, we would get O ∈ C, a contradiction. Therefore, the line
through O and x misses C entirely (since C is a cone), and thus, B ⊆ C.
A H
Proof. The case where dim X = 1 is trivial. Thus, we may assume that dim X ≥ 2. We
reduce the proof to the case where dim X = 2. Let V be an affine subspace of X of maximal
dimension containing L and so that V ∩ A = ∅. Pick an origin O ∈ L in X, and consider the
vector space XO . We would like to prove that V is a hyperplane, i.e., dimV = dimX −1. We
proceed by contradiction. Thus, assume that dim V ≤ dim X − 2. In this case, the quotient
space X/V has dimension at least 2. We also know that X/V is isomorphic to the orthogonal
complement, V ⊥ , of V so we may identify X/V and V ⊥ . The (orthogonal) projection map,
π : X → V ⊥ , is linear, continuous, and we can show that π maps the open subset A to an
open subset π(A), which is also convex (one way to prove that π(A) is open is to observe that
for any point, a ∈ A, a small open ball of center a contained in A is projected by π to an open
ball contained in π(A) and as π is surjective, π(A) is open). Furthermore, O ∈ / π(A). Since
⊥
V has dimension at least 2, there is some plane P (a subspace of dimension 2) intersecting
π(A), and thus, we obtain a nonempty open and convex subset B = π(A) ∩ P in the plane
P ∼ = A2 . So, we can apply Lemma 4.5 to B and the point O = 0 in P ∼ = A2 to find a line,
l, (in P ) through O with l ∩ B = ∅. But then, l ∩ π(A) = ∅ and W = π −1 (l) is an affine
subspace such that W ∩ A = ∅ and W properly contains V , contradicting the maximality of
V . See Figure 4.5.
Remark: The geometric form of the Hahn-Banach theorem also holds when the dimension
of X is infinite but a slightly more sophisticated proof is required. Actually, all that is needed
is to prove that a maximal affine subspace containing L and disjoint from A exists. This can
be done using Zorn’s lemma. For other proofs, see Bourbaki [13], Chapter 2, Valentine [63],
Chapter 2, Barvinok [4], Chapter 2, or Lax [39], Chapter 3.
Theorem 4.6 is false if we omit the assumption that A is open.
For a counter-example, let A ⊆ A2 be the union of the half space y < 0 with the closed
84 CHAPTER 4. TWO MAIN TOOLS: SEPARATION AND POLAR DUALITY
V
O
L=V
π (A ) V
O
l
L=V
A
l V
O
L=V
Figure 4.5: An illustration of the proof of 4.6. Let X = A3 , A be the open spherical ball,
and L the vertical purple line. The blue hyperplane, which strictly separates A from L is
construction using V ⊥ and l.
segment [0, 1] on the x-axis and let L be the point (2, 0) on the boundary of A. It is also
false if A is closed as shown by the following counter-example.
In E3 , consider the closed convex set (cone) A defined by the inequalities
x ≥ 0, y ≥ 0, z ≥ 0, z 2 ≤ xy,
and let D be the line given by x = 0, z = 1. Then D ∩ A = ∅, both A and D are convex and
closed, yet every plane containing D meets A.
Theorem 4.6 has many important corollaries. For example, we will eventually prove that
for any two nonempty disjoint convex sets, A and B, there is a hyperplane separating A and
B, but this will take some work (recall the definition of a separating hyperplane given in
Definition 3.4). We begin with the following version of the Hahn-Banach theorem:
Theorem 4.7. (Hahn-Banach, second version) Let X be a (finite-dimensional) affine space,
A be a nonempty convex subset of X with nonempty interior and L be an affine subspace of
X so that A ∩ L = ∅. Then, there is some hyperplane, H, containing L and separating L
and A.
4.1. SEPARATION THEOREMS AND FARKAS LEMMA 85
A H
◦ ◦
Proof. Since A is convex, by Corollary 4.2, A is also convex. By hypothesis, A is nonempty.
◦
So, we can apply Theorem 4.6 to the nonempty open and convex A and to the affine subspace
◦ ◦ ◦
L. We get a hyperplane H containing L such that A ∩ H = ∅. However, A ⊆ A = A and A
◦
is contained in the closed half space (H+ or H− ) containing A, so H separates A and L.
Corollary 4.8. Given an affine space, X, let A and B be two nonempty disjoint convex
◦
subsets and assume that A has nonempty interior (A 6= ∅). Then, there is a hyperplane
separating A and B.
Proof. Pick some origin O and consider the vector space XO . Define C = A − B (a special
case of the Minkowski sum) as follows:
[
A − B = {a − b | a ∈ A, b ∈ B} = (A − b).
b∈B
It is easily verified that C = A−B is convex and has nonempty interior (as a union of subsets
having a nonempty interior). Furthermore O ∈ / C, since A∩B = ∅.1 (Note that the definition
◦
depends on the choice of O, but this has no effect on the proof.) Since C is nonempty, we
can apply Theorem 4.7 to C and to the affine subspace {O} and we get a hyperplane, H,
1
Readers who prefer a purely affine argument may define C = A − B as the affine subset
A − B = {O + a − b | a ∈ A, b ∈ B}.
Again, O ∈
/ C and C is convex. We can pick the affine form, f , defining a separating hyperplane, H, of C
and {O}, so that f (O + a − b) ≤ f (O), for all a ∈ A and all b ∈ B, i.e., f (a) ≤ f (b).
86 CHAPTER 4. TWO MAIN TOOLS: SEPARATION AND POLAR DUALITY 1
A H
separating C and {O}. Let f be any linear form defining the hyperplane H. We may assume
that f (a − b) ≤ 0, for all a ∈ A and all b ∈ B, i.e., f (a) ≤ f (b). Consequently, if we let
α = sup{f (a) | a ∈ A} (which makes sense, since the set {f (a) | a ∈ A} is bounded), we have
f (a) ≤ α for all a ∈ A and f (b) ≥ α for all b ∈ B, which shows that the affine hyperplane
defined by f − α separates A and B.
Remark: Theorem 4.7 and Corollary 4.8 also hold in the infinite dimensional case, see Lax
[39], Chapter 3, or Barvinok, Chapter 3.
Since a hyperplane, H, separating A and B as in Corollary 4.8 is the boundary of each
of the two half–spaces that it determines, we also obtain the following corollary:
Corollary 4.9. Given an affine space, X, let A and B be two nonempty disjoint open and
convex subsets. Then, there is a hyperplane strictly separating A and B.
Beware that Corollary 4.9 fails for closed convex sets.
However, Corollary 4.9 holds if we also assume that A (or B) is compact, as shown in
Corollary 4.10.
We need to review the notion of distance from a point to a subset. Let X be a metric
space with distance function, d. Given any point, a ∈ X, and any nonempty subset, B, of
X, we let
d(a, B) = inf d(a, b)
b∈B
Euclidean structure. We have the following important property: For any nonempty closed
subset, S ⊆ X (not necessarily convex), and any point, a ∈ X, there is some point s ∈ S
“achieving the distance from a to S,” i.e., so that
The proof uses the fact that the distance function is continuous and that a continuous
function attains its minimum on a compact set, and is left as an exercise.
Corollary 4.10. Given an affine space, X, let A and B be two nonempty disjoint closed
and convex subsets, with A compact. Then, there is a hyperplane strictly separating A and
B.
where B(a, ) denotes the open ball, B(a, ) = {x ∈ X | d(a, x) < }, of center a and radius
> 0. Note that
[
A + B(O, ) = B(a, ),
a∈A
which shows that A + B(O, ) is open; furthermore it is easy to see that if A is convex, then
A + B(O, ) is also convex. Now, the function a 7→ d(a, B) (where a ∈ A) is continuous and
since A is compact, it achieves its minimum, d(A, B) = mina∈A d(a, B), at some point, a, of A.
Say, d(A, B) = δ. Since B is closed, there is some b ∈ B so that d(A, B) = d(a, B) = d(a, b)
and since A ∩ B = ∅, we must have δ > 0. Thus, if we pick < δ/2, we see that
Now, A+B(O, ) and B +B(O, ) are open, convex and disjoint and we conclude by applying
Corollary 4.9.
Finally, we have the separation theorem announced earlier for arbitrary nonempty convex
subsets.
Theorem 4.11. (Separation of disjoint convex sets) Given an affine space, X, let A and B
be two nonempty disjoint convex subsets. Then, there is a hyperplane separating A and B.
88 CHAPTER 4. TWO MAIN TOOLS: SEPARATION AND POLAR DUALITY 1
x A+x
O A
H
D B
−x
A−x
Proof. The proof is by descending induction on dim A. If dim A = dim X, we know from
Proposition 4.3 that A has nonempty interior and we conclude using Corollary 4.8. Next,
asssume that the induction hypothesis holds if dim A ≥ n and assume dim A = n − 1. Pick
an origin O ∈ A and let H be a hyperplane containing A. Pick x ∈ X outside H and define
C = conv(A ∪ {A + x}) where A + x = {a + x | a ∈ A} and D = conv(A ∪ {A − x}) where
A − x = {a − x | a ∈ A}. Note that C ∪ D is convex. If B ∩ C 6= ∅ and B ∩ D 6= ∅, then
the convexity of B and C ∪ D implies that A ∩ B 6= ∅, a contradiction. Without loss of
generality, assume that B ∩ C = ∅. Since x is outside H, we have dim C = n and by the
induction hypothesis, there is a hyperplane, H1 separating C and B. As A ⊆ C, we see that
H1 also separates A and B.
Remarks:
(1) The reader should compare this proof (from Valentine [63], Chapter II) with Berger’s
proof using compactness of the projective space Pd , see Berger [8] (Corollary 11.4.7).
(2) Rather than using the Hahn-Banach theorem to deduce separation results, one may
proceed differently and use the following intuitively obvious lemma, as in Valentine
[63] (Theorem 2.4):
Lemma 4.12. If A and B are two nonempty convex sets such that A ∪ B = X and
A ∩ B = ∅, then V = A ∩ B is a hyperplane.
4.1. SEPARATION THEOREMS AND FARKAS LEMMA 89
One can then deduce Corollaries 4.8 and Theorem 4.11. Yet another approach is
followed in Barvinok [4].
(3) How can some of the above results be generalized to infinite dimensional affine spaces,
especially Theorem 4.6 and Corollary 4.8? One approach is to simultaneously relax
the notion of interior and tighten a little the notion of closure, in a more “linear and
less topological” fashion, as in Valentine [63].
(4) Yet another approach is to define the notion of an algebraically open convex set, as
in Barvinok [4]. A convex set, A, is algebraically open iff the intersection of A with
every line, L, is an open interval, possibly empty or infinite at either end (or all of
L). An open convex set is algebraically open. Then, the Hahn-Banach theorem holds
provided that A is an algebraically open convex set and similarly, Corollary 4.8 also
holds provided A is algebraically open. For details, see Barvinok [4], Chapter 2 and 3.
We do not know how the notion “algebraically open” relates to the concept of core.
(5) Theorems 4.6, 4.7 and Corollary 4.8 are proved in Lax [39] using the notion of gauge
function in the more general case where A has some core point (but beware that Lax
uses the terminology interior point instead of core point!).
An important special case of separation is the case where A is convex and B = {a}, for
some point, a, in A.
A “cute” application of Corollary 4.10 is one of the many versions of “Farkas Lemma”
(1893-1894, 1902), a basic result in the theory of linear programming. For any vector,
x = (x1 , . . . , xn ) ∈ Rn , and any real, α ∈ R, write x ≥ α iff xi ≥ α, for i = 1, . . . , n.
The proof of Farkas Lemma Version I (Proposition 4.14) relies on the fact that a poly-
hedral cone cone(a1 , . . . , am ) is closed. Although it seems obvious that a polyhedral cone
should be closed, a rigorous proof is not entirely trivial.
90 CHAPTER 4. TWO MAIN TOOLS: SEPARATION AND POLAR DUALITY
Indeed, the fact that a polyhedral cone is closed relies crucially on the fact that C is
spanned by a finite number of vectors, because the cone generated by an infinite set may
not be closed. For example, consider the closed disk D ⊆ R2 of center (0, 1) and radius 1,
which is tangent to the x-axis at the origin. Then the cone(D) consists of the open upper
half-plane plus the origin (0, 0), but this set is not closed.
2. A polyhedral cone C is the union of finitely many primitive cones, where a primitive
cone is a polyhedral cone spanned by linearly independent vectors.
Assume that (a1 , . . . , am ) are linearly independent vectors in Rn , and consider any se-
quence (x(k) )k≥0
m
(k)
X
(k)
x = λ i ai
i=1
(k)
of vectors in the primitive cone cone({a1 , . . . , am }), which means that λj ≥ 0 for i =
1, . . . , m and all k ≥ 0. The vectors x(k) belong to the subspace U spanned by (a1 , . . . , am ),
and U is closed. Assume that the sequence (x(k) )k≥0 converges to a limit x ∈ Rn . Since U
is closed and x(k) ∈ U for all k ≥ 0, we have x ∈ U . If we write x = x1 a1 + · · · + xm am , we
would like to prove that xi ≥ 0 for i = 1, . . . , m. The sequence the (x(k) )k≥0 converges to x
iff
lim
x(k) − x
= 0,
k7→∞
iff
m
X 1/2
(k)
lim |λi − xi |2 =0
k7→∞
i=1
iff
(k)
lim λi = xi , i = 1, . . . , m.
k7→∞
(k)
Since λi ≥ 0 for i = 1, . . . , m and all k ≥ 0, we have xi ≥ 0 for i = 1, . . . , m, so
x ∈ cone({a1 , . . . , am }).
Next, assume that x belongs to the polyhedral cone C. Consider a positive combination
x = λ1 a1 + · · · + λk ak , (∗1 )
that is, we are dropping the condition λ1 + · · · + λn = 1. For this version of Farkas Lemma
we need the following separation lemma:
Proposition 4.15. Let C ⊆ Ed be any closed convex cone with vertex O. Then, for every
point a not in C, there is a hyperplane H passing through O separating a and C with a ∈
/ H.
Proof. Since C is closed and convex and {a} is compact and convex, by Corollary 4.10, there
is a hyperplane, H 0 , strictly separating a and C. Let H be the hyperplane through O parallel
to H 0 . Since C and a lie in the two disjoint open half-spaces determined by H 0 , the point a
cannot belong to H. Suppose that some point, b ∈ C, lies in the open half-space determined
by H and a. Then, the line, L, through O and b intersects H 0 in some point, c, and as C
is a cone, the half line determined by O and b is contained in C. So, c ∈ C would belong
to H 0 , a contradiction. Therefore, C is contained in the closed half-space determined by H
that does not contain a, as claimed.
H! H
a
O C
Lemma 4.16. (Farkas Lemma, Version II) Given any d × n real matrix A and any vector
z ∈ Rd , exactly one of the following alternatives occurs:
Proof. The proof is analogous to the proof of Lemma 4.14 except that it uses Proposition
4.15 instead of Corollary 4.10 and either z ∈ cone(A1 , . . . , An ) or z ∈
/ cone(A1 , . . . , An ).
4.1. SEPARATION THEOREMS AND FARKAS LEMMA 93
One can show that Farkas II implies Farkas I. Here is another version of Farkas Lemma
having to do with a system of inequalities, Ax ≤ z. Although, this version may seem weaker
that Farkas II, it is actually equivalent to it!
Lemma 4.17. (Farkas Lemma, Version III) Given any d × n real matrix A and any vector
z ∈ Rd , exactly one of the following alternatives occurs:
(a) The system of inequalities Ax ≤ z has a solution x, or
These versions of Farkas lemma are statements of the form (P ∨ Q) ∧ ¬(P ∧ Q), which
is easily seen to be equivalent to ¬P ≡ Q, namely, the logical equivalence of ¬P and
Q. Therefore, Farkas-type lemmas can be interpreted as criteria for the unsolvablity of
various kinds of systems of linear equations or systems of linear inequalities, in the form of
a separation property.
For example, Farkas II (Lemma 4.16) says that a system of linear equations, Ax = z,
does not have any solution, x ≥ 0, iff there is some c ∈ Rd such that c> z < 0 and c> A ≥ 0.
This means that there is a hyperplane, H, of equation c> y = 0, such that the columns
vectors, Aj , forming the matrix A all lie in the positive closed half space, H+ , but z lies in
the interior of the other half space, H− , determined by H. Therefore, z can’t be in the cone
spanned by the Aj ’s.
94 CHAPTER 4. TWO MAIN TOOLS: SEPARATION AND POLAR DUALITY
Farkas III says that a system of linear inequalities, Ax ≤ z, does not have any solution
(at all) iff there is some c ∈ Rd such that c ≥ 0, c> z < 0 and c> A = 0. This time, there
is also a hyperplane of equation c> y = 0, with c ≥ 0, such that the columns vectors, Aj ,
forming the matrix A all lie in H but z lies in the interior of the half space, H− , determined
by H. In the “easy” direction, if there is such a vector c and some x satisfying Ax ≤ z, since
c ≥ 0, we get c> Ax ≤ c> z, but c> Ax = 0 and c> z < 0, a contradiction.
What is the crirerion for the insolvability of a system of inequalities Ax ≤ z with x ≥ 0?
This problem is equivalent to the insolvability of the set of inequalities
A z
x≤
−I 0
and by Farkas III, this system has no solution iff there is some vector, (c1 , c2 ), with (c1 , c2 ) ≥
0,
> > A > > z
(c1 , c2 ) = 0 and (c1 , c2 ) < 0.
−I 0
The above conditions are equivalent to c1 ≥ 0, c2 ≥ 0, c> > >
1 A − c2 = 0 and c1 z < 0, which
reduce to c1 ≥ 0, c> >
1 A ≥ 0 and c1 z < 0.
We can put all these versions together to prove the following version of Farkas lemma:
Lemma 4.18. (Farkas Lemma, Version IIIb) For any d × n real matrix A and any vector
z ∈ Rd , the following statements are equivalent:
(1) The system Ax = z has no solution x ≥ 0 iff there is some c ∈ Rd such that c> A ≥ 0
and c> z < 0.
(2) The system Ax ≤ z has no solution iff there is some c ∈ Rd such that c ≥ 0, c> A = 0
and c> z < 0.
(3) The system Ax ≤ z has no solution x ≥ 0 iff there is some c ∈ Rd such that c ≥ 0,
c> A ≥ 0 and c> z < 0.
Proof. We already proved that (1) implies (2) and that (2) implies (3). The proof that (3)
implies (1) is left as an easy exercise.
The reader might wonder what is the criterion for the unsolvability of a system Ax = z,
without any condition on x. However, since the unsolvability of the system Ax = b is
equivalent to the unsolvability of the system
A z
x≤ ,
−A −z
using (2), the above system is unsolvable iff there is some (c1 , c2 ) ≥ (0, 0) such that
> > A > > z
(c1 , c2 ) = 0 and (c1 , c2 ) < 0,
−A −z
4.2. SUPPORTING HYPERPLANES AND MINKOWSKI’S PROPOSITION 95
and these are equivalent to c> > > > > >
1 A − c2 A = 0 and c1 z − c2 z < 0, namely, c A = 0 and c z < 0
where c = c1 − c2 ∈ Rd . However, this simply says that c is orthogonal to the columns
A1 , . . . , An of A and that z is not orthogonal to c, so z cannot belong to the column space
of A, a criterion that we already knew from linear algebra.
As in Matousek and Gartner [41], we can summarize these various criteria in the following
table:
The system The system
Ax ≤ z Ax = z
has no solution ∃c ∈ Rd , such that c ≥ 0, ∃c ∈ Rd , such that
x ≥ 0 iff c> A ≥ 0 and c> z < 0 c> A ≥ 0 and c> z < 0
has no solution ∃c ∈ Rd , such that, c ≥ 0, ∃c ∈ Rd , such that
x ∈ Rn iff c> A = 0 and c> z < 0 c> A = 0 and c> z < 0
Remark: The strong duality theorem in linear programming can be proved using Lemma
4.18(c).
Remark: The assumption that A is closed is convenient but unnecessary. Indeed, the proof
of Proposition 4.19 shows that the proposition holds for every boundary point, a ∈ ∂A
(assuming ∂A 6= ∅).
96 CHAPTER 4. TWO MAIN TOOLS: SEPARATION AND POLAR DUALITY
◦
Beware that Proposition 4.19 is false when the dimension of X is infinite and when A= ∅.
The proposition below gives a sufficient condition for a closed subset to be convex.
Proposition 4.20. Let A be a closed subset with nonempty interior. If there is a supporting
hyperplane for every point a ∈ ∂A, then A is convex.
The proposition below characterizes closed convex sets in terms of (closed) half–spaces.
It is another intuitive fact whose rigorous proof is nontrivial.
Proposition 4.21. Let A be a nonempty closed and convex subset. Then, A is the intersec-
tion of all the closed half–spaces containing it.
Proof. Let A0 be the intersection of all the closed half–spaces containing A. It is immediately
checked that A0 is closed and convex and that A ⊆ A0 . Assume that A0 6= A, and pick
a ∈ A0 − A. Then, we can apply Corollary 4.10 to {a} and A and we find a hyperplane,
H, strictly separating A and {a}; this shows that A belongs to one of the two half-spaces
determined by H, yet a does not belong to the same half-space, contradicting the definition
of A0 .
H = {a ∈ En | Oh · Oa = 1}.
Indeed, any hyperplane H in En is the null set of some equation of the form
α1 x1 + · · · + αn xn = β,
and if O ∈
/ H, then β 6= 0. Thus, any hyperplane H not passing through O is defined by an
equation of the form
h1 x1 + · · · + hn xn = 1,
4.3. POLARITY AND DUALITY 97
H = {a ∈ En | Oh · Oa = 1},
The functions a 7→ Oh1 · Oa − 1 and a 7→ Oh2 · Oa − 1 are two affine forms defining the
same hyperplane, so there is a nonzero scalar λ so that
(see Gallier [30], Chapter 2, Section 2.10). In particular, for a = O, we find that λ = 1, and
so,
Oh1 · Oa = Oh2 · Oa for all a,
which implies h1 = h2 . This proves the uniqueness of h.
Using the above, we make the following definition:
Definition 4.2. Given any point a 6= O, the polar hyperplane of a (w.r.t. S n−1 ) or dual of
a is the hyperplane a† given by
a† = {b ∈ En | Oa · Ob = 1}.
Given a hyperplane H not containing O, the pole of H (w.r.t S n−1 ) or dual of H is the
(unique) point H † so that
H = {a ∈ En | OH† · Oa = 1}.
H = a† = {b ∈ En | Oa · Ob = 1}
and kOak > 1, the hyperplane H intersects S n−1 (along an (n − 2)-dimensional sphere)
and if b is any point on H ∩ S n−1 , we claim that Ob and ba are orthogonal. This means
that H ∩ S n−1 is the set of points on S n−1 where the lines through a and tangent to S n−1
touch S n−1 (they form a cone tangent to S n−1 with apex a). Indeed, as Oa = Ob + ba and
b ∈ H ∩ S n−1 i.e., Oa · Ob = 1 and kObk2 = 1, we get
O a
a†
a1 X1 + · · · + an Xn = 1.
Remark: As we noted, polarity in a Euclidean space suffers from the minor defect that the
polar of the origin is undefined and, similarly, the pole of a hyperplane through the origin
does not make sense. If we embed En into the projective space, Pn , by adding a “hyperplane
at infinity” (a copy of Pn−1 ), thereby viewing Pn as the disjoint union Pn = En ∪ Pn−1 , then
the polarity correspondence can be defined everywhere. Indeed, the polar of the origin is the
hyperplane at infinity (Pn−1 ) and since Pn−1 can be viewed as the set of hyperplanes through
the origin in En , the pole of a hyperplane through the origin is the corresponding “point at
infinity” in Pn−1 .
Now, we would like to extend this correspondence to subsets of En , in particular, to
convex sets. Given a hyperplane, H, not containing O, we denote by H− the closed half-
space containing O.
4.3. POLARITY AND DUALITY 99
For simplicity of notation, we write a†− for (a† )− . Observe that {O}∗ = En , so it is
†
convenient to set O− = En , even though O† is undefined. By definition, A∗ is convex even if
A is not. Furthermore, note that
(1) A ⊆ A∗∗ .
(2) If A ⊆ B, then B ∗ ⊆ A∗ .
It follows immediately from (1) and (2) that A∗∗∗ = A∗ . Also, if B n (r) is the (closed)
ball of radius r > 0 and center O, it is obvious by definition that B n (r)∗ = B n (1/r).
In Figure 4.11, the polar dual of the polygon (v1 , v2 , v3 , v4 , v5 ) is the polygon shown in
green. This polygon is cut out by the half-planes determined by the polars of the vertices
(v1 , v2 , v3 , v4 , v5 ) and containing the center of the circle. These polar lines are all easy to
determine by drawing for each vertex, vi , the tangent lines to the circle and joining the
contact points. The construction of the polar of v3 is shown in detail.
Remark: We chose a different notation for polar hyperplanes and polars (a† and H † ) and
polar duals (A∗ ), to avoid the potential confusion between H † and H ∗ , where H is a hy-
perplane (or a† and {a}∗ , where a is a point). Indeed, they are completely different! For
example, the polar dual of a hyperplane is either a line orthogonal to H through O, if O ∈ H,
or a semi-infinite line through O and orthogonal to H whose endpoint is the pole, H † , of H,
whereas, H † is a single point! Ziegler ([67], Chapter 2) use the notation A4 instead of A∗
for the polar dual of A.
We would like to investigate the duality induced by the operation A 7→ A∗ . Unfortunately,
it is not always the case that A∗∗ = A, but this is true when A is closed and convex, as
shown in the following proposition:
v3
v2 v4
v1 v5
Proof. (i) If A is bounded, then A ⊆ B n (r) for some r > 0 large enough. Then,
◦ ◦
B n (r)∗ = B n (1/r) ⊆ A∗ , so that O ∈ A∗ . If O ∈ A, then B n (r) ⊆ A for some r small enough,
so A∗ ⊆ B n (r)∗ = B r (1/r) and A∗ is bounded.
(ii) We always have A ⊆ A∗∗ . We prove that if b ∈ / A∗∗ ; this shows that
/ A, then b ∈
∗∗ ∗∗
A ⊆ A and thus, A = A . Since A is closed and convex and {b} is compact (and convex!),
by Corollary 4.10, there is a hyperplane, H, strictly separating A and b and, in particular,
O∈/ H, as O ∈ A. If h = H † is the pole of H, we have
/ A∗∗ , since
since H− = {a ∈ En | Oh · Oa ≤ 1}. This shows that b ∈
Remark: For an arbitrary subset A ⊆ En , it can be shown that A∗∗ = conv(A ∪ {O}), the
topological closure of the convex hull of A ∪ {O}.
Proposition 4.22 will play a key role in studying polytopes, but before doing this, we
need one more proposition.
4.3. POLARITY AND DUALITY 101
◦
Proposition 4.23. Let A be any closed convex subset of En such that O ∈ A. The polar
hyperplanes of the points of the boundary of A constitute the set of supporting hyperplanes of
A∗ . Furthermore, for any a ∈ ∂A, the points of A∗ where H = a† is a supporting hyperplane
of A∗ are the poles of supporting hyperplanes of A at a.
◦
Proof. Since O ∈ A, we have O ∈ / ∂A, and so, for every a ∈ ∂A, the polar hyperplane a†
is well-defined. Pick any a ∈ ∂A and let H = a† be its polar hyperplane. By definition,
A∗ ⊆ H− , the closed half-space determined by H and containing O. If T is any supporting
hyperplane to A at a, as a ∈ T , we have t = T † ∈ a† = H. Furthermore, it is a simple
exercise to prove that t ∈ (T− )∗ (in fact, (T− )∗ is the interval with endpoints O and t). Since
A ⊆ T− (because T is a supporting hyperplane to A at a), we deduce that t ∈ A∗ , and thus,
H is a supporting hyperplane to A∗ at t. By Proposition 4.22, as A is closed and convex,
A∗∗ = A; it follows that all supporting hyperplanes to A∗ are indeed obtained this way.
102 CHAPTER 4. TWO MAIN TOOLS: SEPARATION AND POLAR DUALITY
Chapter 5
(2) As a subset of En cut out by a finite number of hyperplanes, more precisely, as the
intersection of a finite number of (closed) half-spaces.
As stated, these two definitions are not equivalent because (1) implies that a polyhedron
is bounded, whereas (2) allows unbounded subsets. Now, if we require in (2) that the convex
set A is bounded, it is quite clear for n = 2 that the two definitions (1) and (2) are equivalent;
for n = 3, it is intuitively clear that definitions (1) and (2) are still equivalent, but proving
this equivalence rigorously does not appear to be that easy. What about the equivalence
when n ≥ 4?
It turns out that definitions (1) and (2) are equivalent for all n, but this is a nontrivial
theorem and a rigorous proof does not come by so cheaply. Fortunately, since we have
Krein and Milman’s theorem at our disposal and polar duality, we can give a rather short
proof. The hard direction of the equivalence consists in proving that Definition (1) implies
Definition (2). This is where the duality induced by polarity becomes handy, especially, the
fact that A∗∗ = A! (under the right hypotheses). First, we give precise definitions (following
Ziegler [67]).
1
Definition 5.1. Let E be any affine Euclidean spaceTp of finite dimension, n. An H-polyhedron
in E, for short, a polyhedron, is any subset, P = i=1 Ci , of E defined as the intersection of a
finite number, p ≥ 1, of closed half-spaces, Ci ; an H-polytope in E is a bounded polyhedron
and a V-polytope is the convex hull, P = conv(S), of a finite set of points, S ⊆ E.
1 →
−
This means that the vector space, E , associated with E is a Euclidean space.
103
1
104 CHAPTER 5. POLYHEDRA AND POLYTOPES
(a) (b)
Obviously, H-polyhedra are convex and closed as intersections of convex and closed half-
spaces (in E). By Proposition 3.3, V-polytopes are also convex and closed (in E). Since the
notions of H-polytope and V-polytope are equivalent (see Theorem 5.7), we often use the
simpler locution polytope. Examples of an H-polyhedron and of a V-polytope are shown in
Figure 5.1.
Note that Definition 5.1 allows H-polytopes and V-polytopes to have an empty interior,
which is somewhat of an inconvenience. This is not a problem, since we may always restrict
ourselves to the affine hull of P (some affine space, E, of dimension d ≤ n, where d = dim(P ),
as in Definition 3.2) as we now show.
Proof. (1) This follows immediately because E is an affine subspace of E and every affine sub-
space of E is closed under affine combinations and so, a fortiori , under convex combinations.
We leave the details as an easy exercise.
(2) Assume A is an H-polyhedron in E and that d < n. By definition, A = pi=1 Ci , where
T
the Ci are closed half-spaces determined by some hyperplanes, H1 , . . . , Hp , in E. (Observe
that the hyperplanes, Hi ’s, associated with the closed half-spaces, Ci , may not be distinct.
5.1. POLYHEDRA, H-POLYTOPES AND V-POLYTOPES 105
For example, we may have Ci = (Hi )+ and Cj = (Hi )− , for the two closed half-spaces
determined by Hi .) As A ⊆ E, we have
p
\
A=A∩E = (Ci ∩ E),
i=1
Consequently, we get
p q
\ \
A=A∩E = ((Ei )+ ∩ (Ei )− ) ∩ Cj0 ,
i=1 j=1
Proposition 5.2. Given any two affine Euclidean spaces, E and F , if h : E → F is any
affine map then:
Proof. (1) As any affine map preserves affine combinations it also preserves convex combi-
nation. Thus, h(conv(S)) = conv(h(S)), for any S ⊆ E.
(2) Say A = pi=1 Ci in E. Consider any half-space, C, in E and assume that
T
C = {x ∈ E | ϕ(x) ≤ 0},
for some affine form, ϕ, defining the hyperplane, H = {x ∈ E | ϕ(x) = 0}. Then, as h is
bijective, we get
This shows that h(C) is one of the closed half-spaces in F determined by the hyperplane,
H 0 = {y ∈ F | (ϕ ◦ h−1 )(y) = 0}. Furthermore, as h is bijective, it preserves intersections so
p p
!
\ \
h(A) = h Ci = h(Ci ),
i=1 i=1
By Proposition 5.2 we may assume that E = Ed and by Proposition 5.1 we may assume
that dim(A) = d. These propositions justify the type of argument beginning with: “We may
assume that A ⊆ Ed has dimension d, that is, that A has nonempty interior.” This kind of
reasonning will occur many times.
Since the boundary of a closed half-space, Ci , is a hyperplane, Hi , and since hyperplanes
are defined by affine forms, a closed half-space is defined by the locus of points satisfying a
“linear” inequality of the form ai · x ≤ bi or ai · x ≥ bi , for some vector ai ∈ Rn and some
bi ∈ R. Since ai · x ≥ bi is equivalent to (−ai ) · x ≤ −bi , we may restrict our attention
to inequalities with a ≤ sign. Thus, if A is the p × n matrix whose ith row is ai , we see
that the H-polyhedron, P , is defined by the system of linear inequalities, Ax ≤ b, where
b = (b1 , . . . , bp ) ∈ Rp . We write
Remark: Some authors call “convex” polyhedra and “convex” polytopes what we have
simply called polyhedra and polytopes. Since Definition 5.1 implies that these objects are
5.1. POLYHEDRA, H-POLYTOPES AND V-POLYTOPES 107
convex and since we are not going to consider non-convex polyhedra in this chapter, we stick
to the simpler terminology.
One should consult Ziegler [67], Berger [8], Grunbaum [35] and especially Cromwell [22],
for pictures of polyhedra and polytopes. Figure 5.2 shows the picture a polytope whose faces
are all pentagons. This polytope is called a dodecahedron. The dodecahedron has 12 faces,
30 edges and 20 vertices.
Even better and a lot more entertaining, take a look at the spectacular web sites of
George Hart,
Virtual Polyedra: http://www.georgehart.com/virtual-polyhedra/vp.html,
George Hart’s web site: http://www.georgehart.com/
and also
Zvi Har’El ’s web site: http://www.math.technion.ac.il/ rl/
The Uniform Polyhedra web site: http://www.mathconsult.ch/showroom/unipoly/
Paper Models of Polyhedra: http://www.korthalsaltes.com/
Bulatov’s Polyhedra Collection: http://www.physics.orst.edu/ bulatov/polyhedra/
Paul Getty’s Polyhedral Solids: http://home.teleport.com/ tpgettys/poly.shtml
Jill Britton’s Polyhedra Pastimes: http://ccins.camosun.bc.ca/ jbritton/jbpolyhedra.htm
and many other web sites dealing with polyhedra in one way or another by searching for
“polyhedra” on Google!
108 CHAPTER 5. POLYHEDRA AND POLYTOPES
The standard cube is a V-polytope. The standard n-cross-polytope (or n-co-cube) is the set
n
X
n
{(x1 , . . . , xn ) ∈ E | |xi | ≤ 1}.
i=1
It is also a V-polytope.
Note
Tt that some of the hyperplanes cutting out a polyhedron may be redundant. If
A = i=1 Ci is a polyhedron (where each closedTt half-space, Ci , is associated with a hyper-
plane, Hi , so that ∂Ci = Hi ), we
T say that i=1 Ci is an irredundant decomposition of A if
A cannot be expressed as A = m C
i=1 i
0
with m < t (for some closed half-spaces, Ci0 ). The
following proposition shows that the Ci in an irredundant decomposition of A are uniquely
determined by A.
Proposition 5.3. Let A be a polyhedron with nonempty interior and assume that
A = ti=1 Ci is an irredundant decomposition of A. Then,
T
(iii) We have ∂A = ti=1 Faceti A, where the union is irredundant, i.e., Faceti A is not a
S
subset of Facetj A, for all i 6= j.
T Tt
Proof. (ii) Fix any i and consider Ai = j6=i Cj . As A = i=1 Ci is an irredundant decompo-
◦ ◦
sition, there is some x ∈ Ai − Ci . Pick any a ∈ A. By Lemma 4.1, we get b = [a, x] ∩ Hi ∈ Ai ,
so b belongs to the interior of Hi ∩ Ai in Hi .
◦ ◦
(iii) As ∂A = A − A = A ∩ (A)c (where B c denotes the complement of a subset B of En )
5.1. POLYHEDRA, H-POLYTOPES AND V-POLYTOPES 109
If we had Faceti A ⊆ Facetj A, for some i 6= j, then, by (ii), as the affine hull of Faceti A is
Hi and the affine hull of Facetj A is Hj , we would have Hi ⊆ Hj , a contradiction.
(i) As the decomposition is irredundant, the Hi are pairwise distinct. Also, by (ii), each
facet, Faceti A, has dimension d − 1 (where d = dim A). Then, in (iii), we can show that the
decomposition of ∂A as a union of polytopes of dimension d − 1 whose pairwise nonempty
intersections have dimension at most d − 2 (since they are contained in pairwise distinct
hyperplanes) is unique up to permutation. Indeed, assume that
∂A = F1 ∪ · · · ∪ Fm = G1 ∪ · · · ∪ Gn ,
where the Fi ’s and G0j are polyhedra of dimension d−1 and each of the unions is irredundant.
Then, we claim that for each Fi , there is some Gϕ(i) such that Fi ⊆ Gϕ(i) . If not, Fi would
be expressed as a union
Fi = (Fi ∩ Gi1 ) ∪ · · · ∪ (Fi ∩ Gik )
where dim(Fi ∩ Gij ) ≤ d − 2, since the hyperplanes containing Fi and the Gj ’s are pairwise
distinct, which is absurd, since dim(Fi ) = d − 1. By symmetry, for each Gj , there is some
Fψ(j) such that Gj ⊆ Fψ(j) . But then, Fi ⊆ Fψ(ϕ(i)) for all i and Gj ⊆ Gϕ(ψ(j)) for all j which
implies ψ(ϕ(i)) = i for all i and ϕ(ψ(j)) = j for all j since the unions are irredundant. Thus,
ϕ and ψ are mutual inverses and the Bj ’s are just a permutation of the Ai ’s, as claimed.
110 CHAPTER 5. POLYHEDRA AND POLYTOPES
Therefore, the facets, Faceti A, are uniquely determined by A and so are the hyperplanes,
Hi = aff(Faceti A), and the half-spaces, Ci , that they determine.
As a consequence, if A is a polyhedron, then so are its facets and the same holds for
◦
H-polytopes. If A is an H-polytope and H is a hyperplane with H ∩ A 6= ∅, then H ∩ A is
an H-polytope whose facets are of the form H ∩ F , where F is a facet of A.
We can use induction and define k-faces, for 0 ≤ k ≤ n − 1.
Definition 5.2. Let A ⊆ En be a polyhedron with nonempty interior. We define a k-face
of A to be a facet of a (k + 1)-face of A, for k = 0, . . . , n − 2, where an (n − 1)-face is just
a facet of A. The 1-faces are called edges. Two k-faces are adjacent if their intersection is a
(k − 1)-face.
The polyhedron A itself is also called a faceT(of itself) or n-face and the k-faces of A with
k ≤ n − 1 are called proper faces of A. If A = ti=1 Ci is an irredundant decomposition of A
and Hi is the boundary of Ci , then the hyperplane, Hi , is called the supporting hyperplane
of the facet Hi ∩ A. We suspect that the 0-faces of a polyhedron are vertices in the sense
of Definition 3.6. This is true and, in fact, the vertices of a polyhedron coincide with its
extreme points (see Definition 3.7).
Proposition 5.4. Let A ⊆ En be a polyhedron with nonempty interior.
(1) For any point, a ∈ ∂A, on the boundary of A, the intersection of all the supporting
hyperplanes to A at a coincides with the intersection of all the faces that contain a. In
particular, points of order k of A are those points in the relative interior of the k-faces
of A2 ; thus, 0-faces coincide with the vertices of A.
(2) The vertices of A coincide with the extreme points of A.
Proof. (1) If H is a supporting hyperplane to A at a, then, one of the half-spaces, C,
determined by H, satisfies A = A ∩ C. It follows from Proposition 5.3 that if H 6= Hi (where
the hyperplanes Hi are the supporting hyperplanes of the facets of A), then C is redundant,
from which (1) follows.
(2) If a ∈ ∂A is not extreme, then a ∈ [y, z], where y, z ∈ ∂A. However, this implies that
a has order k ≥ 1, i.e, a is not a vertex.
The proof that every H-polytope A is a V-polytope relies on the fact that the extreme
points of an H-polytope coincide with its vertices, which form a finite nonempty set, and by
Krein and Millman’s Theorem (Theorem 3.8), A is the convex hull of its vertices.
The proof that every V-polytope A is an H-polytope relies on the crucial fact that
the polar dual A∗ of a V-polytope A is an H-polyhedron, and that the equations of the
2
Given a convex set, S, in An , its relative interior is its interior in the affine hull of S (which might be
of dimension strictly less than n).
5.2. POLAR DUALS OF V-POLYTOPES AND H-POLYHEDRA 111
hyperplanes cutting out A∗ are obtained in a very simple manner from the points ai specifying
A as A = conv(a1 , . . . , ap ); see Proposition 5.5.
The proof that every V-polytope A is an H-polytope consists of the following steps:
(4) A V-polytope is closed and convex, and since O belong to A (in fact, to the interior of
A), by Proposition 4.22, we have A = A∗∗ , so A is indeed an H-polyhedron; in fact, A
an H-polytope since it is bounded.
Proposition 5.5. Let S = {ai }pi=1 be a finite set of points in En and let A = conv(S) be
its convex hull. If S 6= {O}, then, the dual A∗ of A w.r.t. the center O is the H-polyhedron
given by
p
(a†i )− .
\
∗
A =
i=1
◦
Furthermore, if O ∈ A, then A∗ is an H-polytope, i.e., the dual of a V-polytope with nonempty
interior is an H-polytope. If A = S = {O}, then A∗ = Ed .
(a†i )− , which
Tp Tp
and the right hand side is clearly equal to i=1 {b ∈ En | Ob · Oai ≤ 1} = i=1
◦
is a polyhedron. (Recall that (a†i )− = En if ai = O.) If O ∈ A, then A∗ is bounded (by
Proposition 4.22) and so, A∗ is an H-polytope.
112 CHAPTER 5. POLYHEDRA AND POLYTOPES
Thus, the dual of the convex hull of a finite set of points {a1 , . . . , ap } is the intersection
of the half-spaces containing O determined by the polar hyperplanes of the points ai .
It is convenient to restate Proposition 5.5 using matrices. First, observe that the proof
of Proposition 5.5 shows that
Therefore, we may assume that not all ai = O (1 ≤ i ≤ p). If we pick O as an origin, then
every point aj can be identified with a vector in En and O corresponds to the zero vector,
0. Observe that any set of p points aj ∈ En corresponds to the n × p matrix A whose j th
column is aj . Then, the equation of the the polar hyperplane a†j of any aj (6= 0) is aj · x = 1,
that is
a>
j x = 1.
Consequently, the system of inequalities defining conv({a1 , . . . , ap })∗ can be written in matrix
form as
conv({a1 , . . . , ap })∗ = {x ∈ Rn | A> x ≤ 1},
where 1 denotes the vector of Rp with all coordinates equal to 1. We write
P (A> , 1) = {x ∈ Rn | A> x ≤ 1}. There is a useful converse of this property as proved in
the next proposition.
where I is the row vector of length p whose coordinates are all equal to 1.
Proof. Only the second part needs a proof. Let B = conv({a1 , . . . , ap } ∪ {0}), where ai ∈ Rn
is the ith row of A. Then, by the first part,
B ∗ = P (A, 1).
Remark: Proposition 5.6 still holds if A is the zero matrix because then, the inequalities
A> x ≤ 1 (or Ax ≤ 1) are trivially satisfied. In the first case, P (A> , 1) = Ed , and in the
second case, P (A, 1) = Ed .
Using the above, the reader should check that the dual of a simplex is a simplex and that
the dual of an n-cube is an n-cross polytope.
It is not clear that every H-polyhedron is of the form P (A, 1). This is indeed the case
if we pick O in the interior of A, but this is nontrivial to prove. What we will need is to
find the corresponding “V-definition” of an H-polyhedron. For this we will need to add
positive combinations of vectors to convex combinations of points. Intuitively, these vectors
correspond to “points at infinity.”
In view of Theorem 5.7, we are justified in dropping the V or H in front of polytope, and
will do so from now on. Theorem 5.7 has some interesting corollaries regarding the dual of
a polytope.
Corollary 5.8. If A is any polytope in En such that the interior of A contains the origin
O, then the dual A∗ of A is also a polytope whose interior contains O, and A∗∗ = A.
114 CHAPTER 5. POLYHEDRA AND POLYTOPES
Corollary 5.9. If A is any polytope in En whose interior contains the origin O, then the
k-faces of A are in bijection with the (n − k − 1)-faces of the dual polytope A∗ . This corre-
spondence is as follows: If Y = aff(F ) is the k-dimensional subspace determining the k-face
F of A then the subspace Y ∗ = aff(F ∗ ) determining the corresponding face F ∗ of A∗ is the
intersection of the polar hyperplanes of points in Y .
We also have the following proposition whose proof would not be that simple if we only
had the notion of an H-polytope (as a matter of fact, there is a way of proving Theorem 5.7
using Proposition 5.10).
Proof. Immediate, since an H-polytope is a V-polytope and since affine maps send convex
sets to convex sets.
The reader should check that the Minkowski sum of polytopes is a polytope.
We were able to give a short proof of Theorem 5.7 because we relied on a powerful
theorem, namely, Krein and Milman. A drawback of this approach is that it bypasses the
interesting and important problem of designing algorithms for finding the vertices of an
H-polyhedron from the sets of inequalities defining it. A method for doing this is Fourier–
Motzkin elimination, see Proposition 5.21, and also Ziegler [67] (Chapter 1) and Section 5.4.
This is also a special case of linear programming.
It is also possible to generalize the notion of V-polytope to polyhedra using the notion
of cone and to generalize the equivalence theorem to H-polyhedra and V-polyhedra.
Definition 5.3. Let E be any affine Euclidean space of finite dimension n (with associated
→
− →
−
vector space E ). A subset C ⊆ E is a cone if C is closed under linear combinations involving
→
−
only nonnegative scalars called positive combinations. Given a subset, V ⊆ E , the conical
hull or positive hull of V is the set
nX o
cone(V ) = λi vi | {vi }i∈I ⊆ V, λi ≥ 0 for all i ∈ I .
I
5.4. THE EQUIVALENCE OF H-POLYHEDRA AND V-POLYHEDRA 115
The positive hull cone(V ) of V is also denoted pos(V ). Observe that a V-cone can be
viewed as a polyhedral set for which Y = {O}, a single point. However, if we take the point
O as the origin, we may view a V-polyhedron A for which Y = {O} as a V-cone. We will
switch back and forth between these two views of cones as we find it convenient, this should
not cause any confusion. In this section, we favor the view that V-cones are special kinds
of V-polyhedra. As a consequence, a (V or H)-cone always contains 0, sometimes called an
apex of the cone.
→
−
A set of the form {a + tu | t ≥ 0}, where a ∈ E is a point and u ∈ E is a nonzero vector,
is called a half-line or ray. Then, we see that a V-polyhedron, A = conv(Y ) + cone(V ), is
the convex hull of the union of a finite set of points with a finite set of rays. In the case of
a V-cone, all these rays meet in a common point, an apex of the cone.
Since an H-polyhedron is an intersection of half-spaces determined by hyperplanes, and
since half-spaces are closed, an H-polyhedron is closed. We know from Proposition 3.3 that
a V-polytope is closed and by Proposition 4.13 that a V-cone is closed. To apply Proposition
4.22 to an arbitrary V-polyhedron we need to know that a V-polyhedron is closed.
Given a V-polyhedron P = conv(Y ) + cone(V ) of dimension d, an easy way to prove that
P is closed is to “lift” P to the hyperplane Hd+1 of equation xd+1 = 1 in Ad+1 , obtaining a
polyhedron Pb contained in Hd+1 homeomorphic to P , and to consider a polyhedral cone (a
V-cone) C(P ) associated with P which has the property that
Pb = C(P ) ∩ Hd+1 .
The details of this construction are given in Section 5.5; see Proposition 5.20(2). Since by
Proposition 4.13 a V-cone is closed and since a hyperplane is closed, Pb = C(P ) ∩ Hd+1 is
closed, and thus P is closed. As a summary, the following propostion holds.
Proposition 5.13. Given any two affine Euclidean spaces E and F , if h : E → F is any
affine map then:
Proof. We already proved (3) in Proposition 5.2. For (1), using the fact that h(a + u) =
→
− →
−
h(a) + h (u) for any affine map, h, where h is the linear map associated with h, we get
→
−
h(conv(Y ) + cone(V )) = conv(h(Y )) + cone( h (V )).
Propositions 5.12 and 5.13 allow us to assume that E = Ed and that our (V or H)-
polyhedra, A ⊆ Ed , have nonempty interior (i.e. dim(A) = d).
The generalization of Theorem 5.7 is that every V-polyhedron A is an H-polyhedron and
conversely.
At first glance, it may seem that there is a small problem when A = Ed . Indeed, Definition
5.3 allows the possibility that cone(V ) = Ed for some finite subset, V ⊆ Rd . This is because
it is possible to generate a basis of Rd using finitely many positive combinations. On the
other hand the definition of an H-polyhedron, A, (Definition 5.1) assumes that A ⊆ En is
cut out by at least one hyperplane. So, A is always contained in some half-space of En and
5.4. THE EQUIVALENCE OF H-POLYHEDRA AND V-POLYHEDRA 117
strictly speaking, En is not an H-polyhedron! The simplest way to circumvent this difficulty
is to decree that En itself is a polyhedron, which we do.
Yet another solution is to assume that, unless stated otherwise, every finite set of vectors
V that we consider when defining a polyhedron has the property that there is some hyper-
plane H through the origin so that all the vectors in V lie in only one of the two closed
half-spaces determined by H. But then, the polar dual of a polyhedron can’t be a single
point! Therefore, we stick to our decision that En itself is a polyhedron.
To prove the equivalence of H-polyhedra and V-polyhedra, Ziegler proceeds as follows:
First, he shows that the equivalence of V-polyhedra and H-polyhedra reduces to the equiva-
lence of V-cones and H-cones using an “old trick” of projective geometry, namely, “homog-
enizing” [67] (Chapter 1). Then, he uses two dual versions of Fourier–Motzkin elimination
to pass from V-cones to H-cones and conversely. Since the homogenization method is an
important technique we will describe it in some detail later.
However, it turns out that the double dualization technique used in the proof of Theorem
5.7 can be easily adapted to prove that every V-polyhedron is an H-polyhedron. This is
because if O belongs to the interior of the V-polyhedron A, then its polar dual A∗ is an
H-polytope; see Proposition 5.14. Then, just as in the proof of Theorem 5.7, we can use the
theorem of Krein and Millman to show that A∗ is a V-polytope. By taking the polar dual
of A∗ , we obtain the fact that A∗∗ = A is an H-polyhedron.
Moreover, the dual of an H-polyhedron is a V-polyhedron; see Proposition 5.15. This fact
can be used to prove that every H-polyhedron is a V-polyhedron by using the fact already
shown that every V-polyhedron is an H-polyhedron!
Consequently we will not describe the version of Fourier–Motzkin elimination used to go
from V-cones to H-cones. However, we will present the Fourier–Motzkin elimination method
used to go from H-cones to V-cones; see Proposition 5.21.
The generalization of Proposition 5.5 to polyhedral sets is shown below. As before, the
center of our polar duality is denoted by O. It is taken as the origin of Ed . The new ingredient
is that because a V-polyhderon is defined by points and vectors, its polar dual is still cut
out by hyperplanes, but the hyperplanes corresponding to vectors pass through the origin.
To show this we need to define the “polar hyperplane” u† of a vector u.
Definition 5.4. Given any nonzero vector u ∈ Rd , let u†− be the closed half-space
u†− = {x ∈ Rd | x · u ≤ 0}.
In other words, u†− is the closed half-space bounded by the hyperplane u† through O normal
to u and on the “opposite side” of u.
with α ≤ 1 (here α = pi=1 λi Ox · Oyi ). In particular, for every j ∈ {1, . . . , q}, if we set
P
µk = 0 for k ∈ {1, . . . , q} − {j}, we should have
that is,
1−α
Ox · vj ≤ for all µj > 0,
µj
5.4. THE EQUIVALENCE OF H-POLYHEDRA AND V-POLYHEDRA 119
which is equivalent to
Ox · vj ≤ 0.
Consequently, if x ∈ A∗ , we must also have
q q
(vj† )− .
\ \
d
x∈ {x ∈ E | Ox · vj ≤ 0} =
j=1 j=1
Therefore,
p q
(yi† )− (vj† )− .
\ \
∗
A ⊆ ∩
i=1 j=1
However, the reverse inclusion is obvious and thus, we have equality. If O belongs to the
interior of A, we know from Proposition 4.22 that A∗ is bounded. Therefore, A∗ is indeed
an H-polytope of the above form.
It is fruitful to restate Proposition 5.14 in terms of matrices (as we did for Proposition
5.5). First, observe that
If we pick O as an origin then we can represent the points in Y as vectors, and O is now
denoted 0. The zero vector is denoted 0.
If A = conv(Y ) + cone(V ) 6= {0}, let Y be the d × p matrix whose ith column is yi and
let V is the d × q matrix whose j th column is vj . Then Proposition 5.14 says that
Proposition 5.15. Let {y1 , . . . , yp } be any set of points in Ed and let {v1 , . . . , vq } be any
set of nonzero vectors in Rd . If Y is the d × p matrix whose ith column is yi and V is the
d × q matrix whose j th column is vj , then
where I is the row vector of length p whose coordinates are all equal to 1.
Proof. Only the second part needs a proof. Let
where yi ∈ Rp is the ith row of Y and vj ∈ Rq is the j th row of V . Then, by the first part,
B ∗ = P (Y, 1; V, 0).
We can now use Proposition 5.14, Proposition 4.22, and Krein and Milman’s Theorem
to prove that every V-polyhedron is an H-polyhedron.
Proposition 5.17. If A 6= Ed is a V-polyhedron and if we choose the center of the polarity
◦
O in the interior A of A, then A is of the form A = P (Y, 1). Therefore, every V-polyhedron
is an H-polyhedron.
Proof. Let A 6= Ed be a V-polyhedron of dimension d. Thus A ⊆ Ed has nonempty interior
so we can pick some point O in the interior of A. If d = 0, then A = {0} = E0 and we are
done. Otherwise, by Proposition 5.14, the polar dual A∗ of A w.r.t. O is an H-polytope.
Then, as in the proof of Theorem 5.7, using Krein and Milman’s Theorem we deduce that A∗
is a V-polytope. Now, as O belongs to A and A is closed (by Proposition 5.11) and convex,
by Proposition 4.22 (even if A is not bounded) we have A = A∗∗ , and by Proposition 5.6
(first part), we conclude that A = A∗∗ is an H-polyhedron of the form A = P (Y, 1).
5.4. THE EQUIVALENCE OF H-POLYHEDRA AND V-POLYHEDRA 121
(2) The pohyhedron A is cut out by hyperplanes, some of which contain 0, which means
A is of the form A = P (Y, 1; V, 0). By Proposition 5.15, the polar dual A∗ of A is a
V-polyhedron. As in the previous case, 0 lies on the boundary of A∗ . We translate 0
using some translation Ω so that the new origin Ω is now in the interior of A∗ , and
by Proposition 5.17, A∗ is an H-polyhedron. As in Case (1) we translate Ω back to 0
using the translation −Ω, so A∗ is of the form A∗ = P (Y, 1; V, 0). By Proposition 5.15,
we deduce that A∗∗ = A is a V-polyhedron (A = A∗∗ because 0 ∈ A and A is closed
and convex).
Putting together Propositions 5.17 and 5.18 we obtain our main theorem:
Both in Proposition 5.17 and in Proposition 5.18, the step that is not automatic is to
find the vertices of an H-polytope from the inequalities defining this H-polytope. A method
to do this algorithmically is Fourier–Motzkin elimination; see Proposition 5.21. This process
can be expansive, in the sense that the number of vertices can be exponential in the number
of inequalities. For example, the standard d-cube is cut out by 2d hyperplanes, but is has
2d vertices.
Here are some examples illustrating Proposition 5.17.
122 CHAPTER 5. POLYHEDRA AND POLYTOPES
Example 5.1. Let A be the V-polyhedron (a triangle) in A2 defined by set Y = {(−1, −1/2),
(1, −1/2), (0, 1/2)}. By Proposition 5.14, the polar dual A∗ is the H-polytope, a triangle,
cut out by the inequalities:
−x − (1/2)y ≤ 1
x − (1/2)y ≤ 1
(1/2)y ≤ 1.
This is also the V-polytope whose vertices are (−2, 2), (2, 2), (0, −2). By Proposition 5.6,
A = A∗∗ is H-polyhedron cut out by the inequalities
−2x + 2y ≤ 1
2x + 2y ≤ 1
−2y ≤ 1,
which are equivalent to
y ≤ x + 1/2
y ≤ −x + 1/2
−y ≤ 1/2;
see Figure 5.3.
(0,1/2)
1/2
A*
A
y=
=1
1
y
1/2
(0,1/2)
x-
(-1,-1/2) (1,-1/2)
(-1,-1/2) (1,-1/2)
(-2,2) (2,2)
(0,-2)
(0,1/2)
1 2x A**
= +
2y 2y
+ =
x 1
-2
-2y = 1
(-1,-1/2) (1,-1/2)
(0,-2)
Figure 5.3: The triangle of Example 5.1 written as both a V-polyhedron and an H-
polyhedron.
5.4. THE EQUIVALENCE OF H-POLYHEDRA AND V-POLYHEDRA 123
Example 5.2. Let A be the V-polyhedron (a square) in A2 defined by set Y = {(−1/2, 0),
(0, −1/2), (1/2, 0), (0, 1/2)}. By Proposition 5.14, the polar dual A∗ is the H-polytope,
another square, cut out by the inequalities:
(1/2)y ≤ 1
−(1/2)x ≤ 1
−(1/2)y ≤ 1
(1/2)x ≤ 1.
This is also the V-polytope whose vertices are (−2, 2), (−2, −2), (2, −2), (2, 2). By Proposi-
tion 5.6, A = A∗∗ is H-polyhedron cut out by the inequalities
−2x + 2y ≤1
−2x − 2y ≤1
2x − 2y ≤1
2x + 2y ≤ 1,
y ≤ x + 1/2
y ≥ −x − 1/2
y ≥ x − 1/2
y ≤ −x + 1/2;
−x + y ≤ 0
x+y ≤0
−y ≤ 1.
This is also the V-polytope whose vertices are (0, 0), (−1, −1), and (1, −1). By Proposition
5.6, A = A∗∗ is H-polyhedron (a cone) cut out by the inequalities
−x − y ≤ 1
x − y ≤ 1;
see Figure 5.5. Note that there is no line associated with (0, 0) since this point belongs to
the boundary of the triangle.
124 CHAPTER 5. POLYHEDRA AND POLYTOPES
(0,1/2) A*
A 1/2x = 1
(1/2,0)
(-1/2,0)
(0,1/2)
(0,-1/2)
-1/2x = 1
(1/2,0)
(-1/2,0)
(0,-1/2)
(-2,-2) -1/2y = 1
(-2,2) (2,-2)
(2,2)
(0,1/2)
A** =
1 2x
2y
+
x+
2y
-2 =
1
1 (1/2,0)
-2
(-1/2,0) =
x-
- 2y
2y
2x
=
1
(0,-1/2)
(-2,-2) (2,-2)
Figure 5.4: The diamond of Example 5.2 written as both a V-polyhedron and an H-
polyhedron.
Example 5.4. Let A be the V-polyhedron in A2 defined by set Y = {(−1, −1), (1, −1)}
and the set of vectors V = {(−1, 1), (1, 1)}. By Proposition 5.14, the polar dual A∗ is the
H-polytope, a convex polygon, cut out by the inequalities:
−x + y ≤0
x+y ≤0
−x − y ≤1
x−y ≤ 1.
This is also the V-polytope whose vertices are (0, 0), (−1/2, −1/2), (0, −1), (1/2, −1/2). By
Proposition 5.6, A = A∗∗ is H-polyhedron cut out by the inequalities
−(1/2)x − (1/2)y ≤ 1
−y ≤ 1
(1/2)x − (1/2)y ≤ 1,
5.4. THE EQUIVALENCE OF H-POLYHEDRA AND V-POLYHEDRA 125
)
,1
x+
(-1
=0
(1 y
, 1)
+y
0
=
-x
(0,-1) (0,-1) (1,-1)
(-1,-1) A*
1
y=
x-
-x-
A**
y=
1
(1,-1)
(-1,-1)
Figure 5.5: The triangular cone of Example 5.3 written as both a V-polyhedron and an
H-polyhedron.
y ≥ −x − 2
y ≥ −1
y ≥ x − 2;
Example 5.5. Let A be the V-polyhedron in A2 defined by sets Y = {(0, 1), (−1, 0), (1, 0)}
and V = {(0, −1)}. By Proposition 5.14, the polar dual A∗ is the H-polytope, a square, cut
out by the inequalities:
y≤1
−x ≤ 1
x≤1
−y ≤ 0.
This is also the V-polytope (square) whose vertices are (−1, 1), (−1, 0), (1, 0), (1, 1). By
126 CHAPTER 5. POLYHEDRA AND POLYTOPES
A
1)
(-1
,
(1
, 1)
(-1,-1) (1,-1)
-x 1
-y =
= A* y
1 x-
=0
x+
y
+y
0
=
-x
(1,-1)
(-1,-1) (0,-1)
-1 1
/2 y=
x /2
-1
/2 -1
y 2x
= 1/
1 A
A**
(-1/2,-1/2) (1/2,-1/2)
y = -1
(1,-1)
(-1,-1)
Figure 5.6: The trough of Example 5.4 written as both a V-polyhedron and an H-polyhedron.
−x + y ≤ 1
−x ≤ 1
x≤1
x + y ≤ 1;
In all the previous examples, the step that is not automatic is to find the vertices of an
H-polytope from the inequalities defining this H-polytope. For small examples in dimension
2 this is easy, but in general this is an expansive process. A method to do this algorithmically
is Fourier–Motzkin elimination; see Proposition 5.21.
Here are now some example illustrating Proposition 5.18.
Example 5.6. Let A be the H-polyhedron in A2 defined by the inequalities
−x − y ≤ 1
x − y ≤ 1.
5.4. THE EQUIVALENCE OF H-POLYHEDRA AND V-POLYHEDRA 127
(0,1) y=1
(-1,1) (1,1)
x = -1 x=1
(-1,0) (1,0) A*
(0,-1) (0,-1)
(0,1)
(-1,1) 1 x+ (1,1)
y= y=
+
-x 1
(-1,0) (1,0)
(0,-1)
x = -1 x=1
A**
Figure 5.7: The triangular peaked of Example 5.5 written as both a V-polyhedron and an
H-polyhedron.
This is the cone arising in Example 5.3. By Proposition 5.6, the polar dual A∗ a V-polytope,
a triangle, the convex hull of the points (0, 0), (−1, −1), and (1, −1). In Example 5.1, we
computed the equations of the triangle (0, 1/2), (−1, −1/2), and (1, −1/2) obtained by trans-
lating the above triangle by (0, 1/2), namely
y ≤ x + 1/2
y ≤ −x + 1/2
−y ≤ 1/2,
so the triangle (0, 0), (−1, −1), and (1, −1) is also the H-polyhedron (triangle) defined by
the inequalities
−x + y ≤ 0
x+y ≤0
−y ≤ 1,
and by Proposition 5.15, A = A∗∗ is the V-polyhedron given by the set Y = {(0, −1)}
consisting of a single point and the set of vectors V = {(−1, 1), (1, 1)}; see Figure 5.8.
128 CHAPTER 5. POLYHEDRA AND POLYTOPES
1
y =
x-
-x-
y=
1
=0
x+
y
A
+y
0
=
-x
(-1,-1) y = -1 (1,-1)
A*
A**
)
,1
(-1
(1
, 1)
(0,-1)
Figure 5.8: The triangular cone of Example 5.6 written as first an H-polyhedron and then
V-polyhedron.
−(1/2)x − (1/2)y ≤ 1
−y ≤ 1
(1/2)x − (1/2)y ≤ 1.
This is the H-polyhedron of Example 5.4. By Proposition 5.6, the polar dual A∗ a V-
polytope, the square (0, 0), (−1/2, −1/2), (0, −1), (1/2, −1/2). In Example 5.2, we computed
the equations of the square (−1/2, 0), (0, −1/2), (1/2, 0), (0, 1/2) obtained by translating the
above square by (0, 1/2), namely
y ≤ x + 1/2
y ≥ −x − 1/2
y ≥ x − 1/2
y ≤ −x + 1/2,
so the square (0, 0), (−1/2, −1/2), (0, −1), (1/2, −1/2) is also the H-polyhedron (square) de-
5.4. THE EQUIVALENCE OF H-POLYHEDRA AND V-POLYHEDRA 129
−x + y ≤0
−x − y ≤1
x−y ≤1
x+y ≤ 0.
By Proposition 5.15, A = A∗∗ is the V-polyhedron given by the set Y = {(−1, −1), (1, −1)}
and the set of vectors V = {(−1, 1), (1, 1)}; see Figure 5.9.
-1
/2
x- 1
1/
2y
2 y=
=
1 1/ A
x-
1 /2
y = -1
0
y=
+
-x
-x
-y A*
=
1
(-1/2,1/2)
(-1/2,/-1/2)
1 (0,-1)
x+
y=
y=
x-
0
, 1) A**
(-1
(1
, 1)
(-1,-1) (1,-1)
Figure 5.9: The trough of Example 5.4 written as first an H-polyhedron and then V-
polyhedron.
−x − y ≤ 1
−y ≤ 0
x − y ≤ 1.
130 CHAPTER 5. POLYHEDRA AND POLYTOPES
By Proposition 5.6, the polar dual A∗ is the V-polyhedron given by the set of points
(0, 0), (−1, −1), and (1, −1) and the vector (0, −1). In example 5.5, we computed the inequal-
ities of the V-polyedron given by Y = {(0, 1), (−1, 0), (1, 0)} and V = {(0, −1)}, obtained
by translating the above V-polyhedron by (0, 1), namely
−x + y ≤ 1
−x ≤ 1
x≤1
x + y ≤ 1,
so the V-polyhedron given by the set of points (0, 0), (−1, −1), and (1, −1) and the vector
(0, −1) is defined by the inequalities
−x + y ≤ 0
−x ≤ 1
x≤1
x + y ≤ 0.
By Proposition 5.15, A = A∗∗ is the V-polyhedron given by the sets Y = {(−1, 0), (1, 0)}
and V = {(−1, 1), (1, 1)}; see Figure 5.10.
Even though we proved the main result of this section, it is instructive to consider a more
computational proof making use of cones and an elimination method known as Fourier–
Motzkin elimination.
Pi (x1 , . . . , xd ) ≤ bi ,
5.5. FOURIER–MOTZKIN ELIMINATION AND CONES 131
-x
-y
=
1
1
=
y A
x-
-x
-y
=
1
1
=
y
x-
(-1,-1) (1,-1)
(0,-1)
A*
, 1)
(- 1
(1 A**
, 1)
(-1,0) (1,0)
Figure 5.10: The trough of Example 5.8 written as first an H-polyhedron and then V-
polyhedron.
where Pi (x1 , . . . , xd ) is a polynomial of total degree ni , this amounts to forming the new
homogeneous inequalities
ni x1 xd
xd+1 Pi ,..., − bi xnd+1
i
≤0
xd+1 xd+1
together with xd+1 ≥ 0. In particular, if the Pi ’s are linear forms (which means that ni = 1),
then our inequalities are of the form
ai · x ≤ b i ,
where ai ∈ Rd is some vector, and the homogenized inequalities are
ai · x − bi xd+1 ≤ 0.
Hi = {x ∈ Ed | ai · x = bi },
and P is given by
m
\
P = {x ∈ Ed | ai · x ≤ bi }.
i=1
If A denotes the m × d matrix whose i-th row is ai and b is the vector b = (b1 , . . . , bm ), then
we can write
P = P (A, b) = {x ∈ Ed | Ax ≤ b}.
Thus, we see that C(P ) is the H-cone given by the system of inequalities
A −b x 0
≤ ,
0 −1 xd+1 0
and that
Pb = C(P ) ∩ Hd+1 .
Conversely, if Q is any H-cone in Ed+1 (in fact, any H-polyhedron), it is clear that
P = Q ∩ Hd+1 is an H-polyhedron in Hd+1 ≈ Ed .
Let us now assume that P ⊆ Ed is a V-polyhedron, P = conv(Y ) + cone(V ), where
Y = {y1 , . . . , yp } and V = {v1 , . . . , vq }. Define Yb = {b y1 , . . . , ybp } ⊆ Ed+1 , and
v1 , . . . , vbq } ⊆ Ed+1 , by
Vb = {b
yi vj
ybi = , vbj = .
1 0
C(P ) = cone({Yb ∪ Vb })
5.5. FOURIER–MOTZKIN ELIMINATION AND CONES 133
is an H-cone in Ed+1 , and Pb = C(P ) ∩ Hd+1 , where Hd+1 is the hyperplane of equation
xd+1 = 1. Conversely, if Q is any H-cone in Ed+1 (in fact, any H-polyhedron), then
P = Q ∩ Hd+1 is an H-polyhedron in Hd+1 ≈ Ed .
Pb = C(P ) ∩ Hd+1 ,
We claim that
P = C ∩ Hd+1 = conv(Yb ) + cone(Vb ),
and thus, P is a V-polyhedron.
Recall that any element z ∈ C can be written as
s
X
z= µk w k , µk ≥ 0.
k=1
Thus, we have
s
X
z= µk wk µk ≥ 0
k=1
X X
= µi w i + µj wj µi , µj ≥ 0
wi d+1 >0 wj d+1 =0
X wi X
= wi d+1 µi + µj wj , µi , µj ≥ 0
wi d+1 w
wi d+1 >0 j d+1 =0
X wi X
= λi + µj wj , λi , µj ≥ 0.
w >0
w i d+1 w =0
i d+1 j d+1
Now, z ∈ C ∩ Hd+1 iff zd+1 = 1 iff pi=1 λi = 1 (where p is the number of elements in Yb ),
P
since the (d + 1)th coordinate of wiwd+1
i
is equal to 1, and the above shows that
as claimed.
Hi = {x ∈ Ed | ui · x = 0},
and C is given by
m
\
C= {x ∈ Ed | ui · x ≤ 0}.
i=1
5.5. FOURIER–MOTZKIN ELIMINATION AND CONES 135
C = P (A, 0) = {x ∈ Ed | Ax ≤ 0}.
and then
ei 0
C0 (A) = cone ± |1≤i≤d ∪ |1≤j ≤m .
Aei ej
the set of vectors obtained from Y by “eliminating the k-th coordinate.” Here, each yi is a
vector in Rd .
/k
Pdonly nontrivial direction is to prove that C ∩ Hk ⊆ cone(Y ). For this, consider
Proof. The
any v = i=1 ti yi ∈ C ∩ Hk , with ti ≥ 0 and vk = 0. Such a v can be written
X X X
v= ti yi + ti yi + tj yj
i|yik =0 i|yik >0 j|yjk <0
136 CHAPTER 5. POLYHEDRA AND POLYTOPES
and as vk = 0, we have X X
ti yik + tj yjk = 0.
i|yik >0 j|yjk <0
Then,
X 1 X X 1 X X
v= ti yi + −tj yjk ti yi + ti yik tj yj
Λ Λ
i|yik =0 i|yik >0 j|yjk <0 j|yjk <0 i|yik >0
X X ti tj
= ti yi + (yik yj − yjk yi ) .
Λ
i|yik =0 i|yik >0
j|yjk <0
Since the k th coordinate of yik yj − yjk yi is 0, the above shows that any v ∈ C ∩ Hk can be
written as a positive combination of vectors in Y /k .
Now, observe that P can be interpreted as the projection of the H-polyhedron Pe ⊆ Ed+p+q
given by
Pe = {(x, u, t) ∈ Rd+p+q | x = Y u + V t, u ≥ 0, Iu = 1, t ≥ 0}
onto Rd . Consequently, if we can prove that the projection of an H-polyhedron is also
an H-polyhedron, then we will have proved that every V-polyhedron is an H-polyhedron.
Here again, it is possible that P = Ed , but that’s fine since Ed has been declared to be an
H-polyhedron.
5.5. FOURIER–MOTZKIN ELIMINATION AND CONES 137
In view of Proposition 5.20 and the discussion that followed, it is enough to prove that the
projection of any H-cone is an H-cone. This can be done by using a type of Fourier–Motzkin
elimination dual to the method used in Proposition 5.21. We state the result without proof
and refer the interested reader to Ziegler [67], Section 1.2–1.3, for full details.
Proposition 5.23. If C = P (A, 0) ⊆ Ed is an H-cone, then the projection projk (C) onto
the hyperplane Hk of equation yk = 0 is given by projk (C) = elimk (C) ∩ Hk , with
elimk (C) = {x ∈ Rd | (∃t ∈ R)(x + tek ∈ P )} = {z − tek | z ∈ P, t ∈ R} = P (A/k , 0) and
where the rows of A/k are given by
where a is any point in H1 , the vectors u1 , . . . , ud−1 form a basis of the direction of H1 , and
ud is normal to (the direction of ) H1 . (When d = 1, P is the half-line, P = {a+tu1 | t ≥ 0}.)
If t ≥ 2, then every point a ∈ P can be written as a convex combination a = (1 − α)b + αc
(0 ≤ α ≤ 1), where b and c belong to two distinct facets F and G of P , and where
F = conv(YF ) + cone(VF ) and G = conv(YG ) + cone(VG ),
for some finite sets of points YF and YG and some finite sets of vectors VF and VG . (Note:
α = 0 or α = 1 is allowed.) Consequently, every H-polyhedron is a V-polyhedron.
Proof. We proceed by induction on the dimension d of P . If d = 1, then P is either a closed
interval [b, c], or a half-line {a + tu | t ≥ 0}, where u 6= 0. In either case, the proposition is
clear.
For the induction step, assume d > 1. Since every facet F of P has dimension d − 1, the
induction hypothesis holds for F , that is, there exist a finite set of points YF , and a finite
set of vectors VF , so that
F = conv(YF ) + cone(VF ).
Thus, every point on the boundary of P is of the desired form. Next, pick any point a in
the interior of P . Then, from our previous discussion, there is a line ` through a in general
position w.r.t. P . Consequently, the intersection points zi = ` ∩ Hi of the line ` with the
hyperplanes supporting the facets of P exist and are all distinct. If we give ` an orientation,
the zi ’s can be sorted. Since ` contains a which is in the interior of P , any point on ` to the
left of z1 must be outside P , and similarly any point of ` to the right of the rightmost zi ,
say zN , must be outside P , since otherwise there would be a hyperplane cutting ` before z1
or a hyperplane cutting ` after zN . Since P is closed and convex and ` is closed, P ∩ ` is a
closed an convex subset of `. But this subset is bounded since all points outside [z1 , zN ] are
outside P . It follows that P ∩ ` is a closed interval [b, c] with b, c ∈ P , so there is a unique
k such that a lies between b = zk and c = zk+1 . But then, b ∈ Fk = F and c ∈ Fk+1 = G,
where F and G the facets of P supported by Hk and Hk+1 , and a = (1 − α)b + αc, a convex
combination.
Consequently, every point in P is a mixed convex + positive combination of finitely
many points associated with the facets of P and finitely many vectors associated with the
directions of the supporting hyperplanes of the facets P . Conversely, it is easy to see that
any such mixed combination must belong to P and therefore, P is a V-polyhedron.
We conclude this section with a version of Farkas Lemma for polyhedral sets.
Lemma 5.25. (Farkas Lemma, Version IV) Let Y be any d × p matrix and V be any d × q
matrix. For every z ∈ Rd , exactly one of the following alternatives occurs:
(a) There exist u ∈ Rp and t ∈ Rq , with u ≥ 0, t ≥ 0, Iu = 1 and z = Y u + V t.
(b) There is some vector (α, c) ∈ Rd+1 such that c> yi ≥ α for all i with 1 ≤ i ≤ p, c> vj ≥ 0
for all j with 1 ≤ j ≤ q, and c> z < α.
5.6. LINEALITY SPACE AND RECESSION CONE 139
Proof. We use Farkas Lemma, Version II (Lemma 4.16). Observe that (a) is equivalent to
the problem: Find (u, t) ∈ Rp+q , so that
u 0 I O u 1
≥ and = ,
t 0 Y V t z
which is exactly Farkas II(a). Now, the second alternative of Farkas II says that there is no
solution as above if there is some (−α, c) ∈ Rd+1 so that
> 1 > I 0
(−α, c ) < 0 and (−α, c ) ≥ (O, O).
z Y V
These are equivalent to
−α + c> z < 0, −αI + c> Y ≥ O, c> V ≥ O,
namely, c> z < α, c> Y ≥ αI and c> V ≥ O, which are indeed the conditions of Farkas IV(b),
in matrix form.
Observe that Farkas IV can be viewed as a separation criterion for polyhedral sets. This
version subsumes Farkas I and Farkas II.
It is immediate from these definitions that lineal(A) is a linear subspace of Rd and that
rec(A) is a convex cone in Rd containing 0.
If we pick a subspace U of Ed such that U and lineal(A) form a direct sum Ed = lineal(A)⊕
U , for example, the othogonal complement of lineal(A), we can decompose A as
A = lineal(A) + (U ∩ A),
with lineal(U ∩ A) = (0). A convex set A such that lineal(A) = (0) is said to be pointed .
If P is an H-polyhedron of the form P = P (A, d), then it is immediate by definition that
lineal(P ) = Ker A = {x ∈ Rd | Ax = 0}.
Regarding recession cones, we have the following proposition.
140 CHAPTER 5. POLYHEDRA AND POLYTOPES
(1) If P is an H-polyhedron of the form P = P (A, b), then the recession cone rec(P ) is
given by
rec(P ) = P (A, 0).
(2) If P is a V-polyhedron of the form P = conv(Y ) + cone(V ), then the recession cone
rec(P ) is given by
rec(P ) = cone(V ).
Proof. The only part whose proof is nontrivial is the inclusion rec(P ) ⊆ cone(V ). We prove
the contrapositive, if v ∈
/ cone(V ) then v ∈
/ rec(P ), using Farkas Lemma Version IV.
By Farkas Lemma Version IV (Proposition 5.25) with Y = ∅, if v ∈ / cone(V ), then there
is some c ∈ R and some α ∈ R such that c vj ≥ 0 for j = 1, . . . , q, 0 ≥ α, and c> v < α.
d >
This implies that −c> vj ≤ 0 for j = 1, . . . , q, and −c> v > 0. Let d = −c.
For any x ∈ P = conv(Y ) + cone(V ), we have
p q
X X
x= λi yi + µj vj ,
i=1 j=1
Pp
with λi , µj ≥ 0 and i=1 λi = 1. Since d> vj ≤ 0 for j = 1, . . . , q and 0 ≤ λi ≤ 1, we get
p q p
X X X
> > >
d x= λi d yi + µj d v j ≤ λi d> yi ≤ max d> yi = K,
1≤i≤p
i=1 j=1 i=1
and since d> v > 0, we see that d> (x + tv) tends to +∞ when t tends to +∞, which implies
that x + tv ∈
/ P , and which means that v ∈ / rec(P ).
Proposition 5.26 shows that if P is a V-polyhedron then v ∈ rec(P ) iff there is some
x ∈ P such that x + tv ∈ P for all t ≥ 0, which is much cheaper to check that the condition
of Definition 5.5 which requires checking that x + tv ∈ P for all t ≥ 0 and for all x ∈ P .
Notes. The treatment of polytopes and polyhedra given in this chapter is based on the
following texts (in alphabetic order): Barvinok [4], Berger [8], Ewald [26], Grunbaum [35],
and Ziegler [67]. The terminology V-polyhedron, V-polytope, H-polyhedron, H-polytope, is
borrowed from Ziegler.
The proof of Theorem 5.7 (Weyl-Minkowski) using Krein and Millman’s theorem and po-
lar duality is taken from Berger [8] (Chapter 12, Proposition 12.1.5). Rather different proofs
5.6. LINEALITY SPACE AND RECESSION CONE 141
of the fact that every V-polytope is an H-polytope are given in Grunbaum [35] (Chapter 3,
Theorem 3.1.1), and Ewald [26] (Chapter II, Theorems 1.4 and 1.5).
We believe that the proof of Proposition 5.17 showing that every V-polyhedron is an H-
polyhedron using Proposition 5.14, Krein and Millman’s theorem, and double dualization, is
new. However, this proof is not that original in the sense that it uses the double dualization
trick already found in Berger, and the crucial observation that the polar dual A∗ of a V-
polyhedron A with respect to a center in the interior of A is a bounded H-polyhedron, that
is, an H-polytope. We also believe that the proof of Proposition 5.18 showing that every
H-polyhedron is a V-polyhedron is new (it uses a quadruple polar dualization!).
The equivalence of V-polyhedra is and H-polyhedra is also treated (using different tech-
niques) in some books on convex optimization, among which we recommend Bertsekas [9].
Except for Proposition 5.24, which we believe is new, the results of Section 5.5 on Fourier–
Motzkin elimination and the polyhedron-cone correspondence are taken from Ziegler [67].
Similarly, the material of Section 5.6 is taken from Ziegler [67].
142 CHAPTER 5. POLYHEDRA AND POLYTOPES
Chapter 6
Linear Programs
143
144 CHAPTER 6. LINEAR PROGRAMS
for running the simplex algorithm and its variants. A particularly nice feature of the tableau
formalism is that the update of a tableau can be performed using elementary row operations
identical to the operations used during the reduction of a matrix to row reduced echelon
form (rref). What differs is the criterion for the choice of the pivot.
However, we do not discuss other methods such as the ellipsoid method or interior points
methods. For these more algorithmic issues, we refer the reader to standard texts on linear
programming. In our opinion, one of the clearest (and among the most concise!) is Matousek
and Gardner [41]; Chvatal [18] and Schrijver [51] are classics. Papadimitriou and Steiglitz
[45] offers a very crisp presentation in the broader context of combinatorial optimization,
and Bertsimas and Tsitsiklis [10] and Vanderbei [64] are very complete.
Linear programming has to do with maximizing a linear cost function c1 x1 + · · · + cn xn
with respect to m “linear” inequalities of the form
ai1 x1 + · · · + ain xn ≤ bi .
These constraints can be put together into an m × n matrix A = (aij ), and written more
concisely as
Ax ≤ b.
For technical reasons that will appear clearer later on, it is often preferable to add the
nonnegativity constaints xi ≥ 0 for i = 1, . . . , n. We write x ≥ 0. It is easy to show that
every linear program is equivalent to another one satisfying the constraints x ≥ 0, at the
expense of adding new variables that are also constrained to be nonnegative. Let P(A, b) be
the set of feasible solutions of our linear program given by
P(A, b) = {x ∈ Rn | Ax ≤ b, x ≥ 0}.
(1) Is P(A, b) nonempty, that is, does our linear program have a chance to have a solution?
(2) Does the objective function c1 x1 + · · · + cn xn have a maximum value on P(A, b)?
The answer to both questions can be no. But if P(A, b) is nonempty and if the objective
function is bounded above (on P(A, b)), then it can be shown that the maximum of c1 x1 +
· · · + cn xn is achieved by some x ∈ P(A, b). Such a solution is called an optimal solution.
Perhaps surprisingly, this result is not so easy to prove (unless one has the simplex method
as its disposal). We will prove this result in full detail (see Proposition 6.1).
The reason why linear constraints are so important is that the domain of potential optimal
solutions P(A, b) is convex . In fact, P(A, b) is a convex polyhedron which is the intersection
of half-spaces cut out by affine hyperplanes. The objective function being linear is convex,
and this is also a crucial fact. Thus, we are led to study convex sets, in particular those that
arise from solutions of inequalities defined by affine forms, but also convex cones.
6.2. NOTATIONAL PRELIMINARIES 145
We give a brief introduction to these topics. As a reward, we provide several criteria for
testing whether a system of inequalities
Ax ≤ b, x ≥ 0
has a solution or not in terms of versions of the Farkas lemma (see Proposition 4.14 and
Proposition 8.4). Then we give a complete proof of the strong duality theorem for linear
programming (see Theorem 8.7). We also discuss the complementary slackness conditions
and show that they can be exploired to design an algorithm for solving a linear program
that uses both the primal problem and its dual. This algorithm known as the primal dual
algorithm, although not used much nowadays, has been the source of inspiration for a whole
class of approximation algorithms also known as primal dual algorithms.
We hope that these notes will be a motivation for learning more about linear program-
ming, convex optimization, but also convex geometry. The “bible” in convex optimization
is Boyd and Vandenberghe [14], and one of the best sources for convex geometry is Ziegler
[67].
If c 6= 0, the affine form ϕ specified by (c, β) defines the affine hyperplane (for short hyper-
plane) H(ϕ) given by
y=
-2x
3-
(0,0,1)
H_ H+
H+
H_
(0,1,0)
(1,0,0)
x+y+z=1
i. ii.
Figure 6.1: Figure i. illustrates the hyperplane H(ϕ) for ϕ(x, y) = 2x + y + 3, while Figure
ii. illustrates the hyperplane H(ϕ) for ϕ(x, y, z) = x + y + z − 1.
with ϕ(x, y, z) = x + y + z − 1; this affine form defines the plane given by the equation
x + y + z = 1, which is the plane through the points (0, 0, 1), (0, 1, 0), and (1, 0, 0). Both of
these hyperplanes are illustrated in Figure 6.1.
For any two vector x, y ∈ Rn with x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) we write x ≤ y
iff xi ≤ yi for i = 1, . . . , n, and x ≥ y iff y ≤ x. In particular x ≥ 0 iff xi ≥ 0 for i = 1, . . . , n.
Ax ≤ b.
Thus the linear program (P ) can also be stated as the linear program (P ):
maximize cx
subject to Ax ≤ b and x ≥ 0.
Example 6.1.
maximize x1 + x2
subject to
x 2 − x1 ≤ 1
x1 + 6x2 ≤ 15
4x1 − x2 ≤ 10
x1 ≥ 0, x2 ≥ 0,
10
1
=
1 x2 =
1
3 -x
x2
4x -
(3,2)
2 x1 + 6x
2 = 15
K1 0 1 2 3 4 5
K1
Figure 6.2: The H-polyhedron associated with Example 6.1. The green point (3, 2) is the
unique optimal solution.
x2 − x1 =1
x1 + 6x2 = 15
4x1 − x2 = 10
x1 =0
x2 = 0.
Definition 6.3. If P(A, b) = ∅, we say that the linear program (P ) has no feasible solution,
and otherwise any x ∈ P(A, b) is called a feasible solution of (P ).
The linear program shown in Example 6.2 obtained by reversing the direction of the
inequalities x2 − x1 ≤ 1 and 4x1 − x2 ≤ 10 in the linear program of Example 6.1 has no
feasible solution; see Figure 6.3.
6.3. LINEAR PROGRAMS, FEASIBLE SOLUTIONS, OPTIMAL SOLUTIONS 149
Example 6.2.
maximize x1 + x2
subject to
x1 − x2 ≤ −1
x1 + 6x2 ≤ 15
x2 − 4x1 ≤ −10
x1 ≥ 0, x2 ≥ 0.
4
1
=
-x 1
10
3 x2
1 x2 =
4x -
x1 + 6x
2 = 15
2
K1 0 1 2 3 4 5
K1
Figure 6.3: There is no H-polyhedron associated with Example 6.2 since the blue and purple
regions do not overlap.
Assume P(A, b) 6= ∅, so that the linear program (P ) has a feasible solution. In this case,
consider the image {cx ∈ R | x ∈ P(A, b)} of P(A, b) under the objective function x 7→ cx.
Definition 6.4. If the set {cx ∈ R | x ∈ P(A, b)} is unbounded above, then we say that the
linear program (P ) is unbounded .
The linear program shown in Example 6.3 obtained from the linear program of Example
6.1 by deleting the constraints 4x1 − x2 ≤ 10 and x1 + 6x2 ≤ 15 is unbounded.
Example 6.3.
maximize x1 + x2
subject to
x 2 − x1 ≤ 1
x1 ≥ 0, x2 ≥ 0.
150 CHAPTER 6. LINEAR PROGRAMS
Otherwise, we will prove shortly that if µ is the least upper bound of the set {cx ∈ R |
x ∈ P(A, b)}, then there is some p ∈ P(A, b) such that
cp = µ,
that is, the objective function x 7→ cx has a maximum value µ on P(A, b) which is achieved
by some p ∈ P(A, b).
Definition 6.5. If the set {cx ∈ R | x ∈ P(A, b)} is nonempty and bounded above, any
point p ∈ P(A, b) such that cp = max{cx ∈ R | x ∈ P(A, b)} is called an optimal solution
(or optimum) of (P ). Optimal solutions are often denoted by an upper ∗; for example, p∗ .
The linear program of Example 6.1 has a unique optimal solution (3, 2), but observe
that the linear program of Example 6.4 in which the objective function is (1/6)x1 + x2 has
infinitely many optimal solutions; the maximum of the objective function is 15/6 which
occurs along the points of orange boundary line in Figure 6.2.
Example 6.4.
1
maximize x1 + x2
6
subject to
x 2 − x1 ≤ 1
x1 + 6x2 ≤ 15
4x1 − x2 ≤ 10
x1 ≥ 0, x2 ≥ 0.
The proof that if the set {cx ∈ R | x ∈ P(A, b)} is nonempty and bounded above, then
there is an optimal solution p ∈ P(A, b), is not as trivial as it might seem. It relies on the
fact that a polyhedral cone is closed, a fact that was shown in Section 4.1.
We also use a trick that makes the proof simpler, which is that a linear program (P ) with
inequality constraints Ax ≤ b
maximize cx
subject to Ax ≤ b and x ≥ 0,
is equivalent to the linear program (P2 ) with equality constraints
maximize b cx
b
subject to Ab b ≥ 0,
bx = b and x
with x ∈ Rn and z ∈ Rm .
Indeed, Ab b ≥ 0 iff
bx = b and x
Ax + z = b, x ≥ 0, z ≥ 0,
iff
Ax ≤ b, x ≥ 0,
and b
cxb = cx.
The variables z are called slack variables, and a linear program of the form (P2 ) is called
a linear program in standard form.
The result of converting the linear program of Example 6.4 to standard form is the
program shown in Example 6.5.
Example 6.5.
1
maximize x 1 + x2
6
subject to
x2 − x1 + z1 = 1
x1 + 6x2 + z2 = 15
4x1 − x2 + z3 = 10
x1 ≥ 0, x2 ≥ 0, z1 ≥ 0, z2 ≥ 0, z3 ≥ 0.
We can now prove that if a linear program has a feasible solution and is bounded, then
it has an optimal solution.
Proposition 6.1. Let (P2 ) be a linear program in standard form, with equality constraint
Ax = b. If P(A, b) is nonempty and bounded above, and if µ is the least upper bound of the
set {cx ∈ R | x ∈ P(A, b)}, then there is some p ∈ P(A, b) such that
cp = µ,
that is, the objective function x 7→ cx has a maximum value µ on P(A, b) which is achieved
by some optimum solution p ∈ P(A, b).
Proof. Since µ = sup{cx ∈ R | x ∈ P(A, b)}, there is a sequence (x(k) )k≥0 of vectors
(k) (k)
x(k) ∈ P(A, b) such that limk7→∞ cx(k) = µ. In particular, if we write x(k) = (x1 , . . . , xn )
(k)
we have xj ≥ 0 for j = 1, . . . , n and for all k ≥ 0. Let A
e be the (m + 1) × n matrix
c
A
e= ,
A
152 CHAPTER 6. LINEAR PROGRAMS
since by hypothesis x(k) ∈ P(A, b), and the constraints are Ax = b and x ≥ 0. Since by
e (k) )k≥0 converges to the vector µ . Now,
hypothesis limk7→∞ cx(k) = µ, the sequence (Ax
b
e (k) can be written as the convex combination
observe that each vector Ax
n
(k) ej
X
e (k) =
Ax xj A ,
j=1
(1) The matrix AK is invertible (that is, the columns of AK are linearly independent).
maximize x1 + x2
subject to x1 + x2 + x3 = 1 and x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, (∗)
has three basic feasible solutions; the basic feasible solution K = {1} corresponds to the
point (1, 0, 0); the basic feasible solution K = {2} corresponds to the point (0, 1, 0); the
basic feasible solution K = {3} corresponds to the point (0, 0, 1). Each of these points
corresponds to the vertices of the slanted purple triangle illustrated in Figure 6.4. The
vertices (1, 0, 0) and (0, 1, 0) optimize the objective function with a value of 1.
.7
x+y=0
1
+z=
x+y
Figure 6.4: The H-polytope associated with Linear Program (∗). The objective function
(with x1 → x and x2 → y) is represented by vertical planes parallel to the purple plane
x + y = 0.7, and reaches it maximal value when x + y = 1.
We now show that if the standard linear program (P2 ) as in Definition 6.6 has some
feasible solution and is bounded above, then some basic feasible solution is an optimal
solution. We follow Matousek and Gardner [41] (Chapter 4, Section 2, Theorem 4.2.3).
First we obtain a more convenient characterization of a basic feasible solution.
154 CHAPTER 6. LINEAR PROGRAMS
Proposition 6.2. Given any standard linear program (P2 ) where Ax = b and A is an m × n
matrix of rank m, for any feasible solution x, if J> = {j ∈ {1, . . . , n} | xj > 0}, then x is a
basic feasible solution iff the columns of the matrix AJ> are linearly independent.
Proof. If x is a basic feasible solution then there is some subset K ⊆ {1, . . . , n} of size m such
that the columns of AK are linearly independent and xj = 0 for all j ∈ / K, so by definition
J> ⊆ K, which implies that the columns of the matrix AJ> are linearly independent.
Conversely, assume that x is a feasible solution such that the columns of the matrix AJ>
are linearly independent. If |J> | = m, we are done since we can pick K = J> and then x
is a basic feasible solution. If |J> | < m, we can extend J> to an m-element subset K by
adding m − |J> | column indices so that the columns of AK are linearly independent, which
is possible since A has rank m.
Next we prove that if a linear program in standard form has any feasible solution x0 and
is bounded above, then is has some basic feasible solution x
e which is as good as x0 , in the
x ≥ cx0 .
sense that ce
Proposition 6.3. Let (P2 ) be any standard linear program with objective function cx, where
Ax = b and A is an m × n matrix of rank m. If (P2 ) is bounded above and if x0 is some
feasible solution of (P2 ), then there is some basic feasible solution x x ≥ cx0 .
e such that ce
Proof. Among the feasible solutions x such that cx ≥ cx0 (x0 is one of them) pick one with
the maximum number of coordinates xj equal to 0, say x e. Let
K = J> = {j ∈ {1, . . . , n} | x
ej > 0}
Aw = AK v = 0.
We will derive a contradiction by exhibiting a feasible solution x(t0 ) such that cx(t0 ) ≥ cx0
with more zero coordinates than xe.
For this we claim that we may assume that w satisfies the following two conditions:
(1) cw ≥ 0.
cx(t) = ce
x + tcw.
Since cw > 0, as t > 0 goes to infinity the objective function cx(t) also tends to infinity,
contradicting the fact that is is bounded above. Therefore, some w satisfying Conditions (1)
and (2) above must exist.
We show that there is some t0 > 0 such that cx(t0 ) ≥ cx0 and x(t0 ) = x
e + t0 w is feasible,
yet x(t0 ) has more zero coordinates than x
e, a contradiction.
Since x(t) = x
e + tw, we have
x(t)i = x
ei + twi ,
so if we let I = {i ∈ {1, . . . , n} | wi < 0} ⊆ K, which is nonempty since w satisfies Condition
(2) above, if we pick
−e
xi
t0 = min ,
i∈I wi
then t0 > 0, because wi < 0 for all i ∈ I, and by definition of K we have x
ei > 0 for all i ∈ K.
By the definition of t0 > 0 and since xe ≥ 0, we have
ej + t0 wj ≥ 0 for all j ∈ K,
x(t0 )j = x
so x(t0 ) ≥ 0, and x(t0 )i = 0 for some i ∈ I. Since Ax(t0 ) = b (for any t), x(t0 ) is a feasible
solution,
x + t0 cw ≥ cx0 + t0 cw ≥ cx0 ,
cx(t0 ) = ce
and x(t0 )i = 0 for some i ∈ I, we see that x(t0 ) has more zero coordinates than x
e, a
contradiction.
Theorem 6.4. Let (P2 ) be any standard linear program with objective function cx, where
Ax = b and A is an m × n matrix of rank m. If (P2 ) has some feasible solution and if it is
bounded above, then some basic feasible solution x
e is an optimal solution of (P2 ).
Proof. By Proposition 6.3, for any feasible solution x there is some basic feasible solution x
e
such that cx ≤ cex. But there are only finitely many basic feasible solutions, so one of them
has to yield the maximum of the objective function.
156 CHAPTER 6. LINEAR PROGRAMS
Geometrically, basic solutions are exactly the vertices of the polyhedron P(A, b), a notion
that we now define.
The concept of a vertex is illustrated in Figure 6.5, while the concept of an edge is
illustrated in Figure 6.6.
x+y+z=3
(1,1,1)
Figure 6.5: The cube centered at the origin with diagonal through (−1, −1, −1) and (1, 1, 1)
has eight vertices. The vertex (1, 1, 1) is associated with the linear form x + y + z = 3.
(1,1,-1)
(1,1,1)
x+y=2
Figure 6.6: The cube centered at the origin with diagonal through (−1, −1, −1) and (1, 1, 1)
has twelve edges. The vertex edge from (1, 1, −1) to (1, 1, 1) is associated with the linear
form x + y = 2.
Ax = AK xK + AN xN = b.
AK xK = b.
Proof. First, assume that v is a vertex of P(A, b), and let ϕ(x) = cx − µ be a linear form
such that cy < µ for all y ∈ P(A, b) and cv = µ. This means that v is the unique point of
P(A, b) for which the objective function x 7→ cx has the maximum value µ on P(A, b), so by
Theorem 6.4, since this maximum is achieved by some basic feasible solution, by uniqueness
v must be a basic feasible solution.
158 CHAPTER 6. LINEAR PROGRAMS
n
In theory, to find an optimal solution we try all m possible m-elements subsets K of
{1, . . . , n} and solve for the corresponding unique solution xK of AK x = b. Then we check
whether such a solution satisfies xK ≥ 0, compute cxK , and return some feasible xK for
which the objective function is maximum. This is a totally impractible algorithm.
A practical algorithm is the simplex algorithm. Basically, the simplex algorithm tries to
“climb” in the polyhderon P(A, b) from vertex to vertex along edges (using basic feasible
solutions), trying to maximize the objective function. We present the simplex algorithm in
the next chapter. The reader may also consult texts on linear programming. In particular,
we recommend Matousek and Gardner [41], Chvatal [18], Papadimitriou and Steiglitz [45],
Bertsimas and Tsitsiklis [10], Ciarlet [19], Schrijver [51], and Vanderbei [64].
Observe that Theorem 6.4 asserts that if a linear program (P ) in standard form (where
Ax = b and A is an m×n matrix of rank m) has some feasible solution and is bounded above,
then some basic feasible solution is an optimal solution. By Theorem 6.6, the polyhedron
P(A, b) must have some vertex.
But suppose we only know that P(A, b) is nonempty; that is, we don’t know that the
objective function cx is bounded above. Does P(A, b) have some vertex?
The answer to the above question is yes, and this is important because the simplex
algorithm needs an initial basic feasible solution to get started. Here we prove that if P(A, b)
is nonempty, then it must contain a vertex. This proof still doesn’t constructively yield a
vertex, but we will see in the next chapter that the simplex algorithm always finds a vertex
if there is one (provided that we use a pivot rule that prevents cycling).
Proof. The proof relies on a trick, which is to add slack variables xn+1 , . . . , xn+m and use the
new objective function −(xn+1 + · · · + xn+m ).
6.4. BASIC FEASIBLE SOLUTIONS AND VERTICES 159
The definition of a basic feasible solution can be adapted to linear programs where the
constraints are of the form Ax ≤ b, x ≥ 0; see Matousek and Gardner [41] (Chapter 4,
Section 4, Definition 4.4.2).
The most general type of linear program allows constraints of the form ai x ≥ bi or
ai x = bi besides constraints of the form ai x ≤ bi . The variables xi may also take negative
values. It is always possible to convert such programs to the type considered in Definition
6.2. We proceed as follows.
Every constraint ai x ≥ bi is replaced by the constraint −ai x ≤ −bi . Every equality
constraint ai x = bi is replaced by the two constraints ai x ≤ bi and −ai x ≤ −bi .
If there are n variables xi , we create n new variables yi and n new variables zi and
replace every variable xi by yi − zi . We also add the 2n constraints yi ≥ 0 and zi ≥ 0. If the
constraints are given by the inequalities Ax ≤ b, we now have constraints given by
y
A −A ≤ b, y ≥ 0, z ≥ 0.
z
160 CHAPTER 6. LINEAR PROGRAMS
Remark: We also showed that we can replace the ineqality constraints Ax ≤ b by equality
constraints Ax = b, by adding slack variables constrained to be nonnegative.
Chapter 7
so for a basic feasible solution x associated with K, we have J> (x) ⊆ K. In fact, by
Proposition 6.2, a feasible solution x is a basic feasible solution iff the columns of AJ> (x) are
linearly independent.
If J> (x) had cardinality m for all basic feasible solutions x, then the simplex algorithm
would make progress at every step, in the sense that it would strictly increase the value of the
objective function. Unfortunately, it is possible that |J> (x)| < m for certain basic feasible
solutions, and in this case a step of the simplex algorithm may not increase the value of the
objective function. Worse, in rare cases, it is possible that the algorithm enters an infinite
loop. This phenomenon called cycling can be detected, but in this case the algorithm fails
to give a conclusive answer.
Fortunately, there are ways of preventing the simplex algorithm from cycling (for exam-
ple, Bland’s rule discussed later), although proving that these rules work correctly is quite
involved.
161
162 CHAPTER 7. THE SIMPLEX ALGORITHM
The potential “bad” behavior of a basic feasible solution is recorded in the following
definition.
Definition 7.1. Given a linear program (P ) in standard form where the constraints are
given by Ax = b and x ≥ 0, with A an m × n matrix of rank m, a basic feasible solution x
is degenerate if |J> (x)| < m, otherwise it is nondegenerate.
The origin 0n , if it is a basic feasible solution, is degenerate. For a less trivial example,
x = (0, 0, 0, 2) is a degenerate basic feasible solution of the following linear program in which
m = 2 and n = 4.
Example 7.1.
maximize x2
subject to
− x1 + x2 + x3 = 0
x1 + x4 = 2
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0.
The matrix A and the vector b are given by
−1 1 1 0 0
A= , b= ,
1 0 0 1 2
and if x = (0, 0, 0, 2), then J> (x) = {4}. There are two ways of forming a set of two linearly
independent columns of A containing the fourth column.
Given a basic feasible solution x associated with a subset K of size m, since the columns
of the matrix AK are linearly independent, by abuse of language we call the columns of AK
a basis of x.
If u is a vertex of (P ), that is, a basic feasible solution of (P ) associated with a basis
K (of size m), in “normal mode,” the simplex algorithm tries to move along an edge from
the vertex u to an adjacent vertex v (with u, v ∈ P(A, b) ⊆ Rn ) corresponding to a basic
feasible solution whose basis is obtained by replacing one of the basic vectors Ak with k ∈ K
by another nonbasic vector Aj for some j ∈ / K, in such a way that the value of the objective
function is increased.
Let us demonstrate this process on an example.
Example 7.2. Let (P ) be the following linear program in standard form.
maximize x1 + x2
subject to
− x1 + x2 + x 3 = 1
x1 + x4 = 3
x2 + x5 = 2
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0, x5 ≥ 0.
7.1. THE IDEA BEHIND THE SIMPLEX ALGORITHM 163
2 u2
2=1
1+x
-x
1
u0 u1
K1 0 1 2 3 4 5
K1
Figure 7.1: The planar H-polyhedron associated with Example 7.2. The initial basic feasible
solution is the origin. The simplex algorithm first moves along the horizontal orange line to
feasible solution at vertex u1 . It then moves along the vertical red line to obtain the optimal
feasible solution u2 .
A1 = −A3 + A4
A2 = A3 + A5 .
Since
1A3 + 3A4 + 2A5 = Au0 = b,
for any θ ∈ R, we have
and
In the first case, the vector (θ, 0, 1 + θ, 3 − θ, 2) is a feasible solution iff 0 ≤ θ ≤ 3, and
the new value of the objective function is θ.
In the second case, the vector (0, θ, 1 − θ, 3, 2 − θ, 1) is a feasible solution iff 0 ≤ θ ≤ 1,
and the new value of the objective function is also θ.
Consider the first case. It is natural to ask whether we can get another vertex and increase
the objective function by setting to zero one of the coordinates of (θ, 0, 1 + θ, 3 − θ, 2), in this
case the fouth one, by picking θ = 3. This yields the feasible solution (3, 0, 4, 0, 2), which
corresponds to the basis (A1 , A3 , A5 ), and so is indeed a basic feasible solution, with an
improved value of the objective function equal to 3. Note that A4 left the basis (A3 , A4 , A5 )
and A1 entered the new basis (A1 , A3 , A5 ).
We can now express A2 and A4 in terms of the basis (A1 , A3 , A5 ), which is easy to do
since we already have A1 and A2 in term of (A3 , A4 , A5 ), and A1 and A4 are swapped. Such
a step is called a pivoting step. We obtain
A2 = A3 + A5
A4 = A1 + A3 .
Then we repeat the process with u1 = (3, 0, 4, 0, 2) and the basis (A1 , A3 , A5 ). We have
and
In the first case, the point (3, θ, 4 − θ, 0, 2 − θ) is a feasible solution iff 0 ≤ θ ≤ 2, and the
new value of the objective function is 3 + θ. In the second case, the point (3 − θ, 0, 4 − θ, θ, 2)
is a feasible solution iff 0 ≤ θ ≤ 3, and the new value of the objective function is 3 − θ. To
increase the objective function we must choose the first case and we pick θ = 2. Then, we
get the feasible solution u2 = (3, 2, 2, 0, 0), which corresponds to the basis (A1 , A2 , A3 ), and
thus is a basic feasible solution. The new value of the objective function is 5.
7.1. THE IDEA BEHIND THE SIMPLEX ALGORITHM 165
Next we express A4 and A5 in terms of the basis (A1 , A2 , A3 ). Again this is easy to do
since we just swapped A5 and A2 (a pivoting step), and we get
A5 = A2 − A3
A4 = A1 + A3 .
We repeat the process with u2 = (3, 2, 2, 0, 0) and the basis (A1 , A2 , A3 ). We have
b = 3A1 + 2A2 + 2A3 − θA4 + θA4
= 3A1 + 2A2 + 2A3 − θ(A1 + A3 ) + θA4
= (3 − θ)A1 + 2A2 + (2 − θ)A3 + θA4 ,
and
b = 3A1 + 2A2 + 2A3 − θA5 + θA5
= 3A1 + 2A2 + 2A3 − θ(A2 − A3 ) + θA5
= 3A1 + (2 − θ)A2 + (2 + θ)A3 + θA5 .
In the first case, the point (3 − θ, 2, 2 − θ, θ, 0) is a feasible solution iff 0 ≤ θ ≤ 2, and the
value of the objective function is 5 − θ. In the second case, the point (3, 2 − θ, 2 + θ, 0, θ) is
a feasible solution iff 0 ≤ θ ≤ 2, and the value of the objective function is also 5 − θ. Since
we must have θ ≥ 0 to have a feasible solution, there is no way to increase the objective
function. In this situation, it turns out that we have reached an optimal solution, in our
case u2 = (3, 2, 2, 0, 0), with the maximum of the objective function equal to 5.
We could also have applied the simplex algorithm to the vertex u0 = (0, 0, 1, 3, 2) and to
the vector (0, θ, 1 − θ, 3, 2 − θ, 1), which is a feasible solution iff 0 ≤ θ ≤ 1, with new value
of the objective function θ. By picking θ = 1, we obtain the feasible solution (0, 1, 0, 3, 1),
corresponding to the basis (A2 , A4 , A5 ), which is indeed a vertex. The new value of the
objective function is 1. Then we express A1 and A3 in terms the basis (A2 , A4 , A5 ) obtaining
A1 = A4 − A3
A3 = A2 − A5 ,
and repeat the process with (0, 1, 0, 3, 1) and the basis (A2 , A4 , A5 ). After three more steps
we will reach the optimal solution u2 = (3, 2, 2, 0, 0).
Let us go back to the linear program of Example 7.1 with objective function x2 and where
the matrix A and the vector b are given by
−1 1 1 0 0
A= , b= .
1 0 0 1 2
Recall that u0 = (0, 0, 0, 2) is a degenerate basic feasible solution, and the objective function
has the value 0. See Figure 7.2 for a planar picture of the H-polyhedron associated with
Example 7.1.
166 CHAPTER 7. THE SIMPLEX ALGORITHM
2 u2
x2
u1
K1 0 1 2 3
x1
K1
>
Figure 7.2: The planar H-polyhedron associated with Example 7.1. The initial basic feasible
solution is the origin. The simplex algorithm moves along the slanted orange line to the apex
of the triangle.
A1 = −A3 + A4
A2 = A3 ,
and we get
and
In the first case, the point (θ, 0, θ, 2 − θ) is a feasible solution iff 0 ≤ θ ≤ 2, and the value of
the objective function is 0, and in the second case the point (0, θ, −θ, 2) is a feasible solution
iff θ = 0, and the value of the objective function is θ. However, since we must have θ = 0 in
the second case, there is no way to increase the objective function either.
It turns out that in order to make the cases considered by the simplex algorithm as
mutually exclusive as possible, since in the second case the coefficient of θ in the value of
the objective function is nonzero, namely 1, we should choose the second case. We must
7.1. THE IDEA BEHIND THE SIMPLEX ALGORITHM 167
pick θ = 0, but we can swap the vectors A3 and A2 (because A2 is coming in and A3 has
the coefficient −θ, which is the reason why θ must be zero), and we obtain the basic feasible
solution u1 = (0, 0, 0, 2) with the new basis (A2 , A4 ). Note that this basic feasible solution
corresponds to the same vertex (0, 0, 0, 2) as before, but the basis has changed. The vectors
A1 and A3 can be expressed in terms of the basis (A2 , A4 ) as
A1 = −A2 + A4
A3 = A2 .
We now repeat the procedure with u1 = (0, 0, 0, 2) and the basis (A2 , A4 ), and we get
and
In the first case, the point (θ, θ, 0, 2−θ) is a feasible solution iff 0 ≤ θ ≤ 2 and the value of the
objective function is θ, and in the second case the point (0, −θ, θ, 2) is a feasible solution iff
θ = 0 and the value of the objective function is θ. In order to increase the objective function
we must choose the first case and pick θ = 2. We obtain the feasible solution u2 = (2, 2, 0, 0)
whose corresponding basis is (A1 , A2 ) and the value of the objective function is 2.
The vectors A3 and A4 are expressed in terms of the basis (A1 , A2 ) as
A3 = A2
A4 = A1 + A3 ,
and we repeat the procedure with u2 = (2, 2, 0, 0) and the basis (A1 , A2 ). We get
and
In the first case, the point (2, 2 − θ, 0, θ) is a feasible solution iff 0 ≤ θ ≤ 2 and the value of
the objective function is 2 − θ, and in the second case, the point (2 − θ, 2, −θ, θ) is a feasible
solution iff θ = 0 and the value of the objective function is 2. This time there is no way
to improve the objective function and we have reached an optimal solution u2 = (2, 2, 0, 0)
with the maximum of the objective function equal to 2.
Let us now consider an example of an unbounded linear program.
maximize x1
subject to
x1 − x2 + x3 = 1
− x1 + x2 + x4 = 2
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0.
:
inequal x y % 1, x C y % 2, x R 0, y R 0 , x = 1 ..5, y = 1 ..5, color = "Nautical 1"
5
3 =
2
x2
+
-x 1
1
=
2
-x
1 x1
θ(1,1)
K1 0 1 2 3 4 5
K1
>
Figure 7.3: The planar H-polyhedron associated with Example 7.3. The initial basic feasible
solution is the origin. The simplex algorithm first moves along the horizontal indigo line to
basic feasible solution at vertex (1, 0). Any optimal feasible solution occurs by moving along
the boundary line parameterized by the orange arrow θ(1, 1).
A1 = A3 − A4
A2 = −A3 + A4 .
and
In the first case, the point (θ, 0, 1 − θ, 2 + θ) is a feasible solution iff 0 ≤ θ ≤ 1 and the value
of the objective function is θ, and in the second case, the point (0, θ, 1 + θ, 2 − θ) is a feasible
solution iff 0 ≤ θ ≤ 2 and the value of the objective function is 0. In order to increase the
objective function we must choose the first case, and we pick θ = 1. We get the feasible
solution u1 = (1, 0, 0, 3) corresponding to the basis (A1 , A4 ), so it is a basic feasible solution,
and the value of the objective function is 1.
The vectors A2 and A3 are given in terms of the basis (A1 , A4 ) by
A2 = −A1
A3 = A1 + A4 .
and
In the first case, the point (1 + θ, θ, 0, 3) is a feasible solution for all θ ≥ 0 and the value
of the objective function if 1 + θ, and in the second case, the point (1 − θ, 0, θ, 3 − θ) is a
170 CHAPTER 7. THE SIMPLEX ALGORITHM
feasible solution iff 0 ≤ θ ≤ 1 and the value of the objective function is 1 − θ. This time, we
are in the situation where the points
form an infinite ray in the set of feasible solutions, and the objective function 1 + θ is
unbounded from above on this ray. This indicates that our linear program, although feasible,
is unbounded.
j j
We let γK ∈ Rm be the vector given by γK = (γkj )k∈K . Actually, since the vector γK j
depends
j
on K, to be very precise we should denote its components by (γK )k , but to simplify notation
we usually write γkj instead of (γK
j
)k (unless confusion arises). We will explain later how the
j
coefficients γk can be computed efficiently.
Since u is a feasible solution we have u ≥ 0 and Au = b, that is,
X
uk Ak = b. (∗∗)
k∈K
the equation (∗∗) and then replace the occurrence of Aj in −θAj by the right hand side of
equation (∗) to obtain
X
b= uk Ak − θAj + θAj
k∈K
X X
= k
uk A − θ γkj Ak + θAj
k∈K k∈K
X
= uk − θγkj Ak + θAj .
k∈K
Consequently, the vector u(θ) appearing on the right-hand side of the above equation given
by j
ui − θγi
if i ∈ K
u(θ)i = θ if i = j
0 if i ∈
/ K ∪ {j}
automatically satisfies the constraints Au(θ) = b, and this vector is a feasible solution iff
then we have a range of feasible solutions for 0 ≤ θ ≤ θj . The value of the objective function
for u(θ) is
X j
X j
cu(θ) = ck (uk − θγk ) + θcj = cu + θ cj − γk ck .
k∈K k∈K
and these equations imply that the subspaces spanned by the vectors (Ak )k∈K and the vectors
(Ak )k∈K + are identical. However, K is a basis of dimension m so this subspace has dimension
+
m, and since K + also has m elements, it must be a basis. Therefore, u+ = u(θj ) is a basic
feasible solution.
The above case is the most common one, but other situations may arise. In what follows,
we discuss all eventualities.
Case (A).
We have cj − k∈K γkj ck ≤ 0 for all j ∈
P
/ K. Then it turns out that u is an optimal
solution. Otherwise, we are in Case (B).
Case (B).
γkj ck > 0 for some j ∈
P
We have cj − k∈K / K (not necessarily unique). There are three
subcases.
Case (B1).
/ K as above we also have γkj ≤ 0 for all k ∈ K, since uk ≥ 0 for all k ∈ K,
If for some j ∈
this places no restriction on θ, and the objective function is unbounded above.
Case (B2).
There is some index j + ∈
/ K such that simultaneously
+
(1) cj + − k∈K γkj ck > 0, which means that the objective function can potentially be
P
increased;
+ +
(2) There is some k ∈ K such that γkj > 0, and for every k ∈ K, if γkj > 0 then uk > 0,
+
which implies that θj > 0.
+
If we pick θ = θj where
j+ uk j +
θ = min j + γk > 0, k ∈ K > 0,
γk
7.2. THE SIMPLEX ALGORITHM IN GENERAL 173
and n X j o
B = B1 ∪ B2 ∪ B3 = j ∈
/ K | cj − γk ck > 0 .
k∈K
Furthermore, (A) and (B), (B1) and (B3), (B2) and (B3) are mutually exclusive, while (B1)
and (B2) are not.
If Case (B1) and Case (B2) arise simultaneously, we opt for Case (B1) which says that
the linear program (P ) is unbounded and terminate the algorithm.
Here are a few remarks about the method.
In Case (B2), which is the path followed by the algorithm most frequently, various choices
+
have to be made for the index j + ∈/ K for which θj > 0 (the new index in K + ). Similarly,
various choices have to be made for the index k − ∈ K leaving K, but such choices are
typically less important.
Similarly in Case (B3), various choices have to be made for the new index j + ∈
/ K going
+
into K . In Cases (B2) and (B3), criteria for making such choices are called pivot rules.
Case (B3) only arises when u is a degenerate vertex. But even if u is degenerate, Case
(B2) may arise if uk > 0 whenever γkj > 0. It may also happen that u is nondegenerate but
as a result of Case (B2), the new vertex u+ is degenerate because at least two components
+ + + +
uk1 − θj γkj1 and uk2 − θj γkj2 vanish for some distinct k1 , k2 ∈ K.
Cases (A) and (B1) correspond to situations where the algorithm terminates, and Case
(B2) can only arise a finite number of times during execution of the simplex algorithm, since
the objective function is strictly increased from vertex to vertex and there are only finitely
many vertices. Therefore, if the simplex algorithm is started on any initial basic feasible
solution u0 , then one of three mutually exclusive situations may arise:
(1) There is a finite sequence of occurrences of Case (B2) and/or Case (B3) ending with an
occurrence of Case (A). Then the last vertex produced by the algorithm is an optimal
solution.
(2) There is a finite sequence of occurrences of Case (B2) and/or Case (B3) ending with
an occurrence of Case (B1). We conclude that the problem is unbounded, and thus
has no solution.
7.2. THE SIMPLEX ALGORITHM IN GENERAL 175
(3) There is a finite sequence of occurrences of Case (B2) and/or Case (B3), followed by
an infinite sequence of Case (B3). If this occurs, the algorithm visits the some basis
twice. This a phenomenon known as cycling. In this eventually the algorithm fails to
come to a conclusion.
There are examples for which cycling occur, although this is rare in practice. Such an
example is given in Chvatal [18]; see Chapter 3, pages 31-32, for an example with seven
variables and three equations that cycles after six iterations under a certain pivot rule.
The third possibility can be avoided by the choice of a suitable pivot rule. Two of these
rules are Bland’s rule and the lexicographic rule; see Chvatal [18] (Chapter 3, pages 34-38).
Bland’s rule says: choose the smallest of the elligible incoming indices j + ∈
/ K, and
similarly choose the smallest of the elligible outgoing indices k − ∈ K.
It can be proved that cycling cannot occur if Bland’s rule is chosen as the pivot rule. The
proof is very technical; see Chvatal [18] (Chapter 3, pages 37-38), Matousek and Gardner [41]
(Chapter 5, Theorem 5.8.1), and Papadimitriou and Steiglitz [45] (Section 2.7). Therefore,
assuming that some initial basic feasible solution is provided, and using a suitable pivot rule
(such as Bland’s rule), the simplex algorithm always terminates and either yields an optimal
solution or reports that the linear program is unbounded. Unfortunately Bland’s rules is one
of the slowest pivot rules.
The choice of a pivot rule affects greatly the number of pivoting steps that the simplex
algorithms goes through. It is not our intention here to explain the various pivot rules.
We simply mention the following rules, referring the reader to Matousek and Gardner [41]
(Chapter 5, Section 5.7) or to the texts cited in Section 6.1.
1. Largest coefficient.
2. Largest increase.
3. Steepest edge.
4. Bland’s Rule.
5. Random edge.
The steepest edge rule is one of the most popular. The idea is to maximize the ratio
c(u+ − u)
.
ku+ − uk
maximize cx
subject to Ax = b and x ≥ 0,
where A
b and x
b are given by
x1
b = ... .
b = A Im ,
A x
xn+m
Since we assumed that b ≥ 0, the vector x b = (0n , b) is a feasible solution of (Pb), in fact a basic
feasible solutions since the matrix associated with the indices n + 1, . . . , n + m is the identity
matrix Im . Furthermore, since xi ≥ 0 for all i, the objective function −(xn+1 + · · · + xn+m )
is bounded above by 0.
If we execute the simplex algorithm with a pivot rule that prevents cycling, starting with
the basic feasible solution (0n , d), since the objective function is bounded by 0, the simplex
algorithm terminates with an optimal solution given by some basic feasible solution, say
(u∗ , w∗ ), with u∗ ∈ Rn and w∗ ∈ Rm .
As in the proof of Theorem 6.7, for every feasible solution u ∈ P(A, b) the vector (u, 0m )
is an optimal solution of (Pb). Therefore, if w∗ 6= 0, then P(A, b) = ∅, since otherwise for
every feasible solution u ∈ P(A, b) the vector (u, 0m ) would yield a value of the objective
function −(xn+1 + · · · + xn+m ) equal to 0, but (u∗ , w∗ ) yields a strictly negative value since
w∗ 6= 0.
Otherwise, w∗ = 0, and u∗ is a feasible solution of (P ). Since (u∗ , 0m ) is a basic feasible
solution of (Pb) the columns corresponding to nonzero components of u∗ are linearly inde-
pendent. Some of the coordinates of u∗ could be equal to 0, but since A has rank m we can
add columns of A to obtain a basis K ∗ associated with u∗ , and u∗ is indeed a basic feasible
solution of (P ).
Running the simplex algorithm on the linear program Pb to obtain an initial feasible
solution (u0 , K0 ) of the linear program (P 2) is called Phase I of the simplex algorithm.
Running the simplex algorithm on the linear program (P 2) with some initial feasible solution
7.3. HOW PERFORM A PIVOTING STEP EFFICIENTLY 177
(u0 , K0 ) is called Phase II of the simplex algorithm. If a feasible solution of the linear
program (P 2) is readily available then Phase I is skipped. Sometimes, at the end of Phase
I, an optimal solution of (P 2) is already obtained.
In summary, we proved the following fact worth recording.
Proposition 7.1. For any linear program (P2 )
maximize cx
subject to Ax = b and x ≥ 0,
The simplex algorithm with a pivot rule that prevents cycling started on the basic feasible
b = (0n , b) of (Pb) terminates with an optimal solution (u∗ , w∗ ).
solution x
(1) If w∗ 6= 0, then P(A, B) = ∅, that is, the linear program (P ) has no feasible solution.
(2) If w∗ = 0, then P(A, B) 6= ∅, and u∗ is a basic feasible solution of (P ) associated with
some basis K.
Proposition 7.1 shows that determining whether the polyhedron P(A, b) defined by a
system of equations Ax = b and inequalities x ≥ 0 is nonempty is decidable. This decision
procedure uses a fail-safe version of the simplex algorithm (that prevents cycling), and the
proof that it always terminates and returns an answer is nontrivial.
j j
so the vector γK is given by γK = A−1 j
K A , that is, by solving the system
j
AK γK = Aj . (∗γ )
j
To be very precise, since the vector γK depends on K its components should be denoted by
(γK )ki , but as we said before, to simplify notation we write γkji instead of (γK
j j
)ki .
In order toPdecide which case applies ((A), (B1), (B2), (B3)), we need to compute the
numbers cj − k∈K γkj ck for all j ∈
/ K. For this, observe that
X j j
cj − γk ck = cj − cK γK = cj − cK A−1 j
K A .
k∈K
If we write βK = cK A−1
K , then
X
cj − γkj ck = cj − βK Aj .
k∈K
Remark: Observe that since u is a basis feasible solution of (P ), we have uj = 0 for all
j∈ / K, so u is the solution of the equation AK uK = b. As a consequence, the value of the
objective function for u is cu = cK uK = cK A−1
K b. This fact will play a crucial role in Section
8.2 to show that when the simplex algorithm terminates with an optimal solution of the
linear program (P ), then it also produces an optimal solution of the dual linear program
(D).
Assume that we have a basic feasible solution u, a basis K for u, and that we also have
the matrix AK as well its inverse A−1
K (perhaps implicitly) and also the inverse (AK )
> −1
of
>
AK (perhaps implicitly). Here is a description of an iteration step of the simplex algorithm,
following almost exactly Chvatal (Chvatal [18], Chapter 7, Box 7.1).
An Iteration Step of the (Revised) Simplex Method
Step 1. Compute the numbers cj − k∈K γkj ck = cj − βK Aj for all j ∈
P
/ K, and for this,
>
compute βK as the solution of the system
A> > >
K β K = cK .
If cj − βK Aj ≤ 0 for all j ∈
/ K, stop and return the optimal solution u (Case (A)).
Step 2. If Case (B) arises, use a pivot rule to determine which index j + ∈
/ K should enter
+
the new basis K + (the condition cj + − βK Aj > 0 should hold).
+
Step 3. Compute maxk∈K γkj . For this, solve the linear system
j + +
AK γK = Aj .
7.3. HOW PERFORM A PIVOTING STEP EFFICIENTLY 179
+
Step 4. If maxk∈K γkj ≤ 0, then stop and report that the linear program (P ) is unbounded
(Case (B1)).
+ + +
Step 5. If maxk∈K γkj > 0, use the ratios uk /γkj for all k ∈ K such that γkj > 0 to
+ +
compute θj , and use a pivot rule to determine which index k − ∈ K such that θj = uk− /γkj+
−
γm 1
+
Since γ` = γkj − > 0, the matrix E(γ) is invertible, and it is easy to check that its inverse is
given by
1 −γ`−1 γ1
... ..
.
1 −γ`−1 γ`−1
−1 −1
E(γ) = γ` ,
−γ`−1 γ`+1 1
.
.. . ..
−γ`−1 γm 1
180 CHAPTER 7. THE SIMPLEX ALGORITHM
A−1 −1 −1
K + = E(γ) AK .
and
(A>
K+ )
−1
= (A> −1 > −1
K ) (E(γ) ) ,
AK s = E1 E2 · · · Es
of s beta matrices. Such a factorization is called an eta factorization. The eta factorization
+
can be used to either invert AK s or to solve a system of the form AKs γ = Aj iteratively.
Which method is more efficient depends on the sparsity of the Ei .
In summary, there are cheap methods for finding the next basic feasible solution (u+ , K + )
from (u, K). We simply wanted to give the reader a flavor of these techniques. We refer the
reader to texts on linear programming for detailed presentations of methods for implementing
efficiently the simplex method. In particular, the revised simplex method is presented in
Chvatal [18], Papadimitriou and Steiglitz [45], Bertsimas and Tsitsiklis [10], and Vanderbei
[64].
j
Since the quantities cj − cK γK play a crucial role in determining which column Aj should
j
come into the basis, the notation cj is used to denote cj − cK γK , which is called the reduced
cost of the variable xj . The reduced costs actually depend on K so to be very precise we
should denote them by (cK )j , but to simplify notation we write cj instead of (cK )j . We will
see shortly how (cK + )i is computed in terms of in terms of (cK )i .
Observe that the data needed to execute the next step of the simplex algorithm are
(1) The current basic solution uK and its basis K = (k1 , . . . , km ).
j
(2) The reduced costs cj = cj − cK A−1 j
K A = cj − cK γK , for all j ∈
/ K.
j
(3) The vectors γK = (γkji )m
i=1 for all j ∈
j
/ K, that allow us to express each Aj as AK γK .
All this information can be packed into a (m + 1) × (n + 1) matrix called a (full) tableau
organized as follows:
cK uK c1 ··· cj ··· cn
uk1 γ11 ··· γ1j ··· γ1n
.. .. .. ..
. . . .
1 j n
ukm γm ··· γm ··· γm
It is convenient to think as the first row as row 0, and of the first column as column 0.
Row 0 contains the current value of the objective function and the reduced costs, column
0 except for its top entry contains the components of the current basic solution uK , and
j
the remaining columns except for their top entry contain the vectors γK . Observe that
j
the γK corresponding to indices j in K constitute a permutation of the identity matrix
+
Im . The entry γkj − is called the pivot element. A tableau together with the new basis
K + = (K − {k − }) ∪ {j + } contains all the data needed to compute the new uK + , the new
j
γK + , and the new reduced costs (cK + )j .
1 n j
If we define the m × n matrix Γ as the matrix Γ = [γK · · · γK ] whose jth column is γK ,
and c as the row vector c = (c1 · · · cn ), then the above tableau is denoted concisely by
cK u K c
uK Γ
We now show that the update of a tableau can be performed using elementary row
operations identical to the operations used during the reduction of a matrix to row reduced
echelon form (rref).
If K = (k1 , . . . , km ), j + is the index of the incoming basis vector, k − = k` is the index
of the column leaving the basis, and if K + = (k1 , . . . , k`−1 , j + , k`+1 , . . . , km ), since AK + =
j+ j j
AK E(γK ), the new columns γK + are computed in terms of the old columns γK using the
equations
j −1 j j + −1 −1 j j + −1 j
γK + = AK + A = E(γK ) AK A = E(γK ) γK .
182 CHAPTER 7. THE SIMPLEX ALGORITHM
with the column involving the γs in the `th column, and this matrix is the product of the
following elementary row operations:
+
1. Multiply row ` by 1/γkj − (the inverse of the pivot) to make the entry on row ` and
column j + equal to 1.
+
2. subtract γkji × (the normalized) row ` from row i, for i = 1, . . . , ` − 1, ` + 1, . . . , m.
j +
These are exactly the elementary row operations that reduce the `th column γK of Γ
to the `th column of the identity matrix Im . Thus, this step is identical to the sequence of
steps that the procedure to convert a matrix to row reduced echelon from executes on the
`th column of the matrix. The only difference is the criterion for the choice of the pivot.
Since the new basic solution uK + is given by uK + = A−1
K + b, we have
j −1 −1 + j −1 +
uK + = E(γK ) AK b = E(γK ) uK .
This means that u+ is obtained from uK by applying exactly the same elementary row
operations that were applied to Γ. Consequently, just as in the procedure for reducing a
matrix to rref, we can apply elementary row operations to the matrix [uk Γ], which consists
of rows 1, . . . , m of the tableau.
Once the new matrix Γ+ is obtained, the new reduced costs are given by the following
proposition.
maximize cx
subject to Ax = b and x ≥ 0,
7.4. THE SIMPLEX ALGORITHM USING TABLEAUX 183
i i γki − +
j
ci − cK + γK + = ci − cK γK − + (cj + − cK γK ).
γkj −
γki −
(cK + )i = (cK )i − + (cK )j + .
γkj −
Proof. Without any loss of generality and to simplify notation assume that K = (1, . . . , m)
j
and write j for j + and ` for km . Since γK
i
= A−1 i i −1 i
K A , γK + = AK + A , and AK + = AK E(γK ),
we have
i −1 i j −1 −1 i j −1 i
ci − cK + γK + = ci − cK + A + A = ci − cK + E(γK )
K AK A = ci − cK + E(γK ) γK ,
where
where the `th column contains the γs. Since cK + = (c1 , . . . , c`−1 , cj , c`+1 , . . . , cm ), we have
m
γkj
j −1 cj X
cK + E(γK ) = c1 , . . . , c`−1 , j − ck , c`+1 , . . . , cm ,
γ` k=1,k6=` γ`j
184 CHAPTER 7. THE SIMPLEX ALGORITHM
and
γ1i
..
.
m ! γ i
j −1 i
cj X γkj `−1
cK + E(γK ) γK = c1 . . . c`−1 − c k c`+1 . . . cm γ`i
γ`j k=1,k6=` γ`j i
γ`+1
.
..
i
γm
m m
γ`i
X X
j
= ck γki+ j cj − ck γk
k=1,k6=`
γ` k=1,k6=`
m m
γ`i
X X
i j j
= ck γk + j cj + c` γ` − ck γk
k=1,k6=`
γ` k=1
m m
γ`i
X X
i j
= ck γk + j cj − ck γk
k=1
γ` k=1
i γ`i j
= cK γK + j (cj − cK γK ),
γ`
and thus
i j −1 i i γ`i j
ci − cK + γK + = ci − cK + E(γK ) γK = ci − cK γK − j (cj − cK γK ),
γ`
as claimed.
Since (γk1− , . . . , γkn− ) is the `th row of Γ, we see that Proposition 7.2 shows that
(cK )j +
cK + = cK − + Γ` , (†)
γkj −
+
where Γ` denotes the `-th row of Γ and γkj − is the pivot. This means that cK + is obtained
by the elementary row operations which consist first normalizing the `th row by dividing it
+
by the pivot γkj − , and then subtracting (cK )j + × the normalized row ` from cK . These are
exactly the row operations that make the reduced cost (cK )j + zero.
cK + = c − cK + Γ+ .
7.4. THE SIMPLEX ALGORITHM USING TABLEAUX 185
We saw in section 7.2 that the change in the objective function after a pivoting step
during which column j + comes in and column k − leaves is given by
X j+
j+ +
θ cj + − γk ck = θj (cK )j + ,
k∈K
where
+ uk −
θj = + .
γkj −
If we denote the value of the objective function cK uK by zK , then we see that
(cK )j +
zK + = zK + + uk − .
γkj −
This means that the new value zK + of the objective function is obtained by first normalizing
+
the `th row by dividing it by the pivot γkj − , and then adding (cK )j + × the zeroth entry of
the normalized `th line by (cK )j + to the zeroth entry of line 0.
In updating the reduced costs, we subtract rather than add (cK )j + × the normalized row `
from row 0. This suggests storing −zK as the zeroth entry on line 0 rather than zK , because
then all the entries row 0 are updated by the same elementary row operations. Therefore,
from now on, we use tableau of the form
The simplex algorithm first chooses the incoming column j + by picking some column for
+
which cj > 0, and then chooses the outgoing column k − by considering the ratios uk /γkj for
+
which γkj > 0 (along column j + ), and picking k − to achieve the minimum of these ratios.
Here is an illustration of the simplex algorithm using elementary row operations on an
example from Papadimitriou and Steiglitz [45] (Section 2.9).
Example 7.4. Consider the linear program
maximize − 2x2 − x4 − 5x7
subject to
x1 + x2 + x3 + x4 = 4
x1 + x 5 = 2
x3 + x 6 = 3
3x2 + x3 + x7 = 6
x1 , x2 , x3 , x4 , x5 , x6 , x7 ≥ 0.
186 CHAPTER 7. THE SIMPLEX ALGORITHM
We have the basic feasible solution u = (0, 0, 0, 4, 2, 3, 6), with K = (4, 5, 6, 7). Since cK =
(−1, 0, 0, −5) and c = (0, −2, 0, −1, 0, 0 − 5) the first tableau is
34 1 14 6 0 0 0 0
u4 = 4 1 1 1 1 0 0 0
u5 = 2 1 0 0 0 1 0 0
u6 = 3 0 0 1 0 0 1 0
u7 = 6 0 3 1 0 0 0 1
Row 0 is obtained by subtracting −1× (row 1) and −5× (row 4) from c = (0, −2, 0, −1, 0,
0, −5). Let us pick column j + = 1 as the incoming column. We have the ratios (for positive
entries on column 1)
4/1, 2/1,
and since the minimum is 2, we pick the outgoing column to be column k − = 5. The pivot
1 is indicated in red. The new basis is K = (4, 1, 6, 7). Next we apply row operations to
reduce column 1 to the second vector of the identity matrix I4 . For this, we subtract row 2
from row 1. We get the tableau
34 1 14 6 0 0 0 0
u4 = 2 0 1 1 1 −1 0 0
u1 = 2 1 0 0 0 1 0 0
u6 = 3 0 0 1 0 0 1 0
u7 = 6 0 3 1 0 0 0 1
To compute the new reduced costs, we want to set c1 to 0 so we subtract row 2 from row
0, and we get the tableau
32 0 14 6 0 −1 0 0
u4 = 2 0 1 1 1 −1 0 0
u1 = 2 1 0 0 0 1 0 0
u6 = 3 0 0 1 0 0 1 0
u7 = 6 0 3 1 0 0 0 1
Next, pick column j + = 3 as the incoming column. We have the ratios (for positive
entries on column 3)
2/1, 3/1, 6/1,
and since the minimum is 2, we pick the outgoing column to be column k − = 4. The pivot
1 is indicated in red and the new basis is K = (3, 1, 6, 7). Next we apply row operations to
reduce column 3 to the first vector of the identity matrix I4 . For this, we subtract row 1
from row 3 and from row 4, to obtain the tableau:
7.4. THE SIMPLEX ALGORITHM USING TABLEAUX 187
32 0 14 6 0 −1 0 0
u3 = 2 0 1 1 1 −1 0 0
u1 = 2 1 0 0 0 1 0 0
u6 = 1 0 −1 0 −1 1 1 0
u7 = 4 0 2 0 −1 1 0 1
To compute the new reduced costs, we want to set c3 to 0 so we subtract 6× row 1 from
row 0, and we get the tableau
20 0 8 0 −6 5 0 0
u3 = 2 0 1 1 1 −1 0 0
u1 = 2 1 0 0 0 1 0 0
u6 = 1 0 −1 0 −1 1 1 0
u7 = 4 0 2 0 −1 1 0 1
Next we pick j + = 2 as the incoming column. We have the ratios (for positive entries on
column 2)
2/1, 4/2,
and since the minimum is 2, we pick the outgoing column to be column k − = 3. The pivot
1 is indicated in red and the new basis is K = (2, 1, 6, 7). Next we apply row operations to
reduce column 2 to the first vector of the identity matrix I4 . For this, we add row 1 to row
3 and subtract 2× row 1 from row 4 to obtain the tableau:
20 0 8 0 −6 5 0 0
u2 = 2 0 1 1 1 −1 0 0
u1 = 2 1 0 0 0 1 0 0
u6 = 3 0 0 1 0 0 1 0
u7 = 0 0 0 −2 −3 3 0 1
To compute the new reduced costs, we want to set c2 to 0 so we subtract 8× row 1 from
row 0 and we get the tableau
4 0 0 −8 −14 13 0 0
u2 =2 0 1 1 1 −1 0 0
u1 =2 1 0 0 0 1 0 0
u6 =3 0 0 1 0 0 1 0
u7 =0 0 0 −2 −3 3 0 1
The only possible incoming column corresponds to j + = 5. We have the ratios (for
positive entries on column 5)
2/1, 0/3,
188 CHAPTER 7. THE SIMPLEX ALGORITHM
and since the minimum is 0, we pick the outgoing column to be column k − = 7. The pivot
3 is indicated in red and the new basis is K = (2, 1, 6, 5). Since the minimum is 0, the basis
K = (2, 1, 6, 5) is degenerate (indeed, the component corresponding to the index 5 is 0).
Next we apply row operations to reduce column 5 to the fourth vector of the identity matrix
I4 . For this, we multiply row 4 by 1/3, and then add the normalized row 4 to row 1 and
subtract the normalized row 4 from row 2, and to obtain the tableau:
4 0 0 −8 −14 13 0 0
u2 = 2 0 1 1/3 0 0 0 1/3
u1 = 2 1 0 2/3 1 0 0 −1/3
u6 = 3 0 0 1 0 0 1 0
u5 = 0 0 0 −2/3 −1 1 0 1/3
To compute the new reduced costs, we want to set c5 to 0 so we subtract 13× row 4 from
row 0 and we get the tableau
4 0 0 2/3 −1 0 0 −13/3
u2 = 2 0 1 1/3 0 0 0 1/3
u1 = 2 1 0 2/3 1 0 0 −1/3
u6 = 3 0 0 1 0 0 1 0
u5 = 0 0 0 −2/3 −1 1 0 1/3
The only possible incoming column corresponds to j + = 3. We have the ratios (for
positive entries on column 3)
and since the minimum is 3, we pick the outgoing column to be column k − = 1. The pivot
2/3 is indicated in red and the new basis is K = (2, 3, 6, 5). Next we apply row operations
to reduce column 3 to the second vector of the identity matrix I4 . For this, we multiply row
2 by 2/3, subtract (1/3)× (normalized row 2) from row 1, and subtract normalized row 2
from row 3, add add row (2/3)× (normalized row 2) to row 4, to obtain the tableau:
4 0 0 2/3 −1 0 0 −13/3
u2 = 1 −1/2 1 0 −1/2 0 0 1/2
u3 = 3 3/2 0 1 3/2 0 0 −1/2
u6 = 0 −3/2 0 0 −3/2 0 1 1/2
u5 = 2 1 0 0 0 1 0 0
To compute the new reduced costs, we want to set c3 to 0 so we subtract (2/3)× row 2
from row 0 and we get the tableau
7.4. THE SIMPLEX ALGORITHM USING TABLEAUX 189
2 −1 0 0 −2 0 0 −4
u2 = 1 −1/2 1 0 −1/2 0 0 1/2
u3 = 3 3/2 0 1 3/2 0 0 −1/2
u6 = 0 −3/2 0 0 −3/2 0 1 1/2
u5 = 2 1 0 0 0 1 0 0
Since all the reduced cost are ≤ 0, we have reached an optimal solution, namely
(0, 1, 3, 0, 2, 0, 0, 0), with optimal value −2.
The progression of the simplex algorithm from one basic feasible solution to another
corresponds to the visit of vertices of the polyhedron P associated with the constraints of
the linear program illustrated in Figure 7.4.
x3
(0,0,3)
6
(1,0,3)
(0,1,3)
x
1 +x
2 +x
3 =4
3
3x 2
(0,2,2)
1
+x3
(2,0,0)
=6
x2
x1 = 2
(0,2,0)
(2,2,0)
4 = 5 x1
Figure 7.4: The polytope P associated with the linear program optimized by the tableau
method. The red arrowed path traces the progression of the simplex method from the origin
to the vertex (0, 1, 3).
As a final comment, if it is necessary to run Phase I of the simplex algorithm, in the event
that the simplex algorithm terminates with an optimal solution (u∗ , 0m ) and a basis K ∗ such
that some ui = 0, then the basis K ∗ contains indices of basic columns Aj corresponding to
slack variables that need to be driven out of the basis. This is easy to achieve by performing a
pivoting step involving some other column j + corresponding to one of the original variables
+
(not a slack variable) for which (γK ∗ )ji 6= 0. In such a step, it doesn’t matter whether
+
(γK ∗ )ji < 0 or (cK ∗ )j + ≤ 0. If the original matrix A has no redundant equations, such a step
190 CHAPTER 7. THE SIMPLEX ALGORITHM
is always possible. Otherwise, (γK ∗ )ji = 0 for all non-slack variables, so we detected that the
ith equation is redundant and we can delete it.
Other presentations of the tableau method can be found in Bertsimas and Tsitsiklis [10]
and Papadimitriou and Steiglitz [45].
n
X
maximize 10n−j xj
j=1
subject to
X i−1
2 i−j
10 xj + xi ≤ 100i−1
j=1
xj ≥ 0,
for i = 1, . . . , n and j = 1, . . . , n.
If p = max(m, n), then , in terms of worse case behavior, for all currently known pivot
rules, the simplex algorithm has exponential complexity in p. However, as we said earlier, in
practice, nasty examples such as the Klee–Minty example seem to be rare, and the number
of iterations appears to be linear in m.
Whether or not a pivot rule (a clairvoyant rule) for which the simplex algorithms runs
in polynomial time in terms of m is still an open problem.
The Hirsch conjecture claims that there is some pivot rule such that the simplex algorithm
finds an optimal solution in O(p) steps. The best bound known so far due to Kalai and
Kleitman is m1+ln n = (2n)ln m . For more on this topic, see Matousek and Gardner [41]
(Section 5.9) and Bertsimas and Tsitsiklis [10] (Section 3.7).
Researchers have investigated the problem of finding upper bounds on the expected
number of pivoting steps if a randomized pivot rule is used. Bounds better than 2m (but of
course, not polynomial) have been found.
7.5. COMPUTATIONAL EFFICIENCY OF THE SIMPLEX METHOD 191
193
194 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
C = {λ1 a1 + · · · + λn an | λi ≥ 0, i = 1, . . . , n}
2. a ∈
/H
1. yai ≥ 0 for i = 1, . . . , n.
2. yb < 0.
A direct proof of the Farkas–Minkowski proposition not involving Proposition 8.1 is given
at the end of this section.
Proposition 8.3. (Farkas Lemma, Version Ib) Let A be an m × n matrix and let b ∈ Rm be
any vector. The linear system Ax = b has no solution x ≥ 0 iff there is some nonzero linear
form y ∈ (Rm )∗ such that yA ≥ 0>
n and yb < 0.
Proof. First, assume that there is some nonzero linear form y ∈ (Rm )∗ such that yA ≥ 0
and yb < 0. If x ≥ 0 is a solution of Ax = b, then we get
yAx = yb,
1. yAj ≥ 0 for j = 1, . . . , n.
2. yb < 0,
Proposition 8.4. (Farkas Lemma, Version IIb) Let A be an m × n matrix and let b ∈ Rm
be any vector. The system of inequalities Ax ≤ b has no solution x ≥ 0 iff there is some
nonzero linear form y ∈ (Rm )∗ such that y ≥ 0> >
m , yA ≥ 0n , and yb < 0.
Proof. We use the trick of linear programming which consists of adding “slack variables”
zi to convert inequalities ai x ≤ bi into equations ai x + zi = bi with zi ≥ 0. If we let
z = (z1 , . . . , zm ), it is obvious that the system Ax ≤ b has a solution x ≥ 0 iff the equation
x
A Im =b
z
x
has a solution with x ≥ 0 and z ≥ 0. Now by Farkas Ib, the above system has no
z
solution with with x ≥ 0 and z ≥ 0 iff there is some nonzero linear form y ∈ (Rm )∗ such that
y A Im ≥ 0>
n+m
In the next section we use Farkas IIb to prove the duality theorem in linear programming.
Observe that by taking the negation of the equivalence in Farkas IIb we obtain a criterion
of solvability, namely:
The system of inequalities Ax ≤ b has a solution x ≥ 0 iff for every nonzero linear form
y ∈ (Rm )∗ such that y ≥ 0> >
m , if yA ≥ 0n , then yb ≥ 0.
We now prove the Farkas–Minkowski proposition without using Proposition 3.3. This
approach uses a basic property of the distance function from a point to a closed set.
Let X ⊆ Rn be any nonempty set and let a ∈ Rn be any point. The distance d(a, X)
from a to X is defined as
d(a, X) = inf ka − xk .
x∈X
Proposition 8.5. Let X ⊆ Rn be any nonempty set and let a ∈ Rn be any point. If X is
closed, then there is some z ∈ X such that ka − zk = d(a, X).
196 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
Proof. Since X is nonempty, pick any x0 ∈ X, and let r = ka − x0 k. If Br (a) is the closed
ball Br (a) = {x ∈ Rn | kx − ak ≤ r}, then clearly
Since Br (a) is compact and X is closed, K = X ∩ Br (a) is also compact. But the function
x 7→ ka − xk defined on the compact set K is continuous, and the image of a compact set
by a continuous function is compact, so by Heine–Borel it has a minimum that is achieved
by some z ∈ K ⊆ X.
and
hp − v, u − pi ≥ 0 for all u ∈ U .
p
Here kwk = hw, wi, where h−, −i is the inner product of the Hilbert space V .
We can now give a proof of the Farkas–Minkowski proposition (Proposition 8.2).
Proof of the Farkas–Minkowski proposition. Let C = cone({a1 , . . . , am }) be a polyhedral
cone (nonempty) and assume that b ∈ / C. By Proposition 4.13, the polyhedral cone is
closed, and by Proposition 8.5 there is some z ∈ C such that d(b, C) = kb − zk; that is, z is
a point of C closest to b. Since b ∈
/ C and z ∈ C we have u = z − b 6= 0, and we claim that
the linear hyperplane H orthogonal to u does the job, as illustrated in Figure 8.1.
First let us show that
hu, zi = hz − b, zi = 0. (∗1 )
This is trivial if z = 0, so assume z 6= 0. If hu, zi =
6 0, then either hu, zi > 0 or hu, zi < 0. In
either case we show that we can find some point z 0 ∈ C closer to b than z is, a contradiction.
Case 1 : hu, zi > 0.
Let z 0 = (1 − α)z for any α such that 0 < α < 1. Then z 0 ∈ C and since u = z − b
z 0 − b = (1 − α)z − (z − u) = u − αz,
so
2
kz 0 − bk = ku − αzk2 = kuk2 − 2αhu, zi + α2 kzk2 .
If we pick α > 0 such that α < 2hu, zi/ kzk2 , then −2αhu, zi + α2 kzk2 < 0, so kz 0 − bk2 <
kuk2 = kz − bk2 , contradicting the fact that z is a point of C closest to b.
z
a3 H
C
a2
a1 b
Let z 0 = (1 + α)z for any α such that α ≥ −1. Then z 0 ∈ C and since u = z − b we have
z 0 − b = (1 + α)z − (z − u) = u + αz so
2
kz 0 − bk = ku + αzk2 = kuk2 + 2αhu, zi + α2 kzk2 ,
and if
0 < α < −2hu, zi/ kzk2 ,
then 2αhu, zi + α2 kzk2 < 0, so kz 0 − bk2 < kuk2 = kz − bk2 , a contradiction as above.
Therefore hu, zi = 0. We have
we have 2αhz − b, x − zi + α2 kx − zk2 < 0, which implies that kz 0 − bk2 < kz − bk2 , contra-
dicting that z is a point of C closest to b.
Since hb − z, x − zi ≤ 0, u = z − b, and by (∗1 ) hu, zi = 0, we have
There are other ways of proving the Farkas–Minkowski proposition, for instance using
minimally infeasible systems or Fourier–Motzkin elimination; see Matousek and Gardner [41]
(Chapter 6, Sections 6.6 and 6.7).
maximize cx
subject to Ax ≤ b and x ≥ 0,
with A a m × n matrix, and assume that (P ) has a feasible solution and is bounded above.
Since by hypothesis the objective function x 7→ cx is bounded on P(A, b), it might be useful
to deduce an upper bound for cx from the inequalities Ax ≤ b, for any x ∈ P(A, b). We can
do this as follows: for every inequality
ai x ≤ b i 1 ≤ i ≤ m,
pick a nonnegative scalar yi , multiply both sides of the above inequality by yi obtaining
y i ai x ≤ y i b i 1 ≤ i ≤ m,
8.2. THE DUALITY THEOREM IN LINEAR PROGRAMMING 199
(the direction of the inequality is preserved since yi ≥ 0), and then add up these m equations,
which yields
(y1 a1 + · · · + ym am )x ≤ y1 b1 + · · · + ym bm .
If we can pick the yi ≥ 0 such that
c ≤ y1 a1 + · · · + ym am ,
cx ≤ (y1 a1 + · · · + ym am )x ≤ y1 b1 + · · · + ym bm ,
namely we found an upper bound of the value cx of the objective function of (P ) for any
feasible solution x ∈ P(A, b). If we let y be the linear form y = (y1 , . . . , ym ), then since
a1
A = ...
am
maximize cx
subject to Ax ≤ b and x ≥ 0,
minimize yb
subject to yA ≥ c and y ≥ 0,
where y ∈ (Rm )∗ . The original linear program (P ) is called the primal linear program.
It can be checked that (x1 , x2 ) = (1/2, 5/4) is an optimal solution of the primal linear
program, with the maximum value of the objective function 2x1 + 3x2 equal to 19/4, and
that (y1 , y2 , y3 ) = (5/16, 0, 1/4) is an optimal solution of the dual linear program, with the
minimum value of the objective function 12y1 + 3y2 + 4y3 also equal to 19/4.
3
2x
+y
=3
2
3x
+
2y
=
y 4
1 4x +
8 y=
12
0
0 0.5 1 1.5 2
x
Figure 8.2: The H-polytope for the linear program of Example 8.1. Note x1 → x and x2 → y.
Observe that in the primal linear program (P ), we are looking for a vector x ∈ Rn
maximizing the form cx, and that the constraints are determined by the action of the rows
of the matrix A on x. On the other hand, in the dual linear program (D), we are looking
for a linear form y ∈ (R∗ )m minimizing the form yb, and the constraints are determined by
8.2. THE DUALITY THEOREM IN LINEAR PROGRAMMING 201
8x + y + 2z = 3
y
4x + 2y + 3z = 2
Figure 8.3: The H-polyhedron for the dual linear program of Example 8.1 is the spacial
region “above” the pink plane and in “front” of the blue plane. Note y1 → x, y2 → y, and
y3 → z.
the action of y on the columns of A. This is the sense in which (D) is the dual (P ). In most
presentations, the fact that (P ) and (D) perform a search for a solution in spaces that are
dual to each other is obscured by excessive use of transposition.
To convert the dual program (D) to a standard maximization problem we change the
objective function yb to −b> y > and the inequality yA ≥ c to −A> y > ≤ −c> . The dual
linear program (D) is now stated as (D0 )
where y ∈ (Rm )∗ . Observe that the dual in maximization form (D00 ) of the dual program
(D0 ) gives back the primal program (P ).
The above discussion established the following inequality known as weak duality.
Proposition 8.6. (Weak Duality) Given any linear program (P )
maximize cx
subject to Ax ≤ b and x ≥ 0,
with A a m × n matrix, for any feasible solution x ∈ Rn of the primal problem (P ) and every
feasible solution y ∈ (Rm )∗ of the dual problem (D), we have
cx ≤ yb.
We say that the dual linear program (D) is bounded below if {yb | y > ∈ P(−A> , −c> )}
is bounded below.
202 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
On the other hand, since x∗ ≥ 0 is an optimal solution of the system Ax ≤ b, by Farkas IIb
again (by taking the negation of the equivalence), since λA ≥ 0 (for the same λ as before),
we must have
λb ≥ 0. (∗1 )
We claim that z > 0. Otherwize, since z ≥ 0, we must have z = 0, but then
λb < z(µ + )
implies
λb < 0, (∗2 )
and since λb ≥ 0 by (∗1 ), we have a contradiction. Consequently, we can divide by z > 0
without changing the direction of inequalities, and we obtain
λ
A≥c
z
λ
b<µ+
z
λ
≥ 0,
z
which shows that v = λ/z is a feasible solution of the dual problem (D). However, weak
duality (Proposition 8.6) implies that cx∗ = µ ≤ yb for any feasible solution y ≥ 0 of the
dual program (D), so (D) is bounded below and by Proposition 6.1 applied to the version of
(D) written as a maximization problem, we conclude that (D) has some optimal solution.
For any optimal solution y ∗ of (D), since v is a feasible solution of (D) such that vb < µ + ,
we must have
µ ≤ y ∗ b < µ + ,
and since our reasoning is valid for any > 0, we conclude that cx∗ = µ = y ∗ b.
If we assume that the dual program (D) has a feasible solution and is bounded below,
since the dual of (D) is (P ), we conclude that (P ) is also feasible and bounded above.
The strong duality theorem can also be proved by the simplex method, because when
it terminates with an optimal solution of (P ), the final tableau also produces an optimal
solution y of (D) that can be read off the reduced costs of columns n + 1, . . . , n + m by
flipping their signs. We follow the proof in Ciarlet [19] (Chapter 10).
maximize cx
subject to Ax ≤ b and x ≥ 0,
204 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
maximize b cx
b
subject to Ab b ≥ 0,
bx = b and x
minimize yb
subject to yA ≥ c and y ≥ 0,
where y ∈ (Rm )∗ . If the simplex algorithm applied to the linear program (P 2) terminates
with an optimal solution (b u∗ , K ∗ ), where u
b∗ is a basic feasible solution and K ∗ is a basis for
b∗ , then y ∗ = b
u b−1∗ is an optimal solution for (D) such that b
cK ∗ A b∗ = y ∗ b. Furthermore, y ∗
cu
K
is given in terms of the reduced costs by y ∗ = −((cK ∗ )n+1 . . . (cK ∗ )n+m ).
Proof. We know that K ∗ is a subset of {1, . . . , n + m} consisting of m indices such that the
b are linearly independent. Let N ∗ = {1, . . . , n + m} − K ∗ . The
corresponding columns of A
simplex methods terminates with an optimal solution in Case (A), namely when
X j
cj −
b ck ≤ 0 for all j ∈ N ∗ ,
γk b
k∈k
bj = P j bk
where A k∈K ∗ γk A , or using the notations of Section 7.3,
cj − b b−1∗ A
cK ∗ A bj ≤ 0 for all j ∈ N ∗ .
b K
cN ∗ − b b−1∗ A
cK ∗ A bN ∗ ≤ 0> ,
b K n
or equivalently as
b−1∗ A
cK ∗ A bN ∗ ≥ b
cN ∗ . (∗1 )
b K
b∗K ∗ = b
cK ∗ u b−1∗ b.
cK ∗ A (∗2 )
b K
8.2. THE DUALITY THEOREM IN LINEAR PROGRAMMING 205
that is,
y ∗ A Im P ≥ c 0>
m P,
which is equivalent to
y ∗ A Im ≥ c 0>
m ,
that is
y ∗ A ≥ c, y ≥ 0,
and these are exactly the conditions that say that y ∗ is a feasible solution of the dual program
(D).
The reduced costs are given by (b ci − b
cK ∗ )i = b b−1∗ A
cK ∗ A bi , for i = 1, . . . , n + m. But for
K
bn+j is the jth vector of the identity matrix Im , so
i = n + 1, . . . , n + m each column A
cK ∗ )n+j = −(b
(b b−1∗ )j = −yj∗
cK ∗ A j = 1, . . . , m,
K
as claimed.
The fact that the above proof is fairly short is deceptive, because this proof relies on the
fact that there are versions of the simplex algorithm using pivot rules that prevent cycling,
but the proof that such pivot rules work correctly is quite lengthy. Other proofs are given
in Matousek and Gardner [41] (Chapter 6, Sections 6.3), Chvatal [18] (Chapter 5), and
Papadimitriou and Steiglitz [45] (Section 2.7).
Observe that since the last m rows of the final tableau are actually obtained by multipling
[u A]
b by Ab−1∗ , the m × m matrix consisting of the last m columns and last m rows of the final
K
b−1∗ (basically, the simplex algorithm has performed the steps of a Gauss–Jordan
tableau is A K
reduction). This fact allows saving some steps in the primal dual method.
By combining weak duality and strong duality, we obtain the following theorem which
shows that exactly four cases arise.
206 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
Theorem 8.9. (Duality Theorem of Linear Programming) Let (P ) be any linear program
maximize cx
subject to Ax ≤ b and x ≥ 0,
minimize yb
subject to yA ≥ c and y ≥ 0,
(4) Both (P ) and (D) have a feasible solution. Then both have an optimal solution, and
for every optimal solution x∗ of (P ) and every optimal solution y ∗ of (D), we have
cx∗ = y ∗ b.
Ax ≤ b
yA ≥ c
cx ≥ yb
x ≥ 0, y ≥ 0>
m.
In fact, for any feasible solution (x∗ , y ∗ ) of the above system, x∗ is an optimal solution of
(P ) and y ∗ is an optimal solution of (D)
Theorem 8.10. (Equilibrium Theorem) For any linear program (P ) and its dual linear
program (D) (with set of inequalities Ax ≤ b where A is an m × n matrix, and objective
8.3. COMPLEMENTARY SLACKNESS CONDITIONS 207
function x 7→ cx), for any feasible solution x of (P ) and any feasible solution y of (D), x
and y are optimal solutions iff
for all i for which nj=1 aij xj < bi
P
yi = 0 (∗D )
and Pm
xj = 0 for all j for which i=1 yi aij > cj . (∗P )
Proof.
Pn First, assume that (∗D ) and (∗P ) hold. The equations in (∗D ) say that yi = 0 unless
j=1 aij xj = bi , hence
m
X m
X n
X m X
X n
yb = y i bi = yi aij xj = yi aij xj .
i=1 i=1 j=1 i=1 j=1
Pm
Similarly, the equations in (∗P ) say that xj = 0 unless i=1 yi aij = cj , hence
n
X n X
X m
cx = cj x j = yi aij xj .
j=1 j=1 i=1
Consequently, we obtain
cx = yb.
By weak duality (Proposition 8.6), we have
cx ≤ yb = cx
for all feasible solutions x of (P ), so x is an optimal solution of (P ). Similarly,
yb = cx ≤ yb
for all feasible solutions y of (D), so y is an optimal solution of (D).
Let us now assume that x is an optimal solution of (P ) and that y is an optimal solution
of (D). Then, as in the proof of Proposition 8.6,
n
X m X
X n m
X
cj x j ≤ yi aij xj ≤ y i bi .
j=1 i=1 j=1 i=1
By strong duality, since x and y are optimal solutions the above inequalities are actually
equalities, so in particular we have
Xn m
X
cj − yi aij xj = 0.
j=1 i=1
The equations in (∗D ) and (∗P ) are often called complementary slackness conditions.
These conditions can be exploited to solve for an optimal solution of the primal problem
with the help of the dual problem, and conversely. Indeed, if we guess a solution to one
problem, then we may solve for a solution of the dual using the complementary slackness
conditions, and then check that our guess was correct. This is the essence of the primal-dual
methods. To present this method, first we need to take a closer look at the dual of a linear
program already in standard form.
minimize y 0 b − y 00 b
A
subject to 0
y y 00
≥ c and y 0 , y 00 ≥ 0,
−A
minimize (y 0 − y 00 )b
subject to (y 0 − y 00 )A ≥ c and y 0 , y 00 ≥ 0,
minimize yb
subject to yA ≥ c,
maximize cx
subject to Ax = b and x ≥ 0,
8.4. DUALITY FOR LINEAR PROGRAMS IN STANDARD FORM 209
minimize yb
subject to yA ≥ c,
where y ∈ (Rm )∗ . If the simplex algorithm applied to the linear program (P 2) terminates
with an optimal solution (u∗ , K ∗ ), where u∗ is a basic feasible solution and K ∗ is a basis for
u∗ , then y ∗ = cK ∗ A−1 ∗ ∗
K ∗ is an optimal solution for (D) such that cu = y b. Furthermore, if
we assume that the simplex algorithm is started with a basic feasible solution (u0 , K0 ) where
K0 = (n − m + 1, . . . , n) (the indices of the last m columns of A) and A(n−m+1,...,n) = Im (the
last m columns of A constitute the identity matrix Im ), then the optimal solution y ∗ = cK ∗ A−1
K∗
for (D) is given in terms of the reduced costs by
and the m×m matrix consisting of last m columns and the last m rows of the final tableau
is A−1
K∗ .
cK ∗ A−1
K ∗ A N ∗ ≥ cN ∗ ,
y ∗ AK ∗ = cK ∗ A−1
K ∗ A K ∗ = cK ∗ ,
y AN ∗ = cK ∗ A−1
∗
K ∗ AN ∗ ≥ cN ∗ .
y ∗ AK ∗ AN ∗ ≥ cK ∗ cN ∗ ,
y ∗ A ≥ c,
(cK ∗ )i = ci − cK ∗ A−1 i
K∗ A ,
210 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
and since for j = n − m + 1, . . . , n the column Aj is the (j + m − n)th column of the identity
matrix Im , we have
(cK ∗ )j = cj − (cK ∗ AK ∗ )j+m−n j = n − m + 1, . . . , n,
that is,
y ∗ = c(n−m+1,...,n) − (cK ∗ )(n−m+1,...,n) ,
as claimed. Since the last m rows of the final tableau is obtained by multiplying [u0 A] by
A−1
K ∗ , and the last m columns of A constitute Im , the last m rows and the last m columns of
the final tableau constitute A−1
K∗ .
Let us now take a look at the complementary slackness conditions of Theorem 8.10. If
we go back to the version of (P ) given by
maximize cx
A b
subject to x≤ and x ≥ 0,
−A −b
and to the version of (D) given by
minimize y 0 b − y 00 b
A
subject to 0
y y 00
≥ c and y 0 , y 00 ≥ 0,
−A
where y 0 , y 00 ∈ (Rm )∗ , since the inequalities Ax ≤ b and −Ax ≤ −b together imply that
Ax = b, we have equality for all these inequality constraints, and so the Conditions (∗D )
place no constraints at all on y 0 and y 00 , while the Conditions (∗P ) assert that
for all j for which m 0 00
P
xj = 0 i=1 (yi − yi )aij > cj .
Therefore, the slackness conditions applied to a linear program (P 2) in standard form and
to its dual (D) only impose slackness conditions on the variables xj of the primal problem.
The above fact plays a crucial role in the primal-dual method.
8.5. THE DUAL SIMPLEX ALGORITHM 211
maximize cx
subject to Ax = b and x ≥ 0,
minimize yb
subject to yA ≥ c,
j
so by multiplying both sides by A−1 −1 j
K and using the fact that by definition γK = AK A , we
obtain n
X j
vj γK = A−1
K b = uK .
j=1
But recall that by hypothesis uk− < 0, yet vj ≥ 0 and γkj − ≥ 0 for all j, so the component of
index k − is zero or positive on the left, and negative on the right, a contradiction. Therefore,
(P 2) is indeed not feasible.
Case (B2). We have γkj − < 0 for some j.
We pick the column Aj entering the basis among those for which γkj − < 0. Since we
j
assumed that cj − cK γK ≤ 0 for all j ∈ N by (∗2 ), consider
j
cj − cK γK
+ j cj j
µ = max − γk− < 0, j ∈ N = max − γ j γk− < 0, j ∈ N ≤ 0,
γkj − k−
so
i γki − j +
ci − cK γK − + (cj + − cK γK ) ≤ 0,
γkj −
i
and again, ci −cK + γK + ≤ 0. Therefore, if we let K
+
= (K −{k − })∪{j + }, then y + = cK + A−1
K+
is dual feasible. As in the simplex algorithm, θ+ is given by
+
θ+ = uk− /γkj − ≥ 0,
and u+ is also computed as in the simplex algorithm by
+
j+ j
ui − θ γi
if i ∈ K
+ +
ui = θj if i = j + .
+
0 if i ∈
/ K ∪ {j }
The change in the objective function of the prime and dual program (which is the same,
since uK = A−1 −1
K b and y = cK AK is chosen such that cu = cK uK = yb) is the same as in the
simplex algorithm, namely +
+ j j+
θ c − cK γK .
+ j + j + +
We have θ+ > 0 and cj − cK γK ≤ 0, so if cj − cK γK < 0, then the objective function of
the dual program decreases strictly.
Case (B3). µ+ = 0.
+ j +
The possibity that µ+ = 0, that is, cj − cK γK = 0, may arise. In this case, the objective
function doesn’t change. This is a case of degeneracy similar to the degeneracy that arises
in the simplex algorithm. We still pick j + ∈ N (µ+ ), but we need a pivot rule that prevents
cycling. Such rules exist; see Bertsimas and Tsitsiklis [10] (Section 4.5) and Papadimitriou
and Steiglitz [45] (Section 3.6).
The reader surely noticed that the dual simplex algorithm is very similar to the simplex
algorithm, except that the simplex algorithm preserves the property that (u, K) is (primal)
feasible, whereas the dual simplex algorithm preserves the property that y = cK A−1K is dual
feasible. One might then wonder whether the dual simplex algorithm is equivalent to the
simplex algorithm applied to the dual problem. This is indeed the case, there is a one-to-
one correspondence between the dual simplex algorithm and the simplex algorithm applied
to the dual problem. This correspondence is described in Papadimitriou and Steiglitz [45]
(Section 3.7).
The comparison between the simplex algorithm and the dual simplex algorithm is best
illustrated if we use a description of these methods in terms of (full) tableaux .
Recall that a (full) tableau is an (m + 1) × (n + 1) matrix organized as follows:
−cK uK c1 ··· cj ··· cn
uk1 γ11 ··· γ1j ··· γ1n
.. .. .. ..
. . . .
1 j n
ukm γm ··· γm ··· γm
214 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
The top row contains the current value of the objective function and the reduced costs,
the first column except for its top entry contain the components of the current basic solution
j
uK , and the remaining columns except for their top entry contain the vectors γK . Observe
j
that the γK corresponding to indices j in K constitute a permutation of the identity matrix
Im . A tableau together with the new basis K + = (K − {k − }) ∪ {j + } contains all the data
j i j+
needed to compute the new uK + , the new γK + , and the new reduced costs ci − (γk − /γk − )cj + .
When executing the simplex algorithm, we have uk ≥ 0 for all k ∈ K (and uj = 0 for
/ K), and the incoming column j + is determined by picking one of the column indices
all j ∈
such that cj > 0. Then, the index k − of the leaving column is determined by looking at the
+ +
minimum of the ratios uk /γkj for which γkj > 0 (along column j + ).
On the other hand, when executing the dual simplex algorithm, we have cj ≤ 0 for all
j∈/ K (and ck = 0 for all k ∈ K), and the outgoing column k − is determined by picking one
of the row indices such that uk < 0. The index j + of the incoming column is determined by
looking at the maximum of the ratios −cj /γkj − for which γkj − < 0 (along row k − ).
More details about the comparison between the simplex algorithm and the dual simplex
algorithm can be found in Bertsimas and Tsitsiklis [10] and Papadimitriou and Steiglitz [45].
Here is an example of the the dual simplex method.
Example 8.2. Consider the following linear program in standard form:
Maximize − 4x1 − 2x2 − x3
x1
x2
−1 −1 2 1 0 0
x3 −3
subject to −4 −2 1 0 1 0 = −4 and (x1 , x2 , x3 , x4 , x5 , x6 ) ≥ 0.
x4
1 1 −4 0 0 1
x5 2
x6
0
0
0
We initialize the dual simplex procedure with (u, K) where u = −3 and K = (4, 5, 6).
−4
1
The initial tableau, before explicitly calculating the reduced cost, is
0 c1 c2 c3 c4 c5 c6
u4 = −3 −1 −1 2 1 0 0
.
u5 = −4 −4 −2 1 0 1 0
u6 = 2 1 1 −4 0 0 1
Since u has negative coordinates, Case (B) applies, and we will set k − = 4. We must now
determine whether Case (B1) or Case (B2) applies. This determination is accomplished by
8.5. THE DUAL SIMPLEX ALGORITHM 215
scanning the first three columns in the tableau, and observing each column has a negative
entry. Thus Case (B2) is applicable, and we need to determine the reduced costs. Observe
that c = (−4, −2, −1, 0, 0, 0), which in turn implies c(4,5,6) = (0, 0, 0). Equation (∗2 ) implies
that the nonzero reduced costs are
−1
c1 = c1 − c(4,5,6) −4 = −4
1
−1
c2 = c2 − c(4,5,6) −2 = −2
1
−2
c3 = c3 − c(4,5,6) 1 = −1,
4
and our tableau becomes
0 −4 −2 −1 0 0 0
u4 = −3 −1 −1 2 1 0 0
.
u5 = −4 −4 −2 1 0 1 0
u6 = 2 1 1 −4 0 0 1
Since k − = 4, our pivot row is the first row of the tableau. To determine candidates for j + ,
we scan this row, locate negative entries and compute
−2 −4
+ cj j
µ = max − j γ4 < 0, j ∈ {1, 2, 3} = max , = −2.
γ4 1 1
Since µ+ occurs when j = 2, we set j + = 2. Our new basis is K + = (2, 5, 6). We must
normalize the first row of the tableau, namely multiply by −1, then add twice this normalized
row to the second row, and subtract the normalized row from the third row to obtain the
updated tableau.
0 −4 −2 −1 0 0 0
u2 = 3 1 1 −2 −1 0 0
u5 = 2 −2 0 −3 −2 1 0
u6 = −1 0 0 −2 1 0 1
It remains to update the reduced costs and the value of the objective function by adding
twice the normalized row to the top row.
6 −2 0 −5 −2 0 0
u2 = 3 1 1 −2 −1 0 0
u5 = 2 −2 0 −3 −2 1 0
u6 = −1 0 0 −2 1 0 1
216 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
We now repeat the procedure of Case (B2) and set k − = 6 (since this is the only negative
entry of u+ ). Our pivot row is now the third row of the updated tableaux, and the new µ+
becomes
−5
+ cj j 5
µ = max − j γ6 < 0, j ∈ {1, 3, 4} = max =− ,
γ6 2 2
which implies that j + = 3. Hence the new basis is K + = (2, 5, 3), and we update the tableau
by taking − 12 of Row 3, adding twice the normalized Row 3 to Row 1, and adding three
times the normalized Row 3 to Row 2.
6 −2 0 −5 −2 0 0
u2 = 4 1 1 0 −2 0 −1
u5 = 7/2 −2 0 0 −7/2 1 −3/2
u3 = 1/2 0 0 1 −1/2 0 −1/2
It remains to update the objective function and the reduced costs by adding five times the
normalized row to the top row.
17/2 −2 0 0 −9/2 0 −5/2
u2 = 4 1 1 0 −2 0 −1
u5 = 7/2 −2 0 0 − 27 1 −3/2
u3 = 1/2 0 0 1 −1/2 0 −1/2
Since u+ has no negative entries, the dual simplex method terminates and objective function
4x1 − 2x2 − x3 is maximized with − 17 2
at (0, 4, 12 ).
maximize cx
subject to Ax = b and x ≥ 0,
minimize yb
subject to yA ≥ c,
where y ∈ (Rm )∗ .
Pn
First, we may assume that b ≥ 0 by changing every equation j=1 aij xj = bi with bi < 0
to nj=1 −aij xj = −bi . If we happen to have some feasible solution y of the dual program
P
(D), we know from Theorem 8.12 that a feasible solution x of (P 2) is an optimal solution iff
the equations in (∗P ) hold. If we denote by J the subset of {1, . . . , n} for which the equalities
yAj = cj
8.6. THE PRIMAL-DUAL ALGORITHM 217
xj = 0 for all j ∈
/ J.
Let |J| = p and N = {1, . . . , n} − J. The above suggests looking for x ∈ Rn such that
X
xj Aj = b
j∈J
xj ≥ 0 for all j ∈ J
xj = 0 for all j ∈
/ J,
or equivalently
AJ xJ = b, xJ ≥ 0, (∗1 )
and
xN = 0n−p .
To search for such an x, and just need to look for a feasible xJ , and for this we can use
the restricted primal linear program (RP ) defined as follows:
maximize − (ξ1 + · · · + ξm )
xJ
subject to AJ Im = b and x, ξ ≥ 0.
ξ
Since by hypothesis b ≥ 0 and the objective function is bounded above by 0, this linear
program has an optimal solution (x∗J , ξ ∗ ).
If ξ ∗ = 0, then the vector u∗ ∈ Rn given by u∗J = x∗J and u∗N = 0n−p is an optimal solution
of (P ).
Otherwise, ξ ∗ > 0 and we have failed to solve (∗1 ). However we may try to use ξ ∗ to
improve y. For this, consider the dual (DRP ) of (RP ):
minimize zb
subject to zAJ ≥ 0
z ≥ −1>
m.
Observe that the program (DRP ) has the same objective function as the original dual
program (D). We know by Theorem 8.11 that the optimal solution (x∗J , ξ ∗ ) of (RP ) yields
an optimal solution z ∗ of (DRP ) such that
z ∗ b = −(ξ1∗ + · · · + ξm
∗
) < 0.
c = [0>
and b >
p − 1 ], then by Theorem 8.11 we have
z∗ = b b−1∗ = −1>
cK ∗ A m − (cK ∗ )(p+1,...,p+m) ,
K
where (cK ∗ )(p+1,...,p+m) denotes the row vector of reduced costs in the final tableau corre-
sponding to the last m columns.
If we write
y(θ) = y + θz ∗ ,
then the new value of the objective function of (D) is
y(θ)b = yb + θz ∗ b, (∗2 )
and since z ∗ b < 0, we have a chance of improving the objective function of (D), that is,
decreasing its value for θ > 0 small enough if y(θ) is feasible for (D). This will be the case
iff y(θ)A ≥ c iff
yA + θz ∗ A ≥ c. (∗3 )
Now since y is a feasible solution of (D) we have yA ≥ c, so if z ∗ A ≥ 0 then (∗3 ) is satisfied
and y(θ) is a solution of (D) for all θ > 0, which means that (D) is unbounded. But this
implies that (P ) is not feasible.
Let us take a closer look at the inequalities z ∗ A ≥ 0. For j ∈ J, Since z ∗ is an optimal
solution of (DRP ), we know that z ∗ AJ ≥ 0, so if z ∗ Aj ≥ 0 for all j ∈ N , then (P ) is not
feasible.
Otherwise, there is some j ∈ N = {1, . . . , n} − J such that
z ∗ Aj < 0,
and then since by the definition of J we have yAj > cj for all j ∈ N , if we pick θ > 0 such
that
yAj − cj
θ≤ j ∈ N, z ∗ Aj < 0,
−z ∗ Aj
then we decrease the objective function y(θ)b = yb + θz ∗ b of (D) (since z ∗ b < 0). Therefore
we pick the best θ, namely
j
yA − cj
+ ∗ j
θ = min j∈
/ J, z A < 0 > 0. (∗4 )
−z ∗ Aj
Step 1. Find some feasible solution y of the dual program (D). We will show later
that this is always possible.
Step 2. Compute
J + = {j ∈ {1, . . . , n} | yAj = cj }.
Step 3. Set J = J + and solve the problem (RP ) using the simplex algorithm, starting
from the optimal solution determined during the previous round, obtaining the
optimal solution (x∗J , ξ ∗ ) with the basis K ∗ .
Step 4.
If ξ ∗ = 0, then stop with an optimal solution u∗ for (P ) such that u∗J = x∗J and the
other components of u∗ are zero.
Else let
z ∗ = −1>
m − (cK ∗ )(p+1,...,p+m) ,
y + Aj = yAj + θ+ z ∗ Aj = cj ,
If (u∗ , ξ ∗ ) with the basis K ∗ is the optimal solution of the program (RP ), Proposition
8.13 together with the last property of Theorem 8.11 allows us to restart the (RP ) in Step 3
with (u∗ , ξ ∗ )K ∗ as initial solution (with basis K ∗ ). For every j ∈ J − J + , column j is deleted,
and for every j ∈ J + − J, the new column Aj is computed by multiplying A b−1∗ and Aj , but
K
220 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
Ab−1∗ is the matrix Γ∗ [1:m; p + 1:p + m] consisting of the last m columns of Γ∗ in the final
K
tableau, and the new reduced cj is given by cj − z ∗ Aj . Reusing the optimal solution of the
previous (RP ) may improve efficiency significantly.
Another crucial observation is that for any index j0 ∈ N such that
θ+ = (yAj0 − cj0 )/(−z ∗ Aj0 ), we have
and so j0 ∈ J + . This fact that be used to ensure that the primal-dual algorithm terminates
in a finite number of steps (using a pivot rule that prevents cycling); see Papadimitriou and
Steiglitz [45] (Theorem 5.4).
It remains to discuss how to pick some initial feasible solution y of the dual program (D).
If cj ≤ 0 for j = 1, . . . , n, then we can pick y = 0.
We should note that in many applications, the natural primal optimization problem
is actually the minimization some objective function cx = c1 x1 + · · · + cn xn , rather its
maximization. For example, many of the optimization problems considered in Papadimitriou
and Steiglitz [45] are minimization problems.
Of course, minimizing cx is equivalent to maximizing −cx, so our presentation covers
minimization too. But if we are dealing with a minimization problem, the weight cj are
often nonnegative, so from the point of view of maximization we will have −cj ≤ 0 for all j,
and we will be able to use y = 0 as a starting point.
Going back to our primal problem in maximization form and its dual in minimization
form, we still need to deal with the situation where cj > 0 for some j, in which case there
may not be any obvious y feasible for (D). Preferably we would like to find such a y very
cheaply.
There is a trick to deal with this situation. We pick some very large positive number M
and add to the set of equations Ax = b the new equation
x1 + · · · + xn + xn+1 = M,
with the new variable xn+1 constrained to be nonnegative. If the program (P ) has a fea-
sible solution, such an M exists. In fact, it can shown that for any basic feasible solution
u = (u1 , . . . , un ), each |ui | is bounded by some expression depending only on A and b; see
Papadimitriou and Steiglitz [45] (Lemma 2.1). The proof is not difficult and relies on the fact
that the inverse of a matrix can be expressed in terms of certain determinants (the adjugates).
Unfortunately, this bound contains m! as a factor, which makes it quite impractical.
Having added the new equation above, we obtain the new set of equations
A 0n x b
> = ,
1n 1 xn+1 M
8.6. THE PRIMAL-DUAL ALGORITHM 221
We now solve (RP 1) via the simplex algorithm. The initial tableau with K = (2, 3, 4) and
J = {1} is
x1 ξ1 ξ2 ξ3
7 12 0 0 0
ξ1 = 2 3 1 0 0 .
ξ2 = 1 3 0 1 0
ξ3 = 4 6 0 0 1
For (RP 1), c = (0, −1, −1, −1), (x1 , ξ1 , ξ2 , ξ3 ) = (0, 2, 1, 4), and the nonzero reduced cost is
given by
3
0 − (−1 − 1 − 1) 3 = 12.
6
Since there is only one nonzero reduced cost, we must set j + = 1. Since
min{ξ1 /3, ξ2 /3, ξ3 /6} = 1/3, we see that k − = 3 and K = (2, 1, 4). Hence we pivot through
the red circled 3 (namely we divide row 2 by 3, and then subtract 3× (row 2) from row 1,
6× (row 2) from row 3, and 12× (row 2) from row 0), to obtain the tableau
x1 ξ1 ξ2 ξ3
3 0 0 −4 0
ξ1 = 1 0 1 −1 0 .
x1 = 1/3 1 0 1/3 0
ξ3 = 2 0 0 −2 1
At this stage the simplex algorithm for (RP 1) terminates since there are no positive reduced
costs. Since the upper left corner of the final tableau is not zero, we proceed with Step 4 of
the primal dual algorithm and compute
When we substitute y + into (D), we discover that the first two constraints are equalities,
and that the new J is J = {1, 2}. The new reduced primal (RP 2) is
Maximize − (ξ1 + ξ2 + ξ3 )
x1
3 4 1 0 0 x2
2
subject to 3 −2 0 1 0 ξ1 = 1 and x1 , x2 , ξ1 , ξ2 , ξ3 ≥ 0.
6 4 0 0 1 ξ2 4
ξ3
Once again, we solve (RP 2) via the simplex algorithm, where c = (0, 0, −1, −1, −1), (x1 , x2 ,
ξ1 , ξ2 , ξ3 ) = (1/3, 0, 1, 0, 2) and K = (3, 1, 5). The initial tableau is obtained from the final
tableau of the previous (RP 1) by adding a column corresponding the the variable x2 , namely
1 −1 0 4 6
Ab−1 A2 = 0 1/3 0 −2 = −2/3 ,
K
0 −2 1 4 8
with
4
c2 = c2 − z ∗ A2 = 0 − −1 3 −1 −2 = 14,
4
and we get
x1 x2 ξ1 ξ2 ξ3
3 0 14 0 −4 0
ξ1 = 1 0 6 1 −1 0 .
x1 = 1/3 1 −2/3 0 1/3 0
ξ3 = 2 0 8 0 −2 1
Note that j + = 2 since the only positive reduced cost occurs in column 2. Also observe
that since min{ξ1 /6, ξ3 /8} = ξ1 /6 = 1/6, we set k − = 3, K = (2, 1, 5) and pivot along the
red 6 to obtain the tableau
x1 x2 ξ1 ξ2 ξ3
2/3 0 0 −7/3 −5/3 0
x2 = 1/6 0 1 1/6 −1/6 0 .
x1 = 4/9 1 0 1/9 2/9 0
ξ3 = 2/3 0 0 −4/3 −2/3 1
Since the reduced costs are either zero or negative the simplex algorithm terminates, and
we compute
z ∗ = (−1 − 1 − 1) − (−7/3 − 5/3 0) = (4/3 2/3 − 1),
224 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
1 1
(−19/42 5/14 − 5/42) −1 + 1 = 1/14, −(4/3 2/3 − 1) −1 = 1/3,
1 1
so
3
θ+ = ,
14
5
y + = (−19/42 5/14 − 5/42) + (4/3 2/3 − 1) = (−1/6 1/2 − 1/3).
14
When we plug y + into (D), we discover that the first, second, and fourth constraints are
equalities, which implies J = {1, 2, 4}. Hence the new restricted primal (RP 3) is
Maximize − (ξ1 + ξ2 + ξ3 )
x1
x2
3 4 1 1 0 0
x4 2
subject to 3 −2 −1 0 1 0 = 1 and x1 , x2 , x4 , ξ1 , ξ2 , ξ3 ≥ 0.
ξ1
6 4 1 0 0 1
ξ2 4
ξ3
The initial tableau for (RP 3), with c = (0, 0, 0, −1, −1, −1), (x1 , x2 , x4 , ξ1 , ξ2 , ξ3 ) = (4/9, 1/6,
0, 0, 0, 2/3) and K = (2, 1, 6), is obtained from the final tableau of the previous (RP 2) by
adding a column corresponding the the variable x4 , namely
1/6 −1/6 0 1 1/3
b−1 A4 = 1/9
A 2/9 0 −1 = −1/9 ,
K
−4/3 −2/3 1 1 1/3
with
1
∗ 4
c4 = c4 − z A = 0 − 4/3 2/3 −1 −1 = 1/3,
1
and we get
x1 x2 x4 ξ1 ξ2 ξ3
2/3 0 0 1/3 −7/3 −5/3 0
x2 = 1/6 0 1 1/3 1/6 −1/6 0 .
x1 = 4/9 1 0 −1/9 1/9 2/9 0
ξ3 = 2/3 0 0 1/3 −4/3 −2/3 1
Since the only positive reduced cost occurs in column 3, we set j + = 3. Furthermore
since min{x2 /(1/3), ξ3 /(1/3)} = x2 /(1/3) = 1/2, we let k − = 2, K = (3, 1, 6), and pivot
around the red circled 1/3 to obtain
8.6. THE PRIMAL-DUAL ALGORITHM 225
x1 x2 x4 ξ1 ξ2 ξ3
1/2 0 −1 0 −5/2 −3/2 0
x4 = 1/2 0 3 1 1/2 −1/2 0 .
x1 = 1/2 1 1/3 0 1/6 1/6 0
ξ3 = 1/2 0 −1 0 −3/2 −1/2 1
At this stage, there are no positive reduced costs, and we must compute
Maximize − (ξ1 + ξ2 + ξ3 )
x1
x3
3 −3 1 1 0 0 x4
2
subject to 3 6 −1 0 1 0 = 1 and x1 , x3 , x4 , ξ1 , ξ2 , ξ3 ≥ 0.
ξ1
6 0 1 0 0 1
ξ2 4
ξ3
The initial tableau for (RP 4), with c = (0, 0, 0, −1, −1, −1), (x1 , x3 , x4 , ξ1 , ξ2 , ξ3 ) = (1/2,
0, 1/2, 0, 0, 1/2) and K = (3, 1, 6) is obtained from the final tableau of the previous (RP 3)
by replacing the column corresponding to the variable x2 by a column corresponding to the
variable x3 , namely
1/2 −1/2 0 −3 −9/2
b−1 A3 = 1/6
A 1/6 0 6 = 1/2 ,
K
−3/2 −1/2 1 0 3/2
with
−3
c3 = c3 − z ∗ A3 = 0 − 3/2 1/2 −1 6 = 3/2,
0
226 CHAPTER 8. LINEAR PROGRAMMING AND DUALITY
and we get
x1 x3 x4 ξ1 ξ2 ξ3
1/2 0 3/2 0 −5/2 −3/2 0
x4 = 1/2 0 −9/2 1 1/2 −1/2 0
.
x1 = 1/2 1 1/2 0 1/6 1/6 0
ξ3 = 1/2 0 3/2 0 −3/2 −1/2 1
By analyzing the top row of reduced cost, we see that j + = 2. Furthermore, since
min{x1 /(1/2), ξ3 /(3/2)} = ξ3 /(3/2) = 1/3, we let k − = 6, K = (3, 1, 2), and pivot along the
red circled 3/2 to obtain
x1 x3 x4 ξ1 ξ2 ξ3
0 0 0 0 −1 −1 −1
x4 = 2 0 0 1 −4 −2 3 .
x1 = 1/3 1 0 0 2/3 1/3 −1/3
x3 = 1/3 0 1 0 −1 −1/3 2/3
Since the upper left corner of the final tableau is zero and the reduced costs are all ≤ 0,
we are finally finished. Then y = (19/3 8/3 − 14/3) is an optimal solution of (D), but more
importantly (x1 , x2 , x3 , x4 ) = (1/3, 0, 1/3, 2) is an optimal solution for our original linear
program and provides an optimal value of −10/3.
The primal-dual algorithm for linear programming doesn’t seem to be the favorite method
to solve linear programs nowadays. But it is important because its basic principle, to use
a restricted (simpler) primal problem involving an objective function with fixed weights,
namely 1, and the dual problem to provide feedback to the primal by improving the ob-
jective function of the dual, has led to a whole class of combinatorial algorithms (often
approximation algorithms) based on the primal-dual paradigm. The reader will get a taste
of this kind of algorithm by consulting Papadimitriou and Steiglitz [45], where it is explained
how classical algorithms such as Dijkstra’s algorithm for the shortest path problem, and Ford
and Fulkerson’s algorithm for max flow can be derived from the primal-dual paradigm.
Chapter 9
In order to study and manipulate complex shapes it is convenient to discretize these shapes
and to view them as the union of simple building blocks glued together in a “clean fashion.”
The building blocks should be simple geometric objects, for example, points, lines segments,
triangles, tetrahedra and more generally simplices, or even convex polytopes. We will begin
by using simplices as building blocks.
The material presented in this chapter consists of the most basic notions of combinatorial
topology, going back roughly to the 1900-1930 period and it is covered in nearly every alge-
braic topology book (certainly the “classics”). A classic text (slightly old fashion especially
for the notation and terminology) is Alexandrov [1], Volume 1 and another more “modern”
source is Munkres [43]. An excellent treatment from the point of view of computational
geometry can be found is Boissonnat and Yvinec [12], especially Chapters 7 and 10. An-
other fascinating book covering a lot of the basics but devoted mostly to three-dimensional
topology and geometry is Thurston [61].
One of the main goals of this chapter is to define a discrete (combinatorial) analog of
the notion of a topological manifold (with or without boundary). The key for doing this is
to define a combinatorial notion of nonsingularity of a face, and technically this is achieved
by defining the notions of star and link of a face. There are actually two variants of the
notion of star: closed stars and open stars. It turns out that the notion of nonsingularity is
captured well by defining a face to be nonsingular if its link is homeomorphic to a sphere or
to a closed ball. It is intuitively clear that if every face is nonsingular then the open star of
every face is a “nice” open set, either an open ball or the intersection of an open ball with
a half space.
However, proving this fact rigorously takes a surprising amount of work and requires the
introduction of new concepts such as the suspension of a complex and the join of complexes.
Once again, our geometric intuition in dimension greater than three is very unreliable, and
we have to resort to algebraic arguments involving induction to be on solid grounds.
227
228 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY
Definition 9.1. Let E be any normed affine space, say E = Em with its usual Euclidean
norm. Given any n+1 affinely independent points a0 , . . . , an in E, the n-simplex (or simplex)
σ defined by a0 , . . . , an is the convex hull of the points a0 , . . . , an , that is, the set of all convex
combinations λ0 a0 + · · · + λn an , where λ0 + · · · + λn = 1 and λi ≥ 0 for all i, 0 ≤ i ≤ n;
the simplex σ is often denoted by (a0 , . . . , an ). We call n the dimension of the n-simplex σ,
and the points a0 , . . . , an are the vertices of σ; we denote the set of vertices {a0 , . . . , an } by
vert(σ). Given any subset {ai0 , . . . , aik } of {a0 , . . . , an } (where 0 ≤ k ≤ n), the k-simplex
generated by ai0 , . . . , aik is called a k-face or simply a face of σ. A face s of σ is a proper
face if s 6= σ (we agree that the empty set is a face of any simplex). For any vertex ai , the
face generated by a0 , . . . , ai−1 , ai+1 , . . . , an (i.e., omitting ai ) is called the face opposite ai .
Every face that is an (n − 1)-simplex is called a boundary face or facet. The union of the
boundary faces is the boundary of σ, denoted by ∂σ, and the complement of ∂σ in σ is the
interior Int σ = σ − ∂σ of σ. The interior Int σ of σ is sometimes called an open simplex .
See Figure 9.1.
a0
0 - simplex
a0 a1 a0 a1
interior
1 - simplex boundary
a2 a2
a0 a1 a0 a1
interior
boundary
2 - simplex
a2
a2
a3
a3
a0
a0 a1
a1 interior boundary
3 -simplex
It should be noted that for a 0-simplex consisting of a single point {a0 }, ∂{a0 } = ∅, and
Int {a0 } = {a0 }. Of course, a 0-simplex is a single point, a 1-simplex is the line segment
(a0 , a1 ), a 2-simplex is a triangle (a0 , a1 , a2 ) (with its interior), and a 3-simplex is a tetrahe-
dron (a0 , a1 , a2 , a3 ) (with its interior). The inclusion relation between any two faces σ and τ
of some simplex, s, is written σ τ .
We now state a number of properties of simplices, whose proofs are left as an exercise.
Clearly, a point x belongs to the boundary ∂σ of σ iff at least one of its barycentric co-
ordinates (λ0 , . . . , λn ) is zero, and a point x belongs to the interior Int σ of σ iff all of its
barycentric coordinates (λ0 , . . . , λn ) are positive, i.e., λi > 0 for all i, 0 ≤ i ≤ n. Then, for
every x ∈ σ, there is a unique face s such that x ∈ Int s, the face generated by those points
ai for which λi > 0, where (λ0 , . . . , λn ) are the barycentric coordinates of x.
A simplex σ is convex, arcwise connected, compact, and closed. The interior Int σ of a
simplex is convex, arcwise connected, open, and σ is the closure of Int σ.
We now put simplices together to form more complex shapes, following Munkres [43].
The intuition behind the next definition is that the building blocks should be “glued cleanly.”
Definition 9.2. A simplicial complex in Em (for short, a complex in Em ) is a set K consisting
of a (finite or infinite) set of simplices in Em satisfying the following conditions:
(1) Every face of a simplex in K also belongs to K.
Condition (2) guarantees that the various simplices forming a complex intersect nicely.
It is easily shown that the following condition is equivalent to condition (2):
(20 ) For any two distinct simplices σ1 , σ2 , Int σ1 ∩ Int σ2 = ∅.
Remarks:
1. A simplicial complex, K, is a combinatorial object, namely, a set of simplices satisfying
certain conditions but not a subset of Em . However, every complex, K, yields a subset
of Em called the geometric realization of K and denoted |K|. This object will be
defined shortly and should not be confused with the complex. Figure 9.2 illustrates
this aspect of the definition of a complex. For clarity, the two triangles (2-simplices)
are drawn as disjoint objects even though they share the common edge, (v2 , v3 ) (a
1-simplex) and similarly for the edges that meet at some common vertex.
230 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY 1
v3 v3
v1 v4
v2 v2
2. Unlike the situation for polyhedra, where all faces are external in the sense that they
belong to the boundary of the polyhedron, the situation for simplicial complexes is
more subtle; a face of a simplicial complex can be internal or external. For example,
the 1-simplex (v2 , v3 ) for the simplicial complex shown in Figure 9.2 is internal, but the
the 1-simplex (v1 , v2 ) is external. If we consider the simplicial complex consisting of the
faces of a tetrahedron, then every edge (1-simplex) is internal. However, if we consider
the simplicial complex consisting of a (solid) tetrahedron, then its facets (2-simplices)
and edges (1-simplices) are external. These matters will be clarified in Definition 9.7.
Some collections of simplices violating some of the conditions of Definition 9.2 are shown
in Figure 9.3. On the left, the intersection of the two 2-simplices is neither an edge nor a
vertex of either triangle. In the middle case, two simplices meet along an edge which is not
an edge of either triangle. On the right, there is a missing edge and a missing vertex.
Some “legal” simplicial complexes are shown in Figure 9.5.
The union |K| of all the simplices in K is a subset of Em . We can define a topology
on |K| by defining a subset F of |K| to be closed iff F ∩ σ is closed in σ for every face
σ ∈ K. It is immediately verified that the axioms of a topological space are indeed satisfied.
9.1. SIMPLICIAL COMPLEXES 1
231
v3
v1 v4
v2
The resulting topological space |K| is called the geometric realization of K. The geometric
realization of the complex from Figure 9.2 is shown in Figure 9.4.
Obviously, |σ| = σ for every simplex, σ. Also, note that distinct complexes may have the
same geometric realization. In fact, all the complexes obtained by subdividing the simplices
of a given complex yield the same geometric realization.
A polytope is the geometric realization of some simplicial complex. A polytope of di-
mension 1 is usually called a polygon, and a polytope of dimension 2 is usually called a
polyhedron. Unfortunately the term “polytope” is overloaded since the polytopes induced
by simplicial complexes are generally not convex. Consequently, if we use the term polytope
for the objects defined in Chapter 5, we should really say “convex polytope” to avoid am-
biguity. When K consists of infinitely many simplices we usually require that K be locally
finite, which means that every vertex belongs to finitely many faces. If K is locally finite,
then its geometric realization, |K|, is locally compact.
In the sequel, we will consider only finite simplicial complexes, that is, complexes K
consisting of a finite number of simplices. In this case, the topology of |K| defined above
232 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY 1
is identical to the topology induced from Em . Also, for any simplex σ in K, Int σ coincides
◦
with the interior σ of σ in the topological sense, and ∂σ coincides with the boundary of σ in
the topological sense.
Definition 9.3. Given any complex, K2 , a subset K1 ⊆ K2 of K2 is a subcomplex of K2
iff it is also a complex. For any complex, K, of dimension d, for any i with 0 ≤ i ≤ d, the
subset
K (i) = {σ ∈ K | dim σ ≤ i}
is called the i-skeleton of K. Clearly, K (i) is a subcomplex of K. See Figure 9.6. We also let
K i = {σ ∈ K | dim σ = i}.
Observe that K 0 is the set of vertices of K and K i is not a complex. A simplicial complex,
K1 is a subdivision of a complex K2 iff |K1 | = |K2 | and if every face of K1 is a subset of some
face of K2 . A complex K of dimension d is pure (or homogeneous) iff every face of K is a
face of some d-simplex of K (i.e., some cell of K). See Figure 9.7. A complex is connected
iff |K| is connected.
It is easy to see that a complex is connected iff its 1-skeleton is connected. The intuition
behind the notion of a pure complex, K, of dimension d is that a pure complex is the result
of gluing pieces all having the same dimension, namely, d-simplices. For example, in Figure
9.8, the complex on the left is not pure but the complex on the right is pure of dimension 2.
|K|
K(1)
Definition 9.4. Let K be any complex and let σ be any face of K. The star St(σ) (or if
we need to be very precise St(σ, K)) of σ is the subcomplex of K consisting of all faces τ
containing σ and of all faces of τ , that is,
The link Lk(σ) (or Lk(σ, K)) of σ is the subcomplex of K consisting of all faces in St(σ)
that do not intersect σ, that is,
To simplify notation, if σ = {v} is a vertex we write St(v) for St({v}) and Lk(v) for
Lk({v}). Figure 9.9 shows
(b) The star of the vertex v, indicated in mint green and the link of v, shown as thicker
red lines.
If K is pure and of dimension d, then St(σ) is also pure of dimension d and if dim σ = k,
then Lk(σ) is pure of dimension d − k − 1.
For technical reasons, following Munkres [43], besides defining the complex St(σ), it is
useful to introduce the open star of σ.
Definition 9.5. Given a complex K, for any simplex σ in K, the open star of σ, denoted
st(σ), is defined as the subspace of |K| consisting of the union of the interiors Int(τ ) = τ −∂ τ
of all the faces τ containing σ.
234 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY
|K| K1
(a) (b)
Figure 9.8: (a) A complex that is not pure. (b) A pure complex
According to this definition, the open star of σ is not a complex but instead a subset of
|K|. Note that
st(σ) = |St(σ)|,
that is, the closure of st(σ) is the geometric realization of the complex St(σ). Then lk(σ) =
|Lk(σ)| is the union of the simplices in St(σ) that are disjoint from σ. If σ is a vertex v, we
have
lk(v) = st(v) − st(v).
However, beware that if σ is not a vertex, then lk(σ) is properly contained in st(σ) − st(σ)!
See Figures 9.10 and 9.11.
One of the nice properties of the open star st(σ) of σ is that it is open. To see this,
observe that for any point a ∈ |K|, there is a unique smallest simplex σ = (v0 , . . . , vk ) such
9.2. NONSINGULAR FACES; STARS AND LINKS 235
(a) (b)
v v
a = λ0 v0 + · · · + λk vk
and thus, |K| − st(v) is the union of all the faces of K that do not contain v as a vertex,
obviously a closed set; see Figure 9.12. Thus, st(v) is open in |K|. It is also quite clear that
st(v) is path connected. Moreover, for any k-face σ of K, if σ = (v0 , . . . , vk ), then
that is,
st(σ) = st(v0 ) ∩ · · · ∩ st(vk ).
Consequently, st(σ) is open and path connected, as illustrated in Figure 9.13.
Unfortunately, the “nice” equation
is false! (and anagolously for Lk(σ).) For a counter-example, (which is illustrated in Figure
9.14), consider the boundary of a tetrahedron and the star of a facet (a 2-simplex).
236 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY
v v
|K| |St(v)|
v v
st(v)
lk(v) = |St(v)| - st(v)
Figure 9.10: For the pure 2-dimensional complex |K|, an illustration of |St(v)|, st(v), and
lk(v).
σ σ
|K| |St(σ )|
σ σ
|St(σ )| - st( σ)
st(σ )
lk( σ )
Figure 9.11: Given the edge σ in the pure 2-dimensional complex |K|, observe that lk(σ) ⊂
st(σ) − st(σ).
v
1
v1
σ
v
2
|K|
st( v1)
v
1
v
2
Proposition 9.1 shows that if every face of K is nonsingular, then the link of every internal
face is a sphere whereas the link of every external face is a ball.
The main goal of the rest of this section is to show that if K is a pure complex of
dimension d and if all its k-faces are nonsingular (0 ≤ k ≤ d − 1), then lk(σ) is either
d d−1
homeomorphic to B d or to B − B . As a consequence, the geometric realization |K| of
K is a manifold.
Although the above facts are easy to check for d = 1, 2, and in some simple cases for
d = 3, a rigorous proof requires a fair amount of work and the introduction of several new
concepts. The key point is that we need to express St(σ) in terms of Lk(σ), and for this, we
need the notion of join of complexes, two special cases of which are the notion of cone and
d+1
of suspension. These two notions allow building the sphere S d+1 and the closed ball B
d
from the sphere S d and the closed ball B , and allow an inductive argument on d. There are
many technical details which we will omit to simplify the exposition. Complete details and
proofs can be found in Munkres [43] (Chapter 8, Section 62).
9.2. NONSINGULAR FACES; STARS AND LINKS 239
v1
v
2
|K|
v
1
v2 v2
st( v1)
v
1
v2
Figure 9.13: Given the edge σ in the pure 2-dimensional complex |K|, st(σ) = st(v1 ) ∩ st(v2 ).
Definition 9.8. Given any complex K in En , if dim K = d < n, for any point v ∈ En such
that v does not belong to the affine hull of |K|, the cone on K with vertex v, denoted, v ∗ K,
is the complex consisting of all simplices of the form (v, a0 , . . . , ak ) and their faces, where
(a0 , . . . , ak ) is any k-face of K. If K = ∅, we set v ∗ K = v. See Figure 9.15
Remark: Unfortunately, the word “cone” is overloaded. It might have been better to use
the locution pyramid instead of cone as some authors do (for example, Ziegler). However,
since we have been following Munkres [43], a standard reference in algebraic topology, we
decided to stick with the terminology used in that book, namely, “cone.”
If σ is a simplex in a complex K, we will need to express St(σ) in terms of σ and its link
Lk(σ), and |St(σ)| − st(σ) in terms of ∂σ and Lk(σ). For this, we will need a generalization
240 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY
v
v 0
0
v1
v1
v
v 3
3
v
v 2
2
|St( v0 v1 v 2)|
|K|
v
v v0 0
0
v1
v1 v1
v
v v 3
3 3
v
v v 2
2 2
|St( v2)|
|St( v0 )|
|St( v1)|
v
0
v1
v
3
v
2
Figure 9.14: Let |K| be the boundary of the solid tetrahedron. The star of a triangular face
is itself and contains only three edges. It is not the intersection of the star of its vertices,
since the star of a vertex contains all six edges of the tetrahedron.
of the above notion of cone to two simplicial complexes K and L, called the join of two
complexes.
Definition 9.9. Given any two disjoint nonempty complexes K and L in En such that
dim(K) + dim(L) ≤ n − 1, if for any simplex σ = (v0 , . . . , vh ) in K and any simplex
τ = (w0 , . . . , wk ) in L, the points (v0 , . . . , vh , w0 , . . . , wk ) are affinely independent, then we
define σ ∗ τ as the simplex
σ ∗ τ = (v0 , . . . , vh , w0 , . . . , wk );
more rigorously, σ ∗τ is the (h+k +1)-simplex spanned by the points (v0 , . . . , vh , w0 , . . . , wk ).
If the collection of all the simplicies σ ∗ τ and their faces is a simplicial complex, then this
complex is denoted by K ∗ L and is called the join of K and L.
Note that if K ∗ L is a complex, then its dimension is dim(K) + dim(L) + 1, which implies
that dim(K) + dim(L) ≤ n − 1.
9.2. NONSINGULAR FACES; STARS AND LINKS 241
|K|
v * |K|
Figure 9.15: On the left is the two-dimensional planar complex |K|. On the right is the
geometric realization of |v∗K|. It consists of a solid blue tetrahedron and a peach tetrahedral
shell.
Observe that a cone v ∗ L corresponds to the special case where K is a complex consisting
of the single vertex v. If K = {v0 , v1 } is the complex consisting of two distinct vertices (with
no edge between them), and if {v0 , v1 } ∗ L is a complex, then it is called a suspension of L
and it is denoted by S(L) or susp(L). The suspension of L is the complex consisting of the
union of the two cones v0 ∗ L and v1 ∗ L.
Two problems immediately come to mind:
(1) Characterize the geometric realization |K ∗ L| of the join K ∗ L of two complexes K
and L in terms of the geometric realizations |K| and |L| of K and L, if K ∗ L is indeed
a complex.
(2) Find a sufficient condition of |K| and |L| that implies that K ∗ L is a complex.
The following proposition gives answers to these problems and gives a necessary and
sufficient condition for K ∗ L to be a complex. The proof is quite technical and not very
illuminating so we refer the reader to Munkres [43] (Chapter 8, Lemma 62.1).
Proposition 9.2. Let K and L be any two disjoint nonempty complexes in En such that
dim(K) + dim(L) ≤ n − 1.
(a) If K ∗L is a complex, then its geometric realization |K ∗L| is the union of all the closed
line segments [x, y] joining some point x in |K| to some point y in |L|. Two such line
segments intersect in a most a common endpoint.
(b) Conversely, if every pair of line segments joining points of |K| and points of |L| inter-
sect in at most a common endpoint, then K ∗ L is a complex.
242 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY
Proposition 9.2 shows that |K ∗ L| can be expressed in terms of the realizations of cones
of the form v ∗ L, where v ∈ K, as
[
|K ∗ L| = |v ∗ L|.
v∈|K|
A few more technical propositions, all proved in Munkres [43] (see Chapter 8, Section
62), will be needed.
A surjective function f : X → Y between two topological spaces X and Y is called
a quotient map if a subset V of Y is open iff f −1 (V ) is open in X. A quotient map is
automatically continuous.
Proposition 9.3. Suppose K ∗ L is a well-defined complex where K and L are finite com-
plexes (it suffices to assume that K is locally finite). Then the map
given by
π(x, y, t) = (1 − t)x + ty
is a quotient map. For every x ∈ |K| and every y ∈ |L|, the map π collapes {x} × |L| × {0}
to the point x and |K| × {y} × {1} to the point y. Otherwise, π is injective.
Using Proposition 9.3 we can prove the following “obvious” proposition which turns out
to be very handy.
Proposition 9.4. Suppose K ∗ L and M ∗ N are well-defined finite complexes (it suffices to
assume that K is locally finite). If |K| ∼
= |M | and |L| ∼
= |N |, then |K ∗ L| ∼
= |M ∗ N |.
The following proposition shown in Munkres [43] (Chapter 8, Lemma 62.6) is crucial.
The proof actually follows pretty much from the definitions.
Proposition 9.6. For any complex K of dimension d and any k-simplex σ ∈ K (0 ≤ k ≤
d − 1), we have
St(σ) = σ ∗ Lk(σ),
and
st(σ) ∼
= |St(σ)| − |∂σ ∗ Lk(σ)|.
By convention, σ ∗ ∅ = σ if Lk(σ) = ∅, and ∅ ∗ Lk(σ) = Lk(σ) if ∂σ = ∅. See Figures 9.16
and 9.17.
9.2. NONSINGULAR FACES; STARS AND LINKS 243
v
v3
v1
v
v2 3
v1
|K|
lk(v)
v2
v v v
v
v3 v
3
v1
v1
v2 v2 v3
v1
v3
v1 v3
v1 v2
v2 v2 |St(v)| = |v * Lk(v)|
v
3
v1
v2
v * Lk(v)
Figure 9.18 shows a 3-dimensional complex. The link of the edge (v6 , v7 ) is the pentagon
2
P = (v1 , v2 , v3 , v4 , v5 ) ∼
= S 1 . The link of the vertex v7 is the cone v6 ∗ P ∼
= B . The link
1
of (v1 , v2 ) is (v6 , v7 ) ∼= B and the link of v1 is the union of the triangles (v2 , v6 , v7 ) and
2
(v5 , v6 , v7 ), which is homeomorphic to B .
The following technical propositions are needed to show that if K is any pure complex
of dimension d, nonsingularity of all the faces implies that every open star is an open subset
homeomorphic either to B d or to B d ∩ Hd , where
Hd = {(x1 , . . . , xd ) ∈ Rd | xd ≥ 0}.
v3
v1 v3
v1
v2
st( v1 v2) v2
|K|
v4 v4
v3 v3
v1 v1
v v
v2 v
| v * Lk(v1 v2) |
v4
v4
minus
v3
v1 v3
v1
v2 | S 0 * Lk(v1 v2) |
B 1 x | v * Lk(v1 v2) | v2
see Munkres [43] (Chapter 1, Lemma 1.1). The following formulae are also easy to show but
they are essential to carry out induction on the dimension of spheres or (closed) balls.
Informally, the first formula says that a cone over the sphere S d is homeomorphic to the
d+1 d
closed ball B , the second formula says that a cone over the closed ball B , is homeomorphic
d+1
to the closed ball B , the third formula says that the suspension of the sphere S d is
9.2. NONSINGULAR FACES; STARS AND LINKS 245
v6
v5 v4
v1 v3
v2
v7
homeomorphic to the sphere S d+1 , and the fourth formula says that the suspension of the
d d+1
closed ball B is homeomorphic to the closed ball B .
Proposition 9.8. For every d ≥ 1 and every k-simplex σ (0 ≤ k ≤ d − 1), we have
d
|σ ∗ ∂∆d−k | ∼
=B
|∂σ ∗ ∂∆d−k | ∼= S d−1 .
Proof. We proceed by induction on d ≥ 1. For the base case d = 1, we must have k = 0 so
σ = v and ∂σ = ∅ for some vertex v, and then by Proposition 9.7
1
|v ∗ ∂∆1 | ∼
=B ,
and
|∅ ∗ ∂∆1 | = |∂∆1 | ∼
= S 0.
For the induction step for the first formula, we use Proposition 9.7 which says that
|∂∆d−k ∗ {v0 , v1 }| ∼
= S d−k+1 ∼
= |∂∆d−k+1 |.
For any k-simplex σ with 0 ≤ k ≤ d − 1, by Proposition 9.4 we have
|σ ∗ ∂∆d−k+1 | ∼
= |σ ∗ (∂∆d−k ∗ ({v0 , v1 })|
∼
= |(σ ∗ ∂∆d−k ) ∗ {v0 , v1 }|.
By the induction hypothesis, we have
d
|σ ∗ ∂∆d−k | ∼
= B = |∆d |,
246 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY
Since |∂∆d−k | ∼
= S d−k−1 , with a slight abuse of notation the formulae of Proposition 9.8
can be written as
d
|σ ∗ S d−k−1 | ∼
=B
|∂σ ∗ S d−k−1 | ∼= S d−1 .
The following proposition is the counterpart of Proposition 9.8 for balls instead of spheres.
Proposition 9.9. For every d ≥ 1 and every k-simplex σ (0 ≤ k ≤ d − 1), we have
d
|σ ∗ ∆d−k−1 | ∼
=B
d−1
|∂σ ∗ ∆d−k−1 | ∼
=B .
Proof. We proceed by induction on d ≥ 1. For the base case d = 1, we must have k = 0 so
σ = v for some vertex v, and then by Proposition 9.7
1
|v ∗ ∆0 | ∼
=B ,
and by definition
0
|∅ ∗ ∆0 | = |∆0 | ∼
=B .
For the induction step for the first formula, we use Proposition 9.7 which says that
d−k
|∆d−k−1 ∗ v| ∼
=B = |∆d−k |.
|σ ∗ ∆d−k | ∼
= |σ ∗ (∆d−k−1 ∗ v)|
∼
= |(σ ∗ ∆d−k−1 ) ∗ v|.
By the induction hypothesis, we have
d
|σ ∗ ∆d−k−1 | ∼
= B = |∆d |,
so by Proposition 9.4 and Proposition 9.7, we have
|σ ∗ ∆d−k | ∼
= |(σ ∗ ∆d−k−1 ) ∗ v|
∼
= |∆d ∗ v|
d+1
∼
=B .
For a d-simplex σ, since |σ| ∼
= |∆d | and |∆0 | = |v|, by Proposition 9.4 and Proposition 9.7,
we have
|σ ∗ ∆0 | ∼
= |∆d ∗ v|
d+1
∼
=B .
248 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY
This concludes the induction step for the first formula and proves that
d
|σ ∗ ∆d−k−1 | ∼
=B .
For the second formula, if k = 0 then σ = v is a vertex and ∂σ = ∅ so
d−1
|∅ ∗ ∆d−1 | = |∆d−1 | ∼
=B .
For any k-simplex σ with 1 ≤ k ≤ d − 1, by Proposition 9.4 we have
|∂σ ∗ ∆d−k | ∼
= |∂σ ∗ (∆d−k−1 ∗ v)|
∼
= |(∂σ ∗ ∆d−k−1 ) ∗ v|.
By the induction hypothesis, we have
d−1
|∂σ ∗ ∆d−k−1 | ∼
=B = |∆d−1 |,
so by Proposition 9.4 and Proposition 9.7, we have
|∂σ ∗ ∆d−k | ∼
= |(∂σ ∗ ∆d−k−1 ) ∗ v|
∼
= |∆d−1 ∗ v|
d
∼
=B .
For a d-simplex σ, since |∂σ| ∼
= |∂∆d | and |∆0 | ∼
= |v|, by Proposition 9.4 and Proposition
9.7, we have
|∂σ ∗ ∆0 | ∼
= |∂σ ∗ v|
∼
= |∂∆d ∗ v|
d
∼
=B .
This concludes the induction step for the second formula and proves that
d−1
|∂σ ∗ ∆d−k−1 | ∼
=B ,
as claimed.
d−k−1
Since |∆d−k−1 | ∼
=B , with a slight abuse of notation the formulae of Proposition 9.9
can be written as
d−k−1 d
|σ ∗ B |∼
=B
d−k−1 d−1
|∂σ ∗ B |∼
=B .
Finally, we can prove that for any pure complex K of dimension d, nonsingularity of all
the faces implies that the open star of any internal face is homeomorphic to B d , and that
d d−1
the open star of any boundary face is homeomorphic to B − B . This result for pure
complexes K without boundaries is stated in Thurston [61] (Chapter 3, Proposition 3.2.5).
9.2. NONSINGULAR FACES; STARS AND LINKS 249
Theorem 9.10. Let K be any pure complex of dimension d. If every face of K is nonsin-
d d−1
gular, then st(σ) ∼
= B d for every every internal k-face σ of K, and st(σ) ∼
= B −B for
every every boundary k-face σ of K (0 ≤ k ≤ d − 1).
Proof. By Proposition 9.6, for any complex K of dimension d and any k-simplex σ ∈ K
(0 ≤ k ≤ d − 1), we have
St(σ) = σ ∗ Lk(σ),
and
st(σ) ∼
= |St(σ)| − |∂σ ∗ Lk(σ)|.
If σ is an internal face then
= S d−k−1 ∼
|Lk(σ)| ∼ = |∂∆d−k |,
st(σ) ∼
= |St(σ)| − |∂σ ∗ Lk(σ)|
∼
= |σ ∗ ∂∆d−k | − |∂σ ∗ ∂∆d−k |
d
∼
= B − S d−1
∼
= Bd.
st(σ) ∼
= |St(σ)| − |∂σ ∗ Lk(σ)|
∼
= |σ ∗ ∆d−k−1 | − |∂σ ∗ ∆d−k−1 |
d d−1
∼
=B −B ,
as claimed.
Theorem 9.10 has the following corollary which shows that any pure complex for which
every face is nonsingular is a manifold.
Proof. Any point a ∈ |K| belongs to some simplex σ, so we proceed by induction on the
dimension of σ. If σ is a vertex v, then by Proposition 9.10 we have st(v) ∼
= B d or st(σ) ∼
=
d d−1
B −B ∼ d d
= B ∩ H . If σ is a simplex of dimension k + 1, then any point a ∈ ∂σ
on the boundary of σ belongs to a simplex of dimension at most k, and the induction
hypothesis implies that there is an open subset U ⊆ |K| containing a such that U ∼ = B d or
U ∼ = B d ∩ Hd . Otherwise, a belongs to the interior of σ, and we conclude by Proposition
d d−1
9.10 since st(σ) ∼
= B d or st(σ) ∼
=B −B ∼
= B d ∩ Hd .
Remark: Thurston states that Proposition 9.11 holds for pure complexes without bound-
aries under the weaker assumption that lk(v) ∼ = S d−1 for every vertex; see Thurston [61],
Chapter 3, Proposition 3.2.5. A proof of the more general fact that if lk(v) ∼ = S d−1 or
d−1
lk(v) ∼
=B for every vertex then every face is nonsingular can be found in Stallings [54];
see Section 4.4, Proposition 4.4.12. The proof requires several technical lemmas and is quite
involved.
Here are more useful propositions about pure complexes without singularities.
Proposition 9.12. Let K be any pure complex of dimension d. If every facet of K is
nonsingular, then every facet of K is contained in at most two cells (d-simplices).
Proof. If |K| ⊆ Ed , then this is an immediate consequence of the definition of a complex.
Otherwise, consider lk(σ). By hypothesis, either lk(σ) ∼
= B 0 or lk(σ) ∼
= S 0 . As B 0 = {0},
0
S = {−1, 1} and dim Lk(σ) = 0, we deduce that Lk(σ) has either one or two points, which
proves that σ belongs to at most two d-simplices.
Proposition 9.13. Let K be any pure and connected complex of dimension d. If every face
of K is nonsingular, then for every pair of cells (d-simplices), σ and σ 0 , there is a sequence
of cells, σ0 , . . . , σp , with σ0 = σ and σp = σ 0 , and such that σi and σi+1 have a common facet,
for i = 0, . . . , p − 1.
Proof. We proceed by induction on d, using the fact that the links are connected for d ≥
2.
Proposition 9.14. Let K be any pure complex of dimension d. If every facet of K is
nonsingular, then the boundary, bd(K), of K is a pure complex of dimension d − 1 with an
empty boundary. Furthermore, if every face of K is nonsingular, then every face of bd(K)
is also nonsingular.
Proof. Left as an exercise.
Remark: Since the building blocks of a polyhedral complex are convex polytopes it might
be more appropriate to use the term “polytopal complex” rather than “polyhedral complex”
and some authors do that. On the other hand, most of the traditional litterature uses the
terminology polyhedral complex so we will stick to it. There is a notion of complex where
the building blocks are cones but these are called fans.
Every convex polytope, P , yields two natural polyhedral complexes:
(i) The polyhedral complex K(P ) consisting of P together with all of its faces. This
complex has a single cell, namely P itself.
(ii) The boundary complex K(∂P ) consisting of all faces of P other than P itself. The cells
of K(∂P ) are the facets of P .
The notions of k-skeleton and pureness are defined just as in the simplicial case. The
notions of star and link are defined for polyhedral complexes just as they are defined for
simplicial complexes except that the word “face” now means face of a polytope. Now, by
Theorem 5.7, every polytope σ is the convex hull of its vertices.
Let vert(σ) denote the set of vertices of σ. Then, we have the following crucial observation:
Given any polyhedral complex K, for every point x ∈ |K|, there is a unique polytope σx ∈ K
such that x ∈ Int(σx ) = σx − ∂ σx . We define a function t : V → R+ that tests whether x
belongs to the interior of any face (polytope) of K having v as a vertex as follows: For every
vertex v of K,
1 if v ∈ vert(σx )
tv (x) =
0 if v ∈
/ vert(σx ),
where σx is the unique face of K such that x ∈ Int(σx ).
Now, just as in the simplicial case, the open star st(v) of a vertex v ∈ K is given by
and it is an open subset of |K| (the set |K| − st(v) is the union of the polytopes of K that
do not contain v as a vertex, a closed subset of |K|). Also, for any face σ, of K, the open
star st(σ) of σ is given by
\
st(σ) = {x ∈ |K| | tv (x) = 1, for all v ∈ vert(σ)} = st(v).
v∈vert(σ)
Theorem (Grünbaum). If P and Q are two polytopes as above with P = conv(Q ∪ {v}),
then the following properties hold:
(a) v ∈ aff(F ); or
(b) among the facets of Q containing F there is at least one such that v is beneath it,
and at least one which is visible from v.
The above theorem implies that the new simplices that need to be added to form a
triangulation of P are the convex hulls conv(F ∪ {v}) associated with facets F of Q visible
from v. The reader should check that everything really works out!
With all this preparation, it is now quite natural to define combinatorial manifolds.
Other authors use the term triangulation of PL-manifold for what we call a combinatorial
manifold.
It is easy to see that the connected components of a combinatorial 1-manifold are either
simple closed polygons or simple chains (“simple” means that the interiors of distinct edges
are disjoint). A combinatorial 2-manifold which is connected is also called a combinatorial
surface (with or without boundary). Proposition 9.14 immediately yields the following result:
Now, because we are assuming that X sits in some Euclidean space, En , the space X
is Hausdorff and second-countable. (Recall that a topological space is second-countable iff
there is a countable family {Ui }i≥0 of open sets of X such that every open subset of X is
the union of open sets from this family.) Since it is desirable to have a good match between
manifolds and combinatorial manifolds, we are led to the definition below.
Recall that
Hd = {(x1 , . . . , xd ) ∈ Rd | xd ≥ 0}.
Definition 9.12. For any d ≥ 1, a (topological) d-manifold with boundary is a second-
countable, topological Hausdorff space M , together with an open cover (Ui )i∈I of open sets
in M and a family (ϕi )i∈I of homeomorphisms ϕi : Ui → Ωi , where each Ωi is some open
subset of Hd in the subset topology. Each pair (U, ϕ) is called a coordinate system or
chart of M , each homeomorphism ϕi : Ui → Ωi is called a coordinate map, and its inverse
ϕ−1
i : Ωi → Ui is called a parameterization of Ui . The family (Ui , ϕi )i∈I is often called an
atlas for M . A (topological) bordered surface is a connected 2-manifold with boundary. If
for every homeomorphism ϕi : Ui → Ωi , the open set Ωi ⊆ Hd is actually an open set in Rd
(which means that xd > 0 for every (x1 , . . . , xd ) ∈ Ωi ), then we say that M is a d-manifold .
{(0, 1, 2), (1, 2, 0), (2, 0, 1)} and {(2, 1, 0), (1, 0, 2), (0, 2, 1)}.
Remark: It is possible to define the notion of orientation of a manifold but this is quite
technical and we prefer to avoid digressing into this matter. This shows another advantage
of combinatorial manifolds: The definition of orientability is simple and quite natural.
There are non-orientable (combinatorial) surfaces, for example, the Möbius strip which
can be realized in E3 . The Möbius strip is a surface with boundary, its boundary being a
circle. There are also non-orientable (combinatorial) surfaces such as the Klein bottle or
the projective plane but they can only be realized in E4 (in E3 , they must have singularities
256 CHAPTER 9. BASICS OF COMBINATORIAL TOPOLOGY
such as self-intersection). We will only be dealing with orientable manifolds and, most of
the time, surfaces.
One of the most important invariants of combinatorial (and topological) manifolds is
their Euler(-Poincaré) characteristic. In the next chapter, we prove a famous formula due
to Poincaré giving the Euler characteristic of a convex polytope. For this, we will introduce
a technique of independent interest called shelling.
Chapter 10
10.1 Shellings
The notion of shellability is motivated by the desire to give an inductive proof of the Euler–
Poincaré formula in any dimension. Historically, this formula was discovered by Euler for
three dimensional polytopes in 1752 (but it was already known to Descartes around 1640).
If f0 , f1 and f2 denote the number of vertices, edges and triangles of the three dimensional
polytope, P , (i.e., the number of i-faces of P for i = 0, 1, 2), then the Euler formula states
that
f0 − f1 + f2 = 2.
The proof of Euler’s formula is not very difficult but one still has to exercise caution. Euler’s
formula was generalized to arbitrary d-dimensional polytopes by Schläfli (1852) but the
first correct proof was given by Poincaré. For this, Poincaré had to lay the foundations of
algebraic topology and after a first “proof” given in 1893 (containing some flaws) he finally
gave the first correct proof in 1899. If fi denotes the number of i-faces of the d-dimensional
polytope, P , (with f−1 = 1 and fd = 1), the Euler–Poincaré formula states that:
d−1
X
(−1)i fi = 1 − (−1)d ,
i=0
257
258 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
(i) The boundary complex K(∂F1 ) of the first cell F1 of K has a shelling.
(ii) For any j, 1 < j ≤ s, the intersection of the cell Fj with the previous cells is nonempty
and is an initial segment of a shelling of the (d − 1)-dimensional boundary complex of
Fj , that is,
j−1
!
[
Fj ∩ Fi = G1 ∪ G2 ∪ · · · ∪ Gr ,
i=1
Note that shellabiliy is only defined for pure complexes. Here are some examples of
shellable complexes:
(1) Every 0-dimensional complex, that is, every set of points is shellable, by definition.
10.1. SHELLINGS 259 1
1 2 3 4 3 5
2
8 4 2
1
7 6 5 1
(2) A 1-dimensional complex is a graph without loops and parallel edges. A 1-dimensional
complex is shellable iff it is connected, which implies that it has no isolated vertices.
Any ordering of the edges e1 , . . . , es such that {e1 , . . . , ei } induces a connected subgraph
for every i will do. Such an ordering can be defined inductively, due to the connectivity
of the graph.
(3) Every simplex is shellable. In fact, any ordering of its facets yields a shelling. This is
easily shown by induction on the dimension, since the intersection of any two facets Fi
and Fj is a facet of both Fi and Fj .
(4) The d-cubes are shellable. By induction on the dimension, it can be shown that
every ordering of the 2d facets F1 , . . . , F2d such that F1 and F2d are opposite (that is,
F2d = −F1 ) yields a shelling.
However, already for 2-complexes, problems arise. For example, in Figure 10.1, the left
and the middle 2-complexes are not shellable but the right complex is shellable.
The problem with the left complex is that cells 1 and 2 intersect at a vertex, which is not
1-dimensional. In the middle complex shown in Figure 10.1, the intersection of cell 8 with its
predecessors is not connected, so the particular order chosen is not a shelling. However, there
are other orders that constitute a shelling. In contrast, the ordering of the right complex is
a shelling. However, observe that the reverse ordering is not a shelling because cell 4 has an
empty intersection with cell 5!
Remarks:
1. Condition (i) in Definition 10.1 is redundant because, as we shall prove shortly, every
polytope is shellable. However, if we want to use this definition for more general
complexes, then Condition (i) is necessary.
(ii’) For any j, with 1 < j ≤ s, the intersection of Fj with the previous cells is
nonempty and pure (d − 1)-dimensional. This means that for every i < j there is
some l < j such that Fi ∩ Fj ⊆ Fl ∩ Fj and Fl ∩ Fj is a facet of Fj .
The following proposition yields an important piece of information about the local struc-
ture of shellable simplicial complexes; see Ziegler [67], Chapter 8.
Proposition 10.1. Let K be a shellable simplicial complex and say F1 , . . . , Fs is a shelling
for K. Then, for every vertex v, the restriction of the above sequence to the link Lk(v), and
to the star St(v), are shellings.
Since the complex K(P ) associated with a polytope P has a single cell, namely P itself,
note that by condition (i) in the definition of a shelling, K(P ) is shellable iff the complex
K(∂P ) is shellable. We will say simply say that “P is shellable” instead of “K(∂P ) is
shellable.”
We have the following useful property of shellings of polytopes whose proof is left as an
exercise (use induction on the dimension):
Proposition 10.2. Given any polytope, P , if F1 , . . . , Fs is a shelling of P , then the reverse
sequence Fs , . . . , F1 is also a shelling of P .
Proposition 10.2 generally fails for complexes that are not polytopes, see the right 2-
complex in Figure 10.1.
We will now present the proof that every polytope is shellable, using a technique invented
by Bruggesser and Mani (1970) known as line shelling [17]. This is quite a simple and
natural idea if one is willing to ignore the technical details involved in actually checking that
it works. We begin by explaining this idea in the 2-dimensional case, a convex polygon, since
it is particularly simple.
Consider the 2-polytope P shown in Figure 10.2 (a polygon) whose faces are labeled
F1 , F2 , F3 , F4 , F5 . Pick any line ` intersecting the interior of P and intersecting the supporting
lines of the facets of P (i.e., the edges of P ) in distinct points labeled z1 , z2 , z3 , z4 , z5 (such
a line can always be found, as will be shown shortly). Orient the line ` (say, upward), and
travel on ` starting from the point of P where ` leaves P , namely z1 . For a while, only face
F1 is visible, but when we reach the intersection z2 of ` with the supporting line of F2 , the
face F2 becomes visible, and F1 becomes invisible as it is now hidden by the supporting line
of F2 . So far, we have seen the faces F1 and F2 , in that order . As we continue traveling along
`, no new face becomes visible but for a more complicated polygon, other faces Fi would
become visible one at a time as we reach the intersection zi of ` with the supporting line of
Fi , and the order in which these faces become visible corresponds to the ordering of the zi ’s
along the line `. Then, we imagine that we travel very fast and when we reach “+∞” in the
upward direction on `, we instantly come back on ` from below at “−∞”. At this point, we
only see the face of P corresponding to the lowest supporting line of faces of P , i.e., the line
10.1. SHELLINGS 261
corresponding to the smallest zi , in our case z3 . At this stage, the only visible face is F3 .
We continue traveling upward on ` and we reach z3 , the intersection of the supporting line
of F3 with `. At this moment, F4 becomes visible, and F3 disappears as it is now hidden
by the supporting line of F4 . Note that F5 is not visible at this stage. Finally, we reach z4 ,
the intersection of the supporting line of F4 with `, and at this moment the last facet F5
becomes visible (and F4 becomes invisible, F3 being also invisible). Our trip stops when we
reach z5 , the intersection of F5 and `. During the second phase of our trip, we saw F3 , F4
and F5 , and the entire trip yields the sequence F1 , F2 , F3 , F4 , F5 , which is easily seen to be a
shelling of P . 1
z2
F1 z1
F2
F4
F3
z5
F5
z4
z3
This is the crux of the Bruggesser-Mani method for shelling a polytope: We travel along
262 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
a suitably chosen line and record the order in which the faces become visible during this
trip. This is why such shellings are called line shellings.
In order to prove that polytopes are shellable we need the notion of points and lines
in “general position.” Recall from the equivalence of V-polytopes and H-polytopes that a
polytope P in Ed with nonempty interior is cut out by t irredundant hyperplanes Hi , and
by picking the origin in the interior of P the equations of the Hi may be assumed to be of
the form
ai · z = 1
where ai and aj are not proportional for all i 6= j, so that
P = {z ∈ Ed | ai · z ≤ 1, 1 ≤ i ≤ t}.
Definition 10.2. Let P be any polytope in Ed with nonempty interior and assume that P
is cut out by the irredudant hyperplanes Hi of equations ai · z = 1, for i = 1, . . . , t. A point
c ∈ Ed is said to be in general position w.r.t. P is c does not belong to any of the Hi , that
is, if ai · c 6= 1 for i = 1, . . . , t. A line ` is said to be in general position w.r.t. P if ` is not
parallel to any of the Hi , and if ` intersects the Hi in distinct points.
The following proposition showing the existence of lines in general position w.r.t. a
polytope illustrates a very useful technique, the “perturbation method.” The “trick” behind
this particular perturbation method is that polynomials (in one variable) have a finite number
of zeros.
Proposition 10.3. Let P be any polytope in Ed with nonempty interior. For any two points
x and y in Ed , with x outside of P ; y in the interior of P ; and x in general position w.r.t.
P ; for λ ∈ R small enough, the line, `λ , through x and yλ with
yλ = y + (λ, λ2 , . . . , λd ),
ai · (u + Λ) 6= 0, i = 1, . . . , t,
or
spi (λ) = αi and spj (λ) = αj for some i 6= j,
where αi = 1−ai ·x and αj = 1−aj ·x. As x is in general position w.r.t. P , we have αi , αj 6= 0
and as the Hi are irredundant, the polynomials pi (λ) = ai · (u + Λ) and pj (λ) = aj · (u + Λ)
are not proportional. Now, if λ ∈
/ Z(pi ) ∪ Z(pj ), in order for the system
spi (λ) = αi
spj (λ) = αj
where qij (λ) is not the zero polynomial since pi (λ) and pj (λ) are not proportional and
αi , αj 6= 0. If we pick λ ∈
/ Z(qij ), then qij (λ) 6= 0. Therefore, if we pick
t
[ t
[
λ∈
/ Z(pi ) ∪ Z(qij ),
i=1 i6=j
the line `λ is in general position w.r.t. P . Finally, we can pick λ small enough so that
yλ = y + Λ is close enough to y so that it is in the interior of P .
Definition 10.3. Given any point x strictly outside a polytope P , we say that a facet F
of P is visible from x iff for every y ∈ F the line through x and y intersects P only in y
(equivalently, x and the interior of P are strictly separared by the supporting hyperplane of
F ).
264 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
z3
z2
z1 F1
F4 F3
F2
We now prove the following fundamental theorem due to Bruggesser and Mani [17] (1970):
Theorem 10.4. (Existence of Line Shellings for Polytopes) Let P be any polytope in Ed of
dimension d. For every point x outside P and in general position w.r.t. P , there is a shelling
of P in which the facets of P that are visible from x come first.
Proof. By Proposition 10.3, we can find a line ` through x such that ` is in general position
w.r.t. P and ` intersects the interior of P . Pick one of the two faces in which ` intersects
P , say F1 , let z1 = ` ∩ F1 , and orient ` from the inside of P to z1 . As ` intersects the
supporting hyperplanes of the facets of P in distinct points, we get a linearly ordered list of
these intersection points along `,
z1 , z2 , · · · , zm , zm+1 , · · · , zs ,
where zm+1 is the smallest element, zm is the largest element, and where z1 and zs belong to
the faces of P where ` intersects P . Then, as in the example illustrated by Figure 10.2, by
travelling “upward” along the line ` starting from z1 we get a total ordering of the facets of
P,
F1 , F2 , . . . , Fm , Fm+1 , . . . , Fs
where Fi is the facet whose supporting hyperplane cuts ` in zi .
10.1. SHELLINGS 265
Remark: The trip along the line ` is often described as a rocket flight starting from the
surface of P viewed as a little planet (for instance, this is the description given by Ziegler
[67] (Chapter 8)). Observe that if we reverse the direction of `, we obtain the reversal of the
original line shelling. Thus, the reversal of a line shelling is not only a shelling but a line
shelling as well.
We can easily prove the following corollary:
(1) For any two facets F and F 0 , there is a shelling of P in which F comes first and F 0
comes last.
(2) For any vertex v of P , there is a shelling of P in which the facets containing v form
an initial segment of the shelling.
Proof. For (1), we use a line in general position and intersecting F and F 0 in their interior.
For (2), we pick a point x beyond v and pick a line in general position through x intersecting
the interior of P . Pick the origin O in the interior of P . A point x is beyond v iff x and O
lies on different sides of every hyperplane Hi supporting a facet of P containing x but on
the same side of Hi for every hyperplane Hi supporting a facet of P not containing x. Such
a point can be found on a line through O and v, as the reader should check.
Remark: A plane triangulation K is a pure two-dimensional complex in the plane such that
|K| is homeomorphic to a closed disk. Edelsbrunner proves that every plane triangulation
has a shelling, and from this, that χ(K) = 1, where χ(K) = f0 −f1 +f2 is the Euler–Poincaré
characteristic of K, where f0 is the number of vertices, f1 is the number of edges and f2 is
the number of triangles in K (see Edelsbrunner [25], Chapter 3). This result is an immediate
266 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
consequence of Corollary 10.5 if one knows about the stereographic projection map, which
will be discussed in the next Chapter.
We now have all the tools needed to prove the famous Euler–Poincaré Formula for Poly-
topes.
be the f -vector associated with K (if necessary we write fi (K) instead of fi ). The Euler–
Poincaré characteristic χ(K) of K is defined by
d
X
χ(K) = f0 − f1 + f2 + · · · + (−1)d fd = (−1)i fi .
i=0
Given any d-dimensional polytope P , the f -vector associated with P is the f -vector associ-
ated with K(P ), that is,
f (P ) = (f0 , · · · , fd ) ∈ Nd+1 ,
where fi , is the number of i-faces of P (= the number of i-faces of K(P ) and thus, fd = 1),
and the Euler–Poincaré characteristic χ(P ) of P is defined by
d
X
χ(P ) = f0 − f1 + f2 + · · · + (−1)d fd = (−1)i fi .
i=0
Moreover, the f -vector associated with the boundary ∂P of P is the f -vector associated
with K(∂P ), that is,
f (∂P ) = (f0 , · · · , fd−1 ) ∈ Nd ,
where fi , is the number of i-faces of ∂P (with 0 ≤ i ≤ d − 1), and the Euler–Poincaré
characteristic χ(∂P ) of ∂P is defined by
d−1
X
d−1
χ(∂P ) = f0 − f1 + f2 + · · · + (−1) fd−1 = (−1)i fi .
i=0
Remark: It is convenient to set f−1 = 1. Then, some authors, including Ziegler [67] (Chap-
ter 8), define the reduced Euler–Poincaré characteristic χ0 (K) of a polyhedral complex (or
a polytope) K as
d
X
0 d
χ (K) = −f−1 + f0 − f1 + f2 + · · · + (−1) fd = (−1)i fi = −1 + χ(K),
i=−1
To prove our next theorem we will use complete induction on N × N ordered by the
lexicographic ordering. Recall that the lexicographic ordering on N × N is defined as follows:
m = m0 and n < n0
and so
d−1
X
χ(∂P ) = (−1)i fi = 1 − (−1)d (d ≥ 1).
i=0
Proof. We prove the following statement: For every d-dimensional polytope P , if d = 0 then
χ(P ) = 1,
else if d ≥ 1 then for every shelling F1 , . . . , Ffd−1 of P , for every j, with 1 ≤ j ≤ fd−1 , we
have
1 if 1 ≤ j < fd−1
χ(F1 ∪ · · · ∪ Fj ) = d
1 − (−1) if j = fd−1 .
268 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
We proceed by complete induction on (d, j) ≥ (0, 1). For d = 0 and j = 1, the polytope P
consists of a single point and so, χ(P ) = f0 = 1, as claimed.
For the induction step, assume that d ≥ 1. For 1 = j < fd−1 , since F1 is a polytope of
dimension d − 1, by the induction hypothesis, χ(F1 ) = 1, as desired.
For 1 < j < fd−1 , by (∗) we have
j−1 j−1
! ! !
[ [
χ(F1 ∪ · · · Fj−1 ∪ Fj ) = χ Fi + χ(Fj ) − χ Fi ∩ Fj . (∗∗)
i=1 i=1
χ(Fj ) = 1.
j−1
!
[
Fi ∩ Fj = G1 ∪ · · · ∪ Gr ,
i=1
for some shelling G1 , . . . , Gr , . . . , Gt of K(∂Fj ), with r < t = fd−2 (∂Fj ). The fact that
r < fd−2 (∂Fj ), i.e., that G1 ∪ · · · ∪ Gr is not the whole boundary of Fj is a property of line
shellings and also follows from Proposition 10.2. As dim(∂Fj ) = d − 2, and r < fd−2 (∂Fj ),
by the induction hypothesis, we have
j−1
! !
[
χ Fi ∩ Fj = χ(G1 ∪ · · · ∪ Gr ) = 1.
i=1
χ(F1 ∪ · · · Fj−1 ∪ Fj ) = 1 + 1 − 1 = 1,
fd−1 −1
!
[
Fi ∩ Ffd−1 = G1 ∪ · · · ∪ Gfd−2 (Ffd−1 ) = ∂Ffd−1 .
i=1
10.3. DEHN–SOMMERVILLE EQUATIONS FOR SIMPLICIAL POLYTOPES 269
and
χ(P ) = χ(∂P ) + (−1)d = 1,
proving our theorem.
Remark: Other combinatorial proofs of the Euler–Poincaré formula are given in Grünbaum
[35] (Chapter 8), Boissonnat and Yvinec [12] (Chapter 7) and Ewald [26] (Chapter III).
Coxeter gives a proof very close to Poincaré’s own proof using notions of homology theory
[20] (Chapter IX). We feel that the proof based on shellings is the most direct and one of
the most elegant. Incidently, the above proof of the Euler–Poincaré formula is very close to
Schläfli proof from 1852 but Schläfli did not have shellings at his disposal so his “proof” had
a gap. The Bruggesser-Mani proof that polytopes are shellable fills this gap!
Observe that R1 = ∅. The crucial property of the Rj is that the new faces G added at
step j (when Fj is added to the shelling) are precisely the faces in the set
Ij = {G ⊆ V | Rj ⊆ G ⊆ Fj }.
The proof of the above fact is left as an exercise to the reader, or see Ziegler [67] (Chapter
8, Section 8.3).
But then, we obtain a partition {I1 , . . . , Is } of the set of faces of the simplicial complex
(other that K itself). Note that the empty face is allowed. Now, if we define
for i = 0, . . . , d, then it turns out that we can recover the fk in terms of the hi as follows:
s k
d − |Rj | d−i
X X
fk−1 = = hi ,
j=1
k − |Rj | i=0
k−i
with 1 ≤ k ≤ d.
But more is true: The above equations are invertible and the hk can be expressed in
terms of the fi as follows:
k
k−i d − i
X
hk = (−1) fi−1 ,
i=0
d−k
with 0 ≤ k ≤ d (remember, f−1 = 1).
Let us explain all this in more detail. Consider the example of a connected graph (a
simplicial 1-dimensional complex) from Ziegler [67] (Section 8.3) shown in Figure 10.4.
10.3. DEHN–SOMMERVILLE EQUATIONS FOR SIMPLICIAL POLYTOPES 271
1 4 5
2 3 6
12 13 34 35 45 36 56
1 2 3 4 5 6
In the above example, we have R1 = {∅}, R2 = {3}, R3 = {4}, R4 = {5}, R5 = {4, 5},
R6 = {6} and R7 = {5, 6}, I1 = {∅, 1, 2, 12}, I2 = {3, 13}, I3 = {4, 34}, I4 = {5, 35},
I5 = {45}, I6 = {6, 36}, I7 = {56}, and the “minimal” new faces (corresponding to the Rj ’s)
added at every stage of the shelling are
∅, 3, 4, 5, 45, 6, 56.
Definition 10.6. For any shellable pure simplicial complex K of dimension d − 1, if hi is
the number of blocks Ij such that the corresponding restriction set Rj has size i, that is,
hi = |{j | |Rj | = i, 1 ≤ j ≤ s}| for i = 0, . . . , d,
then we define the h-vector associated with K as
h(K) = (h0 , . . . , hd ).
272 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
In other words, hi is the number of minimal faces in the partitition that have i vertices,
with h0 = 1.
In our example, as R1 = {∅}, R2 = {3}, R3 = {4}, R4 = {5}, R5 = {4, 5}, R6 = {6} and
R7 = {5, 6}, we have h0 = 1, h1 = 4, and h2 = 2, that is,
h(C) = (1, 4, 2).
Looking at Figure 10.5, we see that for every horizontal layer i (starting from 0) of the
lattice, hi is the numbers of nodes (in bold) that are minimal in some block of the partition.
Now, let us show that if K is a shellable simplicial complex, then the f -vector can be
recovered from the h-vector. Indeed, K is a pure simplicial complex so every face is contained
in a face of dimension d − 1 which has d vertices, and if |Rj | = i, then each (k − 1)-face in
the block Ij must use all i nodes in Rj , so that there are only d − i nodes available and,
among those, k − i must be chosen. Therefore,
s
d − |Rj |
X
fk−1 = ,
j=1
k − |Rj |
where 1 ≤ k ≤ d. Moreover, the formulae are invertible, that is, the hi can be expressed in
terms of the fk . For this, form the two polynomials
d
X
f (x) = fi−1 xd−i = fd−1 + fd−2 x + · · · + f0 xd−1 + f−1 xd
i=0
In particular, h0 = 1, h1 = f0 − d, and
h0 + h1 + · · · + hd = fd−1 .
Now, we just showed that if K is shellable, then its f -vector and its h-vector are related
as above. But even if K is not shellable, the above suggests defining the h-vector from the
f -vector as above. Thus, we make the definition:
Definition 10.7. For any (d − 1)-dimensional pure simplicial complex K, the h-vector
associated with K is the vector
given by
k
d−i
X
k−i
hk = (−1) fi−1 .
i=0
d−k
Note that if K is shellable, then the interpretation of hi as the number of cells, Fj , such
that the corresponding restriction set, Rj , has size i shows that hi ≥ 0. However, for an
arbitrary simplicial complex, some of the hi can be strictly negative. Such an example is
given in Ziegler [67] (Section 8.3).
We summarize below most of what we just showed:
Proposition 10.7. Let K be a (d−1)-dimensional pure simplicial complex. If K is shellable,
then its h-vector is nonnegative and hi counts the number of cells in a shelling whose restric-
tion set has size i. Moreover, the hi do not depend on the particular shelling of K.
There is a way of computing the h-vector of a pure simplicial complex from its f -vector
reminiscent of the Pascal triangle (except that negative entries can turn up). This method is
known as Stanley’s trick ; see Stanley [55]. For this we write the numbers fi to thelast entries
of the rows of Pascal’s triangle (to the place where ordinarily we would put i+1 i+1
= 1), and
then we compute the other entries using the rule
For example, for the graph C of Figure 10.4, for which the f -vector is f = (1, 6, 7), we obtain
the following table
1
1 6
1 5 7
1 4 2
274 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
2f1 = 3f2 .
hk = hd−k k = 0, 1, . . . , d.
Equivalently,
d
X
d−i i
fk−1 = (−1) fi−1 k = 0, . . . , d.
i=k
k
Furthermore, the equation h0 = hd is equivalent to the Euler–Poincaré formula.
Proof. We present a short and elegant proof due to McMullen. Recall from Proposition 10.2
that the reversal Fs , . . . , F1 of a shelling F1 , . . . , Fs of a polytope is also a shelling. From this,
we see that for every Fj , the restriction set of Fj in the reversed shelling is equal to Rj − Fj ,
the complement of the restriction set of Fj in the original shelling. Therefore, if |Rj | = k,
then Fj contributes “1” to hk in the original shelling iff it contributes “1” to hd−k in the
reversed shelling (where |Rj − Fj | = d − k). It follows that the value of hk computed in the
10.3. DEHN–SOMMERVILLE EQUATIONS FOR SIMPLICIAL POLYTOPES 275
original shelling is the same as the value of hd−k computed in the reversed shelling. However,
by Proposition 10.7, the h-vector is independent of the shelling and hence, hk = hd−k .
To prove the second equation, following Ewald [26] (Chapter III, Theorem 3.7), define
the polynomials F (x) and H(x) by
d
X
i d x
F (x) = fi−1 x ; H(x) = (1 − x) F .
i=0
1−x
Pd
Note that H(x) = i=0 fi−1 xi (1 − x)d−i , and an easy computation shows that the coefficient
of xk is equal to
k
d−i
X
k−i
(−1) fi−1 = hk .
i=0
d−k
Now, the equations hk = hd−k are equivalent to
F (y − 1) = (−1)d F (−y)
276 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
for all y 6= 0 (since y = 1/(1 − x)). But F (x − 1) and (−1)d F (−x) are polynomials that
have the same value for infinitely many real values, so in fact the polynomials F (x − 1) and
(−1)d F (−x) are identical. As
d d i
X
i
X X i
F (x − 1) = fi−1 (x − 1) = fi−1 xi−j (−1)j ,
i=0 i=0 j=0
i − j
On the other hand, the coefficient of xk in (−1)d F (−x) is (−1)d+k fk−1 . By equating the
coefficients of xk , we get
d
d+k
X
i−k i
(−1) fk−1 = (−1) fi−1 ,
i=k
k
h0 = h3 , h1 = h2 ,
h0 = h4 , h1 = h3 ,
2f2 = 4f3 ,
10.4. THE UPPER BOUND THEOREM 277
2fd−2 = dfd−1
h0 = h5 , h1 = h4 , h2 = h3 ,
and so on.
It can be shown that for general d-polytopes, the Euler–Poincaré formula is the only equa-
tion satisfied by all h-vectors and for simplicial d-polytopes, the b d+1
2
c Dehn–Sommerville
equations, hk = hd−k , are the only equations satisfied by all h-vectors (see Grünbaum [35],
Chapter 9).
Remark: Readers familiar with homology and cohomology may suspect that the Dehn–
Sommerville equations are a consequence of a type of Poincaré duality. Stanley proved that
this is indeed the case. It turns out that the hi are the dimensions of cohomology groups of
a certain toric variety associated with the polytope. For more on this topic, see Stanley [56]
(Chapters II and III) and Fulton [28] (Section 5.6).
As we saw for 3-dimensional simplicial polytopes, the number of vertices, n = f0 , de-
termines the number of edges and the number of faces, and these are linear in f0 . For
d ≥ 4, this is no longer true and the number of facets is no longer linear in n but in fact
quadratic. It is then natural to ask which d-polytopes with a prescribed number of vertices
have the maximum number of k-faces. This question which remained an open problem for
some twenty years was eventually settled by McMullen in 1970 [42]. We will present this
result (without proof) in the next section.
McMullen’s proof is not really very difficult but it is still quite involved so we will only
state some propositions needed for its proof. We urge the reader to read Ziegler’s account
of this beautiful proof [67] (Chapter 8). We begin with cyclic polytopes.
First, consider the cases d = 2 and d = 3. When d = 2, our polytope is a polygon in
which case n = f0 = f1 . Thus, this case is trivial.
For d = 3, we claim that 2f1 ≥ 3f2 . Indeed, every edge belongs to exactly two faces so if
we add up the number of sides for all faces, we get 2f1 . Since every face has at least three
sides, we get 2f1 ≥ 3f2 . Then, using Euler’s relation, it is easy to show that
f1 ≤ 6n − 3 f2 ≤ 2n − 4
The first interesting fact about the cyclic polytope is that it is simplicial.
Proposition 10.9. Every d + 1 of the points c(t1 ), . . . , c(tn ) are affinely independent. Con-
sequently, Cd (n) is a simplicial polytope and the c(ti ) are vertices.
Proof. We may assume that n = d + 1. Say c(t1 ), . . . , c(tn ) belong to a hyperplane, H, given
by
α1 x1 + · · · + αd xd = β.
(Of course, not all the αi are zero.) Then, we have the polynomial, H(t), given by
H(t) = −β + α1 t + α2 t2 + · · · + αd td ,
of degree at most d and as each c(ti ) belong to H, we see that each c(ti ) is a zero of H(t).
However, there are d + 1 distinct c(ti ), so H(t) would have d + 1 distinct roots. As H(t) has
degree at most d, it must be the zero polynomial, a contradiction. Returing to the original
n > d + 1, we just proved every d + 1 of the points c(t1 ), . . . , c(tn ) are affinely independent.
Then, every proper face of Cd (n) has at most d independent vertices, which means that it is
a simplex.
10.4. THE UPPER BOUND THEOREM 279
n
The following proposition already shows that the cyclic polytope, Cd (n), has k
(k − 1)-
faces if 1 ≤ k ≤ b d2 c.
Proof. Consider any sequence ti1 < ti2 < · · · < tik . We will prove that there is a hyperplane
separating F = conv({c(ti1 ), . . . , c(tik )}) and Cd (n). Consider the polynomial
k
Y
p(t) = (t − tij )2
j=1
and write
p(t) = a0 + a1 t + · · · + a2k t2k .
Consider the vector
a = (a1 , a2 , . . . , a2k , 0, . . . , 0) ∈ Rd
and the hyperplane, H, given by
H = {x ∈ Rd | x · a = −a0 }.
and so, c(tij ) ∈ H. On the other hand, for any other point, c(ti ), distinct from any of the
c(tij ), we have
k
Y
c(ti ) · a = −a0 + p(ti ) = −a0 + (ti − tij )2 > −a0 ,
j=1
proving that c(ti ) ∈ H+ . But then, H is a supporting hyperplane of F for Cd (n) and F is a
(k − 1)-face.
Observe that Proposition 10.10 shows that any subset of b d2 c vertices of Cd (n) forms
a face of Cd (n). When a d-polytope has this property it is called a neighborly polytope.
Therefore, cyclic polytopes are neighborly. Proposition 10.10 also shows a phenomenon that
only manifests itself in dimension at least 4: For d ≥ 4, the polytope Cd (n) has n pairwise
adjacent vertices. For n >> d, this is counter-intuitive.
Finally, the combinatorial structure of cyclic polytopes is completely determined as fol-
lows:
280 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
Proposition 10.11. (Gale evenness condition, Gale (1963)). Let n and d be integers with
2 ≤ d < n. For any sequence t1 < t2 < · · · < tn , consider the cyclic polytope
A subset S ⊆ {t1 , . . . , tn } with |S| = d determines a facet of Cd (n) iff for all i < j not in S,
then the number of k ∈ S between i and j is even:
H = {x ∈ Rd | x · b = −b0 }.
q(t) = c(t) · b + b0 6= 0,
In particular, Proposition 10.11 shows that the combinatorial structure of Cd (n) does
not depend on the specific choice of the sequence t1 < · · · < tn . This justifies our notation
Cd (n).
Here is the celebrated upper bound theorem first proved by McMullen [42].
Theorem 10.12. (Upper Bound Theorem, McMullen (1970)) Let P be any d-polytope with n
vertices. Then, for every k, with 1 ≤ k ≤ d, the polytope P has at most as many (k −1)-faces
as the cyclic polytope Cd (n), that is
The first step in the proof of Theorem 10.12 is to prove that among all d-polytopes
with a given number n of vertices, the maximum number of i-faces is achieved by simplicial
d-polytopes.
Proposition 10.13. Given any d-polytope P with n-vertices, it is possible to form a sim-
plicial polytope P 0 by perturbing the vertices of P such that P 0 also has n vertices and
fk−1 (P ) ≤ fk−1 (P 0 ) for 1 ≤ k ≤ d.
Furthermore, equality for k > b d2 c can occur only if P is simplicial.
Sketch of proof. First, we apply Proposition 9.15 to triangulate the facets of P without
adding any vertices. Then, we can perturb the vertices to obtain a simplicial polytope P 0
with at least as many facets (and thus, faces) as P .
2
bdc
d−i
X ∗ i
fk−1 = + hi ,
i=0
k−i k−d+i
where the meaning of the superscript ∗ is that when d is even we only take half of the last
term for i = d2 and when d is odd we take the whole last term for i = d−1
2
(for details, see
Ziegler [67], Chapter 8). As a consequence if we can show that the neighborly polytopes
maximize not only fk−1 but also hk−1 when k ≤ b d2 c, then the upper bound theorem will be
proved. Indeed, McMullen proved the following theorem which is “more than enough” to
yield the desired result ([42]):
Theorem 10.14. (McMullen (1970)) For every simplicial d-polytope with f0 = n vertices,
we have
n−d−1+k
hk (P ) ≤ for 0 ≤ k ≤ d.
k
Furthermore, equality holds for all l and all k with 0 ≤ k ≤ l iff l ≤ b d2 c, and P is l-
neighborly. (a polytope is l-neighborly iff any subset of l or less vertices determine a face of
P .)
282 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
The proof of Theorem 10.14 is too involved to be given here, which is unfortunate since it
is really beautiful. It makes a clever use of shellings and a careful analysis of the h-numbers
of links of vertices. Again, the reader is referred to Ziegler [67], Chapter 8.
Since cyclic d-polytopes are neighborly (which means that they are b d2 c-neighborly), The-
orem 10.12 follows from Proposition 10.13, and Theorem 10.14.
Corollary 10.15. For every simplicial neighborly d-polytope with n vertices, we have
bdc
2
d−i n−d−1+i
X ∗ i
fk−1 = + for 1 ≤ k ≤ d.
i=0
k−i k−d+i i
This gives the maximum number of (k − 1)-faces for any d-polytope with n-vertices, for all
k with 1 ≤ k ≤ d. In particular, the number of facets of the cyclic polytope Cd (n), is
b d2 c
n−d−1+i
X ∗
fd−1 = 2 ,
i=0
i
d
Corollary 10.15 implies that the number of facets of any d-polytope is O(nb 2 c ). An
unfortunate consequence of this upper bound is that the complexity of any convex hull
d
algorithms for n points in Ed is O(nb 2 c ).
d
The O(nb 2 c ) upper bound can be obtained more directly using a pretty argument using
shellings due to R. Seidel [52].
Consider any shelling of any simplicial d-polytope, P . For every facet, Fj , of a shelling
either the restriction set Rj or its complement Fj − Rj has at most b d2 c elements. So, either
in the shelling or in the reversed shelling, the restriction set of Fj has at most b d2 c elements.
Moreover, the restriction sets are all distinct, by construction. Thus, the number of facets
is at most twice the number of k-faces of P with k ≤ b d2 c. It follows that
b d2 c
X n
fd−1 ≤ 2
i=0
i
d
and this rough estimate yields a O(nb 2 c ) bound.
Remark: There is also a lower bound theorem due to Barnette (1971, 1973) which gives a
lower bound on the f -vectors all d-polytopes with n vertices. In this case, there is an analog of
the cyclic polytopes called stacked polytopes. These polytopes Pd (n) are simplicial polytopes
10.4. THE UPPER BOUND THEOREM 283
obtained from a simplex by building shallow pyramids over the facets of the simplex. Then,
it turns out that if d ≥ 2, then
d d+1
n − k if 0 ≤ k ≤ d − 2
fk ≥ k k+1
(d − 1)n − (d + 1)(d − 2) if k = d − 1.
There has been a lot of progress on the combinatorics of f -vectors and h-vectors since
1971, especially by R. Stanley, G. Kalai and L. Billera, and K. Lee, among others. We
recommend two excellent surveys:
2. Billera and Björner [11] is a more advanced survey which reports on results up to 1997.
In fact, many of the chapters in Goodman and O’Rourke [32] should be of interest to the
reader.
Generalizations of the Upper Bound Theorem using sophisticated techniques (face rings)
due to Stanley can be found in Stanley [56] (Chapters II) and connections with toric varieties
can be found in Stanley [56] (Chapters III) and Fulton [28].
284 CHAPTER 10. SHELLINGS AND THE EULER–POINCARÉ FORMULA
Chapter 11
The fact that not just points but also vectors are needed to deal with unbounded polyhedra
is a hint that perhaps the notions of polytope and polyhedra can be unified by “going
projective”. Indeed, the goal of this chapter is to define a notion of projective polyhedron
which is a natural extension of the notion of polyhedron in affine space, and retains many
of the properties of polyhedra.
However, we have to be careful because projective geometry does not accommodate well
the notion of convexity. This is because convexity has to do with convex combinations,
but the essence of projective geometry is that everything is defined up to non-zero scalars,
without any requirement that these scalars be positive.
It is possible to develop a theory of oriented projective geometry (due to J. Stolfi [57]) in
which convexity is nicely accommodated. However, in this approach, every point comes as a
pair, (positive point, negative point), and although it is a very elegant theory, we find it a bit
unwieldy. However, since all we really need is to “embed” Ed into its projective completion,
Pd , so that we can deal with “points at infinity” and “normal points” in a uniform manner
in particular, with respect to projective transformations, we will content ourselves with
a definition of a notion of projective polyhedron using the notion of polyhedral cone. This
notion is just what is needed in Chapter 12 to deal with the correspondence between Voronoi
diagrams and Delaunay triangulations in terms of the lifting to a paraboloid or the lifting
to a sphere. We will not attempt to define a general notion of convexity.
285
286 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
Definition 11.1. The (real) projective space RPn is the set of all lines through the origin
in Rn+1 , i.e., the set of one-dimensional subspaces of Rn+1 (where n ≥ 0). Since a one-
dimensional subspace L ⊆ Rn+1 is spanned by any nonzero vector u ∈ L, we can view RPn
as the set of equivalence classes of nonzero vectors in Rn+1 − {0} modulo the equivalence
relation,
u ∼ v iff v = λu, for some λ ∈ R, λ 6= 0.
We have the projection p : (Rn+1 − {0}) → RPn given by p(u) = [u]∼ , the equivalence class
of u modulo ∼. Write [u] (or hui) for the line
[u] = {λu | λ ∈ R}
defined by the nonzero vector u. Note that [u]∼ = [u] − {0} for every u 6= 0, so the map
[u]∼ 7→ [u] is a bijection which allows us to identify [u]∼ and [u]. Thus, we will use both
notations interchangeably as convenient.
The projective space RPn is sometimes denoted P(Rn+1 ). Since every line L in Rn+1
intersects the sphere S n in two antipodal points, we can view RPn as the quotient of the
sphere S n by identification of antipodal points. We call this the spherical model of RPn ,
which we illustrate in Figure 11.1.
y x
x
x x
y
(i.) (ii.)
Figure 11.1: The geometric construction for RP1 and RP2 via the identification of antipodal
points of S 1 and S 2 respectively.
A more subtle construction consists in considering the (upper) half-sphere instead of the
sphere, where the upper half-sphere S+n is set of points on the sphere S n such that xn+1 ≥ 0.
This time, every line through the center intersects the (upper) half-sphere in a single point,
except on the boundary of the half-sphere, where it intersects in two antipodal points a+
11.1. PROJECTIVE SPACES 287
and a− . Thus, the projective space RPn is the quotient space obtained from the (upper)
half-sphere S+n by identifying antipodal points a+ and a− on the boundary of the half-sphere.
We call this model of RPn the half-spherical model , which we illustrate in Figure 11.2.
x
x x
(i.)
x
x
(ii.)
Figure 11.2: The geometric construction for RP1 ∼ S 1 and RP2 in terms of the antipodal
boundary points of S+1 and S+2 respectively.
where u = (u1 , . . . , un+1 ). The lines [u] for which un+1 = 0 are “points at infinity”. See
Figure 11.3.
y=1
v u z=1
L∞
(i.)
(ii.)
Figure 11.3: The plane model construction for RP1 and RP2 , where points at infinity corre-
spond to the x-axis and the xy-plane respectively.
RPn = Rn q RPn−1 .
We can repeat the above analysis on RPn−1 and so we can think of RPn as the disjoint
union
RPn = Rn q Rn−1 q · · · q R1 q R0 ,
where R0 = {0} consist of a single point. The above shows that there is an embedding
Rn ,→ RPn given by (u1 , . . . , un ) 7→ (u1 , . . . , un , 1).
It will also be very useful to use homogeneous coordinates.
{(λu1 , . . . , λun+1 ) | λ 6= 0}
is called the set of homogeneous coordinates of a. Since u 6= 0, observe that for all homoge-
neous coordinates, (u1 , . . . , un+1 ), for a, some ui must be non-zero. The traditional notation
for the homogeneous coordinates of a point a = [u]∼ is
(u1 : · · · : un : un+1 ).
11.1. PROJECTIVE SPACES 289
There is a useful bijection between certain kinds of subsets of Rd+1 and subsets of RPd .
For any subset S of Rd+1 , let
−S = {−u | u ∈ S}.
Geometrically, −S is the reflexion of S about 0. Note that for any nonempty subset, S ⊆
Rd+1 , with S 6= {0}, the sets S, −S, and S ∪ −S all induce the same set of points in
projective space RPd , since
because [u]∼ = [−u]∼ . Using these facts we obtain a bijection between subsets of RPd and
certain subsets of Rd+1 .
Definition 11.3. We say that a set S ⊆ Rd+1 is symmetric iff S = −S. Obviously, S ∪ −S
is symmetric for any set S. Say that a subset C ⊆ Rd+1 is a double cone iff for every
u ∈ C − {0}, the entire line [u] spanned by u is contained in C. See Figure 11.4.
u S=C
[u]
-u
-S
We exclude the trivial double cone, C = {0}, since the trivial vector space does not yield
a projective space. Thus, every double cone can be viewed as a set of lines through 0. Note
290 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
that a double cone is symmetric. Given any nonempty subset, S ⊆ RPd , let v(S) ⊆ Rd+1 be
the set of vectors, [
v(S) = [u]∼ ∪ {0}.
[u]∼ ∈S
Proposition 11.1. The map, v : S 7→ v(S), from the set of nonempty subsets of RPd to the
set of nonempty, nontrivial double cones in Rd+1 is a bijection.
Proof. We already noted that v(S) is nontrivial double cone. Consider the map,
We leave it as an easy exercise to check that ps ◦ v = id and v ◦ ps = id, which shows that v
and ps are mutual inverses.
P2
x*
x*
Figure 11.5: In the half-spherical model, a projective line is the maroon semi-circle obtained
by intersecting the hemisphere with a plane through the origin.
It is easy to see that every projective hyperplane, H, is the kernel (zero set) of some
linear equation of the form
a1 x1 + · · · + an+1 xn+1 = 0,
11.1. PROJECTIVE SPACES 291
Observe that the bijection, ϕi , between Ui and Rn can also be viewed as the bijection
x1 xi−1 xi+1 xn+1
(x1 : · · · : xn+1 ) 7→ ,..., , 1, ,..., ,
xi xi xi xi
between Ui and the hyperplane, Hi ⊆ Rn+1 , of equation xi = 1. We will make heavy use of
these bijections. For example, for any subset, S ⊆ RPn , the “view of S from the patch Ui ”,
S Ui , is in bijection with v(S) ∩ Hi , where v(S) is the double cone associated with S (see
Proposition 11.1).
The affine patches, U1 , . . . , Un+1 , cover the projective space RPn , in the sense that every
(x1 : · · · : xn+1 ) ∈ RPn belongs to one of the Ui ’s, as not all xi = 0. See Figures 11.6 and
11.7. The Ui ’s turn out to be open subsets of RPn and they have nonempty overlaps. When
we restrict ourselves to one of the Ui , we have an “affine view of RPn from Ui .” In particular,
on the affine patch Un+1 , we have the “standard view” of Rn embedded into RPn as Hn+1 ,
the hyperplane of equation xn+1 = 1. The complement Hi (0) of Ui in RPn is the (projective)
hyperplane of equation xi = 0 (a copy of RPn−1 ). With respect to the affine patch Ui , the
hyperplane Hi (0) plays the role of hyperplane (of points) at infinity.
From now on, for simplicity of notation, we will write Pn for RPn . We need to define
projective maps. Such maps are induced by linear maps.
292 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
y=1
x=1
(-1,0) (1,0)
Figure 11.6: The space RP1 , visualized by the spherical model, is covered by the two affine
patches y = 1 or U2 , and x = 1 or U1 .
Definition 11.6. Any injective linear map, h : Rm+1 → Rn+1 , induces a map, P(h) : Pm →
Pn , defined by
P(h)([u]∼ ) = [h(u)]∼
and called a projective map. When m = n and h is bijective, the map P(h) is also bijective
and it is called a projectivity.
We have to check that this definition makes sense, that is, it is compatible with the
equivalence relation, ∼. For this, assume that u ∼ v, that is
v = λu,
with λ 6= 0 (of course, u, v 6= 0). As h is linear, we get
h(v) = h(λu) = λh(u),
that is, h(u) ∼ h(v), which shows that [h(u)]∼ does not depend on the representative chosen
in the equivalence class of [u]∼ . It is also easy to check that whenever two linear maps, h1
and h2 , induce the same projective map, i.e., if P(h1 ) = P(h2 ), then there is a nonzero scalar,
λ, so that h2 = λh1 .
Why did we require h to be injective? Because if h has a nontrivial kernel, then, any
nonzero vector u ∈ Ker (h) is mapped to 0, but as 0 does not correspond to any point of
Pn , the map P(h) is undefined on P(Ker (h)).
In some case, we allow projective maps induced by non-injective linear maps h. In this
case, P(h) is a map whose domain is Pn − P(Ker (h)). An example is the map, σN : P3 → P2 ,
given by
(x1 : x2 : x3 : x4 ) 7→ (x1 : x2 : x4 − x3 ),
which is undefined at the point (0 : 0 : 1 : 1). This map is the “homogenization” of the central
projection (from the north pole, N = (0, 0, 1)) from E3 onto E2 .
11.1. PROJECTIVE SPACES 293
z=1
x
x
x
x
x x
y=1
x x
x x
x=1
x x
Figure 11.7: The space RP2 , visualized by the spherical model, is covered by the three affine
patches z = 1 or U3 , y = 1 or U2 , and x = 1 or U1 . The plane z = 1 covers everything except
the pink circle x2 + y 2 = 1. The plane y = 1 will cover this circle, excluding the x-intercepts.
These x-intercepts are then covered by x = 1.
Another way of defining functions (possibly partial) between projective spaces involves
using homogeneous polynomials. If p1 (x1 , . . . , xm+1 ), . . . , pn+1 (x1 , . . . , xm+1 ) are n + 1 homo-
geneous polynomials all of the same degree d, and if these n + 1 polynomials do not vanish
simultaneously, then we claim that the function f given by
and so,
that is, τN maps all the points at infinity (in H3 (0)) to the “north pole,” (0 : 0 : 1 : 1).
However, when x3 6= 0, we can prove that τN is injective (in fact, its inverse is σN , defined
earlier).
Most interesting subsets of projective space arise as the collection of zeros of a (finite)
set of homogeneous polynomials. Let us begin with a single homogeneous polynomial,
p(x1 , . . . , xn+1 ), of degree d and set
As usual, we need to check that this definition does not depend on the specific representative
chosen in the equivalence class of [(x1 , . . . , xn+1 )]∼ . If (y1 , . . . , yn+1 ) ∼ (x1 , . . . , xn+1 ), that
is, (y1 , . . . , yn+1 ) = λ(x1 , . . . , xn+1 ), with λ 6= 0, as p is homogeneous of degree d,
and as λ 6= 0,
p(y1 , . . . , yn+1 ) = 0 iff p(x1 , . . . , xn+1 ) = 0,
which shows that V (p) is well defined.
Definition 11.7. For a set of homogeneous polynomials (not necessarily of the same degree)
E = {p1 (x1 , . . . , xn+1 ), . . . , ps (x1 , . . . , xn+1 )}, we set
s
\
V (E) = V (pi ) = {(x1 : · · · : xn+1 ) ∈ Pn | pi (x1 , . . . , xn+1 ) = 0, i = 1 . . . , s}.
i=1
The set, V (E), is usually called the projective variety defined by E (or cut out by E). When
E consists of a single polynomial p, the set V (p) is called a (projective) hypersurface.
11.1. PROJECTIVE SPACES 295
For example, if
p(x1 , x2 , x3 , x4 ) = x21 + x22 + x23 − x24 ,
then V (p) is the projective sphere in P3 , denoted S f2 . Indeed, if we “look” at V (p) on the
affine patch U4 , where x4 6= 0, we know that this amounts to setting x4 = 1, and we do
get the set of points (x1 , x2 , x3 , 1) ∈ U4 satisfying x21 + x22 + x23 − 1 = 0, our usual 2-sphere!
However, if we look at V (p) on the patch U1 , where x1 6= 0, we see the quadric of equation
1 + x22 + x23 = x24 , which is not a sphere but a hyperboloid of two sheets! Nevertheless, if we
pick x4 = 0 as the plane at infinity, note that the projective sphere does not have points at
infinity since the only real solution of x21 + x22 + x23 = 0 is (0, 0, 0), but (0, 0, 0, 0) does not
correspond to any point of P3 .
Another example is given by
for which V (q) corresponds to a paraboloid in the patch U4 . Indeed, if we set x4 = 1, we get
the set of points in U4 satisfying x3 = x21 + x22 . For this reason, we denote V (q) by P
e and
call it a (projective) paraboloid .
Definition 11.8. Given any homogeneous polynomial F (x1 , . . . , xd+1 ), we will also make
use of the hypersurface cone C(F ) ⊆ Rd+1 , defined by
Remark: Every variety V (E), defined by a set of polynomials, E = {p1 (x1 , . . . , xn+1 ), . . .,
ps (x1 , . . . , xn+1 )}, is also the hypersurface defined by the single polynomial equation
p21 + · · · + p2s = 0.
This fact, peculiar to the real field R is a mixed blessing. On the one-hand, the study of
varieties is reduced to the study of hypersurfaces. On the other-hand, this is a hint that we
should expect that such a study will be hard.
Perhaps to the surprise of the novice, there is a bijective projective map (a projectivity)
f2 to P.
sending S e This map, θ, is given by
θ(x1 : x2 : x3 : x4 ) = (x1 : x2 : x3 + x4 : x4 − x3 ),
x3 − x4 x3 + x4
−1
θ (x1 : x2 : x3 : x4 ) = x1 : x2 : : .
2 2
296 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
(0,1,1)
(1,0,1)
(1,1,1)
C
(-1,-1,-1)
(-1,0,-1)
-C
(0,-1,-1)
(0,1,1)
(1,0,1) (1,1,1)
(-1,0,-1)
(-1,-1,-1)
-C
(0,-1,-1)
Figure 11.8: The double cone C ∪ −C, where C is the V-cone C = cone{(1, 0, 1),
(0, 1, 1), (1, 1, 1)}.
It is important to observe that because C∪−C is a double cone there is a bijection between
nontrivial double polyhedral cones and projective polyhedra. So, projective polyhedra are
equivalent to double polyhedral cones. However, the projective interpretation of the lines
induced by C ∪ −C as points in Pd makes the study of projective polyhedra geometrically
more interesting.
Projective polyhedra inherit many of the properties of cones but we have to be careful
because we are really dealing with double cones, C ∪ −C, and not cones. As a consequence,
there are a few unpleasant surprises, for example, the fact that the collection of projective
polyhedra is not closed under intersection!
Before dealing with these issues, let us show that every “standard” polyhedron P ⊆ Ed
has a natural projective completion, Pe ⊆ Pd , such that on the affine patch Ud+1 (where xd+1 6=
0), Pe Ud+1 = P . For this, we use our theorem on the Polyhedron–Cone Correspondence
(Theorem 5.20, part (2)).
Let A = X + U , where X is a set of points in Ed and U is a cone in Rd . For every point,
298 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
b = {b
and let X x | x ∈ X}, U
b = {b
u | u ∈ U } and Ab = {b
a | a ∈ A}, with
a
a= .
1
b
Then,
b ∪U
C(A) = cone({X b })
is a cone in Rd+1 such that
b = C(A) ∩ Hd+1 ,
A
where Hd+1 is the hyperplane of equation xd+1 = 1. If we set A e = P(C(A)), then we get
a subset of Pd and in the patch Ud+1 , the set A
e Ud+1 is in bijection with the intersection
(C(A) ∪ −C(A)) ∩ Hd+1 = A, b and thus, in bijection with A.
e the projective completion of A. We have an injection, A −→ A,
We call A e given by
In this case, one immediately realizes that 0 is an extreme point of C and so, there is a
hyperplane, H, through 0 so that C ∩ H = {0}, that is, except for its apex, C lies in one of
the open half-spaces determined by H. As a consequence, by a linear change of coordinates,
we may assume that this hyperplane is Hd+1 (0) and so, for every projective polyhedron,
11.2. PROJECTIVE POLYHEDRA 299
(-1,1) (1,1)
A
(-1,0) (1,0)
E
(-1,0,1) D
B
(1,0,1)
(-1,1,0)
C(A)
z=1
(1,1,0)
C
B E
P = P 2(C(A) )
C D
Figure 11.9: The bottom figure shows a projective polyhedron, which is the projective
completion (in the halfsphere model of P2 ) of the infinite trough A = X + U , where
X = {(−1, 0), (1, 0)} and U = {(−1, 1), (1, 1)}.
P = P(C), if C is pointed then there is an affine patch (say, Ud+1 ) where P has no points at
infinity, that is, P is a polytope! On the other hand, from another patch, Ui , as P Ui is in
bijection with (C ∪ −C) ∩ Hi , the projective polyhedron P viewed on Ui may consist of two
disjoint polyhedra.
The situation is very similar to the classical theory of projective conics or quadrics (for
example, see Brannan, Esplen and Gray, [15]). The case where C is a pointed cone corre-
sponds to the nondegenerate conics or quadrics. In the case of the conics, depending how
we slice a cone, we see an ellipse, a parabola or a hyperbola.
For projective polyhedra, when we slice a polyhedral double cone, C ∪ −C, we may see
a polytope (elliptic type) a single unbounded polyhedron (parabolic type) or two unbounded
polyhedra (hyperbolic type). See Figure 11.10.
Now, when U = C ∩ (−C) 6= {0}, the polyhedral cone, C, contains the linear subspace,
U , and if C 6= Rd+1 , then for every hyperplane, H, such that C is contained in one of the two
closed half-spaces determined by H, the subspace U ∩ H is nontrivial. An example is the
cone, C ⊆ R3 , determined by the intersection of two planes through 0 (a wedge). In this case,
U is equal to the line of intersection of these two planes. Also observe that C ∩ (−C) = C
300 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
(0,1,1)
(1,0,1) (1,1,1)
R
C P Q
(-1,0,-1) (0,1,1)
(-1,-1,-1)
-C (1,0,1)
(0,-1,-1)
(i.)
(1,1,1)
C W
(-1,0,-1)
-C
(0,-1,-1)
(ii.)
(0,1,1)
(1,0,1)
(1,1,1)
C S
T
(-1,0,-1)
(-1,-1,-1)
-C
(0,-1,-1)
(iii.)
Figure 11.10: For the sea green double cone C ∪ −C of Figure 11.8, Figure (i.) illustrates
an elliptic type polytope, Figure (ii.) illustrates a parabolic type polyhedron, while Figure
(iii.) illustrates a hyperbolic type polyhedron.
-C
C0 C = U + C0
Figure 11.11: In R3 , C is the cone determined by the pink and peach half planes, and
U = C ∩ −C is the red line of intersection. Then C = U + C0 , where C0 it the peach pointed
cone contained in plane perpendicular to U .
Proof. We already know that U = C ∩ (−C) is the largest linear subspace of C. Let U ⊥ be
the orthogonal complement of U in Rd and let π : Rd → U ⊥ be the orthogonal projection
onto U ⊥ . By Proposition 5.13, the projection, C0 = π(C), of C onto U ⊥ is a polyhedral
cone. We claim that C0 is pointed and that
C = U + C0 .
Both U and C0 are uniquely determined by C. To a great extent, Proposition 11.2 reduces
the study of non-pointed cones to the study of pointed cones.
Definition 11.13. We call the projective polyhedra of the form P = P(C), where C is
a cone with a non-trivial cospan (a non-pointed cone) a projective polyhedral cylinder , by
analogy with the quadric surfaces. We also propose to call the projective polyhedra of the
form P = P(C), where C is a pointed cone, a projective polytope (or nondegenerate projective
polyhedron).
302 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
The following propositions show that projective polyhedra behave well under projective
maps and intersection with a hyperplane:
Proposition 11.3. Given any projective map, h : Pm → Pn , for any projective polyhedron,
P ⊆ Pm , the image, h(P ), of P is a projective polyhedron in Pn . Even if h : Pm → Pn is a
partial map but h is defined on P , then h(P ) is a projective polyhedron.
Proof. The projective map, h : Pm → Pn , is of the form h = P(b h), for some injective linear
m+1 n+1
map, h : R
b → R . Moreover, the projective polyhedron, P , is of the form P = P(C),
for some polyhedral cone, C ⊆ Rm+1 , with C = cone({u1 , . . . , up }), for some nonzero vector
ui ∈ Rm+1 . By definition,
h is linear,
As b
b h(cone({u1 , . . . , up })) = cone({b
h(C) = b h(u1 ), . . . , b
h(up )}).
If we let C
b = cone({b
h(u1 ), . . . , b
h(up )}), then b
h(C) = C
b is a polyhedral cone and so,
P(h)(P ) = P(b
h(C)) = P(C)
b
Proposition 11.3 together with earlier arguments shows that every projective polytope,
P ⊆ Pd , is equivalent under some suitable projectivity to another projective polytope, P 0 ,
which is a polytope when viewed in the affine patch, Ud+1 . This property is similar to the
fact that every (non-degenerate) projective conic is projectively equivalent to an ellipse.
Since the notion of a face is defined for arbitrary polyhedra it is also defined for cones.
Consequently, we can define the notion of a face for projective polyhedra.
If C is strongly convex, then it is easy to prove that C is generated by its edges (its one-
dimensional faces, these are rays) in the sense that any set of nonzero vectors spanning these
edges generates C (using positive linear combinations). As a consequence, if C is strongly
convex, we may say that P is “spanned” by its vertices, since P is equal to P(all positive
combinations of vectors representing its edges).
11.2. PROJECTIVE POLYHEDRA 303
Remark: Even though we did not define the notion of convex combination of points in Pd ,
the notion of projective polyhedron gives us a way to mimic certain properties of convex
sets in the framework of projective geometry. That’s because every projective polyhedron
corresponds to a unique polyhedral cone.
If our projective polyhedron is the completion Pe = P(C(P )) ⊆ Pd of some polyhedron
P ⊆ Rd , then each face of the cone C(P ) is of the form C(F ), where F is a face of P and
so, each face of Pe is of the form P(C(F )), for some face F of P . In particular, in the affine
patch Ud+1 the face P(C(F )) is in bijection with the face F of P . We will usually identify
P(C(F )) and F .
We now consider the intersection of projective polyhedra but first, let us make some
general remarks about the intersection of subsets of Pd . Given any two nonempty subsets,
P(S) and P(S 0 ), of Pd where S and S 0 are polyhedral cones (or more generally cones with
vertex 0), what is P(S) ∩ P(S 0 )? It is tempting to say that
but unfortunately this is generally false! The problem is that P(S) ∩ P(S 0 ) is the set of all
lines determined by vectors both in S and S 0 but there may be some line spanned by some
vector u ∈ (−S) ∩ S 0 or u ∈ S ∩ (−S 0 ) such that u does not belong to S ∩ S 0 or −(S ∩ S 0 ).
Observe that
−(−S) = S
−(S ∩ S 0 ) = (−S) ∩ (−S 0 ).
which is the union of two double cones (except for 0, which belongs to both). Therefore, if
P(S) ∩ P(S 0 ) 6= ∅, then S ∩ S 0 6= {0} or S ∩ (−S 0 ) 6= {0}, and so
since P(S ∩(−S 0 )) = P((−S)∩S 0 ), with the understanding that if S ∩S = {0} or S ∩(−S 0 ) =
{0}, then the corresponding term should be omitted.
Furthermore, if S 0 is symmetric (i.e., S 0 = −S 0 ), then
(S ∪ −S) ∩ (S 0 ∪ −S 0 ) = (S ∪ −S) ∩ S 0
= (S ∩ S 0 ) ∪ ((−S) ∩ S 0 )
= (S ∩ S 0 ) ∪ −(S ∩ (−S 0 ))
= (S ∩ S 0 ) ∪ −(S ∩ S 0 ).
304 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
Now, if C is a pointed polyhedral cone then C ∩ (−C) = {0}. Consequently, for any other
polyhedral cone C 0 we have (C ∩ C 0 ) ∩ ((−C) ∩ C 0 ) = {0}. Using these facts and adopting
the convention that P({0}) = ∅, we obtain the following result:
Proposition 11.4. Let P = P(C) and P 0 = P(C 0 ) be any two projective polyhedra in Pd . If
P(C) ∩ P(C 0 ) 6= ∅, then the following properties hold:
(1)
P(C) ∩ P(C 0 ) = P(C ∩ C 0 ) ∪ P(C ∩ (−C 0 )),
the union of two projective polyhedra. If C or C 0 is a pointed cone i.e., P or P 0 is a
projective polytope, then P(C ∩ C 0 ) and P(C ∩ (−C 0 )) are disjoint (if both are defined).
See Figures 11.12 and 11.13.
1
a1= (-1, 1) a1= (-1, 1) P (C) a2= (1, 1)
a2= (1, 1)
C C
-C -C
-a 2 -a 1 -a 2 -a 1
P 1 (C’)
C‘ C‘
- C‘ - C‘
b = (1, -2)
b = (1, -2) 2
-b 1 2 -b 1
Figure 11.12: Let C = cone{(1, 1), (−1, 1)} and C 0 = cone{(1, 2), (1, −2)}. In the half-
spherical model of P1 , P(C) is the bold red arc, while P(C 0 ) is the bold blue arc.
11.2. PROJECTIVE POLYHEDRA 305
b1= (1,2)
-C
-a 2 -a 1
b = (1, -2)
2
b = (1, -2)
2
-b 1 1
P (C) h P 1 (C’)
-b2
P 1 (C h (-C’))
a1= (-1, 1) a2= (1, 1)
- C‘
-b 1
Figure 11.13: For the cones C and C 0 defined in Figure 11.12, P(C) ∩ P(C 0 ) is illustrated
by two disjoint purple arcs; the light purple arc is P(C ∩ C 0 ), while the dark purple arc is
P(C ∩ (−C 0 )) .
Proof. We already proved (1) so only (2) remains to be proved. Of course, we may assume
that P 6= Pd . This time, using the equivalence theorem of T
V-cones and H-cones (Theorem
5.19), we know that P is of the form P = P(C), with C = pi=1 Ci , where the Ci are closed
half-spaces in Rd+1 . Moreover, H = P(H),
b for some hyperplane, Hb ⊆ Rd+1 , through 0. Now,
as H
b is symmetric,
P ∩ H = P(C) ∩ P(H) b = P(C ∩ H).
b
Consequently,
P ∩ H = P(C ∩ H)
b
p
! !
\
= P Ci ∩ H
b .
i=1
However, H b+ ∩ H
b =H b − , where H
b + and H
b − are the two closed half-spaces determined by H
b
and so,
p p
! !
\ \
C
b= Ci ∩ H b= Ci ∩ H b+ ∩ Hb−
i=1 i=1
the union of two projective polyhedra. If C = −C, i.e., C is a linear subspace (or if C 0 is a
linear subspace), then
P(C) ∩ P(C 0 ) = P(C ∩ C 0 ).
Furthermore, if either C or C 0 is pointed, the above projective polyhedra are disjoint, else if
C and C 0 both have nontrivial cospan and P(C ∩ C 0 ) and P(C ∩ (−C 0 )) intersect then
Finally, if the two projective polyhedra on the right-hand side intersect, then
In preparation for Section 12.7, we also need the notion of tangent space at a point of a
variety.
∂ i1 +···+id p
Dα p(a) = (a),
∂xi11 · · · ∂xidd
and
∂p
pxi (a) = (a).
∂xi
Consider any line ` through a, given parametrically by
` = {a + th | t ∈ R},
with h 6= 0 and say a ∈ S is a point on the hypersurface S = V (p), which means that
p(a) = 0. The intuitive idea behind the notion of the tangent space to S at a is that it is
the set of lines that intersect S at a in a point of multiplicity at least two, which means that
the equation giving the intersection, S ∩ `, namely
is of the form
t2 q(a, h)(t) = 0,
where q(a, h)(t) is some polynomial in t. Using Taylor’s formula, as p(a) = 0, we have
d
X
p(a + th) = t pxi (a)hi + t2 q(a, h)(t),
i=1
for some polynomial q(a, h)(t). From this, we see that a is an intersection point of multiplicity
at least 2 iff
Xd
pxi (a)hi = 0. (†)
i=1
Consequently, if ∇p(a) = (px1 (a), . . . , pxd (a)) 6= 0 (that is, if the gradient of p at a is
nonzero), we see that ` intersects S at a in a point of multiplicity at least 2 iff h belongs to
the hyperplane of equation (†).
Definition 11.15. Let S = V (p) be a hypersurface in Rd . For any point a ∈ S, if ∇p(a) 6= 0,
then we say that a is a non-singular point of S. When a is nonsingular, the (affine) tangent
space Ta (S) (or simply Ta S) to S at a is the hyperplane through a of equation
d
X
pxi (a)(xi − ai ) = 0.
i=1
Vp(a)
x-a x
Ta(S)
a
S = { (x,y,z) | p(x,y,z) = 0 }
Observe that the hyperplane of the direction of Ta S is the hyperplane through 0 and
parallel to Ta S given by
Xd
pxi (a)xi = 0.
i=1
where F (x1 , . . . , xd+1 ) is a homogeneous polynomial of total degree m. We say that a point
a ∈ S is non-singular iff ∇F (a) = (Fx1 (a), . . . , Fxd+1 (a)) 6= 0.
Thus, on the affine patch Ui , the tangent space Ta S is given by the homogeneous equation
d+1
X
fzii (ai )(xj − ai
j xi ) = 0.
j
j=1
j6=i
This looks awful but we can make it pretty if we remember that F is a homogeneous
polynomial of degree m and that we have the Euler relation:
d+1
X
Fxj (a)aj = mF (a),
j=1
for every a = (a1 , . . . , ad+1 ) ∈ Rd+1 . Using this, we can come up with a clean equation for
our projective tangent hyperplane. It is enough to carry out the computations for i = d + 1.
Our tangent hyperplane has the equation
d
X
Fxj (ad+1
1 , . . . , ad+1
d , 1)(xj − ad+1
j xd+1 ) = 0,
j=1
that is,
d
X d
X
Fxj (ad+1
1 , . . . , ad+1
d , 1)xj + Fxj (ad+1
1 , . . . , ad+1
d , 1)(−ad+1
j xd+1 ) = 0.
j=1 j=1
x2 + y 2 + z 2 − w2 = 0,
a1 x + a2 y + a3 z − a4 w = 0.
Now, if a = (a1 : · · · : ad : 1) is a point in the affine patch Ud+1 , then the equation of the
intersection of Ta S with Ud+1 is obtained by setting ad+1 = xd+1 = 1, that is
d
X
Fxi (a1 , . . . , ad , 1)(xi − ai ) = 0,
i=1
This is indeed a hypersurface because F (x1 , . . . , xn+1 ) is a homogenous polynomial and h∗ (S)
is the zero locus of the homogeneous polynomial
m+1 m+1
!
X X
G(x1 , . . . , xm+1 ) = F a1 j x j , . . . , an+1 j xj .
j=1 j=1
d+1
X
>
Φ(x) = x F x = f i j x i xj .
i,j=1
and so,
∂Φ(x) ∂Φ(x)
,..., = 2x> F.
∂x1 ∂xd+1
Definition 11.20. The hypersurface S = V (Φ) ⊆ Pd is called a (projective) (hyper-)quadric
surface. We say that a quadric surface S = V (Φ) is nondegenerate iff the matrix F defining
Φ is invertible.
where X is any subset of Rd . The above suggests generalizing polar duality with respect to
any nondegenerate quadric.
314 CHAPTER 11. PROJECTIVE SPACES AND POLYHEDRA, POLAR DUALITY
ϕu (v) = ϕ(u, v)
for every v ∈ Rd+1 , then the map u 7→ ϕu , from Rd+1 to (Rd+1 )∗ , is a linear isomorphism.
Definition 11.21. Let Q = V (Φ(x1 , . . . , xd+1 )) be a nondegenerate quadric with corre-
sponding polar form ϕ. For any u ∈ Rd+1 , with u 6= 0, the set
is a hyperplane called the polar of u (w.r.t. Q). In terms of the matrix representation of Q,
the polar of u is given by the equation
u> F x = 0,
or
d+1
X ∂Φ(u)
xj = 0.
j=1
∂xj
Going over to Pd , we say that P(u† ) is the polar (hyperplane) of the point a = [u] ∈ Pd and
we write a† for P(u† ).
Note that the equation of the polar hyperplane a† of a point a ∈ Pd is identical to the
equation of the tangent plane to Q at a, except that a is not necessarily on Q. However, if
a ∈ Q, then the polar of a is indeed the tangent hyperplane Ta Q to Q at a.
Proposition 11.7. Let Q = V (Φ(x1 , . . . , xd+1 )) ⊆ Pd be a nondegenerate quadric with
corresponding polar form, ϕ, and matrix, F . Then, every point, a ∈ Q, is nonsingular.
Proof. Since
∂Φ(a) ∂Φ(a)
,..., = 2a> F,
∂x1 ∂xd+1
if a ∈ Q is singular, then a> F = 0 with a 6= 0, contradicting the fact that F is invertible.
(2) a ∈ a† iff a ∈ Q;
11.4. QUADRICS (AFFINE, PROJECTIVE) AND POLAR DUALITY 315
ϕv = λϕu = ϕλu ,
with λ 6= 0 and this implies v = λu, that is, a = [u] = [v] = b, and the pole of H is indeed
unique.
Observe that X ∗ is always a cone, even if X ⊆ Rd+1 is not. By analogy with the Euclidean
case, for any nonzero vector u ∈ Rd+1 , let
C ∗ = P (U > F, 0),
where
P (U > F, 0) = {v ∈ Rd+1 | U > F v ≤ 0}.
Consequently, the polar dual of a polyhedral cone w.r.t. a nondegenerate quadric is a poly-
hedral cone.
Proof. The proof is essentially the same as the proof of Proposition 5.5. As
we have
P ∗ = P(C ∗ ).
We also show that projectivities behave well with respect to polar duality.
where on the left-hand side X ∗ is the polar dual of X w.r.t. Q, and on the right-hand side
h(X))∗ is the polar dual of b
(b h(X) w.r.t. the nondegenerate quadric h(Q) given by the matrix
−1 > −1
(A ) F A . Consequently, if X 6= {0}, then
h((P(X))∗ ) = (h(P(X)))∗
Proof. As
X ∗ = {v ∈ Rd+1 | (∀u ∈ X)(u> F v ≤ 0)},
we have
h(X ∗ ) = {b
b h(v) ∈ Rd+1 | (∀u ∈ X)(u> F v ≤ 0)}
= {y ∈ Rd+1 | (∀u ∈ X)(u> F A−1 y ≤ 0)}
h(X))(x> (A−1 )> F A−1 y ≤ 0)}
= {y ∈ Rd+1 | (∀x ∈ b
h(X))∗ ,
= (b
We will also need the notion of an affine quadric and polar duality with respect to an
affine quadric. Fortunately, the properties we need in the affine case are easily derived from
the projective case using the “trick” that the affine space Ed can be viewed as the hyperplane
Hd+1 ⊆ Rd+1 of equation, xd+1 = 1, and that its associated vector space Rd can be viewed as
the hyperplane Hd+1 (0) ⊆ Rd+1 of equation xd+1 = 0. A point, a ∈ Ad , corresponds to the
a u
d+1 d
d+1
vector ba = 1 ∈ R , and a vector u ∈ R corresponds to the vector, u b = 0 ∈ R . This
way, the projective space P = P(R ) is the natural projective completion of Ed , which
d d+1
is isomorphic to the affine patch Ud+1 where xd+1 6= 0. The hyperplane xd+1 = 0 is the
“hyperplane at infinity” in Pd .
If we write x = (x1 , . . . , xd ), a polynomial, Φ(x) = Φ(x1 , . . . , xd ), of degree 2 can be
written as
Xd d
X
Φ(x) = ai j x i x j + 2 bi xi + c,
i,j=1 i=1
Using Proposition 11.12, we can prove the following Proposition showing that projective
completion and polar duality commute:
Proposition 11.13. Let Q be a nondegenerate affine quadric given by the (d + 1) × (d + 1)
symmetric, invertible matrix F . For every polyhedron P ⊆ Rd , we have
f∗ = (Pe)∗ ,
P
where on the right-hand side, we use polar duality w.r.t. the nondegenerate projective quadric
Q
e defined by F .
Now, P = conv(Y ) + cone(V ), for some finite set of points Y and some finite set of vectors
V , and we know that
C(P ) = cone(Yb ∪ Vb ).
From Proposition 11.10,
But, by definition of C(P ∗ ) (see Section 5.5, especially Proposition 5.20), the hyperplanes
cutting out C(P ∗ ) are obtained by homogenizing the equations of the hyperplanes cutting
out P ∗ and so,
∗ x d+1 b > x > x
C(P ) = ∈R |Y F ≤ 0, V F
b ≤ 0 = (C(P ))∗ ,
xd+1 xd+1 xd+1
as claimed.
Theorem 11.14. Let Q = V (Φ(x1 , . . . , xd+1 ) be a projective or an affine quadric over RPd or
Rd+1 . If Q has a nonsingular point, then for every polynomial Φ0 such that Q = V (Φ0 (x1 , . . .,
xd+1 ), there is some λ 6= 0 (λ ∈ R) so that Φ0 = λΦ.
In this chapter we present the concepts of a Voronoi diagram and of a Delaunay triangu-
lation. These are important tools in computational geometry and Delaunay triangulations
are important in problems where it is necessary to fit 3D data using surface splines. It is
usually useful to compute a good mesh for the projection of this set of data points onto the
xy-plane, and a Delaunay triangulation is a good candidate.
Our presentation of Voronoi diagrams and Delaunay triangulations is far from thor-
ough. We are primarily interested in defining these concepts and stating their most impor-
tant properties. For a comprehensive exposition of Voronoi diagrams, Delaunay triangula-
tions, and more topics in computational geometry, our readers may consult O’Rourke [44],
Preparata and Shamos [47], Boissonnat and Yvinec [12], de Berg, Van Kreveld, Overmars,
and Schwarzkopf [6], or Risler [48]. The survey by Graham and Yao [33] contains a very
gentle and lucid introduction to computational geometry.
In Section 12.7 (which relies on Sections 12.5 and 12.6), we show that the Delaunay
triangulation of a set of points P is the stereographic projection of the convex hull of the set
of points obtained by mapping the points in P onto the sphere using inverse stereographic
projection. We also prove in Section 12.8 that the Voronoi diagram of P is obtained by
taking the polar dual of the above convex hull and projecting it from the north pole (back
onto the hyperplane containing P ). A rigorous proof of this second fact is not trivial because
the central projection from the north pole is only a partial map. To give a rigorous proof,
we have to use projective completions. This requires defining convex polyhedra in projective
space, and we use the results of Chapter 11 (especially, Section 11.2).
321
322 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
ness, one may safely assume that E = Em , although what follows applies to any Euclidean
space of finite dimension. Given a set P = {p1 , . . . , pn } of n points in E, it is often useful to
find a partition of the space E into regions each containing a single point of P and having
some nice properties. It is also often useful to find triangulations of the convex hull of P
having some nice properties. We shall see that this can be done and that the two problems
are closely related. In order to solve the first problem, we need to introduce bisector lines
and bisector planes.
For simplicity, let us first assume that E is a plane i.e., has dimension 2. Given any two
distinct points a, b ∈ E, the line orthogonal to the line segment (a, b) and passing through
the midpoint of this segment is the locus of all points having equal distance to a and b. It
is called the bisector line of a and b. The bisector line of two points is illustrated in Figure
12.1. 1
L b
If h = 21 a + 21 b is the midpoint of the line segment (a, b), letting m be an arbitrary point
on the bisector line, the equation of this line can be found by writing that hm is orthogonal
to ab. In any orthogonal frame, letting m = (x, y), a = (a1 , a2 ), b = (b1 , b2 ), the equation of
this line is
(b1 − a1 )(x − (a1 + b1 )/2) + (b2 − a2 )(y − (a2 + b2 )/2) = 0,
which can also be written as
(b1 − a1 )x + (b2 − a2 )y = (b21 + b22 )/2 − (a21 + a22 )/2.
The closed half-plane H(a, b) containing a and with boundary the bisector line is the locus
of all points such that
(b1 − a1 )x + (b2 − a2 )y ≤ (b21 + b22 )/2 − (a21 + a22 )/2,
and the closed half-plane H(b, a) containing b and with boundary the bisector line is the
locus of all points such that
(b1 − a1 )x + (b2 − a2 )y ≥ (b21 + b22 )/2 − (a21 + a22 )/2.
12.1. DIRICHLET–VORONOI DIAGRAMS 323
The closed half-plane H(a, b) is the set of all points whose distance to a is less that or equal
to the distance to b, and vice versa for H(b, a). Thus, points in the closed half-plane H(a, b)
are closer to a than they are to b.
We now consider a problem called the post office problem by Graham and Yao [33]. Given
any set P = {p1 , . . . , pn } of n points in the plane (considered as post offices or sites), for
any arbitrary point x, find out which post office is closest to x. Since x can be arbitrary,
it seems desirable to precompute the sets V (pi ) consisting of all points that are closer to pi
than to any other point pj 6= pi . Indeed, if the sets V (pi ) are known, the answer is any post
office pi such that x ∈ V (pi ). Thus, it remains to compute the sets V (pi ). For this, if x is
closer to pi than to any other point pj 6= pi , then x is on the same side as pi with respect to
the bisector line of pi and pj for every j 6= i, and thus
\
V (pi ) = H(pi , pj ).
j6=i
If E has dimension 3, the locus of all points having equal distance to a and b is a plane.
It is called the bisector plane of a and b. The equation of this plane is also found by writing
that hm is orthogonal to ab. The equation of this plane is
(b1 − a1 )(x − (a1 + b1 )/2) + (b2 − a2 )(y − (a2 + b2 )/2) + (b3 − a3 )(z − (a3 + b3 )/2) = 0,
(b1 − a1 )x + (b2 − a2 )y + (b3 − a3 )z = (b21 + b22 + b23 )/2 − (a21 + a22 + a23 )/2.
The closed half-space H(a, b) containing a and with boundary the bisector plane is the locus
of all points such that
(b1 − a1 )x + (b2 − a2 )y + (b3 − a3 )z ≤ (b21 + b22 + b23 )/2 − (a21 + a22 + a23 )/2,
and the closed half-space H(b, a) containing b and with boundary the bisector plane is the
locus of all points such that
(b1 − a1 )x + (b2 − a2 )y + (b3 − a3 )z ≥ (b21 + b22 + b23 )/2 − (a21 + a22 + a23 )/2.
The closed half-space H(a, b) is the set of all points whose distance to a is less that or equal
to the distance to b, and vice versa for H(b, a). Again, points in the closed half-space H(a, b)
are closer to a than they are to b.
Given any set P = {p1 , . . . , pn } of n points in E (of dimension m = 2, 3), it is often useful
to find for every point pi the region consisting of all points that are closer to pi than to any
other point pj 6= pi , that is, the set
where d(x, y) = (xy · xy)1/2 , the Euclidean distance associated with the inner product · on
E. From the definition of the bisector line (or plane), it is immediate that
\
V (pi ) = H(pi , pj ).
j6=i
Families of sets of the form V (pi ) were investigated by Dirichlet [23] (1850) and Voronoi
[66] (1908). Voronoi diagrams also arise in crystallography (Gilbert [31]). Other applications,
including facility location and path planning, are discussed in O’Rourke [44]. For simplicity,
we also denote the set V (pi ) by Vi , and we introduce the following definition.
Definition 12.1. Let E be a Euclidean space of dimension m = 2, 3. Given any set P = {p1 ,
. . ., pn } of n points in E, the Dirichlet–VoronoiTdiagram Vor (P ) of P = {p1 , . . . , pn } is the
family of subsets of E consisting of the sets Vi = j6=i H(pi , pj ) and of all of their intersections.
(b1 − a1 )x1 + · · · + (bm − am )xm = (b21 + · · · + b2m )/2 − (a21 + · · · + a2m )/2.
The closed half-space H(a, b) containing a and with boundary the bisector hyperplane is the
locus of all points such that
(b1 − a1 )x1 + · · · + (bm − am )xm ≤ (b21 + · · · + b2m )/2 − (a21 + · · · + a2m )/2,
and the closed half-space H(b, a) containing b and with boundary the bisector hyperplane is
the locus of all points such that
(b1 − a1 )x1 + · · · + (bm − am )xm ≥ (b21 + · · · + b2m )/2 − (a21 + · · · + a2m )/2.
12.1. DIRICHLET–VORONOI DIAGRAMS 325
The closed half-space H(a, b) is the set of all points whose distance to a is less than or equal
to the distance to b, and vice versa for H(b, a). Figure 12.2 shows the Voronoi diagram of a
set of twelve points.
In the general case where E has dimension m, the definition of the Voronoi diagram
Vor (P ) of P is the same as Definition 12.1, except that H(pi , pj ) is the closed half-space
containing pi and having the bisector hyperplane of pi and pj as boundary. Also, observe
that the convex hull of P is a convex polytope.
We will now state a proposition listing the main properties of Voronoi diagrams. It turns
out that certain degenerate situations can be avoided if we assume that the points in the set
P are in general position.
Definition 12.2. If P is a set of points in an affine space of dimension m, then we say
that the points of P are in general position if no m + 2 points from P belong to the same
(m − 1)-sphere.
Thus when m = 2, no 4 points in P are cocyclic, and when m = 3, no 5 points in P are
on the same sphere.
Proposition 12.1. Given a set P = {p1 , . . . , pn } of n points in some Euclidean space E
of dimension m (say Em ), if the points in P are in general position and not in a common
hyperplane then the Voronoi diagram of P satisfies the following conditions:
(1) Each region Vi is convex and contains pi in its interior.
(2) Each vertex of Vi belongs to m + 1 regions Vj and to m + 1 edges.
(3) The region Vi is unbounded iff pi belongs to the boundary of the convex hull of P .
326 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
(4) If p is a vertex that belongs to the regions V1 , . . . , Vm+1 , then p is the center of the
(m − 1)-sphere S(p) determined by p1 , . . . , pm+1 . Furthermore, no point in P is inside
the sphere S(p) (i.e., in the open ball associated with the sphere S(p)).
(5) If pj is a nearest neighbor of pi , then one of the faces of Vi is contained in the bisector
hyperplane of (pi , pj ).
(6)
n
[ ◦ ◦
Vi = E, and V i ∩ V j = ∅, for all i, j, with i 6= j,
i=1
◦
where V i denotes the interior of Vi .
Proof. We prove only some of the statements, leaving the others as an exercise (or see Risler
[48]).
T
(1) Since Vi = j6=i H(pi , pj ) and each half-space H(pi , pj ) is convex, as an intersection
of convex sets, Vi is convex. Also, since pi belongs to the interior of each H(pi , pj ), the point
pi belongs to the interior of Vi .
(2) Let Fi,j denote Vi ∩ Vj . Any vertex p of the Vononoi diagram of P must belong to r
faces Fi,j . Let us pick the origin of our affine space to be p. Now, given a vector space E
and any two subspaces M and N of E, recall that we have the Grassmann relation
Then since p belongs to the intersection of hyperplanes that support the boundaries of the Vi ,
and since a hyperplane has dimension m − 1, by the Grassmann relation, in order to obtain
{p}, a subspace of dimension 0, as the intersection of hyperplanes, we must intersect at least
m hyperplanes, so we must have r ≥ m. We can rename the r + 1 points pi corresponding
the regions Vi inducing the faces containing p by p1 , . . . , pr+1 , so that the r faces containing
p are denoted F1,2 , F2,3 , . . . , Fr,r+1 . Since Fi,j = Vi ∩ Vj , we have
This means that p is the center of a sphere passing through p1 , . . . , pr+1 and containing no
other point in P . By the assumption that points in P are in general position, since there
are r + 1 points pi on a sphere, we must have r + 1 ≤ m + 1, that is, r ≤ m, and thus
r = m. Thus, p belongs to V1 ∩ · · · ∩ Vm+1 , but to no other Vj with j ∈ / {1, . . . , m + 1}.
Furthermore, every edge of the Voronoi diagram containing p is the intersection of m of the
regions V1 , . . . , Vm+1 , and so there are m + 1 of them.
12.1. DIRICHLET–VORONOI DIAGRAMS 327
For simplicity, let us again consider the case where E is a plane. It should be noted that
certain Voronoi regions, although closed, may extend very far. Figure 12.3 shows such an
example.
It is also possible for certain unbounded regions to have parallel edges.
There are a number of methods for computing Voronoi diagrams. A fairly simple (al-
though not very efficient) method is to compute each Voronoi region V (pi ) by intersecting
the half-planes H(pi , pj ) (with pi fixed). One way to do this is to construct for each pi suc-
cessive convex polygons that converge to the boundary of the region V (pi ). At every step
we intersect the current convex polygon with the bisector line of pi and pj . There are at
most two intersection points. We also need a starting polygon, and for this we can pick a
square containing all the points. A naive implementation will run in O(n3 ). However, the
intersection of half-planes can be done in O(n log n), using the fact that the vertices of a
convex polygon can be sorted. Thus, the above method runs in O(n2 log n). Actually, there
are faster methods (see Preparata and Shamos [47] or O’Rourke [44]), and it is possible to
design algorithms running in O(n log n). The most direct method to obtain fast algorithms
is to use the “lifting method” discussed in Section 12.4, whereby the original set of points is
lifted onto a paraboloid, and to use fast algorithms for finding a convex hull.
A very interesting (undirected) graph can be obtained from the Voronoi diagram as
follows: The vertices of this graph are the points pi (each corresponding to a unique region
of Vor (P )), and there is an edge between pi and pj iff the regions Vi and Vj share an edge.
The resulting graph is called a Delaunay triangulation of the convex hull of P , after Delaunay,
who invented this concept in 1934. Such triangulations have remarkable properties.
328 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
Figure 12.4 shows the Delaunay triangulation associated with the earlier Voronoi diagram
of a set of twelve points.
One has to be careful to make sure that all the Voronoi vertices have been computed
before computing a Delaunay triangulation, since otherwise, some edges could be missed. In
Figure 12.5 illustrating such a situation, if the lowest Voronoi vertex had not been computed
(not shown on the diagram!), the lowest edge of the Delaunay triangulation would be missing.
12.2 Triangulations
The concept of a triangulation relies on the notion of pure simplicial complex defined in
Chapter 9. The reader should review Definition 9.2 and Definition 9.3.
Given a finite set P of n points in the plane, and given a triangulation of the convex hull
of P having P as its set of vertices, observe that the boundary of P is a convex polygon.
Similarly, given a finite set P of points in 3-space, and given a triangulation of the convex hull
of P having P as its set of vertices, observe that the boundary of P is a convex polyhedron.
It is interesting to know how many triangulations exist for a set of n points (in the plane
12.2. TRIANGULATIONS 329
or in 3-space), and it is also interesting to know the number of edges and faces in terms
of the number of vertices in P . These questions can be settled using the Euler–Poincaré
characteristic. We say that a polygon in the plane is a simple polygon iff it is a connected
closed polygon such that no two edges intersect (except at a common vertex).
Proposition 12.2.
(1) For any triangulation of a region of the plane whose boundary is a simple polygon,
letting v be the number of vertices, e the number of edges, and f the number of triangles,
we have the “Euler formula”
v − e + f = 1.
(2) For any region, S, in E3 homeomorphic to a closed ball and for any triangulation of S,
letting v be the number of vertices, e the number of edges, f the number of triangles,
and t the number of tetrahedra, we have the “Euler formula”
v − e + f − t = 1.
(3) Furthermore, for any triangulation of the combinatorial surface, B(S), that is the
boundary of S, letting v 0 be the number of vertices, e0 the number of edges, and f 0 the
number of triangles, we have the “Euler formula”
v 0 − e0 + f 0 = 2.
330 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
Proof. All the statements are immediate consequences of Theorem 10.6. For example, part
(1) is obtained by mapping the triangulation onto a sphere using inverse stereographic pro-
jection, say from the North pole. Then, we get a polytope on the sphere with an extra facet
corresponding to the “outside” of the triangulation. We have to deduct this facet from the
Euler characteristic of the polytope and this is why we get 1 instead of 2.
It is now easy to see that in case (1), the number of edges and faces is a linear function
of the number of vertices and boundary edges, and that in case (3), the number of edges
and faces is a linear function of the number of vertices. Indeed, in the case of a planar
triangulation, each face has 3 edges, and if there are eb edges in the boundary and ei edges
not in the boundary, each nonboundary edge is shared by two faces, and thus 3f = eb + 2ei .
Since v − eb − ei + f = 1, we get
v − eb − ei + eb /3 + 2ei /3 = 1,
2eb /3 + ei /3 = v − 1,
Actually, it is not obvious that Del (P ) is a triangulation of the convex hull of P , but
this can be shown, as well as the properties listed in the following proposition.
Proposition 12.3. Let P = {p1 , . . . , pn } be a set of n points in Em , and assume that they
are in general position. Then the Delaunay triangulation of the convex hull of P is indeed a
triangulation associated with P , and it satisfies the following properties:
(1) The boundary of Del (P ) is the convex hull of P .
(2) A triangulation T associated with P is the Delaunay triangulation Del (P ) iff every
(m − 1)-sphere S(σ) circumscribed about an m-simplex σ of T contains no other point
from P (i.e., the open ball associated with S(σ) contains no point from P ).
The proof can be found in Risler [48] and O’Rourke [44]. In the case of a planar set P , it
can also be shown that the Delaunay triangulation has the property that it maximizes the
332 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
minimum angle of the triangles involved in any triangulation of P . However, this does not
characterize the Delaunay triangulation. Given a connected graph in the plane, it can also
be shown that any minimal spanning tree is contained in the Delaunay triangulation of the
convex hull of the set of vertices of the graph (O’Rourke [44]).
We will now explore briefly the connection between Delaunay triangulations and convex
hulls.
>
Figure 12.7: The intersection of the paraboloid x2 +y 2 = z with the cylinder x2 +(y−1)2 = 0.
The intersection is an ellipse in the plane z = 2y − 1.
Indeed, a point p inside the circle C would lift to a point l(p) on the paraboloid. Since
no four points are cocyclic, one of the four points p1 , p2 , p3 , p is further from O than the
others; say this point is p3 . Then, the face (l(p1 ), l(p2 ), l(p)) would be below the face
(l(p1 ), l(p2 ), l(p3 )), contradicting the fact that (l(p1 ), l(p2 ), l(p3 )) is one of the downward-
facing faces of the convex hull of P . See Figure 12.8. But then, by Property (2) of Proposition
12.3, the triangle (p1 , p2 , p3 ) would belong to the Delaunay triangulation of P .
l ( p3 )
l ( p3 )
l (p)
l ( p1 ) l ( p2 ) l (p)
l ( p1 ) l ( p2 )
p2
p2
p1 p
p1 p
p3
p3
Figure 12.8: The lift of four points p1 , p2 , p3 , p. Since p is inside the green circle, the blue
triangle (l(p1 ), l(p), l(p2 )) is beneath the green triangle (l(p1 ), l(p2 ), l(p3 )), which implies that
(l(p1 ), l(p2 ), l(p3 )) is not downward facing.
Therefore, we have shown that the projection of the part of the convex hull of the lifted
set l(P ) consisting of the downward-facing faces is the Delaunay triangulation of P . Figure
334 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
12.9 shows the lifting of the Delaunay triangulation shown earlier. Another example of the
lifting of a Delaunay triangulation is shown in Figure 12.10.
10
y
7.5
5
2.5
0
10
z
5
0
2.5
5
x 7.5
10
10
y
7.5
5
2.5
0
10
z
5
0
2.5
5
x 7.5
10
The fact that a Delaunay triangulation can be obtained by projecting a lower convex
hull can be used to find efficient algorithms for computing a Delaunay triangulation. It also
holds for higher dimensions.
12.4. DELAUNAY TRIANGULATIONS AND CONVEX HULLS 335
The Voronoi diagram itself can also be obtained from the lifted set l(P ). However, this
time, we need to consider tangent planes to the paraboloid at the lifted points. It is fairly
obvious that the tangent plane at the lifted point (a, b, a2 + b2 ) is
z = 2ax + 2by − (a2 + b2 ).
Given two distinct lifted points (a1 , b1 , a21 + b21 ) and (a2 , b2 , a22 + b22 ), the intersection of the
tangent planes at these points is a line belonging to the plane of equation
(b1 − a1 )x + (b2 − a2 )y = (b21 + b22 )/2 − (a21 + a22 )/2.
Now, if we project this plane onto the xy-plane, we see that the above is precisely the equation
of the bisector line of the two points (a1 , b1 ) and (a2 , b2 ). See Figure 12.11. Therefore, if we
look at the paraboloid from z = +∞ (with the paraboloid transparent), the projection of the
boundary of the polyhedron V(P ) consisting of the intersection of the half spaces containing
the origin cut out by the tangent planes at the lifted points is the Voronoi diagram!
display f, g, h ;
Figure 12.11: The intersection of the tangent plane at (0, 1, 1) with equation z = 2y − 1, and
the tangent plane at (1, 0, 1) with equation z = 2x − 1, has intersection y − x = 0, namely
the bisecting hyperplane between (0, 1, 0) and (1, 0, 0).
It should be noted that the “duality” between the Delaunay triangulation, which is the
projection of the convex hull of the lifted set l(P ) viewed from z = −∞, and the Voronoi
diagram, which is the projection of the boundary of the polyhedron V(P ) cut out by the
tangent planes at the points of the lifted set l(P ) viewed from z = +∞, is reminiscent of
the polar duality with respect to a quadric. This duality will be thoroughly investigated in
Section 12.7.
The reader interested in algorithms for finding Voronoi diagrams and Delaunay triangu-
lations is referred to O’Rourke [44], Preparata and Shamos [47], Boissonnat and Yvinec [12],
de Berg, Van Kreveld, Overmars, and Schwarzkopf [6], and Risler [48].
336 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
As far as we know, Edelsbrunner and Seidel [24] were the first to find the relationship
between Voronoi diagrams and the polar dual of the convex hull of a lifted set of points onto
a paraboloid. This connection is described in Note 3.1 of Section 3 in [24]. The connection
between the Delaunay triangulation and the convex hull of the lifted set of points is described
in Note 3.2 of the same paper. Polar duality is not mentioned and seems to enter the scene
only with Boissonnat and Yvinec [12].
Brown appears to be the first person who observed that Voronoi diagrams and convex
hulls are related via inversion with respect to a sphere [16]. Brown takes a set of points
P , for simplicity assumed to be in the plane, first lifts these points to the unit sphere S 2
using inverse stereographic projection from the north pole τN : E2 → (S 2 − {N }) (which is
equivalent to an inversion of power 2 centered at the north pole), getting τN (P ), and then
takes the convex hull D(P ) = conv(τN (P )) of the lifted set. Now, in order to obtain the
12.5. STEREOGRAPHIC PROJECTION AND THE SPACE OF SPHERES 337
N = (0, 0, 1)
M = (1/2, 1, 1/2)
π N (M) ~ (1,2,0)
Voronoi diagram of P , apply our inversion (of power 2 centered at the north pole) to each
of the faces of conv(τN (P )), obtaining spheres passing through the north pole, and then
intersect these spheres with the plane containing P , obtaining circles. The centers of some
of these circles are the Voronoi vertices. Finally, a simple criterion can be used to retain the
“nearest Voronoi points” and to connect up these vertices; see Brown [16], page 225.
Note that Brown’s method is not the method that uses the polar dual of the poly-
hedron D(P ) = conv(τN (P )), as we might have expected from the lifting method using
a paraboloid. However, Brown’s method suggests a method for obtaining the Delaunay
triangulation Del (P ) of P by lifting the set P to the sphere S d by applying the inverse
stereographic projection τN : Ed → (S d − {N }) (see Definition 12.5) instead of the lifting
function l, computing the convex hull D(P ) = conv(τN (P )) of the lifted set τN (P ), and then
applying the central projection πN from the north pole N to the hyperplane xd+1 = 0 instead
of the orthogonal projection pd+1 to the facets of the polyhedron D(P ) that do not contain
the north pole, as we will prove in Section 12.7. The central projection πN is the partial
map πN : (Ed+1 − Hd+1 ) → Ed given by
1
πN (x1 , . . . , xd , xd+1 ) = (x1 , . . . , xd );
1 − xd+1
see Definition 12.5. For any point M = (x1 , . . . , xd , xd+1 ) not in the hyperplane Hd+1 of
equation xd+1 = 1, the point πN (M ) is the intersection of the line hN, M i through M and
N with the hyperplane Hd+1 (0) of equation xd+1 = 0. See Figure 12.12.
Thus, instead of using a paraboloid we can use a sphere, and instead of the lifting function
l we can use the the inverse stereographic projection τN . Then, to get back down to Ed , we
338 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
use the central projection πN instead of the orthogonal projection pd+1 . As D(P ) is strictly
below the hyperplane xd+1 = 1, there are no problems.
It turns out that there is a “projective transformation” Θ of Ed+1 that maps the sphere
d
S minus the north pole to the paraboloid P, and this maps satisfies the equation
l = Θ ◦ τN .
Consequently, Θ is a bijection between Ed+1 − Hd+1 and Ed+1 − Hd+1 (−1), where Hd+1 (−1)
is the hyperplane of equation xd+1 = −1. As we said earlier, Θ maps the sphere S d minus
the north pole to the paraboloid P, (see Figure 12.13),and
l = Θ ◦ τN .
What this means is that if we think of the inverse stereographic projection τN as a lifting of
points in Ed to the sphere S d , then lifting points from Ed to S d and then mapping S d − {N }
to P by applying Θ is equivalent to lifting points from Ed to the paraboloid P using l.
It would be tempting to define the Voronoi diagram Vor (P ) as the central projection of
the polar dual D(P )∗ of D(P ). However, we have to be careful because Θ does not map
all convex polyhedra to convex polyhedra. In particular, Θ is not well-defined on any face
of D(P )∗ intersecting the hyperplane Hd+1 (of equation xd+1 = 1). Fortunately, we can
circumvent these difficulties by using the concept of a projective polyhedron introduced in
Chapter 11 and defining a projective version θ of Θ which is a total function. We can also
define projective versions of σN , τN , l, and πN , to prove that the Voronoi diagram of P is
indeed obtained from a suitable projection of the polar dual of D(P ) (actually, a projective
version of D(P )).
In summary, Voronoi diagrams, Delaunay Triangulations, and their properties, can also
be nicely explained using inverse stereographic projection and the central projection from
N , but a rigorous justification of why this “works” is not as simple as it might appear.
12.5. STEREOGRAPHIC PROJECTION AND THE SPACE OF SPHERES 339
z=3
N = (0,0,1)
z=1
Θ (p) = (1,0,1)
z = 1/2
z = 1/3
z=0
Θ (S) = (0,0,0)
p = (1,0,0) Θ
z = -1/2
S = (0,0,-1)
The advantage of stereographic projection over the lifting onto a paraboloid is that the
(d-)sphere is compact. Since the stereographic projection and its inverse map (d − 1)-spheres
to (d − 1)-spheres (or hyperplanes), all the crucial properties of Delaunay triangulations
are preserved. The purpose of this section is to establish the properties of stereographic
projection (and its inverse) that will be needed in Section 12.7.
Recall that the d-sphere S d ⊆ Ed+1 is given by
S d = {(x1 , . . . , xd+1 ) ∈ Ed+1 | x21 + · · · + x2d + x2d+1 = 1}.
It will be convenient to write a point (x1 , . . . , xd+1 ) ∈ Ed+1 as z = (x, xd+1 ), with
x = (x1 , . . . , xd ). We denote N = (0, . . . , 0, 1) (with d zeros) as (0, 1) and call it the north
pole, and S = (0, . . . , 0, −1) (with d zeros) as (0, −1) and call it the south pole. We also
1 1 1
write kzk = (x21 + · · · + x2d+1 ) 2 = (kxk2 + x2d+1 ) 2 (with kxk = (x21 + · · · + x2d ) 2 ). With these
notations,
S d = {(x, xd+1 ) ∈ Ed+1 | kxk2 + x2d+1 = 1}.
The stereographic projection from the north pole σN : (S d −{N }) → Ed is the restriction to
S d of the central projection πN : (Ed+1 − Hd+1 ) → Ed from N onto the hyperplane Hd+1 (0) ∼ =
d
E of equation xd+1 = 0; that is, M 7→ πN (M ) where πN (M ) is the intersection of the line
hN, M i through N and M with Hd+1 (0). Since the line through N and M = (x, xd+1 ) is
given parametrically by
hN, M i = {(1 − λ)(0, 1) + λ(x, xd+1 ) | λ ∈ R},
the intersection πN (M ) of this line with the hyperplane xd+1 = 0 corresponds to the value
of λ such that
(1 − λ) + λxd+1 = 0,
340 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
N = (0,0,1)
M = (x , y, z)
that is,
1
λ= .
1 − xd+1
Therefore, the coordinates of πN (M ), with M = (x, xd+1 ), are given by
x
πN (x, xd+1 ) = .
1 − xd+1
See Figure 12.14. The central projection πN is undefined on the hyperplane Hd+1 of equation
xd+1 = 1, and the stereographic projection σN from the north pole, which is the restriction
of πN to the sphere S d , is undefined at the north pole.
Let us find the inverse τN = σN −1
(P ) of any P ∈ Hd+1 (0) ∼
= Ed . This time, τN (P ) is the
intersection of the line hN, P i through P ∈ Hd+1 (0) and N with the sphere S d . Since the
line through N and P = (x, 0) is given parametrically by
the intersection τN (P ) of this line with the sphere S d corresponds to the nonzero value of λ
such that
λ2 kxk2 + (1 − λ)2 = 1,
that is
λ(λ(kxk2 + 1) − 2) = 0.
Thus, we get
2
λ= 2 ,
kxk + 1
12.5. STEREOGRAPHIC PROJECTION AND THE SPACE OF SPHERES 341
N = (0,0,1)
P = (x,y)
Definition 12.5. The central projection πN : (Ed+1 − Hd+1 ) → Ed from N onto the hyper-
plane Hd+1 (0) ∼
= Ed of equation xd+1 = 0 is given by
x
πN (x, xd+1 ) = , (xd+1 6= 1).
1 − xd+1
that is,
d
X d
X
ai Xi + (ad+1 − aj Xj )xd+1 + b = 0.
i=1 j=1
Pd
If j=1 aj Xj = ad+1 , then ad+1 + b = 0, which is impossible. Therefore, we get
Pd
−b − ai X i
xd+1 = Pi=1
d
,
ad+1 − i=1 ai Xi
and so
ad+1 + b
1 − xd+1 = .
ad+1 − di=1 ai Xi
P
that is,
d
X d
X
2 2 2
(ad+1 + b) kXk = (ad+1 − ai Xi ) − (b + ai Xi )2
i=1 i=1
d
X
= (ad+1 + b)(ad+1 − b − 2 ai Xi ),
i=1
which yields
Xd
2 2
(ad+1 + b) kXk + 2(ad+1 + b)( ai Xi ) = (ad+1 + b)(ad+1 − b),
i=1
that is,
d
2
X ai ad+1 − b
kXk + 2 Xi − = 0,
i=1
ad+1 + b ad+1 + b
which is indeed the equation of a (d − 1)-sphere in Ed . By “completing the square,” the
above equation can be written as
d 2 d
X ai X a2i ad+1 − b
Xi + − 2
− = 0,
i=1
ad+1 + b i=1
(ad+1 + b) ad+1 + b
344 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
which yields
d 2 Pd
X ai i=1 a2i + (ad+1 − b)(ad+1 + b)
Xi + = ,
i=1
ad+1 + b (ad+1 + b)2
that is,
d 2 Pd+1
X ai i=1a2i − b2
Xi + = . (∗)
i=1
ad+1 + b (ad+1 + b)2
However, the distance from the origin to the hyperplane H of equation
a1 x1 + · · · + ad xd + ad+1 xd+1 + b = 0
is
|b|
δ= ,
Pd+1 2 1/2
i=1 ai
and since we are assuming that H intersects the unit sphere S d in a sphere of positive radius
we must have δ < 1, so
d+1
X
2
b < a2i ,
i=1
and (∗) is indeed the equation of a real sphere (its radius is positive). Therefore, when
N ∈/ H, the image of S = S d ∩ H by σN is a (d − 1)-sphere in Hd+1 (0) = Ed . See Figure
12.16.
If the hyperplane H contains the north pole, then ad+1 + b = 0, in which case, for every
(x, xd+1 ) ∈ S d ∩ H, we have
d
X
ai xi + ad+1 xd+1 − ad+1 = 0,
i=1
that is,
d
X
ai xi − ad+1 (1 − xd+1 ) = 0,
i=1
Figure 12.16: Two views of the plane −x−y+z = 0 intersecting S 2 . The bottom figure shows
the stereographic projection of the intersection, namely the circle (x − 1)2 + (y − 1)2 = 1.
the intersection of the hyperplanes H and Hd+1 (0). Therefore, the image of S d ∩ H by σN
is the hyperplane in Ed which is the intersection of H with Hd+1 (0). See Figure 12.17.
We will also prove that τN maps (d − 1)-spheres in Hd+1 (0) to (d − 1)-spheres on S d
not passing through the north pole. Assume that X ∈ Ed belongs to the (d − 1)-sphere of
equation
Xd d
X
Xi2 + aj Xj + b = 0.
i=1 j=1
For any (X, 0) ∈ Hd+1 (0), we know that (x, xd+1 ) = τN (X) is given by
!
2X kXk2 − 1
(x, xd+1 ) = , .
kXk2 + 1 kXk2 + 1
Using the equation of the (d − 1)-sphere, we get
2X
x=
−b + 1 − dj=1 aj Xj
P
and Pd
−b − 1 − j=1 aj X j
xd+1 = Pd .
−b + 1 − j=1 aj X j
346 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
Then, we get
d Pd
X 2 aj Xj
j=1
ai xi = Pd ,
i=1 −b + 1 − j=1 aj Xj
which yields
d
X Xd Xd d
X
(−b + 1)( ai x i ) − ( ai xi )( aj X j ) = 2 aj X j .
i=1 i=1 j=1 j=1
−b − 1 − di=1 ai xi
P
xd+1 = ,
−b + 1
which yields
d
X
ai xi + (−b + 1)xd+1 + (b + 1) = 0,
i=1
the equation of a hyperplane H not passing through the north pole. Therefore, the image
of a (d − 1)-sphere in Hd+1 (0) is indeed the intersection H ∩ S d of S d with a hyperplane not
passing through N , that is, a (d − 1)-sphere on S d .
Given any hyperplane H 0 in Hd+1 (0) = Ed , say of equation
d
X
ai Xi + b = 0,
i=1
12.5. STEREOGRAPHIC PROJECTION AND THE SPACE OF SPHERES 347
that is,
d
X
ai xi − bxd+1 + b = 0,
i=1
which is indeed the equation of a hyperplane H passing through N . We summarize all this
in the following proposition:
Proposition 12.4. The stereographic projection σN : (S d − {N }) → Ed induces a bijection
between the set of (d − 1)-spheres on S d and the union of the set of (d − 1)-spheres in Ed
with the set of hyperplanes in Ed ; every (d − 1)-sphere on S d not passing through the north
pole is mapped to a (d − 1)-sphere in Ed , and every (d − 1)-sphere on S d passing through
the north pole is mapped to a hyperplane in Ed . In fact, σN maps the (d − 1)-sphere on S d
determined by the hyperplane
a1 x1 + · · · + ad xd + ad+1 xd+1 + b = 0
not passing through the north pole (ad+1 + b 6= 0) to the (d − 1)-sphere
d 2 Pd+1 2
X ai a − b2
Xi + = i=1 i 2 ,
i=1
ad+1 + b (ad+1 + b)
−1
the map τN = σN maps the (d − 1)-sphere
d
X d
X
Xi2 + aj X j + b = 0
i=1 j=1
348 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
Proposition 12.4 raises a natural question: What do the hyperplanes H in Ed+1 that do
not intersect S d correspond to, if they correspond to anything at all?
The first thing to observe is that the geometric definition of the stereographic projection
and its inverse makes it clear that the hyperplanes corresponding to (d − 1)-spheres in Ed
(by τN ) do intersect S d . Now, when we write the equation of a (d − 1)-sphere S, say
d
X d
X
Xi2 + ai Xi + b = 0,
i=1 i=1
we are implicitly assuming a condition on the ai ’s and b that ensures that S is not the empty
sphere, that is, that its radius R is positive (or zero). By “completing the square,” the above
equation can be rewritten as
d d
X ai 2 1 X 2
Xi + = a − b,
i=1
2 4 i=1 i
whereas its center is the point c = − 12 (a1 , . . . , ad ). Thus, our sphere is a “real” sphere of
positive radius iff
X d
a2i > 4b,
i=1
12.5. STEREOGRAPHIC PROJECTION AND THE SPACE OF SPHERES 349
Pd
or a single point, c = − 21 (a1 , . . . , ad ), iff i=1 a2i = 4b.
What happens when
d
X
a2i < 4b?
i=1
In this case, if we allow “complex points,” that is, if we consider solutions of our equation
d
X d
X
Xi2 + ai X i + b = 0
i=1 i=1
q
d
4b − di=1 a2i . The
i
P
over C , then we get a “complex” sphere of (pure) imaginary radius 2
funny thing is that our computations carry over unchanged and the image of the complex
sphere S is still the intersection of the complex sphere S d with the hyperplane H given
d
X
ai xi + (−b + 1)xd+1 + (b + 1) = 0.
i=1
However, this time, even though H does not have any “real” intersection points with S d , we
can show that it does intersect the “complex sphere,”
(4) The set of all points of Ed (viewed as spheres of radius 0); see Figure 12.18.
Moreover, Set (1) corresponds to the hyperplanes that intersect the interior of S d and do not
pass through the north pole; Set (2) corresponds to the hyperplanes that do not intersect S d ;
Set (3) corresponds to the hyperplanes that pass through the north pole minus the tangent
hyperplane at the north pole; and Set (4) corresponds to the hyperplanes that are tangent
to S d , minus the tangent hyperplane at the north pole.
It is convenient to add the “point at infinity” ∞ to Ed , because then the above bijection
can be extended to map the tangent hyperplane at the north pole to ∞. The union of these
four sets (with ∞ added) is called the set of generalized spheres, sometimes denoted S(Ed ).
350 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
N = (0,0,1)
√ p √
Figure 12.18: The plane −2y − 2z − 2 2 = 0 tangent to the point (0, −1/ √ (2), −1/ 2),
along with its corresponding stereographic projection, the point (0, −1/( 2 + 1), 0).
This is a fairly complicated space. For one thing, topologically S(Ed ) is homeomorphic to
the projective space Pd+1 with one point removed (the point corresponding to the “hyperplane
at infinity”), and this is not a simple space. We can get a slightly more concrete “‘picture”
of S(Ed ) by looking at the polars of the hyperplanes w.r.t. S d . Then, the “real” spheres
correspond to the points strictly outside S d which do not belong to the tangent hyperplane
at the north pole, (i.e. Figure 12.19); the complex spheres correspond to the points in the
interior of S d ; the points of Ed ∪ {∞} correspond to the points on S d , (i.e. Figure 12.18);
the hyperplanes in Ed correspond to the points in the tangent hyperplane at the north
pole expect for the north pole, (i.e. Figure 12.20). Unfortunately, the poles of hyperplanes
through the origin are undefined. This can be fixed by embedding Ed+1 in its projective
completion Pd+1 , but we will not go into this.
There are other ways of dealing rigorously with the set of generalized spheres. One
method described by Boissonnat [12] is to use the embedding where the sphere S of equation
d
X d
X
Xi2 −2 ai X i + b = 0
i=1 i=1
(2,2,2)
Figure 12.19: The plane 2x + 2y + 2z = 1 with its dual (2, 2, 2). Also shown is the stereo-
graphic projection of the intersection, namely the circle (x + 2)2 + (y + 2)2 = 11.
The quantity di=1 a2i − R2 is known as the power of the origin w.r.t. S. In general,
P
the power of a point X ∈ Ed is defined as ρ(X) = kcXk2 − R2 , which, after a moment of
thought, is just
Xd d
X
2
ρ(X) = Xi − 2 ai Xi + b.
i=1 i=1
Now, since points correspond to spheres of radius 0, we see that the image of the point
X = (X1 , . . . , Xd ) is
d
X
l(X) = (X1 , . . . , Xd , Xi2 ).
i=1
Thus, in this model, points of E are lifted to the paraboloid P ⊆ Ed+1 of equation
d
d
X
xd+1 = x2i .
i=1
Actually, this method does not deal with hyperplanes but it is possible to do so. The
trick is to consider equations of a slightly more general form that capture both spheres and
hyperplanes, namely, equations of the form
d
X d
X
c Xi2 + ai Xi + b = 0.
i=1 i=1
Indeed, when c = 0, we do get a hyperplane! Now, to carry out this method we really
need to consider equations up to a nonzero scalars, that is, we consider the projective space
352 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
(1,1,1)
(0,0,1)
Figure 12.20: The plane x + y + z = 1 with its dual (1, 1, 1). Also shown is the stereographic
projection of the intersection, namely the line x + y = 1.
has a natural interpretation (with a = (a1 , . . . , ad ) and a0 = (a01 , . . . , a0d )). Indeed, orthogo-
nality with respect to ρ (that is, when ρ((a, b, c), (a0 , b0 , c0 )) = 0) says that the corresponding
spheres defined by (a, b, c) and (a0 , b0 , c0 ) are orthogonal, that the corresponding hyperplanes
defined by (a, b, 0) and (a0 , b0 , 0) are orthogonal, etc. The reader who wants to read more
about this approach should consult Berger (Volume II) [8].
the sphere and the paraboloid are projectively equivalent, as we showed for S 2 in Section
11.1.
We defined the map Θ given by
xi
zi = , 1≤i≤d
1 − xd+1
xd+1 + 1
zd+1 = ,
1 − xd+1
and showed that Θ is a bijection between Ed+1 −Hd+1 and Ed+1 −Hd+1 (−1), where Hd+1 (−1)
is the hyperplane of equation xd+1 = −1. We will show a little later that Θ maps the sphere
S d minus the north pole to the paraboloid P, and satisfies the equation
l = Θ ◦ τN .
The fact that Θ is undefined on the hyperplane Hd+1 is not a problem as far as mapping
the sphere to the paraboloid because the north pole is the only point that does not have an
image. However, later on when we consider the Voronoi polyhedron V(P ) of a lifted set of
points P , we will have more serious problems because in general, such a polyhedron intersects
both hyperplanes Hd+1 and Hd+1 (−1). This means that Θ will not be well-defined on the
whole of V(P ) nor will it be surjective on its image. To remedy this difficulty, we work with
projective completions. Basically, this amounts to chasing denominators and homogenizing
equations, but we also have to be careful in dealing with convexity, and this is where the
projective polyhedra (studied in Section 11.2) will come handy.
fd ⊆ Pd+1 given by the
So, let us consider the projective completion of the sphere S
equation
d+1
X
x2i = x2d+2 ,
i=1
Definition 12.6. Let θ : Pd+1 → Pd+1 be the projectivity induced by the linear map
θb: Rd+2 → Rd+2 given by
zi = xi , 1≤i≤d
zd+1 = xd+1 + xd+2
zd+2 = xd+2 − xd+1 ,
354 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
which simplifies to
d
X
zd+1 zd+2 = zi2 .
i=1
Therefore, θ(S fd ) = P,
e that is, θ maps the projective completion of the sphere to the
projective completion of the paraboloid. Observe that the projective north pole N e =
(0 : · · · : 0 : 1 : 1) is mapped to the point at infinity (0 : · · · : 0 : 1 : 0).
Recall from Definition 11.5 that for any i, with 1 ≤ i ≤ d + 1, the set
Ui = {(x1 : · · · : xd+1 ) ∈ Pd | xi 6= 0}
is a subset of Pd called an affine patch of Pd . We have a bijection ϕi : Ui → Rd between Ui
and Rd given by
x1 xi−1 xi+1 xd+1
ϕi : (x1 : · · · : xd+1 ) 7→ ,..., , ,..., ,
xi xi xi xi
with inverse ψi : Rd → Ui ⊆ Pd given by
ψi : (x1 , · · · , xd ) 7→ (x1 : · · · xi−1 : 1 : xi : · · · : xd ).
The map Θ is the restriction of θ to the affine patch Ud+1 , and as such, it can be fruitfully
described as the composition of θb with a suitable projection onto Ed+1 . For this, as we have
done before, we identify Ed+1 with the hyperplane Hd+2 ⊆ Ed+2 of equation xd+2 = 1 (using
the injection, id+2 : Ed+1 → Ed+2 , where ij : Ed+1 → Ed+2 is the injection given by
(x1 , . . . , xd+1 ) 7→ (x1 , . . . , xj−1 , 1, xj+1 , . . . , xd+1 )
for any (x1 , . . . , xd+1 ) ∈ Ed+1 ). For each i, with 1 ≤ i ≤ d + 2, let πi : (Ed+2 − Hi (0)) → Ed+1
be the projection of center 0 ∈ Ed+2 onto the hyperplane Hi ⊆ Ed+2 of equation xi = 1
(Hi ∼= Ed+1 and Hi (0) ⊆ Ed+2 is the hyperplane of equation xi = 0) given by
x1 xi−1 xi+1 xd+2
πi (x1 , . . . , xd+2 ) = ,..., , ,..., (xi 6= 0).
xi xi xi xi
12.6. RELATING LIFTING TO A PARABOLOID AND LIFTING TO A SPHERE 355
Θ = πd+2 ◦ θb ◦ id+2 .
If we identify Hd+2 and Ed+1 , we may write with a slight abuse of notation Θ = πd+2 ◦ θ.
b
z=1
π3 (p) ~ (1,1,1)
Figure 12.21: The geometric realization of image of π3 (p), where π3 : (E3 − H3 (0)) → E2 .
We will need some properties of the projection πd+2 and of Θ, and for this, let
Proposition 12.5. The maps πd+2 , πN , and Θ have the following properties:
(1) For every hyperplane H through the origin, πd+2 (H) is a hyperplane in Hd+2 . See
Figure 12.22.
(2) Given any set of points {a1 , . . . , an } ⊆ Ed+2 , if {a1 , . . . , an } is contained in the open
half-space above the hyperplane xd+2 = 0 or {a1 , . . . , an } is contained in the open half-
space below the hyperplane xd+2 = 0, then the image by πd+2 of the convex hull of the
ai ’s is the convex hull of the images of these points, that is,
(3) Given any set of points {a1 , . . . , an } ⊆ Ed+2 , if {a1 , . . . , an } is contained in the open
half-space above the hyperplane xd+2 = 1 or {a1 , . . . , an } is contained in the open half-
space below the hyperplane xd+2 = 1, then the image by πN of the convex hull of the
ai ’s is the convex hull of the images of these points, that is,
(4) Given any set of points {a1 , . . . , an } ⊆ Ed+1 , if {a1 , . . . , an } is contained in the open
half-space above the hyperplane Hd+1 or {a1 , . . . , an } is contained in the open half-space
below Hd+1 , then
(5) For any set S ⊆ Ed+1 , if conv(S) does not intersect Hd+1 , then
Θ(conv(S)) = conv(Θ(S)).
Figure 12.22: The intersection of the teal plane x + y + z = 0 with the magenta plane z = 1
results in the pink line x + y = −1. This line is the also the projection of the teal plane via
π3 as shown by the lime green rays through the origin.
Proof. (1) The image, πd+2 (H), of a hyperplane H through the origin is the intersection of
H with Hd+2 , which is a hyperplane in Hd+2 .
(2) This seems fairly clear geometrically but the result fails for arbitrary sets of points,
so to be on the safe side, we give an algebraic proof. We will prove the following two facts
by induction on n ≥ 1:
12.6. RELATING LIFTING TO A PARABOLOID AND LIFTING TO A SPHERE 357
π3(a3)
π3 (a2)
a3
a
2
z=1
a π3 (a1)
1
Figure 12.23: The convex hull of {a1 , a2 , a3 }, namely the dusty rose triangle above z = 0
and below z = 1, under π3 , is projected to the lavender triangle in the plane z = 1.
(1) The base case is clear. Let us assume for the moment that we proved (1) for n = 2
and consider the induction step for n ≥ 2. Since λ1 + · · · + λn+1 = 1 and n ≥ 2, there is
some i such that λi 6= 1, and without loss of generality, say λ1 6= 1. Then, we can write
λ2 λn+1
λ1 a1 + · · · + λn+1 an+1 = λ1 a1 + (1 − λ1 ) a2 + · · · + an+1
1 − λ1 1 − λ1
λ2 λn+1
+ ··· + = 1.
1 − λ1 1 − λ1
358 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
Again, by the induction hypothesis (for n), there exist β2 , . . . , βn+1 with β2 + · · · + βn+1 = 1
and βi ≥ 0, so that
λ2 λn+1
πd+2 a2 + · · · + an+1 = β2 πd+2 (a2 ) + · · · + βn+1 πd+2 (an+1 ),
1 − λ1 1 − λ1
so we get
πd+2 (λ1 a1 + · · · + λn+1 an+1 ) = (1 − α1 )πd+2 (a1 ) + α1 (β2 πd+2 (a2 ) + · · · + βn+1 πd+2 (an+1 ))
= (1 − α1 )πd+2 (a1 ) + α1 β2 πd+2 (a2 ) + · · · + α1 βn+1 πd+2 (an+1 ),
πd+2 ((1−β1 )a1 +β1 (α2 a2 +· · ·+αn+1 an+1 )) = µ1 πd+2 (a1 )+(1−µ1 )πd+2 (α2 a2 +· · ·+αn+1 an+1 ),
which establishes the induction hypothesis. Therefore, all that remains is to prove (1) and
(2) for n = 2.
As πd+2 is given by
x1 xd+1
πd+2 (x1 , . . . , xd+2 ) = ,..., (xd+2 6= 0),
xd+2 xd+2
12.6. RELATING LIFTING TO A PARABOLOID AND LIFTING TO A SPHERE 359
Let a = (a1 , b1 ) and b = (a2 , b2 ). To prove (1), we need to show that for any λ, with
0 ≤ λ ≤ 1,
But since
a1 a2
π2 (a) = , π2 (b) =
b1 b2
(1 − λ)a1 + λa2
π2 ((1 − λ)a + λb) = π2 ((1 − λ)a1 + λa2 , (1 − λ)b1 + λb2 ) = ,
(1 − λ)b1 + λb2
it is enough to show that for any λ, with 0 ≤ λ ≤ 1, if b1 b2 > 0 then
a1 (1 − λ)a1 + λa2 a2 a1 a2
≤ ≤ if ≤ ,
b1 (1 − λ)b1 + λb2 b2 b1 b2
and
a2 (1 − λ)a1 + λa2 a1 a2 a1
≤ ≤ if ≤ ,
b2 (1 − λ)b1 + λb2 b1 b2 b1
where, of course, (1−λ)b1 +λb2 6= 0. For this, we compute (leaving some steps as an exercise)
µb1 µ
λ= = .
(1 − µ)b2 + µb1 (1 − µ) bb21 + µ
b2
Since b1 b2 > 0, we have b1
> 0, and since 0 ≤ µ ≤ 1, we conclude that 0 ≤ λ ≤ 1, which
proves (2).
(3) This proof is completely analogous to the proof of (2).
(4) Since
Θ = πd+2 ◦ θb ◦ id+2 ,
as id+2 and θb are linear, they preserve convex hulls, so by (2), we simply have to show that
either θb ◦ id+2 ({a1 , . . . , an }) is strictly below the hyperplane, xd+2 = 0, or strictly above it.
But
b 1 , . . . , xd+2 )d+2 = xd+2 − xd+1
θ(x
and id+2 (x1 , . . . , xd+1 ) = (x1 , . . . , xd+1 , 1), so
and this quantity is positive iff xd+1 < 1, negative iff xd+1 > 1; that is, either all the points
ai are strictly below the hyperplane Hd+1 or all strictly above it.
(5) This follows immediately from (4) as conv(S) consists of all finite convex combinations
of points in S.
If a set {a1 , . . . , an } ⊆ Ed+2 contains points on both sides of the hyperplane xd+2 = 0,
then πd+2 (conv({a1 , . . . , an })) is not necessarily convex; see Figure 12.24.
Besides θ, we need to define a few more maps in order to establish the connection between
the Delaunay complex on S d and the Delaunay complex on P. We use the convention of
denoting the extension to projective spaces of a map f defined between Euclidean spaces by
fe.
12.6. RELATING LIFTING TO A PARABOLOID AND LIFTING TO A SPHERE 361
(-1, 0, 1)
z=1
(1,0,1)
(1/2, 0, 1/2)
(1/2, 0 , 0)
(1/2, 0, -1/2)
Figure 12.24: Let a1 = (1/2, 0, 1/2) and a2 = (1/2, 0, −1/2). Since π3 ((1/2, 0, 0)) is unde-
fined, the image of π3 (conv({a1 , a2 }) is two disconnected infinite rays.
The line hN
e , xi intersects the hyperplane xd+1 = 0 iff
λ + µxd+1 = 0,
so we can pick λ = −xd+1 and µ = 1, which yields the intersection point,
(x1 : · · · : xd : 0 : xd+2 − xd+1 ),
as claimed.
362 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
1
P
~ P2
~ x N = (0:1:1)
π N(x)
Figure 12.25: A schematic representation, via the plane model for P2 , of the geometric image
eN (x). The copy of P1 corresponds to the intersection of the xz- plane with the plane
of π
z = 1.
and
d d
!
X X
τeN (x1 : · · · : xd+1 ) = 2x1 xd+1 : · · · : 2xd xd+1 : x2i − x2d+1 : x2i + x2d+1 .
i=1 i=1
fd − {N
It is an easy exercise to check that the image of S e } by σ
eN is Ud+1 , and that σ
eN
and τeN Ud+1 are mutual inverses.
Observe that σ
eN = π fd , the restriction of the projection π
eN S fd .
eN to the sphere S
l : Ed → P
Definition 12.9. The lifting e e ⊆ Pd+1 is given by
d
!
X
l(x1 , . . . , xd ) =
e x1 : · · · : xd : x2i : 1 ,
i=1
and the embedding ψd+1 : Ed → Pd (the map ψd+1 defined in Section 11.1) is given by
l = θ ◦ τeN ◦ ψd+1
e
eN = ped+1 ◦ θ
π
τeN ◦ ψd+1 = ψd+2 ◦ τN
l = ψd+2 ◦ l
e
l = Θ ◦ τN .
Then, as
d d
!
X X
τeN ◦ ψd+1 (x1 , . . . , xd ) = 2x1 : · · · : 2xd : x2i − 1 : x2i + 1 ,
i=1 i=1
we get
d
!
X
θ ◦ τeN ◦ ψd+1 (x1 , . . . , xd ) = 2x1 : · · · : 2xd : 2 x2i : 2
i=1
d
!
X
= x1 : · · · : xd : x2i : 1 =e
l(x1 , . . . , xd ),
i=1
and
!
2x1 2xd kxk2 − 1
ψd+2 (τN (x)) = : · · · : : :1
kxk2 + 1 kxk2 + 1 kxk2 + 1
= (2x1 : · · · : 2xd : kxk2 − 1 : kxk2 + 1),
Since ψd+2 adds a 1 as a (d + 2)th component, the fourth equation follows immediately
from the definitions.
For the fifth equation, since
!
2
2x kxk − 1
τN (x) = ,
kxk + 1 kxk2 + 1
2
and
xi
Θ(x)i = , 1≤i≤d
1 − xd+1
xd+1 + 1
Θ(x)d+1 = ,
1 − xd+1
we get
!
kxk2 − 1
2xi
Θ(τN (x))i = 1−
kxk2 + 1 kxk2 + 1
2xi
= = xi , 1 ≤ i ≤ d,
2
! !
kxk2 − 1 kxk2 − 1
θ(τN (x))d+1 = +1 1−
kxk2 + 1 kxk2 + 1
2 kxk2
= = kxk2 ,
2
and since
d
X
l(x) = (x1 , . . . , xd , x2i ) = (x1 , . . . , xd , kxk2 ),
i=1
Given a projective complex, the notions of face, vertex, edge, cell, facet, are defined in
the obvious way.
d
If K ⊆ 2R is a polyhedral complex, then it is easy to check that the set {C(σ) | σ ∈
d+1
K} ⊆ 2R (where C(σ) is the V-cone associated with σ defined in Section 5.5) is a fan.
d
Definition 12.11. Given a polyhedral complex K ⊆ 2R , the projective complex
e = {P(C(σ)) | σ ∈ K} ⊆ 2Pd
K
is called the projective completion of K. See Figure 12.27.
366 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
~
K
P2
Figure 12.27: The projective completion, in the plane model of P2 , of K, the two dimensional
complex in the xy-plane consisting of the teal triangle and the periwinkle quadrilateral.
The above is not the “standard” definition of the Delaunay triangulation of P , but it
is equivalent to the definition given in Section 17.3.1 of Boissonnat and Yvinec [12], as we
will prove shortly. It also has certain advantages over lifting onto a paraboloid, as we will
explain. Furthermore, to be perfectly rigorous, we should define Del (P ) by
Figure 12.28: The inverse stereographic projection of P = {(0, 0, 0), (1, 0, 0), (0, 1, 0),
(2, 2, 0)}. Note that τN ((0, 0, 0)) = (0, 0, −1) and τN ((2, 2, 0)) = (4/9, 4/9, 7/9).
Consequently, as DC(P
f ) = DC(P
^) = P(C(DC(P ))), we immediately check that
Del (P ) = ϕd+1 ◦ π
eN (DC(P
f )) = ϕd+1 ◦ π
bN (C(DC(P ))) = ϕd+1 ◦ π
bN (cone(DC(P
\))),
where DC(P
\) = {b
u | u ∈ DC(P )} and u
b = (u, 1).
This suggests defining the map πN : (Rd+1 − Hd+1 ) → Rd by
πN = ϕd+1 ◦ π
bN ◦ id+2 ,
Del (P ) = ϕd+1 ◦ π
eN (DC(P
f )) = πN (DC(P )).
Figure 12.29: Two views of the Delaunay polytope D(P ), where P = {(0, 0, 0), (1, 0, 0),
(0, 1, 0), (2, 2, 0)}.
Figure 12.30: Two additional views of the Delaunay polytope D(P ) circumscribed by the
unit sphere.
Proof. Note that the intersection points of the Oxd -axis with the supporting hyperplanes
of all the upper-facing facets of P are strictly above the intersection points of the Oxd -axis
with the supporting hyperplanes of all the lower-facing facets. Suppose F is visible from c.
Then, F must not be lower-facing, as otherwise, for any y ∈ F , the line through c and y has
to intersect some upper-facing facet and F is not be visible from c, a contradiction.
Now, as P is the intersection of the closed half-spaces determined by the supporting
hyperplanes of its facets, by the definition of an upper-facing facet, any point c on the
Oxd -axis that lies strictly above the intersection points of the Oxd -axis with the supporting
hyperplanes of all the upper-facing facets of F has the property that c and the interior
of P are strictly separated by all these supporting hyperplanes. Therefore, all the upper-
facing facets of P are visible from c. It follows that the facets visible from c are exactly the
upper-facing facets, as claimed.
Figure 12.31: The lifted Delaunay complex DC(P ), where P = {(0, 0, 0), (1, 0, 0), (0, 1, 0),
(2, 2, 0)}.
conv(P ∪ {x}) not containing x is identical. Moreover, the set of facets of P not visible from
x is the set of facets of conv(P ∪ {x}) that do not contain x.
Proof. If dim(P ) = d then pick any c on the Oxd -axis above the intersection points of the
Oxd -axis with the supporting hyperplanes of all the upper-facing facets of F . Then, c is in
general position w.r.t. P in the sense that c and any d vertices of P do not lie in a common
hyperplane. Now, our result follows by Lemma 8.3.1 of Boissonnat and Yvinec [12].
Corollary 12.9. Given any polytope P ⊆ Ed with dim(P ) = d, there is a point c on the
Oxd -axis so that for all x on the Oxd -axis and above c, the lower-facing facets of P are
exactly the facets of conv(P ∪ {x}) that do not contain x. See Figure 12.36.
Definition 12.14. Given any set of points P = {p1 , . . . , pn } ⊆ Ed , let D0 (P ) denote the
polyhedron conv(l(P )) + cone(ed+1 ), and let D e 0 (P ) be the projective completion of D0 (P ).
Also, let DC 0 (P ) be the polyhedral complex consisting of the bounded facets of the polytope
D0 (P ), and let DCf 0 (P ) be the projective completion of DC 0 (P ). See Figure 12.37. The
f 0 (P )) = pd+1 (DC 0 (P )) is the standard Delaunay complex
complex Del 0 (P ) = ϕd+1 ◦ ped+1 (DC
of P , that is, the orthogonal projection of DC 0 (P ) onto Ed . See Figure 12.38.
Intuitively, adding to conv(l(P )) all the verttical rays parallel to ed+1 based on points
in conv(l(P )) washes out the upper-facing faces of conv(l(P )). Then the bounded facets of
conv(l(P ))+cone(ed+1 ) are precisely the lower-facing facets of conv(l(P )) (if dim(conv(P )) =
d).
The first of the two main theorems of this chapter is that the two notions of Delaunay
complexes coincide.
12.7. LIFTED DELAUNAY COMPLEXES AND DELAUNAY COMPLEXES 371
Figure 12.32: The Delaunay complex Del (P ), for P = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (2, 2, 0)}
obtained by applying πN to DC(P ) of Figure 12.31. The bottom triangle projects onto
the planar green triangle with vertices {(0, 0, 0), (1, 0, 0), (0, 1, 0)}, while the top triangle
projection onto the aqua triangle with vertices {(1, 0, 0), (0, 1, 0), (2, 2, 0)}.
θ(D(P e 0 (P )
e )) = D and θ(DC(P f 0 (P ).
f )) = DC
Furthermore,
Del (P ) = Del 0 (P ).
Therefore, the two notions of a Delaunay complex agree. If dim(conv(P )) = d, then the
bounded facets of conv(l(P )) + cone(ed+1 ) are precisely the lower-facing facets of conv(l(P )).
Proof. Recall that
D(P ) = conv(τN (P ) ∪ {N }),
and D(P
e ) = P(C(D(P ))) is the projective completion of D(P ). If we write τ\
N (P ) for
{τN (pi ) | pi ∈ P }, then
\
C(D(P )) = cone(τ\N (P ) ∪ {N }).
b
By definition, we have
θ(D)
e = P(θ(C(D))).
b
Now, as θb is linear,
w
(0,0,0)
y3
y
2
P y
1
(i.)
(ii.)
Figure 12.33: In E3 , all three edges of the planar triangle in Figure (i.) are visible from x,
while in Figure (ii.), only the top face of the solid rectangular box is visible from x.
We claim that
where
D0 (P ) = conv(l(P )) + cone(ed+1 ).
Indeed,
b 1 , . . . , xd+2 ) = (x1 , . . . , xd , xd+1 + xd+2 , xd+2 − xd+1 ),
θ(x
and for any pi = (x1 , . . . , xd ) ∈ P ,
! Pd
2
2x1 2xd x i − 1
τ\
N (pi ) = Pd 2
, . . . , Pd 2
, Pi=1
d 2
,1
i=1 xi + 1 i=1 xi + 1 i=1 xi + 1
d d
!
1 X X
= Pd 2
2x1 , . . . , 2xd , x2i − 1, x2i + 1 ,
i=1 xi + 1 i=1 i=1
so we get
d
!
2 X 2
b τ\
θ( N (pi )) = Pd
2
x1 , . . . , x d , x2i , 1 = Pd 2
l(p
di ).
i=1 xi +1 i=1 i=1 xi +1
Also, we have
θ(
bN b ) = θ(0,
b . . . , 0, 1, 1) = (0, . . . , 0, 2, 0) = 2ed
d+1 ,
Figure 12.34: Let P be the solid mint green triangular bipyramid. The three faces on the
top are upper-facing, while the three faces on the bottom are lower-facing.
p1
p2
Figure 12.35: Let P be the solid mint green triangular bipyramid. The intersections of the
upper-facing facets and the z-axis is given by the two points p1 and p2 . Since c is above p1 ,
none of the three lower-facing facets in the bottom of the bipyramid are visible from c.
θ(D(P e 0 (P ).
e )) = D
Now, it is clear that the facets of conv(τN (P ) ∪ {N }) that do not contain N are mapped
to the bounded facets of conv(l(P )) + cone(ed+1 ), since N goes the point at infinity, so
θ(DC(P f 0 (P ).
f )) = DC
Figure 12.36: Let P be the solid mint green triangular bipyramid. Then conv(P ∪ {x}) is
the larger solid bipyramid with gray top and mint green bottom. The lower-facing facets of
P are the three mint green faces on the bottom of both P and conv(P ∪ {x}).
Figure 12.37: Two views of the tetrahedron l(P ), where P = {(0, 0, 0), (1, 0, 0), (0, 1, 0),
(2, 2, 0)}.
We can also characterize when the Delaunay complex Del (P ) is simplicial. Recall that
we say that a set of points P ⊆ Ed is in general position if no d + 2 of the points in P belong
to a common (d − 1)-sphere.
Figure 12.38: The polyhedral complex DC 0 (P ) for P = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (2, 2, 0)},
and its orthogonal projection (in red) onto the xy-plane. Note this orthogonal projection
gives the same Delaunay complex as in Figure 12.32.
Remark: Even when the points in P are in general position, the Delaunay complex D(P )
may not be a simplicial polytope. For example, if d + 1 points belong to a hyperplane in
Ed , then the lifted points belong to a hyperplane passing through the north pole, and these
d + 1 lifted points together with N form a non-simplicial facet. For example, consider the
polytope obtained by lifting our original d + 1 points on a hyperplane H plus one more point
not in the hyperplane H; see Figure 12.39.
N
H
p
C
B
A
Figure 12.39: Let d = 2. The three points on the line H and the orange point p are in
general position. However, when lifted to S 2 , the convex hull of these three points and N
form a solid pyramid with base ABCN .
hyperplanes to S d at the lifted points τN (pi ) (with pi ∈ P ), and at the north pole N .
See Figures 12.40, and 12.41. It follows that the polyhedron D(P )∗ has exactly one facet
containing the north pole. The Voronoi diagram of P is the result of applying the central
projection πN from N to the polyhedron D(P )∗ . Under this central projection, the facet
containing the north pole goes to infinity, so instead of considering the polar dual D(P )∗ we
should consider the polar dual DC(P )∗ of the lifted Delaunay complex DC(P ) which does
not have the north pole as a vertex. Then the Voronoi diagram of P is the result of applying
the central projection πN from N to the complex DC(P )∗ . See Figures 12.42 through 12.45.
The polyhedron DC(P )∗ still contains faces intersecting the tangent hyperplane to S d
at the north pole, so we can’t simply map it to the corresponding complex obtained from
the polar dual of the lifted points l(pi ) on the paraboloid P. However, using projective
completions, we can indeed define this mapping and recover the Voronoi diagram of P .
Definition 12.15. Given any set of points P = {p1 , . . . , pn } ⊆ Ed , the lifted Voronoi complex
associated with P is the polar dual (w.r.t. S d ⊆ Rd+1 ) V(P ) = (DC(P ))∗ ⊆ Rd+1 of the
e ) ⊆ Pd+1 is the projective completion of V(P ).
lifted Delaunay complex DC(P ), and V(P
See Figure 12.46. The polyhedral complex Vor (P ) = ϕd+1 (e e )) ∩ 2Ud+1 ) ⊆ 2Ed is the
πN (V(P
Voronoi complex of P , or Voronoi diagram of P . See Figure 12.47.
Figure 12.40: The polar dual D(P )∗ , where P = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (2, 2, 0)}. The
inverse stereographic projection of each point of P , along with N , is the depicted by a red
dot, and the five teal faces of the unbounded wedge are tangent to each red point.
Definition 12.16. Given any set of points P = {p1 , . . . , pn } ⊆ Ed , let V 0 (P ) = (DC 0 (P ))∗
be the polar dual (w.r.t. P ⊆ Rd+1 ) of the “standard” Delaunay complex of Definition 12.14,
and let V f 0 (P ) ⊆ Pd be its projective completion. The standard Voronoi diagram
e 0 (P ) = DC
is given by Vor 0 (P ) = pd+1 (V 0 (P )); see Definition 17.2.7 of Boissonnat and Yvinec [12].
In order to prove our second main theorem we need to show that θ has a good be-
havior with respect to tangent spaces. Recall from Section 11.2 that for any point a =
(a1 : · · · : ad+2 ) ∈ Pd+1 , the tangent hyperplane Ta S fd to the sphere S
fd at a is given by the
equation
d+1
X
ai xi − ad+2 xd+2 = 0.
i=1
If we lift a point a ∈ Ed to S
fd by τeN ◦ ψd+1 and to P
e by e
l, it turns out that the image of the
tangent hyperplane to S at τeN ◦ ψd+1 (a) by θ is the tangent hyperplane to P
f d e at e
l(a).
378 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
Figure 12.41: The polar dual D(P )∗ , where P = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (2, 2, 0)}, and
D(P ) is the Delaunay polytope of Figures 12.29 and 12.30.
fd ) = Te P,
θ(TτeN ◦ψd+1 (a) S e
l(a)
τN ◦ ψd+1 (S)) = e
θ(e l(S),
e
l = θ ◦ τeN ◦ ψd+1
e
and we proved in Section 11.3 (Proposition 11.6) that projectivities preserve tangent spaces.
Thus,
θ(Tτe ◦ψ (a) S fd ) = Te P,
fd ) = Tθ◦eτ ◦ψ (a) θ(S e
N d+1 N d+1 l(a)
as claimed.
l = θ ◦ τeN ◦ ψd+1 .
(2) This follows immediately from the equation e
12.8. LIFTED VORONOI COMPLEXES AND VORONOI COMPLEXES 379
Figure 12.42: Five views of the dual to the lifted Delaunay complex of Figure 12.31.
Given any two distinct points a = (a1 , . . . , ad ) and b = (b1 , . . . , bd ) in Ed , recall that the
bisector hyperplane Ha,b of a and b is given by
(b1 − a1 )x1 + · · · + (bd − ad )xd = (b21 + · · · + b2d )/2 − (a21 + · · · + a2d )/2.
Proposition 12.13. Given any two distinct points a = (a1 , . . . , ad ) and b = (b1 , . . . , bd ) in
Ed , the image under the projection π fd ∩ Tτe ◦ψ (b) S
eN of the intersection TτeN ◦ψd+1 (a) S N d+1
fd of the
tangent hyperplanes at the lifted points τeN ◦ ψd+1 (a) and τeN ◦ ψd+1 (b) on the sphere Sfd ⊆ Pd+1
d
is the embedding of the bisector hyperplane Ha,b of a and b into P ; that is,
π fd ∩ Tτe ◦ψ (b) S
eN (TτeN ◦ψd+1 (a) S fd ) = ψd+1 (Ha,b ).
N d+1
Proof. In view of the geometric interpretation of π eN given earlier, we need to find the
equation of the hyperplane H passing through the intersection of the tangent hyperplanes
fd and Tτe ◦ψ (b) S
TτeN ◦ψd+1 (a) S fd , and passing through the north pole, and then it is geomet-
N d+1
rically obvious that
π fd ∩ Tτe ◦ψ (b) S
eN (TτeN ◦ψd+1 (a) S fd ) = H ∩ Hd+1 (0),
N d+1
380 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
Figure 12.43: Four views of the relationship between DC(P )∗ of Figure 12.42 and the De-
launay complex of Figure 12.32.
where Hd+1 (0) is the hyperplane (in Pd+1 ) of equation xd+1 = 0. Recall that TτeN ◦ψd+1 (a) S
fd
fd are given by
and TτeN ◦ψd+1 (b) S
d
X d
X d
X
E1 = 2 ai x i + ( a2i − 1)xd+1 − ( a2i + 1)xd+2 = 0
i=1 i=1 i=1
and
d
X d
X Xd
E2 = 2 bi x i + ( b2i − 1)xd+1 − ( b2i + 1)xd+2 = 0.
i=1 i=1 i=1
fd ∩ Tτe ◦ψ (b) S
The hyperplanes passing through TτeN ◦ψd+1 (a) S fd are given by an equation of
N d+1
the form
λE1 + µE2 = 0,
with λ, µ ∈ R. Furthermore, in order to contain the north pole, this equation must vanish
for x = (0 : · · · : 0 : 1 : 1). But, observe that setting λ = −1 and µ = 1 gives a solution since
the corresponding equation is
d
X d
X d
X Xd d
X
2 (bi − ai )xi + ( b2i − a2i )xd+1 −( 2
bi − a2i )xd+2 = 0,
i=1 i=1 i=1 i=1 i=1
12.8. LIFTED VORONOI COMPLEXES AND VORONOI COMPLEXES 381
F = (-1/2, 1, 1)
E = (1, -1/2, 1)
D = (-2, 1, -1)
πN ( C ) πN ( D )
(1/2, -1, 0) (-1, 1/2, 0)
π (B)
N (1/2, 1/2, 0)
π N (A )
(7/6, 7/6, 0)
(2,2,0)
Figure 12.44: The Voronoi diagram for the Delaunay triangulation of the red dots P =
{(0, 0, 0), (1, 0, 0), (0, 1, 0), (2, 2, 0)}. The green and black dots are projected, via πN , onto
the aqua dots. Note that E and F are mapped to infinity.
and it vanishes on (0 : · · · : 0 : 1 : 1). But then, the intersection of H with the hyperplane
Hd+1 (0) of equation xd+1 = 0 is given by
d
X Xd d
X
2 (bi − ai )xi − ( b2i − a2i )xd+2 = 0.
i=1 i=1 i=1
Since we view Pd as the hyperplane Hd+1 (0) ⊆ Pd+1 and since the coordinates of points
in Hd+1 (0) are of the form (x1 : · · · : xd : 0 : xd+2 ), the above equation is equivalent to the
equation of ψd+1 (Ha,b ) in Pd in which xd+1 is replaced by xd+2 .
θ(V(P e 0 (P )
e )) = V
and
Vor (P ) = Vor 0 (P ).
Therefore, the two notions of Voronoi diagrams agree.
382 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
(2,2)
(7/6, 7/6)
(0,1)
(1/2, 1/2)
(0,0) (1,0)
Figure 12.45: Another view of the Voronoi diagram (in blue) for the Delaunay triangulation
(in red) of P = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (2, 2, 0)}.
Proof. By definition,
V(P
e ) = V(P ^)∗ ,
]) = DC(P
and by Proposition 11.13,
∗
∗ f ))∗ ,
DC(P ) = DC(P ) = (DC(P
^ ^
so
V(P f ))∗ .
e ) = (DC(P
By Proposition 11.11,
θ(V(P f ))∗ ) = (θ(DC(P
e )) = θ((DC(P f )))∗ ,
Figure 12.47: The orthogonal projection of the dark green lines onto the lighter green
lines (which lie in the xy-plane, provide the Voronoi diagram of P = {(0, 0, 0), (1, 0, 0),
(2, 2, 0), (0, 1, 0)}. This is precisely the same Voronoi diagram as in Figure 12.45.
so we get
θ(V(P f 0 (P ))∗ .
e )) = (DC
Therefore,
θ(V(P e 0 (P ),
e )) = V
as claimed.
eN = ped+1 ◦ θ by Proposition 12.6, we get
As π
We can also prove the proposition below which shows directly that Vor (P ) is the Voronoi
diagram of P . Recall that that V(P
e ) is the projective completion of V(P ). We observed in
Section 11.2 (see page 302) that in the patch Ud+1 , there is a bijection between the faces of
V(P
e ) and the faces of V(P ). Furthermore, the projective completion H e of every hyperplane
d
H ⊆ R is also a hyperplane, and it is easy to see that if H is tangent to V(P ), then H e is
tangent to V(P
e ).
384 CHAPTER 12. DIRICHLET–VORONOI DIAGRAMS
F = (H ∩ H1 )− ∩ · · · ∩ (H ∩ Hkp )− ,
Each Hi = TτN (pi ) S d is the tangent hyperplane to S d at τN (pi ), for some pi ∈ P . Now,
by definition of the projective completion, the embedding V(P ) −→ V(P e ) is given by a 7→
ψd+2 (a). Thus, every point p ∈ P is mapped to the point ψd+2 (τN (p)) = τf N (ψd+1 (p)), and
d d
we also have Hi = TτeN ◦ψd+1 (pi ) S and H = TτeN ◦ψd+1 (p) S . By Proposition 12.13,
e e
is the embedding of the bisector hyperplane of p and pi in Pd , so the first part holds.
Since dim(conv(P )) = d every vertex of V(P ) must belong to at least d + 1 faces.Now,
assume that some vertex v ∈ V(P ) = DC(P )∗ belongs to k ≥ d + 2 facets of V(P ). By
polar duality, this means that the facet F dual of v has k ≥ d + 2 vertices τN (p1 ), . . . , τN (pk )
of DC(P ). However, this contradicts Proposition 12.11. The fact that Vor (P ) is a simple
polyhedron was aready proved in Proposition 12.1.
Note that if m = dim(conv(P )) < d, then the Voronoi complex V(P ) may not have any
vertices.
We conclude our presentation of Voronoi diagrams and Delaunay triangulations with a
short section on applications.
The first example is the nearest neighbors problem. There are actually two subproblems:
Nearest neighbor queries and all nearest neighbors.
The nearest neighbor queries problem is as follows. Given a set P of points and a query
point q, find the nearest neighbor(s) of q in P . This problem can be solved by computing the
Voronoi diagram of P and determining in which Voronoi region q falls. This last problem,
called point location, has been heavily studied (see O’Rourke [44]). The all neighbors problem
is as follows: Given a set P of points, find the nearest neighbor(s) to all points in P . This
problem can be solved by building a graph, the nearest neighbor graph, for short nng. The
nodes of this undirected graph are the points in P , and there is an arc from p to q iff p is
a nearest neighbor of q or vice versa. Then it can be shown that this graph is contained in
the Delaunay triangulation of P .
The second example is the largest empty circle. Some practical applications of this
problem are to locate a new store (to avoid competition), or to locate a nuclear plant as
far as possible from a set of towns. More precisely, the problem is as follows. Given a set
P of points, find a largest empty circle whose center is in the (closed) convex hull of P ,
empty in that it contains no points from P inside it, and largest in the sense that there is no
other circle with strictly larger radius. The Voronoi diagram of P can be used to solve this
problem. It can be shown that if the center p of a largest empty circle is strictly inside the
convex hull of P , then p coincides with a Voronoi vertex. However, not every Voronoi vertex
is a good candidate. It can also be shown that if the center p of a largest empty circle lies
on the boundary of the convex hull of P , then p lies on a Voronoi edge.
The third example is the minimum spanning tree. Given a graph G, a minimum spanning
tree of G is a subgraph of G that is a tree, contains every vertex of the graph G, and minimizes
the sum of the lengths of the tree edges. It can be shown that a minimum spanning tree
is a subgraph of the Delaunay triangulation of the vertices of the graph. This can be used
to improve algorithms for finding minimum spanning trees, for example Kruskal’s algorithm
(see O’Rourke [44]).
We conclude by mentioning that Voronoi diagrams have applications to motion planning.
For example, consider the problem of moving a disk on a plane while avoiding a set of
polygonal obstacles. If we “extend” the obstacles by the diameter of the disk, the problem
reduces to finding a collision–free path between two points in the extended obstacle space.
One needs to generalize the notion of a Voronoi diagram. Indeed, we need to define the
distance to an object, and medial curves (consisting of points equidistant to two objects)
may no longer be straight lines. A collision–free path with maximal clearance from the
obstacles can be found by moving along the edges of the generalized Voronoi diagram. This
is an active area of research in robotics. For more on this topic, see O’Rourke [44].
[1] P.S. Alexandrov. Combinatorial Topology. Dover, first edition, 1998. Three volumes
bound as one.
[2] Noga Alon and Gil Kalai. A simple proof of the upper-bound theorem. European J.
Comb., 6:211–214, 1985.
[4] Alexander Barvinok. A Course in Convexity. GSM, Vol. 54. AMS, first edition, 2002.
[5] Margaret M. Bayer and Carl W. Lee. Combinatorial aspects of convex polytopes. In
P.M. Gruber and J.M. Wills, editors, Handbook of Convex Geometry, pages 485–534.
Elsevier Science, 1993.
[7] Marcel Berger. Géométrie 1. Nathan, 1990. English edition: Geometry 1, Universitext,
Springer Verlag.
[8] Marcel Berger. Géométrie 2. Nathan, 1990. English edition: Geometry 2, Universitext,
Springer Verlag.
[9] Dimitri P. Bertsekas. Convex Optimization Theory. Athena Scientific, first edition,
2009.
[10] Dimitris Bertsimas and John N. Tsitsiklis. Introduction to Linear Optimization. Athena
Scientific, third edition, 1997.
[11] J. Billera, Louis and Anders Björner. Faces numbers of polytopes and complexes. In
J.E. Goodman and Joe O’Rourke, editors, Handbook of Discrete and Computational
Geometry, pages 291–310. CRC Press, 1997.
387
388 BIBLIOGRAPHY
[14] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University
Press, first edition, 2004.
[15] David A. Brannan, Matthew F. Esplen, and Jeremy J. Gray. Geometry. Cambridge
University Press, first edition, 1999.
[16] K.Q. Brown. Voronoi diagrams from convex hulls. Inform. Process. Lett., 9:223–228,
1979.
[17] Heinz Bruggesser and Peter Mani. Shellable decompositions of cells and spheres. Math.
Scand., 29:197–205, 1971.
[18] Vasek Chvatal. Linear Programming. W.H. Freeman, first edition, 1983.
[19] P.G. Ciarlet. Introduction to Numerical Matrix Analysis and Optimization. Cambridge
University Press, first edition, 1989. French edition: Masson, 1994.
[22] Peter Cromwell. Polyhedra. Cambridge University Press, first edition, 1994.
[23] G.L. Dirichlet. Über die reduktion der positiven quadratischen formen mid drei unbes-
timmten ganzen zahlen. Journal für die reine und angewandte Mathematik, 40:209–227,
1850.
[24] H. Edelsbrunner and R. Seidel. Voronoi diagrams and arrangements. Discrete Compu-
tational Geometry, 1:25–44, 1986.
[25] Herbert Edelsbrunner. Geometry and Topology for Mesh Generation. Cambridge Uni-
versity Press, first edition, 2001.
[26] Günter Ewald. Combinatorial Convexity and Algebraic Geometry. GTM No. 168.
Springer Verlag, first edition, 1996.
[27] Jean Fresnel. Méthodes Modernes En Géométrie. Hermann, first edition, 1998.
[28] William Fulton. Introduction to Toric Varieties. Annals of Mathematical Studies, No.
131. Princeton University Press, 1997.
[29] Jean H. Gallier. Curves and Surfaces In Geometric Modeling: Theory And Algorithms.
Morgan Kaufmann, 1999.
BIBLIOGRAPHY 389
[30] Jean H. Gallier. Geometric Methods and Applications, For Computer Science and En-
gineering. TAM, Vol. 38. Springer, second edition, 2011.
[31] E.N. Gilbert. Random subdivisions of space into crystals. Annals of Math. Stat.,
33:958–972, 1962.
[32] Jacob E. Goodman and Joseph O’Rourke. Handbook of Discrete and Computational
Geometry. CRC Press, second edition, 2004.
[33] R. Graham and F. Yao. A whirlwind tour of computational geometry. American Math-
ematical Monthly, 97(8):687–701, 1990.
[34] Donald T. Greenwood. Principles of Dynamics. Prentice Hall, second edition, 1988.
[35] Branko Grünbaum. Convex Polytopes. GTM No. 221. Springer Verlag, second edition,
2003.
[36] D. Hilbert and S. Cohn-Vossen. Geometry and the Imagination. Chelsea Publishing
Co., 1952.
[38] Serge Lang. Real and Functional Analysis. GTM 142. Springer Verlag, third edition,
1996.
[40] Jiri Matousek. Lectures on Discrete Geometry. GTM No. 212. Springer Verlag, first
edition, 2002.
[41] Jiri Matousek and Bernd Gartner. Understanding and Using Linear Programming.
Universitext. Springer Verlag, first edition, 2007.
[42] Peter McMullen. The maximum number of faces of a convex polytope. Mathematika,
17:179–184, 1970.
[43] James R. Munkres. Elements of Algebraic Topology. Addison-Wesley, first edition, 1984.
[46] Dan Pedoe. Geometry, A comprehensive Course. Dover, first edition, 1988.
[47] F.P. Preparata and M.I. Shamos. Computational Geometry: An Introduction. Springer
Verlag, first edition, 1988.
390 BIBLIOGRAPHY
[48] J.-J. Risler. Mathematical Methods for CAD. Masson, first edition, 1992.
[51] Alexander Schrijver. Theory of Linear and Integer Programming. Wiley, first edition,
1999.
[52] Raimund Seidel. The upper-bound theorem for polytopes: an easy proof of its asymp-
totic version. Comput. Geometry: Theory and Applications, 5:115–116, 1995.
[53] Ernst Snapper and Troyer Robert J. Metric Affine Geometry. Dover, first edition, 1989.
[54] John Stallings. Lectures on Polyhedral Topology. Tata Institute, first edition, 1967.
[55] Richard P. Stanley. The number of faces of simplicial polytopes and spheres. In J.E
Goodman, E. Lutwak, J. Malkevitch, and P. Pollack, editors, Discrete Geometry and
Convexity, pages 212–223. Annals New York Academy of Sciences, 1985.
[57] J. Stolfi. Oriented Projective Geometry. Academic Press, first edition, 1991.
[58] Gilbert Strang. Linear Algebra and its Applications. Saunders HBJ, third edition, 1988.
[59] Bernd Sturmfels. Gröbner Bases and Convex Polytopes. ULS, Vol. 8. AMS, first edition,
1996.
[60] Rekha R. Thomas. Lectures in Geometric Combinatorics. STML, Vol. 33. AMS, first
edition, 2006.
[62] Claude Tisseron. Géométries affines, projectives, et euclidiennes. Hermann, first edition,
1994.
[65] Lucas Vienne. Présentation algébrique de la géométrie classique. Vuibert, first edition,
1996.
BIBLIOGRAPHY 391
[66] M.G. Voronoi. Nouvelles applications des paramètres continus à la théorie des formes
quadratiques. J. Reine u. Agnew. Math., 134:198–287, 1908.
[67] Gunter Ziegler. Lectures on Polytopes. GTM No. 152. Springer Verlag, first edition,
1997.