0% found this document useful (0 votes)

284 views275 pages

Introduction To Linear Algebra

The document is a textbook titled 'Introduction to Linear Algebra' authored by Rita Fioresi and Marta Morigi, published in 2022. It covers fundamental concepts of linear algebra aimed at students in Physics, Engineering, and Computer Science, emphasizing intuitive understanding alongside rigorous mathematical proofs. The book includes chapters on linear systems, vector spaces, linear transformations, eigenvalues, and applications, along with exercises and solutions for practical learning.

Uploaded by

jasonchang2145

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

284 views275 pages

Introduction To Linear Algebra

Uploaded by

jasonchang2145

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 275

Introduction to Linear Algebra

Rita Fioresi
University of Bologna
Marta Morigi
University of Bologna
First edition published 2022
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press

2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

© 2022 Casa Editrice Ambrosiana

Authorized translation from Italian language edition published by CEA – Casa Editrice Ambrosiana, A Division of
Zanichelli editore S.p.A.

CRC Press is an imprint of Taylor & Francis Group, LLC

Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright
holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowl-
edged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or
utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including pho-
tocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission
from the publishers.

For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the
Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are
not available on CCC please contact mpkbookspermissions@tandf.co.uk

Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for
identification and explanation without intent to infringe.

Library of Congress Cataloging‑in‑Publication Data

Names: Fioresi, Rita, 1966- author. | Morigi, Marta, author.

Title: Introduction to linear algebra / Rita Fioresi, University of
Bologna, Marta Morigi, University of Bologna.
Description: First edition. | Boca Raton : Chapman & Hall/CRC Press, 2021.
| Includes bibliographical references and index.
Identifiers: LCCN 2021019430 (print) | LCCN 2021019431 (ebook) | ISBN
9780367626549 (hardback) | ISBN 9780367635503 (paperback) | ISBN
9781003119609 (ebook)
Subjects: LCSH: Algebras, Linear.
Classification: LCC QA184.2 .F56 2021 (print) | LCC QA184.2 (ebook) | DDC
512/.5--dc23
LC record available at https://lccn.loc.gov/2021019430
LC ebook record available at https://lccn.loc.gov/2021019431

ISBN: 978-0-367-62654-9 (hbk)

ISBN: 978-0-367-63550-3 (pbk)
ISBN: 978-1-003-11960-9 (ebk)

Typeset in LM Roman
by KnowledgeWorks Global Ltd.
Contents

Preface ix

Chapter 1 Introduction to Linear Systems 1

1.1 LINEAR SYSTEMS: FIRST EXAMPLES 1
1.2 MATRICES 3
1.3 MATRICES AND LINEAR SYSTEMS 6
1.4 THE GAUSSIAN ALGORITHM 11
1.5 EXERCISES WITH SOLUTIONS 16
1.6 SUGGESTED EXERCISES 22

Chapter 2 Vector Spaces 25

2.1 INTRODUCTION: THE SET OF REAL NUMBERS 25
N
2.2 THE VECTOR SPACE R AND THE VECTOR SPACE OF MATRICES 26
2.3 VECTOR SPACES 31
2.4 SUBSPACES 33
2.5 EXERCISES WITH SOLUTIONS 38
2.6 SUGGESTED EXERCISES 39

Chapter 3 Linear Combination and Linear Independence 41

3.1 LINEAR COMBINATIONS AND GENERATORS 41
3.2 LINEAR INDEPENDENCE 47
3.3 EXERCISES WITH SOLUTIONS 51
3.4 SUGGESTED EXERCISES 54

Chapter 4 Basis and Dimension 57

4.1 BASIS: DEFINITION AND EXAMPLES 57
4.2 THE CONCEPT OF DIMENSION 61
4.3 GAUSSIAN ALGORITHM 64
4.4 EXERCISES WITH SOLUTIONS 68
4.5 SUGGESTED EXERCISES 72

v
vi Contents

4.6 APPENDIX: THE COMPLETION THEOREM 74

Chapter 5 Linear Transformations 77

5.1 LINEAR TRANSFORMATIONS: DEFINITION 77
5.2 LINEAR MAPS AND MATRICES 82
5.3 THE COMPOSITION OF LINEAR TRANSFORMATIONS 84
5.4 KERNEL AND IMAGE 86
5.5 THE RANK NULLITY THEOREM 89
5.6 ISOMORPHISM OF VECTOR SPACES 91
5.7 CALCULATION OF KERNEL AND IMAGE 92
5.8 EXERCISES WITH SOLUTIONS 95
5.9 SUGGESTED EXERCISES 97

Chapter 6 Linear Systems 101

6.1 PREIMAGE 101
6.2 LINEAR SYSTEMS 103
6.3 EXERCISES WITH SOLUTIONS 108
6.4 SUGGESTED EXERCISES 111

Chapter 7 Determinant and Inverse 113

7.1 DEFINITION OF DETERMINANT 113
7.2 CALCULATING THE DETERMINANT: CASES 2 × 2 AND 3 × 3 117
7.3 CALCULATING THE DETERMINANT WITH A RECURSIVE
METHOD 119
7.4 INVERSE OF A MATRIX 121
7.5 CALCULATION OF THE INVERSE WITH THE GAUSSIAN
ALGORITHM 123
N N
7.6 THE LINEAR MAPS FROM R TO R 125
7.7 EXERCISES WITH SOLUTIONS 126
7.8 SUGGESTED EXERCISES 127
7.9 APPENDIX 128

Chapter 8 Change of Basis 141

8.1 LINEAR TRANSFORMATIONS AND MATRICES 141
8.2 THE IDENTITY MAP 144
8.3 CHANGE OF BASIS FOR LINEAR TRANSFORMATIONS 148
8.4 EXERCISES WITH SOLUTIONS 150
8.5 SUGGESTED EXERCISES 152
Contents vii

Chapter 9 Eigenvalues and Eigenvectors 155

9.1 DIAGONALIZABILITY 155
9.2 EIGENVALUES AND EIGENVECTORS 158
9.3 EXERCISES WITH SOLUTIONS 169
9.4 SUGGESTED EXERCISES 173

Chapter 10 Scalar Products 177

10.1 BILINEAR FORMS 177
10.2 BILINEAR FORMS AND MATRICES 179
10.3 BASIS CHANGE 181
10.4 SCALAR PRODUCTS 183
10.5 ORTHOGONAL SUBSPACES 185
10.6 GRAM-SCHMIDT ALGORITHM 187
10.7 EXERCISES WITH SOLUTIONS 189
10.8 SUGGESTED EXERCISES 190

Chapter 11 Spectral Theorem 193

11.1 ORTHOGONAL LINEAR TRANSFORMATIONS 193
11.2 ORTHOGONAL MATRICES 194
11.3 SYMMETRIC LINEAR TRANSFORMATIONS 198
11.4 THE SPECTRAL THEOREM 199
11.5 EXERCISES WITH SOLUTIONS 202
11.6 SUGGESTED EXERCISES 203
11.7 APPENDIX: THE COMPLEX CASE 204

Chapter 12 Applications of Spectral Theorem and Quadratic Forms 209

12.1 DIAGONALIZATION OF SCALAR PRODUCTS 209
12.2 QUADRATIC FORMS 213
12.3 QUADRATIC FORMS AND CURVES IN THE PLANE 215
12.4 EXERCISES WITH SOLUTIONS 216
12.5 SUGGESTED EXERCISES 218

Chapter 13 Lines and Planes 221

3
13.1 POINTS AND VECTORS IN R 221
13.2 SCALAR PRODUCT AND VECTOR PRODUCT 223
3
13.3 LINES IN R 226
3
13.4 PLANES IN R 228
viii Contents

13.5 EXERCISES WITH SOLUTIONS 231

13.6 SUGGESTED EXERCISES 233

Chapter 14 Introduction to Modular Arithmetic 235

14.1 THE PRINCIPLE OF INDUCTION 235
14.2 THE DIVISION ALGORITHM AND EUCLID’S ALGORITHM 237
14.3 CONGRUENCE CLASSES 241
14.4 CONGRUENCES 244
14.5 EXERCISES WITH SOLUTIONS 246
14.6 SUGGESTED EXERCISES 247
14.7 APPENDIX: ELEMENTARY NOTIONS OF SET THEORY 248

Appendix A Complex Numbers 249

A.1 COMPLEX NUMBERS 249
A.2 POLAR REPRESENTATION 252

Appendix B Solutions of some suggested exercises 255

Bibliography 261

Index 263
Preface

This textbook comes from the need to cater the essential notions of linear algebra to
Physics, Engineering and Computer Science students. We strived to keep the abstrac-
tion and rigor of this beautiful subject and yet give as much as possible the intuition
behind all of the mathematical concepts we introduce. Though we provide the full
proofs of all of our statements, we introduce each topic with a lot of examples and
intuitive explanations to guide the students to a mature understanding.
This is not meant to be a comprehensive treatment on linear algebra but an
essential guide to its foundation and heart for those who want to understand the
basic concepts and the abstract mathematics behind the powerful tools it provides.
A short tour of our presentation goes as follows.
Chapters 1, 13 and 14 are independent from each other and from the rest of the
book. Chapter 1 and/or Chapter 13 can be effectively used as a motivational intro-
duction to linear algebra and vector spaces. Chapter 14 contains some further topics
like the principle of induction and Euclid’s algorithm, which are essential for com-
puter science students, but they can easily be omitted and the chapter is independent
from the rest of the book.
Chapters 2, 3, 4 and 5 introduce the basic notions concerning vector spaces and
linear maps, while Chapters 6, 7, 8 and 9 further develop the theory to reach the
question of eigenvalues and eigenvectors. A minimal course in linear algebra can end
after Chapter 6, or even better after Chapter 9. In the remaining Chapters 10, 11
and 12 we study scalar products, the Spectral Theorem and quadratic forms, very
important for physical and engineering applications.

ix
x Preface

Ch. 1 Ch. 13 Ch. 14

Ch. 2, 4, 5

Ch. 6, 7, 8, 9

Ch. 10, 11, 12

Acknowledgements. First of all, we want to thank our students in Computer

Science and Physics, who have taught us how to explain in the most accessible way
this beautiful subject, keeping its necessary abstraction. We invite our readers to be
passionate, like some of their predecessors, and grasp the perfection of this theory.
We also thank the Department of Mathematics, which has supported us through
these years of teaching and also our many students who alerted us about the typos
of the previous version: if this book is improved it is also and because of their contri-
bution. We last thank Prof. Faglioni for valued technical support and also the Taylor
and Francis staff for helping us to finalize our work.
Finally, our special thanks is offered to our families, whose encouragement and
support have made this book possible.
CHAPTER 1

Introduction to Linear
Systems

In this chapter, we discuss how to solve linear systems with real coefficients using a
method known as Gaussian algorithm. Later on, we will also use this method to solve
other questions; at the same time, we will interpret linear systems as special cases of
a much deeper theory.

1.1 LINEAR SYSTEMS: FIRST EXAMPLES

A linear equation is an equation where the unknowns appear with degree 1, that is
an equation of the form:

a1 x1 + a2 x2 + . . . + an xn = b, (1.1)

where a1 , a2 , . . . , an and b are assigned numbers and x1 , x2 , . . . , xn are the unknowns.

The numbers a1 , . . . , an are called coefficients of the linear equation, b is called known
term. If b = 0 the equation is said to be homogeneous. A solution of the equation (1.1)
is a n-tuple of numbers (s1 , s2 , . . . , sn ) that gives an equality when put in place of the
unknowns. For example (3, −1, 4) is a solution of the equation 2x1 + 7x2 − x3 = −5
because 2 ⋅ 3 + 7 ⋅ (−1) − 4 = −5.
A linear system of m equations in n unknowns x1 , x2 , . . . , xn is a set of m linear
equations in n unknowns x1 , x2 , . . . , xn that must be simultaneously satisfied:
⎧
⎪ a11 x1 + a12 x2 + ⋯ + a1n xn = b1
⎪
⎪
⎪
⎪ a21 x1 + a22 x2 + ⋯ + a2n xn = b2
⎨
⎪ (1.2)
⎪
⎪ ⋮
⎪
⎪
⎩ am1 x1 + am2 x2 + ⋯ + amn xn = bm

The numbers a11 , . . . , a1n , . . . , am1 , . . . , amn are called the system coefficients, while
b1 , . . . , bm are called the known terms. If bi = 0 for every i = 1, . . . , m, the sys-
tem is said to be homogenous. A solution of the linear system (1.2) is a n-tuple
(s1 , s2 , . . . , sn ) of numbers that satisfies all the system equations. For example (1, 2)

1
2 Introduction to Linear Algebra

is the solution of the linear system

x1 + x2 = 3
{
x1 − x2 = −1

In this book, we will deal exclusively with linear systems with real coefficients
that is, systems of the form (1.2) in which all the coefficients aij of the unknowns
and all known terms bi are real numbers. The solutions that we will find, therefore,
will always be ordered n-tuples of real numbers.
Given a linear system, we aim at answering the following questions:

1. Does the system admit solutions?

2. If so, how many solutions does it admit and what are they?

In certain cases, it is particularly easy to answer these questions. Let us see some
examples.

Example 1.1.1 Consider the following linear system in the unknowns x1 , x2 :

x1 + x2 = 3
{
x1 + x2 = 1

It is immediate to observe that the sum of two real numbers cannot be simultaneously
equal to 3 and 1. Thus, the system does not admit solutions. In other words, when
the conditions assigned by the two equations of the system are incompatible, then
the system does not have solutions.

The example above justifies the following definition.

Definition 1.1.2 A system is said to be compatible if it admits solutions.

Example 1.1.3 Consider the following linear system in the unknowns x1 , x2 :

x1 + x2 = 3
{
x2 = −1

Substituting in the first equation the value of x2 obtained from the second one, we
get: x1 = 3 − x2 = 3 + 1 = 4. The system is therefore compatible and admits a unique
solution: (4, −1). In this example, two variables are assigned (the unknowns x1 and
x2 ), and two conditions are given (the two equations of the system). These conditions
are compatible, that is they are not contradictory, and are “independent” meaning
that they cannot be obtained one from the other. In summary:
Two real variables along with two compatible conditions give one and only one
solution.
Introduction to Linear Systems 3

Example 1.1.4 Now consider the linear system in the unknowns x1 , x2 :

x1 + x2 = 3
{
2x1 + 2x2 = 6.

Unlike what happened in the previous example, here the conditions given by the two
equations are not “independent”, in the sense that the second equation is obtained
by multiplying the first by 2. The two equations give the same relation between
the variables x1 and x2 . Then, solving the linear system means simply solving the
equation x1 + x2 = 3. This equation certainly has solutions: for example, we saw in
the previous example that (4, −1) is a solution, but also (1, 2) or (0, 3) are solutions.
Exactly how many solutions are there? And how can we find them out? In this case,
we have two variables and one condition on them. This means that a variable is free
to vary in the set of real numbers, which are infinitely many. The equation allows
us to express a variable, say x2 , as a function of the other variable x1 . The solutions
are all expressible in the form: (x1 , 3 − x1 ). With this, we mean that the variable x1
can take all the infinite real values, and that in order for the equation x1 + x2 = 3 to
be satisfied, it must be x2 = 3 − x1 . A more explicit way, but obviously equivalent,
to describe the solutions, is {(t, 3 − t)∣t ∈ R}. Of course, we could decide to vary
the variable x2 and express x1 as a function of x2 . In that case, we would give the
solutions in the form (3 − x2 , x2 ), or equivalently we say that the set of solutions is:
{(3 − s, s)∣s ∈ R} . In summary:
Two real variables along with one condition give infinite solutions.

Definition 1.1.5 Two linear systems are called equivalent if they have the same
solutions.

In Example 1.1.4 we observed that the linear system

x1 + x2 = 3
{
2x1 + 2x2 = 6

is equivalent to the equation x1 + x2 = 3. Of course, being able to understand if two

systems are equivalent can be very useful; for example, we can try to solve a linear
system by reducing it to an equivalent one, but easier to solve.
In the next section, we introduce some useful concepts to simplify the way we
write a linear system.

1.2 MATRICES
Given two natural numbers m, n, a m × n matrix with real coefficients is a table of
mn real numbers placed on m rows and n columns. For example:

5 −6 0
( )
4 3 −1

is a 2 × 3 matrix.
4 Introduction to Linear Algebra

If m = n the matrix is said to be square of order n. For example

1 0
( 2 )
3
3

is a square matrix of order 2.

Denote by Mm,n (R) the set of m × n matrices with real coefficients and simply
by Mn (R) the set of square matrices of order n with real coefficients.
Given a matrix A, the number that appears in the i-th row and j-th column of
A is called the (i, j) entry of A.
For example, in the matrix

5 −6 0
A=( )
4 3 −1

the (1, 3) entry is 0, while the (2, 2) entry is 3. Of course, two m × n matrices A and
B are equal if their entries coincide, that is, if the (i, j) entry of A coincides with the
(i, j) entry of B, for every i = 1, . . . , m and for every j = 1, . . . , n.
Given a generic m × n matrix, we can write it synthetically as follows:

⎛ a11 a12 . . . a1n ⎞

⎜ a21 a22 . . . a2n ⎟
A=⎜
⎜
⎜
⎜
⎟
⎟
⎟
⎟ ,
⎜ ⋮ ⋮ ⋮ ⎟
⎝ am1 am2 . . . amn ⎠

where aij is the (i, j) entry, i = 1, . . . , m, j = 1, . . . , n.

We now want to define the product rows by columns between two matrices A and
B, in the case where the rows of A have the same length as the columns of B.
If A is a m × s matrix and B is a s × n matrix, we define the product cij of the
i-th row of A and j-th column of B in the following way:

⎛ b1j ⎞
⎜ b2j ⎟
cij = ( ai1 ai2 . . . ais )⎜
⎜
⎜
⎜
⎟
⎟
⎟ = ai1 b1j + ai2 b2j + . . . + ais bsj ,
⎜ ⋮ ⎟ ⎟
⎝ bsj ⎠

which we also write as: s

cij = ∑ aih bhj .
h=1

In practice, we have multiplied, in order, the coefficients of the i-th row of A by

the coefficients of the j-th column of B, then we have added the numbers obtained.
For example, if we have

⎛ 0 1 ⎞
⎛ 1 0 3 −1 ⎞
⎜ ⎟
B=⎜ ⎟
−3 5
A=⎜
⎜ 0 −2 2 ⎟
1 ⎟ ⎜
⎜
⎜
⎟
⎟
⎟
⎝ 1 0 −1 0 ⎠ ⎜ 1 0 ⎟
⎝ 2 −1 ⎠
Introduction to Linear Systems 5

then
c12 = 1 ⋅ 1 + 0 ⋅ 5 + 3 ⋅ 0 + (−1) ⋅ (−1) = 2
c31 = 1 ⋅ 0 + 0 ⋅ (−3) + (−1) ⋅ 1 + 0 ⋅ 2 = −1.
At this point we define the product of A and B as

C = AB = (cij )i=1,...,m .
j=1,...,n

The matrix C is the product of A and B, and it is a m × n matrix.

In the previous example, we have that

3 −1 ⎞ ⎛ ⎞
0 1
⎛ 1 0
⎜ ⎟ ⎛ 1 2 ⎞
C=⎜ 1 ⎟ ⎜ −3 5 ⎟
⎜ 0 −2 2 ⎟⎜⎜ 1
⎜
⎟
⎟ ⎜
⎟ = ⎜ 10 −11 ⎟
⎟.
0 ⎠⎜ ⎟ ⎝ −1
⎝ 1 0
0 −1 ⎝ 2 −1 ⎠ 1 ⎠

We note that, in general, the number of rows of AB is equal to the number of

rows of A, and the number of columns of AB is equal to the number of columns of
B.
We also observe that the product of a m × n matrix and a n × 1 matrix (i.e. a
n m
vector in R ) results in a m × 1 matrix, that is, a vector in R .
Proposition 1.2.1 The product operation between matrices enjoys the following
properties:

1. Associative, that is, (AB)C = A(BC) where A, B, C are matrices such that
the products that appear in the formula are defined.

2. Distributive, that is, A(B + C) = AB + AC, provided that the sum and product
operations that appear in the formula are defined.

Proof. The proof is a calculation and amounts to applying the definition. We show
only the associativity of the product. Consider A ∈ Mm,s (R), B ∈ Ms,r (R), C ∈
Mr,n (R). We observe that:
s r
(AB)iu = ∑ aih bhu , (BC)hj = ∑ bhu cuj ;
h=1 u=1

then r r s
((AB)C)ij = ∑ (AB)iu cuj = ∑ ( ∑ aih bhu )cuj =
u=1 u=1 h=1
r s s r
∑ ∑ aih bhu cuj = ∑ ∑ aih bhu cuj =
u=1 h=1 h=1 u=1
s r s
∑ aih ( ∑ bhu cuj ) = ∑ aih (BC)hj = (A(BC))ij .
h=1 u=1 h=1

The proof of distributivity is similar.

6 Introduction to Linear Algebra

Note that the product operation between matrices is not commutative. Even if
the product AB between two matrices A and B is defined, the product BA may not
be defined. For example if

⎛ 1 0⎞ 1 −1
A=⎜ ⎟,
⎜ 2 1⎟ B=( )
⎝−1 0⎠ 0 1

we have that
⎛ 1 −1⎞
⎜ 2 −1⎟
AB = ⎜ ⎟,
⎝−1 1 ⎠
while BA is not defined. Similarly if

1 2 −1 1
A=( ), B=( )
0 −3 0 2

we have that
−1 5 −1 −5
AB = ( ), BA = ( ).
0 −6 0 −6

1.3 MATRICES AND LINEAR SYSTEMS

Let us now see how it is possible to use matrices and the product rows by columns
to describe a linear system.
Consider a linear system of the form:
⎧
⎪ a11 x1 + a12 x2 + ⋯ + a1n xn = b1
⎪
⎪
⎪
⎪ a 21 x1 + a22 x2 + ⋯ + a2n xn = b2
⎨
⎪
⎪
⎪ ⋮
⎪
⎪
⎩ m1 1
a x + a x
m2 2 + ⋯ + amn xn = bm

We can write this system in matrix form as follows:

⎛ a11 x1 + a12 x2 + ⋯ + a1n xn ⎞ ⎛ b1 ⎞

⎜
⎜ ⎟
⎟ ⎜ ⎟
=⎜ ⎟
⎜ a21 x1 + a22 x2 + ⋯ + a2n xn ⎟ ⎜ b2 ⎟
⎜
⎜ ⎟
⎟ ⎜ ⎟
⎜ ⋮ ⎟ ⎜⎜ ⋮ ⎟
⎟
⎝ am1 x1 + am2 x2 + ⋯ + amn xn ⎠ ⎝ bm ⎠

and then using the product rows by columns in the following way:

⎛ a11 a12 . . . a1n ⎞⎛ x1 ⎞ ⎛ b1 ⎞

⎜
⎜ ⎟
⎟ ⎜
⎜ ⎟
⎟ ⎜ ⎟
=⎜ ⎟
⎜ a21 a22 . . . a2n ⎟ ⎜ x2 ⎟ ⎜ b2 ⎟
⎜
⎜ ⎟
⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⎟⎜⎜ ⋮ ⎟
⎟ ⎜⎜ ⋮ ⎟
⎟
⎝ am1 am2 . . . amn ⎠⎝ xn ⎠ ⎝ bm ⎠

or, more synthetically, as

Ax = b,
Introduction to Linear Systems 7

where A = (aij ) is the m × n matrix which has as entries the coefficients of the
unknowns,
⎛ x1 ⎞
⎜ x2 ⎟
⎜ ⎟
x=⎜ ⎜
⎜
⎟
⎟
⎜ ⋮ ⎟⎟
⎝ xn ⎠
is the column of the n unknowns, and

⎛ b1 ⎞
⎜ ⎟
b=⎜ ⎟
⎜ b2 ⎟
⎜
⎜ ⎟
⎟
⎜ ⋮ ⎟
⎝ bm ⎠

is the column of m known terms. The matrix A = (aij ) is called the incomplete matrix
associated with the system and the matrix

⎛ a11 a12 . . . a1n b1 ⎞

⎜ a21 a22 . . . a2n b2 ⎟
(A∣b) = ⎜
⎜
⎜
⎜
⎟
⎟
⎟
⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ am1 am2 . . . amn bm ⎠

is called the complete matrix associated with the system.

Example 1.3.1 Consider the linear system

√
2x1 + 2x2 − x3 = 2
{
x1 − x3 = 1

in the unknowns x1 , x2 , x3 .
Then the incomplete matrix and the complete matrix associated with the system
are, respectively:
√ √
2 2 −1 2 2 −1 2
A= ( ) and (A∣b) = ( ).
1 0 −1 1 0 −1 1

Using matrices is simply a more convenient way to write and deal with linear
systems. Each row of the complete matrix associated with a linear system is equivalent
to an equation in which the unknowns are implied.

Definition 1.3.2 A matrix is said to be in row echelon form or staircase form if the
following conditions are met:

(a) rows consisting of zeros, if any, are found at the bottom of the matrix;

(b) the first nonzero element of each (nonzero) row is located to the right of the
first nonzero element of the previous row.
8 Introduction to Linear Algebra

Example 1.3.3 The matrix

⎛ 1 −1 −1 2 −4 ⎞
⎜ 5 ⎟
A=⎜ ⎟
0 0 −1 3
⎜
⎜ ⎟
⎟
⎜
⎜ 0 0 0 1
1 ⎟
⎟
3
⎝ 0 0 0 0 0 ⎠

is a row echelon matrix because it satisfies conditions (a) and (b) of Definition 1.3.2.
On the contrary, the matrix

⎛ 2 −1 −1 2 −4 ⎞
B=⎜
⎜ 0 1 −1 3 5 ⎟
⎟
⎝ 0 2 0 1 1 ⎠
5

is not in such a form because the first nonzero element of the third row is not located
to the right of the first nonzero element of the second row (but below it).

Definition 1.3.4 Let A be a row echelon matrix (by rows). We call pivot of A the
first nonzero element of each nonzero row of A. We call row rank of A, denoted by
rr(A), the number of nonzero rows, or equivalently, the number of its pivots.

Example 1.3.5 Given

⎛ 1 −1 −1 2 −4 ⎞
⎜ 5 ⎟
A=⎜ ⎟
0 0 −1 3
⎜
⎜ ⎟
⎟ ,
⎜
⎜ 0 0 0 1
1 ⎟
⎟
3
⎝ 0 0 0 0 0 ⎠

the pivots of A are 1, −1, 31 , so rr(A) = 3.

Observation 1.3.6 Let A ∈ Mm,n (R) be a row echelon matrix. By definition

rr(A) ≤ m. (1.3)

However, we also have the inequality

rr(A) ≤ n. (1.4)

If m ≤ n, (1.4) follows obviously from (1.3). If m > n, it is easy to understand,

by writing a row echelon matrix with a number of rows greater than the number of
columns, that property (b) of Definition 1.3.2 implies that the maximum number of
pivots of A is n.

Definition 1.3.7 The linear system Ax = b is called row echelon if the matrix A is
row echelon.

We will explain how to quickly solve a linear system whose matrix is in row
echelon form.
Introduction to Linear Systems 9

Example 1.3.8 Consider the following linear system in the unknowns x1 , x2 , x3 , x4 :

⎧
⎪ 4x1 + 2x2 + 3x3 + 4x4 = 1
⎪
⎪
⎪
⎪ x2 − 2x3 = 2
⎨
⎪
⎪
⎪ x3 − x4 = 0
⎪
⎪
⎩ x4 = 1

The complete matrix associated with the system is:

⎛ 4 2 3 4 1 ⎞
⎜ ⎟
(A∣b) = ⎜ ⎟
⎜ 0 1 −2 0 2 ⎟
⎜
⎜ ⎟
⎟ ,
⎜ 0 0 1 −1 0 ⎟
⎝ 0 0 0 1 1 ⎠

which is in row echelon form and has row rank 4. Obviously also the incomplete
matrix A is in row echelon form, and we note that it also has row rank 4. The fact
that the matrix A is in row echelon form means that if you choose any of the sytem
equations, there exists one unknown wich appears in that equation but not in the
following ones. The linear system can therefore be easily solved starting from the
bottom and proceeding with successive replacements, that is, from the last equation
and going back to the first. From the fourth equation we get x4 = 1; replacing x4 = 1
in the third equation we get x3 = x4 = 1. Replacing x3 = 1 in the second equation we
get x2 = 2 + 2 = 4. Finally, replacing x2 = 4 and x3 = x4 = 1 in the first equation, we
obtain x1 = 14 (1 − 2x2 − 3x3 − 4x4 ) = 41 (1 − 8 − 3 − 4) = − 72 . The system has therefore
only one solution: (− 27 , 4, 1, 1).
Example 1.3.9 Consider the linear system in the unknowns x1 , x2 , x3 , x4 obtained
from that of the previous example by deleting the last equation:
⎧
⎪ 4x1 + 2x2 + 3x3 + 4x4 = 1
⎪
⎪
⎨
⎪ x 2 − 2x3 = 2
⎪
⎪
⎩ x3 − x4 = 0.
The complete matrix associated with the system is:

⎛ 4 2 3 4 1 ⎞
⎜
(A∣b) = ⎜ 0 1 −2 0 2 ⎟
⎟.
⎝ 0 0 1 −1 0 ⎠

It is in row echelon form and has row rank 3. The incomplete matrix A is in echelon
form and also has row rank 3. Of course, the solution (− 72 , 4, 1, 1), found in the
previous example, continues to be a solution of the system, so that the system is
certainly compatible. However, how many solutions of the system do we have? Also
in this case, we can proceed from the bottom with subsequent replacements because,
as before, for each equation there exists one unknown which appears in that equation
but not in the following ones. From the last equation, we get x3 = x4 . Replacing
x3 = x4 in the second equation, we get x2 = 2 + 2x3 = 2 + 2x4 . Replacing x2 and x3
in the first equation we obtain x1 = 14 (1 − 2x2 − 3x3 − 4x4 ) = 41 (1 − 4 − 4x4 − 3x4 −
10 Introduction to Linear Algebra

4x4 ) = 14 (−3 − 11x4 ). The system therefore has infinitely many solutions of the form
( 41 (−3 − 11x4 ), 2 + 2x4 , x4 , x4 ) where the variable x4 is allowed to vary in the set of
real numbers.

What we have illustrated in Examples 1.3.8, 1.3.9 is a general fact, and we have
the following proposition.

Proposition 1.3.10 Let Ax = b be a row echelon linear system in the n unknowns

x1 , . . . , xn . Then:

(a) the system admits solutions if and only if rr(A) = rr(A∣b);

(b) if rr(A) = rr(A∣b) = n, the system admits only one solution;

Proof. First we observe that by deleting the column b from the matrix (A∣b) we still
have a matrix in row echelon form, thus also the incomplete matrix A associated
with the system is a matrix in such a form. Also, by deleting the column (b) from
the matrix (A∣b), the number of pivots may decrease at most 1 unit. More precisely
this happens if and only if the matrix A has at least one zero row, let us say the i-th,
and the element bi is different from 0. Going back to the corresponding equation, we
obtain that this is equivalent to the condition 0 = bi ≠ 0, which cannot evidently be
satisfied. So if rr(A) ≠ rr(A∣b), the system does not admit solutions. Now suppose
that rr(A) = rr(A∣b) = n. This means that the number of pivots, i.e. the number of
“steps” , coincides with the number of unknowns, so the system consists of exactly n
equations, the unknown x1 appears only in the first equation, x2 appears only in the
first two equations, x3 appears only in the first three and so on. In particular the last
equation of the system contains only the unknown xn and establishes its value. By
substituting this value in the next-to-last equation, we obtain the value of the variable
xn−1 and so on, proceeding by subsequent replacements from below as in Example
1.3.8, we get the solution of the system, which is unique. If, instead, rr(A) = rr(A∣b) =
k < n it is possible, by proceeding from the bottom with subsequent replacements,
to express the k variables corresponding to the pivots of nonzero rows as a function
of the other n − k variables, which remain free to vary in the set of real numbers. In
this way, we get infinitely many solutions.

Example 1.3.11 We want to solve the linear system:

x1 − x2 + x3 − x4 = 1
{
x3 + 21 x4 = 0

The complete matrix associated with the system is:

x1 − x2 + x3 − x4 = 1
{
x3 + 21 x4 = 0
Introduction to Linear Systems 11

We note that rr(A) = rr(A∣b) = 2 so, by Proposition 1.3.10 (a), the system has
solutions. Since the number of variables is 4 > 2 , by Proposition 1.3.10 (c), the
system admits infinitely many solutions. Basically, we have four variables and two
conditions on them, thus two variables remain free to vary in the set of real numbers,
and we can express two variables as a function of the other two. Proceeding with
subsequent substitutions from the bottom we have:
1
x3 = − x4
2
1 3
x1 = x2 − x3 + x4 + 1 = x2 + x4 + x4 + 1 = x2 + x4 + 1.
2 2
The infinitely many solutions are of the form: (x2 + 23 x4 + 1, x2 , − 21 x4 , x4 ), with
x2 , x4 ∈ R.
We observe that we could choose to express the variable x4 as depending on the
variable x3 and, for example, the variable x2 as depending on the variables x1 and
x3 (x2 = x1 + 3x3 − 1). In other words, the choice of the free variables is not forced.
However, we can always choose as free those variables corresponding to the columns
of the matrix A containing no pivots and express the unknowns corresponding to the
columns that contain the pivots as a function of the others. For example, in this case
the pivots, both equal to 1, are located on the first and third column of the matrix
A and, as a first choice, we have left the variables x2 and x4 to be free and we have
expressed x1 and x3 as a function of x2 and x4 .

1.4 THE GAUSSIAN ALGORITHM

We have now established how to solve a linear system in row echelon form. What
happens for a generic linear system Ax = b? It would be convenient to be able to get
′ ′
a new linear system A x = b , this time in row echelon form, equivalent to the initial
system, i.e. having the same solutions. In this way, we could calculate the solutions
′ ′
of Ax = b by solving the system A x = b in row echelon form. This is exactly what
we will do.

Example 1.4.1 The following linear systems in the unknowns x1 , x2 :

x1 − x2 = 1 x1 − x2 = 1
{ {
x1 + x2 = 2 2x2 = 1

are equivalent. In fact, we can easily see that in both cases the solution is: ( 32 , 12 ). We
note that the first equation is the same in the two systems and that the second system
can be obtained by substituting the second equation with the difference between the
second equation itself and the first equation:
nd nd st
2 equation → 2 equation − 1 equation.
12 Introduction to Linear Algebra

How can we switch from one system to another one equivalent to it? For example
by doing the following:

(a) exchange of two equations;

(b) multiplication of an equation by a real number other than 0;

(c) substitution of the i-th equation with the sum of the i-th equation and the j-th
equation multiplied by a real number α. In summary:

i-th equation ⟶ i-th equation + α(j-th equation).

It is straightforward to verify that operations (a) and (b) do not alter the system
solutions. As for operation (c), it is enough to observe that it involves only the i-th
and j-th equation of the system, so just observe that the systems

j-th equation j-th equation

{ {
i-th equation i-th equation + α(j-th equation)

are equivalent, i.e. they have the same solutions.

Now we translate the operations (a), (b) and (c) in matrix terms:

• Exchanging two equations of the system is equivalent to exchanging two rows

of the complete matrix associated with the system.

• Multiplying an equation by a real number other than 0 is equivalent to multi-

plying a row of the complete matrix associated with the system by the same
real number other than 0; that is, to multiplying each element of the row by
this number.

• Operation (c) is equivalent to replacing the i-th row of the complete matrix as-
sociated to the system, with the sum of the i-th row and the j-th row multiplied
by a real number α.

Let us see a little better what we mean. Let (ai1 . . . ain bi ) and (aj1 . . . ajn bj )
be, respectively, the i-th and j-th row of the matrix (A∣b). Adding the i-th row with
the j-th one multiplied by a number α it means to take the sum:

(ai1 . . . ain bi ) + α(aj1 . . . ajn bj ) = (ai1 + αaj1 . . . ain + αajn bi + αbj ).

Because of the importance of these operations we will then give them a name.

Definition 1.4.2 Given a matrix A we call elementary operations on the rows of A

the following ones:

(a) exchange of two rows;

(b) multiplication of a row by a real number other than 0;

Introduction to Linear Systems 13

(c) replacing the i-th row with the sum of the i-th row and the j-th row multiplied
by a real number α.
Observation 1.4.3 We observe that the elementary operation (c) does not require
that the number α is not zero. In fact, if α = 0 the operation (c) amounts to leaving
the i-th row unchanged.
Given any matrix A = (aij ) we can turn it into a row echelon matrix by elementary
operations on the rows of A. This process is known as Gaussian reduction, and the
algorithm that is used is called Gaussian algorithm and operates as follows:
1. If a11 = 0 exchange the first row of A with a row in which the first element is
nonzero. We denote by a such nonzero element. If the first element of each row
of A is zero, consider the matrix that is obtained by deleting the first column
and start again.
2. Check all the rows except the first, one after the other. If the first element of a
row is zero, leave the row unchanged. If the first element of a row, say the i-th
(i > 1), is equal to b ≠ 0, replace the i-th row with the sum of the i-th row and
the first row multiplied by − ab .
3. At this point all the elements of the first column, except possibly the first, are
zero. Consider the matrix that is obtained by deleting the first row and the first
column of the matrix and start again from step one.
Example 1.4.4 Consider
⎛ 0 1 −1 0 ⎞
⎜
A=⎜ 1 2 0 1 ⎟
⎟.
⎝ 2 −1 1 2 ⎠
Let us use the Gaussian algorithm to reduce A in row echelon form.
Since the entry in place (1, 1) is zero, we exchange the first with the second row,
obtaining the matrix:
⎛ 1 2 0 1 ⎞
⎜
⎜ 0 1 −1 0 ⎟
⎟.
⎝ 2 −1 1 2 ⎠
The first entry of the second row is zero, hence we leave this row unchanged. The
first element of the third row is 2, hence we substitute the third row with the sum of
the third row and the first one multiplied by −2. We thus obtain:
⎛ 1 2 0 1 ⎞ ⎛ 1 2 0 1 ⎞
⎜
⎜ 0 1 −1 0 ⎟
⎟→⎜
⎜ 0 1 −1 0 ⎟
⎟.
⎝ 2 −1 1 2 ⎠ ⎝ 0 −5 1 0 ⎠
Every entry of the first column except for the first one is zero. We then consider the
matrix obtained by deleting the first row and first column:
⎛ 1\ 2\ 0\ 1\ ⎞
⎜
⎜ 0\ 1 −1 0 ⎟
⎟.
⎝ 0\ −5 1 0 ⎠
14 Introduction to Linear Algebra

We apply again the Gaussian algorithm. The first entry of the first row is nonzero,
hence we leave the first row as it is. We substitute the second row with the sum of
the second row with the first multiplied by 5. We obtain:

⎛ 1\ 2\ 0\ 1\ ⎞ ⎛ 1\ 2\ 0\ 1\ ⎞
⎜
⎜ 0\ 1 −1 0 ⎟ → ⎜ 0\ 1 −1 0 ⎟
⎟ ⎜ ⎟.
⎝ 0\ −5 1 0 ⎠ ⎝ 0\ 0 −4 0 ⎠

We thus have obtained a row echelon matrix.

⎛ 1 2 0 1 ⎞
B=⎜
⎜ 0 1 −1 0 ⎟
⎟.
⎝ 0 0 −4 0 ⎠

At this point, we are able to solve any linear system Ax = b. The complete matrix
associated with the system is (A∣b). Using the Gaussian algorithm we can “reduce”
(A∣b) to a row echelon matrix obtaining a matrix (A ∣b ). The linear system A x = b
′ ′ ′ ′

is equivalent to the system Ax = b, since each elementary operation on the rows of

(A∣b) is equivalent to an operation on the equations of the system that preserves
the solutions. So, to find the solutions of our system Ax = b, we need to solve
′ ′
the row echelon system A x = b , taking into account Proposition 1.3.10. We note
in particular that, as a result of the reasoning just explained and the content of
Proposition 1.3.10, given any linear system Ax = b with real coefficients only one of
the following situations can happen:

1. the system has no solutions;

2. the system has a single solution;

3. the system has infinitely many solutions.

This means that there is no linear system with real coefficients with a finite number of
solutions greater than 1. When a linear system with real coefficients has 2 solutions,
then, it has infinitely many ones.
Observation 1.4.5 In the Gaussian algorithm the operations are not “forced”. In
Example 1.4.4, for example, instead of exchanging the first with the second row, we
could exchange the first with the third row. In this way, completing the algorithm,
we would have obtained a different row echelon form of the matrix. From the point of
view of linear systems, this simply means that we get different row echelon systems,
but all equivalent to the initial system (and therefore equivalent to each other).
Example 1.4.6 We want to solve the following linear system of four equations in
five unknowns u, v, w, x, y:
⎧
⎪ u + 2v + 3w + x + y = 4
⎪
⎪
⎪
⎪ u + 2v + 3w + 2x + 3y = −2
⎨
⎪
⎪
⎪ u + v + w + x + y = −2
⎪
⎪
⎩ −3u − 5v − 7w − 4x − 5y = 0.
Introduction to Linear Systems 15

The complete matrix associated with the system is:

⎛ 1 2 3 1 1 4 ⎞
⎜ ⎟
(A∣b) = ⎜ ⎟
⎜ 1 2 3 2 3 −2 ⎟.
⎜
⎜ 1 −2 ⎟
⎟
⎜ 1 1 1 1 ⎟
⎝ −3 −5 −7 −4 −5 0 ⎠

We first reduce the matrix (A∣b) to a row echelon form using the Gaussian algorithm;
then we solve the linear system associated with the reduced matrix.
In this first example, we describe the steps of the Gaussian algorithm and at
the same time we describe the operations on the equations. The advantage of the
Gaussian algorithm is that we can forget the equations and the unknowns, focusing
only on matrices, so our present description is merely explanatory.
The entry (1, 1) is not zero, so we leave the first row unchanged. Then we perform
the following elementary row operations on (A∣b):
nd nd st
- 2 row → 2 row – 1 row;
rd rd st
- 3 row → 3 row – 1 row;
th t st
- 4 row → 4 h row + 3 ( 1 row).

We get the following matrix (and the equivalent linear system):

⎛ 1 2 3 1 1 4 ⎞ ⎧
⎪ u + 2v + 3w + x + y = 4
⎪
⎪
⎜
⎜ 0 0 0 1 ⎟
2 −6 ⎟ ⎪
⎪ x + 2y = −6
⎜
⎜ ⎟
⎟ ⎨
⎜
⎜ 0 −1 −2 0 0 −6 ⎟
⎟
⇔ ⎪
⎪ −v − 2w = −6
⎪
⎪
⎝ 0 1 2 −1 −2 12 ⎠ ⎪
⎩ v + 2w − x − 2y = 12

Now we exchange the second with the fourth row:

⎛ 1 2 3 1 1 4 ⎞ ⎧
⎪ u + 2v + 3w + x + y = 4
⎪
⎪
⎜
⎜ 0 1 ⎟
2 −1 −2 12 ⎟ ⎪
⎪ v + 2w − x − 2y = 12
⎜
⎜ ⎟
⎟ ⎨
⎜
⎜ 0 −1 −2 0 ⎟
0 −6 ⎟
⇔ ⎪
⎪ −v − 2w = −6
⎪
⎪
⎝ 0 0 0 1 2 −6 ⎠ ⎪
⎩ x + 2y = −6

Now we replace the third row with the sum of the third row and the second one:

⎛ 1 2 3 1 1 4 ⎞ ⎧
⎪ u + 2v + 3w + x + y = 4
⎪
⎪
⎜
⎜ 0 1 2 −1 −2 12 ⎟
⎟ ⎪
⎪ v + 2w − x − 2y = 12
⎜
⎜ ⎟
⎟ ⎨
⎜
⎜ 0 0 0 −1 −2 6 ⎟
⎟
⇔ ⎪
⎪ −x − 2y = 6
⎪
⎪
⎝ 0 0 0 1 2 −6 ⎠ ⎪
⎩ x + 2y = −6

Finally, we substitute the fourth row with the sum of the fourth row and the third
one:
⎛ 1 2 3 1 1 4 ⎞ ⎧
⎪ u + 2v + 3w + x + y = 4
⎪
⎪
⎜
⎜ ⎟
⎟ ⎪
⇔⎪
⎜ 0 1 2 −1 −2 12 ⎟ v + 2w − x − 2y = 12
⎜
⎜ ⎟
⎟ ⎨
⎪
⎜ 0 0 0 −1 −2 6 ⎟ ⎪
⎪ −x − 2y = 6
⎝ 0 0 0 ⎪
⎪
0 0 0 ⎠ ⎩ 0=0
16 Introduction to Linear Algebra

The initial system is equivalent to the row echelon system that we have obtained, in
which the last equation has become an identity. The rank of the incomplete matrix
and the rank of the complete matrix of the row echelon matrix obtained coincide and
are equal to 3. The number of unknowns of the system is 5, then the system admits
infinitely many solutions that depend on 5 − 3 = 2 free variables. We solve the system
from the bottom using subsequent substitutions. Using the third equation we can
express the variable x as a function of y:

x = −2y − 6.

In the second equation, we replace x with its expression in terms of y and we obtain
v in terms of w and y:
v = −2w + 6.
Finally, in the first equation, we substitute x with its expression depending on y, v
with its expression depending on w and we obtain u as a function of w and y:

u = −2v − 3w − x − y + 4 = −2(−2w + 6) − 3w − (−2y − 6) − y + 4 = w + y − 2.

So the system has infinitely many solutions of the type (w + y − 2, −2w + 6, w, −2y −
6, y), that depend on two free variables, w, y ∈ R.

1.5 EXERCISES WITH SOLUTIONS

1.5.1 Solve the following linear system in the four unknowns x, y, z, t:

⎧
⎪ x − 2y = 5
⎪
⎪
⎪
⎪ −x + 2y − 3z = −2
⎨
⎪
⎪
⎪ −2y + 3z − 4t = −11
⎪
⎪
⎩ −3z + 4t = 15
Solution. The complete matrix associated with the system is:

⎛ 1 −2 0 0 5 ⎞
⎜ 0 −2 ⎟
(A∣b) = ⎜ ⎟
⎜ −1 2 −3 ⎟.
⎜
⎜ 3 −4 −11 ⎟
⎟
⎜ 0 −2 ⎟
⎝ 0 0 −3 4 15 ⎠
We reduce the matrix (A∣b) to row echelon form using the Gaussian algorithm:

⎛ 1 −2 0 0 5 ⎞ ⎛ 1 −2 0 0 5 ⎞
⎜
⎜ 0 0 −3 0 3 ⎟
⎟ ⎜
⎜ 0 −2 3 −4 −11 ⎟
⎟
⎜
⎜ ⎟
⎟ ⎜
⎜ ⎟
⎜ 3 −4 −11 ⎟
⎟ ⎜ 3 ⎟
⎟
→
⎜ 0 −2 ⎜ 0 0 −3 0 ⎟
⎝ 0 0 −3 4 15 ⎠ ⎝ 0 0 −3 4 15 ⎠

⎛ 1 −2 0 0 5 ⎞
⎜ 3 −4 −11 ⎟
→⎜ ⎟ = (A′ ∣b′ ).
⎜ 0 −2 ⎟
⎜
⎜ 3 ⎟
⎟
⎜ 0 0 −3 0 ⎟
⎝ 0 0 0 4 12 ⎠
Introduction to Linear Systems 17

The matrix in row echelon form is the complete matrix associated to the linear system:

⎧
⎪ x − 2y = 5
⎪
⎪
⎪
⎪ −2y + 3z − 4t = −11
⎨
⎪
⎪
⎪ −3z = 3
⎪
⎪
⎩ 4t = 12

Note that rr(A ) = rr(A ∣b ) = 4. The system, therefore, admits a unique solution
′ ′ ′

that we can calculate by proceeding with subsequent substitutions from the bottom.
From the fourth equation we have

t = 3;
and from the third equation we have

z = −1;
replacing these values of t and z in the second equation we get

y = −2;
finally, by replacing the values of t, z, y in the first equation we get

x = 1.
So the system has only one solution: (1, −2, −1, 3).

1.5.2 Determine the solutions of the following linear system in the unknowns x, y,
z, t, depending on the real parameter α:

⎧
⎪ x+y+z+t=0
⎪
⎪
⎪
⎪ x − z − t = −1
⎨
⎪
⎪
⎪ x + 2y + (2α + 1)z + 3t = 2α − 1
⎪
⎪
⎩ 3x + 4y + (3α + 2)z + (α + 5)t = 3α − 1.

Solution. In this exercise, we are dealing with a linear system in which a real
parameter α appears. This means that as α varies in R we get infinitely many different
linear systems that we will solve by treating them as much as possible as one. The
procedure is always the same, we behave as if the parameter were a fixed real number.
First of all, then, let us write the complete matrix associated with the system:

⎛ 1 1 1 1 0 ⎞
⎜ ⎟
(A∣b) = ⎜ ⎟
⎜ 1 0 −1 −1 −1 ⎟
⎜
⎜ ⎟
⎟
⎜ 1 2 2α + 1 3 2α − 1 ⎟
⎝ 3 4 3α + 2 α + 5 3α − 1 ⎠
and reduce it to row echelon form using the Gaussian algorithm:
18 Introduction to Linear Algebra

⎛ 1 1 1 1 0 ⎞
⎜ ⎟
→⎜ ⎟
⎜ 0 −1 −2 −2 −1 ⎟
⎜
⎜ ⎟
⎟
⎜ 0 1 2α 2 2α − 1 ⎟
⎝ 0 1 3α − 1 α + 2 3α − 1 ⎠

⎛ 1 1 1 1 0 ⎞
⎜ ⎟
→⎜ ⎟
⎜ 0 −1 −2 −2 −1 ⎟
⎜
⎜ ⎟
⎟
⎜ 0 0 2α − 2 0 2α − 2 ⎟
⎝ 0 0 3α − 3 α 3α − 2 ⎠

⎛ 1 1 1 1 0 ⎞
⎜ ⎟
→⎜ ⎟
⎜ 0 −1 −2 −2 −1 ⎟
⎜
⎜ ⎟
⎟
⎜ 0 0 α−1 0 α−1 ⎟
⎝ 0 0 3α − 3 α 3α − 2 ⎠

⎛ 1 1 1 1 0 ⎞
⎜ ⎟
→⎜ ⎟
⎜ 0 −1 −2 −2 −1 ⎟ = (A ∣b )
′ ′
⎜
⎜ ⎟
⎟
⎜ 0 0 α−1 0 α−1 ⎟
⎝ 0 0 0 α 1 ⎠
We now have to determine what happens when the parameter α varies in the set of
real numbers. We must therefore answer the following questions:

1. For which values of α is the system compatible?

2. For the α values for which the system is compatible, how many solutions do we
have and can we determine them explicitly?

As we know the answer is given by Proposition 1.3.10 (a): we must compare the rank
of A with the rank of (A ∣b ). We note that these ranks depend on the value of α.
′ ′ ′

More precisely: rr(A ) = rr(A ∣b ) = 4 for α ≠ 0, 1. In this case, the system has a
′ ′ ′

unique solution that we can compute by proceeding by substitutions starting from

the bottom: the solution is ( α1 , − α+2
α
, 1 α1 ).
For α = 0 we have

⎛ 1 1 1 1 0 ⎞
⎜ 0 −1 −2 −2 −1 ⎟
(A ∣b ) = ⎜
⎜ ⎟
⎟,
′ ′
⎜
⎜ 0 0 −1 0 −1 ⎟ ⎟
⎜ ⎟
⎝ 0 0 0 0 1 ⎠

therefore rr(A ) = 3 and rr(A ∣b ) = 4, so the system is not solvable.

′ ′ ′

For α = 1 we have

⎛ 1 1 1 1 0 ⎞
⎜ 0 −1 −2 −2 −1 ⎟
(A ∣b ) = ⎜
⎜ ⎟
⎟
′ ′
⎜
⎜ 0 ⎟
⎟
⎜ 0 0 0 0 ⎟
⎝ 0 0 0 1 1 ⎠
Introduction to Linear Systems 19

therefore rr(A ) = 3 = rr(A ∣b ), so the system has infinitely many solutions depend-
′ ′ ′

ing on a free variable. As usual we can determine such solutions proceeding with
subsequent substitutions: (x3 , −1 − 2x3 , x3 , 1), x3 ∈ R.

1.5.3 Consider the linear system Σα in the unknowns x1 , x2 , x3 depending on the

real parameter α:

⎧
⎪ αx1 + (α + 3)x2 + 2αx3 = α + 2
⎪
⎪
Σα ∶ ⎨
⎪ αx1 + (2α + 2)x2 + 3αx3 = 2α + 2
⎪
⎪
⎩ 2αx1 + (α + 7)x2 + 4αx3 = 2α + 4
1. Determine the solutions of the linear system Σα as the parameter α ∈ R varies.

2. Determine the solutions of the linear system Σα now interpreted as a linear

system in 4 unknowns x1 , x2 , x3 , x4 .

Solution.

1. Consider the complete matrix (A∣b) associated with the linear system Σα :

⎛ α α + 3 2α α + 2 ⎞
⎜
⎜ α 2α + 2 3α 2α + 2 ⎟
⎟
⎝ 2α α + 7 4α 2α + 4 ⎠

and reduce it to row echelon form with the Gaussian algorithm:

⎛ α α + 3 2α α + 2 ⎞ ⎛ α α + 3 2α α + 2 ⎞
⎜
⎜ α 2α + 2 3α 2α + 2 ⎟
⎟→⎜
⎜ 0 α−1 α α ⎟
⎟→
⎝ 2α α + 7 4α 2α + 4 ⎠ ⎝ 0 −α + 1 0 0 ⎠

⎛ α α + 3 2α α + 2 ⎞
⎜
⎜ 0 α−1 α α ⎟
⎟ = (A ∣b ).
′ ′

⎝ 0 0 α α ⎠

If α ≠ 0 and α−1 ≠ 0, i.e. for each α ∈ R\{0.1}, we have rr(A ) = rr(A ∣b ) = 3,

′ ′ ′

so by Proposition 1.3.10, the system Σα admits one solution: ( 2−α

α
, 0, 1).
If α = 0 we get the matrix:

⎛ 0 3 0 2 ⎞
(A ∣b ) = ⎜
⎜ 0 −1 0 0 ⎟
′ ′
⎟,
⎝ 0 0 0 0 ⎠

which is not in row echelon form, but this can be fixed by replacing the second
line with the sum of the second row and first row multiplied by 13 :
20 Introduction to Linear Algebra

⎛ 0 3 0 2 ⎞
⎜ 0 0 0 2/3 ⎟
⎜ ⎟.
⎝ 0 0 0 0 ⎠
The given system is therefore equivalent to the linear system
3x2 = 2
{
0 = 2/3
which obviously has no solutions.
Finally, if α = 1 we have:

⎛ 1 4 2 3 ⎞
⎜ 0 0 1 1 ⎟
(A ∣b ) = ⎜
′ ′
⎟
⎝ 0 0 1 1 ⎠

which we can reduce in row echelon form to obtain the matrix

⎛ 1 4 2 3 ⎞
⎜ 0 0 1 1 ⎟
(A ∣b ) = ⎜
′′ ′′
⎟
⎝ 0 0 0 0 ⎠

so rr(A ) = rr(A ∣b ) = 2, hence the system Σ1 is equivalent to the linear

′′ ′′ ′′

system of 2 equations in 3 unknowns:

x1 + 4x2 + 2x3 = 3
{ .
x3 = 1
This system has infinitely many solutions depending on a parameter and the
set of solutions is: {(1 − 4x2 , x2 , 1) ∣ x2 ∈ R}.

2. Adding the unknown x4 means to add to the complete matrix (A∣b) associ-
ated with the system a column of zeros corresponding to the coefficients of x4 .
Therefore, by reducing (A∣b) to row echelon form, we get the matrix:

⎛ α α + 3 2α 0 α + 2 ⎞
(A ∣b ) = ⎜ α ⎟
′ ′
⎜ 0 α−1 α 0 ⎟.
⎝ 0 0 α 0 α ⎠

Therefore, reasoning as above, but taking into account that in this case the
number of variables is 4, we obtain that:
for α ∈ R\{0, 1} the system has infinitely many solutions, and they are of the
form: ( 2−α
α
, 0.1, x4 ) with x4 ∈ R;
for α = 0 the system has no solutions;
for α = 1 the system has infinitely many solutions, and they are of the form:
(1 − 4x2 , x2 , 1, x4 ), with x2 , x4 ∈ R.
Introduction to Linear Systems 21

1.5.4 Determine if there are values of the real parameter k such that the linear
system

⎧
⎪ 2x1 + x2 − x3 = 0
⎪
⎪
Σ∶⎨
⎪
4x1 − x2 = 0
⎪
⎪ 1 3
⎩ x1 + 2 x2 − x3 = − 2
is equivalent to the linear system

⎧
⎪ x1 + x2 − 12 x3 = 1
⎪
Πk ∶ ⎪
⎨
⎪ 2x1 − x2 + x3 = 2
⎪
⎪
⎩ kx1 − 4x2 + 3x3 = k
Solution. Two systems are equivalent if they have the same solutions. First we solve
the linear system Σ. The complete matrix associated with the system is:

⎛ 2 1 −1 0 ⎞
(A∣b) = ⎜
⎜ 4 −1 0 0 ⎟
⎟.
⎝ 1 1
−1 − 23 ⎠
2

Using the Gaussian algorithm we can reduce (A∣b) to row echelon form, obtaining
the matrix
⎛ 2 1 −1 0 ⎞
⎜
(A ∣b ) = ⎜
′ ′
0 −3 2 0 ⎟⎟.
⎝ 0 0 − 2 − 23 ⎠
1

We have: rr(A ) = rr(A ∣b ) = 3, so the system Σ admits a single solution that we

′ ′ ′

can determine by proceeding by susequent substitutions from the bottom:

1
x3 = 3; x2 = 2; x1 = .
2
The only solution of the system is therefore ( 12 , 2, 3).
For Σ to be equivalent to Πk it is therefore necessary that ( 12 , 2, 3) simultaneously
satisfies all the equations of Πk . So we replace x1 = 21 , x2 = 2 and x3 = 3 in the
equations of Πk :
⎧
⎪ 1
+ 2 − 23 = 1
⎪
⎪
⎪
⎨
2
1−2+3=2
⎪
⎪
⎪
⎪
k
−8+9=k
⎩ 2
We thus obtained two identities and the necessary condition k = 2. Therefore,
( 21 , 2, 3) is a solution of the system Πk only if k = 2. We can therefore say that
for k ≠ 2 the Πk and Σ systems are not equivalent, but we do not know yet if the
systems Σ and Π2 are equivalent. In fact, this happens if and only if ( 12 , 2, 3) is the
only solution of the system Π2 . Consider then, the complete matrix associated with
the system Πk for k = 2:
22 Introduction to Linear Algebra

⎛ 1 1 − 12 1 ⎞
⎜
⎜ 2 −1 1 2 ⎟
⎟.
⎝ 2 −4 3 2 ⎠
By reducing this matrix to row echelon form, we obtain the matrix:

⎛ 1 1 − 12 1 ⎞
(A ∣b ) = ⎜ 2 0 ⎟
′′ ′′
⎜ 0 −3 ⎟.
⎝ 0 0 0 0 ⎠

So we have rr(A ) = rr(A ∣b ) = 2 < 3, so the Π2 system has infinitely many

′′ ′′ ′′

solutions. We can therefore conclude that there are no values of k such that the
systems Σ and Πk are equivalent to each other.

1.6 SUGGESTED EXERCISES

1.6.1 Solve the following linear systems in the unknowns x, y, z:

1.
⎧
⎪ x+y+z =1
⎪
⎪
⎨
⎪ 2x + 2y + z = 1
⎪
⎪
⎩ 3y + z = 1
2.
⎧
⎪ x − y + 4z = 10
⎪
⎪
⎨
⎪ 3x + y + 5z = 15
⎪
⎪
⎩ x + 3y − 3z = 6
1.6.2 Solve the following linear systems in the unknowns x, y, z, w:

1.
⎧
⎪ x − y + 2z − 3w = 0
⎪
⎪
⎪
⎪ 2x + y − w = 3
⎨
⎪
⎪
⎪ 2y + z + w = −3
⎪
⎪
⎩ 2x + z = 0

2.
⎧
⎪ x+y−z+w =0
⎪
⎪
⎪
⎪ 2x − z − w = 0
⎨
⎪
⎪
⎪ x − y − 2w = 0
⎪
⎪
⎩ 3x + y − 2z = 0

3.
⎧
⎪ x+z =7
⎪
⎪
⎪
⎪ x+y =2
⎨
⎪
⎪
⎪ 4x + 12y + z = 1
⎪
⎪
⎩ 5x + 6y + 2z = −1
Introduction to Linear Systems 23

1.6.3 Consider the following linear system in the unknowns x, y, z, depending on the
real parameter k:
⎧
⎪ x + 2y + kz = 0
⎪
⎪
⎨
⎪ x + y = −1
⎪
⎪
⎩ x + ky = −2.
Determine for which value of k the system admits solutions and, when possible,
determine such solutions.

1.6.4 Determine for which values of a ∈ R the following linear system, in the un-
knowns x, y, z, t, admits solutions and, when possible, determine them:
⎧
⎪ 2x + y − z = 1
⎪
⎪
⎪
⎪ −2x + 3z + t = 1
⎪
⎨
⎪
⎪ 2x + 3y + (a + 2a + 3)z + (a − 2)t = a + 6
2 2
⎪
⎪
⎪
⎪ y + 2(a + 2a + 1)z + (3a − 2a − 7)t = 3a + 4.
2 2
⎩
1.6.5 Given the linear system in the unknowns x, y, z:

⎧
⎪ x + (2 + a)y = b
⎪
⎪
Σa,b ∶ ⎨ (2 + 2a)x + 3y − (b + 1)z = 1 + b
⎪
⎪
⎪
⎩ bx + by − (b + 4)z = b + 3b.
2

1. Determine for which values of a, b ∈ R the homogeneous system associated to

it admits the solution (a, −a, 0). (We call homogeneous system associated with
the linear system Ax = b the system Ax = 0, where 0 is the null column.)

2. Determine for which among the values of a, b found in the previous point the
system Σa,b is solvable and determine its solutions.

1.6.6 Determine for which values of the real parameter k the following system in
the unknowns x1 , x2 , x3 , x4 is compatible. Determine the system’s solutions when
possible.
⎧
⎪ x1 + 3x2 + kx3 + 2x4 = k
⎪
⎪
⎪
⎪ x1 + 6x2 + kx3 + 3x4 = 2k + 1
⎨
⎪
⎪ −x1 − 3x2 + (k − 2)x4 = 1 − k
⎪
⎪
⎪
⎩ kx3 + (2 − k)x4 = 1
CHAPTER 2

Vector Spaces

In this chapter, we want to introduce the main character of linear algebra: the vector
space. It is a generalization of concepts that we already know very well. The Cartesian
plane, the set of functions studied in calculus, the set of m × n matrices introduced in
the previous chapter, the set of polynomials, the set of real numbers are all examples
of sets that have a natural vector space structure. The vector space will also be the
right environment in which to read and interpret the results obtained in the previous
chapter. Before giving its precise definition, we see some concrete examples.

2.1 INTRODUCTION: THE SET OF REAL NUMBERS

Let us briefly recall the main properties of the operations that we are usually per-
formed on numbers, in particular real numbers.
The sum of two real numbers is an operation that associates to each pair of real
numbers a and b another real number, denoted by a + b. So the sum is a function
whose domain is R × R and codomain is R:
+∶ R×R → R
(a, b) ↦ a + b.
The sum of real numbers is:
• commutative: a + b = b + a for each a, b ∈ R;
• associative: (a + b) + c = a + (b + c) for each a, b, c ∈ R;
• admits a neutral element, i.e. there exists a number, 0, such that 0+a = a+0 = a
for every a ∈ R;
• every real number a admits an opposite, that is, there is another number, which
we denote by −a, such that a + (−a) = 0.
The product of two real numbers is an operation that associates to each pair of
real numbers a and b another real number, denoted by ab. Therefore, the product is
a function whose domain is R × R and codomain is R:
⋅∶ R×R → R
(a, b) ↦ ab.

25
26 Introduction to Linear Algebra

The product of real numbers is:

• commutative: ab = ba for each a, b ∈ R;

• associative: (ab)c = a(bc) for every a, b, c ∈ R;

• admits neutral element, i.e. there exists a number, 1, such that 1a = a1 = a for
every a ∈ R;

• distributive with respect to the sum: a(b + c) = ab + ac for every a, b, c ∈ R.

One of the most important properties of real numbers, which distinguishes them
from other sets of numbers, is their continuity. Geometrically this means that we
think of real numbers as distributed along a straight line. More precisely, given a
line, a fixed point on it (origin) and a unit of measure, there is a correspondence
between the points on the line and the set of real numbers. In other words, every real
number uniquely identifies one and only one point on the line.
-
-1 0 1 2

N
2.2 THE VECTOR SPACE R AND THE VECTOR SPACE OF MATRICES
2
We denote by the symbol R the set of ordered pairs of real numbers:
2
R = {(x, y) ∣ x, y ∈ R}.

The fact that the pairs are ordered means, for example, that the element (1, 2) is
different from the element (2, 1).
Once we fix a Cartesian coordinate system in a plane, there is a correspondence
2
between R and the set of points in the plane. Attaching a Cartesian reference to the
plane means fixing two oriented perpendicular lines r and s and a unit of measure.
The point of intersection between the two straight lines is called the origin of the
reference system. Each point of the plane is then uniquely identified by a pair of real
numbers, called coordinates of the point, which indicate the distance of the point
from the line s and its distance from the line r, respectively. The student who is not
familiar with the Cartesian plane can think of the boardgame Battleship.
Vector Spaces 27

6
y

(2, 1)
1

-
0 1 2 x

It is natural to try to extend the operations that we perform with numbers to the
pairs of real numbers. We then define the following:

• Sum:
2 2 2
+∶ R ×R → R
((x, y), (x , y )) ↦ (x, y) + (x , y ) = (x + x , y + y ).
′ ′ ′ ′ ′ ′

• Multiplication by a real number:

2 2
⋅∶ R×R → R
(λ, (x, y)) ↦ λ(x, y) = (λx, λy).

Note that in the product λ(x, y), we have omitted the symbol for the multiplica-
tion, just like we usually do when we multiply two real numbers.
2
We try to interpret geometrically the operations defined in the case of R . To
this aim, we think of each element (a, b) of R as the endpoint of a vector applied at
2

the origin, that is, as an outgoing-oriented segment from the origin with the arrow
pointing to the point of coordinates (a, b). In this case, the way to add two elements
2
of R coincides with the well-known rule of the parallelogram used to add up forces
in physics. This rule states that the sum of two vectors u⃗ and v⃗ applied at a point is
a vector applied at the same point with the direction and length of the diagonal of
the parallelogram having as sides u⃗ and v⃗.
28 Introduction to Linear Algebra

1

6 >

y

v⃗

u ⃗ + ⃗
v
1

u⃗

-
0 x

The multiplication of a vector v⃗ by a real number α is a vector having the same

direction as v⃗ and the length multiplied by the absolute value of α, pointing in the
same or the opposite direction as v⃗, depending on whether α is positive or negative.
y
6

*
2⃗
u

*
3

2
u⃗

*

u

-
x

− 21 u⃗

Some students will remember from physics that there are other operations that
can be performed with vectors (the dot product, cross product, etc.), but now we are
not interested in them and we will not take them into account.
2
Almost immediately we can verify that the sum of elements of R satisfies the
following properties:
1. commutative: (x, y) + (x , y ) = (x , y ) + (x, y) for every (x, y), (x , y ) ∈ R ;
′ ′ ′ ′ ′ ′ 2

2. associative: ((x, y) + (x , y )) + (x , y ) = (x, y) + ((x , y ) + (x , y )) for every

′ ′ ′′ ′′ ′ ′ ′′ ′′

(x, y), (x , y ), (x , y ) ∈ R ;
′ ′ ′′ ′′ 2
Vector Spaces 29

3. existence of neutral element (0, 0): (x, y) + (0, 0) = (0, 0) + (x, y) =

(x, y) for each (x, y) ∈ R ;
2

4. existence of opposite: for every (x, y) ∈ R there exists an element (a, b), called
2

opposite of (x, y), such that (a, b) + (x, y) = (x, y) + (a, b) = (0, 0). Obviously
we have: (a, b) = (−x, −y);

5. distributive property: λ((x, y) + (x , y )) = λ(x, y) + λ(x , y ), for every (x, y),

′ ′ ′ ′

(x , y ) ∈ R and for every a ∈ R.

′ ′ 2

6. (λ + µ)(x, y) = λ(x, y) + µ(x, y), for every (x, y) ∈ R and for all λ, µ ∈ R.
2

7. (λµ)(x, y) = λ(µ(x, y)), for every (x, y) ∈ R and for every λ, µ ∈ R.

8. 1(x, y) = (x, y), for every (x, y) ∈ R .

2
Of course we can generalize what has been done for R to the set of ordered
n-tuples of real numbers, for every n ∈ N:
n
R = {(x1 , . . . , xn ) ∣ x1 , . . . , xn ∈ R}.

We define the sum + and the product ⋅ by real numbers:

n n n
+∶R ×R →R

(x1 , . . . , xn ) + (x1 , . . . , xn ) = (x1 + x1 , . . . , xn + xn );

′ ′ ′ ′

n n
⋅∶R×R →R
λ(x1 , . . . , xn ) = (λx1 , . . . , λxn ).

With a little patience we can check the properties 1 through 8 listed above for
n
the sum and product real numbers in R .
Let us examine another example. Consider the set of matrices 2 × 2 with real
coefficients:
a b
M2 (R) = {( ) ∣ a, b, c, d ∈ R}
c d
introduced in Chapter 1. We define in M2 (R) the following sum + and product ⋅ by
real number operations:

+ ∶ M2 (R) × M2 (R) → M2 (R)

′ ′ ′ ′
a b a b a+a b+b
( )+( ′ ′) = ( ′ ′) .
c d c d c+c d+d

⋅ ∶ R × M2 (R) → M2 (R)
a b λa λb
λ( )=( ).
c d λc λd
30 Introduction to Linear Algebra

Also in this case, with patience, it is possible to verify properties 1 through 8. Students
are strongly encouraged to do so. For example, we prove the commutativity of +:
′ ′ ′ ′
a b a b a+a b+b
( )+( ′ ′) = ( ′ ′) .
c d c d c+c d+d

Since the sum of real numbers is commutative, we can write:

′ ′ ′ ′ ′ ′
a+a b+b a +a b +b a b a b
( ′ ′) = ( ′ ′ )=( ′ ′) + ( ).
c+c d+d c +c d +d c d c d

Once again it all depends on the properties of the real number operations. This is
precisely the strategy to verify the properties 1 through 8 in M2 (R) and more in
general in any vector space.
It is clear that, similarly to what was done for 2 × 2 matrices, it is possible to
define a sum and a product by real numbers also in the set of m × n matrices, and
with some patience one can show that such operations satisfy all the properties listed
above. So we give the definition of the operations of sum and product by real numbers
in Mm,n (R):
Sum:
′ ′ ′
⎛ a11 a12 . . . a1n ⎞ ⎛ a11 a12 . . . a1n ⎞
⎜
⎜ a21 a22 . . . a2n ⎟
⎟ ⎜
⎜
′ ′ ′ ⎟
⎟
⎜
⎜ ⎟
⎟ ⎜
⎜
a21 a22 . . . a2n ⎟
⎟
⎜
⎜ ⋮ ⎟ +
⎟ ⎜ ⎜ ⎟
⎟
⋮ ⋱ ⋮ ⎜ ⋮ ⋮ ⋱ ⋮ ⎟
⎝ am1 am2 . . . amn ⎠ ⎝ a′ ′ ′ ⎠
m1 am2 . . . amn
′ ′ ′
⎛ a11 + a11 a12 + a12 . . . a1n + a1n ⎞
⎜
⎜ a21 + a21
′ ′ ′ ⎟
⎟
=⎜ ⎟
a22 + a22 . . . a2n + a2n
⎜
⎜ ⎟.
⎟
⎜
⎜ ⋮ ⋮ ⋱ ⋮ ⎟
⎟
⎝ am1 + am1 am2 + a2m . . . amn + a′mn
′ ′ ⎠

Product by a real number:

⎛ a11 a12 . . . a1n ⎞ ⎛ λa11 λa12 . . . λa1n ⎞

⎜ a21 a22 . . . a2n ⎟ ⎜ λa21 λa22 . . . λa2n ⎟
λ⎜
⎜
⎜
⎜
⎟
⎟
⎟
⎟ =⎜
⎜
⎜ ⋮
⎟
⎟
⎟ .
⎜ ⋮ ⋮ ⋱ ⋮ ⎟ ⎜⎜ ⋮ ⋱ ⋮ ⎟
⎟
⎝ am1 am2 . . . amn ⎠ ⎝ λam1 λam2 . . . λamn ⎠

Note that the neutral element of the sum in Mm,n (R) is the zero matrix

⎛ 0 0 ... 0 ⎞
⎜
⎜ 0 0 ... 0 ⎟
⎟
⎜
⎜ ⎟
⎟ .
⎜
⎜ ⋮ ⋮ ⋱ ⋮ ⎟
⎟
⎝ 0 0 ... 0 ⎠

The sets described so far with the sum and product operations are all examples
of vector spaces.
Vector Spaces 31

2.3 VECTOR SPACES

In this section, we give the definition of a vector space, that is we formally define the
1
structure which we introduced through the examples in the previous section.

Definition 2.3.1 A real vector space is a set V equipped with two operations called,
respectively, sum and multiplication by scalars:

+∶V ×V ⟶ V ⋅∶R×V ⟶ V

(u, v) ↦ u+v (λ, u) ↦ λu

satisfying the following properties:

1. commutative, i.e. u + v = v + u, for every u, v ∈ V ;

2. associative, that is, (u + v) + w = u + (v + w) for every u, v, w ∈ V ;

3. there exists a neutral element for the sum, i.e. there is 0 ∈ V such that 0 + u =
u + 0 = u for each u in V ;

4. each element of V has an opposite, that is, for every u ∈ V there exists a vector
a such that a + u = u + a = 0;

5. 1u = u;

6. (λµ)u = λ(µu), for every u ∈ V and for every λ, µ ∈ R;

7. λ(u + v) = λu + λv, for every u, v ∈ V and for every λ ∈ R;

8. (λ + µ)u = λu + µu, for every u ∈ V and for every λ, µ ∈ R.

The elements of a vector space are called vectors, while the real numbers are called
scalars. The neutral element of the sum in V is called zero vector. To distinguish
vectors from numbers we will indicate the vectors in bold.
In the previous section, we have seen that R and Mm,n (R) are real vector spaces.
n

Now let us see other examples.

Example 2.3.2 The vectors applied at one point.

Consider the set V of the vectors applied at one point. We define the sum of two
such vectors using the rule of the parallelogram. The sum clearly enjoys properties 1
through 4, and the neutral element is the zero-length vector.
Then we can define multiplication by a real number in the following way: if α ∈ R
and v⃗ ∈ V then α ⋅ v⃗ is the vector with the same direction as v⃗, the length multiplied
by the factor ∣α∣ (where ∣α∣ is the absolute value of α).
1
We will define the concept of the real vector space, in which vectors are multiplied by real
numbers we will call scalars. It is important to know that one can develop this theory also replacing
the real numbers with other sets, such as the rational numbers, or the complex numbers.
32 Introduction to Linear Algebra

Then α ⋅ v⃗ is the vector lying in the same line as v⃗, its length is multiplied by the
factor ∣α∣ (where ∣α∣ is the absolute value of α) and its direction is the same as that
of v⃗ or opposite to it depending on whether the sign of α is positive or negative.
In this way, the set of vectors of the space applied at a point turns out to be a
real vector space.

Example 2.3.3 The functions. Let F(R) be the set of functions F ∶ R ⟶ R with
two operations:

(f + g)(x) = f (x) + g(x), (λ ⋅ f )(x) = λf (x).

Check by exercise that F(R) is a real vector space.

Example 2.3.4 The polynomials.

Consider the set R[x] of polynomials with real coefficients in the variable x.
With the usual operations of sum and product by a real number, R[x] is a vector
space. Again we leave the verification of properties 1 to 8 by exercise. We observe,
for example, that the zero vector in R[x] is the polynomial identically zero, i.e. 0;
2 2
the opposite of the polynomial 5x − x is the polynomial −5x + x.

We now show some useful properties that are valid for any vector space:

Proposition 2.3.5 Let V be a vector space. Then we have the following properties:
i) The zero vector is unique and will be denoted by 0V .
ii) If u is a vector of V , its opposite is unique and it will be denoted by −u.
iii) λ0V = 0V , for each scalar λ ∈ R.
iv) 0u = 0V for each u ∈ V (note the different sense of zero in the first and second
member!).
v) If λu = 0V , then it is λ = 0 or u = 0V .
vi) (−λ)u = λ(−u) = −λu.

Proof. Notice that, while the properties 1 through 8 in Definition 2.3.1 are given, the
statements i) through vi), though appear obvious to the student, must be proven and
they are a direct consequence of Definition 2.3.1.
′
i) If 0 is another vector that fulfills property 3 of Definition 2.3.1, we have that
′
0 + 0V = 0V (here we take u = 0V ). Furthermore, using the fact that 0V satisfies
′ ′ ′ ′
property 3 and taking u = 0 we also have 0 + 0V = 0 . It follows that 0V = 0 + 0V =
′
0.
′
ii) If a and a are both opposite of u, by property 4 we have in particular that
a+u = 0V and u+a = 0V . Then a = a+0V = a+(u+a ) = (a+u)+a = 0V +a = a ,
′ ′ ′ ′ ′

where we have used associativity of the sum.

iii) We have that λ0V = λ(0V + 0V ) by property 3. In addition λ(0V + 0V ) =
λ0V + λ0V by property 7, therefore, λ0V = λ0V + λ0V . We add −λ0V to both
members, and we get, by property 4, that 0V = λ0V .
Vector Spaces 33

iv) We have that 0u = (0 + 0)u = 0u + 0u by property 8, thus 0u = 0u + 0u. We add

−0u to both members, and we get, by property 4, that 0V = 0u.
v) We show that if λu = 0V and λ ≠ 0, then it must be u = 0V . Let λ1 be the inverse
of λ, which exists because λ ≠ 0. Multiplying both members of the equality λu = 0V
by λ1 , we have: u = ( λ1 λ)u = λ1 0V = 0V .
vi) We have that (−λ)u + λu = (−λ + λ)u = 0u = 0V by properties 8 and iv).
Similarly λu + (−λ)u = 0V . Then (−λ)u is the opposite of λu, i.e. (−λ)u = −λu.
Also λ(−u) + λu = λ(−u + u) = λ0V = 0V by properties 7, 4 and iii). Similarly
λu + (−λ)u = 0V , hence also λ(−u) is the opposite of λu, i.e. λ(−u) = −λu.

Definition 2.3.6 The trivial vector space, denoted with {0V }, is a vector space
consisting only of the zero vector.

Observation 2.3.7 By definition of zero vector and by property iii) of Proposition

2.3.5, the operations of sum and product by a scalar are well defined and of course
trivial in the trivial vector space:

0V + 0V = 0V

λ0V = 0V , for all λ ∈ R.

Observation 2.3.8 Let us think about the definition of R-vector space. First of
all we observe that, by definition, a real vector space can never be empty. In fact, it
must contain at least the zero vector that is the neutral element of the sum. It could
happen that a vector space contains only the zero vector; In this case, it is called
trivial.
Now suppose that V is a nontrivial vector space, i.e. that it contains at least one
vector v ≠ 0V . How many elements does V contain? As we can multiply by real
numbers, which are infinitely many, V will contain all the infinitely many multiples
of v, that is, all the vectors of the form λv, for every λ ∈ R.
Since λ varies in the set of real numbers we get infinitely many different elements
of V . To rigorously prove this statement, we have to show that if λ and µ are distinct
real numbers, that is, λ ≠ µ, and v ∈ V is a nonzero vector, then λv ≠ µv. In fact,
if not, we would have:
λv = µv ↔ (λ − µ)v = 0V
with λ − µ ≠ 0 and v ≠ 0V , which would contradict property (v) of Proposition 2.3.5.

2.4 SUBSPACES
How can we recognize and describe a vector space? How can we single out a subset of a
vector space with the same characteristics? To answer to these questions is necessary
to introduce the definition of subspace.

Definition 2.4.1 Let W be a subset of the space vector V . We say that W is a

subspace of V if it satisfies the following properties:
34 Introduction to Linear Algebra

1) W is different from the empty set;

2) W is closed with respect to the sum, that is, for every u, v ∈ W we have that
u + v ∈ W;
3) W is closed with respect to the product by scalars, that is, for every u ∈ W
and every λ ∈ R we have that λu ∈ W .
We note that, since W is not empty, and λu ∈ W for each λ ∈ R, then 0V ∈ W ,
because we can take any vector of W and multiply it by λ = 0. In fact, property (1)
can effectively be replaced by the property:
1) 0V ∈ W ,
and we obtain an equivalent definition.
It is important to note that a subspace W of V is a vector space with the oper-
ations of V restricted to W . In fact, property 2) of Definition 2.4.1 ensures that the
restriction to W of the sum defined in V gives as a result a vector of W :
+V ∣W ×W ∶ W × W → W.
Similarly, property 3) of Definition 2.4.1 ensures that the restriction to R × W of the
product by scalars defined on R × V gives as a result a vector of W . Then properties
1 through 8 of Definition 2.3.1 continue to hold because they are true in V .
In particular, therefore, every vector space V has always at least two subspaces:
V itself and the zero subspace, consisting of only the zero vector 0V . Because of
Observation 2.3.8, if V itself is not trivial, every nontrivial subspace of V contains
infinitely many elements.
We now want to clarify the concept of subspace with some examples and coun-
terexamples.
Example 2.4.2 The set X = {(x, y) ∈ R ∣ y = 0} is a subspace of R . X is in fact:
2 2

1) not empty: it contains infinitely many pairs of real numbers (x, 0);
2) closed under the sum: given two elements (x1 , 0), (x2 , 0) in X, the sum of
(x1 , 0) + (x2 , 0) = (x1 + x2 , 0) still belongs to X;
3) closed with respect to the product by scalars: given any real number α and any
element (x, 0) ∈ X, the product α(x, 0) = (αx, 0) belongs to X.
2
Geometrically, after setting a Cartesian coordinate system in R , we can iden-
tify the set X with the x−axis. Then adding two vectors lying on the x−axis or
multiplying by a scalar one of them, we still get a vector that lies on the x−axis.
More generally, if a is a real number set Wa = {(x, y) ∈ R ∣ y = ax}. We observe
2

first that (0, 0) ∈ W . Furthermore, given two elements (x1 , ax1 ) and (x2 , ax2 ) in Wa ,
their sum
(x1 , ax1 ) + (x2 , ax2 ) = (x1 + x2 , a(x1 + x2 ))
belongs to Wa , i.e. Wa is closed with respect to the sum. Likewise, given λ ∈ R and
(x1 , ax1 ) ∈ Wa , we have:
λ(x1 , ax1 ) = (λx1 , λax1 ) = (λx1 , a(λx1 )),
Vector Spaces 35

which belongs to Wa , i.e. Wa is closed with respect to the product by scalars. So Wa

2
is a subspace of R .
Again, after setting a Cartesian coordinate system in the plane, Wa is identified
with a line passing through the origin of the coordinate system.

Example 2.4.3 Consider the set S = {(x, y) ∈ R ∣y = x + 1}. We can immediately

say that S is not a subspace of R as it does not contain 0R2 = (0, 0). Therefore, not
2

all the straight lines of the plane give subspaces of R , but only those through (0, 0).
2

a b
Example 2.4.4 The set X = {( ) ∈ M2 (R)∣ b = 1} ⊆ M2 (R) is not a subspace
c d
0 0
of M2 (R), because the zero matrix ( ) does not belong to X.
0 0
Observation 2.4.5 If S is a subspace of a vector space V, the condition 0V ∈ S is
necessary, but not sufficient, for S to be a subspace of V . We give a counterexample.

Example 2.4.6 Let S = {(x, y, z) ∈ R ∣ xy = z}. Despite the fact the set S contains
3

the zero vector (0, 0, 0) of R , S is not a subspace of R because it is not closed with
3 3

respect to the sum. Indeed the vectors v = (1, 1, 1) and w = (−1, −1, 1) belong to S,
since they satisfy the equation xy = z, but their sum v + w = (1, 1, 1) + (−1, −1, 1) =
(0, 0, 2) does not belong to S since 0 ⋅ 0 ≠ 2.

Observation 2.4.7 To say that a subset S of a vector space V is closed with

respect to the product by scalars is equivalent to saying that if S contains a nonzero
vector, then it must also contain all of its multiples. If we think of the geometric
2
interpretation of the product by scalars, which we gave in the case of R , this means
that if a subspace contains a nonzero vector v, then it contains the whole line of the
plane passing through the origin and determined by v. This geometrical reasoning
allows us to say immediately that certain known subsets of the plane are not vector
spaces, for example
2 2 2
C = {(x, y) ∈ R ∣x + y = 1}
(a circumference),
2 2
P = {(x, y) ∈ R ∣ y = x }
(a parabola) and so on on.
In a sense, a vector space can not be curved, from which the name linear algebra.

Observation 2.4.8 The examples given so far highlight two different types of rea-
soning. To prove that a subset S of a vector space V is a subspace of V , we must prove
it satisfies properties 1), 2) and 3) of Definition 2.4.1. These properties must apply
always, that is, for each pair of vectors in S (property 2) and for all real numbers
(property 3).
On the contrary, to prove that a subset of a vector space V is not a subspace of
V , it is enough to show that one of properties 1), 2) and 3) of Definition 2.4.1 fails,
i.e. if S ≠ ∅, or if there exists a pair of vectors of S whose sum is not in S, or if there
is a vector of S and a scalar whose product is not in S.
36 Introduction to Linear Algebra

Example 2.4.9 In the vector space R[x] of polynomials with real coefficients in a
variable x, consider the subset R2 [x] consisting of polynomials of degree less than or
equal to 2:
2
R2 [x] = {p(x) = a + bx + cx ∣ a, b, c ∈ R}.
Then R2 [x] is a subspace of R[x]. In fact, adding two polynomials of degrees less
than or equal to 2, we obtain a polynomial of degree less than or equal to 2:
2 ′ 2 2
(a + bx + cx ) + (a + b x + c x ) = (a + a ) + (b + b )x + (c + c )x .
′ ′ ′ ′ ′

Similarly, multiplying a polynomial of degree less than or equal to 2 by a real number

λ, we obtain a polynomial of degree less than or equal to 2:
2 2
λ(a + bx + cx ) = (λa) + (λb)x + (λc)x .

In other words, R2 [x] (which is certainly not empty) is closed with respect to the
sum and the product by scalars, therefore it is a subspace of R[x].
Example 2.4.10 In the vector space R[x] of polynomials with real coefficients in a
variable x, we consider the subset S = {p(x) = a + bx + cx ∈ R[x] ∣ ac = 0}. The
2

subset S contains, for example, the polynomial identically zero and the monomial x,
therefore, it is different from the empty set. However, it is not closed with respect
to the sum defined in R[x]. In fact, S contains the polynomials p(x) = 1 + x and
2 2
q(x) = x + x but does not contain their sum: p(x) + q(x) = 1 + 2x + x .
Since a subspace of a vector space V is primarily a subset of V , it is natural to ask
what happens when carrying out the operation of set theoretic union and intersection
of two (or more) subspaces of V .
Example 2.4.11 Consider the set W = X ∪ Y with
2 2
X = {(x, y) ∈ R ∣ y = 0} and Y = {(x, y) ∈ R ∣ x = 0}.
2
In Example, 2.4.2 we showed that X is a subspace of R . In a similar way, one can
2 2
show that Y is a subspace of R . However, their union W is not a subspace of R ,
because it is not closed with respect to the sum: in fact, the vector (1, 0) belongs to
W because it is an element of X, and the vector (0, 1) belongs to W because it is an
element of Y . However their sum (1, 0) + (0, 1) = (1, 1) belongs neither to X nor to
Y.
The geometric reasoning is also simple: we can think of X and Y , respectively, as
the x-axis and the y-axis in a Cartesian reference in the plane, and W as the union
of the two axes. It is clear from the parallelogram rule that the sum of a vector that
lies on the x-axis and a vector that lies on the y-axis will be outside of the two lines,
hence W is not a subspace.
We observe that the set W can be described as follows:
2
W = {(x, y) ∈ R ∣ xy = 0}.

In fact, since x and y real numbers, their product is zero if and only if at least one
of the two factors is zero.
Vector Spaces 37

The example above shows that, in general, the union of two subspaces of a vector
space V is not a subspace. More precisely, we have the following proposition.
Proposition 2.4.12 Let W1 , W2 be two subspaces of a vector space V . Then W1 ∪W2
is a subspace if and only if W1 ⊆ W2 or W2 ⊆ W1 .
Proof. “⇐” If W1 ⊆ W2 (resp. if W2 ⊆ W1 ) then W1 ∪W2 = W2 (resp. W1 ∪W2 = W1 ),
which is a subspace by hypothesis.
“⇒” To prove this implication, we show that, if W1 ⊆ / W2 and W2 ⊆ / W1 , then
W1 ∪ W2 is not a subspace of V . As W1 ⊆ / W2 , there exists a vector v1 ∈ W1 \ W2 ;
similarly with W2 ⊆/ W1 , there exists a vector v2 ∈ W2 \ W1 . If W1 ∪ W2 were a
subspace, then v = v1 + v2 should be an element of W1 ∪ W2 as it is the sum of an
element of W1 and one of W2 . If v were in W1 , then also v2 = v − v1 would belong to
W1 , but we had chosen v2 ∈ W2 \W1 . Similarly, if v were in W2 , then also v1 = v−v2
would belong to W2 , but we had chosen v1 ∈ W1 \ W2 . So v ∉ W1 ∪ W2 .
With the intersection of two subspaces we have less problems.
Proposition 2.4.13 The intersection S1 ∩ S2 of two subspaces S1 and S2 of a vector
space V is a subspace of V .
Proof. We have to show that S1 ∩ S2 is a subspace of V : we observe first that this
intersection is not empty since 0V belongs both to S1 and S2 , so it belongs to S1 ∩ S2 .
Now we show that S1 ∩S2 is closed with respect to the sum of V : so let v1 , v2 ∈ S1 ∩S2 .
This means, in particular, that v1 , v2 ∈ S1 , which is a subspace of V , so v1 + v2 ∈ S1 .
Similarly since S2 is a subspace, we have that v1 + v2 ∈ S2 . Then v1 + v2 ∈ S1 ∩ S2 .
Similarly, since we show that S1 ∩ S2 is closed with respect to the product by
scalars. Let v ∈ S1 ∩ S2 and λ ∈ R. In particular, v ∈ S1 , which is a subspace, so
λv ∈ S1 ; similarly, v ∈ S2 , which is a subspace, so λv ∈ S2 . Thus λv belongs both
to S1 and to S2 , so that it belongs to their intersection.
Example 2.4.14 Consider the subspaces:
a b
S = {( ) ∈ M2 (R) ∣ b = −c} and
c d

a b
T = {( ) ∈ M2 (R) ∣ a + b + c + d = 0}
c d
of M2 (R).
What is S ∩ T ? The subspace S ∩ T consists of the elements of M2 (R) belonging
to S and T , that is:
a b
S ∩ T = {( ) ∈ M2 (R) ∣ b = −c, a + b + c + d = 0} .
c d
Hence:
a −c
S ∩ T = {( ) ∈ M2 (R)} .
c −a
It is easy to verify that this subset of M2 (R) is closed with respect to the sum and
the product by scalars, as guaranteed by Proposition 2.4.13.
38 Introduction to Linear Algebra

2.5 EXERCISES WITH SOLUTIONS

2.5.1 Determine if the set X = {(r, s, r − s) ∈ R } is a subspace of R .
3 3

Solution. First of all, we observe that X is not the empty set because (0, 0, 0) ∈ X
(just take r = s = 0).
Let us now consider two generic elements of X: (r1 , s1 , r1 −s1 ) and (r2 , s2 , r2 −s2 ).
Their sum is: (r1 , s1 , r1 − s1 ) + (r2 , s2 , r2 − s2 ) = (r1 + r2 , s1 + s2 , r1 − s1 + r2 − s2 ) =
(r1 + r2 , s1 + s2 , r1 + r2 − (s1 + s2 )), and it still belongs to X as it is of the type
(r, s, r − s), with r = r1 + r2 and s = s1 + s2 .
Consider (r1 , s1 , r1 −s1 ) ∈ X and λ ∈ R. Then λ(r1 , s1 , r1 −s1 ) = (λr1 , λs1 , λ(r1 −
s1 )) = (λr1 , λs1 , λr1 − λs1 ) still belongs to X as it is of the type (r, s, r − s), with
3
r = λr1 and s = λs1 . So X is a subspace of R .

2.5.2 Determine whether the set W = {(x, y, z) ∈ R ∣ 2x + z = 0} ⊆ R is a

3 2 3
3
subspace of R .
Solution. W is not the empty set because (0, 0, 0) ∈ W .
Consider now two generic elements of W , (x1 , y1 , z1 ) and (x2 , y2 , z2 ), with 2x1 +
z1 = 0 and 2x2 + z2 = 0. We have that (x1 , y1 , z1 ) + (x2 , y2 , z2 ) = (x1 + x2 , y1 +
2 2

y2 , z1 + z2 ), and this sum belongs to W if and only if 2(x1 + x2 ) + (z1 + z2 ) = 0. But

2(x1 + x2 ) + (z1 + z2 ) = 2x1 + 2x2 + z1 + z2 + 2z1 z2 = (2x1 + z1 ) + (2x2 Z2 + ) + 2z1 z2 =

2 2 2 2 2

0 + 0 + 2z1 z2 = 2z1 z2 , and it is not true that 2z1 z2 is always equal to zero.
For example, the elements (−2, 1, 2) and (−8, 3, 4) belong to W but (−2, 1, 2) +
(−8, 3, 4) = (−10, 4, 6) ∈ / W , because 2 ⋅ (−10) + (6) ≠ 0. (Note that these W
2

elements were not chosen randomly but so to satisfy the request 2z1 z2 ≠ 0).
3
So W is not a subspace of R .
3
2.5.3 Determine a non-empty subset of R closed with respect to the sum but not
with respect to the product by scalars.
Solution. The set X = {(x, y, z)∣x, y, z ∈ R, x ≥ 0} has this property. In fact, X is not
empty because, for example, (0, 0, 0) ∈ X. Let us check if X is closed with respect
to the sum. Let (x1 , y1 , z1 ),(x2 , y2 , z2 ) ∈ X, with x1 , x2 ≥ 0. Then (x1 , y1 , z1 ) +
(x2 , y2 , z2 ) = (x1 + x2 , y1 + y2 , z1 + z2 ) ∈ X because x1 + x2 ≥ 0 (the sum of two
non-negative real numbers is a non-negative real number). Now let (x1 , y1 , z1 ) ∈ X
and λ ∈ R. We have that λ(x1 , y1 , z1 ) = (λx1 , λy1 , λz1 ) belongs to X if and only
if λx1 ≥ 0. But if we choose λ negative and x1 > 0, for example λ = −1 and
(x1 , y1 , z1 ) = (3, −2, 1), this condition it is not verified. So X is not closed with
respect to the product by scalars.

2.5.4 Determine for which values of the parameter k the set

r s
Xk = {( 2 ) ∣ r, s ∈ R} ⊆ M2 (R)
r+k k −k

is a subspace of M2 (R).
Solution. We know that in order for Xk to be a subspace of M2 (R), the null matrix
0 0 r s
must belong to Xk , that is to say that ( ) is of type ( 2 ) for some
0 0 r+k k −k
Vector Spaces 39

r, s ∈ R. This happens if
⎧
⎪ r=0
⎪
⎪
⎪
⎪ s=0
⎨
⎪
⎪
⎪ k=0
⎪
⎪ 2
⎩ k − k = 0,
that is, k = 0.
r s
Let us see now if X0 = {( ) ∣ r, s ∈ R} is a subspace of M2 (R). Certainly X0
r 0
is not empty since it contains the null matrix.
r s1 r2 s2
Let ( 1 )( ) ∈ X0 . We have:
R1 0 r2 0

r1 s1 r s r + r2 s1 + s2
( ) + ( 2 2) = ( 1 ).
r1 0 r2 0 r1 + r2 0

r s
The matrix obtained therefore belongs to X0 . Similarly, if λ ∈ R also λ ( 1 1 ) =
r1 0
λr λs1
( 1 ) belongs to X0 , so X0 is closed with respect to the sum and with respect
r1 0
to the product by scalars, so it is a vector subspace of M2 (R).
In conclusion, Xk is a subspace of M2 (R) for k = 0.

2.6 SUGGESTED EXERCISES

2.6.1 Determine which of the following subsets of vector spaces are subspaces:
i) S = {(x, y, z) ∈ R ∣ x = 0}.
3

ii) T = {(x, y, z) ∈ R ∣ x = y}.

iii) Wn = {p(x) ∈ R[x] ∣ deg(p(x)) = n}, N ∈ N. (Here deg (p(x)) indicates the
degree of the polynomial p(x).
a 0
iv) D = {( ) ∈ M2 (R)}.
0 d

a b
v) T = {( ) ∈ M2 (R)}.
0 d
vi) A = {(aij )i=1,...,3
j=1,...,3
∈ M3 (R) ∣ a11 + a22 + a33 = 0}.

vii) X = {(x, y, z) ∈ R ∣ x + y = 0}.

3 2

viii) X = {(x, y, z) ∈ R ∣ x + y + z = −1}.

ix) X = {p(x) = 3x + rx + s∣ r, s ∈ R} ⊆ R2 [x].

0 r
x) X = {( ) ∣ r, s ∈ R} ⊆ M2 (R).
2r s
40 Introduction to Linear Algebra

r 2r
xi) X = {( 2 ) ∣ r ∈ R} ⊆ M2 (R).
r r
2.6.2 Show that the set of solutions of the homogeneous linear system

⎧
⎪ x1 + 3x2 − x3 = 0
⎪
⎪
⎪
⎨ 2x1 + 4x2 − 4x3 − x4 = 0
⎪
⎪
⎪
⎪
⎩x2 + x3 + 2x4 = 0
4
in the unknowns x1 , x2 , x3 , x4 is a subspace of R .

2.6.3 Establish if the set X = {p(x) ∈ R3 [x]∣p(−1) = 0} is a subspace of R3 [x],

where p(−1) means the value of the polynomial calculated in −1.

2.6.4 Determine whether the set X = {p(x) ∈ R2 [x]∣p(1) = −1} is a subspace of

R2 [x].

2.6.5 Determine if the set X = {g ∶ R → R ∣ g is continuous and differentiable in

x = 2} is a subspace of the vector space of continuous functions f ∶ R → R.

2.6.6 Determine if the set X = {g ∶ R → R ∣ g is continuous but not differentiable

in x = 0} and is a subspace of the vector space of continuous functions f ∶ R → R.

2.6.7 Write, if possible, the set S = {(x, y) ∈ R ∣ x + xy − 2y = 0} as a union of

2 2 2
2 2
two subspaces of R and say if S is a subspace of R .

2.6.8 We call sequence of elements in R any function s ∶ N → R. If s(n) = an , the

sequence is also indicated with (an ). On the set SR of all sequences with elements in
R, we define the following operations:

(an ) + (bn ) = (an + bn ), k(an ) = (kan )

for every (an ), (bn ) ∈ SR and k ∈ R. Show that with these operations SR is a vector
space over R.

2.6.9 Let C(R; R) be the set of continuous functions from R to R. Consider the
operation of sum of functions and the operation of product of any function by a real
number defined as in Example 2.3.3. Show that with these operations C(R; R) is a
vector space over R.
CHAPTER 3

Linear Combination and

Linear Independence

In the previous chapter, we have seen the definition of vector space and subspace.
We now want to describe these objects in a more efficient way. We introduce for
this purpose the concept of linear combination of a set of vectors and the concept
of linearly independent vectors. These are two fundamental definitions within the
theory of vector spaces, whose understanding is necessary to get to the key concepts
of basis and linear transformation, which we we will treat later.

3.1 LINEAR COMBINATIONS AND GENERATORS

Every vector space V ≠ {0} contains infinitely many vectors; for if V contains a
vector v, it immediately must also contain all its multiples, i.e. λv ∈ V for each
λ ∈ R. Let us see an example to better understand this fact.
Consider the subspace W = {(x, ax)∣x ∈ R} in R discussed in the previous
2

chapter. It is represented, in the Cartesian plane, by a line whose equation is y = ax.

We can describe it, in an alternative way, as the set of all multiples of the vector
(1, a)
W = {x(1, a) ∣ x ∈ R}.

41
42 Introduction to Linear Algebra

y 6

(2, 2a)

*

(1, a)
*

-

x

(− 23 , − 32 a)

We say that the vector (1, a) generates the subspace W represented by the line
y = x. The word “generate” is not accidental since, in fact, all vectors of the subspace
W are multiples of (1, a). We also note that the choice of the vector (1, a), as a
generator of W , is arbitrary, we could as well have choosen any of its multiples, like
(2, 2a) or (− 23 , − 32 a).
Graphically it is clear that if we know a point of a straight line (in the plane, but
also in three-dimensional space) different from the origin, then we can immediately
draw the line passing through it and the origin. We will see later that the fact of
knowing the generators of a vector space allows us to determine it uniquely.
Now let us see another example. In R , we consider the two vectors (1, 0) and
2

(0, 1). We ask ourselves: what is the smallest subspace W of R that contains both of
2

these vectors? From the previous reasoning, we know that this subspace must contain
the two subspaces W1 and W2 generated by (1, 0) and (0, 1):

W1 = {λ(1, 0)∣λ ∈ R} represented by the x-axis

W2 = {µ(0, 1)∣µ ∈ R} represented by the y-axis

We also know that the sum of two vectors of W still belongs to W (by the definition
of subspace). For instance (1, 0) + (0, 1) = (1, 1) ∈ W , but also (1, 2) + (3, 4) =
(4, 6) ∈ W . The student is invited to draw vectors sums in R considering the points
2

of the plan associated with them and using the parallelogram rule. In this way, we
2
can convince ourselves that actually W = R . But the graphic construction is not
sufficient to prove this fact, as it is not possible draw all the vectors of the plane, so
let us look at an algebraic proof. We take the generic vector (λ, 0) in W1 and the
generic vector (0, µ) in W2 , and we take their sum: (λ, 0) + (0, µ) = (λ, µ). It is clear
Linear Combination and Linear Independence 43

that all vectors (x, y) in R can be written in this way choosing λ = x and µ = y. So
2

we found that the smallest subspace of R containing the vectors (1, 0) and (0, 1) is
2
2
all R .
Now we formalize the concept of generation of subspace, which we have described
with the previous examples.

Definition 3.1.1 Assume that V is a vector space, v1 , . . . , vn are vectors of V and

λ1 , . . . , λn ∈ R. The vector w = λ1 v1 + ⋅ ⋅ ⋅ + λn vn is said to be a linear combination
of v1 , . . . , vn with scalars λ1 , . . . , λn .

For example, (1, 1) is a linear combination of (1, 0) and (0, 1) with scalars λ1 = 1
and λ2 = 1, but also a linear combination of (2, 1) and (1, 0) with scalars λ1 = 1 and
λ2 = −1.
We now come to the concept of vector space generated by some vectors, the main
concept of this chapter along with that of linear independence.

Definition 3.1.2 Let V be a vector space and let {v1 , . . . , vn } be a set of vectors
of V . The subspace generated (or spanned) by the vectors v1 , . . . , vn is the set of all
their linear combinations, in symbols

⟨v1 , . . . , vn ⟩ = {λ1 v1 + ⋅ ⋅ ⋅ + λn vn ∣ λ1 , . . . , λn ∈ R}.

We have seen that, for example, the subspace generated by a nonzero vector in
2
R corresponds to a straight line, while the subspace generated by the two vectors
(1, 0) and (0, 1) of R is all R .
2 2

Observation 3.1.3 If V is a vector space and v ∈ V , then the subspace generated

by v is the set of multiples of v, i.e. ⟨v⟩ = {λv∣λ ∈ R}.
Moreover, the subspace generated by the zero vector is the trivial subspace, which
contains only the zero vector: ⟨0⟩ = {0}.

Definition 3.1.4 Let V be a vector space and let {v1 , . . . , vn } be a set of vectors of
V . We say that v1 , . . . , vn generate V , or {v1 , . . . , vn } is a set of generators of V if
V = ⟨v1 , . . . , vn ⟩.

In the example above, we saw that the vectors (1, 0) and (0, 1) generate the vector
space R , as each vector (a, b) of R can be written as a linear combination of (1, 0)
2 2

and (0, 1):

(a, b) = a(1, 0) + b(0, 1).

Proposition 3.1.5 Let V be a vector space and let {v1 , . . . , vn } be a set of vectors of
V . Then we have that ⟨v1 , . . . , vn ⟩ is a subspace of V . Moreover, if Z is a subspace of
V containing v1 , . . . vn , then ⟨v1 , . . . , vn ⟩ ⊆ Z, therefore ⟨v1 , . . . , vn ⟩ is the smallest
subspace of V containing {v1 , . . . , vn }.
44 Introduction to Linear Algebra

Proof. First of all, we note that 0 ∈ ⟨v1 , . . . , vn ⟩, as 0 = 0v1 + ⋅ ⋅ ⋅ + 0vn . Now let
v, w ∈ ⟨v1 , . . . , vn ⟩. Then by definition there exist scalars α1 , . . . , αn and β1 , . . . , βn
such that:
v = α1 v1 + ⋯ + αn vn , w = β1 v1 + ⋯ + βn vn
thus
v + w = (α1 + β1 )v1 + ⋯ + (αn + βn )vn ∈ ⟨v1 , . . . , vn ⟩ .
Moreover if k ∈ R

kv = (kα1 )v1 + ⋯ + (kαn )vn ∈ ⟨v1 , . . . , vn ⟩ .

This proves that ⟨v1 , . . . , vn ⟩ is a subspace of V .

Now we prove that ⟨v1 , . . . , vn ⟩ is the smallest subspace of V containing
{v1 , . . . , vn }. Let v = λ1 v1 + ⋯ + λn vn ∈ ⟨v1 . . . vn ⟩, and let Z be a subspace
of V containing v1 , . . . vn . Then Z contains also λ1 v1 , . . . , λn vn because, being a
vector space, if Z contains a vector, it also contains all of its multiples. Moreover,
as Z is closed with respect to the sum, it also contains λ1 v1 + ⋯ + λn vn = v. So
⟨v1 , . . . , vn ⟩ ⊆ Z.

Let us look at an example that is linked to what we have seen in Chapter 1 about
the solution of linear systems depending on a parameter.

Example 3.1.6 We want to determine the subspace generated by the vectors (1, 1),
(2, k), depending on the parameter k.

⟨(1, 1), (2, k)⟩ = {λ1 (1, 1) + λ2 (2, k)∣λ1 , λ2 ∈ R}

= {(λ1 + 2λ2 , λ1 + kλ2 )∣λ1 , λ2 ∈ R}.

2
Since we are considering vectors in R we can represent the vectors with points
of the Cartesian plane. The following diagram illustrates the vectors (1, 1) and (2, k)
for the values of k = 1 and k = 2.
y 6
(2, k) = (2, 2)

(1, 1) (2, k) = (2, 1)

*

-
x
Linear Combination and Linear Independence 45

We see at once that if k = 2, then the two points lie on the same line through the
origin, thus the smallest subspace that contains both of them will be precisely this
line, whose equation is y = x.
If k ≠ 2, the two points lie on two distinct lines through the origin, then the
smallest subspace that contains both of them must contain such lines, and also
the sum of any two points on these lines. Therefore, with a reasoning similar to
the one made at the beginning of this chapter, we have that the smallest subspace
that contains both points whose coordinates are (1, 1), (2, k) is the whole plane, i.e.
the vectors (1, 1), (2, k) generate R .
2

Let us now see an algebraic proof of this fact. Let (a, b) be a generic vector of
R . We want to determine when (a, b) belongs to ⟨(1, 1), (2, k)⟩, that is, when there
2

exist λ1 , λ2 ∈ R such that

(λ1 + 2λ2 , λ1 + kλ2 ) = (a, b).

In other words we have to solve the linear system:

⎧
⎪ λ1 + 2λ2 = a
⎪
⎪
⎨
⎪
⎪
⎪
⎩ λ1 + kλ2 = b
We leave as an exercise to verify that this system in the unknowns λ1 , λ2 always has
a solution if k ≠ 2. If instead k = 2 the complete matrix associated to the system is:

1 2 a
( ),
1 2 b

whose echelon form becomes

1 2 a
( ).
0 0 b−a
So if a ≠ b, the system does not have solutions, i.e. we have that (a, b) ∈ /
⟨(1, 1), (2, 2)⟩; if instead a = b the system has solutions, that is, (a, a) ∈
⟨(1, 1), (2, 2)⟩. Then ⟨(1, 1), (2, 2)⟩ is the set of vectors that have first coordinate
equal to the second, i.e. ⟨(1, 1), (2, 2)⟩ = {(a, a)∣a ∈ R}.
Example 3.1.7 We now slightly modify the previous example. We want to determine
the subspace generated by the vectors: (1, 1), (2, k), (−1, −1), depending on the
parameter k:
⟨(1, 1), (2, k), (−1, −1)⟩ =
{λ1 (1, 1) + λ2 (2, k) + λ3 (−1, −1)∣λ1 , λ2 , λ3 ∈ R} =
{(λ1 + 2λ2 − λ3 , λ1 + kλ2 − λ3 )∣λ1 , λ2 , λ3 ∈ R}.
From the previous reasoning, we expect that these vectors almost always generate
2
R (i.e. for almost every value of k), but let us see a rigorous proof of this fact. We
want to show that for any fixed vector (a, b), we can always choose λ1 , λ2 , λ3 such
that
(λ1 + 2λ2 − λ3 , λ1 + kλ2 − λ3 ) = (a, b).
46 Introduction to Linear Algebra

If we solve the linear system, depending on the parameter k, with the Gaussian
algorithm, an easy calculation shows that this system always has solution (for every
a and b fixed) provided we have k ≠ 2. But when k = 2:

(λ1 + 2λ2 − λ3 , λ1 + 2λ2 − λ3 ) = (a, b),

therefore, in order for the system to admit solutions, it must necessarily be a = b, so

the only vectors belonging to the subspace spanned by the two given vectors are those
of the type (a, a), namely those corresponding to points lying on the straight line of
equation y = x. In fact, an accurate analysis could immediately give the answer to
this problem without any calculation: it is enough to notice that the vector (−1, −1)
is redundant in calculating the subspace, as it belongs to the subspace generated by
(1, 1). Graphically, it is clear that the point P of coordinates (−1, −1) belongs to
the line through the origin and through the point Q of coordinates (1, 1), thus every
subspace of the plane containing Q, automatically contains also P . So we could safely
ignore (−1, −1) and give right away as an answer the solution of Example 3.1.6.

We have thus seen that, in describing a subspace by using a set of generators,

some vectors are redundant, i.e. if we eliminate them, the subspace does not change.
This happens, for example, when we have that a vector is a multiple of another
but also when a vector is the sum of two vectors. For example, we have seen that
⟨(1, 0), (0, 1)⟩ = R but also (as the student can verify directly):
2

2
⟨(1, 0), (0, 1), (1, 1)⟩ = R .

This fact is formalized by the following proposition.

Proposition 3.1.8 Assume that V is a vector space, v1 , . . . , vn are vectors of V and

w is a linear combination of them, namely: w = λ1 v1 + ⋅ ⋅ ⋅ + λn vn . Then

⟨v1 , . . . , vn ⟩ = ⟨v1 , . . . , vn , w⟩.

Conversely if
⟨v1 , . . . , vn ⟩ = ⟨v1 , . . . , vn , w⟩
then w is a linear combination of v1 , . . . , vn .

Proof. In order to show the first part of the result, it is enough to observe that w ∈
⟨v1 , . . . , vn ⟩ by assumption, so it follows from Proposition 3.1.5 that Z = ⟨v1 , . . . , vn ⟩
is a subspace containing {v1 , . . . , vn , w}, thus ⟨v1 , . . . , vn , w⟩ ⊆ ⟨v1 , . . . , vn ⟩, again
by Proposition 3.1.5. The inclusion ⟨v1 , . . . , vn ⟩ ⊆ ⟨v1 , . . . , vn , w⟩ is obvious,
To show the converse, it is enough to note that since ⟨v1 , . . . , vn ⟩ =
⟨v1 , . . . , vn , w⟩ we have that w ∈ ⟨v1 , . . . , vn ⟩, i.e. w is a linear combination of
v1 , . . . , vn .
Linear Combination and Linear Independence 47

3.2 LINEAR INDEPENDENCE

Now we ask: How do we establish what the “redundant vectors” are in the description
of the subspace generated by a set of vectors? If we want be efficient in the description
of a subspace, we must be able to describe it as the subspace generated by the smallest
possible number of vectors. The answer to this question comes from the concept of
linear independence. This is by far the most difficult concept to understand, and it
is the cornerstone of the whole theory. Shortly, the story is: if a set of generators of
a subspace is a set of linearly independent vectors, then we are sure that it is the
most efficient way to describe the subspace generated by those vectors, namely that
we are using the smallest number of vectors. Let us see the definition and then with
a series of small steps, we will arrive to prove the assertion above.

Definition 3.2.1 Let V be a vector space. The vectors v1 , . . . , vn ∈ V are linearly

independent if for every linear combination

λ1 v1 + ⋅ ⋅ ⋅ + λn vn = 0,

we have λ1 = ⋅ ⋅ ⋅ = λn = 0. In other words, the only linear combination of the vectors

v1 , . . . , vn giving the zero vector is the one with all zero scalars. We will say also that
the set of vectors {v1 , . . . , vn } is linearly independent.
1

The vectors v1 , . . . , vn are linearly dependent, if they are not independent. In

other words, the vectors of the set {v1 , . . . , vn } are linearly dependent if there exist
scalars λ1 , . . . , λn , not all zero, such that λ1 v1 + ⋅ ⋅ ⋅ + λn vn = 0.

Let us review the previous examples. The set of vectors {(1, 0), (0, 1)} in R is a
2

set of vectors which are linearly independent, as their only linear combination that
gives the zero vector is obtained with all zero scalars:

α(1, 0) + β(0, 1) = (α, β) = (0, 0),

only if α = β = 0.
On the other hand, the vectors of the set {(1, 0), (0, 1), (1, 1)} are linearly depen-
dent, because there is a linear combination of the given vectors with scalars, not all
zero, which is equal to the zero vector.

1 ⋅ (1, 0) + 1 ⋅ (0, 1) − 1 ⋅ (1, 1) = (0, 0).

Let us see a more elaborate example.

Example 3.2.2 Consider the following set of vectors in R2 [x]: {x+1, x −1, 2, x−1}.
2

Is this a set of linearly independent vectors? If we knew something more about linear
algebra the answer would be immediate, for the moment we have to perform the
1
The words “linearly independent” can be used for the vectors v1 , . . . , vn , but also for the set
of vectors {v1 , . . . , vn } indifferently, i.e. the two terminologies have the same meaning.
48 Introduction to Linear Algebra

calculations. We write a generic linear combination of these vectors, and we put it

equal to the zero vector:
2
α1 (x + 1) + α2 (x − 1) + 2α3 + α4 (x − 1) = 0.

From which:
2
α2 x + (α1 + α4 )x + (α1 − α2 + 2α3 − α4 ) = 0.
A polynomial is zero if and only if all its coefficients are zero, so we obtain the linear
system:
⎧
⎪ α2 = 0
⎪
⎪
⎨
⎪ α1 + α4 = 0 .
⎪
⎪
⎩ α1 − α2 + 2α3 − α4 = 0
We leave as an exercise for the student to verify that this system admits infinitely
many solutions. For example, it has the solution: α1 = 1, α2 = 0, α3 = −1, α4 = −1.
So we can explicitly write a linear combination of the given vectors, which is equal
to the zero vector, while the scalars are not all zero:

1 ⋅ (x + 1) − 1 ⋅ 2 − 1 ⋅ (x − 1) = 0.

So the given vectors are linearly dependent.

When we have a linear combination of this type, we can always write one of the
vectors as a linear combination of the others, for example in our case we have:

(x + 1) = 2 + (x − 1)

Of course, in a set of linearly dependent vectors, it is not true that each of the given
vectors can be expressed as a function of the others; for example, we see that there
2
is no way to express x − 1 as a linear combination of the others.
The important thing to note is that, in a set of linearly dependent vectors, if we
eliminate one vector that is a linear combination of the others, the subspace they
generate does not change (see Proposition 3.1.8), and the vectors of the new set thus
obtained may have become linearly independent. Be careful, however, that this is not
always the case, for example, in the set {2x, 3x, 4x} even if we eliminate a vector,
the remaining vectors are linearly dependent, as the student may verify. Somehow
the linear independence tells us that we have reached the smallest number of vectors
to describe the subspace. This concept will be explored very carefully in the next
chapter on bases.
Observation 3.2.3 We note that, if a set of vectors contains the zero vector, then
it is always a set of linearly dependent vectors. In fact, if we consider the set {v1 =
0, v2 . . . , vn } we have that;

1v1 + 0v2 + ⋅ ⋅ ⋅ + 0vn = 0.

So we obtained a linear combination equal to the zero vector, while the first scalar is
not zero.
Linear Combination and Linear Independence 49

Proposition 3.2.4 In a vector space V , the vectors v1 , . . . , vn are linearly dependent

if and only if at least one of them is a linear combination of the others.
Proof. Suppose that v1 , . . . , vn are linearly dependent. Then, there exist scalars
α1 , . . . , αn ∈ R, not all zero, such that

α1 v1 + ⋯ + αn vn = 0 .

Since at least one of the scalars is nonzero, we have that αk ≠ 0 for some k. Then:
α1 αk−1 αk+1 αn
vk = − α v1 − ⋯ − α vk−1 − α vk+1 − ⋯ − α vn ,
k k k k

and therefore vk is a linear combination of the other vectors.

Conversely, suppose that there are scalars α1 , . . . , αk−1 , αk+1 ,. . . , αn , such that

vk = α1 v1 + ⋯ + αk−1 vk−1 + αk+1 vk+1 + ⋯ + αn vn ,

then
α1 v1 + ⋯ + αk−1 vk−1 + (−1)vk + αk+1 vk+1 + ⋯ + αn vn = 0,
and at least one of the coefficients is not zero, that of vk . So the vectors v1 , . . . , vn
are linearly dependent.

We see a particular case of this proposition, very useful in the exercises.

Proposition 3.2.5 Two vectors are linearly independent if and only if neither is a
multiple of the other.
The proof of this proposition is immediate: just use the previous proposition and
remember that a vector is a linear combination of another if and only if it is one of
its multiples.

Observation 3.2.6 In order to prove that the vectors of a set are linearly dependent,
it is enough to find a vector that is a linear combination of the others. For example
if we see that a vector is a multiple of another, then we know that the vectors of the
set are linearly dependent. The vectors of the following sets are linearly dependent,
and we can verify it without any calculation (but the student should do it if he does
not see why and wants to convince himself!).

• In R : {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 2, 0)}.

• In R : {(1, 0, 0), (0, 1, 0), (−1, 2, 0), (1, 2, 1)}.

• In R3 [x]: {2, x + 7, x − 3x, 1}.

• In R3 [x]: {0, x, 1 − x, x }.
3

• In M2 (R):
1 0 0 0 0 √0 3 0
{( ), ( ), ( ), ( )} .
0 1 0 3 0 2 0 3
50 Introduction to Linear Algebra

Observation 3.2.7 Although it is true that in a set of vectors it is enough to find

out that a vector is a multiple of another one in order to prove that the vectors are
linearly dependent, the vice versa is not true! As shown in Example 3.2.2, we can
have a set of linearly dependent vectors where no one is a multiple of another.

The next proposition shows that removing some vectors from a set of linearly
independent vectors, we get a set of vectors that are still linearly independent.

Proposition 3.2.8 A non-empty subset of a set of linearly independent vectors con-

sists of vectors which are still linearly independent.

Proof. Suppose, by contradiction, that I is a set of linearly independent vectors and

that ∅ ≠ J ⊆ I is a subset of linearly dependent vectors. Then there exists a vector in
J which is written as a linear combination of the others. But then it is also expressed
as a linear combination of vectors of I and then I is a set of linearly dependent
vectors, contradicting the hypothesis.

Example 3.2.9 Suppose we want to determine for which values of k the following
vectors in M2 (R):
1 0 6 0 k 0
( ), ( ), ( ),
1 −3 k −18 1 5
are linearly independent. We proceed as suggested by the definition: we write a linear
combination of the given vectors, and we see if there are nonzero scalars that allow
us to get the zero vector. For the values of k for which this happens, we will have
that the given vectors are linearly dependent.
So we write a generic linear combination of the given vectors and set it equal to
the zero vector:
1 0 6 0 k 0 0 0
λ1 ( ) + λ2 ( ) + λ3 ( )=( ) (3.1)
1 −3 −18 k 1 5 0 0

from which:
λ1 + 6λ2 + kλ3 0 0 0
( )=( ).
λ1 + kλ2 + λ3 −3λ1 − 18λ2 + 5λ3 0 0
Then equality (3.1) is satisfied if and only if λ1 , λ2 , λ3 are solutions of the homoge-
neous linear system:
⎧
⎪ λ1 + 6λ2 + kλ3 = 0
⎪
⎪
⎨
⎪ λ1 + kλ2 + λ3 = 0
⎪
⎪
⎩ −3λ1 − 18λ2 + 5λ3 = 0.
The complete matrix associated with the system is:

⎛ 1 6 k 0 ⎞
(A∣b) = ⎜
⎜ 1 k 1 0 ⎟
⎟,
⎝ −3 −18 5 0 ⎠
Linear Combination and Linear Independence 51

and, if we reduce it in the echelon form with the Gaussian algorithm, it becomes:

⎛ 1 6 k 0 ⎞
(A ∣b ) = ⎜ ⎟.
′ ′
⎜ 0 k−6 1−k 0 ⎟
⎝ 0 0 5 + 3k 0 ⎠

Note that the system always admits the zero solution λ1 = λ2 = λ3 = 0. The question
is whether there are also nonzero solutions or not. The row echelon form of the ma-
trix is particularly appropriate to understand whether or not we have only the zero
solution. We can immediately observe that if the initial vectors were 5, of course at
least one of the 5 initial unknowns (the scalars that give the zero linear combina-
tion) would be indeterminate, in other words the vectors would definitely be linearly
dependent, because we could arbitrarily assign a nonzero value to that unknown. In
the next chapter, using the concept of basis, we will formalize this reasoning which,
however, even now should be clear intuitively.
Returning to the example in question, if k ≠ 6 and k ≠ 35 , we have that rr(A ) =
′

rr(A ∣b ) = 3 is equal to the number of unknowns, thus the system admits a unique
′ ′

solution λ1 = λ2 = λ3 = 0, and the given vectors are therefore linearly independent.

If k = 6, further reducing the matrix in row echelon form we obtain:

⎛ 1 6 6 0 ⎞
(A ∣b ) = ⎜ ⎟
′′ ′′
⎜ 0 0 −5 0 ⎟,
⎝ 0 0 0 0 ⎠

thus rr(A ) = rr(A ∣b ) = 2, and the system admits infinitely many solutions de-
′′ ′′ ′′

pending on one parameter. In particular, there are nonzero solutions, and the given
vectors are linearly dependent.
Finally, if k = − 53 we get:

⎛ 1 6 − 53 0 ⎞
(A ∣b ) = ⎜
⎜ 0 − 5 −5 0 ⎟
′ ′ 33 8
⎟.
⎝ 0 0 0 0 ⎠

Thus rr(A ) = rr(A ∣b ) = 2, and as before the given vectors are linearly dependent.
′ ′ ′

We conclude this chapter with some exercises that clarify the techniques for ver-
ifying linear dependence or independence and the concept of generators.

3.3 EXERCISES WITH SOLUTIONS

2 2 2 2
3.3.1 Determine for which values of k the vectors x +2x+k, 5x +2kx+k , kx +x+3
generate R2 [x].
Solution. Consider the generic vector ax +bx+c ∈ R2 [x], and see when ax +bx+c ∈
2 2

⟨x + 2x + k, 5x + 2kx + k , kx + x + 3⟩. This happens if we have

2 2 2 2

2 2 2 2 2
ax + bx + c = λ1 (x + 2x + k) + λ2 (5x + 2kx + k ) + λ3 (kx + x + 3),
52 Introduction to Linear Algebra

for some λ1 , λ2 , λ3 ∈ R, that is if

2 2 2
ax + bx + c = (λ1 + 5λ2 + kλ3 )x + (2λ1 + 2kλ2 + λ3 )x + (kλ1 + k λ2 + 3λ3 ).

So λ1 , λ2 , λ3 must be solutions of the system:

⎧
⎪ λ1 + 5λ2 + kλ3 = a
⎪
⎪
⎨
⎪
2λ1 + 2kλ2 + λ3 = b
⎪
⎪ 2
⎩ kλ 1 + k λ2 + 3λ3 = c.

The complete matrix associated with the system is:

⎛ 1 5 k a ⎞
(A∣b) = ⎜
⎜ 2 2k 1 b ⎟
⎟,
⎝ k k2 3 c ⎠

which reduced to row echelon form with Gaussian algorithm becomes:

⎛ 1 5 k a ⎞
′ ′
⎜
(A ∣b ) = ⎜ 0 2k − 10 1 − 2k b − 2a ⎟
⎟.
⎝ 0 0 3 − 21 k c − 2b k ⎠

If k ≠ 5 and k ≠ 6, we have rr(A ) = rr(A ∣b ) = 3, therefore the system admits

′ ′ ′

solution, independently from the values of a, b, c, therefore each vector of the type
ax + bx + c belongs to ⟨x + 2x + k, 5x + 2kx + k , kx + x + 3⟩, and the given vectors
2 2 2 2 2

generate R2 [x].
If k = 5, further reducing the matrix to row echelon form, we obtain:

⎛ 1 5 5 a ⎞
(A ∣b ) = ⎜ ⎟
′′ ′′
⎜ 0 0 1 2c − 5b ⎟,
⎝ 0 0 0 −a − 22b + 9c ⎠

and if −a − 22b + 9c ≠ 0 the system does not admit solution, because 3 = rr(A ) ≠
′′

rr(A ∣b ) = 2. This means that if −a−22b+9c ≠ 0 it is not possible to write ax +bx+c

′′ ′′ 2
2 2 2 2
as a linear combination of x +2x+k, 5x +2kx+k , kx +x+3, thus the given vectors
do not generate R2 [x]. For example, x −x+1 ∈
/ ⟨x +2x+k, 5x +2kx+k , kx +x+3⟩.
2 2 2 2 2

If k = 6 we obtain

⎛ 1 5 6 a ⎞
(A ∣b ) = ⎜ ⎟
′ ′
⎜ 0 2 −11 b − 2a ⎟,
⎝ 0 0 0 c − 3b ⎠

and if c ≠ 3b the system does not admit solution, thus the given vectors do not
generate R2 [x]. For example 3x + 5x − 8 ∈
/ ⟨x + 2x + k, 5x + 2kx + k , kx + x + 3⟩.
2 2 2 2 2

4
3.3.2 Let W be the vector subspace of R given by the set of solutions of the homo-
geneous linear system:
x + x2 − x4 = 0
{ 1
2x1 + x2 − x3 + 3x4 = 0
Linear Combination and Linear Independence 53

in the unknowns x1 , x2 , x3 , x4 . Determine, if possible, a finite set of generators of W .

Solution. The complete matrix associated to the system is

1 1 0 −1 0
(A∣b) = ( ),
2 1 −1 3 0

which reduced to row echelon form with the Gaussian algorithm becomes:

1 1 0 −1 0
(A ∣b ) = ( ).
′ ′
0 −1 −1 5 0

The system solutions are: (x3 − 4x4 , −x3 + 5x4 , x3 , x4 ), with x3 , x4 ∈ R.

To determine the generators of W we separate the free variables. Then

W = {(x3 − 4x4 , −x3 + 5x4 , x3 , x4 )∣x3 , x4 ∈ R} =

{(x3 , −x3 , x3 , 0) + (−4x4 , 5x4 , 0, x4 )∣x3 , x4 ∈ R} =

{x3 (1, −1, 1, 0) + x4 (−4, 5, 0, 1)∣x3 , x4 ∈ R}.
We remember that x3 , x4 ∈ R can take any real value. At this point, it is clear that
W = ⟨(1, −1, 1, 0), (−4, 5, 0, 1)⟩, i.e. the vectors (1, −1, 1, 0), (−4, 5, 0, 1) generate W .
4
3.3.3 Determine for which values of k the following vectors of R are linearly depen-
dent:
2
v1 = (2, 2k, k , 2k + 2), v2 = (−1, −k, 2k + 2, −k − 1).
Choose one of these values of k and show that v1 ∈ ⟨v2 ⟩.

Solution. We need to see if there are two nonzero scalars λ1 , λ2 , such that λ1 v1 +
λ2 v2 = 0. It must happen that:
2
λ1 (2, 2k, k , 2k + 2) + λ2 (−1, −k, 2k + 2, −k − 1) = (0, 0, 0, 0),

i.e.
2
(2λ1 − λ2 , 2kλ1 − kλ2 , k λ1 + (2k + 2)λ2 , (2k + 2)λ1 − (k + 1)λ2 ) = (0, 0, 0, 0),

that is, λ1 , λ2 must satisfy the following homogeneous linear system:

⎧
⎪ 2λ1 − λ2 = 0
⎪
⎪
⎪
⎪ 2kλ1 − kλ2 = 0
⎨
⎪ k λ1 + (2k + 2)λ2 = 0
2
⎪
⎪
⎪
⎪
⎩ (2k + 2)λ1 − (k + 1)λ2 = 0,

The complete matrix associated with the system is:

⎛ 2 −1 0 ⎞
⎜ ⎟
(A∣b) = ⎜ ⎟
2k −k 0
⎜
⎜ ⎟
⎟ ,
⎜
⎜ k
2
2k + 2 0 ⎟
⎟
⎝ 2k + 2 −k − 1 0 ⎠
54 Introduction to Linear Algebra

which reduced to row echelon form with the Gaussian algorithm becomes:

⎛ 2 −1 0 ⎞
⎜ 0 (k + 2)
2
⎟
(A ∣b ) = ⎜ ⎟
′ ′
⎜ 0 ⎟
⎜
⎜ ⎟
⎟ .
⎜ 0 0 0 ⎟
⎝ 0 0 0 ⎠

If k ≠ −2, then rr(A ) = rr(A ∣b ) = 2 so the system has only one solution, which is
′ ′ ′

λ1 = λ2 = 0, and the given vectors are linearly independent.

For k = −2 we get:
⎛ 2 −1 0 ⎞
⎜ 0 0 0 ⎟
(A ∣b ) = ⎜⎜ ⎟
⎟,
′ ′
⎜
⎜ 0 0 ⎟ ⎟
⎜ 0 ⎟
⎝ 0 0 0 ⎠

then rr(A ) = rr(A ∣b ) = 1, the system has infinitely many solutions that depend on
′ ′ ′

one parameter and the given vectors are linearly dependent.

For k = −2 we have that:

v1 = (2, −4, 4, −2), v2 = (−1, 2, −2, −1)

and v1 = −2v2 .

3.4 SUGGESTED EXERCISES

3.4.1 Say if the vectors of the following sets are linearly dependent or independent.
If they are linearly dependent write one vector as a linear combination of the others:

(i) {(2, 1, 1), (3, 2, 1), (6, 2, 2)} ⊆ R .

(ii) {(1, 0, 1, 4), (2, 1, 0, 6), (1, −2, 5, 8)} ⊆ R .

(iii) The set of polynomials {2x + 1, x + 2x, 4x − 1} ⊆ R2 [x].

2 2

(iv) The set of polynomials {1, x, x , x } ⊆ R[x].

2 3

(v) The set of following vectors:

1 0 0 3 1 3
{( ) ,( ) ,( )} ⊆ M2 (R).
2 −1 1 −3 −2 −1

3.4.2 Given the vectors

1 0 2
u = ( ), v = ( ), w = ( ),
1 2 −2

determine if they are linearly independent and determine the subspace generated by
them.
Linear Combination and Linear Independence 55

3.4.3 Determine if x − x belongs to ⟨x + x + x, x + 2x, x ⟩.

3 3 2 2 2

3.4.4 Determine for which values of k the polynomial k x + x + 1 belongs to ⟨2x −

2 2 2
2
x, −x + 3x + 1⟩.

3.4.5 Determine for which values of k we have that

2 k 1
w = ( ) ∈ ⟨( ) , ( )⟩ .
5 1 −2

3.4.6 Determine for which values of k the polynomial x +2k belongs to ⟨x +kx, x −
2 2 2

(k + 1)x − k⟩.

3.4.7 In R2 [x] give examples of sets with the following properties:

a) A set of generators that are not linearly independent.
b) A set of linearly independent vectors that do not generate the space.

3.4.8 In M2,3 (R) consider the following vectors:

1 0 −2 −1 −1 1
A=( ), B = ( ).
4 1 0 0 −2 1

Determine a matrix C, such that C ∈ ⟨A, B⟩, C ∈/ ⟨A⟩ and C ∈

/ ⟨B⟩, and a vector D
/ ⟨A, B⟩. The motivation is requested.
such that D ∈
3
3.4.9 Given the following vectors in R :

⎛1⎞ ⎛2⎞ ⎛1⎞ ⎛−2⎞

v1 = ⎜ ⎟ v2 = ⎜ ⎟ v3 = ⎜ ⎟ w = ⎜
⎜0⎟ ⎜k ⎟ ⎜ −2⎟ ⎜3⎟⎟ .
⎝1⎠ ⎝1⎠ ⎝k⎠ ⎝1⎠

a) Determine the values of k for which the three vectors v1 , v2 , v3 are linearly
independent.
b) Determine the values of k for which w ∈ ⟨v1 , v2 , v3 ⟩.

3.4.10 a) Determine the solutions of the following linear system as the parameter k
varies:
⎧
⎪ x−z =1
⎪
⎪
⎪
⎨ kx − ky + 2z = 0
⎪
⎪
⎪
⎪
⎩2x + 3ky − 11z = −1.
2
b) Determine for which values of the parameter k the polynomial x − 1 belongs to
2 2 2
the subspace generated by the polynomials x + kx + 2, kx − 3k , x − 2x + 11.

3.4.11 Let V be a vector space and let v1 , . . . , vn , w1 , . . . , wm ∈ V . If v1 , . . . , vn

generate V is it also true that v1 , . . . , vn , w1 , . . . , wm generate V ? If yes, prove it;
otherwise give a counterexample.
56 Introduction to Linear Algebra

3.4.12 Find the values of h and k for which the vectors of the set {x + h, kx −
2

h, x + 2kx − h} ⊆ R2 [x] are linearly independent. If h = 1 and k = 2 do such vectors

generate R2 [x]?

3.4.13 a) Determine for which values of the parameter k the vectors v1 = x + 3,

v2 = kx + 5, v3 = kx + 5x, v4 = x of R3 [x] are linearly independent.
2 3

b) Choose a value of k for which v1 , v2 , v3 are linearly dependent and write one of
them as a linear combination of the others.
1 2 k 0
3.4.14 a) Establish for which values of the parameter k the matrices ( ), ( ),
0 0 4 0
−1 k − 2
( ) are linearly independent.
k 0
b) Establish for which values of the parameter k such vectors generate the sub-
r s
space W = {( ) ∣r, s, t ∈ R} of M2 (R).
t 0
3.4.15 Determine for which values of the parameter k we have that:

3 −2 2 0 1 k k 6
( ) ∈ ⟨( ), ( ), ( )⟩ .
2 2 2 0 0 −k 1 −6

3.4.16 Determine a set of generators of the following vector spaces:

i) S = {(x, y, z) ∈ R ∣ x + 2y = 0}.
3

ii) T = {hx − kx + 4hx + k ∣ h, k ∈ R} ⊆ R3 [x].

3 2
CHAPTER 4

Basis and Dimension

The concepts of basis and dimension, which are closely related, are central in the
theory of vector spaces.
2 3
Let us start with some examples, mainly, but not only, in R and R . Thanks to
these examples, we will develop geometric intuition, which will be valuable in order
to understand what happens in vector spaces that cannot be visualized. Then, we will
discuss the theory and state the Completion Theorem. This is the most important
result in this chapter; starting from the concept of basis it allows us to reach the
definition of dimension. At the end, we will revisit the Gaussian algorithm, described
in Chapter 1, and we will see how it can be effectively used to answer the main
questions regarding a basis or the dimension of a vector space.

4.1 BASIS: DEFINITION AND EXAMPLES

As the word itself suggests, the concept of basis of a vector space contains all the
information necessary to rebuild the vector space, starting from very “few” vectors.
Let us see some examples to motivate us.

Example 4.1.1 In the previous chapter, we have seen several examples of sets of
2
generators of the vector space R . We will mention a few:
2
R = ⟨(1, 0), (0, 1)⟩ = ⟨(1, 0), (0, 1), (1, 1)⟩.
2 2
If we add a vector to a set that generates R , this set continues to generate R by
Proposition 3.1.8. The question that we ask ourselves is: how can we find a minimal
2
set, i.e. a set as small as possible, of generators for the space R ?
Proposition 3.1.8 comes again to help us: if we remove from the set a vector which
is a linear combination of the others, the new set obtained generates the same vector
space. In the example we are considering, we can remove the vector (1, 1) as it is a
linear combination of (1, 0) and (0, 1): (1, 1) = (1, 0) + (0, 1). If now, however, we try
to further decrease the number of generators in the set, the vector space generated by
them changes. Indeed ⟨(1, 0)⟩ is just the x-axis, while ⟨(0, 1)⟩ is just the y-axis. So, if
we remove one of the two vectors (1, 0) or (0, 1), from the given set, the vector space
generated by the set changes, in other words, there are no “redundant generators”. The

57
58 Introduction to Linear Algebra

important difference between the two sets: {(1, 0), (0, 1)} and {(1, 0), (0, 1), (1, 1)}
cannot go unnoticed. The first set consists of linearly independent vectors, while the
vectors of the second set are linearly dependent. So we have seen in this example
that, starting from a set of generators, we can delete one by one the generators that
are linear combination of other vectors in the set, until we obtain a set of linearly
independent vectors, that is a set in which no vector is a linear combination of the
other vectors (Proposition 3.2.4 of Chapter 3). At this point, we cannot remove any
vector from the set, without changing the vector space generated by the vectors in
the set.
The next proposition formalizes the conclusions of the previous example and gives
us an algorithm to obtain a minimal set of generators; as we will see, this set is called
basis.
Proposition 4.1.2 Let V = ⟨v1 , . . . , vn ⟩ ≠ {0}. Then there exists a subset of
{v1 , . . . , vn }, consisting of linearly independent vectors, which generates V .
Proof. We proceed algorithmically by steps.
Step one. We have that V = ⟨v1 , . . . , vn ⟩ by assumption. If v1 , . . . , vn are linearly
independent, then we have proved the statement. Otherwise, one of the vectors,
suppose vn , is a linear combination of the others, by Proposition 3.2.4. By Proposition
3.1.8, we have:
V = ⟨v1 , . . . , vn ⟩ = ⟨v1 , . . . , vn−1 ⟩.
Step two. In step one, we have eliminated the vector vn from the set of generators
of V , thus V = ⟨v1 , . . . , vn−1 ⟩. If v1 , . . . , vn−1 are linearly independent, then we have
finished our proof. Otherwise, we go back to step one, that is, one of the vectors,
suppose vn−1 , is a linear combination of the others. By Proposition 3.1.8 in Chapter
3 we have:
V = ⟨v1 , . . . , vn ⟩ = ⟨v1 , . . . , vn−1 ⟩ = ⟨v1 , . . . , vn−2 ⟩.
It is clear that, after a finite number of steps, n − 1 at most, we get a set in which
no vector is a linear combination of the others, therefore by Proposition 3.2.4, it is a
set of linearly independent vectors.
We are ready for the definition of basis.
Definition 4.1.3 Let V be a vector space. The set {v1 , . . . , vn } is called a basis if:
1. The vectors v1 , . . . , vn are linearly independent.
2. The vectors v1 , . . . , vn generate V .
We say also that V is finitely generated, if there exists a finite set of generators
of V , i.e. V = ⟨v1 , . . . , vn ⟩.
If V admits a basis {v1 , . . . , vn }, then it is finitely generated. We will soon see
that the converse is also true.
Henceforth, we will say that a set X is maximal (minimal) with respect to a
certain property if X enjoys that properties, but as soon as we add (remove) an
element to (from) X, then X does not enjoy the property anymore.
Basis and Dimension 59

Theorem 4.1.4 Let v1 , . . . vn be vectors in a vector space V .

1. {v1 , . . . , vn } is a basis of V if and only if it is a minimal set of generators of

2. {v1 , . . . , vn } is a basis of V if and only if it is a maximal set of linearly inde-

pendent vectors.

Proof. (1). If {v1 , . . . , vn } is a basis of a vector space V , then by definition it is a

set of generators. We now see that it is also a minimal set with this property. In
fact, if we remove any vector from {v1 , . . . , vn }, then the vector space generated by
the vectors changes. This happens because otherwise, by Proposition 3.1.8, a vector
among v1 , . . . , vn would be a linear combination of the others, while we know that
these vectors are linearly independent by hypothesis. Vice versa, if we consider a
minimal set of generators, then it is a basis because it consists of linearly independent
vectors. In fact, by minimality, we have that, by removing any of the generators, the
remaining vectors do not generate the given vector space anymore, and therefore, by
Propositions 3.1.8 and 3.2.4, this means that none of them is a linear combination of
the other vectors.
(2) If {v1 , . . . , vn } is a basis of a vector space V , by definition it is a set of linearly
independent vectors, and it is also maximal with respect to this property. Indeed, as
v1 , . . . , vn generate V , we will have that if w ∈ V then

⟨v1 , . . . , vn ⟩ = ⟨v1 , . . . , vn , w⟩ = V,

and thus by Proposition 3.1.8, w is necessarily a linear combination of {v1 , . . . , vn },

therefore the vectors v1 , . . . , vn , w are linearly dependent by Proposition 3.2.4.
Conversely, if {v1 , . . . , vn } is a maximal set of linearly independent vectors, if we
add any other vector v, we get a linearly dependent set of vectors, that is, there are
scalars α, α1 , . . . , αn , not all equal to zero, such that

αv + α1 v1 + ⋅ ⋅ ⋅ + αn vn = 0.

We note that it must be α ≠ 0 , otherwise the vectors {v1 , . . . , vn } would be linearly

dependent. Then we have that
α1 αn
v = − α v1 − ⋅ ⋅ ⋅ − α vn ,

thus v ∈ ⟨v1 , . . . , vn ⟩. As we chose v arbitrarily, we have that {v1 , . . . , vn } generates

Example 4.1.5 Consider the following vectors in R3 [x]:

3 2 3
x , x , 2, 5, x + 2, 3x, −7x, 2x .

We want to find a basis for the subspace they generate. The procedure we will follow is
not the standard one, but only an example of the procedure described in Proposition
4.1.2. First, we see immediately that 5 is a linear combination of 2, as it is a multiple
60 Introduction to Linear Algebra

of it: 5 = (2/5)2. Therefore, we can eliminate the vector 5 (thanks to Proposition

3.1.8). In the same way, we can eliminate −7x = (−7/3)3x and also 2x = 2(x ). So
3 3

we have:
3 2 3 3 2
W = ⟨x , x , 2, 5, x + 2, 3x, −7x, 2x ⟩ = ⟨x , x , 2, x + 2, 3x⟩.

Now we see that x + 2 = 1/3 ⋅ 3x + 2, thus:

3 2 3 2
W = ⟨x , x , 2, x + 2, 3x⟩ = ⟨x , x , 2, 3x⟩.

To verify that these vectors are linearly independent and thus form a basis of W , we
have to show that the equation:
3 2
ax + bx + 2c + 3dx = 0

is satisfied only for a = b = c = d = 0, but this is clear, since a polynomial is identically

zero if and only if all its coefficients are equal to zero. Therefore {x , x , 2, 3x} is a
3 2

basis of W . We leave, as an exercise to the student, to prove that W = R3 [x]. We

will see later that the latter statement is obvious, using the concept of dimension.
Now the following question arises: given a vector space V , is there always a basis
for V ? The answer is yes, although we will see the proof only for finitely generated
vector spaces.
Proposition 4.1.6 If a vector space V ≠ {0} is generated by a finite number of
vectors v1 , . . . , vn , then there exists a basis of V .
Proof. By assumption V = ⟨v1 . . . vn ⟩. Then, by Proposition 4.1.2, we have that
there is a subset of {v1 . . . vn }, consisting of linearly independent vectors generating
V , that is, there is a basis of V .

By convention, we have that the empty set ∅ is linearly independent, and it is

the basis of the vector space V = {0}.
Observation 4.1.7 Not all vector spaces are finitely generated. An example of such
space is R[x], the vector space of polynomials with coefficients in R. In fact, suppose
we have a basis of R[x] with a finite number of elements and let N be the highest
N +1
degree of the polynomials in such a basis. Then, the polynomial x cannot be
expressed as a linear combination of elements of the basis and we get a contradiction.
However, even for vector spaces that are not finitely generated, we can always
find a basis: the proof is difficult, and we will not see it here. In the special case of
R[x], a basis must necessarily contain an infinite number of elements; for example,
the student is invited to check that {1, x, x , x , . . . } is a basis of R[x].
2 3

Observation 4.1.8 Proposition 4.1.6 ensures that each vector space, different from
the zero space and generated by a finite number of vectors, has at least one basis.
However, this basis is not unique. For example it is easy to verify that, if k ≠ 0, then
the set Bk = {(k, 1), (0, 1)} is a basis of R , so R has infinitely many bases. We
2 2

invite the student to convince himself that every vector space that admits a basis,
actually admits infinitely many.
Basis and Dimension 61

4.2 THE CONCEPT OF DIMENSION

A vector space, as we already know, can admit different bases, however, as we shall
see:
All the bases have the same number of elements,
This number is called the dimension of the vector space.
It is unlikely that the student now understands the importance of this number
uniquely associated to a vector space. The concept of dimension is the key to answer
many questions about the linear independence of certain sets of vectors, or to deter-
mine if a set of vectors of a vector space V generates or not V . In order to precisely
define the dimension, we first need the Completion Theorem, which will be proved
in the appendix.
Completion Theorem 4.2.1 Let S = {v1 , . . . .vm } be a set of linearly independent
vectors in a finitely generated vector space V . If B = {w1 , . . . , wn } is a basis of V
(we know that there is always at least one) then m ≤ n, and we can always add to S
n − m vectors from the basis {w1 , . . . , wn } in order to obtain a basis of V .
We will see, at the end of this chapter, that we can put in action very explicitly
n
the Completion Theorem in R , to complete any given set of linearly independent
n
vectors to obtain a basis of R .
Proposition 4.2.2 All the bases of the same finitely generated vector space have the
same number of elements.
Proof. Let B1 = {v1 , . . . , vn } and B2 = {w1 , . . . , wm } be two bases of V with n ≤ m.
Since the vectors in B1 are linearly independent and B2 is a basis, by the Completion
Theorem we have n ≤ m. Exchanging the roles of B1 and B2 , we get that m ≤ n,
thus n = m.

Definition 4.2.3 The number of elements of a basis of a vector space is called

dimension of the vector space, and it is denoted with dim(V ). When this number is
finite, or equivalently, when V is generated by a finite number of vectors, V is said
to be finite dimensional.
We now describe some bases for the vector spaces we have encountered so far,
which are particularly simple; they are called canonical bases, and the number of
elements each basis contains is the dimension of the corresponding vector space. We
leave to the student to verify that these are indeed bases.

• R . The canonical basis is given by C = {e1 , . . . , en }, where ei is the vector that

has 1 in position i and 0 in the other positions.

For example the canonical basis of R is C = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}, and
3
3
R has dimension 3, or we can also say that it is a three-dimensional vector
space.

• Rn [x]. The canonical basis is given by C = {x , x

n n−1
, . . . , x, 1}.
62 Introduction to Linear Algebra

• The canonical basis is given by C = {E1,1 , . . . , Em,n }, where Ei,j is the matrix
that has 1 in position (i, j) and 0 in the other positions. For example the
canonical basis of M2,3 is

1 0 0 0 1 0
C = {( ),( ),
0 0 0 0 0 0

0 0 1 0 0 0 0 0 0 0 0 0
( ),( ),( ),( )} .
0 0 0 1 0 0 0 1 0 0 0 1

The knowledge of the canonical bases tells us immediately the dimension of the
vector spaces considered above:
n
dim(R ) = n, dim(Rn [x]) = n + 1, dim = mn.

If one keeps in mind this fact, many exercices become simple. For example, we
ask ourself if the set

{(1, 2, 1), (1, 1, 5), (2, 3, 1), (0, 1, 0)}

3
is a basis of R . The answer is immediate: this set cannot be a basis as we know that
3
all bases in R have precisely 3 elements, while the given set has 4 elements.
The proposition below is particularly useful for solving exercises.
Proposition 4.2.4 Let V be a vector space of dimension n and let W be a subspace
of V . Then:

a) dim(W ) ≤ dim(V );

b) dim(W ) = dim(V ) if and only if V = W .

Proof. a) Recall that the dimension of a vector space is the number of elements
of a basis, which is also a maximal set of linearly independent vectors. Since W
is contained in V , we cannot choose in W a larger number of linearly independent
vectors than the number of linearly independent vectors in V , therefore the dimension
of W cannot be larger than the dimension of V .
b) Since the vectors of a basis of W are linearly independent, by the Completion
Theorem, we can add to them dim(V ) − dim(W ) vectors to obtain a basis of V . If
dim(W ) = dim(V ), it means that a basis of W is already a basis of V , and then in
particular generates V , i.e. W = V .

Let us see how this theorem greatly simplifies the exercises.

Example 4.2.5 We want to show that B = {(1, −1), (2, 0)} is a basis of R . In
2

principle, we have to verify that the two vectors are linearly independent and that
2
they generate R . However, while linear independence is obvious, because one vector
is not a multiple of the other, for generation we should do the calculations. Now we
see that the calculations are not necessary. In fact, we have two linearly independent
Basis and Dimension 63

2
vectors in R , so the subspace generated by them has dimension two. So, by the
2
previous theorem, it must be equal to R .
3 2
Similarly, in Example 4.1.5 we have shown that the vectors x , x , 2, 3x are
linearly independent. Then, since there are four of them, we can immediately
conclude, without making calculations, that they are a basis of R3 [x], therefore
⟨x , x , 2, 3x⟩ = R3 [x].
3 2

In fact, a much stronger property is true. In general, given a set of vectors, the
property of being linearly independent or the property of being generators of a certain
vector space are not related with each other. But, if we are in a vector space of
dimension n and we consider a set with exactly n vectors, then the two properties
are equivalent.

Proposition 4.2.6 Let V be a vector space of dimension n, and let {v1 , . . . , vn } be

set of n vectors of V . The following are equivalent:

a) {v1 , . . . , vn } is a basis of V ;

b) v1 , . . . , vn are linearly independent;

c) v1 , . . . , vn generate V .

Proof. a) implies b) by the definition of basis.

Let us see that b) implies c). Set W = ⟨v1 , . . . , vn ⟩. We have that {v1 , . . . , vn } is
a basis of W , so W has dimension n. Then, by Proposition 4.2.4 (b), we have that
W = V , so the given vectors generate V .
Now, let us see that c) implies a). By Proposition 4.1.2, there is a subset of
{v1 , . . . , vn } which is a basis of V , but as V has dimension n, this subset must
contain exactly n elements, then it is {v1 , . . . , vn }.

We now see that a basis is a very “efficient” way to represent vectors in a vector
space. Let us see an example.
2
Example 4.2.7 In R we know that all the vectors are linear combinations of the
two vectors of the canonical basis e1 = (1, 0), e2 = (0, 1). We can also verify that
2
each vector in R is not only a linear combination of e1 and e2 , but it is so in a unique
way. In fact, if we take the vector (2, 3), we can write (2, 3) = 2(1, 0) + 3(0, 1) and
the numbers 2 and 3 are the only scalars which give us (2, 3) as a linear combination
of (1, 0), (0, 1). But the situation is different if we take the three vectors (1, 0), (0, 1)
and (1, 1). Indeed, we already know that the vector (1, 1) is somehow “redundant”,
that is, we know that: ⟨(1, 0), (0, 1), (1, 1)⟩ = ⟨(1, 0), (0, 1)⟩, and this is because (1, 1)
is a linear combination of (1, 0) and (0, 1). This is reflected by the fact that a vector
2
in R is no longer a linear combination in a unique way of these three vectors. Indeed

(2, 3) = (1, 0) + 2(0, 1) + (1, 1) = 0(1, 0) + (0, 1) + 2(1, 1).

The concept of uniqueness of expression, seen in the previous example, is the

content of the following proposition.
64 Introduction to Linear Algebra

Theorem 4.2.8 Let B = {v1 , . . . , vn } be an ordered basis for the vector space V
(that is, we fixed an order in the set of vectors numbering them) and let v ∈ V . Then
there exists a unique n−tuple of scalars (α1 , . . . , αn ), such that

v = α1 v1 + ⋯ + αn vn .

Proof. Since B is a system of generators, each v ∈ V is written as a linear combination

of the elements of B, that is, there are scalars α1 , . . . , αn such that:

v = α1 v1 + ⋯ + αn vn .

We prove the uniqueness of the αi ’s. Suppose that it is also

v = β1 v1 + ⋯ + βn vn .

Subtracting these two equations, we obtain

0 = (α1 − β1 )v1 + ⋯ + (αn − βn )vn ,

where α1 − β1 = . . . = αn − βn = 0 because of the linear independence of the vectors

in B, and therefore α1 = β1 , . . . , αn = βn .

Definition 4.2.9 The scalars (α1 , . . . , αn ) are called the components of v ∈ V in the
basis B or also the coordinates of v with respect to the basis B and will be denoted
by (v)B = (α1 , . . . , αn ).
Example 4.2.10 As an example, we prove by exercise that B = {(1, −1), (2, 0)} is
a basis of R , and we determine the coordinates of v = (−3, 1) with respect to this
2

basis.
Clearly the vectors in B are linearly independent as (2, 0) is not a multiple of
(1, −1). At this point, as R has dimension 2, we already know that B is a basis. But
2

it is instructive to prove directly that it is a set of generators. Let (x, y) ∈ R and

let a, b ∈ R, such that

(x, y) = a(1, −1) + b(2, 0) = (a + 2b, −a);

it must be a = −y, and therefore b = x+y

2
, and this is always possible, thus B generates
R . In particular (−3, 1) = (−1)(1, −1) + (−1)(2.0), and thus the coordinates of v
2

with respect to the basis B are (v)B = (−1, −1).

4.3 THE GAUSSIAN ALGORITHM AS A PRACTICAL METHOD FOR SOLV-

ING LINEAR ALGEBRA PROBLEMS
We have already seen that, to determine if some vectors of a given vector space V
are linearly independent or if they generate V , we need to solve a linear system. So
it is clear that the Gaussian algorithm is of great help in problems related to linear
dependence or independence and to the concept of generation or more generally to
the concept of basis. What we want to see now is how to use the Gaussian algorithm
directly, without having to set a linear system.
Basis and Dimension 65

We observe that, if we have a matrix A ∈ Mm,n (R), we can consider its rows
n
as vectors of R , such vectors will be called row vectors of A. For example, if A =
0 1 −3
( ), its row vectors are R1 = (0, 1, −3) and R2 = (2, −1, 1).
2 −1 1

Proposition 4.3.1 Given a matrix A ∈ Mm,n (R) the elementary row operations do
n
not change the subspace of R generated by the row vectors of A.

Proof. Recall that the elementary row operations are:

(a) exchange of two rows;

(b) multiplying a row by a real number other than 0;

It is immediate to verify that the statement is true for the type of operations
(a) and (b). For operations of type (c), it is sufficient to show that if Ri and Rj
are two row vectors of A and α ∈ R, we have that ⟨Ri , Rj + αRi ⟩ = ⟨Ri , Rj ⟩. We
obviously have that Ri , Rj +αRi ∈ ⟨Ri , Rj ⟩, so ⟨Ri , Rj ⟩ is a subspace of R containing
n

Ri , Rj + αRi . Then by Proposition 3.1.5, we have that ⟨Ri , Rj + αRi ⟩ ⊆ ⟨Ri , Rj ⟩.

The inclusion ⟨Ri , Rj ⟩ ⊆ ⟨Ri , Rj + αRi ⟩ is shown in a similar way taking into
account the fact that Ri = (Ri + αRj ) − αRj , thus Ri ∈ ⟨Ri , Rj + αRi ⟩.

Observation 4.3.2 The elementary row operations do not change the subspace
n m
of R generated by the row vectors of A, but they do change the subspace of R
generated by the column vectors of A. We invite the reader to verify this fact with
an example in order to convince himself.

Proposition 4.3.3 If a matrix A is row echelon, its nonzero row vectors are linearly
independent.

Proof. Let R1 , . . . , Rk be the nonzero rows of A, and let a1j1 , . . . , akjk be the cor-
responding pivots. Now let λ1 R1 + ⋅ ⋅ ⋅ + λk Rk = 0, and we want to prove that
λ1 = λ2 = ⋅ ⋅ ⋅ = λk = 0. In the vector λ1 R1 + ⋅ ⋅ ⋅ + λk Rk , the element in the position
j1 is λ1 a1j1 , the element in the position j2 is λ1 a1j2 + λ2 a2j2 , and so on, until we reach
the element in position jk , which is λ1 a1jk + λ2 a2jk + ⋅ ⋅ ⋅ + λk akjk . So, from the fact
that λ1 R1 + ⋅ ⋅ ⋅ + λk Rk = 0, it follows that:
⎧
⎪ λ1 a1j1 = 0
⎪
⎪
⎪
⎪ λ1 a1j2 + λ2 a2j2 = 0
⎨
⎪
⎪
⎪ ⋮
⎪
⎪
⎩ λ1 a1jk + λ2 a2jk + ⋅ ⋅ ⋅ + λk akjk = 0.

Since a1j1 ≠ 0, from the first equation, we get λ1 = 0. Substituting λ1 = 0 in the

second equation, and since a2j2 ≠ 0, we get that λ2 = 0, and so on. After k steps we
obtain λk = 0, thus λ1 = ⋅ ⋅ ⋅ = λk = 0, and this shows that the rows R1 , . . . , Rk are
n
linearly independent vectors of R .
66 Introduction to Linear Algebra

Let us now see an example of how these propositions can be applied in the exer-
cises.
3
Example 4.3.4 Given the following vectors of R

⎛2⎞ ⎛0⎞ ⎛2⎞

⎜3⎟
v1 = ⎜ ⎟, ⎜−1⎟
v2 = ⎜ ⎟, ⎜2⎟
v3 = ⎜ ⎟,
⎝−1⎠ ⎝3⎠ ⎝2⎠

3
we want to establish if they are linearly independent and if they are a basis of R .
Moreover, we want to find a basis of the subspace W generated by them and to
calculate the dimension of W .
The matrix:
⎛2 3 −1⎞
⎜0 −1 3 ⎟
A=⎜ ⎟
⎝2 2 2⎠
can be reduced to the following row echelon form:

⎛1 0 4 ⎞
⎜0 1 −3⎟
⎜ ⎟.
⎝0 0 0 ⎠

By Proposition 4.3.1 we have that

⎛ 2 ⎞ ⎛ 0 ⎞ ⎛2⎞
W = ⟨⎜
⎜3⎟ ⎜−1⎟
⎟,⎜ ⎜2⎟
⎟,⎜ ⎟⟩ =
⎝−1⎠ ⎝ 3 ⎠ ⎝2⎠

⎛1⎞ ⎛ 0 ⎞
= ⟨⎜
⎜0⎟
⎟,⎜
⎜−1⎟⎟⟩ .
⎝4⎠ ⎝ 3 ⎠

⎛1⎞
Therefore, we have that the subspace W is generated by the two vectors u1 = ⎜ ⎜0⎟⎟
⎝4⎠
⎛−1⎞
and u2 = ⎜ ⎜0⎟⎟, which are linearly independent by Proposition 4.3.3. Therefore,
⎝3⎠
{u1 , u2 } is a basis for W , which consequently has dimension 2. Since by Theorem
4.1.4, the number of vectors in a basis is the maximum number of linearly independent
vectors in the vector space, we have that v1 , v2 , v3 are linearly dependent, therefore
3
they cannot be a basis of R .

Let us now see in general how to proceed to find a basis of the subspace W gener-
n
ated by vectors v1 , . . . , vk ∈ R and how to decide if they are linearly independent,
that is, we formalize what we learned from the previous example.
Basis and Dimension 67

• We write the matrix A, whose rows are the vectors v1 , . . . , vk (A will be a k × n

matrix).
′
• Using the Gaussian algorithm, we obtain a matrix A which is in row echelon
form.
′
• The vectors forming the nonzero rows of A generate W and are they are linearly
independent, so they are a basis of W .
′
• Let r be the number of nonzero rows of A ; we have that dim W = r. If k = r
the vectors v1 , . . . , vk generate a vector space of dimension k and so they are
linearly independent. If k > r the vectors v1 , . . . , vk are linearly dependent,
because a maximal set of vectors of W , which are linearly independent, has
r < k elements.
n
Now let us see a method to get a basis of a subspace W ⊂ R and then to complete
n
such a basis to a basis of R .
n
Let W ⊂ R be the vector subspace generated by the vectors w1 , . . . , wk .

• We write the matrix A whose rows are the vectors w1 , . . . , wk (A will be a k ×n

matrix).
′
• Using the Gaussian algorithm, we obtain a matrix A , which is in row echelon
form.
′
• The r vectors forming the nonzero rows of A generate W and are linearly
independent, so they are a basis of W .
n
• To complete this basis to a basis of R , just add n − r row vectors in the
′
positions corresponding to the missing pivots in the row echelon matrix A , to
′′
obtain a row echelon matrix A with n pivots.
′′
In fact, by Proposition 4.3.3, the rows of A are linearly independent, and since
n ′′
R has dimension n, by Proposition 4.2.6, the n row vectors of A form a basis
n
of R .

Example 4.3.5 In Example 4.3.4, to complete the basis {u1 , u2 } to a basis of R ,

just add the vector (0, 0, 1), or any vector of the type (0, 0, h) with h ≠ 0.

Observation 4.3.6 Observe at this point that, if we fix an ordered basis B =

{v1 , . . . , vn } of the vector space V , we can consider the function c ∶ V → R that
n

associates to every vector its coordinates with respect to the basis B. If we write
v ∈ V as a linear combination of the elements of B, v = α1 v1 + ⋯ + αn vn , we have
c(v) = (v)B = (α1 , . . . , αn ). By Theorem 4.2.8, c is a bijection. If v has coordinates
(α1 , . . . , αn ) and λ ∈ R is a scalar, then the coordinates of λv are (λα1 , . . . , λαn ),
and if w is another vector, whose coordinates are (β1 , . . . , βn ), then the coordinates
of v + w are c(v + w) = (α1 + β1 , . . . , αn + βn ). The student is invited to verify these
assertions, that we will recall later on, when we talk about isomorphisms of vector
spaces. For now, we are just content to note that, thanks to these properties, we have
68 Introduction to Linear Algebra

that c(λ1 v1 + ⋅ ⋅ ⋅ + λn vn ) = λ1 c(v1 ) + ⋅ ⋅ ⋅ + λn c(vn ) for every v1 , . . . , vn ∈ V and

for every λ1 , . . . , λn ∈ R. From this, it follows that the vectors v1 , . . . , vn are linearly
independent if and only if their coordinates c(v1 ), . . . , c(vn ) are linearly independent,
n
seen as vectors in R . Similarly, w is a linear combination of v1 , . . . , vn if and only
if c(w) is a linear combination of c(v1 ), . . . , c(vn ), and {w1 , . . . , wk } is a basis of
⟨v1 , . . . , vn ⟩ if and only if {c(w1 ), . . . , c(wk )} is a basis of ⟨c(v1 ), . . . , c(vn )⟩. Then,
instead of operating on vectors, we can operate on their coordinates, and then go back
n
again to the vectors. This allows us to use all the techniques that we have seen for R
for any other vector space, provided that it is finitely generated. Of course to do this,
we must always fix an ordered basis, otherwise we cannot talk about coordinates as
n-tuples, which are uniquely associated with each vector. In other words, the same
vector may have different coordinates with respect to different bases. We will continue
this discussion in a later chapter; for now, we just see an example on how we can
make calculations in any finite dimensional vector space, using the techniques studied
n
for R .

Example 4.3.7 Given the polynomials 2x + 3x− 1, −x + 3, 2x + 2x + 2 ∈ R2 [x],

2 2

we want to establish if they are linearly independent and determine a basis of the
subspace W they generate. Consider the canonical basis C = {x , x, 1} of R2 [x]. With
2

respect to this ordered basis, the coordinates of the polynomials are

⎛2⎞ ⎛0⎞ ⎛2⎞

⎜3⎟
v1 = ⎜ ⎟, ⎜−1⎟
v2 = ⎜ ⎟, ⎜2⎟
v3 = ⎜ ⎟.
⎝−1⎠ ⎝3⎠ ⎝2⎠

We can then proceed as in Example 4.3.4, making exactly the same calculations, and
⎛1⎞
we obtain a basis of W given by polynomials whose coordinates are u1 = ⎜⎜0⎟⎟ , u2 =
⎝4⎠
⎛−1⎞
⎜0⎟
⎜ ⎟. Returning to polynomials, a basis of W is {x + 4, −x + 3}.
2

⎝3⎠

4.4 EXERCISES WITH SOLUTIONS

4.4.1 Let W = ⟨(1 − k, 1, 1, −k), (2, 2 − k, 2, 0), (1, 1, 1 − k, k)⟩ ⊆ R . Determine the
4

dimension of W as k ∈ R varies. Then choose a suitable value of k and complete the

4
corresponding basis of W to a basis of R .
Solution. We write the matrix A that has the given vectors as rows, and we apply
the Gaussian algorithm to reduce it to row echelon form. We have:

⎛1 − k 1 1 −k ⎞
⎜
A=⎜ 2 2−k 2 0⎟⎟,
⎝ 1 1 1−k k ⎠
Basis and Dimension 69

and by reducing in row echelon form we get:

⎛1 1 1−k k ⎞
A =⎜ ⎟
′
⎜0 −k 2k −2k ⎟.
⎝0 0 4k − k 2 −4k + k 2 ⎠

2 ′
If k ≠ 0 and 4k − k ≠ 0, that is, if k ≠ 0 and k ≠ 4 the matrix A has 3 nonzero
rows, which are linearly independent, therefore as

W = ⟨(1 − k, 1, 1, −k), (2, 2 − k, 2, 0), (1, 1, 1 − k, k)⟩ =

2 2
⟨(1, 1, 1 − k, k), (0, −k, 2k, −2k), (0, 0, 4k − k , −4k + k )⟩,
W has dimension 3.
If k = 0 the matrix A has only one nonzero row, so W = ⟨(1, 1, 1, 0)⟩ = has
′

dimension 1.
If k = 4 we get:
⎛1 1 −3 4 ⎞
A =⎜ ⎜0 −4 8 −8⎟
′
⎟,
⎝0 0 0 0⎠
so A has 2 nonzero rows and W = ⟨(1, 1, −3, 4), (0, −4, 8, −8)⟩ has dimension 2.
′

We now choose k = 4. To complete {(1, 1, −3, 4), (0, −4, 8, −8)} to a basis of R
4

we have to add 2 row vectors having the pivots in the “missing steps”, i.e. in the
third and fourth place. For example, we can add (0, 0, −1, 2), (0, 0, 0, 5).
So {(1, 1, −3, 4), (0, −4, 8, −8), (0, 0, −1, 2), (0, 0, 0, 5)} is a basis of R obtained
4

by completing a basis of W .
4.4.2 Let v1 = (1, 2, 1, 0), v2 = (4, 8, k, 5), v3 = (−1, −2, 3 − k, −k). Determine for
which values of k the vectors v1 , v2 , v3 are linearly independent. Set k = 1 and de-
/ ⟨v1 , v2 , v3 ⟩.
4
termine, if possible, a vector w ∈ R , such that w ∈

Solution. We write the matrix A that has the given vectors as rows, and we apply
the Gaussian algorithm to reduce it to row echelon form. We have

⎛1 2 1 0⎞
A=⎜
⎜ 4 8 k 5⎟⎟,
⎝−1 −2 3 − k −k ⎠

and by reducing it in row echelon form we get:

⎛1 2 1 0 ⎞
A =⎜ 5 ⎟
′
⎜0 0 k − 4 ⎟.
⎝0 0 0 5 − k⎠

Let W = ⟨v1 , v2 , v3 ⟩. If k ≠ 4 and k ≠ 5, the matrix A has three nonzero rows,

′

which are linearly independent, so W = ⟨(1, 2, 1, 0), (0, 0, k − 4, 5), (0, 0, 0, k − 5)⟩
has dimension 3. Since v1 , v2 , v3 generate W , by Proposition 4.2.6 they are linearly
independent.
70 Introduction to Linear Algebra

If k = 4 we get:
⎛1 2 1 0⎞
A =⎜
⎜0 0 0 5⎟
′
⎟.
⎝0 0 0 1⎠
Now W = ⟨(1, 2, 1, 0), (0, 0, 0, 5), (0, 0, 0, 1)⟩ = ⟨(1, 2, 1, 0), (0, 0, 0, 1)⟩ has dimension
2, and then v1 , v2 , v3 are linearly dependent.
If k = 5 we get:
⎛1 2 1 0⎞
A =⎜ ⎜0 0 1 5⎟
′
⎟,
⎝0 0 0 0⎠
which is a row echelon form matrix with two nonzero rows, so W has dimension 2
and v1 , v2 , v3 are linearly dependent.
′
Now let k = 1. We replace this value in A to get a basis of W :

⎛1 2 1 0⎞
⎜0 0 −3 5⎟
A =⎜
′
⎟.
⎝0 0 0 4⎠

We have that W has dimension 3, and if we choose a row vector that has
the second nonzero pivot, for example w = (0, −2, 3, −1), it follows from
Proposition 4.3.3 that the vectors (1, 2, 1, 0), (0, −2, 3, −1), (0, 0, −3, 5), (0, 0, 0, 4)
are linearly independent, so by Proposition 3.2.4 we have that (0, −2, 3, −1) ∈ /
⟨(1, 2, 1, 0), (0, 0, −3, 5), (0, 0, 0, 4)⟩, that is w ∈
/ W.
5
4.4.3 Let W be the subspace of R generated by the set

B = {(1, 3, −1, 1, 2), (2, 6, −2, 4, 4)};

5
complete a basis B of W to a basis of R .
Solution. We observe that B is actually a basis of W , because the two vectors in B
are linearly independent, not being one multiple of the other.
5
One way to complete B to a basis of R is to proceed as indicated in the proof of
the completion theorem. Another way is the following.
′
Using the Gaussian algorithm we determine a basis B of W such that the matrix
′ ′
A whose rows are the vectors of B is in row echelon form. That is, we consider

1 3 −1 1 2
A=( ),
2 6 −2 4 4

and by applying the Gaussian algorithm we obtain:

1 3 −1 1 2
A =( ).
′
0 0 0 2 0
Basis and Dimension 71

At this point, to complete {(1, 3, −1, 1, 2), (0, 0, 0, 2, 0)} to a basis of R , we need
5

to add 3 row vectors having the pivots in the “missing steps”, i.e. the second, third
and fifth places. For example, we can add (0, 1, −1, 0, 1), (0, 0, 2, 1, −3), (0, 0, 0, 0, 1).
As
W = ⟨(1, 3, −1, 1, 2), (2, 6, −2, 4, 4)⟩ = ⟨(1, 3, −1, 1, 2), (0, 0, 0, 2, 0)⟩,
it is easy to see that
⟨(1, 3, −1, 1, 2), (2, 6, −2, 4, 4), (0, 1, −1, 0, 1), (0, 0, 2, 1, −3), (0, 0, 0, 0, 1)⟩
= ⟨(1, 3, −1, 1, 2), (0, 0, 0, 2, 0), (0, 1, −1, 0, 1), (0, 0, 2, 1, −3), (0, 0, 0, 0, 1)⟩
5
=R .
So if B̃ is the set
{(1, 3, −1, 1, 2), (2, 6, −2, 4, 4), (0, 1, −1, 0, 1), (0, 0, 2, 1, −3), (0, 0, 0, 0, 1)}
5 5
the vectors of B̃ generate R , therefore, by Proposition 4.2.6, B̃ is a basis of R , and
it was obtained by completing the basis B of W .
4.4.4 Consider the following vector subspaces of M2,3 (R):
a 0 b
U = {( ) ∣a, b, c, d ∈ R} ,
c a d
r s t
W = {( ) ∣r + s + t + u + x + y = 0} ,
u x y
and determine a basis of U ∩ W .
Solution. We have that:
a 0 b
U ∩ W = {( ) ∣a, b, c, d ∈ R, a + b + c + a + d = 0} ,
c a d
that is
a 0 b
U ∩ W = {( ) ∣a, b, c, d ∈ R, d = −2a − b − c} =
c a d
a 0 b
{( ) ∣a, b, c ∈ R} =
c a −2a − b − c
1 0 0 0 0 1 0 0 0
= {a ( ) + b( ) + c( ) ∣a, b, c ∈ R} =
0 1 −2 0 0 −1 1 0 −1
1 0 0 0 0 1 0 0 0
⟨( ),( ),( )⟩ .
0 1 −2 0 0 −1 1 0 −1
1 0 0 0 0 1 0 0 0
So the vectors v1 = ( ) , v2 = ( ) , v3 = ( ) gener-
0 1 −2 0 0 −1 1 0 −1
ate U ∩ V . To show that they are linearly independent, let us consider their co-
ordinates with respect to the canonical basis, that is: (v1 )C = (1, 0, 0, 0, 1, −2),
(v2 )C = (0, 0, 0, 0, −1), (v3 )C = (0, 0, 0, 1, 0, −1). We observe that the matrix A
whose rows are (v1 )C , (v2 )C , (v3 )C is in row echelon form, so by Proposition 4.3.3
we have that (v1 )C , (v2 )C , (v3 )C are linearly independent. Then also v1 , v2 , v3 are
linearly independent, so they are a basis of U ∩ W .
72 Introduction to Linear Algebra

4.5 SUGGESTED EXERCISES

4.5.1 Determine a basis of R4 [x] different from the canonical basis, motivating the
answer.
4.5.2 Determine for which values of k the following polynomials of R2 [x] generate
R2 [x]: X + kx + 1.3x + 4x + 1.5x + kx − 3.
2 2 2

4.5.3 Find a basis of ⟨(1, 0, 3), (2, 3, 0), (1, 1, 1)⟩ and complete it to a basis of R .
3

3
4.5.4 Determine which of the following sets of vectors generate R and which of them
3
are a basis of R .
⎧
⎪ ⎛0⎞ 2 ⎫
⎪
⎪
⎪ ⎛ ⎞ ⎛ 0 ⎞⎪ ⎪
⎪
⎪ ⎜
⎜ ⎟
⎟ ⎪
⎪
i) S1 = ⎨ ⎜
⎜ ⎟
⎟ , ⎜
⎜0 ⎟
⎟ , ⎜
⎜ −3 ⎟
⎟⎬ .
⎪
⎪ ⎜
⎜ 0 ⎟
⎟ ⎪
⎪
⎪
⎪ ⎝0 ⎠ ⎝ 0 ⎠⎪
⎪
⎪
⎩⎝−1⎠ ⎪
⎭
⎧
⎪ ⎛ 1 ⎞ ⎛2⎞ ⎛ 1 ⎞ ⎛ 0 ⎞⎫ ⎪
⎪
⎪ ⎪
⎪
ii) S2 = ⎨ ⎜
⎜ 1 ⎟
⎟ , ⎜
⎜0 ⎟
⎟ , ⎜
⎜ −1 ⎟
⎟ ⎜
⎜ 6 ⎟
⎟⎬ .
⎪
⎪ ⎪
⎪
⎪ ⎝
⎩ −1 ⎠ ⎝1 ⎠ ⎝ 1 ⎠ ⎝ ⎠
−3 ⎭⎪

⎧
⎪ ⎛1⎞ ⎛−2⎞ ⎛ 0 ⎞⎫ ⎪
⎪
⎪ ⎪
⎪
iii) S3 = ⎨ ⎜
⎜3⎟
⎟ , ⎜
⎜ 1 ⎟
⎟ , ⎜
⎜ 5 ⎟
⎟⎬ .
⎪
⎪ ⎪
⎪
⎪
⎩ ⎝1⎠ ⎝−9 ⎠ ⎝ −5 ⎠⎪
⎭
4.5.5 Determine if the vectors v1 = (1, 1), v2 = (−1, 1), v3 = (2, 1) are linearly
2
independent. Do they generate R ?
4.5.6 Determine for which values of k the vector w = (−1, k, 1) belongs to
⟨(1, −1, 0), (k, −k, 1), (−1, k , 1)⟩.
2

4.5.7 Determine for which values of k the vectors v1 = (1, 2k, 0), v2 = (1, 0, −3), v3 =
(−1.0, k + 2) are linearly independent. Set k = 1 and establish if (1, −2, −6) ∈
⟨v1 , v2 , v3 ⟩.
4.5.8 Let v1 = (1, 1, 1, 0), v2 = (2, 0, 2, −1), v3 = (−1, 1, −1, 1). Determine a basis of
⟨v1 , v2 , v3 ⟩ and then complete it to a basis of R . Also determine for which values of
4

k the vector w = (k, −3k, k, −2k) belongs to ⟨v1 , v2 , v3 ⟩.

4.5.9 Let X = {(x, y, z) ∈ R ∣ x − y + z = 0 and x − 2y = 0}. Prove that X is a
3

3
subspace of R and determine a basis for it.
4.5.10 Determine for which values of k the vectors v1 = (0, 1, 0), v2 = (1, k, 4), v3 =
(k, 2, 3) generate R . Set k = 0 and determine if the vector (4, −1, 6) belongs to
3

⟨v1 , v2 , v3 ⟩.
2 2
4.5.11 i) Determine for which values of k the vectors v1 = x + 2x − 1, v2 = x +
kx + 1 − k, v3 = 5x + k are linearly dependent.
ii) Choose one value of k found in point i) and write one of the 3 vectors as a linear
combination of the others. Then find a basis of ⟨v1 , v2 , v3 ⟩.
Basis and Dimension 73

2
4.5.12 i) Determine for which values of k the vectors v1 = 6x − 6x − k, v2 =
2
−kx + kx + 6 are linearly independent.
ii) Set k = 0. Determine the dimension of ⟨v1 , v2 ⟩, and if possible, find a vector w
such that w does not belong to ⟨v1 , v2 ⟩ and {v1 , v2 , w} does not generate R2 [x].

4.5.13 Determine for which values of k the vectors v1 = (1, 2, 0), v2 = (−k, 3, 0),
v3 = (2, 0, 1) are a basis B of R . Put k = 0 and determine the coordinates of the
3

vector w = (−2.0, −1) with respect to the basis B.

4.5.14 Determine, if possible, 4 nonzero vectors of R2 [x] that do not generate R2 [x].
Determine, if possible, two distinct subspaces of R2 [x] of dimension 2 that both
2
contain the vector x + x.
a b
4.5.15 Let X = {( ) ∈ M2 (R)∣A, b ∈ R}. Prove that X is a subspace of M2 (R)
−b a
and determine its dimension.
−1 1
4.5.16 Show that B = {( ) , ( )} is a basis of R and determine the coordinates
2
2 −1
3 0
of the vectors ( ) and ( ) with respect to this basis.
−1 1

4.5.17 i) Determine for which values of k the vectors v1 = (−1, 0, 2k), v2 =

(1, k, 3k), v3 = (0, 4k, 2) generate R .
3

ii) Choose a value of k and determine if the vector (2, k, k) belongs to ⟨v1 , v2 , v3 ⟩.
2 2
4.5.18 i) Determine for which values of k the vectors v1 = x + 2x + 2, v2 = −x +
2kx + k − 1, v3 = kx + (2k + 4)x + 3k are linearly independent.
2

ii) Set k = 0 and determine, if possible, a vector w that does not belong to ⟨v1 , v2 , v3 ⟩.

4.5.19 Determine, if possible, 4 vectors {v1 , v2 , v3 , v4 } of R2 [x] that simultaneusly

satisfy the following properties:

i) none of them is a multiple of another;

ii) {v1 , v2 , v3 , v4 } generate R2 [x];

ii) {v1 , v2 , v3 } do not generate R2 [x].

4.5.20 Determine if X = {sx + rx + sx − 2r∣ r, s ∈ R} ⊆ R [x] is a vector subspace

3 2 3

of R3 [x], and if so, determine a basis of X.

4
4.5.21 Determine a basis of the vector subspace of R given by A ∩ B, where A =
⟨(1, 0, 0, 0), (0, 1, 0, 0)⟩ and B = {(x, y, z, t) ∈ R ∣ y = t = 0}.
4

4.5.22 Prove that a nonzero vector space of finite dimension has infinitely many
bases.
74 Introduction to Linear Algebra

4.6 APPENDIX: THE COMPLETION THEOREM

We write here, for the interested reader, the proof of the Completion Theorem. We
start with a technical lemma.

Lemma 4.6.1 If B = {w1 , . . . , wn } is a basis of V and v is a vector of the

′
type v = λ1 w1 + ⋅ ⋅ ⋅ + λk wk + ⋅ ⋅ ⋅ + λn wn with λk ≠ 0 then also B =
{w1 , . . . , wk−1 , v, wk+1 , . . . , wn } is a basis of V .

Proof. We observe that w1 , . . . , wk−1 , wk+1 , . . . , wn ∈ ⟨B⟩, but since

v = λ1 w1 + ⋅ ⋅ ⋅ + λk wk + ⋅ ⋅ ⋅ + λn wn

with λk ≠ 0 we also have that:

λ1 1 λn
w ∈ ⟨B ⟩.
′
wk = − w1 − ⋅ ⋅ ⋅ + v − ⋅ ⋅ ⋅ −
λk λk λk n

So B ⊆ ⟨B ⟩. Since B generates V , we have that V = ⟨B⟩ ⊆ ⟨B ⟩. The inclusion

′ ′

⟨B ⟩ ⊆ V is obvious, so we have ⟨B ⟩ = V , i.e. B generates V .

′ ′ ′
′
We now show that the vectors of B are linearly independent. Consider
β1 , . . . , βn ∈ R such that:

β1 w1 + ⋅ ⋅ ⋅ + βk v + ⋅ ⋅ ⋅ + βn wn = 0.

By replacing v with λ1 w1 + ⋅ ⋅ ⋅ + λk wk + ⋅ ⋅ ⋅ + λn wn and rearranging we obtain that:

(β1 + βk λ1 )w1 + ⋅ ⋅ ⋅ + βk λk wk + ⋅ ⋅ ⋅ + (βn + βk λn )wn = 0.

Since the wi are linearly independent, we have that all coefficients must be zero. In
particular, it must happen that βk λk = 0 and βi + βk λi = 0 for every i ≠ k. From the
first equality, being λk ≠ 0, it follows that βk = 0, and by substituting in the others
′
we get βi = 0 for every i ≠ k. So all βi are zero, which shows that the vectors of B
are linearly independent.

We can now prove the Completion Theorem 4.2.1.

Theorem 4.6.2 Let {v1 , . . . , vm } be a set of vectors in a finitely generated vector

space V . If B = {w1 , . . . , wn } is a basis V (and we know there always exists at least
one basis) then m ≤ n and we can always add to n − m vectors v1 , . . . , vn to B to
obtain a basis of V .

Proof. Since B generates V , we can write v1 in the form

v1 = α1 w1 + ⋅ ⋅ ⋅ + αn wn ,

where not all the coefficients are zero. Possibly rearranging the wk , we can assume
α1 ≠ 0, and then by Lemma 4.6.1 we have that {v1 , w2 , . . . , wn } is a basis of V .
Basis and Dimension 75

Now we consider v2 and the basis {v1 , w2 , . . . ,wn }. We can write v2 in the form:

v2 = β1 v1 + β2 w2 + ⋅ ⋅ ⋅ + βn wn ,

where at least one of the βj with j ≥ 2 is not zero, otherwise v2 would be a multiple of
v1 contradicting the hypothesis that the vectors v1 , . . . , vm are linearly independent.
It is not restrictive to assume β2 ≠ 0, and therefore by Lemma 4.6.1 we have that
{v1 , v2 , w3 , . . . ,wn } is basis.
We can then continue in the same way. At the i-th step we can assume that
{v1 , . . . , vi−1 , wi , . . . , wn } is a basis. We can write vi in the form:

vi = λ1 v1 + ⋅ ⋅ ⋅ + λi−1 vi−1 + λi wi + ⋅ ⋅ ⋅ + λn wn ,

where at least one of the λj with j ≥ i is not zero, otherwise we would have that
vi ∈ ⟨v1 , . . . , vi−1 ⟩, contradicting the hypothesis that the vectors v1 , . . . , vm are
linearly independent. It is not restrictive; suppose that it is λi ≠ 0, and then by
Lemma 4.6.1 we have that {v1 , . . . , vi , wi+1 ,. . . , wn } is a basis for V .
If m ≤ n, possibly rearranging the vectors wk appropriately, after m steps we
obtain that {v1 , . . . , vm , wm+1 , . . . , wn } is a basis for V , as we wanted.
If m > n, after n steps we get that {v1 , . . . , vn } is a basis of V , from which
it follows that vn+1 ∈ ⟨v1 , . . . , vn ⟩, but this contradicts the hypothesis that vectors
v1 , . . . , vm are linearly independent. So it must be m ≤ n, and this ends the proof.
CHAPTER 5

Linear Transformations

Linear transformations are functions between vector spaces that preserve their struc-
ture, i.e. they are compatible with the operations of sum of vectors and multiplication
of a vector by a scalar. As we will see, linear maps are represented very efficiently
using matrices. The purpose of this chapter is to introduce the concept of linear
transformation and understand how it is possible to uniquely associate a matrix to
n m
each linear transformation between R and R , once we fix the canonical bases in
both spaces. Then we will study the kernel, the image of a linear transformation and
the Rank Nullity Theorem, which is one of the most important results in the theory
of vector spaces of finite dimension.

5.1 LINEAR TRANSFORMATIONS: DEFINITION

In calculus we study the real valued functions. These are laws which associate to
a real number, ranging in a certain set called domain of the function, another real
number, belonging to another subset of the real numbers, namely the codomain of
the function.
For example, f ∶ R ⟶ R, f (x) = x is a well-known function from R to R; its
2
2
graph is the parabola of equation y = x . As we know, R is also a vector space, then
f is a function between the vector space R (the domain) and vector space R (the
codomain). However, this function does not behave well with respect to the vector
space structure of R. In fact, if we take two vectors u and v, and we consider the sum
u + v, we have immediately that f (u + v) ≠ f (u) + f (v) (if we draw a graph, we
see it immediately). In the same way, we can see that the multiplication by a scalar
is not preserved. For example, we see that f (2 ⋅ 3) = 6 ≠ 2 ⋅ f (3) = 2 ⋅ 9.
2

On the other hand, there are other functions f ∶ R ⟶ R, that behave well
with respect to the vector space structure, i.e. they verify the equalities f (u + v) =
f (u) + f (v) and f (λv) = λf (v). Consider for example, the function f (x) = 3x. We
see immediately that f (x1 + x2 ) = 3(x1 + x2 ) = 3x1 + 3x2 = f (x1 ) + f (x2 ) and also
that f (λx1 ) = 3λx1 = λ(3x1 ) = λf (x1 ).
As we will see, those functions are linear transformations between vector spaces,
and they preserve the structure, i.e. the sum of vectors has as image, via the function,

77
78 Introduction to Linear Algebra

the sum of the images of the vectors, and the image of the product of a vector by a
scalar is the product of the scalar and the image of the vector.
Before the formal definition of linear map, we give the definition of function and
image.
Definition 5.1.1 We define a function f between two sets A and B as a law which
associates to each element of A one and only one element of B and denote this law
as f ∶ A ⟶ B. The set A is called domain of the function, while the set B is
called codomain of the function. We define image of an element a ∈ A, the element
f (a) ∈ B. The set of images of all the elements of A is called image of f and is
denoted by Im(f ) or sometimes with f (A).
Not all laws that associate elements of a set to elements of another set are func-
tions. For example, we can define a law that goes from the set A of all human beings,
to the set B of all human beings (A and B can be the same set), that associates to
every person a brother. This is not a function because someone may have more than
one brother.
Another example: We consider the law that goes from the set of natural numbers,
to the set of natural numbers, that associates to every number one of its divisors.
Also this law is not a function.
Let us define linear transformations.

Definition 5.1.2 Let V and W be two vector spaces and let F ∶ V ⟶ W be a

function. F is called a linear transformation if:
1. F (u + v) = F (u) + F (v), for every u, v ∈ V ,
2. F (λu) = λF (u) for every λ ∈ R and for every u ∈ V .
Observation 5.1.3 It is easy to verify that if V and W are two vector spaces,
a function F ∶ V ⟶ W is a linear transformation if and only if F (λu + µv) =
λF (u) + µF (v), for every u, v ∈ V and for every λ, µ ∈ R.
Proposition 5.1.4 Let F ∶ V → W be a linear transformation; then F (0V ) = 0W .
Proof. Let v be any vector of V . We have

F (0V ) = F (0v) = 0F (v) = 0W

because of Definition 5.1.2 and Proposition 2.3.5(iv).

1. Let F ∶ R ⟶ R be the function defined by: F (x, y) =

2 2
Example 5.1.5
(x + 1, y). We want to determine if F is linear, that is, see if the properties 1
and 2 of Definition 5.1.2 are verified. We see property 1 first:

F ((x1 , y1 ) + (x2 , y2 )) = F (x1 + x2 , y1 + y2 ) = (x1 + x2 + 1, y1 + y2 ).

On the other hand:

F (x1 , y1 ) + F (x2 , y2 ) = (x1 + 1, y1 ) + (x2 + 1, y2 ) = (x1 + x2 + 2, y1 + y2 ).

Linear Transformations 79

Therefore, the given function is not a linear transformation.

We could also conclude more quickly observing that F (0, 0) = (1, 0) ≠ (0, 0),
so by the previous proposition the map is not linear.

2. Let D ∶ R[x] ⟶ R[x] be defined by: D(p(x)) = p (x), i.e. D is the function
′

which associates to a polynomial its derivative. From calculus we know that

D(p(x) + q(x)) = D(p(x)) + D(q(x)), i.e. the derivative of the sum of two
polynomials is the sum of their derivatives. In addition, we also know that
D(kp(x)) = kD(p(x)), for each constant k ∈ R. So we have shown that the
derivative D is a linear transformation. We invite the student to verify, in a
similar manner, that also the integral is a linear transformation from R[x] to
R[x].

3. The identity map id ∶ V → V defined by: v ↦ v for every v ∈ V is a linear

transformation.

4. The zero map T ∶ V → V defined by: v ↦ 0V for every v ∈ V is a linear

transformation.

5. Given a basis B = {v1 , . . . , vn } of V , the map that associates to each vector

v = λ1 v1 + . . . + λn vn its coordinates (λ1 , . . . , λn ) with respect to the basis B
is an linear transformation. We leave the easy verification as an exercise.

The following observation is crucially importance to understand the correspon-

dence between linear transformations and matrices. We will return to this concept
more in detail later.

Observation 5.1.6 To each matrix

⎛ a1 1 a1 2 ⋯ a1 n ⎞
⎜ a2 1 a2 2 ⋯ a2 n ⎟
A=⎜
⎜
⎜
⎟
⎟ ∈ Mm,n (R),
⎜
⎜ ⋮ ⋮ ⋱ ⋮ ⎟ ⎟
⎟
⎝am 1 am 2 ⋯ am n ⎠
n m
we can associate the function LA ∶ R → R so defined:
n m
LA ∶ R → R

⎛ x1 ⎞ ⎛ a1 1 x1 + a1 2 x2 + . . . + a1 n xn ⎞
⎜
⎜ x2 ⎟
⎟ ↦ ⎜
⎜ a2 1 x1 + a2 2 x2 + . . . + a2 n xn ⎟
⎟
⎜
⎜⋮⎟ ⎟ ⎜
⎜ ⎟
⎟
⎜
⎜ ⎟ ⎟ ⎜
⎜ ⋮ ⎟
⎟
⎝xn ⎠ ⎝am 1 x1 + am 2 x2 + . . . + am n xn ⎠

⎛ a1 1 a1 2 ⋯ a1 n ⎞ ⎛ x 1 ⎞ ⎛ x1 ⎞
⎜ a2 1 a2 2 ⋯ a2 n ⎟ ⎜ ⎟ ⎜ x2 ⎟
=⎜ ⎟ ⎜ 2⎟
= A⎜ ⎟
⎜ ⎟ ⎜ x ⎟ ⎜ ⎟
⎜
⎜ ⎟
⎟ ⎜
⎜ ⎟
⎟ ⎜
⎜ ⎟
⎜ ⋮ ⋮ ⋱ ⋮ ⎟⎜ ⋮ ⎟ ⎜⋮⎟ ⎟
⎝am 1 am 2 ⋯ am n ⎠ ⎝xn ⎠ ⎝xn ⎠
80 Introduction to Linear Algebra

where the product of A by the vector (x1 , . . . , xn ) is the product rows by columns
defined in Chapter 1.

It is easy to see that LA is a linear transformation. Property 1 of Definition 5.1.2

comes from Proposition 1.2.1 (distributivity of the product rows by columns with
respect to the sum), and property 2 is a simple calculation.
More concisely, we can write:

⎛ x1 ⎞ ⎛ x1 ⎞
⎜ ⎟ ⎜ x2 ⎟
LA ⎜ 2⎟
= A⎜ ⎟
⎜ x ⎟ ⎜ ⎟
⎜
⎜ ⎟
⎟ ⎜
⎜ ⎟.
⎜ ⎟⋮ ⎜⋮⎟ ⎟
⎝xn ⎠ ⎝xn ⎠

Let us see a concrete example. If we consider the matrix:

2 1 0
A=( ),
−1 1 3
3 2
it follows that the linear transformation LA ∶ R → R is defined by:

⎛x1 ⎞ 2 1 0 ⎛
x1 ⎞
2x1 + x2
LA ⎜
⎜ 2⎟
x ⎟ = ( ) ⎜
⎜x2 ⎟
⎟ =( )
⎝x3 ⎠ −1 1 3 ⎝ ⎠ −x1 + x2 + 3x3
x3

2 1 0
= x1 ( ) + x2 ( ) + x3 ( ) .
−1 1 3

Note that we have:

⎛1⎞ 2 ⎛0⎞ 1 ⎛0⎞ 0

⎜0⎟
LA ⎜ ⎟ = (−1) , ⎜1⎟
LA ⎜ ⎟ = (1) , ⎜0⎟
LA ⎜ ⎟ = (3) ,
⎝0⎠ ⎝0⎠ ⎝1⎠

in other words:
the images of the canonical basis vectors are the columns of the matrix A.
This fact will be crucial for the exercises, when we have to determine the image of a
linear transformation.

We now wish to know how many possibilities we have for a linear transformation
F ∶ R ⟶ R such that F (1) = a ∈ R. We observe that, by property 1 of Definition
5.1.2 we have that F (x) = xF (1) = ax. So the only linear transformations from R to
R correspond to the straight lines passing through the origin (hence the name linear
transformation). This example is particularly instructive because it showed us that,
in order to fully understand a linear transformation from R to R, it is sufficient to
know only one value; we chose F (1), but the student can convince himself that the
value of F in any other point (as long as nonzero) would have determined F . This
Linear Transformations 81

is true in general: a linear transformation is completely determined by knowing only

some values and precisely the values corresponding to the vectors of a basis (and in
fact in the example considered, any number other than zero is a basis of R). This
basic fact is encoded by the following theorem.

Theorem 5.1.7 Let V and W be two vector spaces. If {v1 , . . . , vn } is a basis of V and
w1 , . . . , wn are arbitrary vectors of W , then there is a unique linear transformation
L ∶ V → W , such that L(v1 ) = w1 , . . . , L(vn ) = wn .

Proof. As {v1 , . . . , vn } is a basis of V there exist unique scalars α1 , . . . , αn (the

coordinates of v), such that v = α1 v1 + ⋯ + αn vn . We define L(v) as follows:

L(v) = α1 w1 + ⋅ ⋅ ⋅ + αn wn .

To verify that L is linear, we need to verify properties 1 and 2 of Definition

5.1.2. Let v = α1 v1 + ⋯ + αn vn , u = β1 v1 + ⋯ + βn vn . We have that v + u =
(α1 + β1 )v1 + ⋯ + (αn + βn )vn and, if λ ∈ R, we have λv = λα1 v1 + ⋯ + λαn vn .
Furthermore:
L(v + u) = L((α1 + β1 )v1 + ⋯ + (αn + βn )vn ) =

= (α1 + β1 )w1 + ⋯ + (αn + βn )wn =

= α1 w1 + ⋯ + αn wn + β1 w1 + ⋯ + βn wn = L(v) + L(u).

Now we see that

L(λv) = L(λα1 v1 + ⋯ + λαn vn ) =

= λα1 w1 + ⋯ + λαn wn =

= λ(α1 w1 + ⋯ + αn wn ) = λL(v).

So L is a linear transformation.
Now let us prove uniqueness. Suppose that G is a linear transformation G ∶ V →
W , such that G(v1 ) = w1 , . . . , G(vn ) = wn and L is the linear transformation defined
above. Then:
G(v) = G(α1 v1 + ⋯ + αn vn ) = α1 G(v1 ) + ⋅ ⋅ ⋅ + αn G(vn ) =

= α1 w1 + ⋅ ⋅ ⋅ + αn wn = L(v).

So G = L, as we wanted.

Corollary 5.1.8 Let V and W be two vector spaces. If two linear transformations,
T, S ∶ V → W coincide on a basis of V , then they coincide on the whole V .
82 Introduction to Linear Algebra

As we shall see, Theorem 5.1.7 is essential to establish a one-to-one correspon-

dence between linear transformations between vector spaces with fixed bases and
matrices. Before proceeding, we pause to consider the extraordinary fact that a lin-
n m
ear transformation from R a R is known, once we know its values on a set of n
elements! If we recall the functions studied in analysis, this is far from being true,
that is for the study of the graph of a function f ∶ R ⟶ R, it is not enough to know
one value of f , but one must painstakingly study its maximum and minimum, the
asymptotes etc. The difference here is that we are considering linear transformations,
which are functions with particular properties and behavior.

5.2 LINEAR MAPS AND MATRICES

In this section, we want to look at different ways to write and represent linear trans-
n m
formations between the two vector spaces R and R . By Theorem 5.1.7, we know
that a linear transformation is uniquely determined by its values on any basis. We
now see what happens in practice, keeping in mind Remark 5.1.6. Let us start by
examining an example.
2 3
Example 5.2.1 Consider the linear transformation F ∶ R ⟶ R such that
F (e1 ) = e2 + e3 , F (e2 ) = 2e1 − e2 + e3 , where {e1 , e2 } is the the canonical ba-
sis of R , i.e. e1 = (1, 0), e2 = (0, 1), and {e1 , e2 , e3 } is the the canonical basis of
2

R , i.e. e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1). We want to determine F (x, y)
3

for a generic vector (x, y) ∈ R . By properties 1 and 2 of the definition of linear

transformation we have that:

F (x, y) = F (x(1, 0) + y(0, 1)) = F (xe1 + ye2 ) = xF (e1 ) + yF (e2 ) =

= x(e2 + e3 ) + y(2e1 − e2 + e3 ) = x(0, 1, 1) + y(2, −1, 1) =

= (2y, x − y, x + y).

Now let us see what happens in general.

n m
Let F ∶ R ⟶ R be the linear transformation:
F (e1 ) = a11 e1 + a21 e1 ⋅ ⋅ ⋅ + am1 em ,
F (e2 ) = a12 e1 + a22 e1 ⋅ ⋅ ⋅ + am2 em ,
⋮
F (en ) = a1n e1 + a2n e1 ⋅ ⋅ ⋅ + amn em .

We want to express F (x1 , . . . , xn ) that is, we want to write the image of any vector
(x1 , . . . , xn ) ∈ R .
n

We proceed exactly as in the example, the reasoning is the same, only more
complicated to write.
Linear Transformations 83

F (x1 , . . . , xn ) = F (x1 e1 + x2 e2 + ⋅ ⋅ ⋅ + xn en ) =
= x1 F (e1 ) + x2 F (e2 ) ⋅ ⋅ ⋅ + xn F (en ) =
= x1 (a11 e1 + a12 e2 + ⋅ ⋅ ⋅ + am1 em ) + x2 (a12 e1 + a22 e1 ⋅ ⋅ ⋅ + am2 em )+
⋅ ⋅ ⋅ + xn (a1n e1 + ⋅ ⋅ ⋅ + amn em ) =
= (a1 1 x1 + a1 2 x2 + . . . + a1 n xn )e1 +

+(a2 1 x1 + a2 2 x2 + . . . + a2 n xn )e2 + . . .

+(am 1 x1 + am 2 x2 + . . . + am n xn )em =

⎛ a1 1 x1 + a1 2 x2 + . . . + a1 n xn ⎞
⎜ a2 1 x1 + a2 2 x2 + . . . + a2 n xn ⎟
=⎜
⎜
⎜
⎜
⎟
⎟
⎟
⎟ .
⎜ ⋮ ⎟
⎝am 1 x1 + am 2 x2 + . . . + am n xn ⎠

Let us take a step further, noting that F (x1 , . . . , xn ) can also be written in a
more compact form, using the notation of multiplication of a matrix by a vector:

⎛ x1 ⎞ ⎛ x1 ⎞ ⎛ a11 . . . a1n ⎞
F⎜
⎜ ⎟⋮ ⎟ = A ⎜
⎜⋮⎟ ⎟, where A=⎜
⎜ ⋮ ⋮ ⎟⎟.
⎝xn ⎠ ⎝xn ⎠ ⎝am1 . . . amn ⎠

In practice, we have shown that F is just the transformation LA associated with A

described in Remark 5.1.6.
We collect all our observations in the following theorem, which is very important
n m
for the exercises. We will denote the canonical bases of R and R , respectively, with
{e1 , . . . , en } and {e1 , . . . , em } (the notation is unambiguous because each vector is
uniquely determined as soon as we know to which vector space it belongs).
n m
Theorem 5.2.2 Assume we have a linear transformation F ∶ R ⟶ R and fix
n m
in R and R the respective canonical bases. Then we can equivalently express F in
one of three ways:

1. F (e1 ) = a11 e1 + ⋅ ⋅ ⋅ + am1 em ,

⋮
F (en ) = a1n e1 + ⋅ ⋅ ⋅ + amn em ;

⎛ a1 1 x 1 + a1 2 x 2 + . . . + a1 n x n ⎞
⎜ a2 1 x 1 + a2 2 x 2 + . . . + a2 n x n ⎟
2. F (x1 , . . . , xn ) = ⎜
⎜
⎜
⎜
⎟
⎟
⎟
⎟;
⎜ ⋮ ⎟
⎝am 1 x1 + am 2 x2 + . . . + am n xn ⎠
84 Introduction to Linear Algebra

3. F (x) = Ax, where

⎛ x1 ⎞ ⎛ a11 . . . a1n ⎞
⎜⋮⎟
x=⎜ ⎟ A=⎜
⎜ ⋮ ⋮ ⎟⎟.
⎝xn ⎠ ⎝am1 . . . amn ⎠

Corollary 5.2.3 There is a one to one correspondence between the m × n matrices

n m
and the linear transformations between the vector spaces R and R , where we fix
in both spaces the respective canonical bases to represent vectors. More precisely the
linear transformation F (x1 , . . . , xn ) = (a11 x1 + ⋅ ⋅ ⋅ + a1n xn , a21 x1 + ⋅ ⋅ ⋅ + xn a2n , . . . ,
am1 x1 + ⋅ ⋅ ⋅ + amn xn ) is associated with the matrix:

⎛ a11 . . . a1n ⎞
A=⎜
⎜ ⋮ ⋮ ⎟⎟
⎝am1 . . . amn ⎠
and vice versa.
n m
Observation 5.2.4 We observe that, if the linear transformation F ∶ R → R is
associated with the matrix A, where we fix the canonical bases in the domain and
codomain, indicating as usual with {e1 , . . . , en } the canonical basis of R , we have
n

that the i-th column of the matrix A (which we denote with Ai ) is F (ei ).

5.3 THE COMPOSITION OF LINEAR TRANSFORMATIONS

In this section, we deal with the composition of linear transformations. If F and G
are linear transformations respectively associated with two matrices A and B (with
respect to the canonical bases), we have that the composition F ◦ G is a linear
transformation and we want to know which matrix is associated with it.
We first recall the definition of composition of functions. If g ∶ A → B and
f ∶ B → C are functions, and the domain of f coincides with the codomain of g, we
define the composition as f ◦ g ∶ A → C, which corresponds to applying first g, then
f . Formally:
f ◦g ∶ A ⟶ C
a ↦ f (g(a)).
We observe that the composition is associative, that is, if we have three functions
h ∶ A → B, g ∶ B → C, f ∶ C → D, we have that f ◦ (g ◦ h) = (f ◦ g) ◦ h. In fact,
for every a ∈ A:

(f ◦ (g ◦ h))(a) = f ((g ◦ h)(a)) = f (g(h(a))) = (f ◦ g)(h(a)) = ((f ◦ g) ◦ h)(a).

The composition operation is not commutative, i.e. general f ◦ g ≠ g ◦ f and it

might even happen that g ◦ f is not defined.
Example 5.3.1 Let
f∶ R ⟶ R g∶ R ⟶ R
2 .
y ↦ y −1 x ↦ x+2
Linear Transformations 85

We compute f ◦ g ∶ R → R and g ◦ f ∶ R → R,
2 2
f ◦ g ∶ x ↦ g(x) = x + 2 ↦ f (x + 2) = (x + 2) − 1 = x + 2x + 3,
2 2 2 2
g ◦ f ∶ y ↦ f (y) = y − 1 ↦ g(y − 1) = (y − 1) + 2 = y + 1.
In this case, f ◦ g ≠ g ◦ f .
Let us now see an important example in linear algebra.
3 2
Example 5.3.2 Consider the two linear transformations LA ∶ R ⟶ R , LB ∶
2 2
R ⟶ R associated with the matrices:
−1 1 2 2 1
A=( ), B=( ),
3 1 0 1 3
2 3
with respect to the canonical bases in R and R . We see immediately that LB ◦ LA
is defined, while LA ◦ LB it is not defined. This is because LA must have as argument
a vector in R , while for every v ∈ R we have that LB (v) ∈ R , then LA (LB (v))
3 2 2

does not make sense.

Let us now see that, whenever we can take the composition of two linear trans-
formations, we still obtain a linear transformation.
Proposition 5.3.3 Let U, V, W be vector spaces and G ∶ U → V , F ∶ V → W two
linear transformations. Then the function F ◦ G ∶ U → W is a linear transformation.
Proof. We have to check the two properties of Definition 5.1.2.
1. Let u1 , u2 ∈ U . Then (F ◦ G)(u1 + u2 ) = F (G(u1 + u2 )) = F (G(u1 ) + G(u2 )) =
F (G(u1 ))+F (G(u2 )) = (F ◦G)(u1 )+(F ◦G)(u2 ), where we used first the definition
of composition, then linearity of G, linearity of F and finally again the definition of
composition.
2. Let u ∈ U and λ ∈ R. Then (F ◦ G)(λu) = F (G(λu)) = F (λ(G(u)) =
λF (G(u)) = λ(F ◦ G)(u), where as in the previous case, we have used first the
definition of composition, then linearity of G, linearity of F and finally again the
definition of composition.
From this proposition we can get an easy corollary. It is very useful in applications.
n m m s
Corollary 5.3.4 Let LA ∶ R ⟶ R , LB ∶ R ⟶ R be two linear transforma-
tions associated to the matrices A and B, respectively. Then the linear map LB ◦ LA
is associated with the matrix BA, namely:
LB ◦ LA = LBA .
Proof. The proof is a simple check:
⎛ x1 ⎞ ⎛ ⎛ x1 ⎞⎞ ⎛ ⎛ x1 ⎞⎞
(LB ◦ LA ) ⎜
⎜ ⎟⋮ ⎟ = L ⎜
B⎜ ⎜
A ⎜ ⋮ ⎟⎟
⎟⎟ = B ⎜
⎜A ⎜
⎜⋮⎟ ⎟⎟
⎟=
⎝xn ⎠ ⎝ ⎝xn ⎠⎠ ⎝ ⎝xn ⎠⎠

⎛ x1 ⎞ ⎛ x1 ⎞
= (BA) ⎜ ⋮ ⎟ = LBA ⎜
⎜ ⎟ ⎜⋮⎟ ⎟.
⎝xn ⎠ ⎝xn ⎠
86 Introduction to Linear Algebra

In this check, we used the associativity of the multiplication rows by columns between
matrices.

5.4 KERNEL AND IMAGE

We now want to get into the theory of linear transformation and introduce the
concepts of kernel and image, which are respectively subspaces of the domain and
codomain of a linear transformation.

Definition 5.4.1 Let V and W be two vector spaces and L ∶ V → W be a linear

transformation. We call kernel of L, the set of vectors in V whose image is the zero
vector of W . This set is denoted by Ker L.
We call image of L, the set of vectors in W which are images of some vectors of
V , that is,
Im(L) = {w ∈ W ∣ w = L(v) for some v ∈ V }.

Let us see some examples.

Example 5.4.2 1. Consider the derivation D ∶ R[x] ⟶ R[x], defined by

D(p(x)) = p (x). As we have seen D is a linear transformation. We want to know
′

which are the polynomial p(x) whose image is zero, i.e. such that D(p(x)) = 0. From
calculus, we know they are all the constant polynomials. So Ker (D) = {c ∣c ∈ R}.
Let us look at the image of D. We ask which polynomials are derivatives of other
polynomials. From calculus we know they are all the polynomials (we are in fact
integrating), so Im(D) = R[x].

2. Consider now the linear transformation L ∶ R ⟶ R defined by: L(e1 ) = 2e1 −e2 ,
3 2

L(e2 ) = e1 , L(e3 ) = e1 + 2e2 .

As seen in Theorem 5.2.2, we know that L(x, y, z) = (2x + y + z, −x + 2z). We
want to determine kernel and image of L. We have that Ker (L) is the set of vectors
whose image is the zero vector, that is,

Ker (L) = {(x, y, z)∣(2x + y + z, −x + 2z) = (0, 0)}

= {(x, y, z)∣2x + y + z = 0, −x + 2z = 0}

= {(x, y, z)∣z = 12 x, y = −2x − z = − 25 x} = {(x, − 52 x, x)∣x ∈ R}

= ⟨(2, −5, 1)⟩.

Linear Transformations 87

Let us see the image.

2x + y + z
Im(L) = {w ∈ R ∣ w = ( ) with x, y, z ∈ R }
2
−x + 2z

2x y z
= {w ∈ R ∣ w = ( )+( )+( )
2
−x 0 2z

with x, y, z ∈ R }

2 1 1
= {w ∈ R ∣ w = x ( )+y( )+z( )
2
−1 0 2

with x, y, z ∈ R }

2 1 1
= ⟨( ) , ( ) , ( )⟩ .
−1 0 2
The fact that the linear maps are defined so to preserve both operations of vector
spaces, makes both the kernel and the image of a given linear transformation to be
linear subspaces.
Proposition 5.4.3 Let L ∶ V ⟶ W be a linear transformation.
1. The kernel of L is a subspace of the domain V .
2. The image of L is a vector subspace of the codomain W .
Proof. (1) We note first that Ker (L) is not the empty set, because 0V ∈ Ker (L) by
Proposition 5.1.4. We have then to verify that Ker (L) is closed with respect to the
sum of vectors and the multiplication of a vector by a scalar. Let us start with the sum.
Let u, v ∈ Ker (L). Then L(u) = L(v) = 0W , so L(u+v) = L(u)+L(v) = 0W +0W =
0W , thus u + v ∈ Ker (L). Now we verify the closure of L with respect to the product
by a scalar. If α ∈ R and u ∈ Ker (L) one has L(αu) = αL(u) = α0W = 0W , so
αu ∈ Ker (L).
(2) Let us now see the same two properties for Im(L). We have that 0W ∈ Im(L)
by Proposition 5.1.4. Let now w1 , w2 ∈ Im(L). So there exist v1 , v2 ∈ V , such that
L(v1 ) = w1 and L(v2 ) = w2 . Therefore, w1 + w2 = L(v1 ) + L(v2 ) = L(v1 + v2 ) ∈
Im(L) and αw1 = αL(v1 ) = L(αv1 ) ∈ Im(L) for every α ∈ R.

Proposition 5.4.4 Let L ∶ V ⟶ W be a linear transformation. Then the subspace

Im(L) is generated by the image of any basis of V , i.e. if {v1 , . . . , vn } is a basis of
V , then:
Im(L) = ⟨L(v1 ), . . . , L(vn )⟩.
Proof. We show first that Im(L) ⊆ ⟨L(v1 ), . . . , L(vn )⟩. Im(L) consists of all vectors
of the type L(v) for all v ∈ V . Let {v1 , . . . , vn } be a basis of V . If v ∈ V , then
v = λ1 v1 + ⋅ ⋅ ⋅ + λn vn , with λ1 , . . . , λn ∈ R. So:
L(v) = L(λ1 v1 + ⋅ ⋅ ⋅ + λn vn ) = λ1 L(v1 ) + ⋅ ⋅ ⋅ + λn L(vn ) ∈ ⟨L(v1 ), . . . , L(vn )⟩.
88 Introduction to Linear Algebra

This proves that Im(L) ⊆ ⟨L(v1 ), . . . , L(vn )⟩.

The inclusion ⟨L(v1 ), . . . , L(vn )⟩ ⊆ Im(L) is true by Proposition 3.1.5, because
Im(L) ⊂ V is a subspace of V .

Consider the earlier example.

Example 5.4.5 We want to determine a basis for the image of the linear transfor-
mation L ∶ R ⟶ R defined by: L(e1 ) = 2e1 − e2 , L(e2 ) = e1 , L(e3 ) = e1 + 2e2 .
3 2

From the previous proposition we know that

Im(L) = ⟨2e1 − e2 , e1 , e1 + 2e2 ⟩.

2
We observe that ImL ⊆ R , then the image has dimension at most 2. We easily
see that the two vectors that generate the image are linearly independent. We can
2 2
therefore conclude that they are a basis of R and then that Im(L) = R .
We shall see in the next section a general method to determine the kernel and
the image of a linear transformation.
Kernel and image are related to injectivity and surjectivity of the linear transfor-
mation. We recall briefly these basic concepts.
Definition 5.4.6 Given a function between two sets A and B,

f ∶A⟶B

1. We say that f is injective if whenever f (x) = f (y) then x = y, i.e. two distinct
elements x and y can never have the same image.

2. We say that f is surjective if every element of B is the image of some element

of A, i.e. the codomain of f coincides with the image of f .

3. We say that f is bijective or a bijection if it is both injective and surjective.

4. We say that f is invertible if there is a function g ∶ B ⟶ A called inverse of

f , such that f ◦ g = idB and g ◦ f = idA , where in general idX ∶ X ⟶ X is
the identity function, which associates each element to itself, i.e. idX (x) = x.
−1
Very often we denote the inverse of f with f .
In the next proposition, we state the fact that a function is bijective if and only
if it is invertible; therefore from now on, we will use these two terms interchangeably.
Proposition 5.4.7 Let f ∶ A ⟶ B be a function between two sets A and B; then
f is bijective if and only if it is invertible.
Proof. Suppose that f is bijective, and we want to construct the inverse g of f . Let
b ∈ B. Since is f is bijective it is, in particular, surjective, thus there exists a ∈ A
such that f (a) = b. In addition to this, a is unique, because f is injective and so if a
′

is such that f (a ) = b then a = a . Define g ∶ B ⟶ A by the rule g(b) = a. Now by

′ ′

construction f (g(b)) = f (a) = b, thus f ◦ g = idB . In addition g(f (a)) = g(b) = a,

thus g ◦ f = idA . This shows that g is the inverse of f .
Linear Transformations 89

Now suppose that f is invertible and g ∶ B ⟶ A is the inverse of f . Then

f (g(b)) = b for each b ∈ B, and g(f (a)) = a for each a ∈ A. We show that f is
injective. Let a1 , a2 ∈ A be such that f (a1 ) = f (a2 ). Then g(f (a1 )) = g(f (a2 )), and
then a1 = g(f (a1 )) = g(f (a2 )) = a2 . For surjectivity, take b ∈ B, and let a = g(b),
then f (a) = f (g(b)) = b and then b is image of a by f .

Let us now return to linear transformations.

Proposition 5.4.8 Let L ∶ V ⟶ W be a linear map.

1. L is injective if and only if Ker (L) = 0V , that is, its kernel is the zero subspace
of the domain V .

2. L is surjective if and only if Im(L) = W , i.e. the image of L coincides with the
codomain.

Proof. (1) We show that if L is injective Ker (L) = 0V . If u ∈ Ker (L) then L(u) =
0W = L(0V ) and since L is injective, u = 0V .
Viceversa, let Ker L = (0V ) and suppose that L(u) = L(v) for some u, v ∈ V .
Then L(u − v) = L(u) − L(v) = 0W . So u − v ∈ Ker (L) = 0V , and we have that
u − v = 0V , then u = v, therefore f is an injection.
(2)This is precisely the definition of surjectivity.

The next proposition tells us that the injective linear transformations preserve
linear independence.

Proposition 5.4.9 Let V and W be vector spaces, v1 , . . . , vr linearly indipendent

vectors of V and L ∶ V → W an injective linear map. Then L(v1 ), . . . , L(vr ) are
linearly indipendent vectors of W .

Proof. Let α1 L(v1 ) + ⋯ + αr L(vr ) = 0W , with α1 , . . . , αr ∈ R. Then by Proposition

5.1.4, we have that L(α1 v1 + ⋯ + αr vr ) = 0W . Since L is injective, it follows that
α1 v1 + ⋯ + αr vr = 0V . But since v1 , . . . , vr are linearly independent, we have α1 =
. . . = αr = 0, so also L(v1 ), . . . , L(vr ) are linearly independent.

5.5 THE RANK NULLITY THEOREM

The Rank Nullity Theorem is perhaps the most important result in the theory of
linear transformations, and it is a formidable tool to solve exercises.

Theorem 5.5.1 Let L ∶ V ⟶ W be a linear map. Then

dim V = dim(Ker L) + dim(ImL). (5.1)

Proof. Let {u1 , . . . , ur } be a basis for the subspace Ker L. By Theorem 4.2.1, we can
complete it to a basis B of V . Let

B = {u1 , . . . , ur , wr+1 , . . . , wn } .
90 Introduction to Linear Algebra

If we prove that B1 = {L(wr+1 ), . . . , L(wn )} is a basis for Im(L) then the theorem
is proved, as dim(Ker (L)) = r, dim(V ) = n and dim(Im(L)) = n − r (the dimension
of Im(L) is the number of vectors in a basis and B1 contains n − r vectors).
Certainly B1 is a system of generators for Im(L), by Proposition 5.4.4. Now we
show that the vectors in B1 are linearly independent. Let

αr+1 L(wr+1 ) + ⋯ + αn L(wn ) = 0,

with αr+1 , . . . , αn ∈ R. We want to show that αr+1 = ⋅ ⋅ ⋅ = αn = 0.

We have:

0 = αr+1 L( wr+1 ) + ⋯ + αn L(wn ) = L(αr+1 w1 + ⋯ + αn wn ).

and therefore w = αr+1 w1 + ⋯ + αn wn belongs to the kernel of L. Since Ker (L) =

⟨u1 , . . . , ur ⟩ we can write w in the form w = α1 u1 + ⋯ + αr ur , with α1 , . . . , αr ∈ R.
Then
αr+1 w1 + ⋯ + αn wn = α1 u1 + ⋯ + αr ur ,
from which it follows that

αr+1 w1 + ⋯ + αn wn − (α1 u1 + ⋯ + αr ur ) = 0

and being B a basis for V this implies that α1 = . . . = αn = 0, concluding the proof
of the theorem.

Formula 5.1 places restrictions on the type and the existence of linear maps be-
tween two given vector spaces.

Proposition 5.5.2 Let V and W be two vector spaces.

1. If dim V > dim W there are no injective linear maps from V to W .

2. If dim V < dim W there are no surjective linear transformations from V to W .

Proof. It is a simple application of Theorem 5.5.1.

1. If L ∶ V → W is a linear injection, then dim(Ker L) = 0 and thus dim V =
dim(ImL). Since ImL is a subspace of W it follows that dim V ≤ dim W .
2. Similarly, if L ∶ V → W is surjective, then ImL = W and thus dim W = dim V +
dim(Ker L), thus dim W ≤ dim V .
4 2
Example 5.5.3 Consider the linear tranformation F ∶ R ⟶ R defined by:
F (e1 ) = e1 − e2 , F (e2 ) = 3e1 − 4e2 , F (e3 ) = −e1 − 5e2 , F (e4 ) = 3e1 + e2 . We
want to determine if the function is injective, surjective, bijective. By the previous
theorem, there is no need to make any calculations, because we have that the map
4 2
cannot be injective as the dimension of R is larger than the dimension of R . We now
come to surjectivity. The image of F is generated by the vectors: e1 − e2 , 3e1 − 4e2 ,
−e1 − 5e2 , 3e1 + e2 . Since at least two of them are linearly independent, they form
2
a basis of R , and therefore the function is surjective. Since it is not an injection, F
is not bijective.
Linear Transformations 91

5.6 ISOMORPHISM OF VECTOR SPACES

The concept of isomorphism allows us to identify two vector spaces, and then to treat
them in the same way when we have to solve linear algebra problems, such as, for
example, to determine if a set of vectors is linearly independent, or for the calculation
of the kernel and of the image of a linear transformation.

Definition 5.6.1 A linear map L ∶ V ⟶ W is said to be an isomorphism, if it is

invertible, or equivalently if it is injective and surjective.
Similarly, two vector spaces V and W are called isomorphic, if there is an isomor-
phism L ∶ V ⟶ W ; in this case, we write V ≅ W .

Example 5.6.2 Consider the linear map L ∶ R2 [x] ⟶ R defined by: L(x ) =
3 2

(1, 0, 0), L(x) = (0, 1, 0), L(1) = (0, 0, 1). This linear transformation is invertibile.
To show this, we can determine the kernel and see that it is the zero subspace and de-
3
termine the image and see that it is all R . We leave this as an exercise. Alternatively,
we can define the linear transformation T ∶ R ⟶ R2 [x], such that T (e1 ) = x ,
3 2

T (e2 ) = x, T (e3 ) = 1 and verify that it is the inverse of L (the student may want to
do these verifications as an exercise). Therefore R2 [x] and R are isomorphic. Some-
3

how, it is as if they were the same space, as we created a one to one correspondence
that associates to a vector in R2 [x], one and only one vector in R , and vice versa.
3

This correspondence also preserves the operations of sum of vectors and multiplica-
tion of a vector by a scalar. In fact, we had already noticed that, once we fix basis in
R2 [x], each vector is written using three coordinates, just like a vector in R . If we
3

fix the canonical basis {x , x, 1}, the linear map that associates to each polynomial
2

its coordinates is just the isomorphism L ∶ R2 [x] ⟶ R described above. Once we

3
3
write the coordinates of a polynomial, we can treat it as an element of R . For exam-
ple, to determine whether some polynomials are linearly independent or to determine
a basis of the subspace they generate, we use the Gaussian algorithm as described in
Chapter 1.

The next theorem is particularly important, since it tells us that not only R2 [x],
N
but any vector space of finite dimension is isomorphic to R for a certain N (which
of course depends on the vector space we consider). So the calculation methods we
N
have described to solve various problems in the vector space R s can be applied to
any vector space V using, instead of the N -tuples of real numbers, the coordinates
of the vectors of the vector space V with respect to a fixed basis.

Theorem 5.6.3 Two vector spaces V and W are isomorphic if and only if they have
the same dimension.

Proof. If dim V = dim W = n, let {u1 , . . . , un } and {w1 , . . . , wn } be two bases of V

and W , respectively. Then the linear map L ∶ V → W defined by L(vi ) = wi for
i = 1, . . . , n is an isomorphism. In fact, by Proposition 5.4.4, we have that ImL =
⟨w1 , . . . , wn ⟩, and since is the vectors w1 , . . . , wn generate W , it follows that L is
surjective. Then, by the Rank Nullity Theorem 5.5.1, we have that dim(Ker L) =
92 Introduction to Linear Algebra

dim V − dim(ImL) = dim V − dim W = 0, so Ker L is the zero subspace and by

Proposition 5.4.8, we have that L is injection.
Conversely, if two vector spaces are isomorphic, there exists a linear map L ∶
V → W that is both injective and surjective. Then, by Proposition 5.4.8, we have
that dim(Ker L) = 0 and dim(ImL) = dim W ; applying the Rank Nullity Theorem
5.5.1, we get that dim V = dim(Ker L) + dim(ImL) = dim W .

Since we know the dimensions of the vector spaces Mm,n (R) and Rd [x], we im-
mediately get the following corollary.

Corollary 5.6.4 Mm,n (R) ≅ R , Rd [x] ≅ R

mn d+1
.

5.7 CALCULATION OF KERNEL AND IMAGE

This section is extremely important for the exercises as it provides us with practical
methods for the calculation of bases for the kernel and the image of a given linear
transformation.
We begin with the calculation of a basis of the kernel of a linear map.
n m
Suppose we have a linear map F ∶ R ⟶ R and we want to determine a basis
n m
for the kernel. We endow R and R with the canonical bases; then, by Proposition
5.2.2, we have that F (x) = Ax, for a suitable matrix A ∈ Mm,n (R). By the definition
of kernel, we have:
n
Ker F = {x ∈ R ∣Ax = 0};
that is, the kernel of F is the set of solutions of the homogeneous linear system
associated with A.
Let us see a concrete example.

Example 5.7.1 Consider the linear map F ∶ R ⟶ R defined by: F (e1 ) = −e2 ,
4 2

F (e2 ) = 3e1 − 4e2 , F (e3 ) = −e1 , F (e4 ) = 3e1 + e2 . We want to determine a basis
for the kernel of F . We write the matrix A associated with F with respect to the
canonical bases:
0 3 3 −1
A=( ).
−4 −1 0 1
Therefore F (x1 , x2 , x3 , x4 ) = (3x2 − x3 + 3x4 , −x1 − 4x2 + x4 ), and Ker F is the set
of solutions of the homogeneous linear system:

3x2 − x3 + 3x4 = 0
{
−x1 − 4x2 + x4 = 0,

which is indeed associated with the matrix A.

To reduce A in row echelon form, it is sufficient to exchange its two lines and we
get:
−1 −4 0 1
A =( ).
′
0 3 −1 3
Linear Transformations 93

Solving the system, we therefore have that

4 1
Ker F = {(− + x3 5x4 , x3 − x4 , x3 , x4 ) ∣x3 , x4 ∈ R} =
3 3

4 1
{(− x3 , x3 , x3 , 0) + (5x4 , −x4 , 0, x4 )∣x3 , x4 ∈ R} =
3 3
4 1
{x3 (− , , 1, 0) + x4 (5, −1, 0, 1) ∣x3 , x4 ∈ R} =
3 3
4 1
⟨(− , , 1, 0) , (5, −1, 0, 1)⟩ .
3 3

We observe that the vectors (− 34 , 13 , 1, 0) , (5, −1, 0, 1) not only generate Ker F , but
they are also linearly independent, because they are not one a multiple of the other,
so they are a basis of Ker F .
Another way to understand the above equalities is the following: Ker F is the set
of linear combinations of the vectors (− 34 , 31 , 1, 0) , (5, −1, 0, 1), obtained by placing,
respectively, first x3 = 1, x4 = 0, then x3 = 0, x4 = 1.

Observation 5.7.2 The phenomenon described in the previous example occurs in

general. If the set of solutions of a homogeneous linear system depends on k free
variables, then this set is the vector space generated by the k vectors v1 , . . . , vk ,
where vi is obtained by setting the i-th free variable equal to 1 and the other free
variables equal to 0, for every i = 1, . . . , k. Furthermore, the vectors so obtained are
linearly independent, and therefore they are a basis for this vector space.

Let us formalize what we have just observed. We first need a definition:

Definition 5.7.3 We call row rank of a matrix M ∈ Mm,n (R) the maximum number
n
of linearly independent rows of M , which is the dimension of the subspace of R
generated by the rows of M . The row rank of A is denoted by rr(A).

If A ∈ Mm,n (R) is a matrix in echelon form the definition of row rank given
above coincides with Definition 1.3.4. In fact, by Proposition 4.3.3 the nonzero rows
of a matrix A in row echelon form are linearly independent, so the dimension of the
subspace generated by the rows of A coincides with the number of nonzero rows of
A.

Proposition 5.7.4 Let Ax = 0 be a homogeneous linear system, where A ∈

Mm,n (R), and let W be the set of its solutions. Then W is a vector space of di-
mension n − rr(A).
n m
Proof. Let LA be the linear transformation LA ∶ R → R associated with the
matrix A, with respect to the canonical bases of the domain and codomain. Then
n
W is actually the kernel of LA , and so it is a subspace of the vector space R by
Proposition 5.4.3. To determine W , we can use the Gaussian algorithm and reduce the
matrix (A∣0) in row echelon form, obtaining a matrix (A ∣0). Now, by Proposition
′
94 Introduction to Linear Algebra

4.3.1, we have that rr(A) = rr(A ) = r, and in turn r is equal to the number of
′
′ ′
non-zero rows of A , that is the number of pivots of A . Then, as we saw in Chapter
1, we can assign an arbitrary value to each of the n − r variables, and write the r
variables corresponding to the pivots in terms of these values. Let xi1 , . . . , xin−r be
the free variables and let wj be the solution of the system obtained by putting xij = 1
and the other free variables equal to zero. Proceeding exactly as in Example 5.7.1,
we obtain that the vectors w1 , . . . , wr generate W . We now show that they are also
linearly independent, from which it follows that they are a basis of W , and so W has
dimension n − r. Suppose that w = λ1 w1 + ⋅ ⋅ ⋅ + λn−r wn−r = 0 with λ1 , . . . , λn−r ∈ R.
We observe the element of place ij of wh is 1 if h = j, otherwise it is zero, thus the
element of place ij of w is exactly λj . The hypothesis that w = 0 means that all
elements of w are zero, in particular λ1 = ⋅ ⋅ ⋅ = λn−r = 0, and this shows that the
vectors w1 , . . . , wn are linearly independent.
We now want to proceed and calculate a basis for the image of a linear transfor-
mation.
n m
Suppose we have a linear map F ∶ R ⟶ R , and we want to determine a basis
n m
for the image. We endow R and R with the canonical bases; then, by Proposition
5.2.2, we have that F (x) = Ax for a suitable matrix A ∈ Mm,n (R). By Proposition
5.4.4 we have:
Im(F ) = ⟨F (e1 ), . . . , F (en )⟩ = ⟨A1 , . . . , An ⟩,
where A1 , . . . , An are the columns of A. At this point, we simply apply the Gaussian
algorithm to the vectors that form the columns of A. Recall that, to perform the
Gaussian algorithm, we must write the vectors as rows. Let us see an example.
Example 5.7.5 Let F ∶ R ⟶ R be defined by F (x, y, z) = (x, 2x, x + y + z, y).
3 4

The matrix associated with F with respect to the canonical bases is:
⎛1 0 0⎞
⎜2 0⎟
A=⎜ ⎟
⎜ 0 ⎟.
⎜
⎜ 1⎟
⎟
⎜1 1 ⎟
⎝0 1 0⎠
The image of F is generated by the columns of A, i.e. ImF = ⟨(1, 2, 1, 0),
(0, 0, 1, 1), (0, 0, 1, 0)⟩. So we apply the Gaussian algorithm to the matrix
⎛1 2 1 0⎞
A =⎜
⎜0 0 1 1⎟
T
⎟,
⎝0 0 1 0⎠
T
where A denotes the transpose of the matrix A, i.e. it is the matrix which has the
T
columns of the matrix A as rows. Reducing A to row echelon form, we get
⎛1 2 1 0 ⎞
⎜
⎜0 0 1 1 ⎟
⎟.
⎝0 0 0 −1⎠
We then see that a basis for the image of F is:
{(1, 2, 1, 0), (0, 0, 1, 1), (0, 0, 0, −1)}.
Linear Transformations 95

5.8 EXERCISES WITH SOLUTIONS

5.8.1 Let Fk ∶ R → R be the linear transformation defined by: Fk (e1 ) = e1 + 2e2 +
4 3

ke3 , Fk (e2 ) = ke2 + ke3 , Fk (e3 ) = ke1 + ke2 + 6e3 , Fk (e4 ) = k + e1 (6 − k)e3 .
a) Determine for which values of k we have that Fk is injective and for which
values of k we have that Fk is surjective.
b) Having chosen a value of k for which Fk is not surjective, determine a vector
/ ImFk .
3
v ∈ R such that v ∈
Solution. By Proposition 5.5.2 Fk is never injective. We now study surjectivity. The
matrix associated with Fk with respect to the canonical bases in the domain and in
the codomain is:
⎛1 0 k k ⎞
A=⎜ ⎜ 2 k k 0 ⎟ ⎟
⎝k k 6 6 − k ⎠
(see Observation 5.2.4). The image of Fk is the subspace generated by the columns
T
of A. We write these columns as rows (i.e. we consider the matrix A , the transpose
of A), and we perform the Gaussian algorithm. We obtain:

⎛1 2 k ⎞
⎜0 k ⎟
A =⎜ ⎟
T
⎜ k ⎟
⎜
⎜ 6 ⎟⎟
⎜k k ⎟
⎝k 0 6 − k⎠

and reducing the matrix to row echelon form:

⎛1 2 k ⎞
⎜
⎜0 k k ⎟
⎟
⎜
⎜ ⎟.
⎜ 0 k − k − 6⎟
⎟
2
⎜0 ⎟
⎝0 0 0 ⎠

If k ≠ 0, k ≠ −2 and k ≠ 3, this matrix has three nonzero rows, so ImFk has dimension
3 and Fk is surjective.
If k = 0, after the exchange of the second with the third row we get:

⎛1 2 0⎞
⎜
⎜0 0 −6⎟
⎟
⎜
⎜ ⎟,
⎜
⎜0 0 0⎟⎟
⎟
⎝0 0 0⎠

so ImF0 has dimension 2 and F0 is not surjective.

If k = −2 or k = 3 we have a row echelon form matrix with two nonzero rows,
then ImFk has dimension 2 and Fk is not surjective.
Choosing k = 0, we have ImF0 = ⟨(1, 2, 0), (0, 0, 6)⟩ and v = (0, −1, 3) ∈ / ImF0 ,
since the 3 vectors (1, 2, 0), (0, −1, 3), (0, 0, 6) are linearly independent, because they
are the nonzero rows of a row echelon form matrix.
96 Introduction to Linear Algebra

4 3
5.8.2 Determine, if possible, a linear transformation G ∶ R → R such that ImG =
⟨(1, 1, 0), (0, 3, −1), (3, 0, 1)⟩ and Ker G = ⟨(−1, 0, 1, −3)⟩.
Solution. Let us see, first of all, if the requests made are compatible with the
Rank Nullity Theorem. We first want to find a basis of ⟨(1, 1, 0), (0, 3, −1), (3, 0, 1)⟩.
To do this, we reduce the matrix to row echelon form:

⎛1 1 0 ⎞
⎜0 3 −1⎟
⎜ ⎟
⎝3 0 1 ⎠

and we get
⎛1 1 0 ⎞
⎜
⎜0 3 −1⎟
⎟.
⎝0 0 0 ⎠
4
So we would have that ImG has dimension 2 and Ker G has dimension 1, but dim R =
4 ≠ 1 + 2 = dim(Ker G) + dim(ImG), consequently a linear transformation with the
required properties cannot exist.
3 3
5.8.3 Determine, if possible, a linear transformation F ∶ R → R such that
Ker F = ⟨e1 − 2e3 ⟩ and ImF = ⟨2e1 + e2 − e3 , e1 − e2 − e3 ⟩. Is this transforma-
tion unique?
Solution. We observe that the requests are compatible with the Rank Nullity The-
orem. In fact, ⟨e1 − 2e3 ⟩ has a dimension of 1, ⟨2e1 + e2 − e3 , e1 − e2 − e3 ⟩
has dimension 2 (because the two vectors are not one multiple of the other) and
dim R = 3 = 1 + 2 = dim(Ker F ) + dim(ImF ).
3

Let us now try to determine the matrix A associated with such F with respect
to the canonical basis (in the domain and in the codomain). It must happen that
F (e1 − 2e3 ) = 0, i.e. F (e1 ) − 2F (e3 ) = 0 (because F is linear), so F (e1 ) = 2F (e3 ).
Since F (e1 ) is represented by the first column of A, and F (e3 ) by third column, we
have that the first column of A must be twice the third. Furthermore, the subspace
generated by the columns of A, that is ImF must be equal to ⟨(2, 1, −1), (1, −1, −1)⟩.
For example, the matrix:
⎛2 2 1⎞
A = ⎜−2 1 −1⎟
⎜ ⎟
⎝−2 −1 −1⎠
meets all of the requirements. In fact by construction e1 − 2e3 ∈ Ker F and ImF =
⟨2e1 + e2 − e3 , e1 − e2 − e3 ⟩; also from the Rank Nullity Theorem, we get that
dim(Ker F ) = dim R − dim(ImF ) = 3 − 2 = 1, thus Ker F = ⟨e1 − 2e3 ⟩.
3

This F is not unique; in fact, it can be verified that for example also the matrix

⎛6 2 3⎞
A=⎜
⎜−6 1 −3⎟
⎟
⎝−6 −1 −3⎠
meets all the given requirements.
Linear Transformations 97

2 3
5.8.4 Let Gk ∶ R → R be the linear transformation defined by:
Gk (x, y) = (kx + 5y, 2x + (k + 3)y, (2k − 2) + x(7 − k)y). Determine for which values
of k we have that G is injective.
Solution. To determine if Gk is injective we need to understand if the kernel of Gk
contains only the zero vector or not. By Corollary 5.2.3 the matrix associated with
Gk with respect to the canonical bases in the domain and in the codomain is:

⎛ k 5 ⎞
A=⎜
⎜ 2 k + 3⎟⎟.
⎝2k − 2 7 − k ⎠

To find the kernel of Gk we need to solve the homogeneous linear system associated
with A. Reducing the matrix (A∣0) to row echelon form, we get:
k+3
⎛ 1 2
0 ⎞
⎜
(A ∣0) = ⎜ ⎟.
⎜ 0 −k − 3k + 10 0 ⎟
′ 2
⎟
⎝ 0 0 0 ⎠

If k + 3k − 10 ≠ 0, that is, if k ≠ −5 and k ≠ 2, we have that rr(A ∣0) = rr(A ) = 2.

2 ′ ′

Since the unknowns are 2, the system has only one solution, the null one, therefore
Ker Gk = {(0, 0)} and G is injective. If k = −5 or k = 2, we easily see that rr(A ∣0) =
′

rr(A ) = 1, then the system admits infinitely many solutions that depend on one
′

parameter, and Gk is not injective.

An alternative way of proceeding is as follows. By the Rank Nullity Theorem, we
have that 2 = dim R = dim(Ker Gk ) + dim(ImGk ), so Gk is injective if and only if
2

the image of Gk has dimension of 2. We then calculate a basis for the image of Gk .
We must consider the matrix
k 2 2k − 2
( ),
5 k+3 7−k

and reduce it with the Gaussian algorithm. We get:

k+3 7−k
1
( 2
5
2
5 ).
0 −k − 3k + 10 k + 3k − 10

If k ≠ −5 and k ≠ 2, there are two nonzero rows, so the image of Gk has dimension
2 and Gk is injective. If k = −5 or k = 2 there is only one nonzero row, so the image
of Gk has dimension 1 and Gk is not injective. We have thus found once again the
result obtained with the previous method.

5.9 SUGGESTED EXERCISES

5.9.1 Consider the function F ∶ R ⟶ R defined by: F (x, y) = (x + 2ky, x − y).
2 2

Determine the values of k for which F is linear.

5.9.2 Given the function F ∶ R1 [x] ⟶ R2 [x] defined by: F (ax + b) = (a − b)x +
2
2
kb x + 2a. Determine the values of k for which F is linear.
98 Introduction to Linear Algebra

3 2 2 3
5.9.3 Given linear transformations F ∶ R → R and G ∶ R → R defined by:
F (x, y, z) = (x − y, 2x + y + z) and G(x, y) = (3y, −x, 4x + 2y), determine if possible,
F ◦ G and G ◦ F .
2 2 2
5.9.4 Given the linear transformations F ∶ R → R and G ∶ R → R defined by:
F (e1 ) = −e1 − e2 , F (e2 ) = e1 + e2 , G(e1 ) = 2, G(e2 ) = − 1; determine if possible
F ◦ G and G ◦ F .
5.9.5 Consider the linear transformation F ∶ R ⟶ R defined by F (e1 ) = 3e1 −3e2 ,
2 2

F (e2 ) = 2e1 − 2e2 . Compute a basis of the kernel and a basis for the image of F .
5.9.6 Establish which of the following linear transformations are isomorphisms:
i) F ∶ R ⟶ R defined by F (x, y, z) = (x + 2z, y + z, z);
3 3

ii) F ∶ R → R defined by F (x, y, z) = (2x − z, x − y + z);

3 2

iii) F ∶ R → R defined by F (e1 ) = 2e1 +e2 , F (e2 ) = 3e1 −e3 , F (e3 ) = e1 −e2 −e3 .
3 3

5.9.7 Find a basis for the kernel and one for the image of each of the following linear
transformations. Establish if they are injective, surjective and/or bijective.
i) F ∶ R ⟶ R defined by F (x, y, z) = (x − z, x + 2y − z, x − 4y − z).
3 3

ii) F ∶ R ⟶ R defined by F (e1 ) = e1 + e2 − e3 , F (e2 ) = 2e1 − 2e2 − e3 .

2 3

iii) F ∶ R ⟶ R defined by F (x, y, z, t) = (2x − t, 3y − x + 2z − t).

4 2

3 3
iv) F ∶ R ⟶ R associated with A, with respect to the canonical basis, where

⎛−1 1 −1⎞
⎜0 0 1⎟
A=⎜ ⎟.
⎝1 0 1⎠

v) F ∶ R ⟶ R defined by F (x, y, z) = (x + y − z, z, x + y, z).

3 4

5.9.8 Given the linear transformation F ∶ R → M2,2 (R) defined by:

x1 − x3 − x4 x1 + 2x3
F (x1 , x2 , x3 , x4 ) = ( ),
x1 + x4 0

find a basis for Ker F and determine the dimension of ImF .

5.9.9 Given the linear transformation F ∶ R → R defined by: F (e1 ) = e1 + e2 + e3 ,
3 3

F (e2 ) = 2e1 + 2e2 + 2e3 , F (e3 ) = e1 + e2 + e3 ; find a basis for Ker F and establish
if the vector e1 − e2 + e3 belongs to ImF .
3 2
5.9.10 Determine, if possible, a surjective linear transformation T ∶ R → R and
4 3
an injective linear transformation F ∶ R → R .
3 4
5.9.11 Are there injective transformations T ∶ R → R ? If yes, determine one; if
not, give reasons for the answer.
Linear Transformations 99

2 2
5.9.12 Let Tk ∶ R ⟶ R be the linear transformation associated with the matrix
A with respect to the canonical basis, where

1 4
A=( ).
k 0

Determine Ker (Tk ) and Im(Tk ) as k varies in R.

3 3
5.9.13 Let Tk ∶ R ⟶ R be the linear transformation associated with the matrix
A with respect to the canonical basis, where

⎛1 2 0⎞
A=⎜
⎜1 0 1⎟
⎟.
⎝2 k 3⎠

Say for which values of the parameter k we have that Tk is an isomorphism.

5.9.14 Let ∶ R → R be the linear transformation defined by: T (x, y, z) = (x +

3 4

2z, y, 2x + 3y + 4z, 3x − y + 6z). Find a basis for Ker (T ) and a basis for Im(T ) and
their dimensions. Is T injective?
3 2
5.9.15 Determine, if possible, a linear transformation F ∶ R → R such that Ker F =
⟨e1 ⟩ and ImF = ⟨e1 − e2 ⟩.
2 3
5.9.16 Determine, if possible, a linear transformation F ∶ R → R such that Ker F =
⟨e2 ⟩ and ImF = ⟨e1 − e2 + 2e3 ⟩.
2 4
5.9.17 Determine, if possible, a linear transformation F ∶ R → R such that ImF
has dimension 1.
3 2
5.9.18 Determine, if possible, a linear linear transformation F ∶ R → R such that
(1, 1) ∈
/ ImF .

5.9.19 Let F ∶ R → R be the linear transformation defined by: F (e1 ) = 2e1 + ke2 ,
3 2

F (e2 ) = ke1 + 2e2 , F (e3 ) = −2e1 − ke2 . Determine for which values of k we have
that F is not surjective. Set k = −1, determine a vector v1 that belongs to Ker F
and a vector v2 which does not belong to Ker F .

5.9.20 Given the linear transformation Fk ∶ R → R defined by: Fk (x, y, z) =

3 2

(kx − 3y + kz, kx + ky − 3z), determine for which values of k we have that Fk is

injective and for which values of k we have that Fk is surjective.

5.9.21 Given the linear transformation Fk ∶ R → R defined by: Fk (e1 ) = ke1 −

2 3

4e2 + ke3 , Fk (e2 ) = −3e1 + ke2 + 3e3 , determine for which values of k we have that
Fk is injective and for which values of k we have that Fk is surjective.
CHAPTER 6

Linear Systems

In this chapter, we want to revisit the theory of linear systems and interpret the
results already discussed in Chapter 1 in terms of linear transformations, using the
knowledge we have gained in Chapter 5. We will use the notation and terminology
introduced in Chapter 1.

6.1 PREIMAGE
The inverse image or preimage of a vector w ∈ W under a linear map f ∶ V ⟶ W
constists of all the vectors in the vector space V , whose image is w. It is a basic
concept in mathematics; for us, it will be a useful tool for expressing the solutions of
a linear system.
We already know an example of inverse image, namely the kernel of a linear
transformation F . In fact, Ker (F ) is the inverse image of the zero vector of W , i.e.
it consists of all vectors, whose image under F is 0W .
Let us look at the definition.

Definition 6.1.1 Let F ∶ V → W be a linear transformation and let w ∈ W . The

inverse image or preimage of w under F is the set

(w) = {v ∈ V ∣ F (v) = w}.

−1
F

Notice that the notation F (w), introduced in the previous definition, does not
−1

have anything to do with the invertibility of the function. When we speak of inverse
image of a vector under a function F , we are not saying that F is an invertible
function: The notation F (w) simply indicates a subset of the domain.
−1

2 3
Example 6.1.2 Let F ∶ R → R be the linear transformation defined by:

F (x, y) = (x + y, x + y, x).

101
102 Introduction to Linear Algebra

What is the inverse image of the vector (1, 1, 3) under F ? By definition we have:

(1, 1, 3) = {(x, y) ∈ R ∣ F (x, y) = (1, 1, 3)} =

−1 2
F

= {(x, y) ∈ R ∣ (x + y, x + y, x) = (1, 1, 3)} =

⎧
⎪ ⎧
⎪ ⎫
⎪
⎪
⎪ ⎪x + y = 1, ⎪
⎪ 2 ⎪ ⎪
=⎨
⎪ (x, y) ∈ R ∣ ⎪
⎨
⎪ x + y = 1, ⎪
⎬
⎪ = {(3, −2)}.
⎪
⎪ ⎪
⎪ ⎪
⎪
⎪
⎩ ⎪
⎩x = 3 ⎪
⎭

This means that F (3, −2) = (1, 1, 3), and there are no other elements of R that have
2

(1, 1, 3) as image. In particular, we have that (1, 1, 3) ∈ ImF .

We now calculate the inverse image of the vector (1, 0, 0) under F . By definition we
have:
F (1, 0, 0) = {(x, y) ∈ R ∣F (x, y) = (1, 0, 0)} =
−1 2

= {(x, y) ∈ R ∣(x + y, x + y, x) = (1, 0, 0)} =

⎧
⎪ ⎧
⎪ x+y =1 ⎫
⎪
⎪
⎪ 2 ⎪ ⎪
=⎨
⎪ (x, y) ∈ R ∣ ⎪
⎨
⎪ x+y =0 ⎪
⎬
⎪ = ∅.
⎪
⎪ ⎪
⎪ ⎪
⎪
⎩ ⎩ x=0 ⎭

The vector (1, 0, 0) is not the image of any vector of R , i.e. (1, 0, 0) ∉ ImF .
2

These examples show that calculating the preimage of a vector under a linear
transformation is equivalent to solving a linear system. We will now deepen our
understanding on this point.

Observation 6.1.3 The set F (w) is a vector subspace of V if and only if

−1

F (w) = Ker F and in this case w = 0W .

−1

In fact, if F (w) = Ker F , we have immediately that the inverse image is a

−1

subspace. Conversely, reasoning by contradiction, if w ≠ 0W then the set F (w)

−1

does not contain 0V . In fact, being F a linear transformation, we have F (0V ) = 0W ,

therefore 0V belongs only to the inverse image of 0W (the same vector in V cannot
belong to the inverse image of different vectors). So F (w) cannot be a a subspace
−1

of V .

Proposition 6.1.4 Let F ∶ V → W be a linear transformation and let w ∈ W . Then

F (w) is not empty if and only if w ∈ ImF . In this case we have:
−1

(w) = {v + z ∣ z ∈ Ker F },
−1
F (6.1)

where v is any element of V such that F (v) = w.

Linear Systems 103

Proof. If w ∉ Im F, then, by definition, F (w) = ∅. We see now the case when

−1

w ∈ ImF . By definition of image, there exists an element v ∈ V , such that F (v) = w.

It might not be the case that v is the only element in F (w). So suppose that v is
−1 ′

another element in F (w). Then

−1

F (v ) = F (v) = w,
′

so
F (v − v) = 0W ,
′

i.e.:
′
v − v ∈ Ker F.
So any element v of F (w) is written as v = v+ (v − v) = v + z, with z ∈ Ker F .
′ −1 ′ ′

Thus we have proved one inclusion.

Consider now v + z, with z ∈ Ker F . Then

F (v + z) = F (v) + F (z) = w + 0W = w.

This gives the other inclusion and we have shown the result.

6.2 LINEAR SYSTEMS

In Chapter 5, we defined the row rank of a matrix (see Definition 5.7.3). We now
want to deepen the study of this notion and have a clearer view of the link between
matrices, linear systems and transformations.
Given a matrix A ∈ Mm,n (R), we can read its rows as vectors of R and its
n
m
columns as vectors of R . It is therefore natural to introduce the following definition.

Definition 6.2.1 We call column rank of a matrix A ∈ Mm,n (R), the maximum
number of linearly independent columns of A, i.e. the dimension of the subspace of
m
R generated by the columns of A.

The following observation is already known and yet, given the its importance in
the context that we are studying, we want to rexamine it.

Observation 6.2.2 If we write A as the matrix associated with the linear trans-
n m
formation LA ∶ R ⟶ R with respect to the canonical bases, then the column
rank of A is the dimension of the image of LA . Indeed, the image is generated by the
columns of the matrix A.

Although in general the row vectors and column vectors of a matrix A ∈ Mm,n (R)
are elements of different vector spaces, the row and column rank of A always coincide.
This number is simply called rank of A, denoted by rk(A).

Proposition 6.2.3 If A ∈ Mm,n (R), then the row rank of A is equal to the column
rank of A.
104 Introduction to Linear Algebra

n m
Proof. Let LA ∶ R ⟶ R be the linear transformation associated with the matrix
A with respect to the canonical bases. The kernel of LA is the set of solutions of the
homogeneous linear system associated with the matrix A and, by Proposition 5.7.4,
has dimension n − rr(A), where rr(A) is the row rank of A. By the Rank Nullity
Theorem 5.5.1, we also know that the dimension of Ker LA is equal to n−dim(ImLA ).
It follows that rr(A) = dim(ImLA ), i.e. rr(A) is equal to the rank of columns A.

Let us see an example to clarify.

Example 6.2.4 Consider the matrix

1 2 0 −1
A=( ).
1 2 0 −1

The row rank of A is 1. Also the column rank is 1, since the vectors (1, 1), (2, 2),
(−1, −1) of R are linearly dependent. So rk(A) = 1.
2

Observation 6.2.5 If A ∈ Mm,n (R) is a matrix in row echelon form, the definition
of rank coincides with the Definition 1.3.4. In fact, by Proposition 4.3.3 the rows of a
nonzero matrix A in row echelon form are linearly independent, hence the dimension
of the subspace generated by the rows of A coincides with the number of nonzero rows
of A. Computing the rank of a matrix in row echelon form is therefore immediate,
while calculating the rank of a generic matrix requires more time.

Proposition 4.3.1 provides an effective method for calculating the rank of a matrix
as the elementary operations on the rows of a matrix preserve the rank, since the
subspace generated by the rows remains unchanged. Hence, to compute the rank of
a matrix, A, we reduce A in row echelon form with the Gaussian algorithm and then
we compute the rank of the reduced matrix, which simply amounts to counting the
number of nonzero rows.

Example 6.2.6 We want to compute the rank of the matrix

⎛ 0 1 3 ⎞
⎜ 1 −1 5 ⎟
A=⎜ ⎟.
⎝ −1 1 0 ⎠

We have:
⎛ 1 −1 5 ⎞
rk(A) = rk ⎜
⎜ 0 1 3 ⎟⎟ = 3.
⎝ 0 0 5 ⎠

We now give a useful definition, already anticipated in the first chapter.

Definition 6.2.7 Let A x = b a linear system. The linear system A x = 0 is called

the homogeneous system associated with the system A x = b.
Linear Systems 105

Proposition 6.2.8 Let Ax = b be a linear system of m equations in n unknowns

that admits at least one solution. Then the set of solutions of the system is

S = {v + z ∣ z ∈ Ker A},

where v is a particular solution of the system and Ker A is the set of solutions of the
associated homogeneous linear system A x = 0.
Proof. This proposition is basically a rewriting of Proposition 6.1.4 with F = LA . In
fact, if we view the matrix A as the matrix associated to the linear transformation
n m
LA ∶ R → R with respect to the canonical bases, then determining the solutions
n
of the linear system Ax = b is the same as determining the vectors x ∈ R such that
LA (x) = b. In other words, it is the same as determining the preimage LA (b) of the
−1
m
vector b ∈ R . Since by hypothesis the system admits solutions, b belongs to the
image of LA , i.e. LA (v) = Av = b for a suitable vector v of R . The set S of system
n

solutions is LA (b), and this, by Proposition 6.1.4, is exactly {v+z ∣ z ∈ Ker LA }.

−1

The theorem that follows is the most important result in theory of linear systems.
Theorem 6.2.9 (Rouché-Capelli Theorem). A linear system Ax = b of m equa-
tions in n unknowns admits solutions if and only if rk(A) = rk(A∣b). If this condition
is satisfied, then the system has:
1. exactly one solution if and only if rk(A) = rk(A∣b) = n;

2. infinitely many solutions if and only if rk(A) = rk(A∣b) < n. In this case, the
solutions of the system depend on n − rk(A) parameters.
Proof. (1). We view the matrix A as the matrix associated with the linear transfor-
n m
mation LA ∶ R → R with respect to the canonical bases. The solutions of the linear
system Ax = b correspond to the vectors x ∈ R , such that LA (x) = b. Hence, we
n

need to determine the preimage LA (b) of the vector b ∈ R . By definition, such

−1 m

preimage is not empty if and only if b ∈ ImLA . In other words, the system has so-
lutions if and only if b ∈ ImLA . As ImLA is generated by the column vectors of the
matrix A, b ∈ ImLA if and only if the subspace generated by the column vectors of
A coincides with the subspace generated by the column vectors of A and the column
vector b, that is, if and only if

dim⟨columns of A⟩ = dim⟨columns of (A∣b)⟩,

i.e. if and only if rk(A) = rk(A∣b).

(2). If Ax = b admits solutions then, by Proposition 6.2.8, all solutions are of the
form
S = {v + z ∣ z ∈ Ker LA },
where v is a particular solution of the system. Then, we have only one of two cases:
1. dim(Ker A) = 0, i.e. Ker A = {0Rn }, so S = {v}, i.e. the system admits only
one solution;
106 Introduction to Linear Algebra

2. dim(Ker A) > 0, thus Ker A contains infinitely many elements, being a real
n
vector subspace of R .

By the Rank Nullity Theorem, we have dim(Ker A) + n = dim ImLA = n − rk(A),

and therefore the solutions depend on n − rk(A) parameters.

Definition 6.2.10 Given a compatible linear system Ax = b, i.e. a system that

admits solution, the number rk(A) = rk(A∣b) is also called the rank of the system.

In Chapter 1, we learned to solve any linear system Ax = b in the following

manner:

1. We write the complete matrix (A∣b) associated with the system;

2. We use the Gaussian algorithm to reduce (A∣b) to a row echelon matrix in the
form (A ∣b );
′ ′

3. The starting linear system Ax = b is equivalent to the row echelon linear system
′ ′
Ax =b;

4. The linear system A x = b has solutions if and only if rk(A ) = rk(A ∣b ).

′ ′ ′ ′ ′

In this case, using subsequent substitutions, we obtain all the solutions of the
system.

Rouché-Capelli theorem states that the linear system Ax = b, which in general is

not in row echelon form, admits solutions if and only if rk(A) = rk(A∣b). How do
we reconcile Rouché-Capelli theorem with the method of resolution of linear systems
just mentioned? It all works because, as shown in Proposition 4.3.1, the Gaussian
algorithm preserves the rank of a matrix, hence rk(A) = rk(A ) and rk(A∣b) =
′

rk(A ∣b ), therefore rk(A) = rk(A∣b) if and only if rk(A ) = rk(A ∣b ).

′ ′ ′ ′ ′

Rouché-Capelli theorem gives all the information on linear systems, we have al-
ready seen in Chapter 1. In particular it states that:
A linear system with real coefficients which admits solutions has either one solution
or infinitely many.
This is exactly the situation that we described in Chapter 1, reinterpreted in terms of
linear transformations. In essence, a compatible linear system is a set of compatible
conditions that are assigned on n real variables. These conditions then lower the
number of degrees of freedom of the system: if the system rank is k, the set of
the solutions no longer depends on n free variables, but on n − k. What counts is
the rank of the system and not the number of equations, because the rank of the
system quantifies independent conditions and eliminates those conditions that can
be deduced from the others and so are redundant.
Here are some examples to illustrate the results shown above.
Linear Systems 107

3 3
Example 6.2.11 Consider the linear transformation LA ∶ R → R whose matrix
with respect to the canonical basis (both in the domain and in the codomain) is:

⎛ 1 0 2⎞
⎜ 2 1 1⎟
A=⎜ ⎟.
⎝ 3 1 3⎠

We want to establish if the vector b = (3, 0, 3) belongs to Im(LA ) and, if so, to

calculate the inverse image of b under LA .
We observe that Im(LA ) is generated by the columns of the matrix A, therefore
b belongs to Im(LA ) if and only if rk(A) = rk(A∣b). In addition, the inverse image
of b under LA consists of the vectors (x, y, z) ∈ R such that LA (x, y, z) = (3, 0, 3),
3

i.e. the vectors (x, y, z) such that

⎛ x ⎞ ⎛ 3 ⎞
⎜ y ⎟
A⎜ ⎟=⎜
⎜ 0 ⎟
⎟,
⎝ z ⎠ ⎝ 3 ⎠

that is:
⎛ x + 2z ⎞ ⎛ 3 ⎞
⎜
⎜ 2x + y + z ⎟
⎟=⎜
⎜ 0 ⎟
⎟.
⎝ 3x + y + 3z ⎠ ⎝ 3 ⎠
In other words, computing the inverse image of b by LA means solving the linear
system
⎧
⎪ x + 2z = 3
⎪
⎪
⎨
⎪ 2x + y + z = 0
⎪
⎪
⎩ 3x + y + 3z = 3.
To calculate the rank of matrix (A∣b) and compare it with the rank of A, we reduce
the matrix (A∣b) and, simultaneously, the matrix A in row echelon form using the
Gaussian algorithm and then we calculate the rank of the reduced matrices. We have:

⎛ 1 0 2 3 ⎞ ⎛ 1 0 2 3 ⎞
(A∣b) → ⎜ ⎟→⎜
⎜ 2 1 1 0 ⎟ ⎜ 0 1 −3 −6 ⎟
⎟→
⎝ 3 1 3 3 ⎠ ⎝ 0 1 −3 −6 ⎠

⎛ 1 0 2 3 ⎞
→⎜
⎜ 0 1 −3 −6 ⎟
⎟.
⎝ 0 0 0 0 ⎠
So rk(A) = rk(A∣b) = 2. This means that the vector b belongs to the image of LA
and dim(Ker LA ) = 3 − 2 = 1. The inverse image of b is given by the elements: v + z,
where v is a particular solution of the system Ax = b and z ∈ Ker A, the kernel of
the matrix A, i.e. the set of solutions of the homogeneous linear system Ax = 0. To
compute v, we observe that the starting system is equivalent to the system:

x + 2z = 3
{
y − 3z = −6.
108 Introduction to Linear Algebra

Therefore, a particular solution can be calculated by substituting, for example, z = 0

in the equations found: if z = 0, we have y = −6, x = 3, then we can choose
v = (3, −6, 0).
The homogeneous system Ax = 0 is equivalent to the homogeneous system:

x + 2z = 0
{
y − 3z = 0;

we can solve from the bottom with subsequent substitutions: y = 3z, x = -2z. The
inverse image of b is thus S = {(3, −6, 0) + (−2z, 3z, z) ∣ z ∈ R}.

6.3 EXERCISES WITH SOLUTIONS

6.3.1 Compute the solutions of the following linear system in the unknowns x1 , x2 ,
x3 , x4 , x5 :
⎧
⎪ x1 − x2 + 3x3 + x5 = 2
⎪
⎪
⎨
⎪ 2x1 + x2 + 8x3 − 4x4 + 2x5 = 3
⎪
⎪
⎩ x1 + 2x2 + 5x3 − 3x4 + 4x5 = 1.
Solution. The complete matrix associated with the system is:

⎛ 1 −1 3 0 1 2 ⎞
⎜
(A∣b) = ⎜ 2 1 8 −4 2 3 ⎟
⎟,
⎝ 1 2 5 −3 4 1 ⎠

which reduced to row echelon form becomes:

⎛ 1 −1 3 0 1 2 ⎞
′
⎜
(A ∣b ) = ⎜ 0
′
3 2 −4 0 −1 ⎟
⎟.
⎝ 0 0 0 1 3 0 ⎠

We have rk(A ) = rk(A ∣b ) = 3 so the system is solvable, and the solutions depend
′ ′ ′

on 5 − 3 = 2 parameters.
′
The pivots are on the first, second and fourth column of A , so we can obtain the
unknowns x1 , x2 and x4 in terms of x3 and x5 .
The system associated with the row echelon form matrix (A ∣b ) is:
′ ′

⎧
⎪ x1 − x2 + 3x3 + x5 = 2
⎪
⎪
⎨
⎪ 3x2 + 2x3 − 4x4 = −1
⎪
⎪
⎩ x4 + 3x5 = 0.

So we have x4 = −3x5 , x2 = − 31 − 4x5 − 23 x3 , x1 = 5

3
− 5x5 − 11
x .
3 3
The solutions
are therefore:
11 5 2 1
{(− x3 − 5x5 + , − x3 − 4x5 − , x3 , −3x5 , x5 ) ∣x3 , x5 ∈ R} =
3 3 3 3

5 1 11 2
{( , − , 0, 0, 0) + z ∣z ∈ ⟨(− , − , 1, 0, 0) , (−5, −4, 0, −3, 1)⟩} .
3 3 3 3
Linear Systems 109

6.3.2 Determine the solutions of the following linear system in the unknown x1 , x2 ,
x3 :
⎧
⎪ x1 − x2 + x3 = 2
⎪
⎪
⎨
⎪ 2x1 − x2 + 3x3 = −1
⎪
⎪
⎩ x1 + 2x3 = 1.
Solution. The complete matrix associated with the system is:

⎛ 1 −1 1 2 ⎞
(A∣b) = ⎜
⎜ 2 −1 3 −1 ⎟
⎟,
⎝ 1 0 2 1 ⎠

which reduced to row echelon form becomes:

⎛ 1 −1 1 2 ⎞
(A ∣b ) = ⎜ ⎟
′ ′
⎜ 0 1 1 −5 ⎟.
⎝ 0 0 0 4 ⎠

Thus, we have rk(A ) = 2 ≠ rk(A ∣b ) = 3. Therefore, the system does not admit
′ ′ ′

solutions by the Rouché-Capelli Theorem. We note that the linear system associated
with the matrix (A ∣b ) is
′ ′

⎧
⎪ x1 − x2 + x3 = 2
⎪
⎪
⎨
⎪ x2 + x3 = −5
⎪
⎪
⎩ 0 = 4,
which is clearly not compatible.

6.3.3 (a) Determine a linear system having

S = (1, 2, 1) + ⟨(1, 1, 1)⟩

as set of solutions.

(b) Determine, if possible, a linear system of three equations having

S = (1, 2, 1) + ⟨(1, 1, 1)⟩

as set of solutions.

(c) Determine, if possible, a linear system of rank 1 having

S = (1, 2, 1) + ⟨(1, 1, 1)⟩

as set of solutions.
3
Solution. Since S is a subset of R , each linear system having S as set of solutions is
a linear system in 3 unknowns. We indicate these unknowns with x, y, z. The set of
solutions of a linear system of the form Ax = b is S = {v + z∣z ∈ + ker A}, where
v is a particular solution of the system. So if Ax = b is a linear system having S as
set of solutions, (1, 2, 1) is a system solution and ker A = ⟨(1, 1, 1)⟩. In particular,
110 Introduction to Linear Algebra

we note that (1, 2, 1) ∉ ⟨(1, 1, 1)⟩, so S is not a subspace of R , so b ≠ 0. Also

dim(ker A) = 1 = 3 − rk(A). So the system we seek has rank 2 and therefore must
necessarily consist of at least 2 equations. This immediately allows us to answer the
question (c): there is no linear system of rank 1 having S as a set of solutions.
To determine a linear system having S as a set of solutions and then answer
question (a), we could then write a generic linear system consisting of two equations
and impose that:

1. (1, 2, 1) is system solution;

2. (1, 1, 1) is the homogeneous system solution associated with Ax = 0.

This method, which certainly works, however, is not the most effective. We then
choose a smarter approach.
What we want to do is to describe by equations the set of elements (x, y, z) ∈ S,
that is the set of elements (x, y, z) of R , such that :
3

(x, y, z) = (1, 2, 1) + z, with z ∈ ⟨(1, 1, 1)⟩

or, equivalently,
(x, y, z) − (1, 2, 1) = z, with z ∈ ⟨(1, 1, 1)⟩.
Note that the vector (x, y, z) − (1, 2, 1) = (x − 1, y − 2, z − 1) belongs to the subspace
⟨(1, 1, 1)⟩ if and only if it is a multiple of (1, 1, 1), i.e. if and only if

1 1 1
rk ( ) = 1.
x−1 y−2 z−1

Using the Gaussian algorithm, we have that:

1 1 1 1 1 1
rk ( ) = rk ( ).
x−1 y−2 z−1 0 y−1−x z−x

The rank of this matrix is equal to 1 if and only if the second row of the matrix is
null, that is if and only if
−x + y − 1 = 0
{
−x + z = 0.
We have therefore found a linear system having S as set of solutions. Naturally, every
system equivalent to the one found has S as a set of solutions. In particular, to answer
the question (b) we will have to determine a linear system of 3 equations equivalent
to the one just written. Just add an equation that is a linear combination of the two
equations found. For example:
⎧
⎪ −x + y − 1 = 0
⎪
⎪
⎨
⎪ −x + z = 0
⎪
⎪
⎩ −2x + y + z − 1 = 0.
Linear Systems 111

6.4 SUGGESTED EXERCISES

3 3
6.4.1 Let F ∶ R → R be the linear transformation defined by:

F (x, y, z) = (x − y, x + 2y, x + y + 3z).

Compute the inverse image of the vector (1, 0, −2) under F .

4 3
6.4.2 Given the linear transformation Tk ∶ R → R defined by:

Tk (x1 , x2 , x3 , x4 ) =

(x1 − 5x2 + kx3 − kx4 , x1 + kx2 + kx3 + 5x4 , 2x1 − 10x2 + (k + 1)x3 − 3kx4 ),
determine for which values of k the vector wk = (1, k, −1) belongs to Im(Tk ). Set
k = 0 and determine the preimage of w0 under T0 .
3 3
6.4.3 Given the linear transformation T ∶ R → R associated with the matrix

⎛3k 3 k + 2⎞
A=⎜
⎜ 1 k k ⎟⎟,
⎝1 2 2 ⎠

determine for which values of k the vector (k, 3, 3) belongs to Im(T ).

Let k = 2; find all the vectors (x, y, z) such that T (x, y, z) = (2, 3, 3).
Let k = 3; find all vectors (x, y, z) such that T (x, y, z) = (3, 3, 3).

6.4.4 Let S = {(1, 2, 0, 3) + z∣z ∈ ⟨(1, −1, 2, 1), (1, 5, −2, 5)⟩}. Determine if S is a
4
vector subspace of R and determine, if possible, a homogeneous linear system having
S as set of solutions.
3 3
6.4.5 Construct, if possible a linear transformation F ∶ R → R such that
F (1, 0, 0) = {(1, 0, 0) + v ∣ v ∈ ⟨(1, 1, 1), (0, 1, −1)⟩}. Establish whether such a
−1

transformation is unique.

6.4.6 Determine, if possible:

1. a linear equation having S = (2, 1, 0, 1) + ⟨(2, 1, 2, 2), (1, −1, 2, 1)⟩ as a set of
solutions;

2. a linear system of two equations having S = (2, 1, 0, 1) + ⟨(2, 1, 2, 2),

(1, −1, 2, 1)⟩ as a set of solutions;

3. a linear system of three equations having S = (2, 1, 0, 1) + ⟨(2, 1, 2, 2),

(1, −1, 2, 1)⟩ as a set of solutions.
CHAPTER 7

Determinant and Inverse

In this chapter, we introduce two basic concepts: the determinant and the inverse
of a square matrix. The importance of these two concepts will be summarized by
Theorem 7.6.1, which contains essentially all we learnt about the linear maps from
n n
R to R .

7.1 DEFINITION OF DETERMINANT

The concepts of linear algebra introduced so far are not sufficient to give a direct
definition of the determinant. Hence, we shall introduce the determinant with an
indirect definition and then we will compute it in some special cases, which will
prove to be the most significant for us, and finally we will arrive at an algorithmic
method to compute it in general.
Let A be a square matrix, i.e. an n × n matrix. We want to associate to it a real
number, called determinant of A, which is calculated starting from the elements of
the matrix A. The definition we give is apparently not a constructive one, however,
we will see that, starting from simple rules, we can calculate the determinant of a
matrix.
Before we begin, it is necessary to introduce the definition of the identity matrix.

Definition 7.1.1 The identity matrix or identity matrix of order n, is the n × n

matrix having all the elements of the main diagonal equal to the number 1, while the
remaining elements are equal to 0. Usually it is indicated with In , or with I, if there
are no ambiguities.

For example, the identity matrix of order 3 is:

⎛1 0 0⎞
I=⎜
⎜0 1 0⎟
⎟.
⎝0 0 1⎠

Definition 7.1.2 The determinant of a square matrix A of order n is a real number,

denoted by det(A), with the following properties:
n
1. If the j-th row of A is the sum of two elements u and v of R , then the

113
114 Introduction to Linear Algebra

determinant of A is the sum of the determinants of the two matrices obtained

by replacing the j-th row of A with u and v, respectively.
n
2. If the j-th row of A is the product λu, where u is an element of R and λ is
a scalar, then the determinant of A is the product of λ and the determinant of
the matrix obtained by replacing the j-th row of A with u.

3. If two rows of A are equal, then the determinant of A is zero.

4. If I is the identity matrix then det(I) = 1.

Thanks to these properties, we can immediately calculate the determinant of some

matrices. For example, by properties (2) and (4), if we consider the matrix

2 0
A=( ),
0 3

then
1 0 1 0
det(A) = 2 det ( ) = 2 ⋅ 3 det ( ) 2 ⋅ 3 det(I) = 6.
0 3 0 1
We will see later how to exploit these properties in a suitable manner to obtain
the determinant of any matrix.
For the moment, we have defined the determinant as a function that has some
properties, however this does not guarantee that such a function exists or, if it exists,
that it is unique. The next proposition, which we do not prove, establishes such facts.

Proposition 7.1.3 There is a function that satisfies the properties of Definition

7.1.2 and such function is unique.

Thanks to the definition of determinant, we can immediately prove the following

additional properties, which will be very useful.

Proposition 7.1.4 Let A and B be two square matrices of order n.

(a) If B is obtained from A by exchanging two rows, then:

det(A) = − det(B).

(b) If B is obtained from A by adding to a row any linear combination of the other
rows, then:
det(A) = det(B).

(c) If A is an upper (or lower) triangular matrix, that is, the coefficients below
(respectively above) the main diagonal are all equal to zero, then the determinant
of A is the product of the elements that are located on its main diagonal.
Determinant and Inverse 115

Proof. (a) For a matrix A we use notation in compact form,

A = (R1 , R2 , . . . , Rn ), where R1 , . . . , Rn are the elements of R that make up the
n

rows of A. Now consider the matrix (R1 + R2 , R1 + R2 , R3 , . . . , Rn ). By property (3)

of Definition 7.1.2 we have that:
det(R1 + R2 , R1 + R2 , R3 , . . . , Rn ) = 0.
On the other hand, by property (1), we have:
det(R1 + R2 , R1 + R2 , R3 , . . . , Rn ) =

= det(R1 + R2 , R1 , R3 , . . . Rn )+

+ det(R1 + R2 , R2 , R3 , . . . , Rn )

= det(R1 , R1 , R3 , . . . Rn ) + det(R1 , R2 , R3 , . . . , Rn )+

+ det(R2 , R1 , R3 , . . . , Rn ) + det(R2 , R2 , R3 , . . . , Rn ) =

= det(R1 , R2 , R3 , . . . , Rn ) + det(R2 , R1 , R3 , . . . , Rn ),
where at the last step we used again property (3). Then
det(R1 , R2 , R3 , . . . , Rn ) + det(R2 , R1 , R3 , . . . , Rn ) = 0,
from which (a) follows, relatively to the first two rows. It is clear that this can be
repeated, in an identical way, for two generic rows.
(b) Let A = (R1 , R2 , . . . , Rn ) and B = (R1 + λ2 R2 + ⋅ ⋅ ⋅ + λn Rn , R2 , . . . , Rn ). Then,
by properties (1) and (2) of Definition 7.1.2, we have:
det(B) = det(R1 , R2 , . . . , Rn ) + λ2 det(R2 , R2 , . . . , Rn ) + ⋅ ⋅ ⋅ +

+λn det(Rn , R2 , . . . , Rn ) = det(A),

since all the determinants, except the first, are equal to zero, as the corresponding
matrices have two equal rows.
(c) Let A be a lower triangular matrix. Suppose for the moment that all the co-
efficients on the diagonal are different from zero. Using the Gaussian algorithm as
described in the first chapter, we can add to the second row a multiple of the first
so as to set to zero the coefficient in position (2, 1) and then proceed similarly by
setting to zero all the coefficients in positions (3, 1), . . . , (n, 1). The determinant of
the matrix obtained in this way does not change, by property (b). Then, we repeat
the same procedure for the second row and then the third, up to the n-th. In this
way, we obtain a diagonal matrix. By property (2) of Definition 7.1.2, we have:

⎛d1 0 . . . 0⎞
⎜ 0 d2 . . . 0⎟
det ⎜
⎜
⎜
⎟
⎟ = d1 d2 ⋯dn .
⎜
⎜⋮ ⋮⎟⎟
⎟
⎝0 ⋮ 0 dn ⎠
116 Introduction to Linear Algebra

In the case when one or more coefficients on the diagonal are equal to zero, we cannot
obtain a diagonal matrix, however, it is easy to see, by applying the Gaussian algo-
rithm, that we obtain a matrix in which a row consists of all zeros and consequently
the determinant is zero. We leave to the reader the details of this case. The reasoning
for an upper triangular matrix is similar.

Given a square matrix A, using elementary row operations, we can always reduce
A to a triangular form and then calculate the determinant using the properties seen
above. One must pay attention to the fact that the elementary row operations may
change the determinant: if we exchange two rows, we must remember that the deter-
minant changes sign; if we multiply a row by a scalar, the determinant is multiplied
by the same scalar; while finally, if we add to a row a linear combination of the others,
the determinant does not change.
We see an explicit example, although the method that we describe is not the most
efficient for the calculation of the determinant in general.

Example 7.1.5 We calculate the determinant of the matrix A:

⎛0 2 4 6⎞
⎜1 2 1⎟
A=⎜ ⎟
⎜ 1 ⎟.
⎜
⎜ 2 −1⎟
⎟
⎜1 1 ⎟
⎝1 1 1 2⎠

We want to bring this matrix into triangular form, using the Gaussian algorithm, but
we must take into account all the exchanges and all the multiplications by a scalar
we make.
We exchange the first row with the second, in this case by Proposition 7.1.4 (a)
the determinant changes sign and the matrix becomes:

⎛1 1 2 1⎞
⎜0 4 6⎟
A=⎜ ⎟
⎜ 2 ⎟.
⎜
⎜ 2 −1⎟
⎟
⎜1 1 ⎟
⎝1 1 1 2⎠

Now we perform the following elementary operations, so as to set to zero all the
coefficients in the first column:

- 3rd row → 3rd row - 1st row;

- 4th row → 4th row - 1st row.

In this way, the determinant does not change, by Proposition 7.1.4 (b), and we obtain:

⎛1 1 2 1⎞
⎜
⎜0 2 4 6⎟⎟
⎜
⎜ ⎟.
⎜
⎜0 0 0 −1⎟⎟
⎟
⎝0 0 −1 1 ⎠
Determinant and Inverse 117

Then, we exchange the second with the third row; by Proposition 7.1.4 (a) the
determinant changes sign and the matrix becomes:

⎛1 1 2 1⎞
⎜
⎜0 2 4 6⎟⎟
⎜
⎜ ⎟.
⎜
⎜0 0 −1 1 ⎟
⎟
⎟
⎝0 0 0 −1⎠
Now we can use property (c) of the previous proposition, which tells us that the
determinant of a triangular matrix is the product of the coefficients on the diagonal.

⎛1 1 2 1⎞
⎜0 6⎟
det ⎜ ⎟
⎜ 2 4 ⎟ = 1 ⋅ 2 ⋅ (−1) ⋅ (−1) = 2.
⎜
⎜ 0 −1 1 ⎟
⎟
⎜0 ⎟
⎝0 0 0 −1⎠
To get the correct result, we must then multiply the determinant of A by (−1) as
many times as the row swaps (in this case two), so:
det(A) = 2 ⋅ (−1) ⋅ (−1) = 2.

7.2 CALCULATING THE DETERMINANT: CASES 2 × 2 AND 3 × 3

For 2 × 2 and 3 × 3 matrices, there are simple formulas for the calculation of the
determinant.
We begin by examining the case of 2 × 2 matrices. Let A be the matrix:
a11 a12
A=( ).
a21 a22
We proceed with the algorithm we explained above, considering two cases.
Case 1. Suppose first that a11 ≠ 0. In this case, we make the following elementary
operation:
a21
• 2nd row → 2nd row - a11
⋅ 1st row.

In this way, by Proposition 7.1.4 (b), the determinant does not change and we obtain
the triangular matrix:
a11 a12
( 0 a − a21 ⋅ a ) .
22 a 12
11

Now we simply take the product of the diagonal coefficients and we have that:
a21
det(A) = a11 ⋅ (a22 − a ⋅ a12 ) = a11 a22 − a12 a21 .
11

Case 2. Suppose a11 = 0. We exchange the first and the second row; the determinant
changes sign, and we get:
a a
( 21 22 ) .
0 a12
118 Introduction to Linear Algebra

So, taking into account the change of sign,

det(A) = −a21 ⋅ a12 = a11 a22 − a12 a21 ,

being a11 = 0.
In both cases, then the following formula holds:

det(A) = a11 a22 − a12 a21 .

So we can calculate the determinant of any matrix 2 × 2. For example:

1 2
det ( ) = 1 ⋅ 4 − 2 ⋅ 3 = −2.
3 4

Consider now the case of 3 × 3 matrices. Let A be the matrix:

⎛a11 a12 a13 ⎞

A=⎜
⎜a21 a22 a23 ⎟
⎟.
⎝a31 a32 a33 ⎠

Proceeding in a similar manner as in the 2×2 case (obviously the Gaussian algorithm
requires a greater number of steps), it is possible to show that the determinant is:

det(A) = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 .

Let us see a mnemonic aid to remember the formula. We rewrite the first two columns
of A next to A to the right

⎛a11 a12 a13 a11 a12 ⎞

⎜
⎜a21 a22 a23 a21 a22 ⎟
⎟.
⎝a31 a32 a33 a31 a32 ⎠

To obtain the determinant of A it is necessary to take the sum of the products

of the coefficients of the main diagonals a11 a22 a33 , a12 a23 a31 , a13 a21 a32 and then
subtract the sum of the products of the coefficients of the “opposite” diagonals:
a13 a22 a31 , a11 a23 a32 , a12 a21 a33 .

Example 7.2.1 Let

⎛1 3 0⎞
A=⎜
⎜2 1 −1⎟
⎟.
⎝0 −3 5 ⎠
Consider
⎛1 3 0 1 3⎞
⎜
⎜2 1 −1 2 1⎟
⎟.
⎝0 −3 5 0 −3⎠
We then have: det(A) = 5 + 0 − 0 − (0 + 3 + 30) = −28.
Determinant and Inverse 119

7.3 CALCULATING THE DETERMINANT WITH A RECURSIVE METHOD

For matrices of order larger than two or three, but also as an alternative method, we
give a method based on a recursive definition of determinant, completely equivalent
to the one that we have already seen (though we will not prove such statement).
We begin by defining the concept of a minor of a matrix.

Definition 7.3.1 Let A ∈ Mn×n (R) be a square matrix of order n (i.e. with n rows
and n columns). We denote by Ai j the square submatrix of A obtained by deleting
the i-th row and j-th column A. Ai j is called a minor of A of order n − 1.

Let us see some examples.

Example 7.3.2 If
⎛−1 0 4 7⎞
⎜ 1⎟
A=⎜ ⎟
⎜ 3 −1 2 ⎟,
⎜
⎜ 6 −1 0 ⎟
⎟
⎜3 ⎟
⎝−2 −1 0 −1⎠
we have
⎛−1 2 1⎞ ⎛−1 0 7⎞
A1 1 ⎜ ⎟ ⎜
= ⎜ 6 −1 0 ⎟ , A2 3 = ⎜ 3 6 0⎟⎟,
⎝−1 0 −1⎠ ⎝−2 −1 −1⎠

⎛−1 4 7⎞
A4 2 =⎜
⎜3 2 1⎟⎟.
⎝ 3 −1 0⎠

We now state a proposition that allows us to calculate the determinant of a matrix

A by proceeding recursively on the order of A, that is, on the number of its rows (or
columns). We will not give a proof of our statement.

Theorem 7.3.3 Let A be a square matrix.

• If A has order 1, i.e. A = (a1 1 ) has one row and one column, we set

det(A) = a1 1 .

• Suppose now that we know how to compute the determinant of matrices order
n − 1. Let
i+j
Γi j = (−1) det(Ai j ),
then n
det(A) = a1 1 Γ1 1 + a1 2 Γ1 2 + . . . + a1 n Γ1 n = ∑ a1 k Γ1 k .
k=1
120 Introduction to Linear Algebra

This is the method for the calculation of the determinant by expanding along the
first row. Let us see how it works in practice, for 2 × 2 and 3 × 3 matrices: we find
the results seen before.
a b
In fact, we see at once that if A = ( ) is a 2 × 2 matrix, we have
c d
det(A) = aΓ1 1 + bΓ1 2 = ad − bc.
2 3
For example: det( ) = 2 ⋅ 4 − 3 ⋅ 1 = 8 − 3 = 5.
1 4
Let us see now the case of 3 × 3 matrices. Let
⎛a1 1 a1 2 a1 3 ⎞
A=⎜ ⎟.
⎜a2 1 a2 2 a2 3 ⎟
⎝a3 1 a3 2 a3 3 ⎠
We have, by definition:
a2 2 a2 3
Γ1 1 = (−1) det A1 1 = det ( ),
1+1
a3 2 a3 3

a2 1 a2 3
Γ1 2 = (−1) det A1 2 = − det ( ),
1+2
a3 1 a3 3

a2 1 a2 2
Γ1 3 = (−1) det A1 3 = det ( ).
1+3
a3 1 a3 2
Hence
det(A) = a1 1 Γ1 1 + a1 2 Γ1 2 + a1 3 Γ1 3 =
a a a a a a
a1 1 det ( 2 2 2 3 ) − a1 2 det ( 2 1 2 3 ) + a1 3 det ( 2 1 2 2 ) =
a3 2 a3 3 a3 1 a3 3 a3 1 a3 2
= a1 1 (a2 2 a3 3 − a2 3 a3 2 ) − a1 2 (a2 1 a3 3 − a2 3 a3 1 ) + a1 3 (a2 1 a3 2 − a2 2 a3 1 ) =
a1 1 a2 2 a3 3 + a1 2 a2 3 a3 1 + a1 3 a2 1 a3 2 − a1 1 a2 3 a3 2 − a1 2 a2 1 a3 3 − a1 3 a2 2 a3 1 ,
as we saw earlier.
It is possible to expand the determinant according according to the r-th row:
n
det(A) = ar 1 Γr 1 + ar 2 Γr 2 + . . . + ar n Γr n = ∑ ar k Γr k .
k=1

The expansion of det(A) according to the s-th column is given by:

n
det(A) = a1 s Γ1 s + a1 s Γ2 s + . . . + an s Γn s = ∑ ak s Γk s .
k=1

It can be shown that, in general, expanding the determinant of a matrix A ∈ Mn (R)

according to any row or column, we always get the same number, so we can expand
according to the row or column with the highest number of zeros.

Now let us see an example.

Determinant and Inverse 121

Example 7.3.4 Let

⎛1 3 0 0⎞
⎜ 0 −1⎟
A=⎜ ⎟
⎜ 2 0 ⎟.
⎜
⎜ 0⎟⎟
⎜0 −3 5 ⎟
⎝1 0 −1 0 ⎠

The expansion of det(A) according to the third row is:

det(A) = 0Γ3 1 − 3Γ3 2 + 5Γ3 3 + 0Γ3 4 = −3Γ3 2 + 5Γ3 3 .

Now
⎛1 0 0⎞
det ⎜2 0 −1⎟
⎜
3+2
Γ3 2 = (−1) ⎟ = −(−1) = 1,
⎝1 −1 0 ⎠

⎛1 3 0 ⎞
det ⎜
⎜2 0 −1⎟
3+3
Γ3 3 = (−1) ⎟ = +(−3) = −3,
⎝1 0 0 ⎠
then det(A) = −3 ⋅ 1 + 5(−3) = −18.
Let us now instead expand det(A) according to the second column:

det(A) = 3Γ1 2 + 0Γ2 2 − 3Γ3 2 + 0Γ4 2 = 3Γ1 2 − 3Γ3 2 .

Now
⎛2 0 −1⎞
det ⎜ 0⎟
1+2
Γ1 2 = (−1) ⎜0 5 ⎟ = −(+5) = −5,
⎝1 −1 0 ⎠

then det(A) = 3(−5) − 3 ⋅ 1 = −18.

This particular example shows that, if we expand the determinant according to the
third row or the second column, the result does not change.

Next theorem is Binet theorem, and it is particularly important: it states that

the determinant function has the multiplicative property, that is, the determinant of
a product of matrices is the product of the determinants. Note that for the sum this
is very far from being true. We will prove Binet theorem in the appendix.

Theorem 7.3.5 Let A and B two square matrices n × n, then

det(AB) = det(A) det(B).

7.4 INVERSE OF A MATRIX

In the set of real numbers, the inverse s of a number t ∈ R \ {0} is the number
defined by the property: st = ts = 1 (s is denoted by 1/t, as we all know). Since we
can multiply square matrices, we can ask whether, given a square matrix A, there
is a square matrix B such that AB = BA = I, where I is the identity matrix,
122 Introduction to Linear Algebra

which plays here the same role as the unit in real numbers. This matrix B is called
the inverse of A. By Binet theorem, which we have just seen, it is clear that if
det(A) = 0, then it is not possible to find a matrix B with this property, because
det(AB) = det(A) det(B) = det(I) = 1. We will see shortly that this condition is
also sufficient.
We begin our discussion with the definition of the inverse of a square matrix and
then move on to the various methods of calculation.

Definition 7.4.1 A square matrix A is called invertible if there is a matrix B (de-

−1
noted by A ), such that: AB = BA = I.

We will compute the inverse of a matrix. A first direct method is given by the
proof of the following theorem, which characterizes invertible matrices.

Theorem 7.4.2 The matrix A is invertible if and only if its determinant is nonzero.

Proof. We prove first that if A is invertible then its determinant is different from zero.
By definition we have that AA = I, then, by Binet theorem, det(A) det(A ) =
−1 −1

det(AA ) = det(I) = 1, therefore det(A) ≠ 0.

−1

Conversely, if the determinant of A is different from zero, we can construct the

inverse of A in the following way. Let det(Ai j ) be the determinant of the matrix
obtained from A by removing the i-th row and j-th column. We then have
1 i+j
(A )ij = (−1) det(Aj i ),
−1
(7.1)
det(A)

where (A )ij indicates the i, j entry of the matrix A . We omit the proof of the
−1 −1

fact that indeed this is the inverse of A.

Observation 7.4.3 Assume we have two different square matrices A, B such that
AB = I. Then, by Binet Theorem 7.3.5, 1 = det(I) = det(AB) = det(A) det(B), so
det(A) and det(B) are both nonzero. By Theorem 7.4.2, both A and B are invertible,
−1
thus by multiplying (on the left) both sides of the equality AB = I by A we obtain
−1
that B = A is the inverse of A.
Thanks to this observation we get that, given two square matrices A and B, then
AB = I if and only if BA = I and the inverse of a matrix is unique.

Now let us see the explicit formula for the inverse of a 2 × 2 matrix.
a b
Let A = ( ). If the determinant det(A) = ad − bc of A is not zero, the inverse
c d
of A can be calculated with formula (7.1) of the previous proposition. We have that:

(A )11 = (A )12 = (−b),

−1 1 −1 1
ad−bc
d, ad−bc

(A )21 = (−c), (A )22 =

−1 1 −1 1
ad−bc ad−bc
a,
Determinant and Inverse 123

then
d −b
= ( ad−bc ad−bc ) .
−1
A −c a
ad−bc ad−bc

We leave as an exercise the easy verification that it is precisely the inverse of A,

namely that:
d −b d −b
ad−bc ) (
a b a b ad−bc 1 0
( −c
ad−bc
a )=( ) ( −c ad−bc )
a =( ).
ad−bc ad−bc
c d c d ad−bc ad−bc
0 1

Now we illustrate an alternative method for the calculation of the inverse of a

matrix. It is difficult to say which of the two methods is faster, as it depends on the
type of matrix.

7.5 CALCULATION OF THE INVERSE WITH THE GAUSSIAN ALGORITHM

Suppose we have an invertible matrix A, and we want to compute its inverse. Consider
the matrix:
M = (A ∣ I)
obtained by putting the identity matrix next to A. Then, through elementary oper-
ations on the rows of the matrix M , we can get the identity matrix on the left-hand
side. We briefly indicate the procedure and then we will clarify it with examples.
We already know how to get a matrix C in row echelon form and, as the elementary
row operations preserve the rank and the matrix A is invertible, the matrix C will
have exactly n pivots. Dividing each row by a suitable number, we can assume that
all the pivots of C are equal to 1. We then proceed from “bottom” up, using the last
pivot and appropriate elementary operations on rows to obtain a new matrix where
the last pivot is 1 and in its column there are all zeros except for the last pivot itself.
Then we move to the pivot before the last and perform the same procedure, until in
the end the matrix on the left hand side of M is the identity matrix. At this point,
the matrix that appears to the right is the inverse matrix of A. We will not prove
this procedure, but we will give examples.

Example 7.5.1 Consider the matrix

1 3
A=( ).
1 4

To compute the inverse, we must apply the Gaussian algorithm to the matrix:

1 3 1 0
( ).
1 4 0 1

We carry out the following elementary operation: 2nd row → 2nd row - 1st row, and
we get:
1 3 1 0
( ).
0 1 −1 1
124 Introduction to Linear Algebra

Then we do: 1st row → 1st row - 3 ⋅ 2nd, obtaining:

1 0 4 −3
( ).
0 1 −1 1
We therefore have:
4 −3
=( ).
−1
A
−1 1
We now look at another example, this time with a parameter.
Example 7.5.2 We want to determine for which values of k the matrix
⎛1 k 0⎞
A = ⎜ 1 2k − 2 0⎟
⎜ ⎟
⎝2k 0 k⎠
is invertible and for these values we want to compute the inverse.
First, we calculate the determinant of A, for example by expanding according to the
third column:
6 1 k
det(A) = 0 ⋅ Γ1 3 + 0 ⋅ Γ2 3 + k ⋅ Γ3 3 = k(−1) det ( ) = k(k − 2).
1 2k − 2
A is invertible for k ≠ 0 and k ≠ 2. Consider now:
⎛ 1 k 0 1 0 0 ⎞
(A ∣ I) = ⎜ ⎟
⎜ 1 2k − 2 0 0 1 0 ⎟
⎝ 2k 0 k 0 0 1 ⎠
and apply the Gaussian algorithm.

a
2 row →
a a
2 row − 1 row ⎛ 1 k 0 1 0 0 ⎞
a a a ⎜
→ ⎜ 0 k − 2 0 −1 1 0 ⎟⎟
3 row → 3 row − 2k ⋅ 1 row ⎝ 0 −2k 2 k −2k 0 1 ⎠

⎛ 1 k 0 1 0 0 ⎞
⋅ 2 row → ⎜ 0 ⎟
a 1 a 1 1
2 row → k−2 ⎜ 0 1 0 − k−2 k−2 ⎟
⎝ 0 −2k k −2k
2
0 1 ⎠

⎛ 1 k 0 1 0 0 ⎞
a a ⎜
3 row → 3 row + 2k ⋅ 2 row → ⎜
2 1 a 1 ⎟
⎟
⎜ 0 1 0 − k−2 k−2
0 ⎟
⎝ 0 0 k 1 ⎠
2
4k 2k
k−2 k−2

2k−2 k
a a a
1 row → 1 row − k ⋅ 2 row ⎛ 1 0 0 k−2 − k−2 0 ⎞
a 1 a →⎜
⎜ 1
⎜ 0 1 0 − k−2
1
0 ⎟
⎟
⎟
3 row → ⋅ 3 row k−2
k ⎝ 0 0 1 4 2k 1 ⎠
k−2 k−2 k
The inverse is therefore:
2k−2 k
⎛ k−2 − k−2 0 ⎞
⎜
⎜ 1 1
0 ⎟
⎟
⎜ − k−2 k−2 ⎟.
⎝ 4 2k 1 ⎠
k−2 k−2 k
Determinant and Inverse 125

In the next section, we will relate the concept of determinant and inverse of a
matrix with the properties of the linear transformation associated with it, once we
have fixed the canonical bases in the domain and the codomain.

N N
7.6 THE LINEAR MAPS FROM R TO R
Now that we have introduced the concept of determinant and inverse of a matrix, we
can give an important result that allows us to characterize invertible linear transfor-
n n
mations from R to R .
n n
Theorem 7.6.1 Let F ∶ R ⟶ R be a linear map, and let A be the matrix
associated to F with respect to the canonical basis (in the domain and codomain).
The following statements are equivalent.

1. F is an isomorphism.

2. F is injective.

3. F is surjective.

4. dim(Im(F )) = n.

5. rk(A) = n.

6. The columns of A are linearly independent.

7. The rows of A are linearly independent.

8. The system A x = 0 has a unique solution.

n
9. For every b ∈ R the system A x = b has a unique solution.

10. A is invertible.

11. The determinant of A is not zero.

Proof. By Proposition 5.5.2, we immediately have the equivalence between (1), (2),
(3). We now show that the statements (3) through (9) are equivalent, showing that
each of them implies the next and then that (9) implies (2). We will show then,
finally, that (1), (10), (11) are equivalent.
n
(3) implies (4), because if F is surjective, then ImF = R has dimension n.
(4) implies (5), because rk(A) = dim(Im(F )), by Observation 6.2.2.
(5) implies (6), by the definition of rank of a matrix (which is in particular is the
column rank).
(6) implies (7), because the row rank of a matrix is equal to the column rank (Propo-
sition 6.2.3).
(7) implies (8), because if the rows of A are linearly independent, when we apply the
126 Introduction to Linear Algebra

Gaussian algorithm for solving the system A x = 0, we find a row echelon matrix
with exactly n pivots, so there is a unique solution.
We now show that (8) implies (9). If the system A x = 0 has a unique solution,
′
reducing the matrix A in row echelon form, we get a matrix A with exactly n pivots.
Then, reducing the matrix A∣b in row echelon form, we get a matrix of the type
A ∣b , which also has exactly n pivots (those of A ). Then the system A x = b admits
′ ′ ′

a unique solution.
We show that (9) implies (2). By Proposition 5.4.8, it is enough to show that
Ker (F ) = 0. But Ker (F ) consists of the solutions of the homogeneous linear system
A ⋅ x = 0, and thanks to (9), taking x = 0, it has only a solution, which must be the
zero solution, thus F is injective.
We have shown that the conditions (2) through (9) are equivalent
We show that (1) implies (10). Let G be the inverse of F , then F ◦ G = G ◦ F = idRn .
Let B be the matrix associated with G with respect to the canonical basis. Then
AB = BA = I, so B is the inverse of A.
n n
(10) implies (1), because, if B is the inverse of A and LB ∶ R ⟶ R is the linear
map associated with it, then LB is the inverse of F .
(10) is equivalent to (11) by Theorem 7.4.2.

7.7 EXERCISES WITH SOLUTIONS

7.7.1 Determine for which values of k the matrix
k − 3 2k
A=( )
2 −1

is invertible, and for these values compute the inverse.

Solution. First we compute the determinant: det(A) = −(k − 3) − 4k = −5k + 3, so
A is invertible for k ≠ 53 .
We use formula (7.1) for the inverse:
(A )11 = det(A) (−1) det(A1 1 ) = −5k+3
−1 1 1+1 −1
,
(A )1 2 = (−1) det(A2 1 ) = − −5k+3
−1 1 1+2 2
det(A)
,
(A )2 1 = (−1) det(A1 2 ) = − −5k+3
−1 1 2+1 2k
det(A)
,
(A )2 2 = (−1) det(A2 2 ) =
−1 1 2+2 k−3
det(A) −5k+3
.
The inverse is therefore:
1 2
= ( 5k−3 5k−3 ) .
−1
A 2k k+1
5k−3 −5k+3

7.7.2 Given the linear transformation F ∶ R ⟶ R defined by: F (e1 ) = 2e1 − e2 ,

3 3

F (e2 ) = e2 + ke3 , F (e3 ) = e1 − e2 + e3 , determine the values of k for which F is an

−1
isomorphism. Choosing an appropriate value of k compute F .
Determinant and Inverse 127

Solution. The matrix associated with F with respect to the canonical bases of the
domain and codomain is:
⎛2 0 1⎞
A=⎜ ⎜−1 1 −1⎟
⎟.
⎝0 k 1⎠
If we calculate the determinant of A with any of the methods that we have seen, for
example expanding according to the first row, we get:
1+1 1+3
det(A) = (−1) 2(1 + k) + 0 + (−1) (−k) = k + 2.
By Theorem 7.6.1 we know that F is an isomorphism if and only if the determinant
of A is nonzero, and therefore F is isomorphism if and only if k ≠ 2. We can therefore
−1
choose any value of k other than 2 to calculate F . We choose k = 0, since this will
simplify the calculations. The matrix associated with the inverse of F in the canonical
bases for the domain and codomain is the inverse of the matrix A. We compute this
inverse using formula 7.1:
−1
⎛2 0 1⎞ ⎛1/2 0 −1/2⎞
⎜−1 1 −1⎟
=⎜ ⎜1/2 1 1/2 ⎟
=⎜
−1
A ⎟ ⎟.
⎝0 0 1⎠ ⎝ 0 0 1 ⎠
Although not necessary for this exercise, it is always a good idea to make sure that
−1
A is actually the inverse of A. To that purpose, it is necessary to perform the rows
−1
by columns product of A and A and verify that the result is the identity matrix:
⎛ 2 0 1 ⎞ ⎛1/2 0 −1/2⎞
⎜−1 1 −1⎟
=⎜ ⎟⎜⎜1/2 1 1/2 ⎟
−1
AA ⎟ = I.
⎝ 0 0 1 ⎠⎝ 0 0 1 ⎠
Therefore the inverse of F is
1 1 1 1
F (x, y, z) = ( x − z, x + y + z, z) .
2 2 2 2

7.8 SUGGESTED EXERCISES

3 3
7.8.1 Let us consider the linear transformation F ∶ R ⟶ R defined by:
F (x, y, z) = (x + 2z, y + z, z).
a) Determine if F is an isomorphism, motivating the answer.
b) If the answer to the question in point (a) is affirmative, calculate the inverse of F .
3 3
7.8.2 Consider the linear transformation F ∶ R ⟶ R such that:
F (x, y, z) = (x + y − 2z, −y + z, −z).
3 3
Determine a linear transformation G ∶ R ⟶ R such that F ◦ G = id.
7.8.3 Establish for which values of a the matrix
1 2
A=( )
a 3
is invertible.
Choose a value of a for which A is invertible and compute the inverse of A.
128 Introduction to Linear Algebra

7.8.4 Establish for which values of a the matrix

⎛0 1 0⎞
A=⎜
⎜1 0 1⎟
⎟
⎝2 a 3⎠

is invertible.
Choose one of the values for which it is invertible and compute the inverse.

7.8.5 Let e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1) be the canonical basis of the real
3 3 3
vector space R and let a ∈ R. Let T ∶ R ⟶ R be the linear transformation such
that T (e1 ) = e1 − ae2 , T (e2 ) = e2 + e3 , T (e3 ) = ae3 .
a) Find the values of a for which the map is invertible.
b) Choosing one of the values of a for which T is invertible, compute the inverse.

7.8.6 Calculate the inverse of the matrix A, if it exists:

−2 4
A=( ).
−1 2

7.8.7 a) Calculate the inverse of the matrix:

⎛1 −1 −1⎞
⎜2 −1 0 ⎟
A=⎜ ⎟
⎝0 0 1⎠

with any method.

b) Let LA be the linear transformation associated with A and let T be the linear
transformation:
3 3
T ∶ R ⟶ R
(x, y, z) ⟶ (−x + y − z, z, x − y).

Determine T ◦ LA .

7.9 APPENDIX
In this appendix, we want to give an alternative, but equivalent, definition of determi-
nant of a square matrix n × n. Instead of defining it indirectly through its properties,
like we have done in the text, we will see a direct definition, through the concept of
permutation, which is extremely important, even if does not appear so in our choice
of exposition of the theory.
This appendix is not necessary to continue reading, but it represents a deepening
of the concepts presented in this chapter. By the nature and depth of the topics of
this appendix it will be impossible to give a complete treatment, and we refer the
interested reader to the fundamental text of S. Lang, Introduction to linear algebra
[5], for further details.
Determinant and Inverse 129

Definition 7.9.1 Let {1, . . . , n} be the set of the first n natural numbers. A permu-
tation is a bijective function σ ∶ {1, . . . , n} ⟶ {1, . . . , n}.

A permutation therefore associates a number between 1 and n to a number

between 1 and n. For example, the function σ ∶ {1, 2} ⟶ {1, 2}, defined by:
σ(1) = 2, σ(2) = 1 is the permutation of {1, 2} that swaps the number 1 with
the number 2. A permutation that only exchanges two numbers i, j and leaves the
others unchanged is called a transposition, and it is denoted with (i, j). For example,
the transposition σ, described above, is written as σ = (1, 2).
Another example of permutation is given by the cycle. A cycle is denoted with
(i1 , i2 , . . . , is ), where {i1 , i2 , . . . , is } is a subset of {1, . . . , n}, and it is the permutation
that sends each ij to ij+1 for j = 1, . . . , n − 1, and then sends in to i1 , leaving all
the other integers unchanged. For example the cycle (3, 1, 5) is the permutation of
{1, . . . , 5} that sends 3 to 1, 1 to 5, 5 to 3 while 2 and 4 remain fixed.
We say that r cycles s1 , . . . , sr are disjoint if each element of {1, . . . , n} is fixed
by all cycles except at most one of them.
We denote the permutations also in this way:

1 ... n
σ=( )
σ(1) . . . σ(n)

and their set with Sn .

Let’s see some examples.

Example 7.9.2 1. Consider the permutation

1 2 3 4
σ1 = ( ).
1 4 3 2

This permutation exchanges (or we also say permutes) the elements 2 and 4
and leaves 3 and 4 unchanged. It is also denoted for simplicity by σ1 = (2, 4)
and, as we have seen, we call it a transposition.

2. Consider the permutation:

1 2 3 4
σ2 = ( ).
2 3 1 4

This permutation is the function that is obtained as a composition of the per-

mutations (2, 3) and (1, 2), that is σ2 = (1, 2) ◦ (2, 3). In fact, first we exchange
2 with 3 and then 1 with 2, obtaining the cycle (1, 2, 3). Note that the fact that
4 does not appear in the cycle writing means that 4 is left unchanged by the
permutation.

Let us now define the parity of a permutation σ.

130 Introduction to Linear Algebra

Definition 7.9.3 Given a permutation σ, we consider the number of inversions, that

is of pairs (i, j), such that i < j, but σ(i) > σ(j). If such a number is even we will
say that the permutation σ is even, or that its parity is p(σ) = 1, if it is odd, we say
then that the permutation is odd, that is that its parity is p(σ) = −1. In other words,
p(σ) = (−1) , where i is the number of inversions.
i

In the example considered above p(σ1 ) = −1, while p(σ2 ) = 1.

Each permutation can be written as a composition of transpositions, in fact one
can show that every function from {1, . . . , n} to {1, . . . , n} can be realized making
subsequent exchanges, i.e. transpositions. However, pay attention to the fact that in
general there is not uniqueness of writing: the same permutation can appear in two
different ways as a composition of transpositions. For example, the identity in S3 is
obtained as the composition (1, 2) ◦ (1, 2), but also as (1, 3) ◦ (1, 3).
The parity of a permutation σ can also be defined equivalently as (−1) , where m
m

is the number of transpositions whose composition gives σ. Once we fix a permutation,

it can be written as a product of transpositions in many different ways, but the
number of transpositions appearing in each product is always of the same parity (i.e.
always even or always odd), so the alternative definition of parity we have given is
well posed. The proof of this fact and of the equivalence between the two proposed
definitions of parity are not obvious at all, but it would take us too far from our goal,
namely a direct definition of the determinant of a matrix.
We now state two fundamental results in the theory of permutations, omitting
also in this case the proof for the reasons already described above.

Proposition 7.9.4 Each permutation is written in a unique way as a composition

of disjoint cycles.

Proposition 7.9.5 Parity has the following property:

p(s1 ◦ s2 ) = p(s1 )p(s2 ),

for every s1 , s2 ∈ Sn .

Now that we have introduced the concept of permutation, we can give an alter-
native definition of determinant.

Definition 7.9.6 Let A = (aij ) be a square matrix n × n. We define determinant of

A the number:
p(σ)
det(A) = ∑ (−1) a1,σ(1) . . . an,σ(n) ,
σ∈Sn

where with ∑σ∈Sn we denote the fact that we are doing the sum of the elements
a1,σ(1) . . . an,σ(n) as σ varies among all permutations of Sn .

We verify that in the case of 2 × 2 and 3 × 3 matrices the new definition of

determinant corresponds to that seen previously in 7.1.2.
Determinant and Inverse 131

Let A be a 2 × 2 matrix:
a11 a12
A=( ).
a21 a22
According to the new definition, its determinant is:
det(A) = a11 a22 − a12 a21 ,
as S2 , the set of permutations of two elements, consists only of the identity and the
transposition (1, 2). We can immediately note that this expression coincides with the
formula for the determinants of 2 × 2 matrices obtained in Section 7.2.
Let A be a 3 × 3 matrix:
⎛a11 a12 a13 ⎞
⎜a21 a22 a23 ⎟
A=⎜ ⎟.
⎝a31 a32 a33 ⎠
The set of permutations of three elements is:
S3 = {id, (1, 2), (2, 3), (1, 3), (3, 2, 1), (2, 3, 1)},
with respective parities: p(1, 2) = p(2, 3) = p(1, 3) = −1, p(id) = p(3, 2, 1) =
p(2, 3, 1) = 1. The determinant of A is therefore given by:
det(A) = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 .
We introduce a notation that will be useful later.
Let A = (aij )i=1,...,n
j=1,...,n
be a generic matrix. The ith row of A is given by:

(ai1 , ai2 , . . . , ain ) = ai1 e1 + ai2 e2 + ⋅ ⋅ ⋅ + ain en ,

where ei = (0, . . . , 1, . . . , 0) is the i-th basis vector of R . Let us write the matrix A
n

as the sequence of its rows:

n n n
A = ( ∑ a1k1 ek1 , ∑ a2k2 ek2 , . . . , ∑ ankn ekn ).
k1 =1 k2 =1 kn =1

a11 a12
To clarify the new notation that we introduced, if A = ( ), we write:
a21 a22
A = (a11 e1 + a12 e2 , a21 e1 + a22 e2 ).
We now want to show that in the case of a 2 × 2 matrix the properties that
define the determinant (see Definition 7.1.2) determine it in a unique way. Let A =
(a11 e1 + a12 e2 , a21 e1 + a22 e2 ). Thanks to property (1) of Definition 7.1.2 we have:
det(A) = det(a11 e1 + a12 e2 , a21 e1 + a22 e2 ) =

= det(a11 e1 , a21 e1 + a22 e2 ) + det(a12 e2 , a21 e1 + a22 e2 ) =

= det(a11 e1 , a21 e1 ) + det(a11 e1 , a22 e2 ) + det(a12 e2 , a21 e1 )+

+ det(a12 e2 , a22 e2 ).
132 Introduction to Linear Algebra

By property (2) of Definition 7.1.2, we can then write:

det(A) = a11 a21 det(e1 , e1 ) + a11 a22 det(e1 , e2 )+

+a12 a21 det(e2 , e1 ) + a12 a22 det(e2 , e2 ).

By Proposition 7.1.4, we have that: det(e1 , e1 ) = det(e2 , e2 ) = 0 (as they are matrices
with two equal rows) and det(e1 , e2 ) = − det(e2 , e1 ). Finally by property (3) of
Definition 7.1.2, we have that det(e1 , e2 ) = 1. Therefore

det(A) = (a11 a22 − a12 a21 ) det(e1 , e2 ) = a11 a22 − a12 a21 .

This shows that the function defined in 7.1.2, must necessarily be expressed by the
formula in 7.9.6, and therefore this function is unique.
The procedure we have described for 2×2 matrices can be replicated identically in
the case of n × n matrices, allowing us to get the equivalence of the two definitions of
determinant. Let us look at this in more detail in the proof of the following theorem,
which is the most significant result of this appendix.

Theorem 7.9.7 The function defined in 7.1.2 exists and is unique, and it is ex-
pressed by the formula of the Definition 7.9.6. The two Definitions 7.9.6 and 7.1.2 of
determinant are therefore equivalent.

Proof. Let
n n n
A = ( ∑ a1k1 ek1 , ∑ a2k2 ek2 , . . . , ∑ ankn ekn )
k1 =1 k2 =1 kn =1

be a n × n matrix written according to the notation introduced above (i.e. as a

sequence of its rows). Let det(A) be the number defined in 7.1.2. In a similar way to
what we saw for the case of 2 × 2 matrices, by properties (1) and (2) of Definition
7.1.2 we can write immediately

det(A) = ∑ a1,k1 . . . an,kn det(ek1 , ek2 , . . . , ekn ).

1≤k1 ...kn ≤n

Note that the matrix (ek1 , ek2 , . . . , ekn ) (written as a sequence of rows, according
to our convention) has, in the i -th row, 1 in the ki position and zero elsewhere. It
is therefore clear that if there are some values repeated among k1 , . . . , kn the matrix
(ek1 , ek2 , . . . , ekn ) has two equal rows and therefore by property (3) of Definition 7.1.2
its determinant is zero. So det(ek1 , ek2 , . . . , ekn ) ≠ 0 only if k1 , . . . , kn are all distinct,
that is, the function s defined by s(i) = ki is a permutation of {1, . . . , n}. At this
point we have
1 if p(s) = 1
det(ek1 , ek2 , . . . , ekn ) = { .
−1 if p(s) = −1
because we can reorder the rows of (ek1 , ek2 , . . . , ekn ) to get the identity matrix (which
has determinant 1), and to do this we made a number of exchanges corresponding to
Determinant and Inverse 133

the parity of s. We therefore obtained that the determinant of A, as defined in 7.1.2,

must be expressed by the formula:
p(s)
det(A) = ∑ (−1) a1,s(1) . . . an,s(n) .
s∈Sn

This proves the equivalence between the two definitions, but also how the number
defined by the four properties in 7.1.2 exists and is unique, that is how these properties
determine it uniquely.

We have an important corollary.

Corollary 7.9.8 The determinant is also given by the formula:
p(σ)
det(A) = ∑ (−1) aσ(1),1 . . . aσ(n),n
σ∈Sn

(note that, with respect to Definition 7.9.6, the permutations are carried out on the
row and not on the column indexes). In particular we have that
T
det(A) = det(A ).
Proof. By the previous theorem we have:
p(σ)
det(A) = ∑ (−1) a1,σ(1) . . . an,σ(n) .
σ∈Sn

We now observe that

aσ(r),r = as,σ−1 (s) ,
where s = σ(r), since σ (s) = σ (σ(r)) = r. Also, since s is a bijection of {1, . . . , n}
−1 −1

into itself, we have that

a1,σ(1) . . . an,σ(n) = aσ−1 (1),1 . . . aσ−1 (n),n .
Therefore:
det(A) = ∑σ∈Sn (−1)
p(σ)
a1,σ(1) . . . an,σ(n)

= ∑σ∈Sn (−1)
p(σ)
aσ−1 (1),1 . . . aσ−1 (n),n =

p(τ )
= ∑τ ∈Sn (−1) aτ (1),1 . . . aτ (n),n ,
−1
because, if σ varies among all the permutations in Sn , also τ = σ varies among all
permutations in Sn and moreover p(τ ) = p(σ).

Thanks to this new definition and to the previous theorem, we can prove what
we have just stated in the text concerning the determinant calculation procedures.
We now want to get a proof of Theorem 7.3.3, which provides us with a valid
tool for calculating the determinant. Before going to the proof of the formula that
appears in Theorem 7.3.3, also known as the formula for Laplace expansion of the
determinant, we need a technical lemma.
134 Introduction to Linear Algebra

Lemma 7.9.9 If B is a square matrix of the type:

⎛ 1 0 0 ... 0 ⎞
⎜ . . . b2n ⎟
B=⎜ ⎟
⎜ b21 b22 b23 ⎟,
⎜
⎜ ... ⋮ ⎟ ⎟
⎜ ⋮ ⋮ ⋮ ⎟
⎝ bn1 bn2 bn3 . . . bnn ⎠

then
⎛ b22 b23 . . . b2n ⎞
det(B) = det ⎜
⎜ ⋮ ⋮ ... ⋮ ⎟ ⎟.
⎝ bn2 bn3 . . . bnn ⎠

Proof. We have det(B) = ∑σ∈Sn (−1) b1,σ(1) . . . bn,σ(n) . Since b1,j = 0 for every
p(σ)

j ≠ 1, the permutations σ ∈ Sn such that σ(1) ≠ 1 give no contribution to the sum,

thus we have that
p(σ)
det(B) = ∑ (−1) b1,1 b2,σ(2) . . . bn,σ(n) .
σ∈Sn

Now the set of permutations σ of {1, . . . , n} that fix 1 can be seen as the set of all
permutations τ of the set {2, . . . , n}. Considering that b1,1 = 1 we have:
p(τ )
det(B) = ∑ (−1) b2,τ (2) . . . bn,τ (n) ,
τ ∈Sn−1

and this is exactly the determinant of the matrix:

⎛ b22 b23 . . . b2n ⎞

⎜
⎜ ⋮ ⋮ ... ⋮ ⎟ ⎟.
⎝ bn2 bn3 . . . bnn ⎠

Theorem 7.9.10 (Laplace theorem). Let A be a n × n matrix and i ∈ {1, . . . , n}

a fixed index. Then we have that:

det(A) = ai,1 Γi,1 + . . . ai,n Γi,n , and

det(A) = a1,i Γ1,i + . . . an,i Γn,i ,

where Γkl denotes determinant of the matrix obtained from A by deleting the k-th row
and the l-th column, multiplied by (−1) .
k+l

Proof. We look at the proof of the first of these properties, the second is quite
similar. We also suppose i = 1. The general case is only more complicated to
write, but does not offer any additional conceptual difficulty. We write A as A =
n
(∑j=1 a1j ej , A2 , . . . , An ). By properties (1) and (2) of Definition 7.1.2, we have that:
n
det(A) = ∑ a1j det(ej , A2 , . . . , An ).
j=1
Determinant and Inverse 135

It is therefore sufficient to calculate the determinant of matrices of the type:

Mj = det(ej , A2 , . . . , An ) =

⎛ 0 0 ... 0 1 0 ... 0 ⎞
⎜ a21 a22 . . . a2(j−1) a2j a2(j+1) . . . a2n ⎟
=⎜
⎜
⎜
⎟
⎟.
⎜
⎜ ⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋱ ⋮ ⎟⎟
⎟
⎝ an1 an2 . . . an(j−1) anj an(j+1) . . . a2n ⎠

We now observe that, as a consequence of property (1) of Proposition 7.1.4 and of

Corollary 7.9.8, if two columns of a matrix are exchanged, the determinant changes
sign, therefore

det(Mj ) =

⎛ 1 0 ... 0 0 0 ... 0 ⎞
⎜ ⎟
= (−1) det ⎜ ⎟
j−1
⎜ a2j a21 a22 . . . a2(j−1) a2(j+1) . . . a2n ⎟
⎜
⎜ ⎟
⎟ .
⎜ ⋮ ⋮ ⋮ ... ⋮ ⎟
⎝ anj an1 an2 . . . an(j−1) an(j+1) . . . ann ⎠

Observing that (−1) = (−1)

j−1 1+j
and using Lemma 7.9.9 we now have that:

⎛ a21 a22 . . . a2(j−1) a2(j+1) . . . a2n ⎞

det ⎜ ⎟
1+j
det(Mj ) = (−1) ⎜ ⋮ ⋮ ... ⋮ ⎟,
⎝ an1 an2 . . . an(j−1) an(j+1) . . . ann ⎠

and this is precisely the matrix obtained from A by deleting the first row and the j
-th column.
Then we get:
n
det(A) = ∑j=1 a1j (−1)
1+j

⎛ a21 a22 . . . a2(j−1) a2(j+1) . . . a2n ⎞

det ⎜
⎜ ⋮ ⋮ ... ⋮ ⎟
⎟,
⎝ an1 an2 . . . an(j−1) an(j+1) . . . ann ⎠

that is, the desired formula.

Let us now prove Binet Theorem 7.3.5; as the two definitions of determinant are
equivalent we can use the one that best fits what we want to do.

Theorem 7.9.11 (Binet Theorem). Let A = (aij )i=1,...,n

j=1,...,n
and B = (bij )i=1,...,n
j=1,...,n
be
two square matrices of order n. Then:

det(AB) = det(A) det(B).

136 Introduction to Linear Algebra

n
Proof. We know that AB is the matrix whose coefficient of place i, j is ∑k=1 aik bkj ,
then apply Definition 7.9.6:
n n
det(AB) = ∑σ∈Sn (−1) (∑k1 =1 a1k1 bk1 ,σ(1) ) . . . (∑kn =1 ankn bkn ,σ(n) ) =
p(σ)

= ∑1≤k1 ,...,kn ≤n ∑σ∈Sn (−1)

p(σ)
a1,k1 bk1 ,σ(1) . . . an,kn bkn ,σ(n) =

= (∑1≤k1 ,...,kn ≤n a1,k1 . . . an,kn )

(∑σ∈Sn (−1) bk1 ,σ(1) . . . bkn ,σ(n) )

p(σ)

rearranging the terms. Let us fix k1 , . . . , kn and consider the number:

p(σ)
∑ (−1) bk1 ,σ(1) . . . bkn ,σ(n) .
σ∈Sn

This is the determinant of the matrix having as rows the rows k1 , . . . , kn of the matrix
B, i.e.:
p(σ)
∑ (−1) bk1 ,σ(1) . . . bkn ,σ(n) = det(∑ bk1 ,i1 ei1 , . . . , ∑ bk1 ,in ein ),
σ∈Sn i1 in

using the notation introduced previously. By property (3) of Definition 7.1.2, we have
that if k1 , . . . , kn are not all distinct, that determinant is zero. So from now on in the
expression det(AB) we sum up only the terms where k1 , . . . , kn are distinct. In this
case, we note that the matrix (∑i1 bk1 ,i1 ei1 , . . . , ∑in bk1 ,in ein ) is obtained starting from
B with a number of row exchanges corresponding to the parity of the permutation τ
defined by τ (i) = ki . Therefore:
p(τ )
det(∑ bk1 ,i1 ei1 , . . . , ∑ bk1 ,in ein ) = (−1) det(B).
i1 in

Since (k1 , . . . , kn ) = (τ (1), . . . , τ (n)), considering all n-tuples with k1 , . . . , kn distinct

is equivalent to having τ varying among all permutations of Sn .
We have therefore obtained:
p(τ )
det(AB) = ∑1≤k1 ,...,kn ≤n (−1) a1,k1 . . . an,kn det(B) =

p(τ )
= ∑τ ∈Sn (−1) a1,τ (1) . . . an,τ (n) det(B) = det(A) det(B).

If A is a square matrix we denote with A1 , . . . , An the rows of A and with Γ ̃i
the column vector (Γi1 , . . . , Γkn ) , where the numbers Γij are defined as in Laplace
T

theorem, and they are called the algebraic complements of the A matrix. The first
formula of Laplace theorem can then be written as:
̃ i = det(A),
Ai Γ for each i = 1, . . . , n, (7.2)
where the product is the usual product rows by columns.
̃ j has any meaning even when i ≠ j. The
It makes sense to ask if the product Ai Γ
answer is given by following proposition.
Determinant and Inverse 137

Proposition 7.9.12 Let the notation be as above. Then:

̃ j = 0,
Ai Γ for each i, j = 1, . . . , n, i ≠ j. (7.3)
Proof. To prove (7.3) we consider the matrix A = (aij )i=1,...,n
′ ′
j=1,...,n
obtained from A by
′
replacing the j-th row with the i-th row. Then we have ars = ars for every r, s =
′
1, . . . , n, with r ≠ j, and ajs = ais for every s = 1, . . . , n.
′ ′ ′
Let Γrs be the algebraic complement of A . Since the j-th row of A does not
′ ′
appear in the calculation of Γjs , we have Γjs = Γjs for every s = 1, . . . , n. We now
′
compute the determinant of A using Laplace theorem and expanding according to
the j-th row:
det(A ) = aj,1 Γj,1 + ⋅ ⋅ ⋅ + aj,n Γj,n = ai,1 Γj,1 + ⋅ ⋅ ⋅ + ai,n Γj,n .
′ ′ ′ ′ ′

′
On the other hand, the matrix A has two equal rows (the i-th and the j-th), then
′
by property (3) of Definition 7.1.2 the determinant of A is equal to zero. This proves
precisely formula (7.3), that is, we have:
ai,1 Γj,1 + ⋅ ⋅ ⋅ + ai,n Γj,n = 0 per ogni i, j = 1, . . . , n, i ≠ j.

Let us now prove formula (7.1) for the computation of the inverse matrix of a
square matrix A with non zero determinant. Let det(Ai j ) be the determinant of the
matrix obtained from A by removing the i-th row and the j-th column and consider
the matrix B whose elements are defined by:
1 i+j
(B)ij = (−1) det(Aj i ). (7.4)
det(A)
Note that the j-th column of the matrix B is 1 ̃ .
Γ
We now compute the product,
det(A) j
1
rows by columns, AB. The element of place i, j is Ai Bj = det(A) ̃ j and by formulas
Ai Γ
1
(7.2) and (7.3) this is det(A) det(A) = 1 if i = j and 0 if i ≠ j.
So AB = I. Similarly, starting from Laplace theorem and expanding the deter-
minant according to the columns, we will have that BA = I, hence B is the inverse
of A.
Finally, we prove the correctness of the method described in Section 7.5 to com-
pute the inverse of an invertible matrix using the Gaussian algorithm.
Let A = (aij )i=1,...,n
j=1,...,n
be an invertible square matrix and let B = (bij )i=1,...,n
j=1,...,n
be its
inverse. We have AB = I, where I is the identity matrix of order n. Let A1 , . . . , An
be the row vectors of A and B ̃1 , . . . , B
̃n the column vectors of B. The coefficients
̃1 satisfy the relations: A1 B
b11 , . . . , bn1 of B ̃1 = (AB)11 = 1, A2 B ̃1 = (AB)21 =
̃
0, . . . , An B1 = (AB)n1 = 0 (where the products are rows by columns), i.e. they are a
solution of the linear system associated with the matrix:
⎛ a11 a12 . . . a1n 1 ⎞
⎜
⎜ a21 a22 . . . a2n 0 ⎟
⎟
⎜
⎜ ⎟.
⋮ ⎟
⎜ ⎟ (7.5)
⎜ ⋮ ⋮ ⋮ ⎟
⎝ an1 an2 . . . ann 0 ⎠
138 Introduction to Linear Algebra

Since A is invertible, by Theorem 7.6.1, the system admits a unique solution, and
then solving this system we determine uniquely the elements of the column B ̃1 . As
described in Section 7.5, through elementary operations on the rows, it is possible to
obtain the identity matrix on the left, that is

⎛ 1 0 ... 0 c11 ⎞
⎜
⎜ 0 1 ... 0 c21 ⎟
⎟
⎜
⎜ ⎟.
⋮ ⋮ ⎟
⎜ ⎟ (7.6)
⎜ ⋮ ⋮ ⎟
⎝ 0 0 ... 1 cn1 ⎠
Since the two systems associated with the matrices in (7.5) and (7.6) have the same
solutions, it must be bj1 = cj1 for each j = 1, . . . , n, i.e. in (7.6) the column on the
̃1 .
left is precisely B
We proceed in the same way for the generic column B ̃i , whose coefficients are the
solutions of the linear system:

⎛ a11 a12 . . . a1n 0 ⎞

⎜
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎟
⎜
⎜ ⎟
⎜
⎜
⎜ ai1 ai2 . . . ain 1 ⎟
⎟
⎟
⎟ . (7.7)
⎜
⎜ ⎟
⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ an1 an2 . . . ann 0 ⎠

To solve this system, we can perform on the rows of A exactly the same elementary
̃1 , and thus we obtain:
operations that we did in the case of B

⎛ 1 0 ... 0 c1i ⎞
⎜
⎜ 0 1 ... 0 c2i ⎟
⎟
⎜
⎜ ⎟
⎟ (7.8)
⎜
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎟
⎝ 0 0 ... 1 cni ⎠

and again, since the two systems associated with the matrices in (7.7) and (7.8) have
the same solutions, it must be bji = cji for each j = 1, . . . , n, i.e. in (7.8) the column
̃i . Since we have to solve n linear systems that all have the same
on the left is just B
matrix of coefficients, we can solve them at the same time by considering the matrix

A∣I.

For all we said, after performing the elementary operations on the rows needed to
reduce A to the identity matrix, we obtain a matrix of the type:

⎛ 1 0 ... 0 c11 c12 . . . c1n ⎞

⎜
⎜ 0 1 ... 0 c21 c22 . . . c2n ⎟
⎟
⎜
⎜ ⎟
⎟ ,
⎜
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎟
⎝ 0 0 ... 1 cn1 cn2 . . . cnn ⎠
where cji = bji for every i, j = 1, . . . , n, i.e. the matrix on the left is precisely the
inverse B of A.

Let us now summarize the key properties of the determinant of a matrix.

Determinant and Inverse 139

Properties of the determinant

The determinant has the following properties:

• The determinant of a matrix A is zero if A has a zero row or a zero column.

• The determinant of a matrix A is zero if and only if the matrix A has one row
(or a column), which is a linear combination of the others.
′
• If the matrix A is obtained from the matrix A by exchanging two rows (or two
′
columns) the determinant of A is the opposite of the determinant of A.
′
• If the matrix A is obtained from the matrix A by multiplying a row (or a col-
′
umn) by a scalar λ, the determinant of A is the product of λ by the determinant
of A.
′
• If the matrix A is obtained from the matrix A adding to a row (or a col-
′
umn) a linear combination of the others the determinant of A is equal to the
determinant of A.

• The determinant of the identity matrix is 1.

• The determinant of a diagonal matrix is the product of the elements on the

diagonal.

• det(A) = det(A ), for each matrix A.

• det(AB) = det(BA) = det(A) det(B) for each pair of square matrices A and
B of the same order (but caution, in general AB ≠ BA!).

• det(A ) = det(A) , for each invertible square matrix A.

−1 −1
CHAPTER 8

Change of Basis

In this chapter, we want to address one of the most technical topics of this theory, i.e.
the change of basis within a vector space. We will also understand how to change the
matrix associated with a linear transformation, if we change the bases in the domain
and codomain.

8.1 LINEAR TRANSFORMATIONS AND MATRICES

n m
As we have seen in Chapter 5, if we fix the canonical bases in R and R , we can
identify linear transformations F ∶ R ⟶ R and m × n matrices Mm,n (R):
n m

{linear transformations R ⟶ R } ↔ Mm,n (R)

n m

↦ (F (e1 ), . . . , F (en )) .
n m
F ∶R ⟶R

The matrix (F (e1 ), . . . , F (en )), associated with the linear transformation F ∶
n m
R ⟶ R , in such one to one correspondence, has as columns the images of the
n
vectors of the canonical basis of R . Let us see an example.
Consider the linear transformation F ∶ R ⟶ R , F (e1 ) = e1 − e2 , F (e2 ) = 3e2 .
2 2

This transformation is associated, with respect to the canonical basis in the domain
and codomain, to the matrix:
1 0
A=( ).
−1 3
n
We know that the choice of the canonical basis to represent the vectors in R
is arbitrary, while being extremely convenient. For example, we have seen that a
vector expressed with respect to two different-ordered bases has obviously differ-
ent coordinates. Up to now, using a basis, other than the canonical one, to rep-
resent vectors seemed unnecessary. However, as we shall see in the next chapter,
it provides us the key to understanding the concepts of eigenvalues and eigenvec-
tors, which are of fundamental importance not only in linear algebra but also in its
applications.
We now want to generalize the correspondence between matrices and linear trans-
formations described above. Let us start with some observations (see 5).

141
142 Introduction to Linear Algebra

Let V and W be two vector spaces of finite dimension and let B = {v1 , . . . , vn }
and B = {w1 , . . . , wm } be two ordered bases of V and W , respectively. If F ∶ V → W
′

is a linear transformation, by Theorem 5.1.7, we know that F is uniquely determined

by F (v1 ), . . . , F (vn ).
Let:
F (v1 ) = a11 w1 + a21 w2 + ⋅ ⋅ ⋅ + am1 wm ,
F (v2 ) = a12 w1 + a22 w2 + ⋅ ⋅ ⋅ + am2 wm ,
(8.1)
⋮
F (vn ) = a1n w1 + a2n w2 + ⋅ ⋅ ⋅ + amn wm ,
and let v = x1 v1 + ⋅ ⋅ ⋅ + xn vn be a generic vector in V . Similarly to what was done
in Chapter 5, we want determine F (v).
We have:
F (v) = F (x1 v1 + x2 v2 ⋅ ⋅ ⋅ + xn vn ) =

= x1 F (v1 ) + x2 F (v2 ) + ⋅ ⋅ ⋅ + xn F (vn ) =

= x1 (a11 w1 + a21 w2 + ⋅ ⋅ ⋅ + am1 wm )+

+x2 (a12 w1 + a22 w2 + ⋅ ⋅ ⋅ + am2 wm ) + . . .

.
+xn (a1n w1 + a2n w2 + ⋅ ⋅ ⋅ + amn wm ) =

= (a1 1 x1 + a1 2 x2 + . . . + a1 n xn )w1 +

+(a2 1 x1 + a2 2 x2 + . . . + a2 n xn )w2 + . . .

+(am 1 x1 + am 2 x2 + . . . + am n xn )wm .

So, if the coordinates of v with respect to the base B are (v)B = (x1 , . . . , xn ), we
have that the coordinates of F (v) with respect to the base B are:
′

⎛ a1 1 x 1 + a1 2 x 2 + . . . + a1 n x n ⎞
⎜ a2 1 x1 + a2 2 x2 + . . . + a2 n xn ⎟
(F (v))B′ =⎜
⎜
⎜
⎜
⎟
⎟
⎟
⎟ = A ⋅ (v)B ,
⎜ ⋮ ⎟
⎝am 1 x1 + am 2 x2 + . . . + am n xn ⎠

where A is the matrix defined in (8.1), which has as columns the coordinates of
the vectors F (v1 ), . . . , F (vn ) with respect to base B = {w1 , . . . , wm }, and A ⋅ (v)B
′

denotes the product rows by columns of the matrix A and the vector (v)B , which
represents the coordinates of v with respect to the basis B.
We are therefore able to associate a m × n matrix A to F once we fix arbitrary
′ n m
ordered bases B and B in the domain and codomain. If V = R , W = R and we fix
′
as B and B the canonical bases of the domain and the codomain, we have that the
matrix A is precisely the matrix associated to F defined in the previous chapters and
recalled the beginning of this chapter. This matrix has as columns F (e1 ), . . . , F (en ),
Change of Basis 143

namely the coordinates, with respect to the canonical basis, of the the images of
n
the vectors of the canonical basis (in R , if we do not specify otherwise, we always
consider the coordinates of vectors with respect to the canonical basis).
We now want to formalize what we observed in the following definition.
Let us make an important distinction. While until now we interchangeably used
rows or columns to indicate the coordinates, in this chapter we require more accuracy,
since the coordinates of a vector with respect to a given basis will form a column
vector, which must then be multiplied (rows by columns) by a matrix. Henceforth,
we shall denote the coordinates of a vector with respect to a given basis via a column
vector.

Definition 8.1.1 Let F ∶ V ⟶ W be a linear transformation, where V and W are

vector spaces, and let B = {v1 , . . . , vn }, B = {w1 , . . . , wm } be ordered bases of the
′

domain and the codomain, respectively.

Let
⎛ x1 ⎞ ⎛ y1 ⎞
⎜
(v)B = ⎜ ⋮ ⎟ , ⎟ (F (v))B′ = ⎜ ⎜ ⋮ ⎟⎟
⎝xn ⎠ ⎝ym ⎠
be the coordinates of a vector v of V and the coordinates of its image under F ,
respectively.
′
The matrix AB,B′ associated to F with respect to the ordered bases B and B is,
by definition, the m × n matrix such that:

⎛ y1 ⎞ ⎛ x1 ⎞
(F (v))B′ = ⎜
⎜ ⎟⋮ ⎟ = A B,B ⎜
′ ⎜⋮⎟ ⎟.
⎝ym ⎠ ⎝xn ⎠

From previous observations, we have that the i-th column of AB,B′ is given by the
coordinates of F (vi ) with respect to the basis B .
′

In case of a linear transformation F ∶ V ⟶ V , where we choose the same basis

B in the domain and codomain, the matrix associated to F with respect to the basis
B will be denoted just with AB .
Let us return to the example shown above. We choose B = {v1 = e1 −e2 , v2 = 3e2 }
2
as a basis of R , and we ask the following questions:

1. What are the coordinates of the vectors v1 and v2 with respect to the basis B?

2. How can we write the matrix associated with F with respect to the canonical
basis in the domain and the basis B in the codomain?

The answer to the first question is obvious. The vectors v1 and v2 have, respec-
tively, coordinates (1, 0) and (0, 1) (T denotes the transpose, namely the fact that
T T

the coordinates represent a column vector). Indeed, v1 = 1⋅v1 +0⋅v2 , v2 = 0⋅v1 +1⋅v2 .
The answer to the second question is also quite simple. Indeed, we have already
seen that changing the basis that we choose to represent vectors within a vector
144 Introduction to Linear Algebra

space, does not change vectors, but only how we write them, i.e. their coordinates.
In the above example, the vectors v1 and v2 are the same, what changes passing
from the canonical basis to the basis B are just their coordinates, which change from
(v1 )C = (1, −1) , (v2 )C = (0, 3) to (v1 )B = (1, 0) , (v2 )B = (0, 1) .
T T T T

The same reasoning is valid for linear transformations, where the concept of co-
ordinates is replaced by the concept of matrix associated with the transformation,
with respect to two given ordered bases in the domain and codomain. Let us see a
concrete example.
Consider the linear transformation F ∶ R ⟶ R , such that F (e1 ) = v1 , F (e2 ) =
2 2

v2 . Let us now represent F using the coordinates with respect to the canonical basis
C = {e1 , e2 } in the domain and the coordinates with respect to the basis B = {v1 , v2 }
in the codomain. We have that (F (e1 ))B = (1, 0) , (F (e2 )B = (0, 1) , where we use
T T

an index to remind us that we use coordinates with respect to a certain basis, which
is not necessarily the canonical one. So we have that the matrix associated to F with
respect to the bases C of the domain and B of the codomain is
1 0
AC,B = ( ).
0 1
In fact, taking the product rows by columns, we see that:

1 1
AC,B ( )= ( )= (F (e1 ))B
0 0

0 0
AC,B ( ) = ( ) = (F (e2 ))B .
1 1
Therefore, the matrix associated to F with respect to the bases C in the domain
and B in the codomain is just AC,B , that is, the identity matrix.
We can easily generalize what we have just said.
Proposition 8.1.2 Let F ∶ V ⟶ V be a linear transformation such that F (v1 ) =
w1 , . . . , F (vn ) = wn , where B = {v1 , . . . , vn }, B = {w1 , . . . , wn } are two ordered
′

bases of V . Then the matrix associated to F , with respect to the base B in the domain
′
and the base B in the codomain is the identity matrix.
The proof is an easy exercise and follows the previous reasoning.

8.2 THE IDENTITY MAP

From now on, our focus will be exclusively concentrated on examining the case when
n
V = R for concreteness, even if everything we say can easily be generalized to the
1
case of a generic vector space, provided that the dimension is finite. Our choice to
1 n
A vector space V of finite dimension can always be identified with R provided that we fix a
basis. We want to point out, however, that this is not the case in this chapter: everything we will
say from now on can be generalized to a vector space V of dimension n, without identifying it with
n
R .
Change of Basis 145

avoid to treat the change of basis in this generality is dictated only by the hope to
increase the clarity, but does not involve conceptual issues.
We now ask a question in some way related to the previous ones.
n n
What is the matrix associated to the identity id ∶ R ⟶ R with respect to
the bases B = (v1 . . . vn ), B = (w1 . . . wn ), respectively, in the domain and in the
′

codomain?
Certainly we know that the identity matrix is associated to the identity map if we
fix the canonical bases in the domain and codomain, however we already know from
the previous example that changing the basis can radically change the appearance of
the matrix associated to the same linear transformation.
We now look at a simple example to help our understanding.
2 2
Example 8.2.1 Consider the identity map id ∶ R ⟶ R and fix the basis B =
{v1 , v2 } in the domain and the canonical basis C in the codomain, with v1 = 2e1 , v2 =
e1 + e2 .
The identity map always behaves in the same way even if we change the way we
represent it: id still sends a vector to itself. Let us see what happens:

id(v1 ) = v1 = 2e1 , id(v2 ) = v2 = e1 + e2 .

We write the coordinates of the vectors with respect to the canonical basis:

2 1
(id(v1 ))C = ( ) , (id(v2 ))C = ( ) .
0 1
Therefore, the matrix associated to the identity with respect to the basis B in the
domain and the canonical basis C in the codomain is:
2 1
IB,C = ( ).
0 1

Now we wonder what happens if we want to represent the identity using the canonical
basis C in the domain and the basis B in the codomain. The identity always associates
to each vector itself; the problem is to understand what are the right coordinates.

id(e1 ) = e1 = (1/2)v1 , id(e2 ) = e2 = −(1/2)v1 + v2 .

So:
1/2 −1/2
(id(e1 ))B = ( ), (id(e2 ))B = ( ).
0 1
Therefore, the matrix associated to the identity with respect to the canonical basis
C of the domain and the basis B of the codomain is:
1/2 −1/2
IC,B = ( ).
0 1

In this very simple example, it was possible to calculate easily the coordinates of e1
146 Introduction to Linear Algebra

and e2 with respect to the basis B = {v1 , v2 }; in general this is not always so easy.
−1
However, in this example, note that IC,B = IB,C . Therefore, the coordinates of the
vectors e1 and e2 with respect to the basis B can be read from the columns of the
−1
matrix IB,C . Remember that the matrix IB,C can be easily calculated and it has as
columns the coordinates of the vectors v1 , v2 with respect to the canonical basis.

This is true in general: we get the coordinates of a vector v with respect to a

basis B through the multiplication (v)B = IB,C (v)C , but we still need some work to
−1

prove it. Before this, we look at another special case.

n
Proposition 8.2.2 Let B be a basis of R ; then the matrix associated to the identity
n n
id ∶ R → R if we fix the basis B in the domain and the codomain is the identity
matrix.

Proof. The proof is immediate; it is a consequence of Proposition 8.1.2. We can see

it, however, also directly. If B = {v1 , . . . , vn }, then the i-th column of such matrix
is given by the coordinates of id(vi ) = vi with respect to the basis B, i.e. it is the
vector that has zero everywhere and 1 in the i-th position.

Observation 8.2.3 In general, if we have the composition of linear transformations

F G
V ⟶ W ⟶ Z, and we denote by AF the matrix associated to F and with AG the
matrix associated to G (with respect to any three fixed bases for V , W and Z), then
the composition G ◦ F is associated with the matrix AG ⋅ AF , where the product is
intended as the usual product rows by columns. The proof of this fact is quite similar
to the proof Corollary 5.3.4.
−1
We can now formalize the equality IC,B = IB,C that we have seen in the particular
case of Example 8.2.1 along with other essential facts regarding the calculation of the
coordinates of a vector with respect to a given basis.
n n
Theorem 8.2.4 Let id ∶ R ⟶ R be the identity transformation, which associates
to each vector, the vector itself. Let B = {v1 , . . . , vn } be an odered basis for R . Then
n

we have:

• the matrix associated to id with respect to the basis B in the domain and the
canonical basis C in codomain is:

IB,C = ((v1 )C , . . . , (vn )C ) ,

where (vi )C are the coordinates of the vector vi with respect to the canonical
basis;

• the matrix associated to id with respect to the canonical basis C in the domain
−1
and B in the codomain is IB,C .

Proof. The first point is clear; substantially it is what we saw in Example 8.2.1. Indeed
(id(v1 ))C = (v1 )C are precisely the coordinates of the vector v1 in the canonical basis.
The same is true for (id(v2 ))C , . . . , (id(vn ))C .
Change of Basis 147

To show the second point, we show that IC,B is the inverse of IB,C , that is,
IB,C IC,B = IC,B IB,C = I, where I is the identity matrix.
Consider the composition of the identity with itself, with respect to different
bases, as indicated in the following diagram:

n id n id n
R ⟶ R ⟶ R .
B C B

The composite function id◦id = id is still the identity, and if we consider the matrices
associated with it, by Observation 8.2.3 we get: IC,B IB,C = IB . Now, by Proposition
8.2.2 we have that IB = I, so that IC,B IB,C = I. Similarly, considering the diagram:

n id n id n
R ⟶ R ⟶ R ,
C B C
−1
we get that IB,C = IC,B . This concludes the proof.

This theorem also answers another question, which was asked previously, that is,
how we can write the coordinates of a vector v with respect to a given basis B.
So far we have responded with a very explicit calculation in each case, but now
we can state a corollary that contains the answer in general.
n n
Corollary 8.2.5 Let C be a basis of R , and let v be a vector of R . Then the
coordinates of v with respect to the basis B are given by:

(v)B = IB,C (v)C .

−1

Proof. We have that v = id(v), so we choose the canonical basis C in the domain
and the basis B in the codomain. Now by Theorem 8.2.4, we know that, with respect
−1
to these bases, the identity map is represented by the matrix IC,B = IB,C .

Let us see an example to illustrate these concepts.

Example 8.2.6 Consider the basis B = {v1 , v2 } of R , where v1 = −e1 + 2e2 , v2 =

−e2 . We want find the coordinates of the vector v = −e1 + 3e2 with respect to
−1 0
the basis B. We know that (v)C = (−1, 3) and IB,C = ( ). With an easy
2 −1
−1 0
calculation, we obtain that IB,C = ( ), then the coordinates are:
−1
−2 −1

−1 0 −1 1
(v)B = IB,C ⋅ (v)C = ( )( ) = ( ).
−1
−2 −1 3 −1

Indeed:
v = 1 ⋅ v1 − 1 ⋅ v2 = −e1 + 2e2 + e2 = −e1 + 3e2 .
148 Introduction to Linear Algebra

8.3 CHANGE OF BASIS FOR LINEAR TRANSFORMATIONS

In this section, we want to deal with a generalization of the problem studied in the
previous section.
n m
Let F ∶ R ⟶ R be a linear transformation, which is associated to the matrix
′
AC,C ′ with respect to the canonical bases C and C in the domain and codomain,
n
respectively. We want to find the matrix associated to F , if we fix the bases B for R
′ m
and B for R .
Both the transformation F and the vector space vectors behave independently
from the basis that we arbitrarily choose to represent them. The example of the iden-
tity is particularly useful, because the identity is the transformation that associates
to each vector the vector itself, but if we change the basis in which we represent it,
the associated matrix undergoes drastic changes.
Consider the following linear transformations, where we place next to each vector
space the basis we choose to represent the vectors; in the upper row, we choose the
′
canonical bases, at the bottom, we choose arbitrarily the two bases B and B .
n F m ′
C R ⟶ R C

id ↑ ↓ id

n F m ′
B R ⟶ R B

This is what is called a commutative diagram, because the path we choose in the
diagram does not influence the result:

F = id ◦ F ◦ id.

The equality we wrote appears as a tautology and not particularly interesting, how-
ever, when we associate with each transformation its matrix with respect to the fixed
bases in the domain and codomain, this same equality will provide a complete answer
to the question we set at the beginning of this section, and that is perhaps the most
technical point of our linear algebra notes.
Therefore, we associate to each linear transformation the corresponding matrix
on the same diagram using coordinates, that is, using matrices to represent linear
transformations. As we know from the previous section, we get:
n F m
R ⟶ R
IB,C (v)B (v)C ↦ AC,C ′ (v)C (v)C ′

↑ id ↑ ↓ id ↓ .
F
(v)B IC ′ ,B′ (v)C ′
n m
R ⟶ R

(v)B ↦ AB,B′ (v)B

Change of Basis 149

We want to use this diagram to determine AB,B′ , that is, the matrix associated
′
to F with respect to the bases B in the domain and B in the codomain.
Thanks to Observation 8.2.3, we have that the equality:

F = id ◦ F ◦ id

corresponds to:
−1
AB,B′ = IC ′ ,B′ AC,C ′ IB,C = IB′ ,C ′ AC,C ′ IB,C ,
where, for the last equality, we have used Theorem 8.2.4.
Thus, we have proved the following theorem.
n m
Theorem 8.3.1 Let F ∶ R ⟶ R be a linear transformation and let AC,C ′ be the
′
matrix associated with F with respect to the canonical bases C and C in the domain
and codomain, respectively. Then the matrix associated to F , with respect to the bases
′
B in the domain and B in the codomain, is given by:
−1
AB,B′ = IB′ ,C ′ AC,C ′ IB,C ,

where:

• AC,C ′ has, as columns, the coordinates of the vectors which are the images of
the vectors of the canonical basis, i.e. F (e1 ), . . . , F (en ), expressed in terms of
m
the canonical basis of R ;
′
• IB′ ,C ′ has, as columns, the coordinates of the vectors of the basis B expressed
m
with respect to the canonical basis of R ;

• IB,C has, as columns, the coordinates of the vectors of the basis B, expressed in
n
terms of the canonical basis of R .

Let us see an example to clarify how we can explicitly calculate AB,B′ .

2 3
Example 8.3.2 Consider the linear transformation F ∶ R ⟶ R associated to the
matrix AC,C ′ with respect to the canonical bases, where

⎛ 1 2⎞
AC,C ′ =⎜
⎜ 0 1⎟
⎟.
⎝−1 0⎠

We want to find the matrix AB,B′ associated with F with respect to the bases: B =
{e1 + e2 , −e1 − 2e2 } in the domain and B = {2e3 , e1 + e3 , e1 + e2 } in the codomain.
′

We find the matrices IB,C and IB′ ,C ′ :

1 −1 ⎛0 1 1⎞
IB,C =( ), IB ,C
′ ′ =⎜
⎜0 0 1⎟
⎟.
1 −2 ⎝2 1 0⎠
150 Introduction to Linear Algebra

−1
We calculate IB′ ,C ′ with any method:

⎛−1/2 1/2 1/2⎞

=⎜ 0 ⎟
−1
IB′ ,C ′ ⎜ 1 −1 ⎟.
⎝ 0 1 0 ⎠

The matrix AB,B′ is given by:

⎛−1/2 1/2 1/2⎞ ⎛ 1 2⎞ 1 −1

=⎜ 0 ⎟
⎟⎜⎜ 0 1⎟
⎟ (1 −2) =
−1
AB,B′ = IB′ ,C ′ AC,C ′ IB,C ⎜ 1 −1
⎝ 0 1 0 ⎠ ⎝−1 0⎠

⎛−3/2 2 ⎞
=⎜
⎜ 2 −3⎟
⎟.
⎝ 1 −2⎠

We conclude with a general observation.

Observation 8.3.3 Theorem 5.2.2 establishes a one to one correspondence between

linear transformations from R to R and matrices in Mm,n (R), once we have fixed
n m
n m
the canonical bases in R and R . However, the attentive reader will easily get
convinced that the choice of the canonical bases is totally arbitrary and therefore such
n m
one to one correspondence holds also when we choose arbitrary bases in R and R ,
respectively. Obviously the matrix associated to the same linear transformation will
be in general different, depending on the bases that we decide to fix. We can calculate
this matrix in a direct way, simply by knowing the coordinates of the vectors of the
two bases with respect to the canonical bases, thanks to Theorem 8.3.1.
Given two vector spaces V and W , with dim(V ) = n and dim(W ) = m, we can
also determine in a similar way a bijective correspondence between linear transfor-
mation from V to W and matrices in Mm,n (R). We invite the reader to go through
n m
the calculations we made before Theorem 5.2.2, replacing R and R with the vector
spaces V and W , respectively, and replacing the canonical bases with two arbitrary
bases V and W , respectively.

8.4 EXERCISES WITH SOLUTIONS

8.4.1 Let F ∶ M2 (R) ⟶ R be the linear transformation that associates to each
matrix its trace, that is:
a a
F ( 11 12 ) = a11 + a22 .
a21 a22
Determine the matrix associated with F with respect to the canonical bases of M2 (R)
and R.
Solution. We observe that M2 (R) has dimension 4 and R has dimension 1, so A is a
1 0
4 × 1 matrix. The first “column” of A is F (a11 ) = F ( ) = 1 + 0 = 1, the second
0 0
Change of Basis 151

0 1 0 0
column is F (e12 ) = F ( ) = 0 + 0 = 0, the third is F (e21 ) = F ( ) = 0 and
0 0 1 0
0 0
the fourth is F (e22 ) = F ( ) = 1, thus A = (1, 0, 0, 1).
0 1

8.4.2 Let F ∶ R ⟶ R be the linear transformation defined by: F (e1 ) = 2e1 − e2 ,

3 2

F (e2 ) = e1 , F (e3 ) = e1 + e2 . Let B = {2e1 − e2 , e1 − e2 } be a basis of R . Determine

the matrix AC,B associated with F with respect to the canonical basis in the domain
and the basis B in the codomain.
Solution. The matrix associated with F with respect to the canonical bases of the
domain and codomain is:
2 1 1
AC,C ′ = ( ).
−1 0 1
We want to change basis in the codomain, therefore we consider the following com-
position of functions:
3 F 2 id 2
R ⟶ R ⟶ R .
′
C C B
The composition is id ◦ F = F , and the matrix associated to it is AC,B = IC ′ ,B AC,C ′ . In
−1
addition, IC ′ ,B = IB,C ′ , where IB,C is the matrix that has as columns the coordinates
2 1
of the vectors of B with respect to the canonical basis C , thus: IB,C ′ = ( ).
′
−1 −1
1 1
We have that IB,C ′ = ( ), therefore
−1
−1 −2

1 1 2 1 1 1 1 2
AC,B = IC ′ ,B AC,C ′ = ( )( )=( ).
−1 −2 −1 0 1 0 −1 −3

8.4.3 Let B = {(2, −1), (1, 1)} be a basis of R and let G ∶ R ⟶ R be the linear
2 2 3

transformation defined by: G(2, −1) = (1, −1, 0), G(1, 1) = (2, 1, −2). Determine the
matrix AC,C ′ associated with G with respect to the canonical bases of the domain and
of the codomain.
Solution. The matrix associated with G with respect to the basis B of the domain
′
and the canonical basis C of the codomain is:

⎛1 2⎞
AB,C ′ = ⎜
⎜−1 1 ⎟
⎟.
⎝ 0 −2⎠

Since we have changed the basis in the domain, we consider the following composition
of functions:
2 id 2 F 3
R ⟶ R ⟶ R .
′
B C C
The composition is G ◦ id = G, and the matrix associated with it is AB,C ′ = AC,C ′ IB,C ,
−1 −1
therefore, multiplying to the right-both members by IB,C we get: AC,C ′ = AB,C ′ IB,C .
152 Introduction to Linear Algebra

1
2 1 − 13
We have IB,C = ( ) and IB,C ′ = ( 31 ), so
−1
2
−1 1 3 3

⎛1 2⎞ 1
− 13 ⎛ 1 1 ⎞
AC,C ′ = ⎜
⎜−1 1 ⎟
⎟ ( 31 2 )=⎜
⎜ 0 1 ⎟⎟.
⎝ 0 −2⎠ 3 3 ⎝− 2 4⎠
−3
3

8.5 SUGGESTED EXERCISES

8.5.1 Write the coordinates of the vector v = (1, 2, 3) with respect to the basis
B = {e1 − e2 , e1 + e2 − e3 , −e1 + e3 } of R .
3

8.5.2 Consider the linear transformation F ∶ R ⟶ R defined by: F (e1 ) = −e2 +e3 ,
2 3

F (e2 ) = 3e1 − 2e3 .

a) Determine the matrix associated with F with respect to the canonical basis and
the matrix AB,B′ associated to F with respect to the bases B = {(1, −2), (−2, 1)} of
R and B = {(2, 0, 5), (0, −1, 0), (1, 1, 3)} of R .
2 ′ 3

b) Determine the coordinates of the vector v = (1, −1, 1) with respect to the basis
′ 3
B of R .

8.5.3 Consider the linear transformation F ∶ R ⟶ R defined by: F (e1 ) = 2e2 ,

4 2

F (e2 ) = e1 − 3e2 , F (e3 ) = −2e1 , F (e4 ) = e1 + e2 .

a) Determine the matrix associated with F with respect to the canonical basis, and
the matrix AB,C associated to F with respect to the bases B = {e2 , e4 , e1 , e3 } of R
4
2
and the canonical basis C of R .
b) Determine the coordinates of the vector v = (6, −1, 3, −2) with respect to the
4
basis B of R .
3 3
8.5.4 Consider the linear transformation F ∶ R ⟶ R defined by:

F (x, y, z) = (8x − 9y, 6x − 7y, 2x − 3y − z).

a) Determine the matrix associated to F with respect to the canonical basis and the
matrix AB associated to F with respect to the basis B = {−e1 + e2 , −2e1 + 3e2 , −e3 }.
b) Say if F is an isomorphism and give a motivation for your answer.
c) If the answer in point (b) is affirmative, compute the inverse of F .
3 2
8.5.5 a) Given the linear transformation T ∶ R → R defined by:
T (x, y, z) = (3kx+y −2kz, 3x+ky −2z), determine for which values of k we have that
T is surjective, motivating the procedure followed. Set k = 0 and determine ker T .
b) Let B = {4e1 −2e2 , −e1 +e2 } be another R basis. Set k = 0. Determine the matrix
2
3
AC,B associated with T with respect to the canonical basis C of R in the domain
and at the basis B in the codomain.
3 3
8.5.6 a) Let F ∶ R → R be the linear transformation defined by:
F (x, y, z) = (x − 4y − 2z, −x + ky + kz, kx − 4ky + z).
Determine for which values of k we have that F is surjective.
Change of Basis 153

3 3
b) Set k = 0 and determine, if possible, a linear transformation G ∶ R → R such
that G ◦ F is the identity.
c) Let B = {e1 + e2 , −e1 + e3 , 2e2 } be another basis of R . Set k = 0. Determine the
3

matrix AC,B associated with F with respect to the basis B in the domain and the
3
canonical basis C of R in the codomain.

8.5.7 Consider the linear transformation D ∶ R3 [x] ⟶ R3 [x] that associates its
derivative to each polynomial. Determine the matrix associated with D with respect
to the basis {x , x , x, 1} of R3 [x].
3 2

8.5.8 Determine a linear transformation F ∶ R → R such that ker F = ⟨e1 +e2 , e2 −

3 3

e3 ⟩ and ImF = ⟨2e1 + e3 ⟩. (Suggestion: it is advisable to choose appropriate bases in

the domain and codomain).
Determine the matrix associated with F with respect to the canonical basis.
CHAPTER 9

Eigenvalues and
Eigenvectors

In this chapter, we want to address one of the most important questions of linear
algebra, namely the problem of diagonalizing a linear transformation together with
the concepts of eigenvalue and eigenvector.

9.1 DIAGONALIZABILITY
The idea behind the problem of diagonalizability is very simple: given a linear trans-
n n
formation F ∶ R ⟶ R , we ask if there is a basis, both for the domain and the
codomain, such that the matrix associated to F , with respect to this basis, has the
simplest possible form, namely the diagonal one. Let us see an example.

Example 9.1.1 Let φ ∶ R ⟶ R be the linear transformation defined by: φ(e1 ) =

2 2

e2 , φ(e2 ) = e1 . With respect to the canonical basis in domain and codomain φ is

represented by the matrix:
0 1
A=( ).
1 0
We observe that:

φ(e1 + e2 ) = e1 + e2 , φ(e1 − e2 ) = −(e1 − e2 ).

So, if we choose the basis B = {v1 , v2 }, with v1 = e1 + e2 and v2 = e1 − e2 , we have

that φ(v1 ) = v1 and φ(v2 ) = −v2 . Therefore, the matrix associated with φ, with
respect to the basis B (in domain and codomain) is a diagonal matrix:

1 0
AB = ( ).
0 −1

We can see this right away without calculations, however, to convince ourselves, we
can just use Definition 8.1.1 or the formula for changing the basis in Chapter 8.
In this case, it is very simple to see what happens geometrically. The transforma-
tion φ is the reflection of the plane with respect to the line x = y. Indeed, φ(e1 ) = e2 ,

155
156 Introduction to Linear Algebra

φ(e2 ) = e1 . We see, geometrically, that the vector v1 , lying on the straight line y = x,
is fixed by the transformation, while the vector v2 , which is perpendicular to the line
y = x, is sent to −v2 . Based on these observations, we can conclude without any
calculation that, with respect to the basis B = {v1 , v2 }, the matrix associated with
φ is in the specified diagonal form.

φ(v2 ) = −v2
v1 = φ(v1 )
@
I
@
@
@
@
@ -
@
@ x
@
@
@
R
@
v2

n n
Definition 9.1.2 A linear transformation T ∶ R ⟶ R is said to be diagonalizable,
n
if there is a basis B for R , such that the matrix AB associated with T with respect
to B (in domain and codomain) is a diagonal matrix.
In the above example, the transformation φ is diagonalizable and B = {e1 +
e2 , e1 − e2 } is the basis with respect to which matrix associated with φ is diagonal.
Just as we gave the definition of diagonalizable linear map, we can also give
the definition of diagonalizable matrix: this is essentially a matrix associated with a
diagonalizable linear transformation. We now see the precise definition.
Definition 9.1.3 A square matrix A is called diagonalizable, if there is a invertible
−1
matrix P , such that P AP is diagonal.
n n
Proposition 9.1.4 Let T ∶ R ⟶ R be a linear transformation associated with the
matrix A with respect to the canonical basis (in domain and codomain). Then T is
diagonalizable if and only if A is diagonalizable. Also, if T and A are diagonalizable,
then the coordinates of the vectors forming the basis B are the columns of the matrix
−1
P such that P AP is diagonal.
Proof. The statement is straightforward, if we remember how to make basis changes
from the previous chapter. Suppose that T is diagonalizable. Then there is a basis B
with respect to which the matrix AB associated with T is diagonal. If P = IB,C , we
have that the formula in Theorem 8.3.1 becomes:
−1
AB = P AP,
Eigenvalues and Eigenvectors 157

So, by definition, A is diagonalizable. Conversely, if A is diagonalizable, it means that

−1
there is a matrix P , such that P AP is diagonal. Consider P as the basis change
matrix IB,C , where B consists of the column vectors of P . The matrix associated
−1 −1
with T with respect to the basis B is just IB,C AIB,C = P AP , and it is diagonal.
Therefore, T is diagonalizable.

Observation 9.1.5 At this point, it is clear that a linear map is diagonalizable if

and only if the matrix associated with it with respect to any basis is diagonalizable.
We leave the proof of this statement as an exercise.
Two questions arise spontaneously now:
n n
1) In general, is a linear transformation T ∶ R ⟶ R always diagonalizable?
Equivalently, given a square matrix A, do we always have an invertible matrix P ,
−1
such that P AP is diagonal?

2) Is there a procedure for “diagonalizing” a linear transformation or a matrix? That

is, is there a procedure to determine the basis B or equivalently the matrix P ?
The answer to the first question in general is no, while for the second question the
answer is yes; we will see the answer in detail in the next section. Geometrically, we
see that there are many matrices that are not diagonalizable. Let us see an example.
2 2
Example 9.1.6 Consider the linear transformation ψ ∶ R ⟶ R defined by:
ψ(e1 ) = −e2 , ψ(e2 ) = e1 . The matrix associated with ψ with respect to the canonical
basis is given by:
0 1
A=( ).
−1 0
Although very similar to the previous example, there is an important difference.
0
Geometrically, ψ corresponds to a clockwise rotation of 90 around the origin of
the plane. In order for A, or equivalently for ψ, to be diagonalizable there must be
two linearly independent vectors v1 , v2 , such that ψ(v1 ) = λ1 v1 , ψ(v2 ) = λ2 v2 for
λ1 , λ2 ∈ R, i.e. two vectors that preserve their direction.
In fact, the matrix of ψ with respect to the basis B = (v1 , v2 ) would be:

λ1 0
AB = ( ).
0 λ2

But it is easy to see that a rotation is not fixing any direction, hence v1 and v2 cannot
exist, that is, A is not diagonalizable.
Note that the argument would be different, if we allowed the scalars to take
complex values. In fact, in this case there would be two vectors namely v1 = (1, −i)
and v2 = (i, −1), such that ψ(v1 ) = −iv1 , ψ(v2 ) = iv2 .
This example suggests a third question:
3) If we allow scalars to take complex values, then, can we always diagonalize a
given matrix (or linear transformation)?
158 Introduction to Linear Algebra

The answer is no, but we can always bring a matrix to a form which is almost
diagonal; this is called the Jordan form, and we will not discuss it, because it would
take us too far.

9.2 EIGENVALUES AND EIGENVECTORS

As we saw in the previous section, regarding the diagonalizability of a linear trans-
formation, the vectors whose direction is not changed by the linear transformation
play a fundamental role. In fact, these vectors form a basis B, with respect to which
the linear transformation is associated with a diagonal matrix. Equivalently, the co-
ordinates of these vectors, with respect to the canonical basis, form the columns of
−1
the matrix P , such that P AP is diagonal (where A is the matrix associated with
the given transformation with respect to the canonical basis).
These vectors are called eigenvectors. Let us look at the precise definition.

Definition 9.2.1 Given a linear transformation T ∶ V ⟶ V , we say that a nonzero

vector v ∈ V is an eigenvector of T of eigenvalue λ, if T (v) = λv, with λ ∈ R.
It should be noted that an eigenvalue λ can be zero, while an eigenvector, by
definition must be different from the zero vector.
Given a matrix A, we call eigenvalues and eigenvectors of A the eigenvalues and
eigenvectors of the linear transformation LA associated to A with respect to the
canonical basis (in domain and codomain).

It is important to understand that the fact that a vector is an eigenvector and

that a scalar is an eigenvalue of a linear transformation T does not depend on basis
chosen to represent T !
In Example 9.1.1, the vectors v1 = e1 + e2 and v2 = e1 − e2 are eigenvectors
of the transformation φ of eigenvalues 1 and −1, respectively. If we take the basis
consisting of the eigenvectors B = (v1 , v2 ), we have that v1 = (1, 1)C = (1, 0)B ,
v2 = (1, −1)C = (0, 1)B . If we change the coordinates, that is, if we change the way
to write these vectors, the vectors v1 and v2 do not change and also the eigenvalues
associated with them do not change. In fact, in the example, we have seen that the
matrix associated with φ is
1 0
AB = ( ).
0 −1
So the eigenvalues are −1, 1. We summarize our discussion in the following observa-
tion.

Observation 9.2.2 A scalar λ is an eigenvalue of a transformation T if and only if

it is an eigenvalue of the matrix AB associated with T with respect to any basis B.

We want to characterize the notion of diagonalizability for a linear transformation.

Proposition 9.2.3 A linear transformation T ∶ V ⟶ V is diagonalizable if and

only if there is a basis B of V consisting of eigenvectors of T .
Eigenvalues and Eigenvectors 159

The proof of this statement is particularly instructive as it contains basic facts

for understanding eigenvalues and eigenvectors.

Proof. If we have a basis B = {v1 , . . . , vn } of eigenvectors, then the matrix associated

with T with respect to the basis B is diagonal. Indeed:

T (v1 ) = λ1 v1 = λ1 v1 + 0v2 + ⋅ ⋅ ⋅ + 0vn ,

...
T (vn ) = λn vn = 0v1 + 0v2 + ⋅ ⋅ ⋅ + λn vn ,
so the matrix associated with T respect to the basis B is:

⎛λ1 . . . 0⎞
AB = ⎜
⎜⋮ ⋮⎟⎟.
⎝ 0 ... λn ⎠
So the transformation T is diagonalizable by definition.
Conversely, if T is diagonalizable, it means that there is a basis B with respect to
which the matrix associated with T is diagonal, i.e.

⎛λ1 . . . 0⎞
AB = ⎜
⎜⋮ ⋮⎟⎟.
⎝ 0 ... λn ⎠
But then, this matrix has precisely the eigenvalues on the diagonal because

⎛λ1 . . . 0 ⎞ ⎛1⎞ ⎛λ1 ⎞

(T (v1 ))B = A(v1 )B = ⎜
⎜⋮ ⋮⎟⎟⎜⎜⋮⎟
⎟=⎜
⎜⋮⎟ ⎟
⎝ 0 ... λn ⎠ ⎝0⎠ ⎝ 0 ⎠

⎛λ1 . . . 0 ⎞ ⎛0⎞ ⎛ 0 ⎞
.
(T (vn ))B = A(vn )B = ⎜
⎜⋮ ⋮⎟⎟⎜⎜⋮⎟
⎟=⎜
⎜⋮⎟ ⎟
⎝ 0 ... λn ⎠ ⎝1⎠ ⎝λn ⎠

Therefore T (v1 ) = λ1 v1 , . . . , T (vn ) = λn vn and the basis B consists of eigenvectors.

We now want to give a concrete method for calculating eigenvalues and eigenvec-
tors of a given matrix or linear transformation.

Definition 9.2.4 Given a square matrix A, we define characteristic polynomial pA

of A the following polynomial in x:

pA (x) = det(A − xI),

where det denotes the determinant and I the identity matrix.
160 Introduction to Linear Algebra

The fact that pA (x) is actually a polynomial in x, for example, follows from the
recursive calculation of the determinant expanded according to a row (or a column)
of A − xI.
In the following, if A is a matrix, for the sake of brevity we will denote by Ker A the
kernel of the linear transformation LA associated with A with respect to the canonical
basis (in the domain and codomain). Equivalently, Ker A is the set of solutions of the
homogeneous linear system A x = 0.

Theorem 9.2.5 A scalar λ is an eigenvalue of A if and only if it is a zero of the

characteristic polynomial, i.e. det(A − λI) = 0.

Proof. If λ is an eigenvalue of A, then there exists v ≠ 0 such that Av = λv, that is,
Av − λv = 0, i.e. (A − λI)v = 0 and so v ∈ Ker (A − λI). Thus by Theorem 7.6.1
the determinant of the matrix A − λI is zero.
Conversely, if det(A − λI) = 0, by Theorem 7.6.1 there is a nonzero vector v ∈
Ker (A − λI). Then (A − λI)v = 0, so Av = λv and v is an eigenvector of eigenvalue
λ.

Definition 9.2.6 Let A and B be two n × n matrices. A and B are said similar, if
there exists an invertible matrix n × n P such that:
−1
B=P AP.

Observation 9.2.7 • If A and B are similar then A and B represent the same
linear transformation with respect to different bases. This immediately follows
from our discussion on basis change.

• A is diagonalizable if and only if it is similar to a diagonal matrix.

Theorem 9.2.8 Similar matrices have the same characteristic polynomial.

(Warning, the reverse is not true!).

−1
Proof. Let A and B be similar matrices, with B = P AP . By Binet Theorem 7.3.5
and the properties of associativity and distributivity of matrix product, we have that

pB (x) = det(P
−1 −1 −1
AP − xI) = det(P AP − P P xI) =

xIP ) = det(P (A − xI)P ) =

−1 −1 −1
det(P AP − P

det(A − xI) det P = det(A − xI) = pA (x).

−1
det P

(In this proof, we have also used the fact that if P is a matrix and x is a scalar, then
P (xI) = (xI)P ).
Eigenvalues and Eigenvectors 161

Observation 9.2.9 We observe that in Theorem 9.2.8 the reverse implication is

false. Namely, the two matrices
3 0 3 0
A=( ) B=( )
0 3 1 3

have the same characteristic polynomial pA (x) = (3 − x) = pB (x), but using the
2

techniques we will learn at the end of this chapter it is easy to deduce that A and B
are not similar, becauseB is not diagonalizable.
Observation 9.2.10 From the proof of Theorem 9.2.5, we have that a vector v ≠ 0
is an eigenvector of a linear transformation T with associated eigenvalue λ if and only
if it belongs to Ker (A − λI), where A is the matrix associated with T with respect
to the canonical basis.
Definition 9.2.11 If λ is an eigenvalue of T ∶ R ⟶ R then Vλ = {v ∈ R ∣ T (v) =
n n n

λv} is called the eigenspace associated to the eigenvalue λ.

Observation 9.2.12 We observe that the elements of Vλ are exactly the eigen-
vectors of eigenvalue λ and the zero vector, so Vλ = Ker (A − λI), where A is the
matrix associated with T with respect to the canonical basis. Obviously Vλ is a vector
subspace, being the kernel of a linear transformation.
We now apply what we learned theoretically to compute eigenvalues and eigen-
vectors of an arbitrary matrix.
Example 9.2.13 We want to find eigenvalues and eigenvectors of the matrix
5 −4
A=( )
3 −2

and determine whether the matrix A is diagonalizable.

• The characteristic polynomial of A is:

5−x −4 2
pA (x) = det ( ) = x − 3x + 2.
3 −2 − x

The associated equation has solutions: 1 and 2, so there are two eigenvalues for
A, λ = 1 and λ = 2.

• We compute the eigenspace corresponding to the eigenvalue λ = 1. It consists

of all vectors (x, y) ∈ R , such that:
2

x x
A( ) = ( ),
y y

that is, the solutions of the system

5x − 4y = x
{
3x − 2y = y .
162 Introduction to Linear Algebra

This is an homogeneous linear system, and the matrix of coefficients is:

5−1 −4
( ),
3 −2 − 1

which is the matrix we used to compute the characteristic polynomial where

we replaced the eigenvalue 1 in place of x.
We therefore have
4 −4
V1 = Ker (A − I) = Ker ( ) = ⟨(1, 1)⟩.
3 −3

• To compute the eigenspace relative to the eigenvalue 2, we proceed in a similar

way, and we obtain:

3 −4
V2 = Ker (A − 2I) = Ker ( ) = ⟨(4/3, 1)⟩.
3 −4

• The vectors (1, 1), (4/3, 1) are linearly independent, so we have that A is diag-
onalizable, and it is similar to the diagonal matrix:

1 0
D=( ) = P AP,
−1
0 2

1 4/3
where P = ( ).
1 1

P is the matrix of the change of basis, which allows us to pass from the canonical
basis to the basis formed by the eigenvectors of A. By the theory on the basis change
(see Chapter 8), the columns of P consist of the coordinates of the eigenvectors with
respect to the canonical basis.

The following theorem will also be very useful.

Theorem 9.2.14 Let T be a linear transformation, and let v1 , . . . , vn be eigenvectors

of T associated to the eigenvalues λ1 , . . . , λn , respectively. Assume that λ1 , . . . , λn are
all distinct. Then v1 , . . . , vn are linearly independent.

Proof. The proof consists of n steps, all similar.

- Step 1. We have that v1 is a linearly independent vector, for v1 ≠ 0.
- Step 2. We show that v1 , v2 are linearly independent. Let β1 , β2 ∈ R such that:

β1 v1 + β2 v2 = 0. (9.1)

Applying T to both members and taking into account that T (v1 ) = λ1 v1 , T (v2 ) =
λ2 v2 , we obtain:
β1 λ1 v1 + β2 λ2 v2 = 0. (9.2)
Eigenvalues and Eigenvectors 163

Now we subtract from the second equality the first equality multiplied by λ2 , and we
get:
β1 (λ1 − λ2 )v1 = 0.
As v1 ≠ 0, we have that β1 (λ1 − λ2 ) = 0, and so β1 = 0, being λ1 ≠ λ2 . By replacing
β1 = 0 in (9.1), we get that β2 v2 = 0. But v2 is an eigenvector, thus v2 ≠ 0, and it
follows that β2 = 0, as we wanted.
After k − 1 steps, we have shown that the vectors v1 , . . . , vk−1 are linearly inde-
pendent.
- Step k. We show that v1 , . . . , vk−1 , vk are linearly independent. Let
β1 , . . . , βk−1 , βk ∈ R, such that:

β1 v1 + ⋅ ⋅ ⋅ + βk−1 vk−1 + βk vk = 0. (9.3)

Applying T to both sides and taking into account that T (vi ) = λi vi we obtain:

β1 λ1 v1 + ⋅ ⋅ ⋅ + βk−1 λk−1 vk−1 + βk λk vk = 0. (9.4)

Now we subtract from equality (9.4) equality (9.3) multiplied by λk and we get:

β1 (λ1 − λk )v1 + ⋅ ⋅ ⋅ + βk (λk − λk )vk = 0.

As v1 , . . . , vk−1 are linearly independent and λi − λk ≠ 0 for every i = 1, . . . , k − 1

(remember that the eigenvalues are all distinct), it follows that βi = 0 for each
i = 1, . . . , k − 1. Now equality (9.3) becomes:

βk vk = 0,

therefore also βk = 0, being vk ≠ 0. So the βi are all zero, and v1 , . . . , vk are linearly
independent, as we wanted.
After n steps we get what wanted, namely that v1 , . . . , vn are linearly indepen-
dent.

An immediate consequence of the previous theorem is the following corollary.

Corollary 9.2.15 A matrix A ∈ Mn (R) with n distinct eigenvalues is diagonalizable.
Proof. The n eigenvectors associated with the n distinct eigenvalues of A are linearly
n
independent by Theorem 9.2.14, and thus form a basis of R .

Let us see a concrete example of what we have just seen.

Example 9.2.16 We want to find eigenvalues and eigenvectors of the matrix

⎛ −1 2 0 ⎞
A=⎜
⎜ 1 1 0 ⎟
⎟
⎝ −1 1 4 ⎠

and determine if it is diagonalizable.

Let us see schematically how to proceed.
164 Introduction to Linear Algebra

• We calculate the characteristic polynomial of A:

⎛ −1 − x 2 0 ⎞
pA (x) = det ⎜ ⎟ 2
⎜ 1 1 − x 0 ⎟ = (x − 3)(4 − x) .
⎝ −1 1 4−x ⎠

• We find the roots of the characteristic polynomial, namely the solutions of

(x − 3)(4 − x) = 0:
2

√ √
x1 = 3, x2 = − 3, x3 = 4.

These roots are the eigenvalues of A. Since A has three distinct eigenvalues, by
3
the previous theorem there is a basis of R consisting of eigenvectors of A. So
we can immediately answer one of the questions: the matrix A is diagonalizable.

• We calculate the eigenspace V4 corresponding to the eigenvalue 4, which consists

of the vectors (x, y, z) ∈ R , such that:
3

⎛ x ⎞ ⎛ x ⎞
A⎜ ⎟ = 4⎜
⎜ y ⎟ ⎜ y ⎟
⎟,
⎝ z ⎠ ⎝ z ⎠

that is, the solutions of the system

⎧
⎪ −x + 2y = 4x
⎪
⎪
⎪
⎨ x + y = 4y
⎪
⎪
⎪
⎪
⎩−x + y + 4z = 4z.
We note that the matrix associated with this linear system is:

⎛ −5 2 0 ⎞
⎜
⎜ 1 −3 0 ⎟
⎟,
⎝ −1 1 0 ⎠

which is the matrix A − 4I.

By reducing the matrix with a Gauss algorithm, we obtain:

⎛ −1 1 0 ⎞
⎜
⎜ 0 −2 0 ⎟
⎟.
⎝ 0 0 0 ⎠

So the solutions of the system depend on 3−2 = 1 parameters. We can determine

x and y, while z takes an arbitrary value s. We get that V4 = {(0, 0, s) ∣ s ∈
R} = ⟨(0, 0, 1)⟩, and it has dimension 1.
Eigenvalues and Eigenvectors 165

√
• We calculate the eigenspace V√3 corresponding eigenvalue 3, which consists
of the vectors (x, y, z) ∈ R such that:
3

⎛ x ⎞ √ ⎛ x ⎞
A⎜ ⎟ = 3⎜
⎜ y ⎟ ⎟.
⎜ y ⎟
⎝ z ⎠ ⎝ z ⎠

So, similarly to what we did before, we need to find

√ the solutions of the homo-
geneous system associated with the matrix A − 3I:
√
⎛ −1 − 3 2√ 0 ⎞
A=⎜ ⎜ 1 1− 3 0√ ⎟
⎟.
⎝ −1 1 4− 3 ⎠

By reducing the matrix with the Gauss method we obtain

√
⎛ 1 1 − √3 0√ ⎞
⎜
⎜ 0 2− 3 4− 3 ⎟ ⎟,
⎝ 0 0 0 ⎠
√ √ √ √
so V√3 = {(−1 − 3 3)z, −(5 + 2 3)z, z) ∣ z ∈ R} = ⟨(−1 − 3 3, −5 − 2 3, 1)⟩
and it has dimension 1.
√
• We calculate the eigenspace corresponding to the eigenvalue − 3, which con-
sists of the vectors (x, y, z) ∈ R , such that:
3

⎛ x ⎞ √ ⎛ x ⎞
A⎜ ⎟ = − 3⎜
⎜ y ⎟ ⎜ y ⎟
⎟,
⎝ z ⎠ ⎝ z ⎠

that is, the √solutions of the homogeneous linear system associated with the
matrix A + 3I: √
⎛ −1 + 3 2√ 0 ⎞
A=⎜ ⎜ 1 1+ 3 0√ ⎟ ⎟.
⎝ −1 1 4+ 3 ⎠

By reducing the matrix with the Gauss method we obtain

√
⎛ 1 1 + √3 0√ ⎞
⎜
⎜ 0 2+ 3 4+ 3 ⎟ ⎟,
⎝ 0 0 0 ⎠
√ √ √ √
so V−√3 = {((3 3 − 1)z, (2 3 − 5)z, z) ∣ z ∈ R} = ⟨(3 3 − 1, 2 3 − 5, 1)⟩, and
it has dimension 1. The matrix

⎛ 4 √0 0 ⎞
⎜
D=⎜ 0 3 0√ ⎟
⎟
⎝ 0 0 − 3 ⎠
166 Introduction to Linear Algebra

−1
is similar to A, and we have that D = P AP , where
√ √
⎛ 0 −1 − 3√3 3√3 − 1 ⎞
P =⎜⎜ 0 −5 − 2 3 2 3 − 5 ⎟ ⎟.
⎝ 1 1 1 ⎠

We now return to the general theory. We know that the eigenvalues of a matrix A
are the roots of its characteristic polynomial. The fact that pA (λ) = 0 is equivalent
to saying that x − λ divides pA (x), i.e. we can write pA (x) = (x − λ)f (x), where
f (x) is a polynomial in x.
We now want to be more precise.

Definition 9.2.17 Let A be a square matrix. A scalar λ is called an eigenvalue of

algebraic multiplicity m if pA (x) = (x − λ) f (x), where f (x) is a polynomial, such
m

that f (λ) ≠ 0. In practice (x − λ) is the maximum power of x − λ that divides

pA (x).
If λ is an eigenvalue of A, the dimension of Ker (A − λI) is called the geometric
multiplicity of λ.

Observation 9.2.18 If λ is an eigenvalue of A, by definition there is at least one

nonzero eigenvector, so that Ker (A − λI) has dimension greater or equal to one.
This tells us that the geometric multiplicity of an eigenvalue is always greater than
or equal to one.

Proposition 9.2.19 If A is a square matrix and λ is an eigenvalue of A, then the

geometric multiplicity of λ is less than or equal to its algebraic multiplicity.
n n
Proof. Suppose that A is a square matrix of order n and let T ∶ R ⟶ R be
the linear transformation associated with A with respect to the canonical basis. The
geometric multiplicity s of λ is the dimension of Ker (A − λI) = Ker (T − λid) =
Vλ . Let {v1 , . . . , vs } be a basis of Ker (T − λid) and let us complete it to a basis
B = {v1 , . . . , vs , . . . , vn } of R . (Note that we can always do it by the Completion
n

Theorem 4.2.1). Since T (vi ) = λvi for i = 1, . . . , s, the matrix AB associated to T

with respect to the basis B is of the type:

⎛λ 0 ⋯ 0 b1 s+1 ⋯ b1 n ⎞
⎜
⎜ 0 λ ⋯ 0 b2 s+1 ⋯ b2 n ⎟ ⎟
⎜
⎜ ⎟
⎜
⎜ ⋮ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ ⎟ ⎟
⎟
⎜
⎜ ⎟
⎜
AB = ⎜
⎜ 0 0 ⋯ λ bs s+1 ⋯ bs n ⎟ ⎟
⎟
⎟.
⎜
⎜ ⋯ bs+1 n ⎟
⎟
⎜
⎜ 0 0 ⋯ 0 bs+1 s+1 ⎟
⎟
⎜
⎜ ⎟
⎜⋮ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ ⎟ ⎟
⎝0 0 ⋯ 0 bn s+1 ⋯ bn n ⎠

We observe now that, by Theorem 9.2.8, we have pA (x) = pAB (x), because the two
matrices A and AB are similar. Compute det(A − xI) = det(AB − xI) developing
according to the first column, and then again according to the first column of the
Eigenvalues and Eigenvectors 167

only minor of order n − 1 with nonzero determinant appearing in the formula, and
so on. We get:

det(AB − xI) =

⎛λ − x 0 ⋯ 0 b1 s+1 ⋯ b1 n ⎞
⎜
⎜ 0 λ − x ⋯ 0 b2 s+1 ⋯ b2 n ⎟⎟
⎜
⎜ ⎟
⎜
⎜ ⋮ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ ⎟ ⎟
⎟
⎜ ⎟
det ⎜
⎜
⎜
⎜ 0 0 ⋯ λ−x bs s+1 ⋯ bs n ⎟⎟
⎟
⎟ =
⎜
⎜ ⎟
⎟
⎜
⎜ 0 0 ⋯ 0 bs+1 s+1 − x ⋯ bs+1 n ⎟⎟
⎜
⎜ ⎟
⎜ ⋮ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ ⎟ ⎟
⎝ 0 0 ⋯ 0 bn s+1 ⋯ bn n − x⎠

⎛λ − x ⋯ 0 b2 s+1 ⋯ b2 n ⎞
⎜
⎜ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ ⎟ ⎟
⎜
⎜ ⎟
⎜ bs n ⎟⎟
(λ − x) det ⎜ ⎟
⎜ 0 ⋯ λ−x bs s+1 ⋯ ⎟=
⎜
⎜ 0 ⋯ bs+1 n ⎟⎟
⎜
⎜ ⋯ 0 bs+1 s+1 − x ⎟
⎟
⎜
⎜ ⎟
⎜ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ ⎟ ⎟
⎝ 0 ⋯ 0 bn s+1 ⋯ bn n − x⎠

⎛λ − x bs s+1 ⋯ bs n ⎞
⎜ 0 ⋯ bs+1 n ⎟
= (λ − x) det ⎜ ⎟
s−1
⎜ bs+1 s+1 − x ⎟=
⎜
⎜ ⋮ ⎟ ⎟
⎜ ⋮ ⋮ ⋱ ⎟
⎝ 0 bn s+1 ⋯ bn n − x⎠

⎛bs+1 s+1 − x ⋯ bs+1 n ⎞

= (λ − x) det ⎜ ⋮ ⎟
s
⎜ ⋮ ⋱ ⎟.
⎝ bn s+1 ⋯ b n n − x⎠

⎛bs+1 s+1 ⋯ bs+1 n ⎞

Let B = ⎜ ⋮ ⎟⎟ ; we get that pA (x) = pAB (x) = (x − λ) pB (x), that
s
⎜ ⋮ ⋱
⎝ bn s+1 ⋯ bn n ⎠
is, (x − λ) divides pA (x). As the algebraic multiplicity m of λ is the maximum power
s

of x − λ that divides pA (x), we have that s ≤ m, i.e. the geometric multiplicity of λ

is less than or equal to its algebraic multiplicity.

Observation 9.2.20 Combining the previous proposition and Observation 9.2.18

we obtain that if λ is an eigenvalue of a matrix, and λ has algebraic multiplicity 1,
then the geometric multiplicity of λ is exactly 1. We will see that this will be of great
help in the exercises.

The following proposition allows us to have a clear strategy for figuring out if a
certain n × n matrix is diagonalizable.

Proposition 9.2.21 Let A be a square matrix of order n, and let λ1 , . . . , λr be its

distinct eigenvalues, with geometric multiplicity, respectively, n1 , . . . , nr . Then A is
diagonalizable if and only if n1 + ⋅ ⋅ ⋅ + nr = n.
168 Introduction to Linear Algebra

Proof. Let T be the linear application associated with A with respect to the canonical
basis. For each eigenvalue λi , we consider a basis Bi = {vi 1 , . . . , vi ni } of the eigenspace
Vλi . We then show that the union B = {v1 1 , . . . , v1 n1 , . . . , vr 1 , . . . , vr nr } of such
m
bases is a basis for R . Once shown this, by Proposition 9.2.3, we have that T is
diagonalizable, because B is a basis of eigenvectors.
Since n1 + ⋅ ⋅ ⋅ + nr = n the set B contains n vectors, so, by Proposition 4.2.6, to
n
prove that B is a basis of R , it is enough to prove that the vectors of B are linearly
independent. Suppose

λ1 1 v1 1 + ⋅ ⋅ ⋅ + λ1 n1 v1 n1 + ⋅ ⋅ ⋅ + λr 1 vr 1 + ⋅ ⋅ ⋅ + λr nr vr nr = 0. (9.5)

Observe that w1 = λ1 1 v1 1 + ⋅ ⋅ ⋅ + λ1 n1 v1 n1 , . . . , wr = λr 1 vr 1 + ⋅ ⋅ ⋅ + λr nr vr nr belong

to Vλ1 , . . . , Vλr , respectively. Observe that we can read equality (9.5) as:

1 ⋅ w1 + 1 ⋅ w2 + ⋅ ⋅ ⋅ + 1 ⋅ wr = 0. (9.6)

If some of the wi were not zero, by equality (9.6) such wi would be linearly dependent,
but this contradicts the fact that eigenvectors with distinct eigenvalues are linearly
independent (Theorem 9.2.14). So wi = 0 for each i = 1, . . . , r, i.e. λi 1 vi 1 + ⋅ ⋅ ⋅ +
λi n1 vi n1 = 0, for each i = 1, . . . , r. Let us now exploit the fact that the vectors
vi 1 , . . . , vi ni are linearly independent, because they form a basis of Vλi . We get:
λi 1 = ⋅ ⋅ ⋅ = λi ni = 0. Thus, in the linear combination in the first member of equality
(9.5), the coefficients must be all zero, and this shows that the vectors of B are linearly
independent, so B is a basis of eigenvectors.
To see the reverse implication, let us assume that A is diagonalizable and let
B = {v1 , . . . , vn } be a basis of R consisting of eigenvectors of T . The matrix D
n

associated to T with respect to the basis B is a digonal matrix which is similar to A,

and its characteristic polynomial is the product of n linear terms of the type x − λi ,
where λi is an eigenvalue of A. By Proposition 9.2.8, similar matrices have the same
characeristic polyniomial, thus pA (x) = pD (x) = (x − λ1 ) 1 ⋯(x − λr ) r , where mi
m m

is the algebraic multiplicity of the eigenvalue λi , but it also is the number of distinct
eigenvectors of eigenvalue λi in B, thus mi ≤ ni = dim(Vλi ) for all i = 1, . . . , r.
Moreover, m1 + ⋅ ⋅ ⋅ + mr = n. As ni ≤ mi by Proposition 9.2.19, it follows that
ni = mi for all i = 1, . . . , r. Therefore n1 + ⋅ ⋅ ⋅ + nr = n, as we wanted to prove.

We can now describe how to proceed to determine whether a square matrix of

order n is diagonalizable.

• We calculate the roots of the characteristic polynomial. If we get n distinct

roots, then we have n distinct eigenvalues corresponding to n linearly indepen-
dent eigenvectors, and thus A is diagonalizable.

• If the eigenvalues are not distinct, for each eigenvalue λ we calculate its geo-
metric multiplicity. If the sum of all geometric multiplicities is n, this will allow
us to find n linearly independent eigenvectors and therefore a basis of V ; con-
sequently A is diagonalizable. If the sum of all geometric multiplicities is less
than n, then A is not diagonalizable.
Eigenvalues and Eigenvectors 169

We conclude with an observation, which relates the calculation of the kernel of a

linear transformation and the fact that it has a zero eigenvalue.
n n
Observation 9.2.22 Suppose that the linear transformation T ∶ R → R has
an eigenvalue equal to zero. What can we say about T ? The eigenspace V0 has a
dimension of at least 1 by Observation 9.2.18. Also, if A is the matrix associated to
T with respect to the canonical basis, V0 consists of the solutions of the linear system
associated with the matrix A − 0I = A, that is V0 is the kernel of T . So if the linear
transformation T has an eigenvalue equal to zero, we can say that V0 = Ker T and T
is not injective.

9.3 EXERCISES WITH SOLUTIONS

2 2
9.3.1 Establish if the linear transformation L ∶ R → R defined by L(x, y) =
(x − y, x + 3y) is diagonalizable. Furthermore, if possible, determine a diagonal ma-
trix D that is similar to A and a matrix B that is not similar to A.

Solution. We first write the matrix associated with L with respect to the canonical
basis:
1 −1
A=( ).
1 3
Now we compute the characteristic polynomial of this matrix:
1 − x −1 2 2
pA (x) = det ( ) = x − 4x + 4 = (x − 2) .
1 3−x
The characteristic polynomial has the unique root x = 2, with algebraic multiplicity
2. So L has only one eigenvalue.

To find the eigenvectors of eigenvalue 2, we solve the homogeneous linear system

associated with the matrix A − 2I:
−1 −1
( ).
1 1

We get that V2 = ⟨(−1, 1)⟩ has dimension 1. So it is not possible to find a basis of
2
R consisting of eigenvectors and so the matrix A is not diagonalizable.
Since A is not diagonalizable, no diagonal matrix D can be found that is similar to
−1 0
A. In particular, if we take for example B = ( ), certainly B is not similar to
0 5
A.
9.3.2 Let L ∶ R ⟶ R be the linear transformation defined by L(e1 ) = 8e1 + 3e2 ,
3 3

L(e2 ) = −18e1 − 7e2 , L(e3 ) = 9e1 + 3e2 − e3 . Determine if A is diagonalizable and,

if so, determine a matrix D similar to A, where A is the matrix associated with L
with respect to the canonical basis. Furthermore, if possible, determine two distinct
−1 −
matrices P1 and P2 such that P1 AP1 = P2 1AP2 = D.
170 Introduction to Linear Algebra

Solution. The A matrix associated with L with respect to the canonical basis is:

⎛ 8 −18 9 ⎞
A=⎜
⎜ 3 −7 3 ⎟
⎟.
⎝ 0 0 −1 ⎠

Let us compute the characteristic polynomial of A:

⎛ 8−x −18 9 ⎞
⎜
pA (x) = det ⎜ 3 −7 − x 3 ⎟ 2
⎟ = −(1 + x)(x − x − 2) .
⎝ 0 0 −1 − x ⎠

We now find the roots of the characteristic polynomial, i.e. the solutions of (1 +
x)(x − x − 2) = 0, that is (1 + x) (x − 2) = 0:
2 2

x1 = −1, x2 = 2.

Such roots are the eigenvalues of A. We notice that the eigenvalue −1 has algebraic
multiplicity 2 and the eigenvalue 2 has algebraic multiplicity 1.
By Observation 9.2.20, we know that the eigenspace V2 has dimension 1, while
by Proposition 9.2.19 we can only say that the dimension of V−1 is at most 2. If V−1
has dimension 2, then, by Proposition 9.2.21, we have that A is diagonalizable; if
instead V−1 has a dimension smaller than 2, then it is not possible to find a basis of
3
R consisting of eigenvectors and therefore A is not diagonalizable.
So we compute the eigenspace corresponding to the eigenvalue x = −1; it consists
of the solutions of the homogeneous linear system associated with the matrix A + I:

⎛ 8 + 1 −18 9 ⎞
⎜
⎜ 3 −7 + 1 3 ⎟
⎟.
⎝ 0 0 −1 + 1 ⎠

Reducing the matrix with the Gaussian algorithm we obtain

⎛ 1 −2 1 ⎞
⎜
⎜ 0 0 0 ⎟⎟,
⎝ 0 0 0 ⎠

so the system solutions depend on 3 − 1 = 2 parameters. We can determine x while

y, z can assume arbitrary values s, t. We obtain that V−1 = {(2s − t, s, t) ∣ s ∈ R} =
⟨(2, 1, 0), (−1, 0, 1)⟩. Hence it has dimension 2. So A is diagonalizable.
The required matrices Pi are matrices whose columns must consist of the coordi-
nates (with respect to the canonical basis) of a basis of eigenvectors, therefore it is
necessary to determine also the eigenspace V2 .
We have to solve the homogeneous linear system associated with the matrix A−2I:

⎛ 8 − 2 −18 9 ⎞
⎜
⎜ 3 −7 − 2 3 ⎟
⎟.
⎝ 0 0 −1 − 2 ⎠
Eigenvalues and Eigenvectors 171

Reducing the matrix with the Gaussian algorithm we obtain

⎛ 1 −3 1 ⎞
⎜ 0 0 3 ⎟
⎜ ⎟,
⎝ 0 0 0 ⎠

so the system solutions depend on 3 − 2 = 1 parameter. We get x, z while y has an

arbitrary value s. We obtain that V2 = {(3s, s, 0) ∣ s ∈ R} = ⟨{(3, 1, 0)⟩ Hence, it has
1 dimension, as we expected.
Now a basis of eigenvectors is for example B1 = {(2, 1, 0), (−1, 0, 1), (3, 1, 0)} and
the matrix associated to L with respect to this ordered basis has on the diagonal the
eigenvalues corresponding to the basis vectors, i.e.:

⎛ −1 0 0 ⎞
D=⎜
⎜ 0 −1 0 ⎟
⎟.
⎝ 0 0 2 ⎠
−1
A matrix P1 such that P1 AP1 = D is, for example, the basis change matrix IB1 ,C :

⎛ 2 −1 3 ⎞
⎜ 1 0 1 ⎟
P1 = ⎜ ⎟.
⎝ 0 1 0 ⎠
−
If we want another matrix P2 such that P2 1AP2 = D, we must choose another
3
ordered basis of R constisting of eigenvectors, making sure that the first two columns
consist always of the coordinates of eigenvectors with eigenvalue −1, while in third
columns we have the coordinates of an eigenvector with eigenvalue 2, for example B1 =
{(−3, 0, 3), (2, 1, 0), (−3, −1, 0}. In this case, the basis change matrix P2 = IB2 ,C is:

⎛ −3 2 −3 ⎞
⎜ 0 1 −1 ⎟
P2 = ⎜ ⎟.
⎝ 3 0 0 ⎠

9.3.3 Establish if the matrix

⎛ −8 18 2 ⎞
A=⎜
⎜ −3 7 1 ⎟⎟
⎝ 0 0 1 ⎠
is diagonalizable.
Solution. We determine the characteristic polynomial of A:

⎛ −8 − x 18 2 ⎞
pA (x) = det ⎜ ⎟ 2
⎜ −3 7 − x 1 ⎟ = (1 − x)(x + x − 2) .
⎝ 0 0 1−x ⎠

We find the roots of the characteristic polynomial, i.e. the solutions of (1 − x)(x +
2

x − 2) = 0, that is (1 − x) (2 + x) = 0:
2

x1 = 1, x1 = −2
172 Introduction to Linear Algebra

and such roots are the eigenvalues of A. We note that the eigenvalue 1 has algebraic
multiplicity 2 and the eigenvalue −2 has algebraic multiplicity 1.
As in the previous exercise, we know that the V−2 eigenspace is 1, while to deter-
mine the dimension of V1 we need to explicitly compute this eigenspace.
V1 consists of the solutions of the homogeneous linear system associated with the
A − I matrix:
⎛ −8 − 1 18 2 ⎞
⎜
⎜ −3 7−1 1 ⎟ ⎟.
⎝ 0 0 1−1 ⎠
Reducing the matrix with the Gaussian algorithm we obtain

⎛ 1 −2 −1/9 ⎞
⎜ ⎟,
⎜ 0 −12 1/3 ⎟
⎝ 0 0 0 ⎠

so the system solutions depend on 3 − 2 = 1 parameter. So V1 has a dimension of 1.

In this case, A is not diagonalizable because we cannot have a basis of eigenvectors.
This depends on the fact that the eigenvalue 1 has algebraic multiplicity 2, but V1
has dimension 1 and not 2.
9.3.4 Determine for which values of k the matrix:

⎛1 1 1 + k ⎞
A=⎜
⎜2 2 2 ⎟ ⎟
⎝0 0 −k ⎠

is diagonalizable. Also determine for which values of k we have that 5 is an eigenvalue

of A.
Solution. We determine the characteristic polynomial of A:

⎛1 − x 1 1+k ⎞
⎜
pA (x) = det ⎜ 2 2−x 2 ⎟
2
⎟ = (x − 3x)(−k − x).
⎝ 0 0 −k − x⎠

The roots of the characteristic polynomial are:

x1 = 0, x2 = 3, x3 = −k,

and such roots are the eigenvalues of A. If k ≠ 0 and k ≠ −3 we get that A has 3
distinct eigenvalues, so it is diagonalizable.
If k = 0 we get that
⎛1 1 1⎞
A=⎜ ⎜2 2 2⎟
⎟,
⎝0 0 0⎠
and as we have just seen, we know that A has eigenvalues 0 and −3 with algebraic
multiplicity 2 and 1, respectively. By Observation 9.2.20, we know that the eigenspace
V−3 has dimension 1, while to determine the dimension of V0 we need to explicitly
compute this eigenspace.
Eigenvalues and Eigenvectors 173

V0 consists of the solutions of the homogeneous linear system associated with the
matrix A − 0I = A, i.e. V0 = ker A. Reducing the matrix with the Gaussian algorithm
we obtain
⎛1 1 1⎞
⎜0 0 0⎟
⎜ ⎟,
⎝0 0 0⎠
so the system solutions depend on 3 − 1 = 2 parameters. So V0 has dimension 2 and
A is diagonalizable.
If k = −3 we get that
⎛1 1 −2⎞
A=⎜ ⎜2 2 2 ⎟ ⎟,
⎝0 0 3 ⎠
and as we have just seen, we know that A has eigenvalues 0 and 3 with algebraic mul-
tiplicity 1 and 2, respectively. As before, to determine if A is diagonalizable we have
to determine the dimension of the eigenspace relative to the eigenvalue of algebraic
multiplicity 2, that is V3 .
We must solve the homogeneous linear system associated with the matrix:

⎛−2 1 −2⎞
⎜ 2 −1 2 ⎟
A − 3I = ⎜ ⎟.
⎝0 0 0⎠

Reducing the matrix with the Gaussian algorithm we obtain

⎛2 −1 2⎞
⎜0 0 0⎟
⎜ ⎟,
⎝0 0 0⎠

so the system solutions depend on 3 − 1 = 2 parameters. So V3 has dimension 2 and

A is diagonalizable.
In summary, A is diagonalizable for every value of k.
Since the eigenvalues of A are 0, −3 and −k, 5 is an eigenvalue of A if and only
if k = −5.

9.4 SUGGESTED EXERCISES

9.4.1 Find eigenvalues and eigenvectors of the following matrices or linear transfor-
mations:
i) the matrix:
⎛ 2 1 0 ⎞
⎜
⎜ 0 1 −1 ⎟
⎟
⎝ 0 2 4 ⎠
2 2
ii) the linear transformation L ∶ R → R defined by:

L(x, y) = (2x + y, 2x + 3y)

174 Introduction to Linear Algebra

3 3
iii) the linear transformation L ∶ R → R defined by:
L(x, y, z) = (x + y, x + z, y + z)
2 2
iv) the linear transformation L ∶ R → R defined by:
L(x, y) = (x − 3y, −2x + 6y)
2 2
v) the linear transformation L ∶ R → R defined by:
L(e1 ) = e1 − e2 , L(e2 ) = 2e1
9.4.2 a) Given the matrix:
⎛7 0 0⎞
A = ⎜0 7 −1⎟
⎜ ⎟
⎝0 14 −2⎠
compute its eigenvalues and eigenvectors.
′
Is A diagonalizable? If so, determine a diagonal matrix A similar to A.
b) Is it possible to find a matrix B such that AB = I (where I is the identity matrix)?
Clearly motivate the answer.
2 2
9.4.3 Determine a linear transformation T ∶ R ⟶ R that has e1 − e2 as an
eigenvector of eigenvalue 2.
9.4.4 Consider the matrix:
⎛3 2 −1⎞
⎜
A=⎜0 2 0⎟⎟.
⎝−1 −2 3 ⎠

Compute its eigenvalues and eigenvectors.

A is diagonalizable? If so, determine all diagonal matrices similar to A. Also, deter-
mine a diagonal matrix D that is not similar to A.
9.4.5 Consider the matrix:
⎛−4 −6 1 ⎞
A=⎜
⎜3 5 −2⎟⎟.
⎝0 0 −1⎠
a) Compute its eigenvalues and eigenvectors.
′
b) Is A diagonalizable? If so, write a diagonal matrix A similar to A.
c) Is A invertible?
9.4.6 Let us consider the matrix:
⎛9 0 0⎞
A=⎜
⎜6 3 6⎟
⎟.
⎝0 0 9⎠

a) Compute its eigenvalues and eigenvectors.

b) Determine whether A is diagonalizable and if so determine a diagonal matrix D
−1
similar to A and a matrix P such that P AP = D. Is such P unique?
Eigenvalues and Eigenvectors 175

9.4.7 Consider the matrix:

⎛6 −9 −3⎞
A=⎜
⎜0 −2 2 ⎟
⎟.
⎝0 −1 1 ⎠
a) Compute its eigenvalues and eigenvectors.
−1
b) Determine, if possible, two distinct matrices P1 and P2 such that both P1 AP1
−1
and P2 AP2 are diagonal.

9.4.8 Given the linear transformation T ∶ R ⟶ R defined by: T (e1 ) = e1 − 2e2 ,

3 3

T (e2 ) = 2e1 + 6e2 , T (e3 ) = 3e1 − e2 + 5e3 .

a) Compute the eigenvalues and an eigenspace of T .
b) Establish if T is diagonalizable.
c) Let B = {3e2 + e3 , 5e2 + e3 , −e1 + e3 } be another basis of R . Determine the matrix
3

AB associated with T with respect to the basis B (in domain and codomain).

9.4.9 Consider the linear transformation T ∶ R ⟶ R defined by: T (x, y, z) =

3 3

(−3x + 6y, −x + 4y, x − 6y − 2z).

a) Compute the eigenvalues and an eigenspace of T .
3
b) Establish if T is diagonalizable and determine, if possible, a basis B of R such
that the matrix AB associated with T with respect to the basis B is diagonal.

9.4.10 Let T ∶ R ⟶ R be the linear transformation defined by: T (e1 ) = (k +

3 3

1) e1 − e3 , T (e2 ) = ke2 + (k + 1)e3 , T (e3 ) = ke3 and let A be the matrix associated
with T with respect to the canonical basis.
a) Determine for which values of k we have that T is diagonalizable.
b) Determine for which values of k we have that 2e1 − 2e3 is an eigenvector of T .
b) For the values of k found in point a) determine, if possible, two distinct diagonal
−1
matrices D1 and D2 similar to A. Also determine a matrix P such that P AP = D1 .

9.4.11 Given the matrix:

⎛k k 2k − 1⎞
A=⎜
⎜0 k 0 ⎟ ⎟.
⎝0 0 1 ⎠
a) Determine for which values of k we have that A is invertible.
b) Determine for which values of k we have that A is diagonalizable.

9.4.12 Let F ∶ R ⟶ R be the linear transformation defined by: F (x, y, z) =

3 3

(2x + 2y + z, 2x − y − 2z, kz), and let A be the matrix associated with F with respect
to the canonical basis.
a) Determine for which values of k we have that F is diagonalizable.
b) Choose any value of k for which F is diagonalizable and determine all the diagonal
matrices D which are similar to A.

9.4.13 Given the matrix:

⎛k 0 0⎞
A=⎜
⎜k 2 3⎟
⎟.
⎝k 3 2⎠
176 Introduction to Linear Algebra

a) Determine for which values of k we have that A is diagonalizable.

b) Determine for which values of k we have that −2e1 is an eigenvector of A.
c) When k = 3 determine, if possible, a matrix B ∈ M3 (R) having the same eigen-
values of A such that B is not similar to A.
CHAPTER 10

Scalar Products

In the definition of vector space, we have the two operations of sum of vectors and
multiplication of a vector by a scalar (see Chapter 2). In this chapter, we want to
introduce a new operation: the scalar product of two vectors. The result of this
operation is a scalar, that is a real number. In addition to its vast importance in the
applications to physics, we will see how the scalar product is essential in linear algebra
for the solution of the problem of diagonalization of symmetric matrices, which we
will discuss later.

10.1 BILINEAR FORMS

In this section, we want to introduce the concept of bilinear form or bilinear appli-
cation and study the bijective correspondence between the set of bilinear forms on a
vector space V of dimension n and the set of matrices of order n, once we fixed an
ordered basis for V .

Definition 10.1.1 Let V be a vector space. A function g ∶ V × V ⟶ R is called a

bilinear form or a bilinear application if:
′ ′
1. g(u + u , v) = g(u, v) + g(u , v),
g(u, v + v ) = g(u, v) + g(u, v )for every u, u , v, v ∈ V .
′ ′ ′ ′

2. g(λu, v) = λg(u, v),

g(u, µv) = µg(u, v) for each u, v ∈ V and for each λ, µ ∈ R.

In other words, for any fixed vector u ∈ V the functions g(u, ⋅) ∶ V ⟶ R and
g(⋅, u) ∶ V ⟶ R are linear applications, hence the term bilinear.
g is called symmetric if g(u, v) = g(v, u) for every u, v ∈ V . A symmetric bilinear
form is called a scalar product on V and will be denoted with < , >.

We shall return to the definition of scalar product in Section 10.4, where we examine
it in more detail.

Example 10.1.2 Let us define the function:

2 2
g ∶ R × R ⟶ R, g((x1 , x2 ), (y1 , y2 )) = x1 y1 + 2x1 y2 .

177
178 Introduction to Linear Algebra

Let us check property (1) of the previous definition:

g((x1 , x2 ) + (x1 , x2 ), (y1 , y2 )) = (x1 + x1 )y1 + 2(x1 + x1 )y2 =

′ ′ ′ ′

= g((x1 , x2 ), (y1 , y2 )) + g((x1 , x2 ), (y1 , y2 ))

′ ′

g(λ(x1 , x2 ), (y1 , y2 )) = λx1 y1 + 2λx1 y2 = λ(g((x1 , x2 ), (y1 , y2 )).

In a completely similar way, we can also verify the property (2). It is therefore a
bilinear form. Note, however, that g(e1 , e2 ) = 2 while g(e2 , e1 ) = 0, so g is not a
scalar product.

Let us look at another example, important in applications to physics.

n
Example 10.1.3 Let us define the function in R :

g((x1 , . . . , xn ), (y1 , . . . , yn )) = x1 y1 + ⋅ ⋅ ⋅ + xn yn ,

for every u = (x1 , . . . , xn ) and v = (y1 , . . . , yn ) in R .

We leave to the reader to verify the properties (1) and (2) of Definition 10.1.1. So,
we have a bilinear form. Since g(u, v) = g(v, u), g is a scalar product. This scalar
n
product on R is called Euclidean product or standard product. We shall denote this
n
product between two vectors u, v ∈ R as < u, v >e or also as u ⋅ v.

As it happens for linear maps, a bilinear application is completely determined by the

values it takes on pairs of elements of a fixed basis.

Proposition 10.1.4 Let V be a vector space of dimension n and set in V a basis

B = {v1 , . . . , vn }. Let cij ∈ R, i, j = 1, . . . , n be arbitrary scalars. Then, there exists
a unique bilinear application g such that:

g(vi , vj ) = cij .

Proof. Let us first prove the existence of g. As each vector of V , is expressed in a

unique way as a linear combination of the elements of the fixed basis B, given u and
w in V , we can write:

u = α1 v1 + ⋅ ⋅ ⋅ + αn vn , w = β1 v1 + ⋅ ⋅ ⋅ + βn vn . (10.1)

We therefore define the function g ∶ V × V ⟶ R in the following way:

n
g(u, w) = ∑ αi βj cij , (10.2)
i,j=1

n
where we used the symbol ∑i,j=1 to indicate the sum for all possible i, j = 1, . . . n. In
full:
g(u, w) = α1 β1 c11 + α1 β2 c12 + ⋅ ⋅ ⋅ +

+α1 βn c1n + α2 β1 c21 + ⋅ ⋅ ⋅ + αn βn cnn .

Scalar Products 179

We must now verify that it is a bilinear application, that is, that it satisfies the
properties of Definition 10.1.1. We check the first of the conditions in (1) leaving the
others by exercise.
Consider the three vectors in V :
′ ′ ′
u = α1 v1 + ⋅ ⋅ ⋅ + αn vn , u = α1 v1 + ⋅ ⋅ ⋅ + αn vn , w = β1 v1 + ⋅ ⋅ ⋅ + βn vn .

By definition of g we have that:

n
g(u + u , w) = ∑ (αi + αi )βj cij =
′ ′

i,j=1

n n
′
= ∑ αi βj cij + ∑ αi βj cij =
i,j=1 i,j=1

′
= g(u, w) + g(u , w).

Similarly, we can also prove the other three properties.

We come now to uniqueness. Suppose that another form g̃ exists, and it is such that
g(vi , vj ) = g̃(vi , vj ). We want to prove that g = g̃, i.e. g(u, w) = g̃(u, w) for each u,
w in V . If we express u and w as linear combinations of the elements of the basis B
as in (10.1), because of properties (1) and (2) of Definition 10.1.1, we can write:

g̃(u, w) = g̃(α1 v1 + ⋅ ⋅ ⋅ + αn vn , β1 v1 + ⋅ ⋅ ⋅ + βn vn ) =

= α1 β1 g̃(v1 , v1 ) + α1 β2 g̃(v1 , v2 ) + ⋅ ⋅ ⋅ + α1 βn g̃(v1 , vn )+

+α2 β1 g̃(v2 , v1 ) + ⋅ ⋅ ⋅ + αn βn g̃(vn , vn ) =

n
= ∑i,j=1 αi βj g̃(vi , vj ).

Since g̃(vi , vj ) = g(vi , vj ) by the very definition of g̃ (see (10.2)), we get g̃(u, w) =
g(u, w)

10.2 BILINEAR FORMS AND MATRICES

In this section, we want to prove that we have a one-to-one correspondence between
the set of bilinear applications on a vector space V of dimension n and the set of
matrices of order n, once we choose an ordered basis of V . This is the analog of what
happens for a linear application, which is determined by its values on the vectors of
an ordered basis. This correspondence is fundamental for any calculation concerning
bilinear forms and scalar products.
Before proceeding, we observe a property of the matrices that will be particularly
useful to us, whose verification is an easy calculation. The transpose of a product
180 Introduction to Linear Algebra

of matrices (rows by columns) is the product of the transposed matrices, with the
factors order reversed.
In formulas, if A ∈ Mm,r (R), B ∈ Mr,n (R), then:
T T
(AB) = B A . (10.3)

Let us now continue our discussion on the one-to-one correspondence between bilinear
forms and matrices, once fixed a basis of the given vector space.

Proposition 10.2.1 Let V be a real vector space of dimension n and let B =

{v1 , . . . , vn } be a fixed ordered basis. There is a one-to-one correspondence between
the set of bilinear forms g ∶ V × V ⟶ R and the set Mn (R). In this correspondence:

• The bilinear form g ∶ V × V ⟶ R is associated to the matrix:

⎛ g(v1 , v1 ) . . . g(v1 , vn ) ⎞
C=⎜
⎜ ⋮ ⋮ ⎟
⎟.
⎝g(vn , v1 ) . . . g(vn , vn )⎠

• The bilinear form

T
g(u, v) = (u)B C(v)B , for every u, v ∈ V

is associated with the matrix C ∈ Mn (R) where (u)B denotes the coordinate
column of the vector u relative to the basis B.

Scalar products, i.e. symmetric bilinear forms, correspond to the symmetric matrices
in Mn (R).

Proof. The first point of this correspondence is a direct consequence of the previous
proposition: to each bilinear application we can associate n scalars g(vi , vj ).
2

Now let us see the second point, that is how to associate a bilinear application directly
to a matrix.
We define

⎛ c11 . . . c1n ⎞ ⎛ y1 ⎞
g(u, v) = (u)B C (v)B = (x1 . . . xn ) ⎜ ⋮ ⎟⎟⎜⎜⋮⎟
T
⎜ ⋮ ⎟,
⎝cn1 . . . cnn yn ⎠
⎠ ⎝

where
⎛ x1 ⎞ ⎛ y1 ⎞
(u)B = ⎜
⎜⋮⎟ ⎟, (v)B = ⎜
⎜⋮⎟ ⎟
⎝xn ⎠ ⎝yn ⎠
are the coordinates of u and v with respect to the basis B. It is immediate to verify
that
g(vi , vj ) = cij ,
Scalar Products 181

therefore C is precisely the matrix associated with g as in point (1).

The bilinearity of g is almost immediate. Let us check the first part of condition (1)
of Definition 10.1.1, leaving the other properties by exercise:

g(u + u , v) = (u + u )B C(v)B = (u)B C(v)B + (u )B C(v)B =

′ ′ T T ′ T

′
= g(u, v) + g(u , v).

We note now that g is symmetric if and only if the corresponding matrix C = (cij )
T
is symmetric, i.e. C = C .
If g is symmetric then cij = g(vi , vj ) = g(vj , vi ) = cji for every i, j = 1, . . . , n , so C
is symmetric.
T
Conversely, suppose that C is symmetric, that is, C = C . We observe that for every
T
u, v ∈ V we have that g(u, v) = g(u, v), since g(u, v) ∈ R.
For each u, v ∈ V by (10.3), we have therefore that:
T T T
g(u, v) = (u)B C (v)B = ((u)B C (v)B )
T T T
= (v)B C (u)B = (v)B C (u)B = g(v, u).
This shows that g is symmetric.

Let us see a concrete example of the correspondence described above.

Example 10.2.2 Observe that the matrix:

3 1
C=( ),
1 2
2
is associated, with respect to the canonical basis of R , to the scalar product <
(x1 , x2 ), (y1 , y2 ) >= 3x1 y1 + x1 y2 + x2 y1 2 + x2 y2 . Indeed:

3 1 y1
(x1 x2 ) ( ) ( ) = 3x1 y1 + x1 y2 + x2 y1 + 2x2 y2 .
1 2 y2

10.3 BASIS CHANGE

n
We now want to know how the matrix associated with a given bilinear form on R ,
1
in the canonical basis, changes, if we decide to change basis.
There is an analogy to what we saw for linear maps. We remind the reader that, as
the basis changes, the matrix associated with a linear application can take various
forms. For example, if we can find a basis of eigenvectors, the associated matrix is
diagonal. However, the linear application does not change! The situation here is quite
1 n
The fact that we take V = R is just for simplicity: all our reasoning applies in the same way
to a generic space V of finite dimension, in which we fix a basis.
182 Introduction to Linear Algebra

similar: the matrix associated to the given bilinear form can take very different forms
and yet the bilinear form does not change.
Let IB,C be the basis change matrix; it is the matrix associated with the identity
map, where we have fixed an ordered basis B in the domain and the canonical basis
n
C in the codomain. We then have that, for each vector u ∈ R :

(u)C = IB,C (u)B ,

where (u)B denotes the column of coordinates of the vector u with respect to the
ordered basis B.
Assume that C is the matrix associated with a given bilinear form, with respect to
n ′
the canonical basis of R . We want to determine the matrix C associated with the
same bilinear form with respect to the basis B. Let us replace (u)C , (v)C using (10.3):
T T T T
< u, v >= (u)C C(v)C = (IBC (u)B ) C(IBC (v)B ) = (u)B IBC CIBC (v)B .

We therefore obtained the following. Assume we fixed an arbitrary ordered basis B

′ ′
and let us denote with C the corresponding matrix of a bilinear form g. Then C
is obtained from the matrix C associated with g with respect to the canonical basis
through the formula:
′ T
C = IB,C CIB,C . (10.4)
More generally, as noted in Chapter 8, this formula also applies if C is replaced by an
′
arbitrary basis B ; the proof of this statement remains the same. So formula (10.4)
becomes:
′ T −1
C = IB,B′ CIB,B′ , with IB,B′ = IB′ ,C IB,C , (10.5)
where for the last equality we used Theorem 8.3.1, considering the identity application
n
of R as F .
Let us look at the previous example.

Example 10.3.1 Consider the scalar product < , > associated to the matrix

2 1
C=( )
1 2

with respect to the the canonical basis. In particular, notice that scalar products are
bilinear forms.
Suppose we choose B = {v1 = e1 , v2 = e1 + e2 } as a basis, therefore:

1 1
IBC = ( ).
0 1
′
The matrix C associated with the same scalar product with respect to the basis B
is:
1 0 2 1 1 1 2 3
C =( )( )( )=( ).
′
1 1 1 2 0 1 3 3
Scalar Products 183

Consider the scalar product of v1 and v2 , i.e. < e1 , e1 +e2 >. Using first the canonical
basis and then the basis B, we verify that the result is the same:
2 1 1
< e1 , e1 + e2 >= (1 0) ( )( ) = 3
1 2 1

2 3 0
< e1 , e1 + e2 >= (1 0) ( ) ( ) = 3.
3 3 1
Observation 10.3.2 It is useful to compare the basis change formula for a linear
application and that of the basis change for a bilinear form.
• Two matrices A and B represent the same linear application (with respect to
different bases) if and only if they are similar, that is, there is an invertible
−1
matrix P such that B = P AP .
• Two matrices A and B represent the same bilinear form (with respect to dif-
T
ferent bases) if and only if an invertible P matrix exists such that B = P AP .
T −1
It is clear by looking at these two formulas that matrices with the property P = P ,
i.e. such that their transpose coincides with their inverse, are of particular importance.
We will do a more detailed study of these matrices and their properties in later
sections.

10.4 SCALAR PRODUCTS

From now on we will consider symmetric bilinear forms, i.e. scalar products only. Let
us recall their definition, adding some remarks.
Definition 10.4.1 Let V be a vector space. The function < , >∶ V × V ⟶ R is
called a scalar product if:
′ ′
1. =< u, v > + ,
′ ′ ′ ′
< u, v + v >=< u, v > + < u, v > for every u, u , v, v ∈ V .
2. < λu, v >= λ < u, v >,
< u, µv >= µ < u, v >, for each u, v ∈ V and for each λ, µ ∈ R.
3. < u, v >=< v, u > for every u, v ∈ V .
In other words, for any fixed vector u ∈ V , the functions < u, ⋅ >∶ V ⟶ R and
< ⋅, u >∶ V ⟶ R are linear applications.
<, > is non-degenerate when, if < u, v >= 0 for each v ∈ V , then u = 0.
<, > is positive definite if < u, u > ≥ 0 for each u ∈ V and < u, u >= 0 if and only if
√
u = 0. If a scalar product is positive definite, we say that < u, u > is the norm of
the vector u, and we denote it with ∥u∥.
As we will see, the positive definite scalar products are particularly important, and
yet there are examples of nonpositive definite scalar products that have a fundamental
importance in physics.
184 Introduction to Linear Algebra

Example 10.4.2 We can immediately verify that the standard scalar product or
n
Euclidean scalar product in R , defined in the Example 10.1.3 given by:

< (x1 , . . . , xn ), (x1 , . . . , xn ) >e = x1 x1 + ⋅ ⋅ ⋅ + xn xn

′ ′ ′ ′

is positive definite, indeed:

2 2
< (x1 , . . . , xn ), (x1 , . . . , xn ) >e = x1 + ⋅ ⋅ ⋅ + xn ≥ 0.
2 2
Also, x1 + ⋅ ⋅ ⋅ + xn = 0 if and only if x1 = x2 = ⋅ ⋅ ⋅ = xn = 0.
Note that the matrix associated with the Euclidean product with respect to the
n
canonical basis of R is the identical matrix.

Definition 10.4.3 Let V be a vector space with a scalar product. We say that
u, v ∈ V are perpendicular (orthogonal) to each other if < u, v >= 0. We will also
use the u ⊥ v notation to indicate two vectors perpendicular to each other.
So we can reformulate the notions defined above in the following way:
A scalar product is non-degenerate if and only if there is no nonzero vector perpendic-
ular to all the others. Furthermore, a positive definite scalar product is automatically
not degenerate: in fact, being positive definite implies that there is no vector orthog-
onal to itself, while being degenerate requires that such a vector exists.
Observation 10.4.4 1. The notion of orthogonality depends on the scalar product
2
chosen. For example, in R we consider the two scalar products:

< (x1 , x2 ), (y1 , y2 ) >e = x1 y1 + x2 y2 ,

< (x1 , x2 ), (y1 , y2 ) >

′
= x1 y2 + x2 y1 .

We see that the vectors e1 and e2 are perpendicular with respect to the Euclidean
product <, >e , but not with respect to the other scalar product, in fact:

< (1, 0), (0, 1) > = 1.

′

2. The existence of a vector orthogonal to itself is a necessary, but not sufficient,

2
condition to have a degenerate scalar product. In fact, if we consider in R the scalar
product:
< (x1 , x2 ), (y1 , y2 ) >= x1 y1 ,
we immediately see that it is degenerate, since the vector e2 is orthogonal to any
other vector.
2
On the other hand, if we consider in R the scalar product:

< (x1 , x2 ), (y1 , y2 ) >m = x1 y1 − x2 y2

we have that the vector (1, 1) is orthogonal to itself, in fact:

< (1, 1), (1, 1) >m = 1 − 1 = 0.

Scalar Products 185

However, this product is non-degenerate, in fact there is no nonzero vector that is

orthogonal to all the others. If, by contradiction, such a vector (a, b) existed, it would
be orthogonal to e1 and e2 :

< (a, b), (1, 0) >m = a = 0 < (a, b), (0, 1) >m = −b = 0.

Hence, (a, b) = (0, 0).

The latter scalar product is particularly important for physics and is called the
Minkowski scalar product.

10.5 ORTHOGONAL SUBSPACES

⊥
In this section, we introduce the notion of orthogonal subspace W to a subspace
W of a vector space V .

Proposition 10.5.1 Let W ⊆ V be a vector subspace of V and < , > a scalar product
on V . Then the set:

= {u ∈ V ∣ < u, w >= 0, for all w ∈ W }

⊥
W

is a vector subspace of V .

Proof. Let us check the three properties of Definition 10.4.1. We immediately see
⊥
that 0V ∈ W . Indeed, by property (2) of Definition 10.4.1:

< 0V , w >=< 0 ⋅ 0V , w >= 0⋅ < 0V , w >= 0.

⊥ ⊥
Let us now verify that for every u1 , u2 ∈ W , we have that u1 + u2 ∈ W . By
property (1) of Definition 10.4.1:

< u1 + u2 , w >=< u1 , w > + < u2 , w >= 0.

⊥
Finally, we verify that for every scalar λ and for every u ∈ W , we have that
⊥
λu ∈ W . By properties (2) of Definition 10.4.1:

< λu, w >= λ < u, w >= 0.

Definition 10.5.2 Given a vector subspace W of V and a scalar product < , > on
V , the orthogonal subspace to W is:

= {u ∈ V ∣ < u, w >= 0, for all w ∈ W }.

⊥
W

The observation that follows is essential for the exercises.

Observation 10.5.3 We observe that if W = ⟨w1 , . . . , wn ⟩ ⊆ V , then

= {u ∈ V ∣ < u, wi >= 0, for all i = 1, . . . , n}.

⊥
W (10.6)
186 Introduction to Linear Algebra

⊥
In fact, if u ∈ W certainly < u, wi >= 0, but the converse is also true. In fact, let
u ∈ V , such that < u, wi >= 0. If w = λ1 w1 + ⋅ ⋅ ⋅ + λn wn ∈ W , then
< u, w > =< u, λ1 w1 + ⋅ ⋅ ⋅ + λn wn >=

= λ1 < u, w1 > + ⋅ ⋅ ⋅ + λn < u, wn >= 0.

So, if we fix a basis of V , we can express conditions in (10.6) using linear equations,
therefore the calculation of the subspace orthogonal to a given subspace amounts to
the solution of an homogeneous linear system.
Example 10.5.4 Let us compute the orthogonal subspace to W = ⟨(1, 1)⟩ with
respect to the Euclidean scalar product and the Minkowski scalar product introduced
in Observation 10.4.4. With respect to the Euclidean scalar product we have:

(W )e = {(x, y) ∣ < (x, y), (1, 1) >e = 0} = {(x, y) ∣ x + y = 0} = ⟨(1, −1)⟩.

⊥

With respect to the Minkowski scalar product we have:

(W )m = {(x, y) ∣ < (x, y), (1, 1) >m = 0} = {(x, y) ∣ x − y = 0} = ⟨(1, 1)⟩.

⊥

Thanks to Observation 10.5.3, we are able to determine the dimension of the subspace
n
orthogonal to W ⊆ R with respect to the Euclidean scalar product.
n ⊥
Proposition 10.5.5 Let W be a vector subspace of R and let W be the subspace
orthogonal to W with respect to the Euclidean scalar product. Then:

dim(W ) + dim(W ) = n.
⊥

Proof. Let B = {v1 , . . . , vm } be an ordered basis of W and let A = (aij ) ∈ Mm,n (R)
be the matrix having the coordinates of the vectors v1 , . . . , vm as rows. By Observa-
tion 10.5.3, we have that (x1 , . . . , xn ) ∈ W if and only if:
⊥

(ai1 , . . . , ain ) ⋅ (x1 , . . . , xn ) = 0 for each i = 1, . . . , n.

In other words:
= {x = (x1 , . . . , xn )∣Ax = 0} = ker(LA ),
⊥
W
where LA is the linear application associated with the matrix A with respect to the
canonical basis.
By the Rank Nullity Theorem:

dim(W ) = n − dim Im(A).

⊥

We know that the dimension of the image of LA is the rank of A, which is the
dimension of the subspace generated by its columns or equivalently by its rows. Since
the rows of A are given by the coordinates of the vectors of a basis of W , this
dimension is just dim(W ). Therefore:

dim(W ) = n − dim(W ).
⊥

Scalar Products 187

10.6 GRAM-SCHMIDT ALGORITHM

Let V be a vector space with a positive definite scalar product. The Gram-Schmidt
algorithm allows us to build a basis consisting of mutually orthogonal vectors.

Definition 10.6.1 Let V be a vector space with a scalar product < , >, which is
positive definite and B = {u1 , . . . , un } a basis of V . We say that B is an orthogonal
basis if ui ⊥ uj for every i, j = 1, . . . , n and that B is an orthonormal basis if:

1 for i = j
< ui , uj >= { ,
0 for i ≠ j
that is, if it is an orthogonal basis and each vector of the basis has norm equal to 1.

In this situation, it is useful to use compact notation:

< ui , uj >= δij ,

where the function δij is called Kronecker delta and is defined by:

1 for i = j
∆ij = { .
0 for i ≠ j

We observe that, given an orthogonal basis, we can immediately obtain an orthonor-

mal basis by multiplying each vector by the inverse of its norm.
To obtain an orthogonal basis from a given basis, it is necessary to introduce the
notion of orthogonal projection.

Definition 10.6.2 Let V be a vector space with a positive definite scalar product,
and let u, v. The orthogonal projection of the vector v on the vector u is given by
the vector:
< v, u >
proju (v) = < u, u > u.

v

BB

BB

u2 u1 = u

BB
BM
B 1

B B
B 1B

B
B proju v

B
B

B
188 Introduction to Linear Algebra

From the figure, where we chose the Euclidean product, we can see and verify with
an easy calculation in the general setting, that if {u, v} is a basis of R , {u1 = u, u2 =
2

v − proju1 (v)} is an orthogonal basis of R . This process can be iterated and allows
2

to build an orthogonal basis {u1 , . . . , un } starting from a generic basis {v1 , . . . , vn }

of V .
2
We can define then, in a similar way to what we have seen for R :
1
u1 = v1 , f1 = u1
∥u1 ∥
1
u2 = v2 − proju1 (v2 ), f2 = u2
∥u2 ∥
1
u3 = v3 − proju1 (v3 ) − proju2 (v3 ), f3 = u3 (10.7)
∥u3 ∥

⋮ ⋮
1
uk = vk − proju1 (vk ) ⋅ ⋅ ⋅ − projuk−1 (vk ), fk = uk ,
∥uk ∥
√
where ∥ui ∥ = < ui , ui >.
With the procedure described above, called Gram-Schmidt algorithm, we immediately
obtained a set of mutually orthogonal vectors.

Theorem 10.6.3 Gram-Schmidt algorithm. Let V be a vector space with a pos-

itive definite scalar product < , >. Let {v1 , . . . , vn } be a basis of V . Then the set of
vectors f1 , . . . , fn obtained through (10.7) is an orthonormal basis of V .

Proof. The fact that the vectors f1 , . . . , fn are mutually orthogonal and their norm is
equal to 1 is easy to verify. Since the dimension of V is n, in order to show f1 , . . . , fn
is a basis, it is enough to verify linear independence. If

λ1 f1 + ⋅ ⋅ ⋅ + λn fn = 0,

then:
< fi , λ1 f1 + ⋅ ⋅ ⋅ + λn fn >= λi = 0 for all i = 1, . . . , n.
So f1 , . . . , fn are linearly independent and form an orthonormal basis of V .

We conclude with an observation that establishes the importance of choosing coordi-

nates with respect to an orthonormal basis for a given positive definite scalar product.
This gives a nice explicit description of the product.

Observation 10.6.4 Let V be a finite dimensional vector space with a positive

definite scalar product < , > and let B = {v1 , . . . , vn } be an orthonormal basis of V .
Such a basis always exists by the Gram-Schmidt algorithm.
By definition, the matrix C = (< vi , vj >)i,j=1,⋯,n is associated with the scalar
product < , >, if we fix the basis B. Thus C coincides with the identity matrix I.
Scalar Products 189

Moreover, if we express two vectors u and w as linear combinations of the vectors of

the basis B, u = α1 v1 + ⋅ ⋅ ⋅ + αn vn , w = β1 v1 + ⋅ ⋅ ⋅ + βn vn , we have that:

< u, w > =< α1 v1 + ⋅ ⋅ ⋅ + αn vn , β1 v1 + ⋅ ⋅ ⋅ + βn vn >=

= α1 β1 < v1 , v1 > +α1 β2 < v1 , v2 > + ⋅ ⋅ ⋅ + αn βn < vn , vn >=

= α1 β1 + ⋅ ⋅ ⋅ + αn βn ,

where we used the bilinearity properties of the scalar product and the fact that
< vi , vj >= δij . Hence, if we choose an orthonormal basis B, the matrix associated
to the given positive definite scalar product is the identity, just as it happens for
n
the standard scalar product in R and the canonical basis, and the scalar product of
two vectors v and w coincides with the Euclidian product of their coordinates with
respect to the basis B.

10.7 EXERCISES WITH SOLUTIONS

10.7.1 Consider the bilinear form < (x1 , x2 ), (y1 , y2 ) >= x1 y1 − 2x1 y2 − 2x1 y2 . De-
termine the matrix associated with it with respect to the canonical basis, and show
that it is a scalar product. Write the matrix associated with the same scalar product
with respect to the basis B = {e1 + e2 , 2e1 − e2 }.
Solution. We can immediately write the associated matrix to the given bilinear form
with respect to the canonical basis:

1 −2
C=( ).
−2 0
′
It is a scalar product since the matrix is symmetric. The matrix C associated with
the same scalar product with respect to the basis B is:
T
1 2 1 −2 1 2 −3 0
C =( ) ( )( )=( ).
′
1 −1 −2 0 1 −1 0 12
3
10.7.2 Consider the vector subspace W of R generated by the vectors w1 = e1 +
⊥
e2 − 3e3 and w2 = −e1 + 2e2 − 3e3 . Determine a basis for W , computed with respect
3
to the Euclidean scalar product in R . Also determine an orthonormal basis for W
⊥
and an orthonormal basis for W .
consists of the vectors (x, y, z) ∈ R , such that:
⊥ 3
Solution. W

(1, 1, −3) ⋅ (x, y, z) = 0, (−1, 2, −3) ⋅ (x, y, z) = 0.

We must therefore solve the linear system:

x + y − 3z = 0
{
−x + 2y − 3z = 0.
190 Introduction to Linear Algebra

We immediately get that W = {(z, 2z, z)∣z ∈ R}, therefore a basis for W is given
⊥ ⊥

by the vector (1, 2, 1).

To determine an orthonormal basis of W it is necessary to use the Gram-Schmidt
algorithm described in (10.7). We therefore have:
√ √ √
u1 = w1 = (1, 1, −3), f1 = ∥u1 ∥ u1 = (1/ 11, 1/ 11, −3/ 11)
1

√ √ √
u2 = w2 − <w2 ,u1 >
u ,
<u1 ,u1 > 1
f2 = 1
u
∥u2 ∥ 2
= (−7/ 66, 2 2/33, −1/ 66).

An orthonormal basis for W is therefore given by:

√ √ √ √ √ √
{(1/ 11, 1/ 11, −3/ 11), (−7/ 66, 2 2/33, −1/ 66)}.

An orthonormal basis for W is obtained by taking a generator, for example (1, 2, 1),
⊥

and dividing it by its norm:

√ √ √
{(1/ 6, 2/ 6, 1/ 6)}.
4
10.7.3 Consider the vector subspace W of R consisting of the solutions of the linear
system:
x+y+z−t=0
{
x + 2y − z + t = 0.
⊥ 3
Determine a basis for W , calculated with respect to Euclidean scalar product in R .

Solution. We observe that we can write the equations that define W in the following
way:
(1, 1, 1, −1) ⋅ (x, y, z, t) = 0, (1, 2, −1, 1) ⋅ (x, y, z, t) = 0.
Therefore the vectors (1, 1, 1, −1) and (1, 2, −1, 1) belong to W as they are perpen-
⊥

dicular to all vectors of W . As dim(W ) = 4 − dim(W ) = 2 by Proposition 10.5.1,

⊥
⊥
it follows that they form a basis of W .

10.8 SUGGESTED EXERCISES

10.8.1 Consider the bilinear form in R given by g((x1 , x2 ), (y1 , y2 )) = 2x1 y1 −x1 y2 +
2

3x2 y2 .

a) Write the matrix associated with it with respect to the canonical basis.

b) Write the matrix associated with it with respect to the basis B = {v1 = e1 +
e2 , v2 = −2e2 }.
4
10.8.2 Let W be the vector subspace of R defined from the equation x+y+2z−t = 0
4
and consider the Euclidian scalar product of R .
⊥
a) Determine a basis for W .

b) Determine an orthogonal basis of W .

Scalar Products 191

4
10.8.3 Let W be the vector subspace of R generated by the vectors e1 + e4 , e2 −
2e3 + e4 .
⊥
a) Determine a basis for W .
b) Determine an orthogonal basis of W .
4
10.8.4 Let W be the vector subspace of R defined by the following equations:

⎧
⎪ x+y+z =0
⎪
⎪
⎪
⎨ −x + y + z + w = 0
⎪
⎪
⎪
⎪ 1
⎩y + z + 2 w = 0.
4
Calculate an orthonormal basis for W relative to the Euclidean scalar product of R .
10.8.5 Let W be the vector subspace of R generated by the vectors (1, 2, −1),
3

(−1/2, −1, −1), (1, 2, 1).

a) Find an orthonormal basis of W with respect to the Euclidean scalar product.
⊥
b) Find an orthonormal basis for W .
5
10.8.6 Consider the vector subspace W of R described by the solutions of the
homogeneous linear system:
⎧
⎪ x1 + 2x2 + x5 = 0
⎪
⎪
⎪
⎨x1 − x2 − x3 = 0
⎪
⎪
⎪2x − x − x = 0
⎪
⎩ 1 3 4

a) Determine an orthogonal basis of W with respect to the Euclidean scalar prod-

uct.
⊥ ⊥
b) Determine W . What is the relation between W and the matrix of the system?
4
10.8.7 Let W be the following vector subspace of R :
W = ⟨(1, −1, 0, 1), (2, −1, 0, 1), (1, 1, 1, 0)⟩.
⊥
a) Determine W .
⊥
b) Determine an orthogonal basis for W and an orthogonal basis for W with respect
4
to the standard scalar product in R .
c) Determine whether the basis found in point (b) remains an orthogonal basis also
with respect to the scalar product:
< (x1 , x2 , x3 , x4 ), (y1 , y2 , y3 , y4 ) > = x1 y1 + 2x2 y2 + 3x3 y3 + 4x4 y4 .

d) Determine the matrix associated with the scalar product of point (c) with respect
to the ordered basis
B = {(1, 1, 1, −1), (0, 1, 0, 1), (0, 0, 1, 0), (1, 0, 1, 2)}.
192 Introduction to Linear Algebra

10.8.8 a) Show that, in a vector space V of finite dimension, a scalar product

associated with a diagonal matrix D, relative to a given basis, is non-degenerate
if and only if all elements on the diagonal of D are nonzero.

b) Prove that, in a vector space V of finite dimension, a scalar product associated

with a diagonal matrix D, relative to a given basis, is positive definite if and
only if all the elements on the diagonal of D are positive.

10.8.9 Let V be a vector space with a scalar product < , >.

a) If < , > is a positive definite scalar product, then is it not degenerate?

b) If < , > is a real nondegenerate scalar product, is it positive definite?

CHAPTER 11

Spectral Theorem

The Spectral Theorem represents one of the most important results of elementary
linear algebra. In Chapter 9, we examined the problem of calculating eigenvalues and
eigenvectors and the question of diagonalizability for a square matrix A of order n
with real entries. We have seen that it is not always possible to find a diagonal matrix
similar to the given matrix, because sometimes we do not have a basis of eigenvectors
n
of A for the space R . However, if the matrix A is symmetric, i.e. it coincides with its
transpose, the Spectral Theorem guarantees that it is diagonalizable. Furthermore,
not only is A similar to a diagonal matrix, but, if we denote with < , > the scalar
product associated with it, there is a basis consisting of eigenvectors of A such that
the matrix associated with < , > with respect to this basis is diagonal.
In order to prove all these results, it is necessary to introduce the concepts of orthog-
onal linear transformation and symmetric linear transformation.

11.1 ORTHOGONAL LINEAR TRANSFORMATIONS

In this section we want to define both orthogonal linear transformations and a set of
matrices which are extremely important for a deeper understanding of the concept of
scalar product: the orthogonal matrices. Let us start with the first of these concepts.
Definition 11.1.1 Let V be a real vector space with a positive definite scalar prod-
uct. We say that a linear transformation U ∶ V ⟶ V is orthogonal if

=< u, v >, for all u, v ∈ V

In other words, the linear transformation U preserves the scalar product given in V .
Proposition 11.1.2 Let V be a real vector space of dimension n with a positive
definite scalar product <, >. Let U ∶ V ⟶ V be a linear map and let u, v ∈ V . The
following statements are equivalent.
1. U is orthogonal, i.e. =< u, v > for each u, v ∈ V .
2. U preserves the norm, i.e.: =< u, u > for each u ∈ V .
3. If B = {v1 , . . . , vn } is an orthonormal basis of V (with respect to the scalar
product <, >) then {U (v1 ), . . . , U (vn )} is also an orthonormal basis of V .

193
194 Introduction to Linear Algebra

Proof. Let us see (2) ⟹ (1). Let u, v ∈ V and consider:

=< (u − v), (u − v) >

Thus:
 −2 + =

=< u, u > −2 < u, v > + < v, v >

Using (2) we immediately see that =< u, v >.
(1) ⟹ (3). If B is an orthonormal basis then < vi , vj >= δij , where δij is the
Kronecker delta, i.e. δij = 1 if i = j and δij = 0 if i ≠ j.
By (1) we have that < vi , vj >== δij so {U (v1 ), . . . , U (vn )} is also
an orthonormal basis of V .
(3) ⟹ (2). Let u = α1 v1 + ⋅ ⋅ ⋅ + αn vn a vector of V expressed as a linear
combination of the elements of B. Then
< u, u > =< α1 v1 + ⋅ ⋅ ⋅ + αn vn , α1 v1 + ⋅ ⋅ ⋅ + αn vn >=

2 2
= α1 < v1 , v1 > +α1 α2 < v1 , v2 > + ⋅ ⋅ ⋅ + αn < vn , vn >

using the bilinearity of the scalar product. Since B is an orthonormal basis

2 2
< u, u >= α1 + ⋅ ⋅ ⋅ + αn (11.1)

Now let us compute :

=< α1 U (v1 ) + ⋅ ⋅ ⋅ + αn U (vn ),

α1 U (v1 ) + ⋅ ⋅ ⋅ + αn U (vn ) >=

= α1 +α1 α2 + ⋅ ⋅ ⋅ +

+αn

using the bilinearity of the scalar product and the linearity of U . We recall that by
hypothesis also {U (v1 ), . . . U (vn )} is an orthonormal basis, i.e. =
δij , we get:
2 2
= α1 + ⋅ ⋅ ⋅ + αn
So, by (11.1), we have =< u, u >.
This concludes the proof.

11.2 ORTHOGONAL MATRICES

We now want to define the set of orthogonal matrices and establish the relation
between orthogonal matrices and orthogonal linear transformations.
Spectral Theorem 195

n
Let us consider the vector space R with the standard scalar product < , >e :

⎛ β1 ⎞
⎜⋮⎟
< u, v >e = (α1 . . . αn ) ⎜ ⎟ = α1 β1 + ⋅ ⋅ ⋅ + αn βn (11.2)
⎝βn ⎠

where u = (α1 . . . αn ) , v = (β1 . . . βn ) are (column) vectors in R .

T T n

Definition 11.2.1 We say that a matrix A ∈ Mn (R) is orthogonal if it preserves the

standard scalar product, that is, if:
n
< Au, Av >e =< u, v >e for all u, v ∈ R

where we denote with Au the product rows by columns of the matrix A by the column
vector u.

Recall now that, given a matrix A ∈ Mn (R), we denote by LA ∶ R ⟶ R the

n n

linear transformation associated with it, if we fix the canonical bases in domain and
codomain; if u ∈ R , LA (u) = Au.
n

The relation between orthogonal matrices and orthogonal linear transformations is

expressed by the following proposition, whose proof is immediate if we compare the
Definitions 11.1.1 and 11.2.1.

Proposition 11.2.2 A matrix A ∈ Mn (R) is orthogonal if and only if the linear

n n
transformation LA ∶ R ⟶ R is orthogonal with respect to the standard scalar
n
product in R (see formula (11.2)).

Orthogonal matrices satisfy many properties, which we summarize in a proposition.

Proposition 11.2.3 Let A ∈ Mn (R) be a square matrix. The following statements

are all equivalent.
n
a) A is orthogonal, i.e. < Au, Av >e =< u, v >e for each u, v ∈ R .
n
b) A preserves the vector norm, i.e .: < Au, Au >e =< u, u >e for each u, v ∈ R .
T T −1 T
c) A A = I = A A where I is the identity matrix. In particular, A =A .

d) The columns and rows of A form two orthonormal bases.

Proof. The equivalence between (a) and (b) is immediate by Proposition 11.1.2.
Before proceeding with the proof, we recall that the standard scalar product of two
column vectors u = (α1 . . . αn ) , v = (β1 . . . βn ) , can be expressed as the product
T T

rows by columns:
⎛ β1 ⎞
u v = (α1 . . . αn ) ⎜⎜⋮⎟
T
⎟
⎝βn ⎠
So we have that:
T T T
< Au, Av >e = (Au) (Av) = u (A A)v (11.3)
196 Introduction to Linear Algebra

remembering that if X and Y are matrices, we have (XY ) = Y X .

T T T

(a) ⟹ (c). By formula (11.3) and by our assumptions, it follows that

T T T
u (A A)v = u v

for each u, v ∈ R . Let A A = C = (cij )i,j=1,...,n and choose u = ei and v = ej in the

n T

above formula. We see that

T T
ei (A A)ej = cij ,
while
T
ei ej = δij
where δij is the Kronecker delta. So we have
T
A A=I
T
since we proved that the matrices C = A A and I have the same entries.
In particular, it follows that the determinant of A is nonzero, so A is invertible.
−1 T −1
Multiplying both sides of the equality by the matrix A , we obtain that A = A ,
T
so AA = I.
(c) ⟹ (a). Suppose that AA = I. Then by formula (11.3), we have that
T

T T T T
< Au, Av >e = (Au) (Av) = u (A A)v = u v =< u, v >e ,

as we wanted to prove.
(c) ⟺ (d). If v1 , . . . , vn are the column vectors of A the equation A A = I is
T

equivalent to:
vi ⋅ vj = 0, i ≠ j, vi ⋅ vi = 1.
that is, it is equivalent to the fact that these columns are orthonormal vectors.
T
For the row vectors we argue similarly. The equation A A = I is equivalent to the
fact that the rows of A are orthonormal vectors. We conclude by remembering that
n
A is an invertible matrix if and only if its rows (columns) form a basis of R .
T
Remark 11.2.4 We observe that in the previous proposition, condition (c), AA =
T
I = A A, is equivalent to only one of the two equalities. In fact, in general, given
X, Y ∈ Mn (R), if XY = I then, by Binet theorem, det(XY ) = det(X) det(Y ) = 1.
Hence both matrices are invertible and they are one the inverse of the other (see
Observation 7.4.3 of Chapter 7).
Similarly in part (d), the two conditions “ the columns of A form a basis” and “ the
rows of A form a basis” are equivalent. This is a consequence of the general fact that
the row rank of a matrix is equal to its column rank.

Because of the previous remark and the above discussion we have the following propo-
sition.
Spectral Theorem 197

Proposition 11.2.5 Let V be a real vector space of finite dimension with a positive
definite scalar product. Let U ∶ V ⟶ V be a linear map and let A be the matrix
associated with it, with respect to an orthonormal B basis. Then U is orthogonal if
and only if the matrix A is orthogonal.

We conclude this section with some properties of orthogonal matrices.

Observation 11.2.6 1. The determinant of an orthogonal matrix can only be

±1. Indeed
T T 2
det(A A) = det(A ) det(A) = det(A) = det(I) = 1

because the determinant of a matrix is equal to the determinant of its transpose,

by Corollary 7.9.8.
−1 T
2. Also note that, if A is an orthogonal matrix, we have that A = A is or-
thogonal. This immediately follows from property (c) of Proposition 11.2.3. To
verify that (A ) A = I just recall that (A ) = A.
T T T T T

3. If A and B are orthogonal matrices then AB is an orthogonal matrix. In fact,

we can directly verify that:
T T T T T
(AB) (AB) = B A AB = B IB = B B = I

Let us see some examples of orthogonal matrices. We leave to the reader the easy
−1 T
verification that A = A .
2
Example 11.2.7 1. Rotations in R :

cos(t) sin(t)
A=( )
−sin(t) cos(t)

As the reader can easily verify, this matrix is associated with a rotation of the plane
by an angle t, centered on the origin of the Cartesian axes.
2
2. Reflection with respect to the line x = y in R :

0 1
A=( )
1 0

As the reader can easily verify, this matrix it is associated with a reflection, i.e. the
points that belong to the line x = y are fixed, and every other point of the plane is
sent to the his symmetric with respect to that line.
3
3. Rotations in R . Let us consider a linear transformation that rotates any vector
applied in the origin around a given axis passing through the origin, of a certain fixed
angle t. Such a linear transformation preserves the Euclidean norm and therefore
is an orthogonal transformation (with respect to the Euclidean scalar product). A
3
famous theorem by Euler states that every orthogonal transformation of R whose
matrix has determinant equal to 1 is of this type.
198 Introduction to Linear Algebra

11.3 SYMMETRIC LINEAR TRANSFORMATIONS

In this section we want to define symmetric linear maps and see which matrices are
associated with them, once we fix suitable bases in domain and codomain.

Definition 11.3.1 Let V be a real vector space with a positive definite scalar product
< , >. We say that the linear transformation T ∶ V ⟶ V is symmetric if

< T (u), v >=< u, T (v) > for each u, v ∈ V (11.4)

n n
Let us now look at the particular case of a linear transformation T ∶ R ⟶ R and
the Euclidean scalar product. If we fix the canonical basis C, we can write:
T T n
< u, v >e = u Iv = u v for all u, v ∈ R
n
where<, >e denotes the standard scalar product in R and u denotes a column vector
n n
in R . This notation identifies a vector in R with the column of its coordinates with
respect to the canonical basis.
In other words, the Euclidean product associated with the canonical basis corresponds
to the identity matrix. Let us now consider the matrix A associated with the linear
transformation T , with respect to the canonical basis, i.e. T = LA .
We can therefore write:
T T T T T n
< T (u), v >e = (Au) v = u A v = u A v for each u, v ∈ R (11.5)
T
If A is a symmetric matrix, i.e. A = A , we have that T is a symmetric linear map.
In fact from (11.5):
T T T
< T (u), v >e =< Au, v >e = u A v = u Av =< u, Av >e =< u, T (v) >e

Conversely, if T is symmetric, we see that the A matrix is symmetric. Indeed:

0⎞ ⎛ ⎞
0
⎛1 0 . . . ⎜ ⋮⎟
⎜0 1 . . . 0⎟ ⎜
⎜ ⎟
⎟
< T (ei ), ej >e = (Aei ) ej = (a1i . . .
T
ani ) ⎜
⎜
⎜
⎜
⎟
⎟
⎟
⎟
⎜
⎜
⎜
⎜1 ⎟
⎟
⎟
⎟ = aji
⎜⋮ ⋮⎟ ⎜
⎜ ⎟
⎟
⎝0 ⎜⋮⎟
... 1⎠ ⎝ ⎠
0

⎛1 0 . . . 0⎞ ⎛ a1j ⎞
⎜0 1 . . . 0⎟
⎟⎜ a2j ⎟
< ei , T (ej ) >e = = (0 . . . 1 . . . 0) ⎜
⎜ ⎟ ⎜
⎜ ⎟
⎟
T
ei Aej ⎜
⎜ ⋮⎟
⎟ ⎜ ⎟ = aij
⎜⋮ ⎟⎜ ⋮ ⎟
⎜ ⎟
⎝0 ... 1⎠ ⎝anj ⎠

and therefore by the condition (11.4), we have aij = aji , that is, the matrix A is
symmetric.
The same happens for the general case: every symmetric linear map is associated to
a symmetric matrix, if we fix an orthonormal basis in domain and codomain.
Spectral Theorem 199

Proposition 11.3.2 Let V be a real vector space, with a positive definite scalar
product < , > and let B = {v1 , . . . , vn } be an orthonormal basis. Let T ∶ V ⟶ V be
a linear map and let A = (aij ) be the matrix associated with T with respect to the
basis B. Then T is symmetric if and only if the matrix A is symmetric.
Proof. We first prove that the matrix associated to the symmetric linear map T with
respect to the basis B is symmetric. We have:

T (vj ) = a1j v1 + ⋅ ⋅ ⋅ + anj vn

from which:
< T (vj ), vi > = a1j < v1 , vi > + ⋅ ⋅ ⋅ + aij < vi , vi > + . . .

⋅ ⋅ ⋅ + anj < vn , vj >= aij

and similarly
< vj , T (vi ) > = aj1 < vj , v1 > + ⋅ ⋅ ⋅ + aji < vj , vi > + . . .

⋅ ⋅ ⋅ + anj < vn , vj >= aji

So we have that:
aij =< T (vj ), vi >=< vj , T (vi ) >= aji (11.6)
On the other hand, equation (11.6) proves that if the matrix A associated with T with
respect to an orthonormal basis is symmetric, then < T (vj ), vi >=< vj , T (vi ) >. As
each u, v ∈ V is a linear combination of v1 , . . . , vn we have that the previous equality
gives immediately < T (u), v >=< u, T (v) > for each u, v ∈ R . This concludes the
n

proof.

Remark 11.3.3 Note that in the preliminary observations to the previous proposi-
tion we have proved that if A is a n × n matrix with real entries then:
T
< Au, v >e =< u, A v >e
n
where < , >e is the Euclidean scalar product and u, v ∈ R are column vectors.

11.4 THE SPECTRAL THEOREM

In this section, we want to state and prove one of the most important results of linear
algebra: the spectral theorem.
Let us start with two lemmas followed by some immediate observations. The first
lemma tells us that every symmetric matrix admits at least one real eigenvalue, the
second lemma, which is actually a consequence of the first, tells us that eigenvectors
of distinct eigenvalues of a symmetric matrix are always perpendicular to each other.
These are the key steps to the proof of the spectral theorem.
Lemma 11.4.1 Let A ∈ Mn (R) be a symmetric matrix. Then A admits a real
eigenvalue.
200 Introduction to Linear Algebra

Proof. See Appendix 11.7.

We observe that once proven that a matrix A with real entries admits an eigenvalue
n
λ ∈ R, we immediately have that the eigenspace Vλ = ker(A − λI) ⊆ R contains
a nonzero vector and therefore there exists also a real eigenvector relative to the
eigenvalue λ.
Let us now establish another result that will be fundamental in the proof of the
spectral theorem.

Lemma 11.4.2 Let A ∈ Mn (R) be a symmetric matrix, λ a real eigenvalue and

n
u ∈ R an eigenvector of eigenvalue λ. Let w be a vector perpendicular to u with
respect to the Euclidean scalar product. Then, u is perpendicular to Aw.
T
Proof. Since u is perpendicular to w and A = A , we have

0 = λ < u, w >e =< λu, w >e =< Au, w >e =< u, Aw >e ,

and therefore also Aw is perpendicular to u.

We almost immediately have a particularly important corollary.

Corollary 11.4.3 Let λ and µ be distinct eigenvalues of a symmetric matrix A and

u, w two corresponding eigenvectors. Then u is perpendicular to w.

Proof. We need to show that < u, w >e = 0. We have

λ < u, w >e =< Au, w >e =< u, Aw >e = µ < u, w >e ,

So (λ − µ) < u, w >e = 0 and since λ ≠ µ, we get < u, w >e = 0.

Let us summarize with a corollary what we proved for symmetric matrices in terms
of symmetric linear transformations.

Corollary 11.4.4 Let V be a real vector space of finite dimension with a positive
definite scalar product, and let T ∶ V ⟶ V be a symmetric linear map. Then:

1. T admits at least one real eigenvalue.

2. If w ∈ V is perpendicular to an eigenvector u of T then T (w) is also perpen-

dicular to u.

3. Eigenvectors of T relative to distinct eigenvalues are perpendicular to each

other.

Proof. Let B be an orthonormal basis for the positive definite scalar product. Such
a basis exists thanks to the Gram-Schmidt algorithm. By Proposition 11.3.2, the
matrix A associated to T with respect to the basis B is symmetric. Therefore, the
statements of the corollary immediately follow from Lemmas 11.4.1, 11.4.2 and from
Corollary 11.4.3.
Spectral Theorem 201

We can finally state the spectral theorem for real symmetric matrices and symmetric
linear maps at the same time.
Theorem 11.4.5 Let V be a real vector space of dimension n with a positive definite
scalar product. Let T ∶ V ⟶ V be a symmetric linear transformation and let A ∈
Mn (R) be the symmetric matrix associated with T with respect to an orthonormal
basis B.
Then:
• T is diagonalizable, and there exists an orthonormal basis N of eigenvectors of
T.
• A is diagonalizable by an orthogonal matrix, that is, there exists an orthogonal
−1
matrix P , such that D = P AP is diagonal.
Before starting with the proof, we observe that the two statements of the theorem are
completely equivalent. The orthonormal basis N is a basis of mutually perpendicular
eigenvectors of T that have norm 1. The existence of this basis of eigenvectors is
−1
equivalent to the existence of an orthogonal matrix P , such that P AP is diagonal.
This matrix has as columns the coordinates of the eigenvectors with respect to the
basis B.
Proof. Let λ1 be a real eigenvalue of T , and let u1 ∈ V be an eigenvector of norm 1 of
eigenvalue λ1 . We know that such λ1 and u1 exist by Lemma 11.4.1. Let W1 = ⟨u1 ⟩ .
⊥

Then, we have dim(W1 ) = n − 1. Let us now consider the linear map T1 = T ∣W1 ,
that is, let us look at the restriction of T1 to the subspace W1 , then T1 ∶ W1 ⟶ V .
For Lemma 11.4.2, since u is also perpendicular to T1 (w) for each w ∈ W1 , we have
Im(T1 ) ⊆ W1 = ⟨u1 ⟩ , therefore we can write T1 ∶ W1 ⟶ W1 . Let us now repeat all
⊥

the arguments we have given for T ∶ V ⟶ V for the symmetric transformation T1 ∶

W1 ⟶ W1 . Such a transformation has a real eigenvalue λ2 and a real eigenvector,
u2 ∈ W1 ⊆ V of norm 1. Clearly since T1 is nothing else than T restricted to W1 , the
eigenvalue λ2 and the corresponding eigenvector u2 will also be an eigenvalue and an
eigenvector of T , respectively. So, reasoning as above, we define W2 = ⟨u2 ⟩ ⊆ W1 ,
⊥

dim(W2 ) = n − 2. We note that every vector of W2 is also perpendicular to u1

as W2 ⊆ W1 . It is clear that, proceeding in this way, after n steps, we obtain n
eigenvectors u1 , . . . , un of T of norm 1, which are mutually perpendicular. Therefore,
they form an orthonormal basis of eigenvectors and so T is diagonalizable.
Example 11.4.6 Let us consider the real matrix
⎛ 3 −2 4⎞
A=⎜
⎜−2 6 2⎟⎟.
⎝4 2 3⎠
The eigenvalues are λ = 7 with algebraic multiplicity 2 and λ = −2 with algebraic
multiplicity 1. The eigenspaces are:
V7 = ⟨v1 = (1, 0, 1), v2 = (−1/2, 1, 0)⟩,

V−2 = ⟨v3 = (−1, −1/2, 1)⟩,

202 Introduction to Linear Algebra

The matrix Q, which has as columns the eigenvectors of the basis {v1 , v2 , v3 }, diag-
onalizes the matrix A, however it is not orthogonal:

⎛7 0 0 ⎞ ⎛1 −1/2 −1 ⎞
D=⎜
⎜0 7 0 ⎟ Q=⎜ −1/2⎟
−1
⎟ = Q AQ, ⎜0 1 ⎟.
⎝0 0 −2⎠ ⎝1 0 1 ⎠

If we want to diagonalize A via an orthogonal basis change, we have to orthogonalize

the basis of eigenvectors of each eigenspace with the Gram-Schmidt algorithm. The
orthogonal matrix we are looking for is:
√ √
⎛1/ 2 −1/√ 5 −2/3⎞
P =⎜ ⎜ 0√ 2/ 5 −1/3⎟ ⎟.
⎝1/ 2 0 2/3 ⎠
−1
and we have D = P AP .

11.5 EXERCISES WITH SOLUTIONS

11.5.1 Consider the linear map T ∶ R ⟶ R defined by: T (e1 ) = e1 + 2e2 ,
2 2

T (e2 ) = 2e1 +e2 . Check that this is a symmetric linear transformation and determine
an orthonormal basis B with respect to which T is associated with a diagonal matrix.
Solution. The matrix associated with T with respect to the canonical basis (in domain
and codomain) is:
1 2
A=( ).
2 1
Since A is a symmetric matrix, T is a symmetric linear transformation.
The eigenvalues of A and T are: −1, 3. We compute the eigenspaces:

V−1 = ⟨(1, −1)⟩, V3 = ⟨(1, 1)⟩.

To get an orthonormal basis, we have to choose eigenvectors with norm equal to 1.

Therefore: √ √ √ √
B = {(1/ 2, −1/ 2), (1/ 2, 1/ 2)}
is a basis that satisfies our requirements.
3 3
11.5.2 Let u = e1 − 2e3 ∈ R and consider the linear transformation proju ∶ R ⟶
3 3
R that associates to each vector v of R its orthogonal projection on the vector u
with respect to the Euclidean scalar product, i.e.
< v, u >e
proju (v) = < u, u > u
e

(see Definition 10.6.2). Verify that this is a symmetric linear transformation and
determine an orthonormal basis B with respect to which proju is associated to a
diagonal matrix D. Then write D explicitly.
Spectral Theorem 203

Solution. We determine the A matrix associated with proju with respect to the canon-
ical basis (in domain and codomain). We have:
proju (e1 ) = <e 1 ,u>e
<u,u>
(e1 − 2e3 ) = 15 (e1 − 2e3 ) = 15 e1 − 52 e3
e
proju (e2 ) = <e2 ,u>e
<u,u>e
(e1 − 2e3 ) = 0(e1 − 2e3 ) = 0
proju (e3 ) = <e3 ,u>e
<u,u>
(e1 − 2e3 ) = − 25 (e1 − 2e3 ) = − 52 e1 + 45 e3 .
e
Therefore:
1
⎛ 5 0 − 25 ⎞
A=⎜
⎜ 0 0 0 ⎟ ⎟.
⎝− 2 0 54 ⎠
5
Since A is a symmetric matrix, proju is a symmetric linear transformation.
We want to determine an orthonormal basis B with respect to which proju is as-
sociated with a diagonal matrix. We can proceed as in the previous exercise or
observe that u is an eigenvector of proju relative to the eigenvalue 1, in fact:
proju (u) = <u,u>
<u,u>e
u = u. Moreover, if v is any nonzero vector orthogonal to u,
e
i.e. < v, u >e = 0, we have that proju (v) = 0u, so v is an eigenvector of proju relative
to the eigenvalue 0.
Let us now consider an orthonormal basis B obtained by applying the Gram-Schmidt
algorithm to basis {u, e1 , e2 }.
Using the notation of Theorem 10.6.3, we have:
1 1 2
u1 = u, f1 = u = √ e1 − √ e3
∥u∥ 5 5
4 2 1 2 1
u2 = e1 − proju (e1 ) = e1 + e3 , f2 = u2 = √ e1 + √ e3
5 5 ∥u2 ∥ 5 5
1
u3 = e2 − proju (e2 ) − proju2 (e2 ) = e2 , f3 = u3 = e2 .
∥u3 ∥
The vector f1 is a multiple of u so it is an eigenvector of proju of eigenvalue 1, the
vectors f2 , f3 are perpendicular to u by construction, so they are eigenvectors of proju
of eigenvalue 0. A basis with the required properties is: B = {f1 , f2 , f3 } and the matrix
D is:
⎛1 0 0⎞
D=⎜ ⎜0 0 0⎟ ⎟.
⎝0 0 0⎠

11.6 SUGGESTED EXERCISES

11.6.1 Say which of the following matrices are orthogonal and give reasons for the
answer:
√ √
2 0 1 1 ⎛0 1 0⎞ ⎛−1/√ 5 4/ √45 −2/3⎞
a) ( ) , b) ( ) , c) ⎜
⎜0 0 −1⎟⎟ , d) ⎜
⎜ 2/ 5 2/√5 −1/3⎟ ⎟.
0 1/2 1 −1 ⎝1 0 0 ⎠ ⎝ 0 5/ 5 2/3 ⎠
11.6.2 Consider the matrix:
√
⎛1/2 0 3/2⎞
A=⎜
⎜ 0 1 0 ⎟⎟.
⎝ a b c ⎠
204 Introduction to Linear Algebra

If possible, determine values for a, b, c that make it an orthogonal matrix.

2
11.6.3 a) Determine if there is an angle value t for which the rotation in R
described in Example 11.2.7 is a symmetric linear transformation.
b) Determine the linear transformation that associates to each point of the plane,
its symmetric with respect to the x-axis. Establish whether it is an orthogonal
linear transformation or not.
c) Determine whether the orthogonal projection described in Exercise 11.5.2 is an
orthogonal linear transformation.
11.6.4 Diagonalize by an orthogonal transformation the following matrices:

⎛3 1 1⎞ 2 −1 ⎛3 2 0⎞ ⎛2 0 0⎞
⎜1 3 1⎟
⎜ ⎟, ( ), ⎜2 6 0⎟
⎜ ⎟, ⎜
⎜0 1 −2⎟
⎟.
⎝1 1 3⎠ −1 1 ⎝0 0 3⎠ ⎝0 −2 4 ⎠
2 2
11.6.5 Determine the linear transformation proju ∶ R → R associating to each
vector its projection on the vector u = e1 −e2 . Write explicitly the matrix T associated
to it with respect to the canonical basis. Say if it is a symmetric and/or orthogonal
linear transformation. Determine (if possible) a diagonal matrix similar to T .
2 2
11.6.6 Let proju ∶ R → R be the linear transformation associating to any vector
its projection on the vector u = 3e1 − 4e2 . Let A be the matrix associated with it
with respect to the canonical basis. Determine if there is an orthogonal matrix P ,
0 0
such that P AP = ( ).
−1
0 1

11.7 APPENDIX: THE COMPLEX CASE

In this appendix, we want to give the proof of Lemma 11.4.1, and so we will introduce
hermitian products, which represent a generalization of scalar products to the complex
case. This appendix requires familiarity with complex numbers. We invite the reader
to refer to Appendix A.
We define C = {(x1 , . . . , xn ) ∣ xi ∈ C} as the set of n-tuples of complex numbers. In
n
n
analogy with what happens for R , we can define the sum and the multiplication by
a scalar λ ∈ C so that properties (1) through (8) of Definition 2.3.1 are verified. We
n
thus obtain that C is a complex vector space (see Appendix A). All the theory we
developed in Chapters (1) through (9) for real vector spaces also holds for complex
vector spaces, that is for sets satisfying properties (1) through (8) of Definition 2.3.1,
where C is substituted for R as the set of scalars.
Let us now see an important generalization of the notion of scalar product.
Definition 11.7.1 Let V be a complex vector space. The function < , >∶ V ×V ⟶ C
is called a hermitian product if:
′ ′
1. =< u, v > + ,
′ ′ ′ ′
< u, v + v >=< u, v > + < u, v > for each u, u , v, v ∈ V .
Spectral Theorem 205

2. < λu, v >= λ < u, v >,

< u, µv >= µ < u, v > for each u, v ∈ V and for each λ, µ ∈ C.

3. < u, v >= < v, u > for each u, v ∈ V .

<, > is non-degenerate when, if < u, v >= 0 for each v ∈ V , then u = 0.

<, > is positive definite if < u, u > ≥ 0 for each u ∈ V and < u, u >= 0 if only if
u = 0.

The difference between a hermitian product and a scalar product is that for a her-
mitian product we require the linearity of the function < u, ⋅ >∶ V ⟶ C, but
the antilinearity of the function < ⋅, u >∶ V ⟶ C, i.e. for u ∈ V fixed we have
< u, µv >= µ < u, v >.
We are particularly interested in the following example.
n
Example 11.7.2 In the complex vector space C we define:

< (x1 , . . . , xn ), (x1 , . . . , xn ) >h = x1 x1 + ⋅ ⋅ ⋅ + xn xn

′ ′ ′ ′

We leave to the reader to verify this is an hermitian product. This product is called
n
the standard hermitian product in C . It is immediate to verify that it is positive
definite, indeed:
2 2
< (x1 , . . . , xn ), (x1 , . . . , xn ) >h = x1 x1 + ⋅ ⋅ ⋅ + xn xn = ∣x1 ∣ + ⋅ ⋅ ⋅ + ∣xn ∣ ≥ 0,

and ∣x1 ∣ + ⋅ ⋅ ⋅ + ∣xn ∣ = 0 if and only if x1 = x2 = ⋅ ⋅ ⋅ = xn = 0. Therefore, the

2 2

standard hermitian product is also non-degenerate and positive definite.

Let us consider the canonical basis C = {e1 , . . . , en } of C , where ei is the vector
n

having 1 in the i-th position and 0 elsewhere. We observe that, as we did for scalar
n
products, to each hermitian product < , > in C we can associate a C matrix, with
cij =< ei , ej >, such that
′
⎛ x1 ⎞
< (x1 , . . . , xn ), (x1 , . . . , xn ) >= (x1 , . . . , xn )C ⎜
⎜ ⋮, ⎟
′ ′
⎟.
⎝xn ′ ⎠

In the case of the standard hermitian product, the matrix associated with it is the
identity matrix I since < ei , ej >h = δij . It is not difficult to prove, in complete
analogy with the case of the scalar products, that a matrix C is associated with
T
a hermitian product if and only if C = C , that is, it coincides with its complex
conjugate transpose.

The following observation is crucial in the proof of Lemma 11.4.1.

n
Observation 11.7.3 Let us consider the complex vector space C with the standard
hermitian product, described in the previous example.
206 Introduction to Linear Algebra

If (x1 , . . . , xn ), (x1 , . . . , xn ) ∈ R ⊆ C , since xi = xi we have that:

′ ′ n n ′ ′

< (x1 , . . . , xn ), (x1 , . . . , xn ) >h = x1 x1 + ⋅ ⋅ ⋅ + xn xn

′ ′ ′ ′

=< (x1 , . . . , xn ), (x1 , . . . , xn ) >e .

′ ′

n
In other words, the hermitian product of vectors in R coincides with the usual
n
Euclidean product in R .
If A is a matrix with real entries, we have that:
T n
< Au, v >h =< u, A v >h , for each u, v ∈ C . (11.7)

In fact, (11.7) is true for u = ei , v = ej (where the ei are the vectors of the canonical
basis), therefore, by the linearity and antilinearity of the hermitian product, it is not
n
difficult to verify that it is true also for generic vectors u, v ∈ C .

We are therefore ready to state and prove Lemma 11.4.1.

Lemma 11.7.4 Let A ∈ Mn (R) be a symmetric matrix. Then A admits a real

eigenvalue.

Proof. A is a matrix with real entries, however, as real numbers are contained in
the complex field we also have that A ∈ Mn (C). By the Fundamental Theorem of
algebra (see Appendix A), the characteristic polynomial of A, det(A − λI), is equal
to zero for at least one complex number, λ0 ∈ C. We want to prove that λ0 is real,
n
that is λ0 = λ0 . Let u ∈ C be an eigenvector of eigenvalue λ0 , and let < , >h be the
n
standard hermitian product in C , i.e .:
T n
< u, v >h = u v, for all u, v ∈ C ,

where u = (α1 , . . . , αn ) , v = (β1 , . . . , βn )

T T n
∈ C are column vectors and v =
(β1 , . . . , βn ) .
T

We have:

< Au, u >h = (Au) u = u A u = u A u = u (Au) =< u, Au >h

T T T T T

T
since A = A and, because A is a matrix with real entries, A = A. Therefore,

λ0 < u, u >h =< Au, u >h =< u, Au >h = λ0 < u, u >h .

Hence:
(λ0 − λ0 ) < u, u >h = 0.
From the fact that u ≠ 0, since it is an eigenvector, it follows that < u, u >h ≠ 0, then
λ0 = λ0 .

In the rest of this appendix, we want to revisit the results we have stated in this
chapter for a real vector space with a positive definite scalar product, for the case
of a complex vector space with a positive definite hermitian product. As we will see
Spectral Theorem 207

all the main theorems, including the Spectral Theorem, have statements and proofs
similar to those seen, which we will therefore leave by exercise.
Reading this part is not necessary for understanding the real case; we include it for
completeness.
In analogy with the real case, we can give the following definitions. In the complex
case, the notions unitary and hermitian linear maps or matrices replace the corre-
sponding notions of real symmetric and orthogonal ones, respectively.
Definition 11.7.5 Let V be a complex vector space with a positive definite hermitian
product < , >h . We say that a linear transformation T ∶ V ⟶ V is unitary if

< T u, T v >h =< u, v >h .

We say that a linear transformation T ∶ V ⟶ V is hermitian if

< T u, v >h =< u, T v >h .

−1 ∗ ∗
We say that a complex coefficient matrix A is unitary if A = A , where A = AT ;
∗
instead we say it is hermitian if A = A , that is, A coincides with its transpose
complex conjugate. We note that these two operations, that is, transposition of A and
conjugation of each entry of A, can be interchanged; that is, the result is independent
of which one we choose to do first.
It is easy to verify that the hermitian condition on A corresponds to the fact that
n
with respect to the hermitian scalar product in C we have:

< Au, v >h =< u, Av >h .

Also note that if A is a real symmetric matrix we have immediately that it is also an
∗
hermitian matrix. In fact, it satisfies the condition A = A , as the conjugate complex
of a number real is the real number itself.
We can state the analogue of Proposition 11.7.6, whose proof is the same as in the
real case.
Given a complex vector space V with an hermitian scalar product < , >h , we say that
u, v ∈ V are perpendicular (orthogonal) if < u, v >h = 0.
If V is finite dimensional and the hermitian scalar product is positive definite, with
the same calculations as in Section 10.6 it can be proved that there exists a basis B
of V consisting of vectors of norm 1 such that any two of them are orthogonal, where
√
the norm of a vector u is defined as ∥u∥ = < u, u >h .

Proposition 11.7.6 Let V be a complex vector space of finite dimension, with a pos-
itive definite hermitian product < , >h . Let T ∶ V ⟶ V be a linear transformation,
and let B be an orthonormal basis. Then:
1. T is hermitian if and only if its associated matrix with respect to the basis B is
an hermitian matrix.
208 Introduction to Linear Algebra

2. T is unitary if and only if its associated matrix with respect to the basis B is a
unitary matrix.

Similarly to the real case, we can also state and prove the following results.

Lemma 11.7.7 Let A be an hermitian matrix. Then:

1. A admits at least one real eigenvalue.

2. If u is the eigenvector of A, and w is perpendicular to u then Aw is perpen-

dicular to u.

3. Eigenvectors of A relative to distinct eigenvalues are perpendicular to each

other.

Finally, we can state the Spectral Theorem for hermitian linear maps and equivalently
for hermitian matrices. The proof is the same as the one for the real case.

Theorem 11.7.8 Let V be a complex vector space of dimension n with a positive

definite hermitian product. Let T ∶ V ⟶ V be an hermitian linear transformation
and let A ∈ Mn (C) be the hermitian matrix associated with T with respect to an
orthonormal basis B of V .
Then:

• T is diagonalizable and furthermore there exists an orthonormal basis N of V

consisting of eigenvectors of T .

• A is diagonalizable by means of a unitary matrix; that is, there exists a unitary

−1
matrix P such that D = P AP is diagonal.
CHAPTER 12

Applications of Spectral
Theorem and Quadratic
Forms

In this chapter, we want to study some consequences of the Spectral Theorem for
scalar products and quadratic forms associated with them.

12.1 DIAGONALIZATION OF SCALAR PRODUCTS

Let us now revisit the problem of the basis change for scalar products. We wonder if it
is possible to determine a basis for the vector space, such that the matrix associated
with a given scalar product is as simple as possible, i.e. diagonal. We have already
solved this problem in the case of a positive definite scalar product using Gram-
Schmidt Theorem 10.6.3. We are now interested in the more general case, together
with some extra conditions on the basis.
Given a vector space V and a scalar product < , >, once we fix an ordered basis
A = {u1 , . . . , un }, we can associate to < , > a unique matrix C whose coefficients are
cij =< ui , uj >, as we saw in Chapter 10.
If we choose a different basis, B = {v1 , . . . , vn }, we will have that the matrix associ-
ated with the scalar product changes according to the following formula (see (10.5)):
′ T
C = IB,A CIB,A , (12.1)

where IB,A is the basis change matrix between the bases B and A.
Now suppose we have a vector space V with a positive definite scalar product <, >V .
As we saw in Observation 10.6.4, if we fix an orthonormal basis, <, >V is associated
with the identity matrix. Therefore, if we write the vectors of V using the coordinates
n
with respect to the chosen orthonormal basis, we can identify V with R and < , >V
with the standard scalar product.
Now, consider an arbitrary scalar product < , > in V (not necessarily positive definite
or non-degenerate). We will see shortly that, thanks to the Spectral Theorem, it is
possible choose a basis N , orthonormal with respect to < , >V , such that the matrix

209
210 Introduction to Linear Algebra

associated with the scalar product < , > with respect to the basis N is diagonal.
Hence, N will be an orthogonal basis (not necessarily orthonormal) also with respect
to < , >. This will allow us to immediately to determine some fundamental properties
of the scalar product < , >. For example, we can determine if the product is non-
degenerate or positive definite, simply by looking at the signs of the elements on the
diagonal (which are, in fact, the eigenvalues) of the matrix associated to < , > with
respect to N .
We start with an equivalent statement of the Spectral Theorem.
Theorem 12.1.1 Let V be a vector space of dimension n with a positive definite
scalar product < , >V . Let < , > be another scalar product in V . Then, there exists an
orthonormal basis N for < , >V , which is also orthogonal for < , >.
Proof. By the Gram-Schmidt Theorem 10.6.3, there exists a basis A, which is or-
thonormal for < , >V . Let C be the matrix associated with < , > with respect to the
basis A and let T ∶ V ⟶ V be the linear application associated with C with re-
spect to the basis A in both the domain and codomain. By the Spectral Theorem,
there exists a basis N , orthonormal for the positive definite scalar product < , >V ,
consisting of eigenvectors of T . If P = IN ,A is the matrix of basis change between N
−1 T
and A, we have, again by the Spectral Theorem, that P is orthogonal, i.e. P = P .
Therefore, since N is a basis of eigenvectors, we can write:
−1 T
D=P CP = P CP, (12.2)

where D is a diagonal matrix, with the eigenvalues of C on the diagonal.

By formula (12.2), we have that the diagonal matrix D is the matrix associated with
the scalar product < , > with respect to the basis N , which is then an orthogonal
basis for < , >. This concludes the proof.

Remark 12.1.2 Formula (12.2) tells us a surprising fact: given a vector space V
with a positive definite scalar product and a fixed orthonormal basis A, there exists
an orthonormal basis N , such that we can write the basis change formula for a
symmetric linear application T as the basis change formula for a scalar product < , >
in the same way!
Hence, we can use the theory of diagonalization of linear applications, that we studied
in the Chapter 9, to solve the diagonalizability problem for scalar products. We need
to keep in mind two things:

1. Unlike what happens for linear applications, scalar products are always diago-
nalizable. This happens because, with a fixed-ordered basis, a scalar product is
associated to a symmetric matrix and the Spectral Theorem guarantees us the
diagonalizability of such matrices, via an orthogonal matrix P .

2. Formula (12.2) solves the problem of diagonalizability of a scalar product using

orthogonal transformations (changes of basis). If we relax this request it is
possible to prove in a more general way the result of the diagonalizability of
the scalar products, through Sylvester’s Theorem, which, however, will not be
Applications of Spectral Theorem and Quadratic Forms 211

dealt with. We want to emphasize that orthogonal basis changes are particularly
useful in applications, especially in physics.

From the previous theorem, we immediately have a corollary, which is very important
for applications.
n
Corollary 12.1.3 Let < , > be a scalar product in R . Then, there exists a basis N ,
orthonormal for the Euclidean scalar product, such that the matrix associated with
< , > is diagonal.
Let us now see how the above results can be applied to determine if an arbitrary
scalar product is positive definite and non-degenerate.
Proposition 12.1.4 Let V be a vector space of dimension n with a scalar product
< , > associated with a diagonal matrix D with respect to a given basis N . Then:
1. < , > is non-degenerate if and only if all the elements on the diagonal of D are
non zero;
2. < , > is positive definite if and only if all elements on the diagonal of D are
positive.
Proof. Given two vectors u, v ∈ V , let (u)N = (α1 , . . . , αn ) , (v)N = (β1 , . . . , βn )
T T

be their coordinates with respect to the ordered basis N = {w1 , . . . , wn }. Then we

have:
T
< u, v >= (u)N D (v)N = λ1 β1 α1 + ⋅ ⋅ ⋅ + λn βn αn , (12.3)
where λ1 , . . . , λn are the elements on the diagonal of the diagonal matrix D. Therefore:

< wi , wi >= λi . (12.4)

Let us prove the claims.

(1). Suppose < , > is non-degenerate and suppose by contradiction that λi = 0 for
some i. Then from formula (12.4) it follows that the vector wi is orthogonal to any
other vector of the space V , and this gives us a contradiction.
Let us now look at the other implication. We assume, by contradiction, that < , > is
degenerate; then there is a non zero vector u orthogonal to all vectors of the space
V , in particular to all vectors of N . Let us set (u)N = (α1 , . . . , αn ); therefore:
T
< u, wi >= (u)N D (wi )N = λi αi = 0

for each i. Since by hypothesis λi ≠ 0 we have αi = 0 for each i, which gives a

contradiction.
Let us see the other implication. By contradiction, we assume that there is a non
zero vector u such that < u, u >≤ 0. Then, from (12.3), we have:
2 2
< u, u >= λ1 α1 + ⋅ ⋅ ⋅ + λn αn ≤ 0.

This implies that at least one of the λi is negative or null, getting the contradiction.

212 Introduction to Linear Algebra

Let us see an example of an application of this result.

2
Example 12.1.5 In R , we consider the scalar product defined as:

< (x1 , y1 ), (x2 , y2 ) >= −4x1 y1 + 2x1 y2 + 2x2 y1 − 4x2 y2

The matrix associated with it with respect to the canonical basis is:

−4 2
C=( )
2 −4

The Spectral Theorem guarantees us that such a matrix can be diagonalized through
an orthogonal change of basis. With an easy calculation, we see that the eigenvalues
of C are λ1 = −2, λ2 = −6, and its eigenspaces are: V−2 = ⟨(1, 1) >, V−6 = ⟨(1, −1) >.
By the Spectral Theorem we immediately have that

−2 0
( ) = P CP = P CP =
−1 T
0 −6
√ √ √ √
1/√2 1/ √2 −4 2 1/√2 1/ √2
=( )( )( ).
1/ 2 −1/ 2 2 −4 1/ 2 −1/ 2

By the
√ previous
√ √ with respect to the ordered orthonormal basis N =
proposition,
√
((1/ 2, 1/ 2), (1/ 2, −1/ 2)), the scalar product < , > is associated to the diagonal
matrix:
−2 0
D=( ).
0 −6
We can then conclude that < , > is non-degenerate but not positive definite.

Thanks to the previous example, we can make an easy observation, which is very
important for exercises.

Observation 12.1.6 Let < , > be a scalar product in a vector space V of dimension
n, and let C be the associated matrix, with respect to any basis of V . Then:

1. < , > is non-degenerate if and only if C has no null eigenvalues.

2. < , > is positive definite if and only if C has no negative or null eigenvalues.

In fact, by Corollary 12.1.3 there exists a basis N , such that the matrix associated
to < , > is diagonal. Looking at the proof of Theorem 12.1.1, we see that this matrix
has the eigenvalues of C on its diagonal. Hence, our claims descend from Proposition
12.1.4.

Let us see an example.

3
Example 12.1.7 In R we define the scalar product:

< (x1 , y1 ), (x2 , y2 ) >= −2x1 y1 + 2x1 y2 + 2x2 y1 − x2 y3 − x3 y2 + x3 y3 .

Applications of Spectral Theorem and Quadratic Forms 213

We wonder if this product is non-degenerate and positive definite.

The matrix associated with < , > with respect to the canonical basis is:

⎛−2 2 0⎞
C=⎜
⎜2 0 −1⎟⎟.
⎝ 0 −1 1 ⎠

The eigenvalues are:

√ √
λ1 = 1/2(−3 − 13), λ2 = 2, λ3 = 1/2( 13 − 3).
By the previous observation we can conclude without further calculations, that the
given product is non-degenerate, but not positive definite, since all eigenvalues are
non zero, but the eigenvalue λ1 is negative.

12.2 QUADRATIC FORMS

In this section we want to discuss quadratic forms. We can use the information we
have about scalar products to diagonalize a quadratic form and, in the case of two
variables, to draw curves in the plane.
Definition 12.2.1 Let V be a real vector space and let < , > be a scalar product in V .
We define the real quadratic form q associated with < , > as the function q ∶ V ⟶ R
such that q(v) =< v, v >.
n
For example in R , given the Euclidean product, the quadratic form associated with
it is the function that associates to a vector its norm squared: q(v) = ∥v∥ =
2

α1 + ⋅ ⋅ ⋅ + αn , for each v = (α1 , . . . , αn ).

2 2

Observation 12.2.2 If q is a quadratic form, then q uniquely determines the scalar

product that defines it. Indeed, such product on two arbitrary vectors is given by:
< u, v >= (1/2)[q(u + v) − q(u) − q(v)]. (12.5)
So that:
(1/2)[q(u + v) − q(u) − q(v)] =

= (1/2)[ − < u, u > − < v, v >]

= (1/2)[2 < u, v >].

Definition 12.2.3 Given a quadratic form q on a vector space V we say that q is
non-degenerate, or positive definite, if the scalar product associated with it has the
same properties.
Given a quadratic form q, and a basis B for V , we can therefore write:
T
q(v) = (v)B C(v)B ,
where C is the matrix associated with the scalar product corresponding to q, with
respect to the basis B, and (v)B represents the column of the coordinates of v with
respect to the basis B.
214 Introduction to Linear Algebra

n n
Observation 12.2.4 In R consider the following function q ∶ R ⟶ R:
q(x1 , . . . , xn ) = a11 x1 + a12 x1 x2 + a13 x1 x3 + . . . a1n x1 xn +
2

2 2
+a22 x2 + a23 x2 x3 + ⋅ ⋅ ⋅ + a2n x2 xn + ⋅ ⋅ ⋅ + ann xn .
We construct the symmetric matrix C as follows:
a12 a13
⎛aa11 2 2
. . . a21n ⎞
⎜ 212 a23
. . . a22n ⎟
C=⎜ ⎟
a22
⎜
⎜ 2 ⎟
⎟ . (12.6)
⎜
⎜. . . ⎟
⎟
⎝ a1n a2n a3n
. . . ann ⎠
2 2 2

It is easy to verify that

⎛ x1 ⎞
q(x1 , . . . , xn ) = (x1 , . . . , xn )C ⎜
⎜⋮⎟ ⎟.
⎝xn ⎠
n
So the function q represents a quadratic form in R and we can immediately check
n
using (12.6) that every quadratic form in R is of this form.
Let us see an example.
3 2 2
Example 12.2.5 Consider q ∶ R ⟶ R given by: q(x, y, z) = x + 2xy + 3zy − 2z .
We have that q is a quadratic form, and the matrix associated with it is given by:

⎛1 1 0 ⎞ ⎛x⎞
q(x, y, z) = (x y z) ⎜
⎜1 0 3/2⎟
⎟⎜⎜y ⎟
⎟.
⎝0 3/2 −2 ⎠ ⎝z ⎠

We leave to the reader to verify the above equality.

The Spectral Theorem and its version for scalar products given by Theorem 12.1.1
have an immediate consequence for quadratic forms.
n
Corollary 12.2.6 (Principal Axes Theorem). Let q ∶ R ⟶ R be a quadratic
n
form associated with the matrix C with respect to the canonical basis of R . Then
n
there exists an orthonormal basis N of R , consisting of eigenvectors of C, such that
the matrix associated with q takes a diagonal form. We can therefore write:
2 2
q(x1 , . . . , xn ) = λ1 x1 + ⋅ ⋅ ⋅ + λn xn ,
where (x1 , . . . , xn ) are the coordinates with respect to basis N and λ1 , . . . , λn are the
eigenvalues of the matrix C.
We conclude this section with a fundamental definition for the classification of real
quadratic forms. Despite of its importance, we shall not develop this topic any further
1
here.
1
Sylvester’s Theorem, which we do not study here, states that two quadratic forms, or equiva-
lently two scalar products in a finite dimensional vector space, coincide up to a change of basis if
and only if they have the same signature.
Applications of Spectral Theorem and Quadratic Forms 215

n
Definition 12.2.7 Let q be a quadratic form on R . We define the signature of q
as the pair (r, s), where r and s are the number of positive and negative eigenvalues
respectively, of the matrix C associated with q with respect to the canonical basis,
each counted with multiplicity.

We note that, in this definition, instead of the canonical basis, we may choose any
orthonormal basis to determine C. In fact, we know that all matrices associated with
q have the same eigenvalues.

12.3 QUADRATIC FORMS AND CURVES IN THE PLANE

As an application of the results on quadratic forms, we want to describe the curve
consisting of the points satisfying the equation

q(x, y) = c, (12.7)

in the plane, where q is a non-degenerate quadratic form and c ∈ R is a positive

constant.
Let us start with considering the particular case
2 2
q(x, y) = λ1 x + λ2 y (12.8)

with λ1 , λ2 ≠ 0. We immediately see that if λ1 , λ2 < 0, no point of the plane has

coordinates that satisfy (12.7). We divide the other cases according to the sign of λ1
and λ2 obtaining the following classification:

• λ1 , λ2 > 0 ellipse;

• λ1 < 0, λ2 > 0 (or λ1 > 0, λ2 < 0) hyperbola.

Thanks to the Principal Axis Theorem 12.2.6, we can treat the general case as one
of these two geometric figures. We will therefore say that a quadratic form is in
canonical form, if it takes the expression (12.8). Let us see an example.

Example 12.3.1 We want to draw the curve in the plane whose points have coor-
2 2
dinates satisfying the equation: 5x − 4xy + 5y = 48.
216 Introduction to Linear Algebra

2 2
The matrix associated with the quadratic form q(x, y) = 5x − 4xy + 5y is:

5 −2
A=( )
−2 5
The eigenvalues
√ √of A are 3 and√7, the√ corresponding eigenvectors of unit length are
u1 = (1/ 2, 1/ 2), u2 = (−1/ 2, 1/ 2). Using the coordinates with respect to the
basis B = {u1 , u2 }, we have q(x , y ) = 3(x ) + 7(y ) . So q(x, y) = 48 is an ellipse
′ ′ ′ 2 ′ 2

that we can immediately draw:

y’ x’

Let us now look at another example related to hyperbola.

Example 12.3.2 We want to draw the curve in the plane whose points have coor-
2 2
dinates satisfying the equation: x − 8xy − 5y = 16. The matrix associated with the
2 2
quadratic form q(x, y) = x − 8xy − 5y is:

1 −4
A=( ).
−4 −5
The eigenvalues
√ √of A are −7 and√ 3, the
√ corresponding eigenvectors of unit length are
u1 = (1/ 5, 2/ 5), u2 = (−2/ 5, 1/ 5). Using coordinates with respect to the basis
B = {u1 , u2 }, we have q(x , y ) = −7(x ) + 3(y ) . So q(x, y) = 16 is a hyperbola,
′ ′ ′ 2 ′ 2

which we can immediately draw, similarly to what we have done previously.

12.4 EXERCISES WITH SOLUTIONS

3
12.4.1 Consider the quadratic form in R given by:
2 2
q(x, y, z) = x + 2xz − y − 2yz.

Say if it is non-degenerate, positive definite and give its signature. Write the scalar
product associated with it with respect to the canonical basis.
Applications of Spectral Theorem and Quadratic Forms 217

Also, determine a basis with respect to which this scalar product is associated with
a diagonal matrix.
Solution. First we write the matrix associated with the given quadratic form, in the
canonical basis:
⎛1 0 1⎞
C=⎜ ⎜0 −1 −1⎟
⎟.
⎝1 −1 0 ⎠
Let us compute the eigenvalues:
√ √
λ1 = − 3, λ2 = 3, λ3 = 0.

The quadratic form is degenerate as one of the eigenvalues of the associated matrix
is equal to zero. It is not positive definite as it is degenerate. The signature is (1, 1).
The scalar product associated with q with respect to the canonical basis is given by:
′
⎛1 0 1 ⎞ ⎛x ⎞
⎟⎜
′⎟
< (x, y, z), (x , y , z ) >= (x, y, z) ⎜0 −1 −1⎟
⎜ ⎜y ⎟
⎜
′ ′ ′
⎟.
⎝1 −1 0 ⎠ ⎝z ′ ⎠

A basis N with respect to which the matrix C is diagonal consists of normalized

eigenvectors of C. Indeed if we compute the eigenvectors:
√ √ √
v1 = (−(2 − 3)/(−1 + 3), 1/(−1 + 3), 1)
√ √ √
v2 = (−(−2 − 3)/(1 + 3), −1/(1 + 3), 1)

v3 = (−1, −1, 1).

we have N = (v1 /∥v1 ∥, v2 /∥v2 ∥, v3 /∥v3 ∥).

12.4.2 Consider the curve consisting of the points in the plane that satisfy the
equation:
2 2
3x − 2xy − y = 1.
Determine what curve it is and give a sketch in the cartesian plane.
2 2
Solution. The matrix associated with the quadratic form q(x, y) = 3x − 2xy − y
with respect to the canonical basis is:
3 −1
C=( ).
−1 −1
Let us compute the eigenvalues:
√ √
λ1 = 1 + 5, λ2 = 1 − 5.

We can see immediately, without the need for further calculations, that it is a hyper-
bola, whose canonical form is given by:
√ ′ 2 √ ′ 2
q(x , y ) = (1 + 5)(x ) + (1 − 5)(y ) .
′ ′
218 Introduction to Linear Algebra

However, if we want to draw it, it is necessary to compute the eigenvectors, as their

′ ′
directions correspond to the x and y axes of the new coordinate system. We note
that in this case it is not necessary to normalize the eigenvectors, that is to divide
them by the norm, as we are only interested in the direction of the axes and not in
the basis change.
Let us compute the eigenvectors:
√ √
v1 = (−2 − 5, 1), v2 = (−2 + 5, 1).

We can then draw the hyperbola. It is √

useful to compute the intersections with the
′ ′ ′ ′
axes x and y : for y = 0, x = ±1/(1 + 5).

y
y’

x’

12.5 SUGGESTED EXERCISES

12.5.1 Given the quadratic form q(x1 , x2 ) = 5x1 − 4x1 x2 + 5x2 .
2 2

a) Write the matrix associated with it with respect to the canonical basis.
b) Write the quadratic form in canonical form q(x1 , x2 ) = ax1 + bx2 for appropriate
2 2

a and b.
c) Draw the curve described by the equation q(x1 , x2 ) = 48 in the cartesian plane.

12.5.2 Given the quadratic form q(x1 , x2 ) = 3x1 + 2x2 + x3 + 4x1 x2 + 4x2 x3 .
2 2 2

a) Write the matrix A associated with it with respect to the canonical basis.
b) Say if q is positive definite.
−1
c) Find (if possible) a matrix P such that D = P AP is diagonal.
d) Write the quadratic form q1 associated with D and establish the relation between
q1 and q.
Applications of Spectral Theorem and Quadratic Forms 219

12.5.3 Given the matrix:

⎛ 4 2 3 ⎞
C=⎜
⎜ 2 0 2 ⎟
⎟
⎝ 3 2 4 ⎠

a) Write the scalar product and the quadratic form associated with it with respect
3
to the canonical basis of R .
b) Determine whether the scalar product in (a) is positive definite and/or non-
degenerate.
c) Compute the signature of the quadratic form in (a).
d) Determine a basis in which the scalar product given in (a) is associated with a
diagonal matrix.
[Help: 8 and −1 are eigenvalues of C.]
2
12.5.4 Consider the quadratic form: q(x, y) = x + 5xy.
1) Write the scalar product associated with it and determine if it is non-degenerate
or positive definite. Compute the signature of q.
2) Given the curve in the plane consisting of the points with coordinates satisfying
2
the equation x + 5xy = 1, say what curve it is.
2 2
12.5.5 Draw the curve x1 − 8x1 x2 − 5x2 = 16.

12.5.6 Given the curve described by the equation

2 2
5x + 5y − 6xy − 16 = 0,

find the canonical form and give a drawing of it.

12.5.7 Given the equation:

2 2
x − 4xy + y = 4
I) Say which curve it describes.
II) Determine its canonical form.
III) Draw the curve in the plane.
2 2
12.5.8 Given the curve described by the equation x + y − 16xy = 1;
1) Find its canonical form and give a drawing.
2) Write the scalar product associated with it and say if it is non-degenerate and if
it is positive definite.
CHAPTER 13

Lines and Planes

In this chapter, we introduce three-dimensional geometry in an elementary way, and

3
we focus on the study of lines and planes in R .
3
Hence, we need to introduce the concept of dot product and cross product in R .
Since this chapter is self-contained, we discuss such notions without any reference to
definitions in Chapters 10, 11, 12. We warn the reader is that the dot product defined
below is an example of the more general notion of standard scalar product as defined
in Chapter 11.

3
13.1 POINTS AND VECTORS IN R
3
Consider R , the set of ordered triples of real numbers:
3
R = {(x, y, z) ∣ x, y, z ∈ R}
3
We can represent the elements of R as points, where:

• We have a point O, called the origin, corresponding to the element (0, 0, 0).

• We choose three lines through O perpendicular to each other, and we call them
coordinate axes, denoted respectively as the x, y and z axis. We usually think
of the x and y axes as horizontal and to the z axis as vertical (see Fig. 13.1).

• We set an orientation of the axes according to the right-hand rule (see also Fig.
13.1). We first choose a direction for the x-axis and one for the y-axis and we
call these positive directions. For the z axis the positive direction is identified
as follows: if we wrap the fingers of the right hand (pointer, middle, ring finger,
little finger) around the z-axis from the x-axis in the positive direction, to the
y-axis in the positive direction, the thumb points in the direction that we call
positive for the z axis.

221
222 Introduction to Linear Algebra

z
6 .......
........ ......
...... ....
...
Y
.... ..
.
...
...... ..
. ...
....... ..
............... ......................
.

-

y

+
x

Fig. 13.1
We can then associate with a point P the three ordered real numbers (a, b, c) that
3
we call coordinates of P (see Fig. 13.2). From now on, we will identify R with the
points of space through this representation.

Fig. 13.2

Given the points P = (a, b, c) and Q = (a , b , c ) ∈ R , we can define the distance

′ ′ ′ 3

between them as:

√
d(P, Q) = (a − a′ )2 + (b − b′ )2 + (c − c′ )2 . (13.1)

We leave to the reader as an easy exercise to prove that actually d(P, Q) represents
the length of the segment which connects P with Q.
Lines and Planes 223

We now define the notion of vector, which is extremely important for our discussion.
3
In this appendix we will use round brackets for the points of R and square brackets
for vectors, in order to mark the difference between these two notions.
Given a point P = (a, b, c) ∈ R , we define the position vector of P as:
3

⟶
v =OP = [a, b, c].
We represent the position vector as an arrow from the origin with its tip in P .
⟶
Given the points P = (a, b, c), Q = (a , b , c ) ∈ R , we define the vector P Q like:
′ ′ ′ 3

⟶
w =P Q= [((a − a ), (b − b ), (c − c )].
′ ′ ′

We can represent the vector w as an arrow starting from the origin and parallel to
the segment P Q.
6

Q

w

-

O

P

+

Fig. 13.3
Given two vectors u = [u1 , u2 , u3 ], v = [v1 , v2 , v3 ] we can define their sum as follows:
u + v = [u1 + v1 , u2 + v2 , u3 + v3 ].
Similarly we can define multiplication by a scalar, i.e. by a real number λ:
λu = [λu1 , λu2 , λu3 ].
The two operations of sum between vectors and multiplication of a vector by a scalar
equip the set of vectors with the structure of vector space, that is, the 8 properties
of Definition 2.3.1 of Chapter 2 apply.
We leave the easy verification to the reader.

13.2 SCALAR PRODUCT AND VECTOR PRODUCT

We now want to define the dot product, also called scalar product between two
vectors. Let u = [u1 , u2 , u3 ], v = [v1 , v2 , v3 ] be two vectors. We define their dot
product as:
u ⋅ v = u1 v1 + u2 v2 + u3 v3 .
224 Introduction to Linear Algebra

The dot product has the following properties, which we leave as an easy exercise for
the reader. For each vector u, v, w and for each scalar λ we have:

• Commutativity:
u⋅v=v⋅u

• Compatibility with multiplication by a scalar:

(λu) ⋅ v = λ(u ⋅ v) = u ⋅ (λv)

• Distributivity:
u ⋅ (v + w) = u ⋅ v + u ⋅ w
√ √
Note that u ⋅ u = u21 + u22 + u23 represents the distance of the point P = (u1 , u2 , u3 )
from the origin and therefore represents the length of the segment OP . We define as
√
length of the vector u the number u ⋅ u, which we also denote as ∥u∥.
Let us now remind the reader of an elementary result of Euclidean geometry, the
cosine theorem, which allows us to find the length of an edge of a triangle with
vertices A, B, C, knowing the lengths of the other two edges and the angle θ between
them (see Fig. 13.4). This theorem states that:
2 2 2
BC = AC + AB − 2 ⋅ AC ⋅ AB cos θ,

where θ is the angle corresponding to the vertex C.

This theorem allows us to prove a result that is fundamental for our discussion.

C
C
C
C
AC C
C BC
C
C
C
....
...
...
...
θ C
.
C
..
..
CC

A AB B
Fig. 13.4

Theorem 13.2.1 Let u and v be two nonzero vectors and let θ be the angle having
as sides the half-lines determined by the two vectors. Then

u ⋅ v = ∥u∥ ∥v∥ cos θ.

Lines and Planes 225

Proof. With an easy calculation, we have that:

2 2 2
∥u − v∥ = (u − v) ⋅ (u − v) = ∥u∥ + ∥v∥ − 2u ⋅ v. (13.2)

By the cosine theorem, if we consider the triangle with a vertex in the origin and
sides of length ∥u∥, ∥v∥ and ∥u − v∥, we immediately have that:
2 2 2 2
∥u − v∥ = ∥u∥ + ∥v∥ − ∥u∥ ∥v∥ cos(θ). (13.3)

Equalities (13.2) and (13.3) immediately give us our result.

From the previous theorem, we can easily obtain the following corollary, which es-
tablishes when two vectors are perpendicular, that is, when the angle between them
is π/2. This result will be very useful for exercises.

Corollary 13.2.2 Two vectors u and v are perpendicular to each other if and only
if u ⋅ v = 0.

Let us now turn to another extremely important product for our discussion: the cross
product, also called vector product. Let u = [u1 , u2 , u3 ], v = [v1 , v2 , v3 ] be two vectors.
We define their vector product as:

u × v = (u2 v3 − u3 v2 , u3 v1 − u1 v3 , u1 v2 − u2 v1 ) .

The vector product has the following properties that we leave for exercise. For each
vector u, v, w and for each scalar λ:

• The vector product of a vector by itself is the zero vector.

u×u=0

• Anticommutativity
u × v = −(v × u)

• Distributivity with respect to addition

u × (v + w) = (u × v) + (u × w)

• Compatibility with multiplication by a scalar:

(λ u) × v = u × (λ v) = λ (u × v)

We conclude the section with a result of great importance for the exercises. It is an
immediate consequence of Corollary 13.2.2.

Proposition 13.2.3 The cross product between two vectors u and v is perpendicular
to both u and v.
226 Introduction to Linear Algebra

3
13.3 LINES IN R
3
In the space R , we want to describe the points laying on a straight line r, using
equations. We require our line to pass through a point P0 = (x0 , y0 , z0 ), we ask that
its direction is determined by the vector v = [v1 , v2 , v3 ]. In short, with the help of a
drawing, we can immediately write the equation for P = (x, y, z), the generic point
of the line r:
⟶ ⟶
OP =OP0 +t[v1 , v2 , v3 ]. (13.4)
Therefore:
[x, y, z] = [x0 , y0 , z0 ] + t[v1 , v2 , v3 ]. (13.5)
We call equation (13.5) a vector parametrization or vector equation of the line r.
z
6
PP
PP
P PP
P PP P
PP P

P
PP
7 P P
PPPP
P0
PP
P
P
*PPP

r
PP
PP

v

i
P
PP
PP
PP -

y

=
x
In general, the line r is the set points with coordinates (x, y, z) ∈ R expressed as:
3

⎧
⎪ x = x0 + tv1
⎪
⎪
⎪
⎨ y = y0 + tv2 (13.6)
⎪
⎪
⎪
⎪
⎩z = z0 + tv3

as the parameter t ∈ R changes. The equations in (13.6) are called parametric equa-
tions of the line r. The vector v is called the direction vector of r. Note that the
vector v identifies the direction of r; however, we can use any nonzero vector, multiple
of v, to define the same line.
Let us see a concrete example.

Example 13.3.1 We want to write parametric equations of the line r through the
point P0 = (1, 0 − 1) with direction given by the vector v = [2, 1, −1]. We also
Lines and Planes 227

want to know if r is parallel to the line r given by parametric equations [x, y, z] =

′

[2, 0, −3] + t [−4, −2, 2]. Substituting in formula (13.6), we have immediately the
′

parametric equations of the line:

⎧
⎪ x = 1 + 2t
⎪
⎪
⎪
⎨ y=t (13.7)
⎪
⎪
⎪
⎪
⎩z = −1 − t.
′
The two lines r and r have the same direction, since the direction vector of r,
v = [2, 1, −1], is a multiple of the direction vector of r , v = [−4, −2, 2]. There-
′ ′

fore, the two lines are parallel or they coincide. To establish their mutual position
′
it is sufficient to check if the point P0 belongs to the line r , that is, if it exists a
value of the parameter t such that [1, 0, −1] = [2, 0, −3] + t [−4, −2, 2]. We leave to
′ ′

the reader to verify that this value does not exist. Therefore, the two given lines are
parallel and they do not coincide.

Let us look at another example.

Example 13.3.2 We want to write parametric equations of the line r̂ through the
points P1 = (3, 1, −2) and P2 = (5, 2, −3). A direction of r̂ is obtained by taking the
difference between the coordinates of P2 and those of P1 : v = [2, 1, −1]. So r̂ is given
by:

⎧
⎪ x = 3 + 2t̂
⎪
⎪
⎪
⎨ y = 1 + t̂ (13.8)
⎪
⎪
⎪
⎪
⎩z = −2 − t̂,
where we have chosen P0 = P1 in formula (13.6) (we very well could have chosen P2
instead).
We now ask whether the line r̂ is parallel or coincident with the line r of the previous
example, since they have the same direction. A quick calculation shows that Q belongs
to r, and therefore the two lines are coincident.

These examples show that a parametric form of a given line is not unique: we can in
fact change the point P0 = (x0 , y0 , z0 ) used in the representation (13.6), choosing it
arbitrarily among all the infinite points of the line, or we can multiply the parameter
t for an arbitrary nonzero constant: in both the cases the line does not change, even
though its parametric equations can take a different form.
3
Let us now see an equivalent way of describing the points of a line in R without
using a parameter. For any set of parametric equations of a line r, it is always possible
to get the parameter t from one of the equations and, replacing it in the other two,
we obtain a linear system of two equations in the unknowns x, y, z. Such equations
are called Cartesian equations of the line r. It is obvious that if the coordinates of
a point verify parametric equations of the line r then they also verify the Cartesian
equations, the vice-versa will be clear at the end of next section. In fact, we will
see that the two linear equations of the system we obtain represent two planes both
228 Introduction to Linear Algebra

containing the line r. The solutions of the linear system are the points that lie in the
intersection of two planes, that is, the points of a line, and this line must be indeed
r. Let us see an example.
Example 13.3.3 We want to write the line r of the previous example in Cartesian
form (see 13.3.1). In this case, it is very simple; since t = y, just substitute y instead
of t directly in the other equations:

x = 1 + 2y
{ (13.9)
z = −1 − y
So x − 2y − 1 = 0 and y + z + 1 = 0 are the Cartesian equations of the line r. We can
then think of the points of the line as the set of solutions of the linear system (13.9).
In the next section, we will see how the previous example can be reinterpreted to see
3
a line as the intersection of two planes in R .

3
13.4 PLANES IN R
We now want to determine the equation that describes the points of a plane perpen-
dicular to a given line with direction n = [a, b, c] and passing through a fixed point
P0 = (x0 , y0 , z0 ). By Corollary 13.2.2, we have that two vectors are perpendicular if
and only if their dot product is zero. Therefore, the set of points of the plane con-
taining P0 = (x0 , y0 , z0 ), perpendicular to the vector n, is obtained by imposing that
the generic point P = (x, y, z) on the plane satisfies the equation:
⟶
n⋅ P0 P = 0.
We write this equation as:
a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0. (13.10)
Equation (13.10) is called a Cartesian equation of the given plane and the vector n
is said normal vector to the plane.
z
6
P
PPP
P
PPP
P 0
PP
s
PP

PP H P

PP HH P
Hs
PP
PP
PP

PP P
7

π

n
-

y

=
x
Lines and Planes 229

Example 13.4.1 We want to determine the plane through the point P0 = (1, −1, 2)
and with normal vector n = [2, −3, −1]. Substituting in equation (13.10) we imme-
diately have:
2(x − 1) − 3(y + 1) − (z − 2) = 0.
Thus the plane consists of all the points (x, y, z) that satisfy the equation 2x−3y−z =
3.

It is easy to verify that, on the other hand, any linear equation of the type:

ax + by + cz = d

represents a plane perpendicular to the vector n = [a, b, c].

Similarly to what we saw in the previous section, we can also write paramet-
ric equations that represent the points belonging to a plane containing the point
P0 = (x0 , y0 , z0 ) and with directions given by u = [u1 , u2 , u3 ] and v = [v1 , v2 , v3 ],
which are not one multiple of the other.
We write:
[x, y, z] = [x0 , y0 , z0 ] + t[u1 , u2 , u3 ] + s[v1 , v2 , v3 ]
or more extensively:
⎧
⎪ x = x0 + tu1 + sv1
⎪
⎪
⎪
⎨ y = y0 + tu2 + sv2 (13.11)
⎪
⎪
⎪
⎪
⎩z = z0 + tu3 + sv3 .
The equations in (13.11) are called parametric equations of the given plane. Note that
while the parametric equations of a line (see (13.6)) depend on a single parameter t,
the parametric equations of a plane depend on two parameters t and s. Intuitively,
this corresponds to the fact that, while along a line we can move in one direction only,
in the plane we have two degrees of freedom, and therefore we can move according
to all linear combinations of the two vectors u and v.
In a similar way to what we have seen for the lines, we can transform parametric
equations of a plane deriving t and s from two equations and replacing them in the
third, thus obtaining a single linear equation, which is precisely the equation of a
plane. It is uniquely determined up to a multiple.
Let us see some examples.

Example 13.4.2 We want to determine, in parametric form and in Cartesian form,

the plane passing through the point P = (3, 1, 0) with normal vector n = [1, 2, −5].
From fomula (13.10) we immediately obtain the equation of the plane in Cartesian
form:
1(x − 3) + 2(y − 1) − 5z = 0 ⟹ x + 2y − 5z = 11.
230 Introduction to Linear Algebra

To obtain the equations of the plane in parametric form we proceed in a similar

way to what we have seen for the lines: we assign to two variables (arbitrarily) the
values of the parameters t and s and replace them in the Cartesian equation:

⎧
⎪ x = 11 − 2t + 5s
⎪
⎪
⎪
⎨ y=t
⎪
⎪
⎪
⎪
⎩z = s.
Let us look at another more complicated example.

Example 13.4.3 We want to determine, in parametric form and in Cartesian form,

the plane passing through the three points: P = (1, 0, −1), Q = (2, 2, 1), R = (4, 1, 2).
First we determine the vectors giving the directions:
⟶ ⟶
P Q= Q − P = [1, 2, 2], P R= R − P = [3, 1, 3].
⟶ ⟶
Note that we could very well have chosen for instance P R, QR: the plane thus
obtained would be the same, we invite the student to verify this fact by doing the
calculations. At this point a parametric form is immediate:

⎧
⎪ x = 1 + t + 3s
⎪
⎪
⎪
⎨ y = 0 + 2t + s (13.12)
⎪
⎪
⎪
⎪
⎩z = −1 + 2t + 3s.
To determine the Cartesian form, we could obtain t and s from two equations and
replace them into the third. However, from formula (13.10) we know that, to de-
termine the plane, it is enough know the coordinates of a point in the plane and a
vector normal to it. To determine a vector normal to the plane, we can take the vector
product of the two direction vectors. Such product is in fact always perpendicular to
both vectors. Let us see the calculation:
⟶ ⟶
n =P Q × P R= [4, 3, −5].

Therefore the plane is immediately given by the Cartesian equation:

4(x − 1) + 3y − 5(z + 1) = 0.

i.e. 4x + 3y − 5z = 9.

We end the section by giving the definition of skew lines.

′ 3
Definition 13.4.4 We say that two distinct lines r and r in R are skew lines, if
they are neither parallel nor they intersect.
Lines and Planes 231

13.5 EXERCISES WITH SOLUTIONS

13.5.1 Consider the plane π passing through the point P = (1, 0, 1) with normal
vector n = [1, −2, 4]. Determine the line r perpendicular to π and passing through
the point P = (1, −2, 3).
Solution. The plane is given by the Cartesian equation:

x − 2y + 4z = 5.

The line we need to find has as a direction vector the vector normal to the plane and
therefore we can immediately write the equations of the line r in parametric form:

⎧
⎪ x=1+t
⎪
⎪
⎪
⎨ y = −2 − 2t (13.13)
⎪
⎪
⎪
⎪
⎩z = 3 + 4t,
which correspond to the equations in Cartesian form:

4x − z = 1
{ (13.14)
2x + y = 0.

13.5.2 Determine the distance of the point P = (2, −1, 3) from the plane passing
through Q = (1, −1, −8) with normal vector n = (2, −2, −1).
Solution. We immediately write the Cartesian equation of the plane:

2x − 2y − z = 12

and then a set of parametric equations of the line passing through P and with direc-
tion vector n is:
⎧
⎪ x = 2 + 2t
⎪
⎪
⎪
⎨ y = −1 − 2t
⎪
⎪
⎪
⎪
⎩z = 3 − t.
We then calculate the point R of intersection between this line and the given plane
by substituting the generic point obtained from the parametric equations of the line
in the equation of the plane:

2(2 + 2t) − 2(−1 − 2t) − (3 − t) = 12,

We obtain t = 1, thus R = (4, −3,√2). Using the distance formula (13.1), we get that
the distance between P and R is 17.

13.5.3 Consider the line r through the two points P = (1, 0, −2) and Q = (−1, 1, −1)
and the line r through the point R = (3, 1, −5) with direction vector v = (0, −1, 1).
′ ′
′ ′
Say if r is perpendicular to r and compute the distance between Q and r .
232 Introduction to Linear Algebra

Solution. We compute a direction vector: v = [−2, 1, 1] and therefore the line r has
parametric equations:
⎧
⎪ x = 1 − 2t
⎪
⎪
⎪
⎨ y =0+t
⎪
⎪
⎪
⎪
⎩z = −2 + t.
′
We immediately see that v ⋅ v = 0, however two lines are perpendicular if they
intersect and if they have perpendicular direction vectors. So we have to check if they
′
intersect. We write parametric equations of r :

⎧
⎪ x=3
⎪
⎪
⎪
⎨ y =1−t
′
⎪
⎪
⎪
⎪ ′
⎩z = −5 + t .
At this point, to check whether or not they intersect we solve the system:

⎧
⎪ 3 = 1 − 2t
⎪
⎪
⎪
⎨
′
1−t =0+t
⎪
⎪
⎪
⎪ ′
⎩−5 + t = −2 + t.
′
We see that this system admits solution for t = −1 and t = 2, that is, the point
S = (3, −1, −3) belongs to both lines. Therefore, r and r are perpendicular.
′
′
We now want to calculate the distance between Q and r . As the lines are perpendic-
ular and intersect at the point S this distance will be given by the distance between
the points Q and S which we can immediately calculate using formula (13.1):
√ √
16 + 4 + 4 = 2 6.

13.5.4 Determine the intersection of the two planes:

x + y − z = 0, y + 2z = 6. (13.15)

Solution. Before proceeding, let us note a very important fact: the expression of a line
in Cartesian form, that is, given by a system of two equations in the unknowns x, y,
z, corresponds to the intersection of two planes each identified by its own Cartesian
equation. The direction of the line is uniquely identified as it is perpendicular to
both normal directions of the two planes. So, to find a vector perpendicular to both
the normal vectors of the planes, we calculate the vector product of the two normal
vectors: n1 = [1, 1, −1] and n2 = [0, 1, 2]:

n1 × n2 = [3, −2, 1].

Let us now choose an arbitrary point on the line, that is, a point that satisfies both
equations (13.15). We can for example set z = 0 and obtain values for x and y from
Lines and Planes 233

the equations: y = 6 and x = −6. Hence, parametric equations of the line intersection
of the two planes are:

⎧
⎪ x = −6 + 3t
⎪
⎪
⎪
⎨ y = 6 − 2t
⎪
⎪
⎪
⎪
⎩z = t.

13.6 SUGGESTED EXERCISES

13.6.1 Consider the following lines and determine what is their reciprocal position
(they are parallel lines, they intersect or they are skew).

r1 ∶ x = 1 + t, y = t, z = 2 − 5t

r2 ∶ x + 1 = y − 2 = 1 − z,

r3 ∶ x = 1 + t, y = 4 + t, z =1−t

r4 ∶ x = 2 + 2t, y = 1 + 2t, z = −3 − 10t

Also, in case of parallel lines, compute the distance between them.

13.6.2 a) Let the π be the plane given by:

⎧
⎪ x = 1 + s + 2t
⎪
⎪
⎪
⎨ y = 3s
⎪
⎪
⎪
⎪
⎩z = 2 + t.

Determine the line r perpendicular to π and passing through the point P = (2, 1, 0).
b) Determine the plane π containing Q = (1, 0, −1) and r, both in parametric and
′

in Cartesian form.

13.6.3 Find parametric equations for the intersection of the planes x + y − z = 2,

3x − 4y + 5z = 6. Also calculate the angle formed by the two planes, that is, by the
two lines normal to the planes.
3
13.6.4 In R , given the plane π, x + y − z = 0 and the line r x = 2t, y = t, z = 2t + 1
calculate (if it exists) the line through (0, 0, 0) and π ∩ r.
3
13.6.5 Give in R the lines r, s of Cartesian equations

x+y−1=0 x + y − 2z − 2 = 0
r∶{ s∶{
z−1=0 z + 1 = 0;

a) Show that r, s belong to the same plane.

b) Determine the equation of the plane that contains r, s.

234 Introduction to Linear Algebra

c) Determine the distance between r and s.

3
13.6.6 In R , given the planes π1 ∶ x − y + 1 = 0, π2 ∶ x + y + 3z = 0:

a) Find parametric equations for the line r intersection of π1 and π2 .

b) Determine the Cartesian equation of the plane π containing r and the point
P = (1, 0, 1).

c) Find parametric equations for the line s passing through P and orthogonal to
π.
3
13.6.7 In R , given the plane π with Cartesian equation π ∶ x + y − 2z + 4 = 0 and
the line r with Cartesian equations

x − 2z + 12 = 0
{
y−4=0

a) Find parametric equations for r.

b) Determine the relative position of π, r.

c) Determine the distance between π and r.

3
13.6.8 In R , given the plane π with Cartesian equation

π ∶x−y+z+1=0

and the point Q = (1, 1, 0):

a) Find a Cartesian equation of the plane passing through Q that is parallel to π.

b) Determine the distance between these two planes.

c) Find Cartesian equations for the line s passing through Q and parallel to the
vector v = [1, 0, 1].

13.6.9 Consider two generic vectors u, v laying on the plane xy. Prove that the norm
of ∥u × v∥ is the area of the parallelogram of sides u and v.

13.6.10 Consider three generic vectors u, v w in R . Prove that ∥u × v ⋅ w∥ is the

volume of the parallelepiped with sides u, v and w.

CHAPTER 14

Introduction to Modular
Arithmetic

In this chapter, we want to study the arithmetic of the integers. We will start with
the principle of induction, a result of fundamental importance that has several ap-
plications in various areas of mathematics. We will then continue with the division
algorithm, Euclid’s algorithm, arriving to congruences, the most important topic of
elementary discrete mathematics.

14.1 THE PRINCIPLE OF INDUCTION

The principle of induction is the main technique for proving statements regarding
natural numbers that is, the set N = {0, 1, 2, . . . }. Let start with an example.
Assume we want to prove that the sum of the integers between 0 and n is n(n + 1)/2:
n
n(n + 1)
∑k = . (14.1)
2
k=0

We will denote this statement with P (n). First of all, we check its validity for n = 0:
we need to show that the sum of natural numbers between 0 and 0 is 0. In fact,
0 = 0(0 + 1)/2 = 0. Let us now verify that P (1) is true: the sum of the integers
between 0 and 1 is 0 + 1 = 1 = 1(1 + 1)/2. Similarly, P (2): 0 + 1 + 2 = 2(2 + 1)/2 = 3.
It is clear that, with a little patience, we could go on like this by verifying formula
(14.1) when n is very large, but we want to prove that it is true for all natural
numbers n. The induction principle helps us by allowing us to prove the validity of a
statement for all natural numbers.
First of all, we state the axiom of good ordering. We will see later that the principle
of induction and the axiom of good ordering are equivalent to each other. However in
the theory we will illustrate we shall take one as an axiom and then prove the other.

Axiom of good ordering. Each non-empty subset of the set of natural numbers
contains an element that is smaller than all the others.

235
236 Introduction to Linear Algebra

Theorem 14.1.1 (Principle of induction). Assume that a statement P (n) is given

for each natural number. If it occurs that:
1) P (0) is true;

2) if P (k) is true then P (k + 1) is true;

then P (n) is true for every n ∈ N.
Proof. Let S be the subset of N for which the statement P (n) does not hold. We
want to show that S = ∅. Assume by contradiction that S is not empty. So, by the
axiom of the good ordering, S contains a natural number m smaller than all other
elements in S. In particular P (m) is not true and since P (0) is true by hypothesis,
m ≠ 0. On the other hand, being m the smallest element of S, m − 1 does not belong
to S, ie P (m − 1) is true. By hypothesis 2), however, P (m − 1 + 1) = P (m) is true
and therefore we have obtained a contradiction.

Example 14.1.2 Let us see how to use the principle of induction to prove the
formula (14.1), which is the following statement P (n):
n(n+1)
the sum of the first n natural numbers is 2
.
We have already seen that this statement is true for n = 0, that is, hypothesis 1) of
the principle of induction is true. Now assume that P (k) is true, that is:
k(k + 1)
0 + 1 + 2 + 3 + ⋅⋅⋅ + k = .
2
We have to show that P (k) ⟹ P (k + 1) (hypothesis 2) of the induction principle,
i.e. that:
(k + 1)(k + 2)
0 + 1 + 2 + 3 + ⋅ ⋅ ⋅ + k + (k + 1) = .
2
Since P (k) is true, we have:
2
0 + 1 + ⋅ ⋅ ⋅ + k + (k + 1) =
k(k+1) k +k+2k+2
2
+k+1= 2
=

(k+1)(k+2)
= 2
,

so P (k + 1) is true. At this point the principle of induction guarantees that the

statement P (n) holds for every n.
The use of the statement P (k) to prove P (k + 1) takes the name of inductive hy-
pothesis.
We now state a variant of the principle of induction, known as the principle of com-
plete induction, that is very useful in applications. Despite this statement appears
weaker than the induction principle, at the end of this section we will see that the
principle of induction and the principle of complete induction are equivalent.
Theorem 14.1.3 (Principle of Complete Induction). Assume that a statement
P (n) is given for each natural number. If it occurs that:
Introduction to Modular Arithmetic 237

1. P (0) is true;
2. if P (j) is true for every j < k then P (k) is true;
then P (n) is true for every n ∈ N.
Proof. It is an immediate consequence of the principle of induction.

Theorem 14.1.4 The following are equivalent:

1. The axiom of good ordering.
2. The principle of induction.
3. The principle of complete induction.
Proof. We have already seen that (1) ⟹ (2) ⟹ (3). It is therefore sufficient to
show that (3) ⟹ (1). The axiom of good ordering can be formulated as follows:
“If a set S ⊂ N does not have a smallest element then S = ∅” or, equivalently “If
a set S ⊂ N does not have a smallest element then n ∉ S for every n ∈ N”. We
denote with P (n) this last statement and we show that it is true using the principle
of complete induction. So S is a subset of N that has no smallest element. P (0) is
true, i.e. 0 ∉ S, otherwise 0 would be the smallest element of S (being 0 the smallest
element of N). Assume that P (j) is true for every j ≤ k − 1, that is, assume that
0, 1, 2, . . . , k − 1 ∉ S. We want to show that k ∉ S. But if k ∈ S then that k would
be the minimum of S (because all natural numbers smaller than k are not in S) and
this would be absurd because we are assuming that S has no minimum element. So,
by the complete induction principle, P (n) holds for every n ∈ N, that is, the axiom
of good order is true.

This apparently convoluted proof is, however, instructive because it proves the equiv-
alence of three statements, which look quite different from each other.

14.2 THE DIVISION ALGORITHM AND EUCLID’S ALGORITHM

The division algorithm formalizes a procedure that we know very well since ele-
mentary school. However it is necessary to understand that a procedure, to be well
defined, must be proved, so that its validity becomes absolute and not confined to
the possible examples that we can build.
Theorem 14.2.1 (Division algorithm). Consider n, b ∈ N, b ≠ 0. Then there are
two unique integers q and r, respectively called quotient and remainder, such that:
n = qb + r, with 0 ≤ r < b.

Proof. Let us first prove the existence of q and r with the principle of complete
induction. P (0) it is true because n = 0 = 0b + 0, with q = r = 0. Assume that P (j)
is true for every j such that 0 ≤ j < k and we want to show P (k) (that is, we want
to verify hypothesis 2) of the principle of complete induction). If k < b then:

k = 0b + k, with remainder r = k, 0 ≤ k < b,

238 Introduction to Linear Algebra

so P (k) is true for k < b. Now assume that k ≥ b. Since 0 ≤ k − b < k, by applying
the inductive hypothesis to k − b we have:

k − b = q1 b + r1 , with 0 ≤ r1 < b,

from which we deduce:

k = (q1 + 1)b + r1 , with 0 ≤ r1 < b,

which shows what we want.
We now show that q and r are unique. Assume n = q1 b + r1 = q2 b + r2 , with 0 ≤
r1 ≤ r2 < b. Then it follows that r2 − r1 = (q1 − q2 )b. Now in the first member of the
equality there is a non-negative integer less than b, in the second member there is a
multiple of b, so they both must be zero, i.e. r1 = r2 and q1 = q2 .

Observation 14.2.2 With a similar proof, which uses the axiom of good ordering,
we obtain that the division algoritm holds for any two integers a and b, not necessarily
belonging to natural numbers. More precisely we have that:

Theorem 14.2.3 Given two integers n and b with b ≠ 0, there exists a unique pair
of integers p and q, respectively called quotient and remainder, such that

n = qb + r, with 0 ≤ r < ∣ b ∣,

where ∣ b ∣ indicates the absolute value of b.

We now want to introduce the concept of divisibility and of greatest common divisor.

Definition 14.2.4 Let a and b be two integers. We say that b divides a if there is
an integer c such that a = bc and we write b∣a. We say that d is the greatest common
divisor between two numbers a and b if it divides them both and it is the largest
integer with this property. We will denote the greatest common divisor between a
and b with gcd(a, b).

Observation 14.2.5 Note that if a, b and c are integers and a divides both b and
c, then a also divides b + c and b − c. We leave to the reader the easy verification of
this property that we will use several times later.

We now want to find an efficient algorithm to determine the greatest common divisor
between two integers.
Introduction to Modular Arithmetic 239

Theorem 14.2.6 (Euclid’s algorithm) Let a and b be two positive integers such
that b ≤ a and b does not divide a. Then we have:
a = q0 b + r0 , where 0 ≤ r0 < b

b = q1 r0 + r1 , where 0 ≤ r1 < r0

r0 = q2 r1 + r2 , where 0 ≤ r2 < r1

rt−2 = qt rt−1 + rt , where 0 < rt < rt−1

rt−1 = qt+1 rt
and the last nonzero remainder rt is the greatest common divisor between a and b.
Proof. By Theorem 14.2.1 we can write: a = q0 b + r0 . Now we want to show that
gcd(a, b) = gcd(b, r0 ). In fact if c∣a and c∣b then c∣r0 , since r0 = a − q0 b. Similarly,
if c∣b and c∣r0 then c∣a = q0 b + r0 . So the set of integers that divide both a and b
coincides with the set of integers that divide both b and r0 . Therefore the greatest
common divisor of the two pairs (a, b) and (b, r0 ) is the same. Once established that,
the result follows immediately from the chain of equalities:

gcd(a, b) = gcd(b, r0 ) = gcd(r0 , r1 ) = ⋅ ⋅ ⋅ = gcd(rt−1 , rt ) = rt .

Let us see concretely how to use this algorithm to determine the greatest common
divisor of two given numbers.
Example 14.2.7 We want to compute gcd(603, 270). We use Euclid’s algorithm
(Theorem 14.2.6):
603 = 2 ⋅ 270 + 63

270 = 4 ⋅ 63 + 18

63 = 3 ⋅ 18 + 9

18 = 2 ⋅ 9.
So gcd(603, 270) = 9.
The following theorem is a consequence of Euclid’s algorithm and will be the funda-
mental tool for the resolution of congruences, which we will study in Section 14.4.
Theorem 14.2.8 (Bézout Identity). Let a, b be positive integers and let d =
gcd(a, b). Then, there are two integers u, v (not unique) such that:

d = ua + vb.
240 Introduction to Linear Algebra

Proof. The proof of this result uses Euclid’s algorithm 14.2.6. We show that at each
step there exist ui , vi ∈ Z such that ri = ui a + vi b.
For r0 the result is true, with u0 = 1 and v0 = −q0 , indeed
r0 = a − q0 b.
Then we have that:
r1 = b − r0 q1 = b − (u0 a + v0 b)q1 = −u0 aq1 + (1 − vo q1 )b
and the result is also true for r1 , just take u1 = −u0 q1 and v1 = 1 − v0 q1 . In general,
after the step i − 1 we know ui−2 , ui−1 , vi−2 , vi−1 ∈ Z such that:
ri−2 = ui−2 a + vi−2 b, ri−1 = ui−1 a + vi−1 b.
So we have that:
ri = ri−2 − ri−1 qi = ui−2 a + vi−2 b − (ui−1 a + vi−1 b)qi
= (ui−2 − ui−1 qi )a + (vi−2 − vi−1 qi )b,
so the result is true for ri , with ui = ui−2 − ui−1 qi and vi = vi−2 − vi−1 qi . Since
gcd(a, b) is the last nonzero remainder rt , after the step t we know ut and vt such
that rt = gcd(a, b) = ut a + vt b and u = ut , v = vt are the integers we were looking
for.

Observation 14.2.9 In the previous theorem, the existence of two numbers u, v such
that d = ua+vb does not guarantee that d = gcd(a, b). For example 10 = 15⋅14−25⋅8
but gcd(14, 8) = 2.

Example 14.2.10 In Example 14.2.7, we computed gcd(603, 270) = 9. We now want

to compute two numbers u and v such that u 603+v 270 = 9. We proceed backwards by
carefully replacing the remainders in the sequence of equations obtained in Example
14.2.7.
9 = 63 − 3 ⋅ 18 =

= 63 − 3 ⋅ [270 − 4 ⋅ 63] = 63 − 3 ⋅ 270 + 12 ⋅ 63 =

= (−3) ⋅ 270 + 13 ⋅ 63 = (−3) ⋅ 270 + 13 ⋅ [603 − 2 ⋅ 270] =

= (−3) ⋅ 270 + 13 ⋅ 603 + (−26) ⋅ 270 =

= 13 ⋅ 603 + (−29) ⋅ 270.

So 9 = 13 ⋅ 603 + (−29) ⋅ 270, i.e. we have u 603 + v 270 = 9 with u = 13 and

v = −29.
Introduction to Modular Arithmetic 241

We conclude this section with a result that we do not prove and that we will not use
later, but which plays a fundamental role for the theory of the integer numbers and
whose generalizations are extremely important in number theory (see [2]).

Definition 14.2.11 Let us say that a positive integer p is prime if its only divisors
are ±p and ±1.
Theorem 14.2.12 (Fundamental Theorem of Arithmetic). Each integer
greater than 1 is the product of primes in a unique way up to reordering:

n = p1 p2 . . . pr ,

where p1 , . . . , pr are prime numbers (not necessarily distinct).

14.3 CONGRUENCE CLASSES

Congruence classes represent a way of counting, adding up and multiplying numbers
different from the ones we know from the arithmetic of integers and that we use every
day. For example, if we ask someone to meet precisely in 10 hours and now it is 23
o’clock, this person knows very well that the meeting will take place at 9 o’clock, not
at 33, as the sum of integers would suggest. Similarly if our clock marks 8 o’clock
and we have an appointment in 6 hours, we know that the appointment will be at 2
o’clock.
We want to formalize this way of carrying out operations, which in the cases examined
simply consists of doing the sum of the integer numbers, divide the result by a certain
integer (which in the first example is 24, in the second 12) and then take the remainder
of the division.
Definition 14.3.1 Let a, b and n be three integers, with n > 0. We say that a is
congruent to b modulo n and we write a ≡n b if n∣a − b.
The congruence relationship between integers has the following properties:
(i) Reflexive property: a ≡n a, in fact n∣a − a = 0.

(ii) Symmetric property: if a ≡n b then b ≡n a. Indeed if n∣a − b then n∣b − a.

(iii) Transitive property: if a ≡n b and b ≡n c then a ≡n c. In fact, if n∣a − b and

n∣b − c then n∣a − b + b − c = a − c.
A relationship satisfying the reflexive property, the symmetric property and the tran-
sitive property is called equivalence relation. We will not study equivalence relations
in general, however we will use the three aforementioned properties of congruences.
We now state a result that will be useful later and whose proof we leave for exercise.
Proposition 14.3.2 Let a, b, c, d, n ∈ Z, with n > 0. If a ≡n b and c ≡n d then:
1. a + c ≡n b + d,

2. ac ≡n bd.
242 Introduction to Linear Algebra

Let us now define the congruence classes, i.e. the sets that contain all integer numbers
which are congruent to each other modulo a certain integer n.

Definition 14.3.3 Let a, n ∈ Z. The congruence class of a modulo n, denoted with

[a]n , is the set of integers congruent to a modulo n:

[a]n = {b ∈ Z ∣ b ≡n a} = {a + kn ∣ k ∈ Z}.

Example 14.3.4 For example, if we take n = 4 we have that:

[0]4 = {0, 4, −4, 8, −8 . . . }

[1]4 = {1, 5, −3, 9, −7 . . . }

[2]4 = {2, 6, −2, 10, −6 . . . }

[3]4 = {3, 7, −1, 11, −5 . . . }

Note that:
[4]4 = [0]4 = [−4]4 = . . .

[5]4 = [1]4 = [−3]4 = . . .

[6]4 = [2]4 = [−2]4 = . . .

[7]4 = [3]4 = [−1]4 = . . .

We note that in this example:

- there are no elements in common in two different congruence classes;

- the union of the congruence classes is the set of all integers;

- there is a finite number of congruence classes.

These facts, as we shall see, are valid in general.

Proposition 14.3.5 Let [a]n and [b]n be two congruence classes modulo n. Then
[a]n = [b]n or [a]n and [b]n are disjoint, that is they do not have common elements.

Proof. Assume there is a common element c between [a]n and [b]n , that is c ≡n a
and c ≡n . So, by the transitive property of congruences a ≡n b, and then, again by
the transitive property, [a]n = [b]n .

The proof of the following proposition is immediate.

Proposition 14.3.6 1. Let r be the remainder of the division of a by n. Then

[a]n = [r]n .
Introduction to Modular Arithmetic 243

2. [0]n , [1]n , . . . , [n − 1]n are all the distinct congruence classes modulo n.

3. [0]n ∪ [1]n ∪ . . . ∪ [n − 1]n = Z.

We have come to the most important definition of this chapter: the set Zn .

Definition 14.3.7 The set of integers modulo n, denoted with Zn , is the set of
congruence classes modulo n:

Zn = {[0]n , [1]n , . . . , [n − 1]n }.

Definition 14.3.8 We define the following sum and product operations on the set
Zn :
[a]n + [b]n = [a + b]n , [a]n [b]n = [ab]n .

Observation 14.3.9 The operations just defined do not depend on the numbers
a and b chosen to represent the congruence classes which we add or multiply, but
only from their congruency class. In this case, it is said that the operations are well
defined. For example, in Z4 we have: [1]4 = [5]4 and [2]4 = [6]4 . By definition,
[1]4 + [2]4 = [3]4 = [11]4 = [5]4 + [6]4 .

Example 14.3.10 We compute the tables of addition and multiplication for Z3 and
Z4 , inviting the student to practice in building the analogue tables for Z5 and Z6 :

+ [0]3 [1]3 [2]3

[0]3 [0]3 [1]3 [2]3
[1]3 [1]3 [2]3 [0]3
[2]3 [2]3 [0]3 [1]3
Z3
⋅ [0]3 [1]3 [2]3
[0]3 [0]3 [0]3 [0]3
[1]3 [0]3 [1]3 [2]3
[2]3 [0]3 [2]3 [1]3

+ [0]4 [1]4 [2]4 [3]4

[0]4 [0]4 [1]4 [2]4 [3]4
[1]4 [1]4 [2]4 [3]4 [0]4
[2]4 [2]4 [3]4 [0]4 [1]4
[3]4 [3]4 [0]4 [1]4 [2]4
Z4
⋅ [0]4 [1]4 [2]4 [3]4
[0]4 [0]4 [0]4 [0]4 [0]4
[1]4 [0]4 [1]4 [2]4 [3]4
[2]4 [0]4 [2]4 [0]4 [2]4
[3]4 [0]4 [3]4 [2]4 [1]4
244 Introduction to Linear Algebra

We note some very important facts: in Z3 each element other than [0]3 admits an
inverse, that is, for every [a]3 ≠ [0]3 there is an element [b]3 such that [a]3 [b]3 =
[1]3 . This inverse is denoted with [a]3 . So we have: [1]3 = [1]3 , [2]3 = [2]3 . This
−1 −1 −1

property does not apply in the case of Z4 . In fact, the multiplicative table shows that
there is no inverse of the class [2]4 . As we will see in detail in the next paragraph,
this diversity is linked to the fact that 3 is a prime number while 4 is not.

14.4 CONGRUENCES
In this section, we aim at solving linear equations in which the unknown belongs to
the set Zn introduced in the previous section.
Let us start by examining the structure of Zp , with p a prime number.
Proposition 14.4.1 The following statements are equivalent:
(1) p is a prime number.
(2) The equation [a]p x = [1]p , with [a]p ≠ [0]p , has a solution in Zp , that is,
every element [a]p ≠ [0]p in Zp admits an inverse.
(3) If [a]p [b]p = [0]p in Zp then [a]p = [0]p or [b]p = [0]p .
Proof. (1) ⟹ (2): since [a]p ≠ [0]p , p does not divide a, so gcd(a, p) = 1. Then
by Theorem 14.2.8 there exists u, v ∈ Z such that 1 = au + pv. Taking the classes
congruence modulo p, we have: [1]p = [a]p [u]p + [p]p [v]p = [a]p [u]p , therefore
x = [u]p ∈ Zp is a solution of [a]p x = [1]p .
(2) ⟹ (3): we have [a]p [b]p = [0]p with [a]p ≠ [0]p . By hypothesis there is an
inverse of [a]p that is, there is an element [u]p such that [u]p [a]p = [1]p . Multi-
plying both members of the equality [a]p [b]p = [0]p by [u]p we get: [u]p [a]p [b]p =
[u]p [0]p = [0]p , i.e. [b]p = [0]p .
(3) ⟹ (1): we assume that p = ab and we show that necessarily a and b are
equal to ±1 or ±p, that is the only divisors of p are, up to changing the sign, p itself
and 1. Considering the absolute values, we observe that ∣a∣∣b∣ = ∣ab∣ = ∣p∣ = p so
∣a∣ ≤ p and ∣b∣ ≤ p. The equality p = ab, translated in Zp , becomes the equality
[p]p = [a]p [b]p i.e. [a]p [b]p = [0]p . By hypothesis we know that either [a]p = [0]p
or [b]p = [0]p . If [a]p = [0]p then a = ±p and b = ±1. If [b]p = [0]p then b = ±p and
a = ±1.

Corollary 14.4.2 If p is a prime number the equation [a]p x = [b]p , with [a]p ≠
[0]p , has a single solution in Zp .
Proof. By property (2) of the previous proposition we know that [a]p is invertible.
Multiplying the given equation by [a]p we get x = [a]p [b]p , which shows at the
−1 −1

same time that the solution exists and is unique.

The following result is very useful in the resolution of the exercises.

Proposition 14.4.3 If gcd(a, n) = 1 the equation [a]n x = [b]n has a unique
solution in Zn .
Introduction to Modular Arithmetic 245

Proof. Since gcd(a, n) = 1, by Theorem 14.2.8 there exist u, v ∈ Z such that au+nv =
1, so [a]n is invertible in Zn , with inverse [u]n . Arguing as in the proof of the previous
corollary we get the result.

At this point, it is easy to characterize all elements in Zn having an inverse.

Proposition 14.4.4 The element [a]n has an inverse in Zn if and only if gcd(a, n) =
1.
Proof. Assume that gcd(a, n) = 1. Then by Proposition 14.4.3 the equation [a]n x =
[1]n has a unique solution [c]n in Zn . Thus [c]n is precisely the inverse of [a]n .
Viceversa, assume that there exists an element [c]n ∈ Zn such that [a]n [c]n = [1]n ,
thus n∣1 − ac, that is, 1 − ac = nr for some r ∈ Z. Let d = gcd(a, n); we have that
d∣a, d∣n, so d∣ac + nr = 1, that is d = 1, as we wanted to prove.

Let us now examine the general case of Proposition 14.4.3.

Theorem 14.4.5 Let a, n ∈ Z, n > 0, and let d = gcd(a, n). Then:
(1) the equation [a]n x = [b]n has solution in Zn if and only if d∣b;

(2) if d∣b then equation [a]n x = [b]n has exactly d distinct solutions in Zn .
Proof. Assume that [c]n is a solution of the given equation, then: [a]n [c]n = [ac]n =
[b]n i.e. ac ≡n b or, equivalently, n∣ac − b. Consequently d∣b since d∣n and d∣a.
′ ′ ′
Assume now that d divides b and let a = a d, n = n d, b = b d. Observe that
gcd(a , n ) = 1, otherwise d would not be the greatest common divisor between a
′ ′

and n. Then by Proposition 14.4.3, the equation [a ]n′ x = [b ]n′ has a unique solu-
′ ′

tion in Zn′ , let it be [c]n′ . Thus we have: [a ]n′ [c]n′ = [a c]n′ = [b ]n′ , i.e. n ∣a c − b ,
′ ′ ′ ′ ′ ′

so n = n d∣a dc − b d = ac − b, i.e. [c]n is a solution of the equation [a]n x = [b]n .

′ ′ ′

This shows (1).

It remains to show that if d∣b, there are d distinct solutions in Zn . Let [c]n be the
solution found in point (1). If [e]n is another solution of the given equation then
we have [a]n [c]n = [b]n = [a]n [e]n , thus n∣ac − aand, that is n d∣a dc − a de and
′ ′ ′

therefore n ∣a c − a e. It follows that [a ]n′ [e]n′ = [a ]n′ [c]n′ and then [e]n′ is the
′ ′ ′ ′ ′

solution of the equation [a ]n′ x = [b ]n′ . By Proposition 14.4.3 we have [e]n′ = [c]n′ ,
′ ′
′
i.e. e = c + kn , with k ∈ Z.
Then it is easy to verify that [e]n ∈ {[c]n , [c+n ]n , [c+2n ]n , . . . , [c+(d−1)n ]n } = X
′ ′ ′

and that the elements of X are all distinct and are all solutions of the equation
[a]n x = [b]n . This shows what we wanted.

Example 14.4.6 We want to determine all solutions in Z74 of the equation [33]74 x =
[5]74 .
We use Euclid’s algorithm:
74 = 2 ⋅ 33 + 8

33 = 4 ⋅ 8 + 1

8 =8⋅1
246 Introduction to Linear Algebra

As (33, 74) = 1, we know that the solution exists and is unique, and the calculations
just made allow us to compute the inverse of [33]74 . We have in fact that:

1 = 33 − 4 ⋅ 8 = 33 − 4 ⋅ (74 − 2 ⋅ 33) = (−4) ⋅ 74 + 9 ⋅ 33.

We have therefore found: 1 = (−4) ⋅ 74 + 9 ⋅ 33. We write this equation in Z74 :

[1]74 = [9]74 [33]74 .

So the solution is x = [9]74 ⋅ [5]74 = [45]74 .

Now let us see how Theorem 14.4.5 also allows us to solve linear congruences.

Definition 14.4.7 A linear congruence (modulo n) in the unknown x is a congruence

of the type:
ax ≡n b
with a, b, n ∈ Z, n > 0.

Observation 14.4.8 It is clear that an integer c is a solution of the linear congruence

ax ≡n b if and only if [c]n is a solution of the equation [a]n x = [b]n .
Let us go back to the proof of Theorem 14.4.5 for a moment, using the same notation.
We have that the equation [a]n x = [b]n has solution if and only if d divides n,
where d = gcd(a, n). In this case, set n = n d; if [c]n is a solution of the equation
′

[a]n x = [b]n , found for example with the method used in point (1), the integers e
such that [e]n is a solution of the equation [a]n x = [b]n are all those of the type
′
e = c + kn , with k ∈ Z. Thus the solutions of the linear congruence ax ≡n b are
′
precisely only those of the type e = c + kn , with k ∈ Z.

14.5 EXERCISES WITH SOLUTIONS

n
14.5.1 Prove by induction that a set with n elements has 2 subsets.
Solution. Let us show P (0). If a set is empty then it has zero elements and therefore
0
only one subset (itself); therefore it has 1 = 2 subsets. We now assume that the
statement P (k − 1) is true, and we want to show P (k), with k ≥ 1. The statement
P (k) says that a set S with k elements has 2 subsets. As S contains at least an
k
′
element x, we can think of S as the disjoint union of one of its subsets S and the
subset {x}. The subsets of S that do not contain x are also subsets of S , and there are
′

of them by the inductive hypothesis P (k − 1), since S contains k − 1 elements.

k−1 ′
2
On the other hand, a subset of S containing x is given by T ∪ {x}, where T is a
′ k−1
subset of S . It is easy to convince ourselves that there are exactly 2 such subsets.
k−1 k−1 k
Therefore the subsets of S are 2 +2 = 2 , and this concludes the proof by
induction.

14.5.2 We want to determine all the solutions of the congruence

63x ≡375 24.
Solution. Since 3 = gcd(63, 375) divides 24, the congruence has solutions in Z. We
Introduction to Modular Arithmetic 247

have that 63 = 3 ⋅ 21, 375 = 3 ⋅ 125, 24 = 3 ⋅ 8, thus we solve the equation [21]125 x =
[8]125 . We want to find the inverse of [21]125 , and to do so we first use Euclid’s
algorithm to compute gcd(125, 21):

125 = 5 ⋅ 21 + 20

21 = 1 ⋅ 20 + 1

20 = 20 ⋅ 1.

Proceeding backwards:

1 = 21 − 1 ⋅ 20 = 21 − (125 − 5 ⋅ 21) = −125 + 6 ⋅ 21.

We therefore obtain that:

[1]125 = [6]125 [21]125 .
So the solution of the equation [21]125 x = [8]125 is x = [6]125 ⋅ [8]125 = [48]125 . At
this point the solutions of the congruence are given by: x = 48 + k125, with k ∈ Z.

14.6 SUGGESTED EXERCISES

2 2 2
14.6.1 Prove that 1 + 2 + 3 + . . . n = n(n + 1)(2n + 1)/6.

14.6.2 Prove the Fundamental Theorem of Arithmetic 14.2.12 using the principle of
complete induction.
n
14.6.3 Prove (by induction) that if n it is a non-negative integer then 2 > n.

14.6.4 Say if there are two classes [a]37 , [b]37 in Z37 , both nonzero such that
[a]37 [b]37 = [0]. If they exist compute them, if they do not exist explain why. Answer
the same question replacing Z37 with Z36 .

14.6.5 Consider the two congruence classes: [0]6 e [3]12 . Say if they are the same
or if they are different from each other or if one is contained in the other.

14.6.6 Solve the following equations (if possible):

i) [23]x = [7] in Z40
ii) [3]x = [13] in Z17
iii) [15]x = [9] in Z53
iv) [4]x = [1] in Z400
v) [18]x = [30] in Z42
vi) [16]x = [36] in Z56
vii) [20]x = [56] in Z178
248 Introduction to Linear Algebra

14.7 APPENDIX: ELEMENTARY NOTIONS OF SET THEORY

In this appendix, we recall some elementary notions of set theory used in the text.
The concept of set and membership are primitive, so we do not give a rigorous defi-
nition of them. Informally, a set is a collection of objects, and to assign a set we can
list its elements, for example:

X = {3, 4, 5, 6, 7},

or we can assign a property of its elements, for example, in the previous case, X is
the set of natural numbers greater than 2 and smaller than 8, and can be referred to
as:
X = {x∣ x is a natural number and 2 < x < 8}.
To denote that an element belongs to a set, we use the symbol ∈, whose negation is
/ . For example, in the previous case we have 5 ∈ X, 9 ∈
∈ / X.
Some sets often used in the text are:
N = {0, 1, 2, . . . , n, n + 1, . . . } set of natural numbers,
Z = {0, ±1, ±2, . . . , ±n, . . . } set of integer numbers,
R set of real numbers.

Two sets X and Y are the same if they have the same elements, and in this case we
write X = Y .

Definition 14.7.1 If X and Y are sets, let us say X is a subset of Y if each element
of X is also an element of Y , and we write X ⊆ Y .

Then there is a special set, the empty set, that is the set with no elements and is
denoted with ∅. Note that the empty set it is a subset of any set X. We have for
example:
2
{x ∣ x ∈ R, x = −1} = ∅,
because no real number raised to the square gives −1 as a result. Care must be taken;
for example X = {0} is the subset of the real numbers that contains the single element
zero, however, it is not the empty set because it contains an element.
Let us now recall the two fundamental operations that can be carried out between
sets.

• The union of two sets X and Y is the set of all the elements that belong to X
or to Y and is denoted with X ∪ Y .

• The intersection of two sets X and Y is the set of all the elements that belong
to both X and Y and is denoted with X ∩ Y .
APPENDIX A

Complex Numbers

In this appendix, we introduce the set of complex numbers, necessary for a deeper
understanding of the question of finding solutions of algebraic equations. All linear
algebra results we describe in this book concerning real vector spaces, are also true re-
placing real numbers with complex numbers, without any modification to the theory.
Since this topic involves an additional difficulty, we prefer to present our treatment
of linear algebra limiting ourselves to the case of real scalars and leaving the complex
number case in this appendix.

A.1 COMPLEX NUMBERS

Let C be the set of complex numbers: it represents an extension of the set of real
numbers, coming mainly from the need to be able to find the square root of a negative
real number. We therefore introduce a new symbol, denoted by i, called imaginary
2
unit, with the property that i = −1. We define the set of complex numbers as
C = {a + bi∣a, b ∈ R}
with the sum and product operations defined as follows:
• (a + bi) + (c + di) = (a + c) + (b + d)i,
• (a + ib)(c + id) = (ac − bd) + (ad + bc)i,
where the real numbers a+c, b+d, ac−bd, ad+bc are obtained with the usual operations
of addition and product between real numbers. By definition, given a + bi, c + di ∈ C,
we have that a + bi = c + di if and only if a = c and b = d.
An easy way to remember the product operation between two complex numbers is to
2
multiply them as if they were two polynomials in i and to remember that i = −1,
so:
2
(a + ib)(c + id) = ac + adi + bci + bdi = (ac − bd) + (ad + bc)i.
Example A.1.1 Given the two complex numbers 1 + 2i, 3 − i we want to compute
the sum and the product.
(1 + 2i) + (3 − i) = (1 + 3) + (2 − 1)i = 4 + i
(1 + 2i)(3 − i) = 3 + 6i − i − 2i = 3 + (6 − 1)i − 2(−1) = 5 + 5i
2

249
250 Introduction to Linear Algebra

The set of real numbers is a subset of C, because we can write any real number
a ∈ R in the form a = a + 0i. We call complex numbers of the type bi = 0 + bi purely
imaginary. If z = a + bi is a complex number, the real numbers a and b are called the
real part and imaginary part of z, respectively.
We can represent complex numbers in the Cartesian plane as follows: we associate
to a + bi the pair of real numers (a, b). In this plane, the x-axis represents the real
numbers and it is called real axis, while the y-axis represents pure imaginary complex
numbers and we call it imaginary axis.

Im
−1 + 2i 6
s

1+i
s

-
Re

1
s − 2i
2

Given a complex number α = a + bi, we define its complex conjugate (or conjugate)
α as α = a − bi. We also define the modulus of a complex number α as
√ √
∣α∣ = αα = a2 + b2 .

In the Cartesian representation, the modulus of α is the distance of the point P =

(a, b), representing α, from the origin.
The conjugation operation, which associates to a complex number its complex con-
jugate, satisfies some properties, which we leave as an easy exercise:

• (α) = α for each α ∈ C,

• α = α if and only if α ∈ R,

• α + β = α + β for each α, β ∈ C,

• αβ = αβ for each α, β ∈ C.

−1
One of the most important properties of complex numbers is that the inverse α of
Complex Numbers 251

α ∈ C, α ≠ 0 always exists, and it is a complex number. This inverse is obtained as

−1 2
α = α/∣α∣ . More explicitly:

−1 a b
α = 2 2
− 2 i.
a +b a + b2
This allows us to immediately compute the quotient of two complex numbers. Instead
of remembering the formula, we invite the student to understand the procedure de-
scribed in the following example.

Example A.1.2 Let us consider the quotient of complex numbers: 3−2i 1−i
. We want
to express this quotient as a + bi for appropriate a and b. We proceed by multiplying
the numerator and denominator by the complex conjugate of the denominator. The
student will recognize the analogy with the procedure to rationalize the denominator
of a fraction:

3 − 2i 3 − 2i 1 + i (3 − 2i)(1 + i) 3 − 2i + 3i − 2i
2
5 1
= ⋅ = = = + i.
1−i 1−i 1+i ∣1 − i∣2 2 2 2

We conclude this section with a list of the properties of operations in complex num-
bers, the verification of which is left to the reader as an easy exercise.

• commutativity of the sum: α + β = β + α for each α, β ∈ C;

• associativity of the sum:

(α + β) + γ = α + (β + γ) for each α, β, γ ∈ C;

• existence of the neutral element 0 for the sum:

α + 0 = α for each α ∈ C, with 0 = 0 + 0i;

• existence of the opposite −α of a complex number α: α + (−α) = 0 for each

α = a + bi ∈ C, with −α = −a − bi;

• product commutativity: αβ = βα for each α, β ∈ C;

• product associativity:
(αβ)γ = α(βγ) for each α, β, γ ∈ C;

• existence of the neutral element 1 for the product:

1α = α for each α ∈ C, with 1 = 1 + 0i;
−1 −1
• existence of the inverse α of a complex number α: αα = 1 for each α ∈ C,
α ≠ 0;

• distribution of the sum with respect to the product:

(α + β)γ = αγ + βγ for each α, β, γ ∈ C;
252 Introduction to Linear Algebra

A.2 POLAR REPRESENTATION

We now want to represent a complex number in the Cartesian plane using the po-
lar coordinate system. In this system, each point of the P plane is identified by
two coordinates (ρ, θ), called respectively radial coordinate and angular coordinate.
The coordinate ρ represents the distance of the point P from the origin O. The θ
coordinate represents the angle between the x-axis and the ray OP :
6 r
P = (ρ, θ)
ρ

.....
....
...
...
θ
. -

By definition of the trigonometric functions sine and cosine, we can immediately

express the Cartesian coordinates (x, y) of a point P in terms of its polar coordinates
(ρ, θ):
x = ρ cos θ y = ρ sin θ.
Conversely, given the Cartesian coordinates (x, y) of P , thanks to the Pythagorean
theorem and the definition of arctangent, we can immediately write:
√
ρ = x2 + y 2 , θ = arctan (y/x),
where ρ ≥ 0 and θ is determined up to multiples of 2π (if ρ ≠ 0).
If we now consider a complex number α = a+ib and its representation in the Cartesian
plane, we immediately see that in polar coordinates we have a = ρ cos θ and b = ρ sin θ.
So we can also write
α = ρ(cos θ + sin θi).
This expression is called trigonometric or polar representation of the complex number
α = a + bi. As we have seen, ρ is the modulus of α, while the angle θ is called the
argument of α.
If we use the trigonometric representation, the formula for the product of two complex
numbers α = ρ1 (cos θ1 + sin θ1 i) e β = ρ2 (cos θ2 + sin θ2 i) becomes particularly
elegant. We have, in fact:
αβ = ρ1 ρ2 (cos θ1 + sin θ1 i)(cos θ2 + sin θ2 i)

= ρ1 ρ2 ((cos θ1 cos θ2 − sin θ1 sin θ2 ) + (cos θ1 sin θ2 + sin θ1 cos θ2 )i) (A.1)

= ρ1 ρ2 ( cos(θ1 + θ2 ) + sin(θ1 + θ2 )i).

So, for two complex numbers we have that: the modulus of the product is the product
of the moduli and the argument of the product is the sum of the arguments.
Complex Numbers 253

The trigonometric form of a complex number allows us to compute its n-th roots fairly
quickly, through De Moivre’s formula. Thanks to formula (A.1) we can compute the
powers of a complex number:

α = ρ(cos θ + i sin θ)
= ρ (cos 2θ + sin 2θ i)
2 2
α
= ρ (cos 3θ + sin 3θ i)
3 3
α (A.2)
⋮ =
= ρ (cos nθ + sin nθ i).
n n
α

The last equality is called De Moivre’s formula.

Therefore, we can immediately determine the n-th roots of a complex number α =
ρ(cos θ + sin θ i):
1/n
α = ρ( cos[(θ + 2kπ)/n] + sin[(θ + 2kπ)/n] i) (A.3)

with k = 0, . . . , n − 1, in other words we have n n-th complex roots of a given nonzero

complex number.
Let us see an example.

Example A.2.1 We want to determine all the cube roots of 1 + i. According to the
formula (A.3) they are given by:
√
2{ cos[(π/4 + 2kπ)/3] + sin[(π/4 + 2kπ)/3] i}

with k = 0, 1, 2. More precisely:

√ √ π π
α1 = 2{ cos[(π/4)/3] + sin[(π/4)/3] i} = 2(cos 12 + sin 12 i)

√
α2 = 2{ cos[(π/4 + 2π)/3] + sin[(π/4 + 2π)/3] i} =

√
= 2{ cos 3π
4
+ sin 3π
4
i}

√
α3 = 2{ cos[(π/4 + 4π)/3] + sin[(π/4 + 4π)/3] i} =

√
= 2{ cos 17π
12
+ sin 17π
12
i}.

We can represent such roots in the complex plane as follows:

254 Introduction to Linear Algebra

Im
6
α2
r

r α1
-
Re

α3 r

We conclude this section by stating a very important result: the Fundamental The-
orem of Algebra, whose proof is particularly difficult. Since it is beyond the scope
of this book, we refer the reader to one of several specific texts (see for instance S.
Lang, Algebra [3]).
Theorem A.2.2 Any polynomial of degree n with complex coefficients
n n−1
p(x) = an x + an−1 x + ⋅ ⋅ ⋅ + a1 x + a0 , an , . . . , ao ∈ C

can be factored into a product of n linear factors (not necessarily distinct):

p(x) = an (x − α1 )(x − α2 )⋯(x − αn ),

with α1 , . . . , αn ∈ C.
This specifically implies that a polynomial equation of degree n with coefficients in
C always has n complex solutions, although not necessarily distinct.
Let us see an example.
4
Example A.2.3 We want to find all the solutions of the equation x − 16 = 0. We
can immediately factor the polynomial as:
4 2 2
x − 16 = (x − 4)(x + 4) = (x − 2)(x + 2)(x + 2i)(x − 2i).

Therefore the zeros of the polynomial, corresponding to the solutions of the given
equation, are: ±2, ±2i. We can obtain the same result also by applying formula
(A.3):
2 = 2[cos(0) + sin(0) i],

2i = 2[cos(2π/4) + sin(2π/4) i],

−2 = 2[cos(4π/4) + sin(4π/4) i],

−2i = 2[cos(6π/4) + sin(6π/4) i],

remembering that in polar coordinates: −1 = cos(π) + sin(π) i and that −i =
cos(6π/4) + sin(6π/4)i.
APPENDIX B

Solutions of some
suggested exercises

Chapter 1: Introduction to linear systems

1.6.1 a) x = y = 0, z = 1. b) The system has no solutions.
1.6.3 The system has a unique solution for k ≠ 0, 1. For k = 0, the system has
infinitely many solutions depending on one parameter, for k = 1 the system does not
have solutions.

Chapter 2: Vector Spaces

2.6.1 a), b), d), e), f), l) are subspacess. c), g), h), i), m) are not subspaces.
2.6.4 X is not a vector subspace.

Chapter 3: Linear Combinations and linear indipen-

dence
3.4.1 a), d), e) are linearly independent sets. b), c) are linearly dependent sets.
3.4.3 Yes.
√
3.4.4 k = ± 3.
3.4.5 k ≠ −1/2.
3.4.6 k = −2/5.
3.4.9 a) k ≠ 2, −1, b) k ≠ 2.
3.4.10 a) The system has a unique solution for k ≠ 0, 1, it has no solutions for
k = 0, 1. b) k ≠ 0, 1.
3.4.12 The given vectors are always linearly dependent.
3.4.13 a) k ≠ 0, 5/3. For k = 0 we have v2 = v3 .
3.4.14 a) k ≠ 0, −2. b) k ≠ 0, −2.

255
256 Introduction to Linear Algebra

3.4.15 k ≠ 3.
3.4.16 a) {(−2, 1, 0), (0, 0, 1)}.

Chapter 4: Basis and dimension

4.5.2 k ≠ 2.
4.5.4 a) it is basis, b) it is a set of generators, c) it is not a set of generators.
4.5.6 k ≠ −1.
4.5.7 The vectors are linearly independent for k ≠ 0 and k ≠ 1. For k = 1,
(1, −2, −6) ∈ ⟨v1 , v2 , v3 ⟩.
4.5.8 For all k ∈ R.
4.5.10 v1 , v2 , v3 generate R for k ≠ 34 ; for k = 0 the vector (4, −1, 6) belongs to
3

⟨v1 , v2 , v3 ⟩.
4.5.11 k = −5, k = 2.
4.5.12 k ≠ ±6.
4.5.13 k ≠ − 23 .
4.5.17 (a) k ≠ 0, k ≠ 1/10.

Chapter 5: Linear maps

5.9.1 For all k ∈ R.
5.9.2 For no value of k ∈ R.
5.9.6 (a) F is isomorphism. (b), (c) the linear maps are not isomorphisms.
5.9.13 k ≠ −2.
5.9.14 A basis of Ker (T ) is {(−2, 0, 1)} and a basis of Im(T ) is {(1, 0, 2, 3),
(0, 1, 3, −1)}, dim(Ker T ) = 1, dim(ImT ) = 2 and dim(R ) = 3 = 1 + 2 =
3

dim(Ker T ) + dim(ImT ), hence T is not injective.

5.9.19 k = ±2.
5.9.20 k ≠ −3.
5.9.21 F is injective for all k ∈ R.

Chapter 6: Linear Systems

6.4.2 k ≠ −5.
6.4.4 S is a subspace.
6.4.5 F exists and is unique.

Chapter 7: Determinant and Inverse

Solutions of some suggested exercises 257

7.8.1 a) F is an isomorphism, because the determinant of the matrix associated to

F with respect to the canonical basis is equal to 1.
7.8.2 G(x, y, z) = (x + y − z, −y − z, −z).
7.8.3 The matrix is invertible for a ≠ 3/2.
7.8.7 a) The inverse is:
⎛−1 1 −1⎞
⎜−2 1 −2⎟
⎜ ⎟
⎝0 0 1⎠

Chapter 8: Change of basis

8.5.1 (4, 6, 9).
8.5.5 a) k ≠ ±1, ker(T ) = ⟨(2/3, 0, 1)⟩.
3/2 1/2 −1
b) ( )
6 1 −4
8.5.6 a) k ≠ 4, k ≠ −1/2.
b) With respect to the canonical bases in domain and codomain, G is associated to
the matrix:
⎛ 0 −1 0 ⎞
⎜−1/4 −1/4 −1/2⎟
⎜ ⎟
⎝ 0 0 1 ⎠

Chapter 9: Eigenvalues and Eigenvectors

9.4.1 (a) Eigenvalues 2, 3, the matrix is not diagonalizable. (b) Eigenvalues 1, 4, L
is diagonalizable. (c) Eigenvalues ±1, 2, L is diagonalizable. (d) Eigenvalues 0, 7, L
is diagonalizable. (e) L has not real eigenvalues.
9.4.2 a) Eigenvalues 0, 5, 7, A is diagonalizable.
9.4.4 Eigenvalues 2, 4, A is diagonalizable.
9.4.5 Eigenvalues −1, 2, A is not diagonalizable.
9.4.6 Eigenvalues 9, 3, A is diagonalizable.
9.4.7 Eigenvalues 0, 6, −1, A is diagonalizable.
9.4.8 Eigenvalues 2, 5, T is not diagonalizable.
9.4.9 Eigenvalues −2, 3, T is diagonalizable.
9.4.10 a) k = −1. b) For all k ∈ R.
9.4.11 A is diagonalizable for k = 0.
9.4.12 F is diagonalizable for k ≠ 2.
9.4.13 a) A is diagonalizable for k ≠ 5. b) For k = 0.
258 Introduction to Linear Algebra

Chapter 10: Scalar Products

10.8.2 a) Basis for W : {(1, 2, 2, −1)}. b) Basis for W : {(−2, 1, 0, 0), (−2, 0, 1, 0),
⊥
√ √ √ √ √ √
√ for W :√{(−2/ 5, 1/ 5, 0, 0), (−2/3 5), −4/3 5, 5/3, 0), (5/3 34),
⊥
(5,0,0,1)}.
√ Basis
5 2/17/3, 5 2/17/3, 3/ 34)}.
: {(0,√2, 1, 0), (−1,
⊥
10.8.3
√ a) Basis √ for W √ √ −1, 0, 1)}.
√ b) Orthogonal basis for W :
{(1/ 2, 0, 0, 1/ 2), (−1/ 22, 2/11, −2 2/11, 1/ 22)}.

√ √for W : {(0,
10.8.4 Basis √ −1, 1,√0), (1/2, √
−1/2, 0,√1)}. Orthonormal basis for W :
{(0, −1/ (2/
⊥
√ 2, 1 2,
√ 0), √ 22, −1/ 22. − 1/ 22, 4/ 22)}. Orhonormal basis for W :
{(0, 1/ 3, −1/ 3, 1/ 3)}. c) No.
√ √
√ W : {(0,√1, −1, 1)}.√b) Orthonormal
√ basis for W : {(1/ 3, −1/ 3, 0,
⊥
10.8.7
√ a)√Basis for √
1/ 3), ( 2/3, 1/ 6, 0, −1/ 6), (0, 1/ 6, 2/3, 1/ 6)}.

Chapter 11: Spectral Theorem

11.6.1 a) no, b) no, c) yes, d) yes.
11.6.4 1) Eigenvalues: 5, 2. Basis of eigenvectors: v1 = (1, 1, 1), v2 = (−1, 0, 1),
v3 = (−1, 1, 0). √ √ √
⎛ 1/ √3 1/ 3 1/√3⎞
P =⎜ ⎜ 1/ 2⎟⎟
−1
⎜−1/√ 2 √ 0 √ ⎟
⎝ 1/ 6 2/3 1/ 6⎠
√ √
2)
√ Eigenvalues: 1/2(3 ± 5). Eigenvectors: v1 = ((−1 − 5)/2, 1), v2 = ((−1 +
5/2, 1).
4) Eigenvalues: 5, 2, 0. Eigenvectors: v1 = (0, −1, 2) v2 = (1, 0, 0) v3 = (0, 2, 1).
√ √
⎛0 −1/ 5 2/ 5⎞
P =⎜ 0√ ⎟
−1
⎜1 0√ ⎟
⎝0 1/ 5 1/ 5⎠

Chapter 12: Applications of Spectral Theorem and

Quadratic Forms
12.5.3 The eigenvalues of C are 8, ±1, hence q and the associated scalar product is
non-degenerate, but not positive definite. The signature of q is (2, 1). The basis in
which the scalar product is associated √
to a diagonal
√ matrix
√ is given by eigenvectors
√ √ of
norm 1: v1 = (2/3, 1/3, 2/3), v2 = (1/ 18, −4/ 18, 1/ 18), v3 = (−1/ 2, 0, 1/ 18).
12.5.4 1) < (x, y), (x , y ) >= xx + (5/2)xy + (5/2)yx . The associated matrix in
′ ′ ′ ′ ′

the canonical basis is:

1 5/2
C=( ).
5/2 0
√
The eigenvalues of C are: 1/2(1 ± 26), hence the quadratic form is non-degenerate,
but not positive definite. The signature of q is (1, 1). 2) The curve is an hyperbole.
Solutions of some suggested exercises 259

Chapter 13: Lines and Planes

13.6.1 r1 , r2 are skew.
√ r1 , r3 are skew. r1 , r4 are coincident. r2 , r3 are parallel and
their distance is 2 6/3. r2 , r4 are skew. r3 , r4 are skew.
13.6.4 a) The line r is given by (2 + 3t, 1 − t, −6t). b) The plane π is given by
′

5x − 9y + 4z = 1.
13.6.6 a) Parametric equations for r: x = t, y = 1+t, z = −1/3−(2/3)t. b) Cartesian
equation for π: x − 3y − 3z + 2 = 0. c) Parametric equations for s: x = 1 + t, y = −3t,
z = 1 − 3t.
13.6.7 a) Parametric equations
√ for r: x = −12 + 2t, y = 4, z = t. b) They are parallel.
c) The distance is (2/3) 6.

Chapter 14: Discrete Mathematics

14.6.4 Such classes do not exist in Z37 because 37 is prime. In Z36 we have
[4]36 [9]36 = [0]36 .
14.6.5 They are different, one is not contained into the other.
14.6.6 b) [10], d) The congruence has no solutions.
Bibliography

[1] T. W. Hungerford. Algebra. Springer, 1974.

[2] T. W. Hungerford. Abstract Algebra, an Introduction. Cengage Learning, Inc,

2012.

[3] S. Lang. Algebra. Springer, 2002.

[4] S. Lang. Undergraduate Algebra. Springer Science & Business Media, 2005.

[5] S. Lang. Introduction to Linear Algebra. Springer Science & Business Media,
2012.

261
Index

2
R , 26 good ordering axiom, 235
Zn , 243 Gram-Schmidt algorithm, 188
greatest common divisor, 238
basis, 58
basis change for scalar products, 182 hermitian product, 204
Bezout identity, 239 non-degenerate, 205
bilinear application, 177 positive definite, 205
bilinear form, 177 homogeneous equation, 1

canonical basis, 61 image, 86

Cartesian equations calculation, 92
of a line, 228 induction principle, 235
of a plane, 228 inverse, calculation of, 123
characteristic polynomial, 160 isomorphism, 91
complete induction principle, 236
Completion Theorem, 61 kernel, 86
complex numbers, 204 calculation, 92
components, 62
linear combination, 41
congruence, 244
linear equation, 1
congruence classes, 241
linear independence, 47
congruence relationship, 241
linear system, 1
coordinates, 62
linear transformation, 77, 78
determinant, 113, 128 diagonalizable, 156
diagonalization, scalar products, 209 orthogonal, 193
division algorithm, 237 symmetric, 198

eigenvalue, 158 matrix, 30

calculation, 159 multiplicative additive tables of Z3 and
eigenvectors, 158 Z4 , 243
calculation, 159
norm of a vector, 183
elementary row operations, 65
equivalence relation, 241 orthogonal basis, 187
Euclid’s algorithm, 238 orthogonal matrices, 194
finitely generated, 58 orthogonal projection, 187
function, 78 orthonormal basis, 187

Gaussian algorithm, 1, 64 parametric equations

gcd, 238 of a line, 226
generators, 43 of a plane, 229

263
264 Index

permutation, 128 positive definite, 183

plane, 228 standard, 184
polynomial, 32 sequences, 40
set, 248
quadratic form, 213 intersection, 248
non degenerate, 213 union, 248
positive definite, 213 size, 61
span, 43
Rank Nullity Theorem,
Spectral Theorem, 201
89
straight, 226
reflections, 197
subset, 248
rotations, 197
subspace, 33
scalar, 31 orthogonal, 186
scalar product, 177, 183, 209
transpose matrix, 180
Euclidean, 184
Minkowski, 184 vector space, 28, 31
non-degenerate, 183 finite dimensional, 61
orthogonal subspace, 186 vectors linearly independent, 47
orthogonal vectors, 184
perpendicular vectors, 184 zero vector, 31

Functional Linear Algebra (Hannah Robbins)
100% (5)
Functional Linear Algebra (Hannah Robbins)
406 pages
Linalg Final 4web
No ratings yet
Linalg Final 4web
121 pages
An Introduction To Linear Algebra by Krishnamurthy Mainra Arora PDF
No ratings yet
An Introduction To Linear Algebra by Krishnamurthy Mainra Arora PDF
348 pages
An Introduction To Linear Algebra by Krishnamurthy, Mainra & Arora PDF
84% (19)
An Introduction To Linear Algebra by Krishnamurthy, Mainra & Arora PDF
348 pages
An Introduction To Linear Algebra by Krishnamurthy Mainra Arora PDF
100% (4)
An Introduction To Linear Algebra by Krishnamurthy Mainra Arora PDF
348 pages
An Introduction To Linear Algebra by Krishnamurthy, Mainra & Arora
No ratings yet
An Introduction To Linear Algebra by Krishnamurthy, Mainra & Arora
348 pages
Lecture Notes in LInear Algebra - ICEF Moscow
No ratings yet
Lecture Notes in LInear Algebra - ICEF Moscow
121 pages
Linearalgebra: Pure Applied
100% (1)
Linearalgebra: Pure Applied
726 pages
Introduction To Linear Algebra 2nd Edition Edition Whitelaw Available Full Chapters
100% (2)
Introduction To Linear Algebra 2nd Edition Edition Whitelaw Available Full Chapters
149 pages
Algebra: Linear
100% (1)
Algebra: Linear
386 pages
Linear Algebra An Introduction 2ed. Edition Richard Bronson Instant Download
100% (1)
Linear Algebra An Introduction 2ed. Edition Richard Bronson Instant Download
50 pages
(Ebook) Linear Algebra by Lina Oliveira ISBN 9781032287812, 9780815373315, 1032287810, 0815373317, 2021061942, 2021061943 Instant Access 2025
No ratings yet
(Ebook) Linear Algebra by Lina Oliveira ISBN 9781032287812, 9780815373315, 1032287810, 0815373317, 2021061942, 2021061943 Instant Access 2025
82 pages
Linear Algebra - Pure & Applied
88% (8)
Linear Algebra - Pure & Applied
734 pages
Linear Algebra Done Wrong Treil S All Chapter Instant Download
100% (7)
Linear Algebra Done Wrong Treil S All Chapter Instant Download
56 pages
Roger Baker
No ratings yet
Roger Baker
327 pages
McGraw Hill - Schaum S Outlines - Theory and Problems of Linear Algebra 1991 - by Santirub
No ratings yet
McGraw Hill - Schaum S Outlines - Theory and Problems of Linear Algebra 1991 - by Santirub
459 pages
Linear Algebra 1st Edition Lina Oliveira PDF Download
100% (5)
Linear Algebra 1st Edition Lina Oliveira PDF Download
61 pages
Anne Schilling, Isaiah Lankham, Bruno Nachtergaele - Linear Algebra As An Introduction To Abstract Mathematics-World Scientific (2016) PDF
100% (1)
Anne Schilling, Isaiah Lankham, Bruno Nachtergaele - Linear Algebra As An Introduction To Abstract Mathematics-World Scientific (2016) PDF
203 pages
AlgebraLineal BUENO
No ratings yet
AlgebraLineal BUENO
203 pages
Advanced Linear Algebra PDF
100% (13)
Advanced Linear Algebra PDF
348 pages
Kuttler LinearAlgebra AFirstCourse Yorku MATH2022 Summer2016
No ratings yet
Kuttler LinearAlgebra AFirstCourse Yorku MATH2022 Summer2016
256 pages
An Introduction To Linear Algebra by Krishnamurthy Mainra Arora PDF
No ratings yet
An Introduction To Linear Algebra by Krishnamurthy Mainra Arora PDF
348 pages
Linear Algebra Kwak and Hong
No ratings yet
Linear Algebra Kwak and Hong
380 pages
Linear Algebra Book by Jin Ho Kwak and Sungpyo Hong PDF
100% (2)
Linear Algebra Book by Jin Ho Kwak and Sungpyo Hong PDF
380 pages
Linear Algebra
No ratings yet
Linear Algebra
380 pages
Lipschutz LinearAlgebra
100% (3)
Lipschutz LinearAlgebra
340 pages
John B. Fraleigh, Raymond A. Beauregard-Linear Algebra-Addison-Wesley (1995)
100% (2)
John B. Fraleigh, Raymond A. Beauregard-Linear Algebra-Addison-Wesley (1995)
614 pages
Linear Algebra For Economists
No ratings yet
Linear Algebra For Economists
131 pages
Interactive Linear Algebra
No ratings yet
Interactive Linear Algebra
441 pages
Kuttler LinearAlgebra AFirstCourse YorkU MATH2022 Winter2017
No ratings yet
Kuttler LinearAlgebra AFirstCourse YorkU MATH2022 Winter2017
258 pages
Linear Algebra Pure Applied 1st Edition Edgar G. Goodaire Get PDF
No ratings yet
Linear Algebra Pure Applied 1st Edition Edgar G. Goodaire Get PDF
80 pages
Tephen H - Friedberg 2c Algebra Lineal
100% (1)
Tephen H - Friedberg 2c Algebra Lineal
615 pages
Friedberg - Linear Algebra (4th Ed)
100% (9)
Friedberg - Linear Algebra (4th Ed)
614 pages
Whitelaw, T. A - Introduction To Linear Algebra,-Chapman & Hall - CRC (2018)
100% (1)
Whitelaw, T. A - Introduction To Linear Algebra,-Chapman & Hall - CRC (2018)
286 pages
Ladw 2024 10-01
No ratings yet
Ladw 2024 10-01
286 pages
Ladw - 2017 09 04
No ratings yet
Ladw - 2017 09 04
286 pages
Bruce Cooperstein-Elementary Linear Algebra (2010)
100% (1)
Bruce Cooperstein-Elementary Linear Algebra (2010)
954 pages
Linear Algebra Textbook Guide
100% (1)
Linear Algebra Textbook Guide
447 pages
Linear Algebra: Jim Hefferon
100% (1)
Linear Algebra: Jim Hefferon
447 pages
Linear Algebra CBE 0616 GK
No ratings yet
Linear Algebra CBE 0616 GK
37 pages
Course of Linear Algebra and Multidimensional Geometry
No ratings yet
Course of Linear Algebra and Multidimensional Geometry
143 pages
Lectures On Applied Math Linear Algebra
No ratings yet
Lectures On Applied Math Linear Algebra
590 pages
Linear Algebra Fraleigh
No ratings yet
Linear Algebra Fraleigh
608 pages
Linear Algebra Pure Applied 1st Edition Edgar G. Goodaire Instant Download
No ratings yet
Linear Algebra Pure Applied 1st Edition Edgar G. Goodaire Instant Download
52 pages
2.4-Description of Coursework Sep 2023
No ratings yet
2.4-Description of Coursework Sep 2023
6 pages
Design and Analysis of Algorithm Question Bank For Anna University
100% (4)
Design and Analysis of Algorithm Question Bank For Anna University
20 pages
Chapter 4 Modern Cryptography
No ratings yet
Chapter 4 Modern Cryptography
24 pages
Fundamental of Accounting
67% (3)
Fundamental of Accounting
72 pages
B.Tech CSE Algorithm Design Notes
No ratings yet
B.Tech CSE Algorithm Design Notes
126 pages
Direct and Indirect Approaches
No ratings yet
Direct and Indirect Approaches
1 page
G0368 Final Assignment V1
No ratings yet
G0368 Final Assignment V1
4 pages
Be - Information Technology Engineering - Semester 6 - 2019 - November - Design and Analysis of Algorithms Daoa Pattern 2015
No ratings yet
Be - Information Technology Engineering - Semester 6 - 2019 - November - Design and Analysis of Algorithms Daoa Pattern 2015
3 pages
Test Bank For Introduction To International EconomicTB Sample
No ratings yet
Test Bank For Introduction To International EconomicTB Sample
12 pages
Types of Islamic Banking Products
No ratings yet
Types of Islamic Banking Products
10 pages
Econometrics Lecture Notes Booklet
No ratings yet
Econometrics Lecture Notes Booklet
81 pages
Introductory Econometrics: Wang Weiqiang
No ratings yet
Introductory Econometrics: Wang Weiqiang
44 pages
Lifting Supervisor PDF
91% (11)
Lifting Supervisor PDF
211 pages
Import Tariffs & Quotas Explained
100% (21)
Import Tariffs & Quotas Explained
13 pages
Design and Analysis of Algorithms Feb 2022
No ratings yet
Design and Analysis of Algorithms Feb 2022
2 pages
Module 1 Safety Induction 1
100% (1)
Module 1 Safety Induction 1
90 pages
Basic Statistics
100% (10)
Basic Statistics
73 pages
Confined Space Rescue/ Emergency Services
No ratings yet
Confined Space Rescue/ Emergency Services
26 pages
RMM Unit-I Introdution To Data Mining
No ratings yet
RMM Unit-I Introdution To Data Mining
129 pages
Design and Analysis of Algorithm Question Bank
No ratings yet
Design and Analysis of Algorithm Question Bank
15 pages
Career Profiles for Job Seekers
100% (1)
Career Profiles for Job Seekers
6 pages
Design and Analysis of Algorithms - Assignment #1 - (SP22-BAI-036)
No ratings yet
Design and Analysis of Algorithms - Assignment #1 - (SP22-BAI-036)
57 pages
Cyber Security Questions and Answers PDF
82% (11)
Cyber Security Questions and Answers PDF
234 pages
Mysql 3rd Edition
100% (10)
Mysql 3rd Edition
646 pages
Introductory Econometrics Test Bank 5th Edi
100% (1)
Introductory Econometrics Test Bank 5th Edi
140 pages
140 Basic Python Programs
75% (12)
140 Basic Python Programs
96 pages
Cse-IV-Design and Analysis of Algorithms (10cs43) - Notes
100% (1)
Cse-IV-Design and Analysis of Algorithms (10cs43) - Notes
88 pages
SOLUTIONS Monetary Policy Practice Questions
No ratings yet
SOLUTIONS Monetary Policy Practice Questions
8 pages
Islamic Banking
No ratings yet
Islamic Banking
96 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
5 pages
A New Reference Grammar of Modern Spanish Sixth Edition. Edition Antonia Moreira Rodríguez Complete Edition
100% (1)
A New Reference Grammar of Modern Spanish Sixth Edition. Edition Antonia Moreira Rodríguez Complete Edition
87 pages
Executive Presence Development Tips
No ratings yet
Executive Presence Development Tips
2 pages
Developments English For Secific Purposes
100% (4)
Developments English For Secific Purposes
317 pages
Gmdaheadit 31631768197873 Empanelmentof Advocatestodefendcourtmattersof GMDAbeforevariousjudicial Courts Tribunals Authoritiesetc
No ratings yet
Gmdaheadit 31631768197873 Empanelmentof Advocatestodefendcourtmattersof GMDAbeforevariousjudicial Courts Tribunals Authoritiesetc
3 pages
For The Second Exam in Transportation Law
No ratings yet
For The Second Exam in Transportation Law
5 pages
DRUG Abuse
No ratings yet
DRUG Abuse
77 pages
Gic Statement (09 Oct 2023)
No ratings yet
Gic Statement (09 Oct 2023)
3 pages
HHOG-PHC-ENG-2022-060-SHROMB GEOTECHNICAL LTD - Technical Tender - Form I
No ratings yet
HHOG-PHC-ENG-2022-060-SHROMB GEOTECHNICAL LTD - Technical Tender - Form I
77 pages
Business Plan For Poultry Production
100% (1)
Business Plan For Poultry Production
7 pages
Blackhawk XP135A Engine Upgrade King Air 90 Series
No ratings yet
Blackhawk XP135A Engine Upgrade King Air 90 Series
2 pages
Law and Philosophy 1st Edition Michael Freeman Download
No ratings yet
Law and Philosophy 1st Edition Michael Freeman Download
52 pages
Tenses Review Grammar Quiz
No ratings yet
Tenses Review Grammar Quiz
2 pages
Brinton The Nature of Quakerism
No ratings yet
Brinton The Nature of Quakerism
12 pages
Bài Tập Online Các Tuần Anh 9
No ratings yet
Bài Tập Online Các Tuần Anh 9
29 pages
Bibliografía Rosario Ferré
No ratings yet
Bibliografía Rosario Ferré
22 pages
Legal Aspects of Agency Contracts
No ratings yet
Legal Aspects of Agency Contracts
5 pages
Allegory of The Cave
No ratings yet
Allegory of The Cave
3 pages
Sanderson, George Michigan DHS
No ratings yet
Sanderson, George Michigan DHS
3 pages
05-Hadith On Aqidah
No ratings yet
05-Hadith On Aqidah
17 pages
Graduate Statistics Analysis
100% (2)
Graduate Statistics Analysis
2 pages
Book Recommendations
No ratings yet
Book Recommendations
3 pages
DD Topical Past Papers Block II 1st Year
No ratings yet
DD Topical Past Papers Block II 1st Year
19 pages
02 Robert Cer Vero
No ratings yet
02 Robert Cer Vero
55 pages
Specialized Crime Investigation With Legal Medicine Part 1
No ratings yet
Specialized Crime Investigation With Legal Medicine Part 1
7 pages
Nursing Care For Hipopituitarisme
100% (1)
Nursing Care For Hipopituitarisme
10 pages
Journal of Luminescence: Lucas Nonato de Oliveira, Eriberto Oliveira Do Nascimento, Linda V.E. Caldas
No ratings yet
Journal of Luminescence: Lucas Nonato de Oliveira, Eriberto Oliveira Do Nascimento, Linda V.E. Caldas
9 pages
Jacking Systems
0% (1)
Jacking Systems
19 pages
Know Your Learners
No ratings yet
Know Your Learners
8 pages
China Bank vs. Chua: Property Dispute Decision
No ratings yet
China Bank vs. Chua: Property Dispute Decision
5 pages
Modern Medicine's Outdated Views
No ratings yet
Modern Medicine's Outdated Views
2 pages