[go: up one dir, main page]

0% found this document useful (0 votes)
108 views59 pages

Revision Notes - MA2101

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 59

Revision notes - MA2101

Ma Hongqiang
April 27, 2017

Contents
1 Vector Spaces over a Field 2

2 Vector Subspaces 6

3 Linear Spans and Direct Sums of Subspaces 7

4 Linear Independence, Basis and Dimension 10

5 Row Space and Column Space 14

6 Quotient Spaces and Linear Transformations 18

7 Representation Matrices of Linear Transformations 24

8 Eigenvalue and Cayley-Hamilton Theorem 29

9 Minimal Polynomial and Jordan Canonical Form 37

10 Quadratic Forms, Inner Product Spaces and Conics 48

11 Problems 59

1
1 Vector Spaces over a Field
Definition 1.1 (Field, Rings, Groups).
Let F be a set containing at least two elements and equipped with the following two binary
operations +(the addition, or plus) and ×(the multiplication, or times), where F × F :=
{(x, y) | x, y ∈ F} is the product set of F with iteself:

+ :F × F → F
(x, y) 7→ x + y;
× :F × F → F
(x, y) 7→ x × y;

Axiom(0) Addition and multiplication are well defined on F in the sense that:

∀x ∈ F, ∀y ∈ F ⇒ x + y ∈ F
∀x ∈ F, ∀y ∈ F ⇒ xy ∈ F

Namely, the operation map +(resp. ×) takes element (x, y) in the domain F × F to some ele-
ment x + y(resp. xy) in the codomain F. The quintuple (F, +, 0; ×, 1) with two distinguished
elements 0 (the additive identity) and 1 (the multiplicative identity), is called a field if the
following Eight Axioms (and also Axiom (0)) are satisfied.
(1) Existence of an additive identity 0F or simply 0:

x + 0 = x = 0 + x, ∀x ∈ F

(2) (Additive) Associativity:

(x + y) + z = x + (y + z), ∀x, y, z ∈ F

(3) Additive Inverse


for every x ∈ F, there is an additive inverse −x ∈ F of x such that

x + (−x) = 0 = (−x) + x

(4) Existence of a multiplicative identity 1F or simply 1:

x1 = x = 1x, ∀x ∈ F

(5) (Multiplicative) Associativity:

(xy)z = x(yz), ∀x, y, z ∈ F

(6) Multiplicative Inverse for nonzero element:


for every 0 6= x ∈ F, there is a multiplicative inverse x−1 ∈ F such that

xx−1 = 1 = x−1 x.

2
(7) Distributive Laww:
(x + y)z = xz + yz, ∀x, y, z ∈ F
z(x + y) = zx + zy, ∀x, y, z ∈ F

(8) Commutativity for addition and multiplication:

x + y = y + x, ∀x, y ∈ F
xy = yx, ∀x, y ∈ F

The triplet (F, +, ×) with only the Axioms (1)—(5) and (7)—(8) satisfied is called a (com-
mutative) ring.
The pair (F, +) with only Axioms (1)—(3) satisfied by its binary operation +, is called an
(additive) group.

Notation 1.1 (about F× ). For a field (F, +, 0; ×, 1), we use F× to denote the set of nonzero
elements in F:
F× := F \ {0}

Definition 1.2 (Polynomial ring).


Let (F, +, 0; ×, 1) be a field (or a ring), e.g. F = Z, Q, R, C.
n
X
g(x) = ai xi = an xn + an−1 xn−1 + · · · + a1 x + a0
i=0

with the leading coefficient an = 6 0, is called a polynomial of degree n ≥ 0, in one


variable x and with coefficients ai ∈ F. Let
d
X
F[x] := { bj xj | d ≥ 0, bj ∈ F}
j=0

be the set of all polynomials in one variable x and with coefficients in F.

Theorem 1.1 (Uniqueness of identity and inverse). Let F be a field.

(1) F has only one additive identity 0.

(2) F has only one multiplicative identity 1.

(3) Every x ∈ F has only one additive inverse −x.

(4) Every x ∈ F× has only one multiplicative inverse x−1 .

Theorem 1.2 (Properties of F).

(1) (Cancelation Law) Let b, x, y ∈ F. Then

b+x=b+y ⇒x=y

3
(2) (Killing Power of 0)
0x = 0 = x0, ∀x ∈ F

(3) In F, we have
0F 6= 1F

(4) If x ∈ F× , then its multiplicative inverse

x−1 ∈ F×

(5) If x + x0 = 0, then x and x0 are multiplicative inverse to each other.

(6) If xx00 = 1 then x and x00 are multiplicative inverse to each other.

Definition 1.3 (Vector Space).


Let F be a field and V a non-empty set, with a binary vector addition operation

+:V ×V →V
(v1 , v2 ) 7→ v1 + v2

and scalar multiplication operation

×:V ×V →V
(c, v) 7→ cv

Axiom(0) These two operations are well defined in the sense that

∀vi ∈ V ⇒ v1 + v2 ∈ V
∀c ∈ F, ∀v ∈ V ⇒ cv ∈ V

(V, +) is a vector space over the field F if the following Seven Axioms are satisfied.

(1) Existence of zero vector 0V :

v + 0 = v = v + 0, ∀v ∈ V

(2) (Additive) Associativity:

(u + v) + w = u + (v + w), ∀u, v, w ∈ V

(3) Additive Inverse:


for every v ∈ V , there is an additive inverse −v of v such that

v + (−v) = 0 = (−v) + v

(4) The effect of 1 ∈ F on V :


1v = v, ∀v ∈ V

4
(5) (Multiplicative) Associativity:

(ab)v = a(bv), ∀a, b ∈ F, v ∈ V

(6) Distributive Law:


(a + b)v = av + bv, ∀a, b ∈ F, v ∈ V
a(u + v) = au + av, ∀a ∈ F, u, v ∈ V

(7) Commutativity for the vector addition:

u + v = v + u, ∀u, v ∈ V

Remark: Every vector space V contains the zero vector 0V .

5
2 Vector Subspaces
Definition 2.1 (Subspace).
Let V be a vector space over a field F. A non-empty subset W ⊆ V is called a vector
subspace of V if the following two conditions are satisfied:

(CA) (Closed under vector Addition)

∀wi ∈ W ⇒ w1 + w2 ∈ W

(CS) (Closed under Scalar Multiplication)

∀a ∈ F, ∀w ∈ W ⇒ aw ∈ W

Obvious subspace of V are V and {0}.


Remark: Every vector subspace W of V contains the zero vector 0W .

Definition 2.2 (Linear Combination).


Let V be a vector space over a field F and W ⊆ V a non-empty subset. Then the following
are equivalent:

(i) W is a vector subspace of V , i.e. W is closed under vector addition and scalar multi-
plication, in the sense of Definition of Subspace.

(ii) W is closed under linear combination:

∀ao ∈ F, ∀wi ∈ W ⇒ a1 w1 + a2 w2 ∈ W

(iii) W together with the vector addition + and the scalar multiplication ×, becomes a
vector space.

Theorem 2.1 (Intersection of Subspace being a subspace).


Let V be a vector space over a field F and let Wα ⊆ V (α ∈ I) be vector subspaces of V .
Then the intersection
∩α∈I Wα
is again a vector subspace of V

Remark: Union of subspaces may not be a subspace. However, union of subspaces is closed
under scalar multiplication.

6
3 Linear Spans and Direct Sums of Subspaces
Definition 3.1 (Linear combination and Linear Span).
Let V be a vector space over a field F. A vector v ∈ V is called a linear combination of
some vectors vi ∈ V (1 ≤ i ≤ s) if

v = a1 v1 + a2 v2 + · · · + as vs

for some scalars ai ∈ F. Let S ⊆ V be a non-empty subset. The subset Span(S) :=

{v ∈ V | v is a linear combination of some vectors in S}

of V is called the vector subspace of V spanned by the subset S.

Theorem 3.1 (Span being a subspace).

(i) The subset Span(S) of V is indeed a vector subspace of V .

(ii) Span(S) is the smallest vector subspace of V containing the set S:


firstly, Span(S) is a vector subspace of V containing S;
secondly, if W is another vector subspace of V containing S, then W ⊇ Span(S)

Definition 3.2 (Sum of subspaces).


Let V be a vector space over a field F, and let U and W be vector subspaces of V . The
subset
U + W = u + w | u ∈ U, w ∈ W
is called the sum of the subspaces U and W .

Theorem 3.2 (Sum being a subspace).


Let U and W be vector subspaces of a vector space V over a field F. For the sum U + W ,
we have:

1. U + W = Span(U ∪ W )

2. U + W is indeed a vector subspace of V .

3. U + W is the smallest vector subspace of V containing both U and W :


first, U + W is a vector subspace of V containing both U and W ; secondly, if T is
another vector subspace of V containing both U and W then, T ⊇ U + W .

Note 1: Let U and W be two vector subspaces of a vector space V over a field F. Then the
following are equivalent.

1. The union U ∪ W is a vector subspace of V .

2. Either U ⊆ W or W ⊆ U .

7
Definition 3.3 (Sum of many subspaces).
Let V be a vector space over a field F and let Wi (1 ≤ i < s) be vector subspaces of V . The
subset
= W1 + · · · + Ws
s s
X
X
Wi = { wi | wi ∈ W }
i=1 i=1
= {w1 + · · · + ws | wi ∈ Wi }

Theorem 3.3 (Sum of many being a subspace).


Let Wi (1 ≤ i < s) be vector subspaces of a vector space V over a field F. For the sum
P s
i=1 Wi , we have
Ps s
1. i=1 Wi = Span(∪i=1 Wi )
Ps
2. i=1 Wi is indeed a vector subspace of V .
Ps
3. i=1 Wi is the smallest vector subspace of V containing all Wi .

Definition 3.4 (Direct Sum of Subspaces).


Let V be a vector space over a field F and let W1 , W2 be vector subspace of V . We say that
the sum W1 + W2 is a direct sum of two vector subspaces W1 , W2 if the intersection

W1 ∩ W2 = {0}

In this case, we denote W1 + W2 as W1 ⊕ W2 .


We write W = W1 ⊕ W2 if W is a direct sum of W1 and W2 .

Theorem 3.4 (Equivalent Direct Sum Definition).


Let W1 and W2 be two vector subspaces of a vector space V over a field F. Set W := W1 +W2 .
Then the following are equivalent.

1. We have
W1 + W2 = W1 ⊕ W2
i.e., W1 + W2 is a direct sum of W1 , W2 ,i.e., W1 ∩ W2 = {0}

2. (Unique expression condition) Every vector w ∈ W can be expressed as

w = w1 + w2

for some wi ∈ Wi and such expression of w is unique.

Definition 3.5 (Direct Sum of Many Subspaces).


P overa field F and let Wi (1 ≤ i ≤ s; s ≥ 2) be vector subspaces of V .
Let V be a vector space
We say that the sum si=1 Wi is a direct sum of vector subspaces Wi if the intersection
k−1
!
X
Wi ∩ Wk = {0} (2 ≤ ∀k ≤ s)
i=1

8
Theorem 3.5 (Equivalent Direct Multiple Sum Definition).
i (1 ≤ i ≤ s, s ≥ 2) be vector subspaces of a vector space V over a field F. Set
Let WP
W := si=1 Wi . Then the following are equivalent.

1. We have
W1 + · · · + Ws = W1 ⊕ · · · ⊕ Ws
Ps
i.e., i=1 Wi is a direct sum of Wi .

2. !
X
Wi ∩ Wl = {0} (∀1 ≤ l ≤ s)
i6=j

3. (Unique expression condition) Every vector w ∈ W can be expressed as

w = w1 + · · · + ws

for some wi ∈ Wi and such expression of w is unique.

9
4 Linear Independence, Basis and Dimension
Definition 4.1 (Linear (in)dependence).
Let V be a vector space over a field F. Let T be a (not necessarily finite) subset of V and
let
S = {v1 , . . . , vn }
be a finite subset of V .
(1) We call S a linear independent set or L.I., if the vector equation below

x1 v1 + · · · + xm vm = 0

has only the so called trivial solution

(x1 , . . . , xm ) = (0, . . . , 0)

(2) We call S a linear dependent set or L.D. if there are scalars a1 , . . . , am in F which
are not all zero (i.e. (a1 , . . . , am ) 6= (0, . . . , 0)) such that

a1 v1 + · · · + am vm = 0

(3) The set T is a linearly independent set if every non-empty finite subset of T is
linearly independent. The set T is a linearly dependent set if at least one non-
empty finite subset of T is linearly dependent.
Theorem 4.1 (L.D./L.I. Inheritance).
(1) Let S1 ⊆ S2 . If the smaller set S1 is linearly dependent then so is the larger set S2 .
Equivalently, if the larger set S2 is linearly independent then so is the smaller set S1 .

(2) {0} is a linearly dependent set.

(3) If 0 ∈ S, then S is a linearly dependent set.


Definition 4.2 (Equivalent L.I./L.D. Definitions).
Let S = {v1 , . . . , vm } be a finite subset of a vector space V over a field F. Then we have:
(1) Let |S| ≥ 2. Then S is a linear dependent set if and only if some vk ∈ S is a linear
combination of the others, i.e. there are scalars

a1 , . . . , ak−1 , ak+1 , . . . , am

in F (with all these scalars vanishing allowed) such that


X
vk = ai vi = a1 v1 + · · · + ak−1 vk−1 + ak+1 vk+1 + · · · + am vm
i6=k

(2) Let |S| ≥ 2. Then S is linearly independent if and only if no vk ∈ S is a linear


combination of others.

10
(3) Suppose that S = {v1 } (a single vector). Then S is linearly dependent if and only if
v1 = 0. Equivalently, S is linearly independent if and only if v1 6= 0.

(4) Suppose that S = {v1 , v2 } (two vectors). Then S is linearly dependent if and only if
one of v1 , v2 is a scalar multiple of the other. Equivalently, S is linearly independent
if and only if neither one of v1 , v2 is a scalar multiple of the other.
Definition 4.3 (Basis, (in)finite demension).
Let V be a nonzero vector space over a field F. A subset B of V is called a basis if the
following two conditions are satisfied.
(1) (Span) V is spanned by B: V = Span(B)

(2) (L.I.) B is a linearly independent set.


If V has a basis B with cardinality |B| < ∞ we say that V is finite dimensional and
define the dimension of V over the field F as the cardinality of B:

dimF V := |B|

Otherwise, V is called infinite-dimensional.


If V equals the zero vector space {0}, we define

dim{0} = 0

Theorem 4.2 (Equivalent Basis Definition I).


Let B = {v1 , . . . , vn }(with vi 6= 0V ) be a finite subset of a vector space V over a field F.
Then the following are equivalent.
(1) B is a basis of V .

(2) (Unique expression condition) Every vector v ∈ V can be expressed as

v = a1 v1 + · · · + an vn

for some scalars ai ∈ F and such expression of v is unique.

(3) V has the following direct sum decomposition:

= Span{v1 } ⊕ · · · ⊕ Span{vn }
V
= Fv1 ⊕ · · · ⊕ Fvn

Theorem 4.3 (Deriving a basis from a spanning set).


Suppose that a nonzero vector space V over a field F is spanned by a finite subset B =
{v1 , . . . , vs }, then we have:
(1) There is a subset B1 ⊆ B such that B1 is a basis of V . In particular

dimF V = B1 ≤ B

11
(2) Let B2 be a maximal linearly independent subset of B: first B2 is L.I. and secondly
every subsets B3 of B larger than B2 is L.D. Then B2 is a basis of V = Span(B).

Theorem 4.4 (Dimension being well Defined).


Let B = {v1 , . . . , vn } be a basis of a vector space V over a field F. Then we have:

(1) Suppose that S is a subset of V with |S| > n = |B|. Then S is L.D.

(2) Suppose that T is a subset of V with |T | < n. Then T does not span V .

(3) Suppose that B 0 is another basis of V . Then |B 0 | = |B|. So the dimension dimF V (=
|B|) of V depends only on V , but not on the choice of its basis.
In other words, dimF V is well defined.

Theorem 4.5 (Expanding an L.I. set).


Let B be a L.I. subset of a vector space V over a field. Then exactly one of the following
two cases is true.

(1) B spans V and hence B is a basis of V .

(2) Let w ∈ V \ Span(B)( and hence w ∈


/ B). Then

B ∪ {w}

is a L.I. subset of V .

In particular, if V is of finite dimension n, then one can find n − |B| vectors

w|B|+1 , · · · , wn

in V Span(B) such that a


B {w|B|+1 , · · · , wn }
is a basis of V .

Theorem 4.6 (Equivalent Basis Definition II).


Let B be a subset of vector space V of finite dimension dimm athbbF V = n ≥ 1. Then the
following are equivalent.

(1) B is a basis of V .

(2) B is L.I. and |B| = n.

(3) B spans V and |B| = n.

Theorem 4.7 (Basis of a direct sum).


Let V be a (not necessarily finite-dimensional) vector space over a field F.

12
(1) Suppose that B is a basis of V . Decompose it as a disjoint union
a a
B = B1 B2 · · · Bs

of non-empty sets Bi . Then Bi is a basis of Wi := Span(Bi ) and

V = W1 ⊕ · · · ⊕ Ws

is a direct sum of nonzero vector subspaces Wi of V .

(2) Conversely, suppose that


V = W1 ⊕ · · · ⊕ Ws
is a direct sum of nonzero vector subspaces Wi of V . Let Bi be a basis of Wi . Then
a a
B = B1 B2 · · · Bs

is a basis of V and a disjoint union of non-empty sets Bi .

(3) In particular, if
V = W1 ⊕ · · · ⊕ Ws
is a direct sum, then
s
X
dimm athbbF V = dimF Wi
i=1

13
5 Row Space and Column Space
Definition 5.1 (Column/Row Space, Nullspace, Nullity, Range of A).
Let  
a11 · · · a1n
A = (ai j) =  ... .. .. 

. . 
am1 · · · amn
be an m × n matrix with entries in a field F. Let
   
a11 a1n
Col(A) := Span{c1 :=  ...  , . . . , cn :=  ... }
   
am1 amn
be the column space of A, and let
R(A) := Span{r1 := (a11 , . . . , a1n ), . . . , rm := (am1 , . . . , amn )}
be the row space of A so that we can write
 
r1
A = (c1 , . . . , cm ) =  ... 
 
rm
The range of A is defined as
R(A) = {AX | X ∈ Fnc }
The nullity of A is defined as the dimension of the nullspace or kernel
Ker(A) := Null(A) := {X ∈ Fnc | AX = 0}
i.e.
nullity(A) = dim Null(A)
Theorem 5.1 (Rank of Matrix, Matrix Dimension Theorem).

(1) The range equals the column space


R(A) = Col(A)

(2) Column and row spaces have the same dimension


dimF Col(A) = dimF R(A) := rank(A)
which is called the rank of A.
(3) There is a dimension theorem
rank(A) + nullity(A) = n
where n is the number of columns in A.

14
Theorem 5.2 (L.I. vs L.D.).
In previous theorem, suppose that m = n so that A is a square matrix of order n. Then the
following are equivalent.
(1) A is an invertible matrix, i.e. A has a so called inverse A−1 ∈ Mn (F) such that
AA−1 = In = A−1 A

(2) A has nonzero determinant


det(A) = |A| =
6 0

(3) The column vectors


c1 , . . . , cn
of A form a basis of the column vector n-space Fnc .
(4) The row vectors
r1 , . . . , rn
of A form a basis of the row vector n-space Fnc .
(5) The column vectors
c1 , . . . , cn
of A are linearly indepedent in Fnc .
(6) The row vectors
r1 , . . . , rn
of A are linearly indepedent in Fnc .
(7) The matrix equation
AX = 0
has the trivial solution only: X = 0.
Theorem 5.3 (Row operation preserves columns relations).
Suppose that A and B are row equivalent. Then we have:
(1) If the column vectors
ai1 , . . . , ais , aj1 , . . . , ajt
of A satisfies a relation
ci1 ai1 + · · · + cis ais = cj1 aj1 + · · · + cjt ajt
for some scalars cik ∈ F, then the corresponding column vectors
bi1 , . . . , bis , bj1 , . . . , bjt
of B satisfies exactly the same relation
ci1 bi1 + · · · + cis bis = cj1 bj1 + · · · + cjt bjt
The converse is also true.

15
(2) The column vectors
ai1 , . . . , ais , aj1 , . . . , ajt
of A are linearly dependent if and only if the corresponding column vectors

bi1 , . . . , bis , bj1 , . . . , bjt

of B are linearly dependent.

(3) The column vectors


ai1 , . . . , ais , aj1 , . . . , ajt
of A are linearly independent if and only if the corresponding column vectors

bi1 , . . . , bis , bj1 , . . . , bjt

of B are linearly independent.

(4) The column vectors


ai1 , . . . , ais
of A forms a basis of the column space

Col(A) = Span{a1 , . . . , an }

if and only if the corresponding column vectors

bi1 , . . . , bis

form a basis of the column space

Col(B) = Span{b1 , . . . , bn }

(5) If
B1 := {ai1 , . . . , ais }
is a maximal L.I. subset of the set

C = {a1 , . . . , an }

of all column vectors then B1 is a basis of the column space Col(A) of A.

(6) Suppose that B is in row-echelon form with leading entries at columns

i1 , . . . , i s

Then
ai1 , . . . , ais
forms a basis of the column space Col(A) of A.

16
(7) The row space of A and B are identical

R(A) = R(B)

But the column spaces of A and B may not be the same.

(8) Suppose that 


b01
B = (bij ) =  ... 
 
b0m
is in row-echelon form with leading entries at rows

j1 , . . . t

Then
b0j1 , . . . , b0jt
form a basis of the row space R(A) = R(B) of A.

17
6 Quotient Spaces and Linear Transformations
Definition 6.1 (Sum of subsets of a space).
Let V be a vector space over a field F, and let S and T be subsets (which are not necessarily
subspaces) of V . Define the sum of S and T as

S + T := {s + t | s ∈ S, t ∈ T }

In general, given subsets Si (1 ≤ i ≤ r) of V , we can define the sum of Si as


r
X Xr
Si = { xi | xi ∈ Si }
i=1 i=1

Theorem 6.1 (Inclusion and sum for subsets). (1) Associativity

(S1 + S2 ) + S3 = S1 + (S2 + S3 )

(2) Commutativity
S1 + S2 = S2 + S1

(3) If S1 ⊆ S2 and T1 ⊆ T2 then


S1 + T1 ⊆ S2 + T2

(4) If W is a subspace of V , then

W + {0} = W, W +W =W

(5) Suppose that W is a subspace of V , THen

S+W =W ⇔S ⊆W

Definition 6.2 (Coset v̄).


Let V be a vector space over a field F and W a subspace of V . For any given v ∈ V , the
subset
v + W := {v + w | w ∈ W }
of V is called the coset of W containing v. This subet is often denoted as

v̄ := v + W

The vector v is a representative of the coset v̄.

Theorem 6.2 (Coset Relations).


Let W be a subspace of a vector space V . The following are equivalent

(1) v + W = W ,i.e.,v̄ = 0̄

(2) v ∈ W

18
(3) v + W ⊆ W

(4) W ⊆ v + W

Theorem 6.3 (To be the same coset).


Let W be a subspace of V . Then for v̄i = vi + W ,

v¯1 = v¯2 ⇔ v1 − v2 ∈ W

Remark:Suppose that V = U ⊕ W is a direct sum of subspaces U and W . Then the map


below is a bijection(and indeed an isomorphism)

f : U → V /W
u 7→ ū = u + W

Definition 6.3 (Quotient Space).


Let W be a subspace of V . Let

V /W := {v̄ = v + W | v ∈ V }

be the set of all cosets of W . It is called the quotient space of V modulo W .


We define a binary addition operation on V /W :

+ : V /W × V /W → V /W
¯ v2
(v¯1 , v¯2 ) 7→ v¯1 + v¯2 := v1 +

and a scalar multiplication operation

× : F × V /W → V /W
(a, v¯1 ) 7→ av¯1 := av
¯1

Theorem 6.4 (Quotient Space being Well Defined).


Let V be a vector space over a field F and W a vector subspace of V . Then we have:

(1) The binary addition operation and scalar multiplication operation on V /W is well
defined.

(2) V /W together with these binary addition and scalar multiplication operations, becomes
a vector space over the same field F, with the zero vector

0V /W = 0¯V = w̄

for any w ∈ W .

Definition 6.4 (Linear transformation, and its Kernel and Image; Isomorphism).
Let Vi be two vector spaces over the same field F. A map

ϕ : V1 → V2

19
is called a linear transformation from V1 to V2 if ϕ is compatible with the vector addition
and scalar multiplication on V1 and V2 in the sense below:
ϕ(v1 + v2 ) = ϕ(v1 ) + ϕ(v2 )
ϕ(av) = aϕ(v)
When ϕ : V → V is a linear transformation from V to itself, we call ϕ a linear operator on
V.
A linear transformation is called an isomorphism if it is a bijection. In this case, we denote
V1 ' V2
Remark: If T : V → W is a linear transformation, then T (0V ) = 0W .
Remark:(Direct sum vs quotient space)
Let V = U ⊕ W , where U, W are subspaces of V . Then the map below is an isomorphism.
f : U → V /W
u 7→ ū = u + W
Theorem 6.5 (Equivalent Linear Transformation definition).
Let ϕ : V1 → V2 be a map between two vector spaces Vi over the same field F. The the
following are equivalent.
(1) ϕ is a linear transformation.
(2) ϕ is compatible with taking linear combination in the sense below:
ϕ(a1 v1 + a2 v2 ) = a1 ϕ(v1 ) + a2 ϕ(v2 )
for all ai ∈ F, vi ∈ V .
Theorem 6.6 (Evaluate T at a basis).
Let V be a vector space over a field F and with a basis B = {u1 , u2 , . . .}. Let T : V → W
be a linear transformation.
Then T is uniquely determined by its valuations T (ui ) (i = 1, 2, . . .) at the basis B.
Namely, if T 0 : V → W is another linear transformation such that T 0 (ui ) = T (ui )∀i, then
they are equal: T 0 = T .
Theorem 6.7 (Quotient map).
Let V be a vector space over a field F. Let W be a subspace V and V /W . One verifies that
γ is surjective and
ker(γ) = W
Theorem 6.8 (Image being a vector subspace).
Let
ϕ:V →W
be a linear transformation between two vector spaces over the same field F. Let V1 be a
vector subspace of V . Then the image of V1 :
T (V1 ) = {T (u | u ∈ V1 }
is a vector subspace of W .
In particular, T (V ) is a vector subspace of W .

20
Theorem 6.9 (Subspace vs. Kernel).
Let V be a vector space over a field F.
(1) Suppose that
ϕ:V →U
is a linear transformation. Then the kernel ker(ϕ) is a vector subspace of V .
(2) Conversely, suppose W is a vector subspace of V . Then there is a linear transformation

ϕ:V →U

such that
W = ker(ϕ)

Theorem 6.10 (To be injective). Let

ϕ:V →W

be a linear transformation. Show that ϕ is injective if and only if ker(ϕ) = {0}.


Theorem 6.11 (Equivalent Isomorphism Definition).
Let ϕ : V → W be a linear transformation. Then there is an isomorphism
ϕ̄ : V / ker(ϕ) ' ϕ(V ) ∈ U
v̄ 7→ ϕ(v)
such that
ϕ = ϕ̄ ◦ γ
where
γ : V → V / ker(ϕ)v 7→ v̄
is the quotient map, a linear transformation.
In particular, when ϕ is surjective, we have an isomorphism

ϕ̄ : V / ker(ϕ) ' U

Theorem 6.12 (Finding basis of the quotient).


Let V be a vector space over a field F of finite dimension n. Let W be a subspace with a
basis B1 = {w1 , . . . , wr }.
(1) B1 extends to a basis a
B := B1 {w1 , . . . , wr }
of V .
(2) The cosets
{w̄1 , . . . , w̄r }
is a basis of the quotient space V /W . In particular,

dimF V /W = dimF V − dimF W

21
(3) a
B1 {ur+1 , . . . , un }
is a basis of V if and only if the cosets

{ur+1
¯ , . . . , u¯n }

is a basis of V /W .
Theorem 6.13 (Goodies of Isomorphism).
Let ϕ : V → W be an isomorphism and let B be a subset of V . Then we have:
(1) If there is a relation
r
X s
X
ai vi = a i vi
i=1 i=r+1

among vectors vi ∈ V , then exactly the same relation


r
X s
X
ai ϕ(vi ) = ai ϕ(vi )
i=1 i=r+1

holds among vectors ϕ(vi ) ∈ W . The converse is also true.

(2) B is linearly dependent if and only if so is ϕ(B).

(3) B is linearly independent if and only if so is ϕ(B);

(4) We have
ϕ(Span(B)) = Span(ϕ(B))
In particular,
ϕ(Span{v1 , . . . , vs }) = Span{ϕ(v1 ), . . . , ϕ(vs )}

(5) B spans V if and only if ϕ(B) spans W .

(6) B is a basis of V if and only if ϕ(B) is a basis of W . In particular,

dim V = dim W

Theorem 6.14 (To be isomorphic finite-dimensional spaces).


Let V and W be finite-dimensional vector spaces over the same field F. Then the following
are equivalent.
(1) dimF V = dimF W = n.

(2) There is an isomorphism


ϕ:V 'W

(3) For some n, we have:


V ' Fn ' W

22
Theorem 6.15 (Dimension Theorem).
Let ϕ : V → W be a linear transformation between vector spaces over a field F. Then

dimF ker(ϕ) + dimF ϕ(V ) = dimF V

Theorem 6.16 (2nd Isomorphism Theorem).


Let W1 , W2 be vector subspaces of a vector space V .

(1) The map


ϕ : W + 1/(W1 + W2 )) → (W1 + W2 )/W2
w + W1 ∩ W2 = w 7→ (w) = w + W2
is a well difined isomorphism between vector spaces.

(2) A dimension formula:

dim W1 + dim W2 = dim(W1 + W2 ) + dim(W1 ∩ W2 )

Theorem 6.17 (Equivalent isomorphism definition).


Let
ϕ:V →W
be a linear transformation between vector spaces over a field F and of the same finite dimen-
sion n. Then the following are equivalent.

(1) ϕ is an isomorphism

(2) ϕ is an injection

(3) ϕ is an surjection

23
7 Representation Matrices of Linear Transformations
Definition 7.1 (Coordinate vector).
Let V be vector space of dimension n ≥ 1 over a field F. Let
B = BV = (v1 , . . . , vn )
be a basis of V . Every vector v ∈ V can be expressed as a linear combination
v = c1 v1 + · · · + cn vn
and this expression is unique. We gather the coefficients ci and form a column vector
 
c1
 .. 
[v]B :=  .  ∈ Fnc
cn
which is called the coordinate vector of v related to basis B.
One can recover v from its coordinate vector [v]B :
v = B[v]B
Theorem 7.1 (Isomorphism V → Fnc ).
Let V be an n-dimensional vector space over a field F and with a basis B = {v1 , . . . , vn }.
Show that the map
ϕ : V → Fnc
v 7→ [v]B
is an isomorphism between the vector space V and Fnc .
Theorem 7.2 (Representation matrix).
Let
T :V →W
be a linear transformation between vector spaces over a field F. Let
B = {v1 , . . . , vn }
be a basis of V , and
BW = {w1 , . . . , wm }
be a basis of W . Let A ∈ Mm×n (F). Then the following three conditions on A are equivalent.
[T (v)]BW = A[v]B
A = ([T (v1 )]BW , . . . , [T (vn )]BW )
(T (v1 ), . . . , T (vn )) = (w1 , . . . , wm )A
We denote the above matrix A as
[T ]B,BW := A = ([T (v1 )]BW , . . . , [T (vn )]BW )
and call it the representation matrix of T relative to B and BW .

24
Theorem 7.3 (Linear transformation theory = Matrix theory).
For every matrix
A ∈ Mm×n (F)
there is a unique linear transformation
T :V →W
such that the representation matrix
[T ]B,BW = A
Consequently, the map
ϕ : HomF (V, W ) → Mm×n (F)
T 7→ [T ]B,BW
is an isomorphism of vector spaces over F.
Theorem 7.4 (Close relation between the space of vectors and space of their coordinates).
Let
T :V →W
be a linear transformation between the vector spaces V and W over the same field F, of
dimensions n and m, respectively. Let
BV := (v1 , . . . , vn )
be a basis of V , and
BW := (w1 , . . . , wm )
be a basis of W . Let
Mm×n (F) 3 A := [T ]BV ,BW = (a1 , . . . , an )
Then the following are isomorphisms:
ϕ : Ker(T ) → Null(A)
v 7→ [v]BV ,
ψ : Null(A) → Ker(T )
X 7→ BV X,
ξ : R(T ) → R(TA ) = col.sp. of A = Span{a1 , . . . , an }
η : R(TA ) → R(T )
Y 7→ BW Y = (w1 , . . . , wm )Y
Below are some consequences of the isomorphism above
(1) The subset
{X1 , . . . , Xs }
of Fnc is a basis of Null(A) if and only if the vectors
BV X1 , . . . , BV XS
of V forms a basis of Ker(T ).

25
(2) The subset
{Y1 , . . . , Yt }
of Fm
c is a basis of R(TA ) if and only if the vectors

BW Y1 , . . . , BW Yt

of W form a basis of R(T ).

(3) The range of T is given by

R(T ) = Span{BW a1 , . . . , BW an }

(4) T : V → W is an isomorphism if and only if its representation matrix A = [T ]BV ,BW is


an invertible matrix in Mn (F).

Theorem 7.5 (Representation Matrix of a Composite Map).


Let V1 , V2 , V3 be vector spaces of finite dimension over the same field F and let B1 , B2 , B3 be
theire respective bases. Let
T1 : V1 → V2
and
T2 : V2 → V3
be linear transformations. Then we have:

[T2 ◦ T1 ]B1 ,B3 = [T2 ]B2 ,B3 [T1 ]B1 ,B2

Theorem 7.6 (Representation matrix of inverse of an isomorphism).


Let
T :V →W
be an isomorphism between vector spaces over the same field F and of finite dimension. Let

T −1 : W → V

be the inverse isomorphism of T . Let BV (resp. BW ) be a basis of V (resp. W ). Then

[T −1 ]BW ,BV = [TBV ,BW ]−1

Theorem 7.7 (Representation matrix of map combination).


Let
Ti : V → W
be two linear transformations between finite-dimensional vector spaces over the same field F.
Let B(resp. BW ) be a basis of V (resp. W ). Then for any ai ∈ F, the map linear combination
a1 T1 + a2 T2 has the representation matrix

[a1 T1 + a2 T2 ]B,BW = a1 [T1 ]B,BW + a2 [T2 ]B,BW

26
Theorem 7.8 (Equivalent transition matrix definition).
Let V be a vector space over a field F and of finite dimension n ≥ 1. Let

B = (v1 , . . . , vn )

and
B 0 := (v10 , . . . , vn0 )
be two bases of V . Let P ∈ Mn (F). Then the following are equivalent.

(1)
P = ([v10 ]B , . . . , [vn0 ]B )

(2)
B 0 = BP

(3) For any v ∈ V , we have


P [v]B 0 = [v]B

This P is denoted as PB 0 →B and called the transition matrix from basis B 0 to B. P is


invertible.

Theorem 7.9 (Basis change theorem for representation matrix).


Let V be a vector space over a field F and of finite dimension n ≥ 1. Let

B = (v1 , . . . , vn )

and
B 0 := (v10 , . . . , vn0 )
be two bases of V . Then
[T ]B 0 = P −1 [T ]B P
where
P = PB 0 →B

Definition 7.2 (Similar Matrices).


Two square matrices (of the same order) A1 , A2 ∈ Mn (F) are similar if there is an invertible
matrix P ∈ Mn (F) such that
A2 = P −1 A1 P
In this case, we denote
A1 ∼ A2
The similarity property is an equivalence relation.

Theorem 7.10. Similar matrices have the same determinant:

A1 ∼ A2 ⇒ |A1 | = |A2 |

27
Definition 7.3 (Determinant/Trace of a linear operator).
Let
T :V →V
be a linear operator on a finite-dimensional vector space V .
We define the determinant det(T ) of T as

det(T ) := det([T ]B )

and the trace of T as


Tr(T ) = Tr([T ]B )
where B is any basis of V .

Definition 7.4 (Characteristic polynomial pA (x), pT (x)). (1) Let A ∈ Mn (F).

pA (x) : = |xIn − A|
= xn + bn−1 xn−1 + · · · + b1 x + b0

is called the characteristic polynomial of A, which is of degree n.

(2) Let
T :V →V
be a linear operator on an n-dimensional vector space V . Set

A := [T ]B

where B is any basis of V . Then

pT (x) : = |xIn − A|
= xn + bn−1 xn−1 + · · · + b1 x + b0

is called the characteristic polynomial of T , which is of degree n = dim V .

Theorem 7.11. Similar matrices have equal characteristic polynomial.

Theorem 7.12.
For A ∈ Mn (F), we have
Tr(A) = −bn−1
det(A) = (−1)n pA (0)

28
8 Eigenvalue and Cayley-Hamilton Theorem
Definition 8.1 (Eigenvalue, eigenvector).
Assume that
λ∈F

(arabic*) Let V be a vector space over a field F. Let

T :V →V

be a linear operator. A nonzero vector v in V is called an eigenvector of T corre-


sponding to the eigenvalue λ ∈ F of T if

T (v) = λv

(arabic*) For an n × n matrix A in Mn (F), a nonzero column vector u in Fnc is called an eigen-
vector of A corresponding to the eigenvalue λ ∈ F of A if

Au = λu

Definition 8.2 (Equivalent definition of eigenvalue and eigenvector).


Let V be a vector space of dimension n over a field F and with a basis B, Let

T :V →V

be a linear operator. Assume that


λ∈F
Then the following are equivalent:

(1) λ is an eigenvalue of T (corresponding to an eigenvector 0 6= v ∈ V of T , i.e. T (v) =


λv).

(2) λ is an eigenvalue of [T ]B (corresponding to an eigenvector 0 6= [v]B ∈ Fnc of [T ]B , i.e.


[T ]B [v]B = λ[v]B ).

(3) The linear operator


λIV − T : V → V
x 7→ λx − T (x)
is not an isomorphism, i.e. there is some

0 6= v ∈ Ker(λIV − T )

(4) The matrix λIn − [T ]B is not invertiblem i.e. the matrix equation

(λIn − [T ]B )X = 0

has a non=trivial solution.

29
(5) λ is a zero of the characteristic polynomial pT (x) of T

pT (λ) = |λIn − [T ]B | = 0

Theorem 8.1 (Determinant |A| as product of eigenvalues).


Let A ∈ Mn (F). Let p(x) be the characteristic polynomial. Factorise

p(x) = (x − λ1 ) · · · (x − λn )

in some over field of F. Then the determinant of A equals


n
Y
λi
i=1

Definition 8.3 (Eigenspace of an eigenvalue).


Let λ ∈ F be an eigenvalue of a linear operator

T :V →V

on an n-dimensional vector space V over the field F. The subspace (of all the eigenvectors
corresponding to the eigenvalue λ, plus 0V ):

Vλ : = Vλ (T )
: = Ker(λIV − T )
= {v ∈ V | T (v) = λv}

of V is called the eigenspace of T corresponding to the eigenvalue λ.

Definition 8.4 (Geometric/Algebraic Multiplicity).


Let λ ∈ F and T : V → V be as the previous definition.

(1) The dimension


dim Vλ
of the eigenspace Vλ of T is called the geometrix multiplicity of the eigenvalue λ of
T . We have
1 ≤ dim Vλ ≤ n

(2) The algebraic multiplicity of the eigenvalue λ of T is defined to be the largest positive
integer k such that (x − λ)k is a factor of the characteristic polynomial pT (x),i.e.

(x − λ)k | pT (x), (x − λ)k+1 - pT (x)

We shall see that

geometric multiplicity of λ ≤ alg. multiplicity of λ

30
Theorem 8.2 (Eigenspace of T and [T ]B ).
Let
T :V →V
be a linear operator on an n-dimensional vector space V with a basis B. Set

A := [T ]B

The map
f : Ker(T − λIV ) → Null(A − λIn )
w 7→ [w]B
gives an isomorphism. In particular,

dim Vl ambda(T ) = dim Vλ (A)

Due to this isomorphism, the following are equivalent:

1. The subset
{u1 , · · · , us }
of V is a basis of the eigenspace Vλ (T ) of T .

2. THe subset
{[u1 ]B , cdots, [us ]B }
of Fnc is a basis of the eigenspace Vλ ([T ]B ) of the representation matrix [T ]B of T
relative to a basis B of V .

Also, the following are equivalent:

1. The subset
{X1 , · · · , Xs }
of Fnc is a basis of the eigenspace Vλ ([T ]B ) of the representation matrix [T ]B of T
relative to a basis B of V .

2. The subset
{BX1 , · · · , BXs }
of V is a basis of the eigenspace Vλ (T ) of T .

Theorem 8.3 (Eigenspaces of similar matrices).


Let A ∈ Mn (F). Suppose that P −1 AP = C. Show that

Fnc ⊇ Vλ (A) = P Vλ (C) := {P X | X ∈ Vλ (C)}

Theorem 8.4 (Sum of eigenspaces).


Let
λ1 , . . . , λ k

31
be some distinct eigenvalues of a linear operator T on a vector space V over a field F. Then
the sum of eigenspaces
Xk
W := Vλi (T )
i=1
= Vλi (T ) + · · · + Vλk (T )
is a direct sum:
W = ⊕ki=1 Vλi (T )
= Vλ1 (T ) ⊕ · · · ⊕ Vλk (T )

Definition 8.5 (Multiplication of linear operators S1 , . . . , Sr ).


Let
T :V →V
be a linear operator on a vector space V over a field F. Define

T s := T ◦ · · · ◦ T (s times)

which is a linear operator on V .


Ts : V → V
v 7→ T s (v)
Be convention, set
T 0 := IV = idV
More generally, for a polynomial
r
X
f (x) = ai xi
i=0

Define r
X
f (T ) = ai T i
i=0

Then f (T ) is a linear operator on V .

f (T ) : V &toV
v 7→ f (T )(v)

Similarly, for linear operators


Si : V → V
Define
S1 S2 · · · Sr := S1 ◦ S2 ◦ · · · ◦ Sr
which is a linear operator.

Theorem 8.5 (Polynomials in T ).


Let
T :V →V

32
be a linear operator on a vector space V over a field F and with a basis

B = {u1 , . . . , un }

Let
f (x), g(x) ∈ F[x]
be polynomials. We have

(1)
[f (T )]B = f ([T ]B )

(2) The multiplication f (T )g(T ) as polynomials in T equals the composite f (T ) ◦ g(T ) as


linear operators:
f (T )g(T ) = f (T ) ◦ g(T )

(3) Commutativity:
f (T )g(T ) = g(T )f (T )

(4) If P ∈ Mn (F) is invertible, then

f (P −1 AP ) = P −1 f (A)P

(5) If
S:V →V
is an isomorphism with inverse isomorphism

S −1 : V → V

Then
f (S −1 T S) = S −1 f (T )S

Definition 8.6 (T -invariant Subspace).


Let
T :V →V
be a linear operator on a vector space V . A subspace W of V is called T -invariant if the
image of W under the map T is included in W :

T (W ) := {T (w) | w ∈ W }

i.e.,
T (w) ∈ W ∀w ∈ W
In this case, define the restriction of T on W as:

T |W :W →W
w 7→ T (w)

33
Theorem 8.6 (Kernels and Images of Commutative Operators).
Let Ti : V → V be two linear operator commutative to each other, i.e.
T1 ◦ T2 = T2 ◦ T1
as maps. Namely,
T1 (T2 (v)) = T2 (T1 (v)) (∀v ∈ V )
Both
Ker(T2 ), ima(T2 )
are T1 -invariant subspaces of V .
Theorem 8.7 (Evaluate T on a basis of a subspace).
Let
T :V →V
be a linear operator on a vector space V over a field F. Let W be a subspace of V with a
basis BW = {w1 , w2 , · · · }. Then W is T -invariant, if and only if
T (BW ) ⊆ W
Theorem 8.8 (T -cyclic subspace).
Let
T :V →V
be a linear operator on a vector space V over a field F. Fix a vector
0 6= w1 ∈ V
1. The subspace
W := Span{T s (w1 | s ≥ 0}
of V is T -invariant.
W is called the T -cyclic subspace of V generated by w1 .
2. Suppose that V is finite-dimensional. Let s be the smallest positive integer such that
T s (w1 ) ∈ Span{w1 , T (w1 ), . . . , T s−1 (w1 )}
We have
dimF W = s
and
B := {w1 , T (w1 ), . . . , T s−1 (w1 )}
is a basis of W .
3. In (2), if
T s (w1 ) = c0 w1 + c1 T (w1 ) + · · · + cs−1 T s−1 (w1 )
for some scalars ci ∈ F, then the characteristic polynomial of the restiction operator
T | W on W is
pT |W (x) = −c0 − c1 x − · · · − cs−1 xs−1 + xs

34
Theorem 8.9 (Characteristic Polynomial of the Restriction Operator).
Let
T :V →V
be a linear operator on a vector space V over a field F and of dimension n ≥ 1. Let W be
a T -invariant subspace of V . Then the characteristic polynomial pT |W (x) of the restriction
operator T | W on W is a factor of the characteristic polynomial pT (x) of T , i.e.

pT (x) = q(x)pT |W (x)

for some polynomial q(x) ∈ F[x].


Theorem 8.10 (To be T -invariant in terms of [T ]B ).
A subspace W of an n-dimensional space V is T -invariant for a linear operator T on V , if
and only if every basis BW of W can be extended to a basis

B = BW ∪ B2

of V such that the representation matrix of T relative to B, is of the form:


 
A1 A2
[T ]B =
0 A3

for some square matrices A1 , A3 (automatically with A1 = [T | W ]BW ).


In this case, the matrix
A2 = 0
if and only if
W2 := Span(B2 )
is a T -invariant subspace of V (automatically with [T | W2 ]B2 = A3 )
Theorem 8.11 (Upper Triangular Form of a Matrix).
Let T : V → V be a linear operator on an n-dimensional vector space V over a field F.
Suppose that the characteristic polynomial p(x) is factorised as

p(x) = (x − λ1 )n1 · · · (x − λk )nk

for some λi ∈ F. Then there is an basis B of V such that the representation matrix [T ]B is
upper triangular.
Theorem 8.12 (Characteristic Polynomials of Direct Sums).
Let
T :V →V
be a linear operator on an n-dimensional vector space V over a field F. Suppose that there
are T -invariant subspaces
Wi (1 ≤ i ≤ r)
of V such that V is the direct sum
V = ⊕ri=1 Wi

35
of Wi .
Then the characteristic polynomial pT (x) of T is the product:
r
Y
pT (x) = pT |Wi (x)
i=1

of the characteristic polynomials of the restriction operators T | Wi on Wi .

Theorem 8.13 (To be direct sum of T -invariant subspaces).


An n-dimensional vector space V with a linear operator T is a direct sum

V = ⊕ri=1 Wi

of some T -invariant subspaces Wi , if and only if every set of bases Bi of Wi gives rises to a
basis a a
B = B1 ··· Br
of V such that the representation matrix of T relative to B is in the form
 
A1 0 · · · 0
 0 A2 · · · 0
[T ]B =  ..
 
.. . . .. 
 . . . . 
0 0 0 Ar

with Ai of order |Bi | = dim Wi

Theorem 8.14 (Cayley-Hamilton Theorem).


Let n
X
pT (x) = |xIn − [T ]B | = bi x i
i=0

be the characteristic polynomial of a linear operator

T :V →V

on an n-dimensional vector space V over a field F and with a basis B. Then T satisfies the
equation pT (x) = 0, i.e.
pT (T ) = 0IV
which is the zero map on V .

36
9 Minimal Polynomial and Jordan Canonical Form
Definition 9.1 (Minimal Polynomial).
Let
T :V →V
be a linear operator on an n-dimensional vector space over a field F. A nonzero polynomial
m(x) ∈ F[x]
is a minimal polynomial of T if it satisfies:
1. m(x) is monic,
2. Vanishing condition:
m(T ) = 0IV
3. Minimality degree condition:
Whenever f (x) ∈ F[x] is another nonzero polynomial such that f (T ) = 0IV , we have
deg(f (x)) ≥ deg(m(x))

We can define a minimal polynomial of a matrix A ∈ MN (F).


Remark: The existence of minimal polynomial is proven by Cayley-Hamilton theorem.
Theorem 9.1 (Uniqueness of a minimal polynomial mT (x)).
Let T : V → V be a linear operator on an n-dimensional vector space V over a field F. Let
m(x) be a minimal polynomial of T . Let f (x) ∈ F[x]. Then the following are equivalent.
(1) f (T ) = 0IV .
(2) m(x) is a factor of f (x),i.e., m(x) | f (x).
In particular, there is exactly one minimal polynomial of T and will be denoted as
mT (x) = m(x)
Further, if A = [T ]B , then mT (x) = mA (x).
Theorem 9.2 (Minimal polynomials of similar matrices).
If two matrices Ai are simialr: A1 ∼ A2 , then they have the same minimal polynomial
mA1 (x) = mA2 (x)
Theorem 9.3 (Minimal polynomials of direct sums).
Consider the matrix  
A1 0 · · · 0
 0 A2 · · · 0 
A =  ..
 
.. . . .. 
 . . . . 
0 0 · · · Ar
where Ai ∈ Mni (F) are square matrices. The minimal polynomial mA (x) of A is equal to
the least common multiple of the minimal polynomials mAi (x) of Ai ,i.e.,
mA (x) = lcm{mA1 (x), . . . , mAr (x)}

37
Theorem 9.4.
The set of zeros of pT (x) and that of mT (x) are identical.
Definition 9.2 (Jordan Block).
Let λ be a scalar in a field F. The matrix below
 
λ 1 0 0 ···
0
 λ 1 0 ···

J := Js (λ) =  0
 0 λ 0 ···
 ∈ Ms (F)
 .. .. .... 
..
. . . . .
0 0 0 ··· λ

is called the Jordan Block of order s with eigenvalue λ.


The characteristic polynomial and minimal polynomial of J are identical:

mJ (x) = (x − λ)s = pJ (x)

The eigenspace
Vλ (J) = Span{e1 }
has dimension 1, i.e. the geometric multiplicity of λ is 1, but the algebraic multiplicity of λ
is s.
Definition 9.3 (Jordan Canonical Form).
Let λ be a nonzero scalar in a field F. Let

s1 ≤ s2 ≤ · · · ≤ se

The following Block Diagonal


 
Js1 (λ) 0 0 ··· 0
 0
 Js2 (λ) 0 ··· 0 

 0
A(λ) =  0 Js3 (λ) ··· 0 

 .. .. .. .. .. 
 . . . . . 
0 0 0 · · · Jse (λ)

is called a Jordan canonical form with eigenvalue λ.


The order of A(λ) is
Xe
s= si
i=1

The characteristic polynomial and minimal polynomial of A are

pA(λ) (x) = (x − λ)s , mA(λ) (x) = (x − λ)se

where s is also called algebraic multiplicity of λ of A(λ).


The eigenspace of A

Vλ (A(λ)) = Span{e1 , e1+s1 , . . . , e1+s1 +···+se−1 }

38
has dimension equal to e.
The geometric multiplicity of the eigenvalue λ of A(λ) is dim Vλ (A(λ)) = e. And we have,

e≤s

More generally, let


λ1 , . . . , λ k
be distinct scalars in F. WLOG, we assume λ1 < λ2 < · · · < λk when F = R. Then the
block diagonal  
A(λ1 ) 0 0 ··· 0
 0
 A(λ2 ) 0 ··· 0  
J = 0
 0 A(λ3 ) · · · 0  
 .. .. .. . . .. 
 . . . . . 
0 0 0 · · · A(λk )
is called a Jordan canonical form where A(λi ) is a Jordan canonical form with eigenvalue
λi as shown above.
Each A(λi ) is of order
s(λi )
and s(λi ) is also the number of times the same scalar λi appears on the diagonal of J and
also the algebraic multiplicity of the eigenvalue λi of J.
So J is of order euqal to
Xk
s(λi )
i=1

There are exactly


e(λi )
Jordan blocks (with eigenvalue λi ) in A(λi ), the largest of which is of order

se (λi )

This se (λi ) is also the multiplicity of λi in the minimal polynomial mJ (x).


There are exactly
Xk
e(λi )
i=1

Jordan blocks in J.
Now we have
k
Y
pJ (x) = (x − λi )s(λi )
i=1
Yk
mJ (x) = (x − λi )se (λi )
i=1

39
The eigenspace Vλi (J) has dimension e(λi ) and is spanned by the e(λi ) ectors corresponding
to the first columns of the e(λi ) Jordan blocks in A(λi ).
We also have
dim Vλi (J) = e(λi ) ≤ s(λi )
Sometimes, a block diagonal J below
 
Js1 (λ1 ) 0 0 ··· 0
 0
 Js2 (λ2 ) 0 ··· 0 

J =
 0 0 Js3 (λ3 ) ··· 0 

 .. .. .. ... .. 
 . . . . 
0 0 0 · · · Jsr (λr )

is also called a Jordan Canonical Form, where each Jsi (λi ) is a Jordan block with eigen-
value λi ∈ F, but these λi ’s may not be distinct.
Assume that there are exactly k distinct elements in the set

{λ1 , . . . , λr }

and we assume that


λmi
are these k distinct ones. These k of λmi are just the distinct eigenvalues of J.
Let
s(λmi )
be the number of times the same scalar λmi appears on the diagonal of J. Let

e(λmi )

be the number of Jordan blocks (among the r such in J) with eigenvalue of the same λmi ;
among these e(λmi ) Jordan blocks, the largest is of order say

se (λmi )

The eigenspace Vλmi (J) has dimension e(λmi ) and is spanned by e(λmi ) vectors corresponding
to the first columns of these e(λmi ) Jordan blocks.
Also,
Yk
pJ (x) = (x − λi )s(λmi )
i=1
Yk
mJ (x) = (x − λi )se (λmi )
i=1

As in the case of A(λ), for the matrix J, we have

dim Vλmi (J) = e(λmi ) ≤ s(λmi )

40
Theorem 9.5 (Jordan Canonical Form of a Linear Operator).
Let V be a vector space of dimension n over a field F and

T :V →V

a linear operator with characteristic polynomial pT (x) and minimal polynomial mT (x) as
follows
pT (x) = (x − λ1 )n1 · · · (x − λk )nk
mT (x) = (x − λ1 )m1 · · · (x − λk )mk
where
λ1 , . . . , λ k
are distinct scalars in F.
Then there is a basis B of V such that the representative matrix [T ]B equals a Jordan
canonical form J ∈ Mn (F), with

ni = s(λi ), mi = se (λi )

Such a block diagonal J is called a Jordan canonical form of T . It is unique up to re-ordering


of λi . The basis B of V is called a Jordan canonical basis of T .

Theorem 9.6 (A canonical form of T is a canonical form of [T ]B ).


Let V be an n-dimensional vector space over a field F and

T :V →V

a linear operator. Let


A = [T ]0B
be the representation matrix of T relative to a basis

B 0 = (v1 , . . . , vn )

of V . Let J be a Jordan canonical form. Then the following are equivalent:

(1) There is an invertible matrix P ∈ Mn (F) such that

P −1 AP = J

(2) THere is an invertible matrix P ∈ Mn (F) such that the representation matrix [T ]B
relative to the new basis
B = B0P
is J, i.e.
[T ]B = J

Theorem 9.7 (Existence of Jordan Canonical Form).


Let F be a field. Let A be a matrix in Mn (F). Let p(x) = pA (x) be the characteristic
polynomial. Then the following is equivalent.

41
(1) A has a Jordan canonical form J ∈ Mn (F).

(2) Every zero of the characteristic polynomial p(x) belongs to F.

(3) We can factor p(x) as


p(x) = (x − λ1 ) · · · (x − λn )
where all λi ∈ F.

In particular, if F is so called algebraically closed, then every matrix A ∈ Mn (F) and every
T on an n-dimensional vector space V over F have a Jordan canonical form j ∈ Mn (F).

Theorem 9.8 (Consequences of Jordan canonical forms).


Let A be a matrix in Mn (F). Set p(x) = pA (x) and m(x) = mA (x).

1. The characteristic polynomial p(x) and the minimal polynomial m(x) have the same
zero sets.
{α ∈ F | p(α) = 0} = {α ∈ F | m(α) = 0}
Also, the multiplicity ni and mi of a zero λi of p(x) and m(x) satisfy

ni ≥ mi ≥ 1

2. If J ∈ Mn (F) is a Jordan canonical form of A, then we have

dim Vλi (A) = dim Vλi (J) = e(λi ) ≤ s(λi )

Theorem 9.9 (Canonical forms of similar matrices).


Let Ai ∈ Mn (F) and Ji ∈ Mn (F) be its Jordan canonical form. Then the following are
equivalent.

(1) A1 and A2 are similar.

(2) We have J1 = J2 after re-ordering of their Jordan Block.

Definition 9.4 (Diagonalisable Operator).


Let V be an n-dimensional vector space over a field F. A linear operator T : V → V is
diagonalisable over F, if the representation matrix [T ]B relative to some basis B of V is a
diagonal matrix in Mn (F):
 
λ1 0 0 · · · 0
 0 λ2 0 · · · 0 
 
[T ]b = J =  0 0 λ3 · · · 0 
 
 .. .. .. . . .
. . . . .. 
0 0 0 · · · λn

where λi are scalars in F. This J is then an automatically a Jordan canonical form of T .


Clearly,
λ1 , . . . , λn

42
exhaust all zeros of pT (x) and the characteristic polynomial of T is

pT (x) = (x − λ1 ) · · · (x − λn )

A square matrix A ∈ Mn (F) is diagonalisable over F, if A is similar to a diagonal matrix in


Mn (F), i.e.  
λ1 0 0 ··· 0
 0 λ2
 0 ··· 0 
−1
P AP = J =  0 0
 λ3 ··· 0 
 .. .. .. .. .. 
. . . . .
0 0 0 · · · λn
for some invertible P ∈ Mn (F), where λi are scalars in F. This J is then automatically a
Jordan canonical form of A.
Write
P = (p1 , . . . , pn )
with pj the jth column of P .
The diagonalisability condition on A is equivalent to
 
λ1 0 0 · · · 0
 0 λ2 0 · · · 0
 
AP = P  0 0 λ3 · · · 0


 .. .. .. . . .. 
. . . . .
0 0 0 · · · λn

i.e.
(Ap1 , . . . , Apn ) = (λ1 p1 , . . . , λn pn )
i.e. each pi is an eigenvector of A corresponding to the eigenvalue λi .
Suppose that A = [T ]B 0 . Then the condition above is equivalent to

[T (vi )]B 0 = [T ]B 0 [vi ]B 0 = λi [vi ]B 0

where vi = B 0 pi ∈ V with
[vi ]B 0 = pi
i.e.
T (vi ) = λi vi
i.e.  
λ1 0 0 ··· 0
 0 λ2 0
 ··· 0 
T (v1 , . . . , vn ) = (v1 , . . . , vn )  0 0 λ3
 ··· 0 
 .. .. .. .. .. 
. . . . .
0 0 0 · · · λn

43
i.e.  
λ1 0 0 ··· 0
 0 λ2 0
 ··· 0 
[T ]B =  0 0 λ3
 ··· 0 
 .. .. .. .. .. 
. . . . .
0 0 0 · · · λn
where
B = (v1 , . . . , vn ) = B 0 P
is a basis of V since
([v1 ]B 0 , . . . , [vn ]B 0 ) = (p1 , . . . , pn )
is a basis of column vector space Fnc .
Theorem 9.10.

(1) T is diagonalisable if and only if the representation matrix [T ]B 0 relative to every basis
B 0 is diagonalisable.

(2) A matrix A ∈ Mn (F) is diagonalisable if and only if the matrix transformation TA on


the column n-space Fnc is diagonalisable.
Theorem 9.11 (Equivalent Diagonalisable Condition).
Let V be an n-dimensional vector space over a field F, and

T :V →V

a linear operator. Then the following are equivalent:


1. T i s diagonalisable over F, i.e. the representation matrix of T relative to some basis
B of V is a diagonal matrix in Mn (F).
 
λ1 0 0 · · · 0
 0 λ2 0 · · · 0 
 
 0 0 λ3 · · · 0 
[T ]B =  
 .. .. .. . . .. 
. . . . .
0 0 0 · · · λn

2. [T ]B 0 is diagonalisable over F for every basis B 0 of V , i.e. there exists an invertible


P ∈ Mn (F) such that
P −1 [T ]B 0 P = diag[λ1 , . . . , λn ]
for some scalars λi ∈ F(automatically being eigenvalues of T ).

3. A basis
B = (v1 , . . . , vn )
of V is formed by eigenvectors vi of T .

44
4. There are n linearly independent eigenvectors vi of T .

5. For the representation matrix [T ]B 0 relative to every basis B 0 of V , a basis

P = (p1 , . . . , pn )

of the column n-space Fnc is formed by eigenvectors pi of [T ]B 0 .

6. For the representation matrix [T ]B 0 relative to every basis B 0 of V , there are n linearly
independent eigenvectors pi of [T ]B 0 .

7. Let
λm1 , . . . , λmk
be the only distinct eigenvalues of T and let Bi be a basis of the eigenspace Vλi (T ).
Then
B = (B1 , . . . , Bk )
is a basis of V , automatically with
 
λm1 I|B1 | 0 0 ··· 0
 0
 λm2 I|B2 | 0 ··· 0 

 0
[T ]B =  0 λm3 I|B3 | ··· 0 

 .. .. .. ... .. 
 . . . . 
0 0 0 · · · λmk I|Bk |

8. Let
λm1 , . . . , λmk
be the only distinct eigenvalues of T . Then V is a direct sum of the eigenspaces

V = Vλmi (T ) ⊕ · · · ⊕ Vλmk (T )

9. Let
λm1 , . . . , λmk
be the only distinct eigenvalues of T . Then
k
X
dim Vλmi (T ) = dim V
i=1

10. T has a Jordan canonical form J which is diagonal.

Theorem 9.12 (Minimal polynomial and diagonalisability).


Let F be a field. Let A be a matrix in Mn (F). Let m(x) = mA (x) be minimal polynomial of
A. Then the following are equivalent:

(1) A is diagonalisable over F.

45
(2) The minimal polynomial m(x) is a product of distinct linear polynomials in F[x].

m(x) = (x − λ1 ) · · · (x − λk )

where λi are distinct scalars in F


(3) We can factor m(x) over F as

m(x) = (x − λ1 ) · · · (x − λk )

for some scalars λi ∈ F and m(x) has only simple zeros.


(4) Let p(x) = pA (x) be the characteristic polynomial. Then we can factorise p(x) over F
as
p(x) = (x − λ1 )n1 · · · (x − λk )nk
where λi are distinct scalars in F. The dimension of the eigenspace satisfies:

dim Vλi = ni

Theorem 9.13.
Let V be an n-dimensional vector space over a field F.
A linear operator
T :V →V
on V is nilpotent if
T m = 0IV
for some positive integer m.
Suppose that T has a Jordan canonical form J ∈ Mn (F). The following are equivalent.
(1) T is nilpotent.
(2) J equals some A(λ) with λ = 0.
(3) Every eigenvalue of T is zero/
(4) The characteristic polynomial of T is pT (x) = xn .
(5) The minimal polynomial of T is mT (x) = xs for some s ≥ 1.
Theorem 9.14 (Additive Jordan Decomposition).
Suppose a linera operator
T :V →V
has a Jordan canonical form in Mn (F). There are linear operators

TS : V → V

and
Tn : V → V
satisfying the following:

46
(1) A decomposition
T = Ts + Tn

(2) Ts is semi-simple.

(3) Tn is nilpotent.

(4) Commutativity
Ts ◦ Tn = Tn ◦ Ts

(5) There are polynomials f (x), g(x) in F[x] such that

Ts = f (T ) Tn = g(T )

This decomposition is unique. We call it Jordan decomposition.

47
10 Quadratic Forms, Inner Product Spaces and Conics
Definition 10.1 (Bilinear forms).
Let V be a vector space over a field F. Consider the map H below:
H :V ×V →F
(x, y) 7→ H(x, y)
(1) H is called a bilinear form on V if H is linear in both variables, i.e., for all
xi , yj , x, y ∈ V, ai , bi ∈ F
we have
H(a1 x1 + a2 x2 , y) = a1 H(x1 , y) + a2 H(x2 , y)
H(x, b1 y1 + b2 y2 ) = b1 H(x, y1 ) + b2 H(x, y2 )

(2) A bilinear form H on V is symmetric if


H(x, y) = H(y, x) ∀x, y ∈ V

Theorem 10.1 (Representation Matrix).


Suppose that
B = (v1 , . . . , vn )
is a basis of a vector space V over a field F. Let
 
a11 · · · a1n
A = (aij ) =  ... . . . ... 
 
an1 · · · ann

be a matrix in mathbbMn (F).


We define the function
HA : V × V → F
n
X n
X n
X n
X
( xi vi , yi vj ) 7→ HA ( x i vi , y i vj )
i=1 j=1 i=1 j=1

where n n
X X
HA ( x i vi , yi vj )
i=1 j=1
n X
X n
:= aij xi yj
i=1 j=1
  
a11 · · · a1n y1
 .. . . . .
=(x1 , . . . , xn )  . . ..   .. 
 
an1 · · · ann yn
=X t AY

48
(1) Then HA is a bilinear form on V and called the bilinear form associated with A
(and relative to the basis B of V ).

(2) Conversely, every bilinear form H on V is of the form HA for some A in Mn (F). Indeed,
just set
aij = H(vi , vj ), A := (aij )
Then one can use the bilinearity of H, show that H = HA .
The matrix A is called the representation matrix of H relative to the basis of B¿

(3) HA is a symmetric bilinear form if and only if A is a symmetric matrix.

Definition 10.2 (Non-degenerate bilinear forms).


A bilinear form H on V is non-degenerate if for every y0 ∈ V , we have:

H(x, y0 ) = 0(∀x ∈ V ) ⇒ y0 = 0

A bilinear form H = HA is non-degenerate if and only if its representation matrix A is


invertible.

Definition 10.3 (Congruent matrices).


Two matrices A and B in Mn (F) are congruent if there is an invertible matrix P ∈ Mn (F)
such that
B = P t AP
Being congruent is an equivalent relation.
Consider the bilinear form
H : Fnc × Fnc → F
(X, Y ) 7→ X t AY
If we write
X = PY
with an invertible matrix P ∈ Mn (F) and introduce Y as a new coordinate system for Fnc ,
then
H(X1 , X2 ) = X1t AX2
= (P Y1 )t A(P Y2 )
= Y1t (P t AP )Y2
Thus, the bilinear form above would have a simpler form in new coordinates Y , if P t AP
(which is congruent to A) is simpler. This simplification is very useful in classfying all conics.

Theorem 10.2 (Weak version of Principle Axis Theorem).


Let A ∈ Mn (F) be a symmetric matrix. Then there is an invertible matrix P in Mn (F) such
that the matrix P t AP is diagonal:

P t AP = diag[d1 , . . . , dn ] =: D

49
i.e. A is congruent to a diagonal matrix D.
In this case, the bilinear form
n X
X n
H(X1 , X2 ) = aij xi yj
i=1 j=1

= X1t AX2
= Y1t DY2
  
d1 · · · 0 y2 1
= (y11 , . . . , y1n )  0 . . 0   ... 
.
   
0 · · · dn y2 n
Xn
= dj y1j y2j
j=1

where we have used the substitution:

Xi = P Y i

Definition 10.4 (Inner Product, Orthogonal, Norm).


We start with the real version.
Consider a function H:
V ×V →R
(x, y) 7→ H(x, y)
on a vector space V over the field R of real numbers.
The function H is called a real inner product and V a real inner product space, if the
following three conditions are satisfied, where we denote

hx, yi := H(x, y)

(1) H is a bilinear form, i.e., for all

xi , yj , x, y ∈ V, ai , bi ∈ R

we have
ha1 x1 + a2 x2 , yi = a1 hx1 , yi + a2 hx2 , yi
hx, b1 y1 + b2 y2 i = b1 hx, y1 i + b2 hx, y2 i

(2) H is symmetric, i.e., for all x, y ∈ V , we have

hx, yi = hy, xi

(3) Positivity:
For all 0 6= x ∈ V , we have
hx, xi > 0

50
Next is the complex version. Consider a function H:

V ×V →C
(x, y) 7→ H(x, y)

on a vector space V over the field C of real numbers.


The function H is called a complex inner product and V a complex inner product
space, if the following three conditions are satisfied, where we denote

hx, yi := H(x, y)

(1) H is a bilinear form, i.e., for all

xi , yj , x, y ∈ V, ai , bi ∈ R

we have
h< a1 x1 + a2 x2 , yi = a1 hx1 , yi + a2 hx2 , yi
h< x, b1 y1 + b2 y2 i = b̄1 hx, y1 i + b̄2 hx, y2 i

(2) H is symmetric, i.e., for all x, y ∈ V , we have

hx, yi = hy, xi

(3) Positivity:
For all 0 6= x ∈ V , we have
hx, xi > 0

We have three more definitions:


(1) The norm of a vector x ∈ V is denoted and defined as:
p
kxk = hx, xi

We have
kxk ≥ 0
and
kxk == 0 ⇔ x = 0V

(2) Two vectors x, y in V are orthogonal to each other and denoted as

x⊥y

if their inner product


hx, yi = 0

(3) Sometimes, we use


(V, h, i)
to denote a vector space V with an inner product h, i.

51
Definition 10.5 (Non-degenerate Inner Product).
Let (V, h, i) be an inner product space over a field F with F = R or F = C. Then the product
h, i is non-degenerate in the sense:
for every u0 ∈ V
hu0 , yi = 0(∀y ∈ V ) ⇒ u0 = 0V
and for every v0 ∈ V ,
hx, v0 i = 0(∀x ∈ V ) ⇒ v0 = 0V
Definition 10.6 (Orthonormal basis).
Let (V, h, i) be a real or complex inner product space. A basis B = (v1 , . . . , vn is called an
orthonormal basis of the inner product space V , if it satisfies the following two conditions:
(1) Orthogonality:
for all i 6= j, we have:
vi ⊥ vj i.e., hvi , vj i = 0

(2) Normalised:
for all i, we have
kvi k = 1
Namely, vi is a unit vector.
Theorem 10.3 (Gram-Schmidt Process).
Let V = Fnc with F = R or C. Employ the standard inner product h, i for V .
Let
(u1 , . . . , ur )
be a basis of a subspace W of V . Then one can apply the following Gram-Schmidt process
to get an orthonormal basis
(v1 , . . . , vr )
of W .
v10 = u1
hu2 , v10 i 0
v20 = u2 − v1
kv10 k2
k−1
X huk , v0 i
vk0 = uk − 0 2
i

i=1
kvi k
vj0
vj =
v0

j

Definition 10.7 (Adjoint matrices A∗ ).


For a matrix A = (aij ) ∈ Mn (C), the adjoint of A is defined as

A∗ = (A)t = (aij )t

i.e., the (i, j)-entry of A∗ equals aji . Note that

A∗ = (At )

52
Theorem 10.4 (Adjoint matrix A∗ and inner product).
Let V = F)nc with F = R or C. Employ the standard inner product h, i for V . For a matrix
A ∈ Mn (F), we have
hAX, Y i = hX, A∗ Y i
Theorem 10.5 (Adjoint Linear Operator).
Let T : V → V be a linear operator on an n-dimensional inner product space V over a field
F. Then we have:
(1) There is a unique linear operator

T∗ : V → V

on V such that
hT (u), vi = hu, T ∗ (v)i
Such T ∗ is called the adjoint linear operator of T .

(2) Let B = (w1 , . . . , wn ) be an orthonormal basis of the inner product space V . Then

[T ∗ ]B = ([T ]B )∗

Theorem 10.6 (Adjoint of adjoint). (T ∗ )∗ = T .


Theorem 10.7 (Adjoint of linear map combinations).

(1) Suppose that T = αIV is a scalar map. Then

T ∗ = αIV

(2)
(a1 T1 + a2 T2 )∗ = a1 T1∗ + a2 T2∗

(3)
(T1 ◦ T2 )∗ = T2∗ ◦ T1∗

Definition 10.8 (Orthogonal, Unitary, Self-adjoint, Normal linear operators).


Let A ∈ Mn (C) (resp. let T : V → V be a linear operator on an n-dimensional inner product
space over a field F = R or C and with an orthonormal basis B). Let A∗ (resp. T ∗ ) be the
adjoint of A (resp. T ).
(1) A linear operator T over a real inner product space is orthogonal if

T T ∗ = IV

(2) A real matrix A in Mn (R) is orthogonal if

AAt = In

53
(3) A linear operator T over a complex inner product space is unitary if

T T ∗ = IV

(4) A complex matrix A in mathbbMn (C) is unitary if

AA∗ = In

(5) T is self-adjoint if its adjoint T ∗ equals itself:

T = T∗

When the field F = R, a self adjoint operator is also called a symmetric operator.
(6) A complex matrix A ∈ Mn (C) is self-adjoint if the adjoint matrix of A equals itself:

A∗ = A

(7) A linear operator T over a complex inner product space is normal if

T T ∗ = T ∗T

(8) A complex matrix A ∈ Mn (C) is normal if

AA∗ = A∗ A

Orthogonal, Unitary, self-adjoint operators are normal.


Theorem 10.8. T is orthogonal, unitary, self-adjoint or normal if and only if its represen-
tation matrix A := [T ]B (relative to one hence every orthonormal basis B) is respectively
orthogonal, unitary, self-adjoint and normal.
Theorem 10.9 (Equivalent unitary matrix definition).
For a real matrix P in Mn (C), the following are equivalent, if we employ the standard inner
product on Cnc .
(1) P is unitary, i.e. P P ∗ = In .
(2) Write
P = (p1 , . . . , pn )
where the pj are the column vectors of P . Then the column vectors p1 , . . . , pn form
an orthonormal basis of Cnc .
(3) The matrix transformation
TP : Cnc → Cnc
X 7→ P X
preserves the standard inner product, i.e., for all X, Y in Cnc , we have

hP X, P Y i = hX, Y i

54
(4) The matrix transformation TP preserves the distance, i.e. for all X, Y in Cnc , we have

kP X − P Y k = kX − Y k

(5) The matrix transformation TP preserves the norm, i.e. for all X in Cnc , we have

kP Xk = kXk

(6) For one and hence every orthogonal basis

B = (v1 , . . . , vn )

of Cnc , the new basis


B 0 = BP
is again an orthonormal basis of Cnc .
Theorem 10.10 (Eigenvalues of orthogonal or unitary matrices).

(1) If a real matrix P ∈ Mn (R) is orthogonal, then every zero of pP (x) has modulus equal
to 1. In particular, the determinant

|P | = ±1

(2) If a complex matrix P ∈ Mn (C) is unitary, then every eigenvalue



λi = r1 + r2 −1

of P has modulus q
|λ| = r12 + r22 = 1
In particular, the determinant |P | ∈ C has modulus 1.
Theorem 10.11 (Eigenvalue of self-adjoint linear operators).

(1) Suppose that a real matrix A ∈ Mn (R) is symmetric. Every zero of pA (x) is a real
number.

(2) Suppose a complex matrix A ∈ Mn (C) is self-adjoint. Every zero of pA (x) is a real
number.

(3) More generally, suppose that T is a self adjoint linear operator. Every zero of pT (x) is
a real number.

(4) Suppose that T is a self-adjoint linear operator. Let vi (i = 1, 2) be two eigenvectors


corresponding to two distinct eigenvalues λi of T . We have

hv1 , v2 i = 0

55
Definition 10.9 (Positive/Negative definite linear operators).
Let A ∈ Mn (C) (resp. let V be an n-dimensional inner product space which is over a field
F = R or C).
(1) T is positive definite if T is self-adjoint and

hT (v), vi > 0

(2) T is negative definite if T is self-adjoint and

hT (v), vi < 0

Thus T is negative definite if and only if −T is positive definite.

(3) A is positive definite if A is self-adjoint and

(AX)t X = X t At X > 0

(4) A is negative definite if A is self-adjoint and

(AX)t X = X t At X < 0

Thus, A is negative definite if and only if −A is positive definite.


Theorem 10.12 (Equivalent Positive-Definite Definition).
Let
A = (aij ) ∈ Mn (R)
be a symmetric real matrix. Then A is positive definite if and only if all its principal
minors
(aij )1≤i,j≤r (1 ≤ r ≤ n)
of order r have positive determinants.
Let T be a self-adjoint linera operator on an inner product space V which is over F = R or
C and with an orthonormal basis B. Set

A := [T ]B ∈ Mn (F)

Then the following are equivalent.


(1) T is positive definite.

(2) A is positive definite.

(3) Every eigenvalue of T is positive.

(4) Every eigenvalue of A is positive.

(5) One can write A as


A = C ∗C
for some invertible complex matrix C ∈ Mn (C)

56
Theorem 10.13. Let A ∈ Mn (C). The function H on V := Cnc

H :V ×V →C
(X, Y ) 7→ hX, Y i := (AX)t Y

defines an inner product on V if and only if A is positive definite.

Theorem 10.14 (Principle Axis Theorem).

1. Let T : V → V be a linear operator on a real inner product space V of dimension n.


Then T is self-adjoint (i.e., T ∗ = T ) if and only if there is an orthonormal basis B such
that
[T ]B
is a diagonal matrix in Mn (R).

2. A real matrix A ∈ Mn (R) is self-adjoint (i.e. A∗ = A) if and only if there is an


orthogonal matrix P such that

P −1 AP = P t AP

is a diagonal matrix in Mn (R).

3. Let T : V → V be a linear operator on a complex inner product space V of dimension


n. Then T is self-adjoint (i.e., T ∗ = T ) if and only if there is an orthonormal basis B
such that
[T ]B
is a diagonal matrix in Mn (C).

4. A complex matrix A ∈ Mn (C) is self-adjoint (i.e. A∗ = A) if and only if there is an


unitary matrix U such that
U −1 AU = U ∗ AU
is a diagonal matrix in Mn (C).

Theorem 10.15 (Orthogonal Complement).


Let W be a subspace of an inner product space V . Take an orthogonal basis BW of W .

(1) One can extend BW to an orthonormal basis B = (BW , B2 ) of V .

(2) B2 is an orthonormal basis of so called orthogonal complement of W :

W ⊥ := {x ∈ V | hx, vi = 0, ∀w ∈ W }

(3)
V = W ⊕ W⊥

57
Definition 10.10 (Quadratic Form).
Let V be a vector space over a field F. A function

K:V →F

or simply K(x) is a quadratic form if there is a symmetric bilinear form

H :V ×V →F

such that
K(x) = H(x, x)

Theorem 10.16 (Principle Axis Theorem of Quadratic Form).


Let n X
n
X
f (x1 , . . . , xn ) = aij xi xj
i=1 j=1

be a quadratic form in coordinates  


x1
X =  ... 
 
xn
with
A = (aij ) ∈ Mn (R)
a symmetric matrix. Then there is an orthogonal matrix P such that f has the following
standard form
f (x1 , . . . , xn ) = λ1 y12 + · · · + λn yn2
in the new coordinates  
y1
 .. 
Y :=  .  = P −1 X
yn
where λi ∈ R are the eigenvalues of A.
This standard form is unique up to relabelling of λi yi2 .

58
11 Problems
1 Let A ∈ Mn (C) be a complex matrix of order n ≥ 9 and let

f (x) := (x − 1)2 (x − 2)3 (x − 3)4

Suppose that A is self-adjoint and f (A) = 0. Find all possible minimal polynomials
mA (x) of A.

2 Let V be a finite-dimensional inner product space and T : V → V invertible. Prove


that there exists a unitary operator U and a positive operator P on V such that
T = U ◦ P.

3 AY1314Sem2 Question 6(iii)

4 AY1415Sem2 Question 8(iv) –(vi)

5 Let V be a finite dimensional vector space over a field F and let T be a linear operator
on V . Suppose there exists v ∈ V such that {v, T (v), . . . , T n−1 (v)} is a basis for V
where n = dim(V ).

(a) Prove that the linear operators IV , T, . . . , T n−1 are linearly independent. (Done)
(b) Let S be a linear operator on V such that S ◦ T = T ◦ S. Write

S(v) = a0 v + a1 T (v) + · · · + an−1 T n−1 (v)

where a0 , a1 , · · · , an−1 ∈ F.
Prove that S = p(T ) where p(x) = a0 + a1 x + · · · + an−1 xn−1 .
(c) Suppose pT (x) = (x−λ1 )r1 (x−λ2 )r2 · · · (x−λk )rk where λ1 , λ2 , . . . , λk are distinct
eigenvalues of T . Find mT (x).

6 Let A be an invertible n × n matrix over a field F.

(a) Show that cA−1 (x) = xn [cA (0)]−1 cA ( x1 ) (Done)


(b) Show that mA−1 (x) = xk [mA (0)]−1 mA ( x1 ).

59

You might also like