Let X and Y be any vector spaces. To each vector x X we assign a unique vector y Y.
Then we say that a mapping (or transformation or operator) of X into Y is given. Such a
mapping is denoted by a capital letter, say F. Example: F: X Y. The vector y Y assigned
to a vector x X is called the image of x and is denoted by F(x) = y.
F is called a linear mapping or linear transformation if for all vectors v and x in X and scalars
c K where K is a field if it satisfies the following conditions;
(a) F (v x) F (v) F ( x)
(b) F (cx ) cF ( x) .
Thus F : X Y is linear if it preserves vector addition and scalar multiplication.
Let X be a vector space over the field K. A mapping F : X K is termed a linear functional
(or linear forms) if for every u , v X and every scalars q, r K ; F (qu rv) qF (u ) rF (v)
That is a linear functional on X is a linear mapping from X to K.
Let L be the operator defined by L( x ) 3 x , for each x R 2 then we have
L(x) (3x)
(3 x)
= L (x ) , scalar multiplication
L( x y ) 3( x y )
3x 3 y
L( x) L( y )
Hence L is a linear mapping.
1-2.2 Linear functional
Let V be a vector space over a field F. The dual space of V denoted by V is the set of all
V (V , F ) {F : V F | F is linear}
linear maps from V to F that is eg. F:V K s.t for all u, v V and q, r K .
F (uq vr ) qF (u ) rF (v )
and this is known as the linear functional. Thus linear functional is a vector space to its scalar
The space of all linear functional is known as the Dual space of a vector V denoted by V
Example 1
The mapping L: R2 R1 defined by L(x) = X1 + Xz is a linear transformation since L(x +
y) = = =
Example 2
Let L be the mapping from c[a,b] to If f and g are any
vectors in c[a,b] then is a linear functional on X.
= f x dx g x dx
b b
a a
Therefore, L is a linear transformation.
Let V K n , the vector space of n-tupples which we write as column vectors. Then the dual
space V* can be identified with the space of row vectors. Any linear functional
(a1 , .......... ., an ) in V* has the representation
( x1 , x2 , .......... ......., xn ) (a1 , a2 , .......... ......., an ) : a1 x1 a2 x2 .......... .......... xn
The mapping L : R 2 R1 defined by L( x) x1 x2 , is a linear transformation since
L( x y ) ( x1 y1 ) ( x2 y2 )
( x1 x2 ) ( y1 y2 )
L( x) L( y )
Let L : P1 P2 be defined as indicated. Is L a linear transformation? Justify your answer.
a. L[p(t)]=tp(t)+t2+1
b. L(at+b)=at2+(a-b)t
Suppose {v1, ……………, vn} is a basis of V over K. Let 1, …….n V * be the linear
1 if i j
functional defined by i (vi ) ij
0 if i j
Then (1 , .......... ....n ) is a basis of V*
We first show that (1 , .......... ....n ) spans V*. Let be an arbitrary element of V* and
suppose (v1 ) k1 , (v2 ) k2 .......... ........., (vn ) kn
Set k11 .......... ......... knn
Then (v1 ) (k11 .......... .... knn )(v1 )
k11v1 .......... ......... knnv1
k1.1 k2 . 0 ......... kn . 0 k1
Thus (vi) = (vi) for i = 1, …………. , n Since and agree on the basis vectors,
k11 k21 knn . Hence, spans V*.
We are supposed to show that {1, 2, ……….., n} is linearly independent.
Suppose a11 + a22 + ……………. + ann = 0
Multiply through by v1
0 = 0 (v1) = (a11 + a22 + …………… + ann)(v1)
= a11v1 + a22v1 + ………… annv1
= a1. 1 + a2.0 + …………… + an.0
= a1
The above basis I is termed as the dual basis to {vi} or the dual basis by the kronecker delta
ij we have
1v1 = 1, 1v2 = 0, ………….. 1vn = 0
2v1 = 0, 2v2 = 1, …………., 2vn = 0
nv1 = 0, nv2 = 0, …………., nvn = 1
Consider the following basis of R2 = {v1 = (2, 1), v2 = (3, 1)}. Find the dual basis {1, 2}.
The linear functional is given as 1(x, y) = ax + by and 2(x, y) = cx + dy. Such that
1v1 = 1 1v2 = 0 2v1 = 0, 2v2 = 1.
1v1 1 (2, 1) 2a b 1
or a = -1, b = 3
1v2 1 (3, 1) 3a b 0
2v1 2 (2, 1) 2c d 0
or c = 1, d = -2
2v2 2 (2, 1) 3c d 1
Hence the dual basis is {1 (x, y) = -x + 3y, 2(x, y) = x –2y} (2)
Consider the following basis of R2={v1=(1,1), v2=(1,2)}. Find the dual basis 1 , 2
Let {v1, ………….., vn} be a basis of V and let {1, ……….., n} be the dual basis of V*.
Then for any vector u V u = 1(u)v1 + 2(u)v2 + ……… + n(u)vn (1) and for any linear
functional V*
= (v1) 1 + (2) 2 + ………….. + (vn) n. …………(2)
Suppose u = a1v1 + a2v2 + …………. + anvn …………… (3)
Then 1(u) = a11(v1) + a2 1(v2) + ……………. + an1(vn)
= a1. 1 + a2. 0 + …………. an. 0 = a1
2(u) = a12(v1) + a22(v2) + ……………. + an2(vn).
= a1. 0 + a2. 1 + ………….. + an. 0 = a2.
Similarly, for i = 3, ………….., n. we have
I(u) = a1I(v1) + a2 I(v2) + …………. + aiI(vi) + ………….. anI(vn).
= a1. 0 + a2. 0 …………+ ai. 1 + ……….. + an. 0 = ai
That is 1(u) = a1, 2(u) = a2, ………….., n(u) = an.
Next we prove (2). Applying the linear functional to both sides of (1) we get.
(u) = 1(u) (v1) + 2(u) (v2) + ………… + n (vn).
= (v1) 1(u) + (v2) 2(u) + ………. + (vn) n(u).
= ((v1) 1 + ((v2) 2 + …………….+ ((vn) n.
= ((v1) 1 + (v2) 2 + …………… + (vn) n) (u)
2 2 3i 4 5i
2 3i 5 6 2i is a Hermitian matrix.
4 5i 6 2i 7
3 2i 4 i
But 2 i 6 i is not a Hermitian matrix even though it is symmetric.
4 i i 3
A real matrix is Hermitian if and only if it is symmetric.
4 3 5
3 2 1
5 6
For all a, b C and all x1, x2, Y Cn
f(ax1 + bx2, Y) = (ax1 + bx2)t AY
= (ax1t + bx2t)AY
= ax1tAY + bx2tAY
= af(x1, Y) + bf(x2, Y)
Hence f is linear in the first variable.
f ( X Y ) X t AY ( X t AY )
= Y At X
= Yt AX A A
= YtAX
= f(Y, X)
Hence f is a Hamitian form on Cn. (we consider XtAY is a scalar hence equal to its
Let V be a. vector space of finite dimension over a field K. A bilinear form on V is a mapping
F: U x V K which satisfies the ff.
(i) f (au1 + bu2, v) = af (u1, v) + bf(u2, v)
(ii) f(u, av1 + bv2) = af(u, v1) + bf(u, v2)
for all a, b K and all ui U and vi V
Let the basis of V be (v1, v2, ………….. vn) and (u1, u2, ………….. un) the basis of U
Example: (i)
Let f be the dot product on Rn; that is f(u, v) = u.v= a1b1 + a2b2 + ………. + anbn
where u = (ai) and v = (bi). Then f is a bilinear form in Rn.
(iii) Let A = (aij) be any n x n matrix over K. Then A may be viewed as a bilinear from
f in Kn by defining f(X, Y) = Xt AY
a11 a12 .......... .a1n y1
a21 a22 .......... .a2 n y2
; = (x1, x2, …….. xn)
: : : :
ni a n2 .......... .a nn yn
i . j 1
aij xiyi = a11 x1y1 + a12x1y1 + …………. annxnyn.
The above formal expression in variables xi, yi, is termed bilinear polynomial corresponding
to the matrix A.
Let f be the bilinear form on R2 defined by f((x1, x2), (y1, y2)) = 2x1y1 – 3x1 y2 + x2y2.
(i) Find the matrix A of f in the basis {u1 = (1, 0), u2 = (1, 1)}.
(ii) Find the matrix B of f in the basis {v1 = (2, 1), v2 = (1, -1)}
(iii) Find the transition matrix P from the basis {u1} to the basis {vi}, and verify that B
= PtAP
Set A = (aij) where aij = f(ui, uj).
2 3
The given matrix is =D
0 1
2 3 1
a11 = f(u1, u1) = U1T Du1 = (1, 0) = 2
0 1 0
Let f be the bilinear form on R2 defined by
f x1 , x2 , y1 , y2 3x1 y1 2 x1 y2 4 x2 y1 x2 y2
a. Find the matrix A of f in the basis u1 1,1 , u2 1, 2
b. Find the matrix B of f in the basis v1 1, 1 , v2 3,1
c. Find the transition matrix P from ui to vi and verify that B Pt AP
A bilinear form f on V is symmetric if f(x, y) = f(y, x) or f(u, v) = f(v, u) for every u, v V
If A is a matrix representation of f, we can write f(x, y) = xt AY = (xt AY)t = YtAtX.
We use the fact that XtAY is a scalar and therefore equal its transpose. Thus if f is symmetric
YtAtX = f(X, Y) = f(Y, X) = YtAX. And since this is true for all vectors X, Y it follows that
A = At or A is symmetric conversely if A is symmetric, then f is symmetric.
1 0 0 :1 0 0
0 1 2 : 2 1 0
0 1 : 3 0 1
We next apply the operation R3 -2R2 + R3
1 0 0 :1 0 0 1 0 0 :1 0 0
0 1 2 : 2 1 0 C3 2C2 C3 and then 0 1 0 : 2 1 0
0 5 : 7 2 1 0 0 5 : 7 2 1
1 1 3
Let A= 1 2 1 , a symmetric matrix. Use elementary row operation to diagonalize A.
3 1
A bilinear form H on a finite-dimensional vector space V is called diagonalizable if there is
an ordered basis for V such that (H) is a diagonal matrix. If A is the matrix of bilinear
form and A has distinct eigenvalues then A is diagonalizable i.e. P-1AP = B.
Where P is the matrix of eigenvectors and B is the diagonal matrix with eigenvalues on the
main diagonal.
1 2
Consider the matrix A = .
3 2
1 2
Then we have =0
3 2
(1 - ) (2 - ) – 6 = 0
2 - 3 + 2 – 6. = 0
2 - 3 - 4 = 0. = 4, = -1
Ax = x
When 4 , we have
1 2 x1 x1
3 3 x2 x2
x1 + 2x2 = 4x1
3x1 + 2x2 = 4x2
-3x1 + 2x2 = 0
3x1 – 2x2 = 0
3x1 = 2x2 =
When 1 , we have
1 2 x1 x1
3 2 x2 x2
x1 2 x2 x1
3 x1 2 x2 x2
2 x1 2 x2 0
3 x1 3 x2 0
x1 x2
1 1
2 1 1
5 5
p= p
3 1 3 2
5 5
Then A is similar to the diagonal matrix
15 1
5 1 2 2 1 4 0
B = p-1 AP = 3
2 3
5 5 2 3 1 0 1
The diagonal elements 4 and –1 of the diagonal matrix B are the eigenvalues corresponding
to the given eigenvectors.
1 1 4 1 1 4
Let A = 3 2 1 Then det( A I ) 3 2 1 = 0 The eigenvalues
2 1 1 (1 )
2 1
are 1 = 1 2 = -2 and 3 = 3. and the corresponding eigenvectors are V1 =
1 1 1
4 V2 1 V3 2
1 1 1
1 1 1 1 2 3
-1 1
P = (V1, V2, V3) = 4 1 2 p = 2 2 6
1 1 3 0 3
1 2 3 1 1 4 1 1 1
p-1 AP = B = 2 2 6 3 2 1 4 1 2
3 0 3 2
1 1 1
1 1
1 0 0
= 0 2 0
0 3
The difference between the number of positive eigenvalues and the number of negative
eigenvalues is called the signature.
If a matrix A has as the eigenvalues, then the signature is
given as 8-1-1= 6
1 1 i 2i
Let H = 1 i 4 2 3i , a Hermition matrix. Find a non-singular matrix
2i 2 3i 7
P s.t. PT H P is diagonal. Find also the signature(s
. Then is a scalar. The number is called the inner product of and and
is written as . This is referred to as dot product.
u v u t v u1 , , un
Let and be vectors in , and let c be a scalar, then
(d) and
The length (or norm) of is the nonnegative scalar defined by:
v vv v v 2
2 v and
v1 , v2
v12 v22
| v2 |
| v1 |
0 x1
A vector whose length is 1 is called a unit vector. The unit vector of is given as u .
The process of creating from is sometimes called normalising . We say that is in the
direction as .
Let . Find a unit vector in the same direction as .
v v v 12 (2)2 22 9 3
Check that
1 2 2
2 2 2
v vv 1
3 3 3
For and in , the distance between and is
If u u1 , , un , v v1 , , vn
u1 v1 u2 v2 un vn
2 2 2
Then u v
Let u = (2,3,2,-1) and v = (4,2,1,3). Find,
1. The norm of each of the vectors
2. The distance between the two vectors
3. The angle between the two vectors
SESSION 2-3 Orthogonal Vectors
v 0 v
u u v v u v
u u u v v u v v
u u v v u v v u
Two vectors and in are orthogonal to each other if
Two vectors and are orthogonal
1. Compute and
2. Find a unit vector in the direction of
3. Show that is orthogonal to
4. Show that is an orthogonal set where
If then both sides of (1) are zero and hence (1) is true in this case.
If , let W be the subspace spanned by .
Recall that for any scalar .
v, u v, u v, u u, v
Thus projwv u u 2
u, u u, u u u
u v u v, u v u, u 2 u, v v, v
u 2 u, v v
2 2
Find the length and the distance between the vectors u = (2, 3, 1) and v = (4, 1, -3). Find also
the angle between the two vectors.
||v|| = (v, v) (2, 3, 1). (2, 3, 1) 4 9 1 14
||u|| = (u, u) (4, 1, 3). (4, 1, 3) 16 1 9 26
|(v, u)| = ||v||. ||u|| = 14. 26 364 2 91
(u. v) (4, 1, 3). (2, 3, 1)
cos = (8 3 3
|| u || . || v || 364
= cos-1 = 65.20
2.3.4 Matrices
(Ext. of IPS. and norm)
If the vectror space is V = M(mxn) matrix then the norm function is
(a) ||A|| = Max |Aij| is called the max norm.
i, j
(b) ||A|| = Max | Aij | is called the max row sum norm
j 1
2 3 2
E.g. A = for i = 1 | aij | = |2| + |-3| = 5 for i = 2
1 2 j 1
j 1
|a2j| = |1| + |2| = 3
n 2
(c) Frebonius norm ||A||F = ( Aij ) 2
i, j
a11 a12 2 3
For A =
a21 a21 1 2
= 4 9 1 4 18
n p
(d) A P norm is defined by ||A||p = | Aij | p , where P = 1, 2, 3 p = 2 Frebonius
For a vector space V = R , the norm functions are
(a) ||v|| = max |Vj| called max. absolute norm
2 3
E.g V =
4 6
||v|| = max |Vj| = max [|2|, |-3|, |4|, |-6]
= max [2, 3, 4, 6] = 6
n 2
(b) ||v||2 = | V j |2 called Euclidean norm
j 1
( 6 ) 2
||v||2 = 2 2 (3) 2 4 2 2
= 4 9 16 362
n p
(c) ||v||p = | v j |
j 1
||v||3 = | 2 |3 | 3 |3 | 4 |3 | 6 |3 3
= 8 27 16 2163 3 315
1. (i) Let x1, x2, ………….. xn be distinct real numbers. For each pair of polynomials in
p( x )
pn define (p, p) = i where xi = (I – 3)/2 for i = 1, ………… 5
i 1
5 2
(iii) Define the norm in p5 by ||p|| = ( p, p) [ p( xi )]2
i 1
(iv) Compute (a) ||x||, (b) ||x || (c) the distance between x and x2
2. (i) Find the length and the distance between the vectors .
(ii) Find also the angles between the two vectors.
In this unit we treat Orthogonal and Orthonormal sets. We introduce you to orthogonal
projection and the Gram-Schmidt Orthogonalization process. We also treat the Best
approximation and the method of least square.
A set of vectors in is said to be an orthogonal set if each pair of distinct
vectors from the set is orthogonal, that is if whenever .
Let us consider the three possible pairs of distinct vectors, namely, and
Theorem 1
If is an orthogonal set of nonzero vectors in , then S is linearly
independent and hence is a basis for the subspace spanned by S
If for some scalars then
0 0 u1 (c1u1 c2u2 c pu p )u1
(c1u1 )u1 (c2u2 )u1 (c pu p )u1
c1 (u1u1 ) c2 (u2u1 ) c p (u pu1 )
c1 (u1 u1 )
Since is nonzero, is not zero and so . Similarly, must be zero. Thus
S is linearly independent.
Let S be an orthogonal set of nonzero vectors in and let W be the subspace spanned by S.
then S is called an orthogonal basis for W because it is both an orthogonal set and a basis for
W. If there are n vectors in S, then W and S is an orthogonal basis for .
Theorem 2
Let be an orthogonal basis for a subspace W of . Then each y in W, has a
unique representation as a linear combination of . In fact if
11 12 33
u1 u2 u
11 6 33 3
u1 2u2 2u3
Which of the following are orthogonal sets of vectors?
a. {(1,-1,2),(0,2,-1),(-1,1,1)}
b. {(1,2,-1,1),(0,-1,-2,0),(1,0,0-1)}
c. {(0,1,0,-1),(1,0,1,1),(-1,1-1,2)}
Show that is an orthonormal basis of where
v v 3
1 2 2 1 0
66 66 66
Thus is an orthogonal set. Also
v1 v1 9 1 1 1
11 11 11
v2 v2 1 4 1 1
6 6 6
v3 v3 1
16 49 1
66 66 66
Which shows that and are unit vectors. Thus is an orthonomal set.
A matrix has orthonomal columns if and only if ( I the identity matrix)
Let assume that U has only three columns in .
Then (1)
Let U be an matrix with orthonomal columns and let and be in . Then
(c) if and only of
Properties (a) and (c) say that linear mapping preserves length and orthogonality.
1 2
2 3
U 2 and x 2
2 3 3
0 1
Verify that
1-4.3 Orthogonal projection
Given nonzero vectors u and y in , consider the problem of decomposing y into the sum of
two vectors, one a multiple of u and the other orthogonal to u.
z y yˆ y
ŷ u u
Find α to make orthogonal to u.
Given any scalar α, let , then is orthogonal to u if and only if
Let and . Find the orthogonal projection of unto u. Then write as the
sum of two orthogonal vectors, one in space and one orthogonal to u.
The orthogonal projection of onto u is and the component
of orthogonal to u is
Which shows that is an orthogonal set.
Let {v1, ………….., vn} be an arbitrary basis of an IPS. V. Then there exists an orthonormal
basis {u1, …………. un} of V s.t. the transition matrix from {vi} to {ui} is triangular; that is
for i = 1, …………. n
ui = ai1 v1 + ai2 v2 + ……….. aiivi
We set ui = ; then {u1} is orthonormal. Next we set w2 = v2 – (v2, u1) u1 and u2 =
|| v1 ||
By the above Lemma above, w2 (and hence u2) is orthogonal to u1; then {ui, u2} is
Next we set W3 = v3 – (v3, u1) u1 – (v3, u2) u2 and u3 = w3/||w3||.
By the Lemma above, w3 (and hence u3) is orthogonal to u1 and u2; then {u1, u2, u3} is
1 1 4
1 4 2
Given find the orthonormal basis for the column space of A.
1 4 2
1 1
Let v1 = (1 1 1 1)t v2 = (-1 4 4 –1) and v3 = (4 –2, 2, 0)
Step 1 ||v1|| = 12 12 12 12 4 =2
v1 1
u1 = (1 1 1 1).
|| v1 || 2
Step 2
w2 = v2 – (v2, u1) u1 = (-1, 4, 4, -1) – (v2, u2) u1
= (-1, 4, 4, -1) - 3 (1, 1, 1, 1) .
= 5 2 [-1, 1, 1, -1]
w2 5 ( 1, 1, 1, 1) 5 ( 1, 1, 1, 1) 1
u2 = 2 2 = (1, 1, 1, 1)
|| w2 || 25 5 2
w3 = v3 – (v3, u1) u1 – (v3, u2)u2
= (4 – 2, 2,0) (v3, u1) (-1, 1, 1, 1) – (v3, u2) 1
(1, 1, 1, 1)
= (4, -2, 2, 0) - (1 1 1 1) (1 1 1 1)
= (4, -2, 2, 0) – (2, 0, 0, 2).
= (2 – 2 2 –2)
2(1 –1 1 –1)
w3 2(1 1 1 1) 1
u3 = (1 –1, 1 –1)
|| w3 || 16 2
Check that |uj| = 1 for j = 1, 2, 3 and u1 u2 = u1. u3 = u3 = 0
Exercise 1:
Given {v1 = (1, 1, 1) v2 = (0, 1, 1), v3 = (0, 0, 1) and using Gram-Schimdt othogonalization
process find the orthonormal vectors {ui} by normalising vi
Exercise 2:
Consider the basis S u1 , u2 , u3 for R 3 where
u1 1,1,1 , u2 1,0, 1 , and u3 1, 2,3 .
Use the Gram-Schmidt process to transform S to an orthonormal basis in R 3 .
The Best Approximation Theorem
Let W be a subspace of be a vector in and be the orthogonal projection of onto
W determined by an orthogonal basis of W. Then is the closest point in W to , in the
sense that for all v in W distinct from .
The vector is called the best approximation to by elements of W. The distance from to
given by , can be regarded as the ‘error’ of using in place of . By the above
theorem, the error is minimized when .( is the error).
W to is
Example 2
The distance from a point in to a subspace W is defined as the distance from to the
nearest point in W. Find the distance from , to W = span where
5 1 1
15 21 1 7
y u1 u2 2 2 8
30 6 2 2
1 1 4
1 1 0
y y 5 8 3
10 4 6
If A is and b is in , a least-squares solution of is an in s.t
for all in
The term least-squares arises from the fact that is the square root of a sum of
squares. We therefore look for an that makes the closest point in to be.
The vector is called the orthogonal projection of onto W and often is within as .
Let then is a unit vector of u. Suppose that satisfies , then
by the above theorem, the projection has the property that is orthogonal to . So
is orthogonal to each column of A. If is any column of A then
and since each is a row of .
normal equations of
The set of least-square solutions of coincides with the nonempty solutions of the
normal equations
Find a least-squares solution of the in consistent system for
Then becomes
Find a least-square solution of for
The matrix is invertible if and only if the columns of A are linearly independent. In this
case, the equation has only one least-squares solution and is given by
When a least-square solution is used to produce as an approximation
to b, the distance from b to is called the least-squares error of this approx.
solution of
Let and be the column vectors of A and they are also orthogonal. The orthogonal
projections of b onto is given by
in is a list of weights that will build out of the columns of A. Hence the weight to
place on the columns of A to produce
Find a least-square solution for
For instance if we consider the Lagrange polynomial f(x) = x3 – x + 1 with the following four
points; (-3. 3, 0.103), (-0.1, 1.099), (0.2, 0.808), (1.3, 1.897) and graph the points, we see that
they lie nearly on a straight line. A widely used principle for fitting straight lines is the
Method of Least Squares by Gauss.
In the Method of Least Squares, the straight line y = a + bx should be fitted through the given
points (x1, y1), (x2, y2), …………, (xn, yn) so that the sum of the squares of the distances of
those points from the straight line is minimum, where distance is measured in the vertical
direction (the y-direction).
The point on the line with abscissa xj has the ordinate a + bxj. Hence its distance from (xj, yj)
is |y – a – bxj| and that sum of square is q =
j 1
(yj – a – bxj)2 where q depends on a and b
y j a bx j
y = a + bx
a + bxj
1 x
The above equations are called the normal equations of our problem.
Using the method of least squares, fit a straight line to the points; (-1.3, 0.103), (-0.1, 1.099),
(0.2, 0.808), (1.3, 1.897).
n = 4, xj = 0.1, xj2 = 3.43, yj = 3.907, xjyj = 2.3839.
Hence the normal equations are
4a + 0.10b = 3.9070
0.1a + 3.43b = 2.3839.
Solving the above equations simultaneously, we have a = 0.9601, b = 0.6670 and obtain y =
0.9601 + 0.6670x.
Fit a straight line to the points (0,3), (2, 1), (3, -1), (5, -2) by the method of least squares.
1. Fit a parabola p(x) = b0 + b1x + b2x2 to the points (-1, 3), (1, 1), (2, 2), (3, 6) by the method
of least squares.
2. Derive a normal equations to a cubic parabola p(x) = b0 + b1x + b2x2 + b3x3 and hence fit a
cubic parabola by least squares to (-2, -8), (-1, 0), (0, 1), (1, 2), (2, 12), (4, 80).
In this unit we introduce you to quadratic forms, similar matrices and how to diagonalize
matrices and quadratic symmetric matrices
1. If A is a symmetric, then any two eigenvalues from different eigenspaces are
2. An nxn matrix A is orthogonally diagonalizable if and only if A is a symmetric
Quadratic forms and eigenvalues: Let A be an nxn symmetric matrix. Then a quadratic form
x t Ax is;
a. Positive definite if and only if the eigenvalues of A are all positive.
b. Negative definite if and only if the eigenvalues of A are all negative.
c. Positive semi-definite if and only if one of the eigenvalues of A is 0, and the others
are positive.
d. Negative semi-definite if and only if one of the eigenvalues of A is 0, and the others
are negative.
e. Indefinite if and only if A has both positive and negative eigenvalues.
Let , compute for the following matrices (a) (b)
3 2 x1
X t AX x1 , x2
2 7 x2
(b) 3 x1 2 x2 , 2 x1 7 x2 1
3 x1 x1 2 x1 x2 2 x1 x2 7 x2 x2
3 x12 4 x1 x2 7 x22
For in , let
Write this quadratic form as
Note that the coefficients and go on the diagonal of A. To make A symmetric, the
coefficient of for must be split evenly between the and entries in A.
The coefficient of
Let . Compute the value of for and
Let Q(x,y,z)= . Write this in the form .
Let . Then the matrix of quadratic form is given by
First we orthogonally diagonalize A. The eigenvalues are and and the
associated unit eigenvectors are
Note that these are automatically orthogonal, hence they provide an orthonormal basis for
Let P ,D=
The matrix A above is an indefinite matrix since it has both positive and negative
eigenvalues. Hence the eigenvectors are not orthogonal.
Find the value of when
We know that , we have
Make a change of variable x Pu that transforms the quadratic q( x) 5x12 4x1x2 5x22 into
quadratic form with no cross product term and hence find the value of Q( x) when x (2, 2)
In some cases especially when there is repeated eigenvalue, the symmetric matrix does not
yield mutually orthogonal eigenvectors. In order to get orthogonal matrix P, we use gram
Schmidt process to construct mutually orthogonal eigenvectors from the original eigenvectors
and hence obtain the needed orthogonal matrix for PTAP = B.
A matrix B is said to be similar to a matrix A if there is a non-singular matrix P such that
Example 1
3. If A is similar to B and B is similar to C, then A is similar to C
By property 2 we replace the statements “A is similar to B” and “B is similar to A” by “A and
B are similar”.
We shall say that the matrix A is diagonalizable if it is similar to a diagonal matrix. In this
case we also say that A can be diagonalized.
Example 2
If A and B are as in Example 1, then A is diagonalizable, since it is similar to B.
Similar matrices have the same eigenvalues
Let A and B be similar. Then B P 1 AP , for some nonsingular matrix P. We prove that A
and B have the same characteristic polynomials, f A ( ) and f B ( ) , respectively. We have
f B ( ) det ( I n B) det ( I n P 1 AP)
det( P 1 I n P P 1 AP ) det( P 1 ( I n A) P )
det( P 1 ) det( I n A) det( P) (1)
det( P 1 ) det( P) det( I n A)
det( I n A) f A ( )
Since f A ( ) f B ( ) , it follows that A and B have the same eigenvalues.
It follows from Exercise 1, that the eigenvalues of a diagonal matrix are the entries on its
main diagonal. The following theorem establishes when a matrix is diagonalizable.
An n x n matrix A is diagonalizable if and only if it has n linearly independent eigenvectors.
We see that the eigenvectors of the matrix in Example 1 above are linearly independent.
If the roots of the characteristic polynomial of an nxn matrix A are all distinct, then A is
1. Let A be a 3x3 matrix whose eigenvalues are -3 4 and 4and associated eigenvectors
1 0 0
are 0 , 0 and 1 respectively. Find a diagonal matrix D that is similar to A.
1 1 1
5 3 1
2. Let A , find the matrix B such that P AP B where P is an invertible
3 5
matrix. What can you say about the matrices A and B?
Given the quadratic form q(x, y, z) = 2x2 + 6x1x2 + 5y2 – 2yz + 2z2, we can rewrite the
quadratic form as q(x, y, z) = 2x2 + 3x1y2 + 3yx1 + 5y2 – yz – zy + 2z2
2 3 0
Hence A = 3 5 1
0 1 2
Now the characteristic polynomial is (-1) ( - 2) ( - 7) and hence A has the eigen values 1
= 2, 2 = 7 and 3 = 0. The corresponding eigenvectors are
1 3 3
1 1 1
V1 = 0 , V2 5 and V3 2
10 35 14
3 1 1
v1v2 = 0 v1v3 = 0, v2v3 = 0 orthogonal
1 3 3
10 35 14
P = 0 5 2
35 14
3 1 1
10 35 14
2 0 0
Hence (f) = B = PTAP = 0 7 0
0 0
Given the quadratic form q(x, y, t) = 5x2 + 8xy + 5y2 + 4xz + 4yz + 2z2 = 100
q(x, y, z) = 5x2 + 4xy + 4yx + 5y2 + 2xz + 2zx + 2yz + 2zy + 2z2 = 0
Hence the matrix of the quadratic form is
5 4 2 5 4 2 x
A = 4 5 2 ie ( x, y, z ) 4 5 2 y = 100
2 2 2 2 z
2 2
Diagonalizing A
5 4 2
4 5 2 = 0 1 = 1 2 = 1 3 = 10
2 2 2
From (5 - ) 1 + 42 + 23 = 0
41 + (5 - )2 + 23 = 0
21 + 22 + (2 - )3 = 0
The eigenvectors are
12 1
3 2 23
V1 = 12 , V2 3 1 3 V3 2 3
0 4 1
3 2
(i) They have been normalized since no repeated eigenvalues
(ii) V1 V2 = 0, V1 V3 = 0, V2 V3 = 0 orthogonal
Let q be the quadratic form associated with the symmetric bilinear form f. Show that the
polar form of f: f(u, v) = [q(u + v) q(u) – q(v)].
q(u + v) - q(u) – q(v) = f(u + v, u + v) – f(u, u) – f(v, v)
= f(u, u) + f(u, v) + f(v, u) + f(v, v) – f(u, u) – f(v, v)
= 2 f(u, v)
Assume the characteristic of the field f is not two q(u + v) – q(u) – q(v) = f(u, v)
Consider the quadratic forms K on a real inner product space V, find a symmetric bilinear
form H s.t. K(x) = H(x, x) xV. Then find an orthonomal basis for V s.t. (H) is a
diagonal matrix.
(a) K: R R defined by K y = 3x2 + 3y2 + 3z2 – 2xz
(b) K: R2 R defied by K = 7x2 – 8xy + y2
5 4 4 1
E.g. A = has eigenvectors and
1 2 1 1
4 1
Hence X =
1 1
1 1 1 5 4 4 1 0.2 .2 24 1
X-1AX = 1
5 4 1
2 1
1 .2 .8 6 1
6 0
0 1
Diagonalize the matrix,
7.3 0.2 3.7
A = 11.5 1.0 5.5
17.7 9.3
The characteristic polynomial is -3 -2 + 12 = 0. Hence the roots (eigenvalues of A) are 1
= 3, 2 = -4, 3 = 0. Eigenvalues
1 1 2 1 1 2 .7 .2 .3
3 , 1, 1 X 3 1 1 X 1.3 .2 .7
1 3 4 1 4 .8 .2
3 .2
v1 v2 v3
3 0 0
D = X-1AX = 0 4 0
0 0
Any quadratic form in n- variables is equivalent by means of an orthogonal
matrix P to a quadratic form
Find out what type of conic section the following quadratic form represents and transform it
to principal axes Q = 17x12 – 30x1x2 + 17x22 = 128.
17 15
We have Q = xTAX. Where A =
15 17
The characteristic polynomial is (17 - )2 – 152 = 0
1 = 2, 2 = 32
Q = 2y12 + 32y22 But Q = 128
y12 y22
2y1 + 32y2 = 128 = 2 2 = 1
2 2
8 2
If we want to know the direction of the principal axes in the x1x2 – coordinates, we normalize
the eigenvectors from (A - I) x = 0 with = 2 and = 32.
1 1
2 2
We get and
1 1
2 2
1 2 1
2 y1
Hence x = Xy = 1
2 2
x1 = y1/2 – y2/2
x2 = y1/2 + y2/2
Consider the matrix 3x2 + 4xy + 8xz + 4zy + 3z2 = 1. This has the matrix transformation
3 2 4 x
(x, y, z) 2 0 2 y = 1
4 3 z
The eigenvalues are 1 = 8, 2 = -1, 3 = -1 and the eigenvectors are
2 1 0
v1 = 1 , v2 2 , v3 2
2 0 1
Not orthogonal vectors because v1v2 = 0, v1v3 = 0, v2v3 = 4 hence not normalized they are not
mutually orthogonal and cannot provide an orthogonal matrix.
We may use C = (v1 v2 v3) and C-1 AC = D. However, we cannot get a canonical
transformation. We can however, construct a new set of mutually orthogonal eigenvectors
(u1, u2, u3) and (v1, v2, v3) such that Q = (u1 u2 u3) then QTAQ = D and we can have a
canonical transformation from XTAX = XTAD QTX = (QTX)T D(QTX) = yTDy
1. Find an orthogonal matrix Q that diagonalizes the symmetric matrix
5 1 1
A 1 5 1
1 1 5
2. Consider the conic section whose equation is q( x) 2 x 2 2 xy 2 y 2 9 . Find the
canonical transformation.
3. Diagonalize the matrix of quadratic form
Q( xyz ) 3x 2 4 xy 8 xz 4 yz 3z 2