[go: up one dir, main page]

0% found this document useful (0 votes)
89 views39 pages

Gaussian Elimination and Matrix Operations

The document discusses Gaussian elimination and matrix multiplication, covering fundamental concepts such as vectors, inner products, matrices, and their operations. It explains the matrix-vector and matrix-matrix products, including their computational costs and block matrix notation. Additionally, it touches on linear equations and curve interpolation, highlighting their relevance in computer graphics and robotics.

Uploaded by

kurisushiina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views39 pages

Gaussian Elimination and Matrix Operations

The document discusses Gaussian elimination and matrix multiplication, covering fundamental concepts such as vectors, inner products, matrices, and their operations. It explains the matrix-vector and matrix-matrix products, including their computational costs and block matrix notation. Additionally, it touches on linear equations and curve interpolation, highlighting their relevance in computer graphics and robotics.

Uploaded by

kurisushiina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Gaussian Elimination and its Variants

Matrix Multiplication

Abdellatif Serghini

Laurentian University

January 8th, 2025


Vectors and Matrices.

1. Vectors
2. Inner products
3. Matrices
4. Matrix-Vector product
5. Matrix-Matrix product
6. Block matrix and block matrix operations
7. Linear equations
Vectors

A vector of dimension n is an ordered collection of n elements, which


are called components.
A column vector  
x1
 x2 
x=.
 
 .. 
xn

◮ Also written as (row vector)



x = x1 , x2 , · · · , xn

◮ Set of n-vectors is denoted Rn


◮ xi : ith element or component of x
Special vectors:
◮ x = 0 (zero vector): xi = 0, i = 1, · · · , n
◮ x = ei (ith basis vector or ith unit vector): xi = 1, xk = 0 for k 6= i
Vector operations

◮ Scalar multiplication of a vector x with a scalar α


 
αx1
αx2 
αx =  . 
 
.
 . 
αxn

◮ Addition and subtraction of two n-vectors x , y


   
x1 + y1 x1 − y1
 x2 + y2   x2 − y2 
x + y =  . , x − y =  . 
   
 ..   .. 
xn + yn xn − yn
Example of vectors

Image: N × N pixels, numbered 1 to N 2

1 2 3 ··· N
N+1 N+2 N+3 ··· 2N
2N + 1 2N + 2 2N + 3 ··· 3N
.. .. ..
. . .
N2 − N + 1 N2 − N + 2 N2 − N + 3 ··· N2

◮ Grayscale image: can be represented by a vector x of size


n = N 2 ; xi is the grayscale level of pixel i
◮ Grayscale is a range of shades of gray without apparent color.
◮ Black is represented by 0, and white is represented by 255.
◮ Color image: color of each pixel is given by three values (the
amount of red, green and blue); image can be represented by
three vectors x , y , z of size N 2 . Thus a color image will use three
times as much memory as a grayscale image.
Inner products.

The inner product of two n-vectors x , y is defined as

x1 y1 + x2 y2 + · · · + xn yn

notation x T y , x T is the transpose of a column vector x (The


transpose of a column vector is a row vector and vice versa).
Properties
◮ (αx )T y = α(x T y ) for all scalar α
◮ (x + y )T z = (x T z) + (y T z)
◮ xT y = yT x
Matrices

m × n-matrix A:
 
a1,1 a1,2 ··· a1,n
 a2,1 a2,2 ··· a2,n 
Am,n = . .. .. 
 
 .. ..
. . . 
am,1 am,2 · · · am,n

◮ ai,j are the elements or coefficients


◮ set of m × n-matrices is denoted Rm×n
◮m, n are the dimensions
Special matrices
◮ A = 0 (zero matrix): ai,j = 0 for i = 1, · · · , m, j = 1, · · · , n
◮ A = I (identity matrix): m = n, ai,i = 1, ai,j = 0 for i 6= j
Matrix transpose

m × n-matrix A:
 
a1,1 a2,1 ··· am,1
a1,2 a2,2 ··· am,2 
ATn,m = . .. .. 
 
 .. ..
. . . 
a1,n a2,n · · · am,n

Properties
◮ AT is n × m if A is m × n
◮ (AT )T = A
◮ a square matrix A is symmetric if A = AT , i.e., ai,j = aj,i
Scalar multiplication and addition

◮ Scalar multiplication of an m × n-matrix A with a scalar α


 
αa1,1 αa1,2 · · · αa1,n
 αa2,1 αa2,2 · · · αa2,n 
αAm,n =  . .. .. 
 
 .. ..
. . . 
αam,1 αam,2 · · · αam,n
◮ Addition and subtraction of two m × n-matrices A and B
 
a1,1 ± b1,1 a1,2 ± b1,2 · · · a1,n ± b1,n
 a2,1 ± b2,1 a2,2 ± b2,2 · · · a2,n ± b2,n 
Am,n ± Bm,n =  .. .. ..
 
.. 
 . . . . 
am,1 ± bm,1 am,2 ± bm,2 · · · am,n ± bm,n
Matrix-vector product

Product of m × n-matrix A with n-vector x


   
a1,1 x1 + a1,2 x2 + · · · + a1,n xn b1
 a2,1 x1 + a2,2 x2 + · · · + a2,n xn   b2 
Am,n x =  ..  =  ..  = b
   
 .   . 
am,1 x1 + am,2 x2 + · · · + am,n xn bm

◮ Dimensions must be compatible:


number of columns of a matrix A (n) = number of elements of a
vector x (n)
◮ number of elements of the vector b is m
◮ The ith component of b is:
n
X
bi = ai,1 x1 + ai,2 x2 + · · · + ai,n xn = ai,j xj
j=1
Matrix-vector product

From the equation


n
X
bi = ai,1 x1 + ai,2 x2 + · · · + ai,n xn = ai,j xj
j=1

we can obtain a computer pseudocode to perform matrix-vector


multiplication
bi ← 0
for i = 1 → m do
for j = 1 → n do
bi ← bi + ai,j xj
end for
end for
Matrix-vector product

◮ Another way to view matrix-vector multiplication:


 
a1,1 x1 + a1,2 x2 + · · · + a1,n xn
 a2,1 x1 + a2,2 x2 + · · · + a2,n xn 
Am,n x =  .. =
 
 . 
am,1 x1 + am,2 x2 + · · · + am,n xn
       
a1,1 a1,2 a1,n b1
 a2,1   a2,2   a2,n   b2 
 ..  x1 +  ..  x2 + · · · +  ..  xn =  .. 
       
 .   .   .   . 
am,1 am,2 am,n bm
◮ b is a linear combination of the columns of A
◮ If we let Aj denote the jth column of A, we have
n
X
b= Aj xj
j=1
The cost of matrix-vector operations
Evaluated by counting the number of floating-point operations (real
numbers are stored in computers in a floating-point format)
◮one floating-point (flop): +, −, /, or ×
Example: Matrix-vector product: b = Ax for Am×n

bi ← 0
for i = 1 → m do
for j = 1 → n do
bi ← bi + ai,j xj
end for
end for
◮ bi + ai,j xj involves two flops
◮ for j = 1 → n do
bi ← bi + ai,j xj
end for Pn
involves j=1 2 = 2n
◮ to
Pm calculate b = Ax , the total of flops is
i=1 2n = m × (2n) = 2mn
Matrix-matrix product.
If A is m × n, B is n × p, the product C = AB is m × p
The entry (i, j) of C is
ci,j = nk =1 ai,k bk ,j
P
= the dot product of the ith row of A with the jth column of B
Example: A computer program to multiply A by B

C←0
for i = 1 → m do
for j = 1 → p do
for k = 1 → n do
ci,j ← ci,j + ai,k bk ,j
end for
end for
end for
◮ Cost of matrix-matrix operations:
p
m X n
X X
2 = m × (p × (2n)) = 2m × n × p
i=1 j=1 k =1
Block matrix notation.

" # " #
2 2 0 2 3 h i h i
B= ,C = ,D = 1 0 , E = −1 6 0
1 3 5 4 7
 
" # 2 22 30
B C
A= = 1 3 5 4 7
 
D E
1 0 −1 6 0

◮ A is a block matrix; B, C, D, E are the blocks of A


◮ Dimensions of the blocks must be compatible: B and D have the
same number of columns; B and C have the same number of
rows, etc.
◮ Partitioning matrices into blocks is a useful tool for: constructing
algorithms, developing faster variants of algorithms and proving
theorems.
Block matrix notation.

◮ Adding block matrices


" # " # " #
A B Ã B̃ A + Ã B + B̃
+ ,=
C D C̃ D̃ C + C̃ D + D̃

dimensions must conform (additions make sense)


◮ Multiplying block matrices
" #" # " #
A B W X AW + BY AX + BZ
=
C D Y Z CW + DY CX + DZ

dimensions must conform (multiplication and additions make


sense)
Linear equations

m equations in n variables x1 , x2 , · · · , xn :

a1,1 x1 + a1,2 x2 + · · · + a1,n xn = b1


a2,1 x1 + a2,2 x2 + · · · + a2,n xn = b2
..
.
am,1 x1 + am,2 x2 + · · · + am,n xn = bm
in matrix form: Ax = b, where:
     
a1,1 a1,2 · · · a1,n x1 b1
 a2,1 a2,2 · · · a2,n   x2   b2 
A= . .. ..  , x =  ..  , b =  .. 
     
 .. ..
. . .  .  . 
am,1 am,2 · · · am,n xn bm
Curve interpolation

◮ Curve interpolation is a problem that arises frequently in


computer graphics and in robotics (path planning).
◮ Interpolation problems require finding curves passing through
some given data points and possibly satisfying some extra
constraints.
◮ There are a number of interpolation problems, and we consider
one of the most common problems which can be stated as
follows:
Problem: Given n data points (t1 , y1 ), . . . , (tn , yn ), find a curve p,
such that p(ti ) = yi , for all i, 1 ≤ i ≤ n.
Polynomial interpolation

Fit a polynomial

p(t) = x1 + x2 t + x3 t 2 + · · · + xn t n−1

through n points (t1 , y1 ), · · · , (tn , yn )


◮ problem data: t1 , t2 , · · · , tn , y1 , y2 , · · · , yn
◮ problem variables: x1 , x2 , · · · , xn
write out the conditions on x :

p(t1 ) = x1 + x2 t1 + · · · + xn t1n−1 = y1
p(t2 ) = x1 + x2 t2 + · · · + xn t2n−1 = y2
.. ..
. .
p(tn ) = x1 + x2 tn + · · · + xn tnn−1 = yn
n linear equations in n variables, in matrix form: Ax = b, where:

t12 · · · t1n−1
     
1 t1 x1 y1
1 t2 t22 · · · t2n−1   x2   y2 
A = . .  , x =  ..  , b =  .. 
     
.. ..
 .. .. . .   .  .
n−1
1 tn tn2 · · · tn xn yn
Electrical circuits

Consider the electrical circuit (see page 14). Suppose all the voltage
and currents are constant. We need to determine the four nodal
voltages x1 , x2 , x3 , x4 .
◮ Kirchhoff’s current law:
P the sum
P of the currents away from the
node must be zero ( Iin = Iout at any node).
◮ Ohm’s law: V = IR, the voltage (volts) drop is equal to the
current (amperes) times the resistance (ohms).
where I: current and R: resistance
write out the conditions on x :
I3,4 : the current from node 3 to node 4 through the 5 ohms resistor, by
Ohm’s law we obtain x3 − x4 = 5I3,4 or I3,4 = 1/5(x3 − x4 )
I1,3 : the current from node 1 to node 3 through the 1 ohms resistor, by
Ohm’s law we obtain x1 − x3 = I1,3
I5,3 : the current from node 5 to node 3 through the 2 ohms resistor, by
Ohm’s law we obtain x5 − x3 = 2I5,3
I5,3 + I1,3 = I3,4 (Kirchhoff’s current law) or
1 1
2 (x5 − x3 ) + 1(x1 − x3 ) = 5 (x3 − x4 ) with x5 = 6V (see Equation 3
below)
Applying the same procedure to nodes 1, 2 and 4, we get the
following system (4 linear equations in 4 unknowns)

2x1 − x2 − x3 = 0
−x1 + 1.5x2 − 0.5x4 = 0
−x1 + 1.7x3 − 0.2x4 = 3
− 0.5x2 − 0.2x3 + 1.7x4 = 0

The above system can be written in matrix form


Ax = b, b = [0, 0, 3, 0]T . The system has exactly one solution (the
matrix A is nonsingular).
Nonsingularity and Uniqueness of Solutions

Recall that the n × n identity matrix is denoted by I (it is the unique


matrix such that AI = IA = A
For a given matrix A, if there is a unique matrix B such that
AB = BA = I, then B is called the inverse of A and denoted by A−1 .
Theorem
◮ A−1 exists
◮ There is no nonzero x such that Ax = 0
◮ The columns of A are linearly independent
◮ The rows of A are linearly independent
◮ det(A) 6= 0 (the determinant of A)
◮ Given any vector b, there is exactly one vector x such that
Ax = b
The six conditions above are equivalent, if any one holds, they all
hold.
Numerical Solution of Differential Equations
Differential equations are widely applied to model natural
phenomena, engineering systems and many other situations. Solving
differential equations is a fundamental problem in science and
engineering.
Consider a differential equation
−u ′′ (x ) + cu ′ (x ) + du(x ) = f (x ) 0 < x < 1,
with boundary condition
u(0) = 0, u(1) = 0,
The problem is to solve for the function u, given the constants c and
d and the function f .
◮ The finite difference method obtains an approximate solution for
u(x ) at a finite set of x .
◮ Here we consider the discrete x are uniformly spaced in the
interval 0 ≤ x ≤ 1 such that
xj = jh, j = 0, 1, · · · , m
where m is the total number of spatial nodes, including those on
the boundary.
Given m, the spacing between the xi is computed with h = m1
Numerical Solution of Differential Equations

◮ The finite difference method involves using discrete


approximations like

u(xi + h) − u(xi − h) u(xi+1 ) − u(xi−1 )


u ′ (xi ) ≃ =
2h 2h
and
u(xi + h) − 2u(xi ) + u(xi − h) u(xi+1 ) − 2u(xi ) + u(xi−1 )
u ′′ (xi ) ≃ =
h2 h2
◮ Now, since the differential equation holds at each grid point, we
have
u ′′ (xi ) + c u ′ (xi ) + d u(xi ) = f (xi )
for i = 1, 2, · · · , m − 1
Numerical Solution of Differential Equations

◮ Substituting these approximations for the derivatives into the


differential equation, we obtain

u(xi+1 ) − 2u(xi ) + u(xi−1 ) u(xi+1 ) − u(xi−1 )


+c + d u(xi ) ≃ f (xi )
h2 2h
for i = 1, 2, · · · , m − 1
◮ Next, we replace u(xi ) by the symbol ui an approximation of
u(xi ), and f (xi ) by fi , to get

ui+1 − 2ui + ui−1 ui+1 − ui−1


+c + d ui = fi
h2 2h
◮ We obtain a linear system of m − 1 equations in the unknowns
u0 , u1 , · · · , um , but u0 = u(0) = 0 and um = u(xm ) = 0, leaving
m − 1 unknowns ( u1 , u2 , · · · , um−1 )
Numerical Solution of Differential Equations

For m = 6 we get a system of 5 equations in 5 unknowns. The


system of equations can be represented in matrix form as
    
a b1 0 0 0 u1 f1
b2 a b1 0 0  u2  f2 
    
 0 b2 a b1 0  u3  = f3 
    
 0 0 b2 a b1  u4  f4 
0 0 0 b2 a u5 f5

where a = 72 + d, b1 = −36 + 3c and b2 = −36 − 3c.


Triangular Systems

◮ A square matrix A is lower triangular if ai,j = 0 for j > i


 
a1,1 0 ··· 0 0
 a2,1 a2,2 ··· 0 0 
 .. ..
 
An,n =  . .. 
 . . 0 0  
an−1,1 an−1,2 · · · an−1,n−1 0 
an,1 an,2 ··· an,n−1 an,n

◮ A is upper triangular if AT is lower triangular (if ai,j = 0 for j < i)


◮ A triangular matrix is nonsingular if and only if the diagonal
elements are nonzero (aii 6= 0, i = 1, 2, · · · , n).
◮ A triangular system is a system of the form: Ax = b with A
triangular matrix.
Solving a lower triangular system

Example
Solve Ax = b with A lower-triangular matrix and nonsingular
    
1 0 0 x1 −1
−2 1 0 x2  = −7
−1 1 1 x3 −6

Using forward substitution



 x1 = −1
x2 = −7 + 2x1 = −9
x3 = −6 + x1 − x2 = 2

Solving a lower triangular system
Solve Ax = b with A lower-triangular matrix and nonsingular
    
a1,1 0 ··· 0 x1 b1
a2,1 a2,2 · · · 0  x2  b2 
 .. ..   ..  =  .. 
    
..
 . . .  .   . 
an,1 an,2 · · · an,n xn bn

Algorithm

x1 = b1 /a1,1
x2 = (b2 − a2,1 x1 )/a2,2
x3 = (b3 − a3,1 x1 − a3,2 x2 )/a3,3
..
.  
Pi−1
xi = (bi − ai,1 x1 − ai,2 x2 − · · · − ai,i−1 xi−1 )/ai,i = a−1
i,i bi − j=1 ai,j x j
..
.
xn = (bn − an,1 x1 − an,2 x2 − · · · − an,n−1 xn−1 )/an,n
Algorithm

It is usual for a computer to store x over b since once we have


calculated xi , we no longer need bi (b1 is only used in calculating x1 ,
b2 is only used in calculating x2 , and so on).

for i = 1 → n do
for j = 1 → i − 1 do
bi ← bi − ai,j bj
end for
if ai,i = 0 then
set error, exit
end if
bi ← bi /ai,i
end for

Cost

Σni=1 Σi−1 n 2
j=1 2 = Σi=1 2(i − 1) = n(n − 1) = n − n
Algorithm

Cost
◮ The operations that are performed outside the j loop: ai,i is
compared with zero n times (if ai,i = 0), and there are n divisions
(bi /ai,i ). The total is proportional to n.
◮ The terms that are proportional to n will be insignificant
(compared to n2 ) if n is large enough.
◮ We ignore the lesser costs: the cost of doing forward substitution
is n2 flops (n2 − n ≃ n2 )
Algorithm

Cost
◮ With the known operation count going like O(n2 ), we can write

Tn = kn2

for some unknown constant.


◮ To determine how long solving A2n,2n x = b will take, we write

T2n = k (2n)2
= 4kn2
= 4Tn

◮ Doubling the size of the matrix is expected to increase the


computational time by a factor of 22 = 4.
Solving an upper triangular system
Solve Ax = b with A upper-triangular matrix and nonsingular
    
a1,1 · · · a1,n−1 a1,n x1 b1
 .. .. .. ..   ..   .. 
 .
 . . .  .  =  . 
   
 0 · · · an−1,n−1 an−1,n  xn−1   bn−1 
0 ··· 0 an,n xn bn

Agorithm

xn = bn /an,n
xn−1 = (bn−1 − an−1,n xn )/an−1,n−1
xn−2 = (bn−2 − an−2,n−1 xn−1 − an−2,n xn )/an−2,n−2
..
.
x1 = (b1 − a1,2 x2 − a1,3 x3 − · · · − a1,n xn )/a1,1

cost
cost: n2 flops
Positive Definite Systems; Cholesky Decomposition
Cholesky decomposition is widely utilized in various fields, for
example Relevance Vector Machine (RVM) algorithm -M. E. Tipping,
Sparse Bayesian Learning and the Relevance Vector Machine,
Journal of Machine Learning Research, vol.1, pp.211-244, 2001- The
RVM is used in various fields, such as machine learning and image
pattern recognition..., and is associated with intensive matrix
operations, such as matrix matrix multiplication, Cholesky
decomposition, and matrix inversion...
Positive Definite Matrix
◮ A is positive definite matrix if A is symmetric and

x T Ax > 0 for all

x 6= 0

◮ If A is symmetric of order n, then


n X
X n n
X X
x T Ax = ai,j xi xj = ai,i xi2 + 2 ai,j xi xj
i=1 j=1 i=1 i>j
Positive Definite Systems; Cholesky Decomposition

Examples
◮    
9 6 9 6
A= ,B =
6 5 6 3
◮ A is positive definite

x T Ax = 9x12 + 12x1 x2 + 5x22 = (3x1 + 2x2 )2 + x22

◮ B is not positive definite

x T Bx = 9x12 + 12x1 x2 + 3x22 = (3x1 + 2x2 )2 − x22


Positive Definite Systems; Cholesky Decomposition

Theorem
Let M be any n × n nonsingular matrix, and let A = M T M. Then A is
positive definite.
Proof
◮ AT = (M T M)T = M T (M T )T = M T M = A (A is symmetric)
◮ We must show x T Ax > 0 for all x 6= 0
x T Ax = x T (M T M)x = (Mx )TP
Mx
n
Let y = Mx , then x T Ax = y T y = i=1 yi2 > 0 for all y 6= 0.
◮ But y = Mx 6= 0 since M is nonsingular and x 6= 0.
◮ Then x T Ax > 0 for all nonzero x

Theorem
If A is positive definite, then A is nonsingular.
Proof
See p. 34
Cholesky Decomposition

Theorem
If A is positive definite, then A can be decomposed in exactly one way
into a product
A = RT R
such that R is upper triangular with ri,i > 0 (all main diagonal are
positive)
Proof
See p. 34
◮ Cholesky decomposition is useful because R and R T are
triangular
◮ To solve Ax = b with A = R T R
◮ Let y = Rx
◮ we have R T y = b, since R T is lower triangular, we can solve for y
(by forward substitution)
◮ Now from Rx = y , with R is upper triangular, we can solve for x (by
back substitution)
Cholesky Decomposition

Algorithm
write the decomposition
   
a1,1 a1,2 a1,3 · · · a1,n r1,1 0 0 ··· 0
a2,1 a2,2 a2,3 · · · a2,n  r1,2 r2,2 0 ··· 0 
   
a3,1 a3,2 a3,3 · · · a3,n  r1,3 r2,3 r3,3 ··· 0
=

 .. .. ..   .. .. ..
 
.. .. 
 . . . .   . . . . 
an,1 an,2 an,3 · · · an,n r1,n r2,n r3,n · · · rn,n
 
r1,1 r1,2 r1,3 · · · r1,n
 0 r2,2 r2,3 · · · r2,n 
 
 0 0 r3,3 · · · r3,n 
 .. .. ..
 
.. 
 . . . . 
0 0 0 · · · rn,n

You might also like