MAT 461/561: 3.2 LU Decomposition: Announcements
MAT 461/561: 3.2 LU Decomposition: Announcements
MAT 461/561: 3.2 LU Decomposition: Announcements
2 LU Decomposition
James V. Lambers
Announcements
Homework submission:
Introduction
Current procedure for solving Ax = b (MAT 326-style):
h i
1. Form augmented matrix A b
2. Use row operations (Gaussian elimination) to reduce augmented matrix to upper triangular
form U (row-echelon form)
3. Use back substitution on U and modified b
The choice of row operations performed on A and b are determined solely by A
Ideally, we should be able to re-use information if solving Ax2 = b2
To do this, we need a “record” of row operations performed on A to use later with a new b
Gaussian elimination on A (not b) is by far the most expensive part of solving Ax = b, should
only have to perform once, but we need “storage” of that sequence of row operations
To do this, we will examine Gaussian elimination without involving b
This useful if solving many systems with same A, different b. When this happens: using an
implicit time-stepping method to solve a linear system of ODEs, with a time-independent Jacobian
(y0 = Jy), will be seen in Chapter 12
1
Derivation of the LU Decomposition
Elementary row Matrices
Recall elementary row operations: 1) scaling rows, 2) interchanging rows, 3) subtracting a multiple
of one row from another
All of these can be described by multiplying A on the left by elementary row matrices:
• Interchanging rows 1 and 3:
0 0 1 0 1 2 −1 3 10 3 −4 2
0 1 0 0 4 −7 6 −5 4 −7 6 −5
EA = =
1 0 0 0 10 3 −4 2 1 2 −1 3
0 0 0 1 1 0 −6 8 1 0 −6 8
If we can easily describe each individual row operation, can we efficiently (in terms of time and
space) represent ALL of the operations needed to carry out Gaussian elimination?
More concisely,
M (n−1) · · · M (2) M (1) A = U
where M (j) is the product of elementary row matrices that eliminate subdiagonal entries in column
j. That is,
M (j) = Enj · · · Ej+2,j Ej+1,j
2
We know that Eij is the identity matrix I, with −mij in row i, column j. What does M (j) look
like? It is the identity matrix I, with −mij in the (i, j) position, for i = j + 1, . . . , n.
Example:
1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0
0 1 0 0 0 1 0 0 −3 1 0 0 −3 1 0 0
=
0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0
−2 0 0 1 0 0 0 1 0 0 0 1 −2 0 0 1
and −1
1 0 0 0 1 0 0 0
−3 1 0 0 3 1 0 0
=
1 0 1 0 −1 0 1 0
−2 0 0 1 2 0 0 1
From
M (n−1) · · · M (2) M (1) A = L−1 A = U
we have
A = [M (1) ]−1 [M (2) ]−1 · · · [M (n−2) ]−1 [M (n−1) ]−1 U
or
A = L(1) L(2) · · · L(n−1) U
where
L(j) = [M (j) ]−1
is the identity matrix I with mij in the (i, j) position for i > j. But what happens when we
multiply all of the L(j) ? We obtain
where L is the identity matrix I with mij in the (i, j) position for i > j. Thus we have
A = LU
Solution of Ax = b
How we solve the system: Ax = b means LU x = b. Let y = U x. Then Ly = b.
The algorithm:
1. Perform Gaussian elimination on A: A = LU ( 23 n3 flops)
2. Solve Ly = b using forward substitution (n2 − n flops). This implicitly applies the row
operations previously performed on A to b. Compare bi = bi − mij bj step from Gaussian
elimination to yi = yi − `ij yj step of forward substitution.
3. Solve U x = y using back substitution to obtain x. (n2 flops)
3
Existence and Uniqueness
Not every matrix, even if invertible, HAS a LU Decomposition! For example, the matrix
0 −1 2 1 2 −7
A= 3 9 −4 , A= 2 4 1
5 0 1 3 0 −5
What goes wrong? In first A, a11 = 0 so multipliers for column j = 1 are undefined.
For second A, we can proceed with first column and multipliers m21 = 2/1 = 2 and m31 = 3/1 = 3.
But then a22 = 0 in the updated matrix!
The LU Decomposition A = LU exists if and only if the principal minors of A are nonzero.
The leading principal submatrices of A, denoted A1 , A2 , . . . , An , are defined as follows:
a11 a12 ··· a1k
" #
a11 a12
a21 a22 a2k
A1 = a11 , A2 = , Ak = .. .. ..
a21 a22
. . .
ak1 ··· ··· akk
The principal minors of A are det(A1 ), det(A2 ), . . . , det(An ). All must be nonzero for the LU
Decomposition to exist.
If it exists, is it unique? Assume A = L1 U1 = L2 U2 . Because A is nonsingular (invertible), Li ,Ui
are also nonsingular. Therefore
U1 U2−1 = L−1
1 L2 .
Recall that multiplying and inverting preserves triangularity! Therefore U1 U2−1 is upper triangular,
and L−1
1 L2 is unit lower triangular. What can we conclude if a lower triangular and upper triangular
matrix are equal? They must be diagonal, because all entries below the main diagonal must be
zero (upper) and all entries above the main diagonal must also be zero (lower).
Furthermore, all the diagonal entries must be 1, because L−1
1 L2 is unit lower triangular. That is,
L−1
1 L2 = I
4
Pivoting
If the LU Decomposition of A does not exist, then we must certainly perform row interchanges to
ensure that it does, BUT due to floating-point arithmetic error, we have another reason to perform
row interchanges.
Consider the main step in Gaussian elimination:
(j+1) (j) (j)
aik = aik − mij ajk
(k)
where A(1) is the original matrix. Note that all entries aij will be “contaminated” with roundoff
(j) (j) (j)
error. That is, ajk = ãjk (1 + jk ) where ãjk is the result of performing Gaussian elimination in
exact arithmetic and jk is the (relative) roundoff error.
(j)
Suppose the multiplier mij is large. Then, the roundoff error in ajk is amplified by mij , thus
(j+1)
causing ajk will be “more erroneous”! Therefore, even if there is no danger of dividing by zero,
we should interchange rows (pivot) to reduce the size of mij .
Using the fact that mij = aij /ajj , we want to perform an interchange among rows j, j + 1, . . . , n
so that ajj is as large as possible.
Partial Pivoting
In partial pivoting, we examine ajj , aj+1,j , aj+2,j , . . . , anj and find the largest (in absolute value).
Suppose that entry is aqj . Then we interchange row j and row q. This ensures the smallest
multipliers possible from entries in column j. It guarantees |mij | ≤ 1. Also, it’s efficient, because
O(n2 ) comparisons vs. O(n3 ) flops. The “backslash” operator in MATLAB (\)
performs Gaussian elimination with partial pivoting. HOWEVER, even with small multipliers, the
entries of U can still get very large (see homework).
Complete Pivoting
Before eliminating subdiagonal entries in column j, we examine rows and columns j, j + 1, . . . , n to
find the largest entry in the entire submatrix. Suppose that entry is apq , where p, q ≥ j. We then
swap rows j and p, and columns j and q. Therefore A is replaced by P1 AP2 , where P1 and P2 are
permutation matrices. Example:
1 0 0 0
0 0 0 1
P1 =
0 0 1 0
0 1 0 0
swaps rows 2 and 4 if multiplied on the left, or columns 2 and 4 on the right.
Therefore Ax = b becomes
P1 AP2 P2−1 x = P1 b
or
Ãx̃ = b̃
where à = P1 AP2 , b̃ = P1 b, and x̃ = P2−1 x. Therefore we can carry out à = LU , Ly = b̃ (forward
substitution), and U x̃ = y (back substitution). Then x = P2 x̃.
5
Complete pivoting: why or why not?
• pro: smaller multipliers than partial pivoting, less propagation of rounding error, also Gaus-
sian elimination is proven to be “backward stable” (Sec 2.1) meaning that the computed
output of Gaussian elimination is the exact LU decomposition of a “nearby” matrix. Recall
that forward error (error in x) is backward error (error in A) times the condition number of
A. Note that Gaussian elimination was not known to be stable initially, until Jim Wilkinson
performed a careful roundoff error analysis.
• con: complete pivoting requires O(n3 ) comparisons, compared to partial pivoting O(n2 ).
• A = LU (Gaussian elimination)
• Ly = b (forward substitution)
• U x = y (back substitution)
Effectively, we have:
M̃ (n−1) · · · M̃ (1) Pn−1 · · · P2 P1 A = U
where M̃ (j) = Pn−1 · · · Pj+1 M (j) Pj+1
T ···PT
n−1 is unit lower triangular. This means
P A = LU
where P carries out ALL row interchanges during Gaussian elimination, and L contains the multi-
pliers, BUT: to ensure multipliers are in the correct place, do the following:
6
This ensures the correct L in P A = LU .
The process now:
• Ly = P b (forward substitution)
• U x = y (back substitution)
Storage: do NOT store P explicitly. Recommended: start with a vector of indices 1, . . . , n. When
swapping rows of A, swap indices in this vector. Then use these indices to rearrange b. MATLAB:
use b(P) where P is the vector of indices.
so det(P ) = ±1. We use the fact that interchanging rows of a matrix negates the determinant, so