Guide
Guide
LYAPACK
Thilo Penzl
Preface
Control theory is one of the most rapidly developing disciplines of mathematics and
engineering in the second half of the 20th century. In the past decade, implementations
of numerically robust algorithms for many types of dense problems in control theory have
become available in software packages, such as SLICOT [7]. However, little research has
been done on efficient numerical methods for control problems related to large sparse
or structured dynamical systems before 1990. In the last few years, quite a number of
approaches for several types of large control problems have been proposed, but, at present,
it is often not clear, which of them are the more promising ones. It is needless to say that
there is little software for large control problems available. In this situation, the author took
the opportunity to implement the software package LYAPACK (“Lyapunov Package”),
which covers one particular approach to a class of large problems in control theory. An
efficient ADI-based solver for large Lyapunov equations is the “workhorse” of LYAPACK,
which also contains implementations of two model reduction methods and modifications of
the Newton method for the solution of large Riccati equations and linear-quadratic optimal
control problems. Most of the underlying algorithms have been developed by the author in
the past three years. A part of this research was done simultaneously and independently
by Jing-Rebecca Li. A benefit of her work to LYAPACK is in particular an improvement
in the efficiency of the Lyapunov solver.
LYAPACK aims at two goals. First, of course, the package will hopefully be used
to solve problems that arise from practical applications. The availability of easy-to-use
software is surely one step to make practitioners consider alternative numerical techniques:
“unless mathematics is put into software, it will never be used” [The SIAM Report on
Mathematics in Industry, 1996]. (This statement might be somewhat too strong. And, of
course, the reverse statement is not necessarily true.) Second, SLICOT can be considered
as a contribution to a fair and comprehensive comparison of the existing methods for large
Lyapunov equations, model reduction problems, etc., which is yet to be done.
For several reasons LYAPACK has been implemented in MATLAB1 rather than pro-
gramming languages like FORTRAN, C, or JAVA. MATLAB codes are easier to under-
stand, to modify, and to verify. On the other hand, their performance cannot compete
with that of codes in the aforementioned programming languages. However, this does not
mean that LYAPACK is restricted to the solution of “toy problems”. Several measures,
such as the use of global variables for large data structures, have been taken to enhance the
computational performance of LYAPACK routines. To put this into the right perspective,
Lyapunov equations of order larger than 12000 were solved by LYAPACK within few hours
on a regular workstation. When using standard methods, supercomputers are needed to
solve problems of this size.
LYAPACK was implemented and tested in a UNIX environment. Note, in particular,
that the file names of some routines do not comply the DOS-like “xxxxxxxx.yyy” naming
convention.
The author acknowledges the support of the DAAD (Deutscher Akademischer Aus-
tauschdienst = German Academic Exchange Service). He is grateful to Peter Benner,
Peter Lancaster, Jing-Rebecca Li, Volker Mehrmann, Enrique Quintana-Orti, and Andras
Varga for their direct or indirect help on the project. He also wants to thank the staff of
1
MATLAB is a trademark of The MathWorks Inc.
“The First Cup” (University of Calgary), where the considerable quantity of coffee was
produced, which was needed to realize the LYAPACK project.
Finally, it should be stressed that any kind of feedback from people who applied or
tried to apply this package is highly appreciated.
Thilo Penzl
Calgary, November 1999
Addendum to Preface
This manuscript was mostly finished just before Thilo Penzl died in a tragic accident
in December 1999, a few days before his return to work in the Numerical Analysis Group
at TU Chemnitz where he also completed his PhD in 1998. I felt that this very nice
piece of work should be made available to the scientific community and we therefore tested
the codes, proofread the manuscript and performed minor corrections in the text. The
MATLAB codes were tested by Falk Ebert and the corrections to the Users’ Guide were
performed by myself.
Any comments or questions concerning the package should be addressed to Volker
Mehrmann mehrmann@mathematik.tu-chemnitz.de.
The LYAPACK codes are available at http://www.tu-chemnitz.de/sfb393/lyapack
Volker Mehrmann
Chemnitz, August 2000
Disclaimer and usage notes
3 Lyapunov equations 10
3.1 Low Rank Cholesky Factor ADI . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.1 Theory and algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.2 Stopping criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.3 The routine lp lradi . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Computation of ADI shift parameters . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Theory and algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 The routine lp para . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Model reduction 21
4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Low rank square root method . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.1 Theory and algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.2 Choice of reduced order . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.3 The routine lp lrsrm . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.4 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 Dominant subspaces projection model reduction . . . . . . . . . . . . . . . 24
4.3.1 Theory and algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3.2 Choice of reduced order . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.3 The routine lp dspmr . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.4 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
C Case studies 42
C.1 Demo programs for user-supplied functions . . . . . . . . . . . . . . . . . . 42
C.1.1 Demo program demo u1: . . . . . . . . . . . . . . . . . . . . . . . . 42
C.1.2 Demo program demo u2: . . . . . . . . . . . . . . . . . . . . . . . . 45
C.1.3 Demo program demo u3: . . . . . . . . . . . . . . . . . . . . . . . . 48
C.2 Demo program for LRCF-ADI iteration and algorithm for computing ADI
parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
C.2.1 Demo program demo l1 . . . . . . . . . . . . . . . . . . . . . . . . . 51
C.2.2 Results and remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
C.3 Demo programs for model reduction algorithms . . . . . . . . . . . . . . . . 54
C.3.1 Demo program demo m1 . . . . . . . . . . . . . . . . . . . . . . . . . 54
C.3.2 Results and remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
C.3.3 Demo program demo m2 . . . . . . . . . . . . . . . . . . . . . . . . . 59
C.3.4 Results and remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
C.4 Demo program for algorithms for Riccati equations and linear-quadratic
optimal problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
C.4.1 Demo program demo r1 . . . . . . . . . . . . . . . . . . . . . . . . . 64
C.4.2 Results and remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
1
1 Introduction
1.1 What is LYAPACK?
LYAPACK is the acronym for “Lyapunov Package”. It is a MATLAB toolbox (i.e., a set of
MATLAB routines) for the solution of certain large scale problems in control theory, which
are closely related to Lyapunov equations. Basically, LYAPACK works on realizations
where F ∈ Rn,n and G ∈ Rn,t are given and X ∈ Rn,n is the solution. In some
applications the solution X itself might be of interest, but mostly it is only an
auxiliary matrix, which arises in the course of the numerical solution of another
problem. Such problems are model reduction, Riccati equations, and linear-quadratic
optimal control problems, for example.
˙ )
x̂(τ = Âx̂(τ ) + B̂u(τ )
(3)
y(τ ) = Ĉ x̂(τ )
of smaller order k, whose behavior is similar to that of the original one in some sense.
There exist a large number of model reduction methods which rely on Lyapunov
equations [2]. LYAPACK contains implementations of two such methods. Both are
based on the Lyapunov equations
AT XC + XC A = −C T C. (5)
subject to the dynamical system (1) and the initial condition x(0) = x0 is called the
linear-quadratic optimal control problem (LQOCP). Its optimal solution is described
by the state-feedback
LYAPACK contains routines for these three types of problems. The underlying algo-
rithms are efficient w.r.t. both memory and computation for many large scale problems.
• Stability. In most cases, the matrix A must be stable, i.e., its spectrum must be a
subset of the open left half of the complex plane. For the solution of Riccati equations
and optimal control problems it is sufficient that a matrix K (0) is given, for which
T
A − BK (0) is stable.
• The number of the inputs and outputs must be small compared to the
number of states, i.e., m << n and q << n. As a rule of thumb, we recommend
n/m, n/q ≥ 100. The larger these ratios are, the better is the performance of
LYAPACK compared to implementations of standard methods.
• The matrix A must have a structure, which allows the efficient solution
of (shifted) systems of linear equations and the efficient realization of
products with vectors. Examples for such matrices are classes of sparse matri-
ces, products of sparse matrices and inverses of sparse matrices, circulant matrices,
Toeplitz matrices, etc.
At this point, it should be stressed that problems related to certain generalized dynamical
systems
˙ ) = N x̃(τ ) + B̃u(τ )
M x̃(τ
(9)
y(τ ) = C̃ x̃(τ )
where M, N ∈ Rn,n , can be treated with LYAPACK as well. However, it is necessary
that the generalized system can be transformed into a stable, standard system (1). This
is the case when M is invertible and M −1 N is stable. The transformation is done by an
LU factorization (or a Cholesky factorization in the symmetric definite case) of M , i.e.,
M = ML MU . Then an equivalent standard system (1) is given by
• LYAPACK cannot solve Lyapunov equations and model reduction problems, where
the system matrix A is not stable. It cannot solve Riccati equations and optimal
control problems if no (initial) stabilizing feedback is provided.
• LYAPACK cannot be used to solve problems related to singular systems, i.e., gener-
alized systems (9) where M is singular.
• LYAPACK is not able to solve problems efficiently which are highly “ill-conditioned”
(in some sense). LYAPACK relies on iterative methods. Unlike direct methods,
whose complexity does usually not depend on the conditioning of the problem, iter-
ative methods generally perform poorly w.r.t. both accuracy and complexity if the
problem to be solved is highly ill-conditioned.
• LYAPACK is inefficient if the system is of small order (say, n ≤ 500). In this case,
it is recommended to apply standard methods to solve the problem; see §7.
• LYAPACK is inefficient if the number of inputs and outputs is not much smaller than
the system order. (For example, there is not much sense in applying LYAPACK to
problems with, say, 1000 states, 100 inputs, and 100 outputs.)
• LYAPACK is not very efficient if it is not possible to realize basic matrix operations,
such as products with vectors and the solution of certain (shifted) systems of linear
equations with A, in an efficient way. For example, applying LYAPACK to systems
with an unstructured, dense matrix A is dubious.
• LYAPACK is not intended to solve discrete-time problems. However, such problems
can be transformed into continuous-time problems by the Cayley transformation. It
is possible to implement the structured, Cayley-transformed problem in user-supplied
functions; see §2.2.
• LYAPACK cannot handle more complicated types of problems, such as problems
related to time-invariant or nonlinear dynamical systems.
• A basic concept of LYAPACK is that matrix operations with A are implicitly realized
by so-called user-supplied functions (USFs). For general problems, these routines
must be written by the users themselves. However, for the most common problems
such routines are provided in LYAPACK.
• Multiplications with A or AT :
Y ←− AX or ←− AT X.
Y ←− A−1 X or ←− A−T X.
• Adequate structures for storing the data, which corresponds to the matrix A, can
be used. (In other words, one is not restricted to storing A explicitely in a sparse or
dense array.)
• Adequate methods for solving linear systems can be used. (This means that one is not
restricted to using “standard” LU factorizations. Instead, Cholesky factorizations,
Krylov subspace methods, or even multi-grid methods can be used.)
In general, users have to implement user supplied functions themselves in a way that is
as highly efficient w.r.t. both computation and memory demand. However, user supplied
functions for the following most common types of matrices A (and ways to implement the
corresponding basic matrix operations) are already contained in LYAPACK. Note that the
basis name, which must be provided as input parameter name to many LYAPACK routines,
is the first part of the name of the corresponding user supplied function.
• [basis name] = as: A in (1) is sparse and symmetric. (Shifted) linear systems are
solved by sparse Cholesky factorization. In this case, the ADI shift parameters pi
must be real. Note: This is not guaranteed in the routine lp lrnm for Riccati
equations and optimal control problems. If this routine is used, the unsymmetric
version au must be applied instead of as.
• [basis name] = au: A in (1) is sparse and (possibly) unsymmetric. (Shifted) linear
systems are solved by sparse LU factorization.
• [basis name] = au qmr ilu: A in (1) is sparse and (possibly) unsymmetric. (Shifted)
linear systems are solved iteratively by QMR using ILU preconditioning, [14].
• [basis name] = msns: Here, the system arises from a generalized system (9), where
M and N are symmetric. Linear systems involved in all three types of basic matrix
operations are solved by sparse Cholesky factorizations. In this case, the ADI shift
parameters pi must be real. Note: This is not guaranteed in the routine lp lrnm
for Riccati equations and optimal control problems. If this routine is used, the
unsymmetric version munu must be applied instead of msns.
• [basis name] = munu: Here, the system arises from a generalized system (9), where
M and N are sparse and possibly unsymmetric. Linear systems involved in all three
types of basic matrix operations are solved by sparse LU factorizations.
6 2 REALIZATION OF BASIC MATRIX OPERATIONS
Although, these classes of user supplied functions can be applied to a great variety of
problems, users might want to write their user supplied functions themselves (or modify
the user supplied functions contained in LYAPACK). For example, this might be the
case if A is a dense Toeplitz or circulant matrix, or if alternative iterative solvers or
preconditioners should be applied to solve linear systems. Obviously, it is impossible to
provide user supplied functions in LYAPACK for all possible structures the matrix A can
have.
For each type of problems listed above the following routines are needed. Here, one or
two extensions are added to the basis name:
Five different first extensions are possible. They have the following meaning:
For some classes of user supplied functions preprocessing and postprocessing routines do
not exist because they are not needed. There is no second extension if [extension 1] = pre
or pst. If [extension 1] = m, l, or s, there are the following three possibilities w.r.t. the
second extension:
• [extension 2] = i: initialization of the data needed for the corresponding basic matrix
operations.
• no [extension 2]: the routine actually performs the basic matrix operations.
0 0
100 100
200 200
300 300
400 400
500 500
600 600
700 700
800 800
0 200 400 600 800 0 200 400 600 800
There are a few situations, when preprocessing is not necessary. Examples are standard
systems (1), where A is a tridiagonal matrix and (shifted) linear systems are solved directly
(Here, reordering would be superfluous.), or where A is sparse and (shifted) linear systems
are solved by QMR [14].
Usually, the preprocessing step consists of an equivalence transformation of the system.
In rare cases not only the system matrices, but also further matrices must be transformed.
In particular, this applies to nonzero initial stabilizing state-feedback matrices K0 when
a Riccati equation or an optimal control problems should be solved.
It is important for users to understand, what is done during the preprocessing and
to distinguish carefully between “original” and “transformed” (preprocessed) data. Often
the output data of LYAPACK routines must be backtransformed (postprocessed) in order
to obtain the solution of the original problem. Such data are, for example, the low rank
Cholesky factor Z that describes the (approximate) solution of a Lyapunov equation or a
8 2 REALIZATION OF BASIC MATRIX OPERATIONS
Riccati equation, or the (approximate) state-feedback K for solving the optimal control
problems. For instance, if as pre or au pre have been applied for preprocessing, then the
rows of Z or K must be reordered by the inverse permutation. If msns pre or munu pre
are used, these quantities must be transformed with the inverse of the Cholesky factor
MU and subsequently re-reordered. These backtransformations are implemented in the
corresponding user supplied functions [basis name] pst for postprocessing.
In some cases, postprocessing can be omitted, despite preprocessing has been done.
This is the case, when the output data does not depend on what has been done as pre-
processing (which is usually an equivalence transformation of the system). An exam-
ple is model reduction by LRSRM or DSPMR. Here, the reduced systems are invariant
w.r.t. equivalence transformations of the original system.
...
[A0,B0,C0,prm,iprm] = au_pre(A,B,C); % Step 1
au_m_i(A0); % Step 2
Y0 = au_m(’N’,X0); % Step 3
...
au_l_i; % Step 4
Y0 = au_l(’N’,X0); % Step 5
...
p = lp_para(...); % Step 6
au_s_i(p); % Step 7
...
Y0 = au_s(’N’,X0,i); % Step 8
...
au_s_d(p); % Step 9
...
p = lp_para(...); % Step 10
...
au_s_i(p); % Step 11
...
Y0 = au_s(’N’,X0,i); % Step 12
...
au_s_d(p); % Step 13
...
Z = au_pst(Z0,iprm); % Step 14
au_l_d; % Step 15
au_m_d; % Step 16
...
2.4 Organization of user-supplied functions 9
Step 2: Initialization of data for multiplications with A0 . Here, the input parameter A0
is stored in the “hidden” global variable LP A.
Step 4: Initialization of data for the solution of linear systems with A0 . Here, an LU
factorization of the matrix A0 (provided as LP A) is computed and stored in the
global variables LP L and LP U.
Step 7: Initialization of data for the solution of shifted linear systems with A0 . Here,
the LU factors of the matrices A0 + p1 I, . . . , A0 + pl I (A0 is provided in LP A) are
computed and stored in the global variables LP L1, LP U1, . . . , LP Ll, LP Ul.
Step 8: Solution of shifted linear system (A0 + pi I)Y0 = X0 . au s has access to the global
variables LP Li and LP Ui.
Step 10: Possibly, a new set of shift parameters is computed, which is used for a further
run of the LRCF-ADI iteration. (This is the case within the routine lp lrnm, but
typically not for model reduction problems.)
Step 11: (Re)initialization of data for the solution of shifted linear systems with A0 and
the new shift parameters. Again, the LU factors are stored in the global variables
LP L1, LP U1, . . . , LP Ll, LP Ul. Here, the value of l may differ from that in Step 7.
Step 13: Delete the data generated in Step 11, i.e., clear the global variables LP L1, LP U1,
. . . , LP Ll, LP Ul. (Steps 9–13 can be repeated several times.)
Step 14: Postprocessing, which has been discussed in §2.3. The result Z0 of the prepro-
cessed problem is backtransformed into Z.
Step 15: Delete the data generated in Step 4, i.e., clear the global variables LP L and
LP U.
Step 16: Delete the data generated in Step 2, i.e., clear the global variable LP A.
10 3 LYAPUNOV EQUATIONS
The other user supplied functions, which are contained in LYAPACK, are organized in a
similar way. Consult the corresponding m-files for details.
The following table shows which user supplied functions are invoked within the single
LYAPACK main routines. [b.n.] means [basis name].
The calling sequences for these user supplied functions are fixed. It is mandatory to stick
to these sequences when implementing new user supplied functions. The calling sequences
are shown below. There it is assumed that X0 (parameter X0) is a complex n × t matrix,
p is a vector containing shift parameters, and the flag tr is either ’N’ (“not transposed”)
or ’T’ (“transposed”).
3 Lyapunov equations
3.1 Low Rank Cholesky Factor ADI
3.1.1 Theory and algorithm
This section gives a brief introduction to the solution technique for continuous time Lya-
punov equations used in LYAPACK. For more details, the reader is referred to[31, 6, 33, 37].
We consider the continuous time Lyapunov equation
F X + XF T = −GGT , (11)
where F ∈ Rn,n is stable, G ∈ Rn,t and t << n. It is well-known that such continuous time
Lyapunov equations have a unique solution X, which is symmetric and positive semidef-
inite. Moreover, in many cases, the eigenvalues of X decay very fast, which is discussed
3.1 Low Rank Cholesky Factor ADI 11
for symmetric matrices F in [40]. Thus, there exist often very accurate approximations of
a rank, that is much smaller than n. This property is most important for the efficiency of
LYAPACK.
The ADI iteration [36, 50] for the Lyapunov equation (11) is given by X0 = 0 and
for i = 1, 2, . . . It is one of the most popular iterative techniques for solving Lyapunov
equations. This method generates a sequence of matrices Xi which often converges very fast
towards the solution, provided that the ADI shift parameters pi are chosen (sub)optimally.
The basic idea for a more efficient implementation of the ADI method is to replace the
ADI iterates by their Cholesky factors, i.e., Xi = Zi ZiH and to reformulate in terms of
the factors Zi . Generally, these factors have ti columns. For this reason, we call them low
rank Cholesky factors (LRCFs) and their products, which are equal to the ADI iterates,
low rank Cholesky factor products (LRCFPs). Obviously, the low rank Cholesky factors
Zi are not uniquely determined. Different ways to generate them exist; see [31, 37]. The
following algorithm, which we refer to as Low Rank Cholesky Factor ADI (LRCF-ADI),
is the most efficient of these ways. It is a slight modification of the iteration proposed in
[31]. Note that the number of iteration steps imax needs not be fixed a priori. Instead,
several stopping criteria, which are described in §3.1.2 can be applied.
• stagnation of the normalized residual norm (most likely caused by round-off errors);
Here, the normalized residual norm corresponding to the low rank Cholesky factor Z is
defined as
F ZZ T + ZZ T F T + GGT F
NRN(Z) = . (13)
||GGT ||F
Note: In LYAPACK, a quite efficient method for the computation of this quantity is
applied. See [37] for details. However, the computation of the values NRN(Zi ) in the
course of the iteration can still be very expensive. Sometimes, this amount of computation
can exceed the computational cost for the actual iteration itself! Besides this, computing
the normalized residual norms can require a considerable amount of memory. This amount
is about proportional to ti. For this reason, it can be preferable to avoid stopping criteria
based on the normalized residual norm (tolerance for the NRN, stagnation of the NRN)
and to use cheaper, possibly heuristical criteria instead.
In the sequel, we discuss the above stopping criteria and show some sample convergence
histories (in terms of the normalized residual norm) for LRCF-ADI runs. Here, this method
is applied to a given test example, but the iterations are stopped by different stopping
criteria and different values of the corresponding stopping parameters. It should be noted
that the convergence history plotted in Figures 2–5 is quite typical for LRCF-ADI provided
that shift parameters generated by lp para are used in the given order. In the first
stage of the iteration, the logarithm of the normalized residual norm decreases about
linearly. Typically, this slope becomes less steep, when more “ill-conditioned” problems are
considered. (Such problems are in particular Lyapunov equations, where many eigenvalues
of F are located near the imaginary axis, but far away from the real one. In contrast,
symmetric problems, where the condition number of F is quite large, can usually be
solved by LYAPACK within a reasonable number of iteration steps.) In the second stage,
the normalized residual norm curve nearly stagnates on a relatively small level (mostly,
between 10−12 and 10−15 ), which is caused by round-off errors. That means the accuracy
(in terms of the NRN) of the low rank Cholesky factor product Zi ZiH cannot be improved
after a certain number of steps. Note, however, that the stagnation of the error norm
kZi ZiH − XkF can occur a number of iteration steps later. Unfortunately, the error cannot
be measured in practice because the exact solution X is unknown.
Each of the four stopping criteria can be “activated” or “avoided” by the choice of the
corresponding input argument (stopping parameter) of the routine lp lradi. If more than
one criterion is activated, the LRCF-ADI iteration is stopped as soon as (at least) one of
the “activated” criteria is fulfilled.
need to be performed to evaluate it. The drawback of this stopping criterion is, that
it is not related to the attainable accuracy of the delivered low rank Cholesky factor
product ZZ H . This is illustrated by Figure 2.
0
10
−2
10
−6
10
−8
10
−10
10
−12
10
−14
10
0 20 40 60 80 100
iteration steps
Figure 2: Stopping criterion: maximal number of iteration steps. Solid line: max it = 20;
dash-dotted line: max it = 60; dotted line: max it = 100. The other three criteria are
avoided.
This criterion can be avoided by setting min res = 0. (Because of round-off errors
it is practically impossible to attain NRN(Zi ) = 0.) It requires the computation of
normalized residual norms and is computationally expensive. A further drawback of
this criterion is that it will either stop the iteration before the maximal accuracy is
attained (see min res = 10−5 , 10−10 in Figure 3) or it will not stop the iteration
at all (see min res = 10−15 in Figure 3). If one wants to avoid this criterion, but
compute the convergence history provided by the output vector res, one should set
min res to a value much smaller than the machine precision (say, min res = 10−100 ).
0
10
−2
10
−6
10
−8
10
−10
10
−12
10
−14
10
0 20 40 60 80 100
iteration steps
Figure 3: Stopping criterion: tolerance for the normalized residual norm. Solid line:
min res = 10−5 ; dash-dotted line: min res = 10−10 ; dotted line: min res = 10−15 . The
other three criteria are avoided.
0
10
−2
10
normalized residual norm
−4
10
−6
10
−8
10
−10
10
−12
10
−14
10
0 20 40 60 80 100
iteration steps
Figure 4: Stopping criterion: Stagnation of the normalized residual norm. Solid line:
with rs = ’S’; dotted line: with rs = ’N’. The other three criteria are avoided.
kVi k2F
≤ min in
kZi k2F
3.1 Low Rank Cholesky Factor ADI 15
−5
10
−10
10
−15
10
0 20 40 60 80 100
iteration steps
Figure 5: Stopping criterion: smallness of the values kVi kF . Solid line: min in = 10−5 ;
dash-dotted line: min in = 10−10 ; dotted line: min in = 10−15 . The other three criteria
are avoided.
We recommend to use (only) the stopping criterion related to with rs if Lyapunov so-
lutions of high accuracy should be computed and if it is affordable to compute residual
norms. If the computation of the residual norms must be avoided, the criterion related to
min in is probably the best choice.
F X + XF T = −GGT (14)
F T X + XF = −GT G. (15)
Here, F = A−Bf KfT . Basic matrix operations must be supplied by user-supplied functions
for the matrix A (not F !). G in (14) or GT in (15) should contain very few columns
compared to the system order. Bf and Kf are matrices, that represent a so-called state
feedback. They are needed in the routine lp lrnm for the Newton method, in which
16 3 LYAPUNOV EQUATIONS
lp lradi is invoked. In general, users will not use this option, which means Bf = Kf = 0.
However, if this is not the case, the matrices Bf and Kf must contain very few columns
to guarantee the efficiency of the routine.
The approximate solution of either Lyapunov equation is given by the low rank
Cholesky factor Z, for which ZZ H ≈ X. Z has typically fewer columns than rows. (Oth-
erwise, this routine and LYAPACK itself are useless!) In general, Z can be a complex
matrix, but the product ZZ H is real. lp lradi can perform an optional internal post-
processing step, which guarantees that the delivered low rank Cholesky factor Z is real.
More precisely, the complex low rank Cholesky factor delivered by the LRCF-ADI itera-
tion is transformed into a real low rank Cholesky factor of the same size, such that both
low rank Cholesky factor products are identical. However, doing this requires additional
computation. (This option is not related to the “real version” of LRCF-ADI described in
[6].)
Furthermore, there exists an option for directly generating the product of the (approx-
imate) solution with a matrix, i.e., Kout = ZZ H Kin is computed without forming the low
rank Cholesky factor Z. Here, Kin must contain only few columns. However, this option
should not be used by the user. It is needed in the implicit version of the Newton method.
If this mode is used, stopping criteria based on the residual cannot be applied.
Calling sequences:
Depending on the choice of the mode parameter zk, the following two calling sequences
exist. However, it is recommended to use only the first mode.
• zk = ’Z’:
[Z, flag, res, flp] = lp_lradi( tp, zk, rc, name, Bf, Kf, G, p, ...
max_it, min_res, with_rs, min_in, info)
• zk = ’K’:
[K_out, flag, flp] = lp_lradi( tp, zk, rc, name, Bf, Kf, G, p, ...
K_in, max_it, min_in, info)
Input parameters:
tp: Mode parameter, which is either ’B’ or ’C’. If tp = ’B’, CALE (14) is solved.
Otherwise, CALE (15) is solved.
zk: Mode parameter, which is either ’Z’ or ’K’. If zk = ’Z’, the low rank Cholesky
factor Z is computed. Otherwise, Kout = ZZ H Kin is computed directly.
rc: Mode parameter, which is either ’R’ or ’C’. If rc = ’C’, the routine delivers a
low rank Cholesky factor, which is not real when non-real shift parameters are used.
Otherwise, the low rank Cholesky factor resulting from the LRCF-ADI iteration is
transformed into a real low rank Cholesky factor Z̃, which describes the identical
approximate solution Z̃ Z̃ T . Z̃ is returned instead of Z.
name: The basis name of the USFs that realize BMOs with A.
Bf: Feedback matrix Bf , which is not used explicitely in general. For Bf = 0, set Bf =
[].
3.2 Computation of ADI shift parameters 17
Kf: Feedback matrix Kf , which is not used explicitely in general. For Kf = 0, set Kf =
[].
G: The matrix G.
p: Vector containing the suitably ordered ADI shift parameters P = {p1 , . . . , pl }, which
are delivered by the routine lp para. If the number l of distinct parameters is
smaller than imax in Algorithm 1, shift parameters are used cyclically. That means,
pl+1 = p1 , pl+2 = p2 , . . ., p2l = pl , p2l+1 = p1 , . . .
K in: The matrix Kin , which is only used in the mode zk = ’K’.
Output parameters:
Z: The low rank Cholesky factor Z, which is complex if rc = ’C’ and p is not a real
vector.
K out: The matrix Kout , which is only returned in the mode zk = ’K’.
flag: A flag, that shows by which stopping criterion (or stopping parameter) the iteration
has been stopped. Possible values are ’I’ (for max it), ’R’ (for min res), ’S’ (for
with rs), and ’N’ (for min in).
res: A vector containing the history of the normalized residual norms. res(1) = 1 and
res(i + 1) is the normalized residual norm w.r.t. the iteration step i. If the stopping
criteria are chosen, so that the normalized residual norms need not be computed,
res = [] is returned.
flp: A vector containing the history of the flops needed for the iteration. flp(1) = 0 and
flp(i + 1) is the number of flops required for the iteration steps 1 to i. flp displays
only the number of flops required for the actual iteration. The numerical costs for
initializing and generating data by USFs, the computation of ADI shift parameters,
and the computation of normalized residual norms are not included.
|(t − p1 ) · . . . · (t − pl )|
sP (t) = .
|(t + p1 ) · . . . · (t + pl )|
max sP (t)
t∈σ(F )
is minimized. Unfortunately, the spectrum σ(F ) is not known in general and it cannot be
computed inexpensively if F is very large. Furthermore, even if the spectrum or bounds for
the spectrum are known, no algorithms are available to compute the optimal parameters
pi .
Our algorithm for the computation of a set of suboptimal shift parameters is numer-
ically inexpensive and heuristic. It is based on two ideas. First, we generate a discrete
set, which “approximates” the spectrum. This is done by a pair of Arnoldi processes; e.g.,
[19]. The first process w.r.t. F delivers k+ values that tend to approximate “outer” eigen-
values, which are generally not close to the origin, well. The second process w.r.t. F −1 is
used to get k− approximations of eigenvalues near the origin, whose consideration in the
ADI minimax problem is crucial. The eigenvalue approximations delivered by the Arnoldi
processes are called Ritz values. Second, we choose a set of shift parameters, which is a
subset of the set of Ritz values R. This is done by a heuristic, that delivers a suboptimal
solution for the resulting discrete optimization problem. Note that the order in which this
heuristic delivers the parameters is advantageous. Loosely speaking, the parameters are
ordered such that parameters, which are related to a strong reduction in the ADI error,
are applied first. For more details about the parameter algorithm, see [37].
INPUT: F , l0 , k+ , k−
OUTPUT: P = {p1 , . . . , pl }, where l = l0 or l0 + 1
1. Choose b0 ∈ Rn at random.
2. Perform k+ steps of the Arnoldi process w.r.t. (F, b0 ) and compute the set of Ritz values
R+ .
3. Perform k− steps of the Arnoldi process w.r.t. (F −1 , b0 ) and compute the set of Ritz
values R− .
4. R = {ρ1 , . . . , ρk+ +k− } := R+ ∪ (1/R− )
5. IF R 6⊂ C− , remove unstable elements from R and display a warning.
6. Detect i with maxt∈R s{ρi } (t) = minρ∈R maxt∈R s{ρ} (t) and initialize
{ρi } : ρi real
P := .
{ρi , ρ̄i } : otherwise
WHILE card(P) < l0
7. Detect
i with sP (ρi ) = maxt∈R sP (t) and set
P ∪ {ρi } : ρi real
P := .
P ∪ {ρi , ρ̄i } : otherwise
END WHILE
3.2 Computation of ADI shift parameters 19
Obviously, the output of this algorithm is a proper parameter set; see §3.1.1. The number
of shift parameters is either l0 or l0 + 1. Larger values of k+ and k− lead to better
approximations of the spectrum, but increase also the computational cost, because k+
matrix-vector multiplications with F must be computed in the first Arnoldi algorithm and
k− systems of linear equations with F must be solved in the second one. A typical choice of
the triple (l0 , k+ , k− ) is (20,50,25). For “tough” problems these values should be increased.
For “easy” ones they can be decreased. Note that decreasing l0 will reduce the memory
demand if shifted SLEs are solved directly, because in this case the amount of the memory
needed to store the matrix factors is proportional to l.
Steps 6 and 7 require that R is contained in C− . However, this can only be guaranteed
if F + F T is negative definite and exact machine precision is used. If F is unstable, than
LYAPACK cannot be applied anyway, because the ADI iteration diverges or, at least,
stagnates. Experience shows that also in the case, when F is stable but F + F T is not
definite, the Ritz values tend to be contained in the left half of the complex plane. If
this is not the case, unstable Ritz values are removed in Step 5, which is more or less
a not very elegant emergency measure. If LRCF-ADI run with the resulting parameters
diverges despite this measure, the matrix F is most likely unstable. In connection with
the LRCF-NM or LRCF-NM-I applied to ill-conditioned CAREs, this might be caused by
round-off errors. There the so-called closed-loop matrix A − Bf KfT can be proved to be
stable (in exact arithmetics), but the closed-loop poles (i.e, the eigenvalues of A − Bf KfT )
can be extremely sensitive to perturbations, so that stability is not guaranteed in practice.
Figure 6 shows the result of the parameter algorithm for a random example of order
n = 500. The triple (l0 , k+ , k− ) is chosen as (20,50,25). 21 shift parameters were returned.
The picture shows the eigenvalues of F , the set R of Ritz values, and the set P of shift
parameters. Note that the majority of the shift parameters is close to the imaginary axis.
500
400
300
200
imaginary axis
100
−100
−200
−300
−400
−500
−2000 −1500 −1000 −500 0
real axis
Calling sequences:
[p,err_code,rw,Hp,Hm] = lp_para(name,Bf,Kf,l0,kp,km)
[p,err_code,rw,Hp,Hm] = lp_para(name,Bf,Kf,l0,kp,km,b0)
Input parameters:
name: The basis name of the USFs that realize BMOs with A.
Bf: Feedback matrix Bf , which is not used explicitely in general. For Bf = 0, set Bf =
[].
Kf: Feedback matrix Kf , which is not used explicitely in general. For Kf = 0, set Kf =
[].
kp: Parameter k+ .
km: Parameter k− .
b0: This optional argument is an n-vector, that is used as starting vector in both Arnoldi
processes. If b0 is not provided, this vector is chosen at random, which means that
different results can be returned by lp para in two different runs with identical input
parameters.
Output parameters:
err code: This parameter is an error flag, which is either 0 or 1. If err code = 1, the
routine encountered Ritz values in the right half of the complex plane, which are
removed in Step 5 of Algorithm 2. err code = 0 is the standard return value.
4 Model reduction
4.1 Preliminaries
Roughly speaking, model reduction is the approximation of the dynamical system
with  ∈ Rk,k , B̂ ∈ Rk,m , Ĉ ∈ Rq,k (or, possibly,  ∈ Ck,k , B̂ ∈ Ck,m , Ĉ ∈ Cq,k ), and
k < n. In particular, we consider the case where the system order n is large, and m and
q are much smaller than n. Furthermore, we assume that A is stable. Several ways exist
to evaluate the approximation error between the original system and the reduced system.
Frequently, the difference between the systems (16) and (17) measured in the L∞ norm
Here, SB , SC ∈ Cn,k are certain projection matrices, which fulfill the biorthogonality con-
dition
SCH SB = Ik .
Furthermore, both model reduction algorithms rely on low rank approximations to the
solutions (Gramians) of the continuous time Lyapunov equations
AT XC + XC A = −C T C. (21)
This means that we assume that low rank Cholesky factors ZB ∈ Cn,rB and ZC ∈ Cn,rC
with rB , rC << n are available, such that ZB ZBH ≈ X and Z Z H ≈ X . In LYAPACK
B C C C
these low rank Cholesky factors are computed by the LRCF-ADI iteration; see §3.1. In
§§4.2 and 4.3 we will briefly describe the two model reduction algorithms LRSRM [39, 33]
and DSPMR [32, 39]. In [39] a third method called LRSM (low rank Schur method)
is proposed. However, this less efficient method is not implemented in LYAPACK. The
distinct merit of LRSRM and DSPMR compared to standard model reduction algorithms,
such as standard balanced truncation methods [47, 44, 48] or all-optimal Hankel norm
approximation [18]), is their low numerical cost w.r.t. both memory and computation. On
the other hand, unlike some standard methods, the algorithms implemented in LYAPACK
do generally not guarantee the stability of the reduced system. If stability is crucial, this
22 4 MODEL REDUCTION
property must be checked numerically after running LRSRM or DSPMR. If the reduced
system is not stable, several measures can be tried. For example, one can simply remove the
unstable modes by modal truncation [11]. Another option is to run LRSRM or DSPMR
again using more accurate low rank Cholesky factors ZB and ZC . Note that for some
problems the error function kG(ω) − Ĝ(ω)k in ω, which characterizes the frequency
response of the difference of both systems, can be evaluated by supplementary LYAPACK
routines; see §6.
If the low rank Cholesky factors ZB and ZC delivered by the LRCF-ADI iteration
are not real, then the reduced systems are not guaranteed to be real. This problem
is discussed more detailed in [33] for the low rank square root method. If the reduced
system needs to be real, it is recommended to check a posteriori whether the result of low
rank square root method or dominant subspace projection model reduction is real. It is
possible to transform a reduced complex system into a real one by a unitary equivalence
transformation; see [33]. A much simpler way, of course, is using the option rc = ’R’
for which the routine lp lradi delivers real low rank Cholesky factors (at the price of a
somewhat increased numerical cost).
INPUT: A, B, C, ZB , ZC , k
OUTPUT: Â, B̂, Ĉ
1. UC ΣUBH := ZCH ZB (“thin” SVD with descending ordered singular values)
−1/2 −1/2
2. SB = ZB UB (:,1:k) Σ(1:k,1:k) , SC = ZC UC (:,1:k) Σ(1:k,1:k)
The only difference between the classical square root method and this algorithm is, that
here (approximate) low rank Cholesky factors ZB and ZC are used instead of exact
Cholesky factors of the Gramians, which have possibly full rank. This reduces in par-
ticular the numerical cost for the singular value decomposition in Step 1 considerably.
However, there are two basic drawbacks of LRSRM compared to the “exact” square
root method. Unlike LRSRM, the latter delivers stable reduced systems under mild condi-
tions. Furthermore, there exists an upper error bound for (18) for the standard square root
method, which does not apply to the low rank square root method. Thus, it is not surpris-
ing that the performance of Algorithm 3 depends on the accuracy of the approximate low
rank Cholesky factor products ZB ZB H and Z Z H and the value k, where k ≤ rank Z H Z .
C C C B
This makes the choice of the quantities rB , rC , and k a trade-off. Large values of rB and
rC , and values of k much smaller than rank ZCH ZB tend to keep the deviation of the low
rank square root method from the standard square root method small. On the other hand
the computational efficiency of the low rank square roo method is decreased in this way.
However, LRCF-ADI often delivers low rank Cholesky factors ZB and ZC , whose products
4.2 Low rank square root method 23
approximate the system Gramians nearly up to machine precision. In this case the re-
sults of the LYAPACK implementation of the low rank square root method will be about
as good as those by any standard implementation of the balanced truncation technique,
which, however, can be still numerically much more expensive.
Finally, note that the classical square root method is well-suited to compute (nu-
merically) minimal realizations; e.g., [48]. LRSRM (as well as DSPMR) can be used to
compute such realizations for large systems. The term “numerically minimal realization”
is not well-defined. Loosely speaking, it is rather the concept of computing a reduced
system, for which the (relative) approximation error (18) is of magnitude of the machine
precision. See Figure 14 in §C.3.3.
• Maximal reduced order. The input parameters max ord of the routine lp lrsrm
prescribes the maximal admissible value for the reduced order k, i.e., k ≤ max ord is
required. If the choice of this value should be avoided, one can set max ord = n or
max ord = [].
• Maximal ratio σk /σ1 . The input parameter tol prescribes the maximal admissible
value for the ratio σk /σ1 . That means k is chosen as the largest index for which
σk /σ1 ≥ tol. This means that one will generally choose a value of tol between the
machine precision an 1.
In general, both parameters will determine different values of k. The routine lp lrsrm
uses the smaller value.
Calling sequence:
[Ar ,Br, Cr, SB, SC, sigma] = lp_lrsrm( name, B, C, ZB, ZC, ...
max_ord, tol)
Input parameters:
name: The basis name of the user supplied functions that realize basic matrix operations
with A.
B: System matrix B.
C: System matrix C.
max ord: A parameter for the choice of the reduced order k; see §4.2.2.
tol: A parameter for the choice of the reduced order k; see §4.2.2.
Output parameters:
Ar: Matrix  ∈ Ck,k of reduced system.
Br: Matrix B̂ ∈ Ck,m of reduced system.
Cr: Matrix Ĉ ∈ Cq,k of reduced system.
SB: Projection matrix SB .
SC: Projection matrix SC .
sigma: Vector containing the singular values computed in Step 1.
INPUT: A, B, C, ZB , ZC , k
OUTPUT: Â, B̂, Ĉ
h i
1 1
1. Z = ||ZB ||F ZB ||ZC ||F ZC
• Maximal reduced order. The input parameter max ord of the routine lp dspmr
prescribes the maximal admissible value for the reduced order k, i.e., k ≤ max ord is
required. To avoid this choice, one can set max ord = n or max ord = [].
• Maximal ratio σk /σ1 . The input parameter tol determines the maximal admissible
value for √
the ratio σk /σ1 . More precisely, k is chosen as the largest index for which
σk /σ1 ≥ tol. Note that here the square root of tol is used in contrast to LRSRM.
(Note that the values σi have somewhat different meanings in LRSRM and DSPMR.)
In general, both parameters will determine different values of k. The routine lp dspmr
uses the smaller value. Finally, it should be mentioned, that, at least in exact arithmetics,
both LRSRM and DSPMR (run with identical values max ord and tol) deliver the same
result for state-space symmetric systems (i.e., systems, where A = AT and C = B T ).
Calling sequence:
[Ar ,Br, Cr, S] = lp_dspmr( name, B, C, ZB, ZC, max_ord, tol)
Input parameters:
name: The basis name of the user supplied functions that realize basic matrix operations
with A.
B: System matrix B.
C: System matrix C.
max ord: A parameter for the choice of the reduced order k; see §4.3.2.
tol: A parameter for the choice of the reduced order k; see §4.3.2.
Output parameters:
Ar: Matrix  ∈ Ck,k of reduced system.
S: Projection matrix S.
26 5 RICCATI EQUATIONS
C T QC + AT X + XA − XBR−1 B T X = 0, (22)
where A ∈ Rn,n , B ∈ Rn,m , and C ∈ Rq,n with m, q << n. Moreover, we assume that
Q ∈ Rq,q is symmetric, positive semidefinite and R ∈ Rm,m is symmetric, positive definite.
Unlike in the other sections of this document, we do not assume here that A is stable, but
T
it is required that a matrix K (0) is given, such that A − BK (0) is stable. Such a matrix
K (0) can be computed by partial pole placement algorithms [21], for example.
In general, the solution of (22) is not unique. However, under the above assumptions, a
unique, stabilizing solution X exists, which is the solution of interest in most applications;
e.g., [34, 29]. A solution X is called stabilizing if the closed-loop matrix A − BR−1 B T X is
stable.
Algebraic Riccati equations arise from numerous problems in control theory, such as
robust control or certain balancing and model reduction techniques for unstable systems.
Another application, for which algorithms are provided by LYAPACK, is the solution
of the linear quadratic optimal control problem. In this paragraph, we briefly describe
the connection between linear quadratic optimal control problems and algebraic Riccati
equations. The linear quadratic optimal control problem is a constrained optimization
problem. The cost functional to be minimized, is
1 ∞
Z
J (u, y, x0 ) = y(τ )T Qy(τ ) + u(τ )T Ru(τ )dτ , (23)
2 0
where Q = QT ≥ 0 and R = RT > 0. The constraints are given by the dynamical system
u(τ ) = −K T x(τ )
To sum up, we consider two problems in this section. The first one is the numerical
computation of the stabilizing solution of the continuous time algebraic Riccati equations
(22). The second problem is the solution of the linear quadratic optimal control problem
(23,24,25), which is a particular application of algebraic Riccati equations. Its solution
can be described by the stabilizing solution X, from which the optimal state-feedback can
easily be computed via (26), or by the feedback K itself.
LYAPACK contains implementations of the low rank Cholesky factor Newton method
(LRCF-NM) and the implicit low rank Cholesky factor Newton method (LRCF-NM-I) pro-
posed in [6]. LRCF-NM delivers a LRCF Z, such that the product ZZ H approximates the
Riccati solution X. This means that LRCF-NM can be used to solve both continuous time
algebraic Riccati equations and linear quadratic optimal control problems. The implicit
version LRCF-NM-I, which directly computes an approximation to K without forming Z
or X, can only be used to solve the linear quadratic optimal control problem in a more
memory efficient way.
Both LRCF-NM and LRCF-NM-I are modifications of the classical Newton method
for algebraic Riccati equations [28], or more precisely, combinations of the Newton method
with the LRCF-ADI iteration. We will describe these combinations in §§5.2 and 5.3. The
classical formulation of the Newton method is given by the double step iteration
where the matrices Q̃ ∈ Rq,h (h ≤ q) and R̃ ∈ Rm,m have full rank. Thus, the Lyapunov
equations to be solved in (27) have the structure
T T
F (k) X (k) + X (k) F (k) = −G(k) G(k)
where F (k) = AT − K (k−1) B T and G(k) = C T Q̃ K (k−1) R̃ . Note that G(k) contains
only t = m + h << n columns. Hence, these Lyapunov can be solved efficiently by the
LRCF-ADI iteration. The Lyapunov solutions form a sequence of approximate solutions
to the algebraic Riccati equations (22). Therefore, the inclusion of Algorithm 1 into the
Newton iteration (27) can be utilized to determine low rank Cholesky factor products which
approximate the solution of the algebraic Riccati equation (22). The resulting algorithm
low rank Cholesky factor Newton method is described below.
T
INPUT: A, B, C, Q, R, K (0) for which A − BK (0) is stable (e.g., K (0) = 0 if A is stable)
OUTPUT: Z = Z (kmax ) , such that ZZ H approximates the solution X of the algebraic
Riccati equation (8)
FOR k = 1, 2, . . . , kmax
(k) (k)
1. Determine (sub)optimal ADI shift parameters p1 , p2 , . . . with respect to the
matrix F (k) = AT − K (k−1) B T .
2. G(k) = C T Q̃ K (k−1) R̃
3. Compute matrix Z (k) by Algorithm 1, such that the low rank Cholesky factor
H
product Z (k) Z (k)
T T
approximates the solution of F (k) X (k) + X (k) F (k) = −G(k) G(k) .
H
4. K (k) = Z (k) (Z (k) BR−1 )
END
Similar to the LRCF-ADI iteration for the solution of Lyapunov equations, the distinct
merit of this algorithm is that the (approximate) solution of the algebraic Riccati equations
is provided as a low rank Cholesky factor product rather than an explicit dense matrix. In
particular, this allows the application of the algorithm to problems of large order n, where
dense n × n matrices cannot be stored in the computer memory. Moreover, the LRCF-NM
requires often much less computation compared to the standard implementation, where
Lyapunov are solved directly by the Bartels-Stewart or the Hammarling method; see §7.
See [6] for more technical details of the LRCF-NM.
where
i
(k) (k) (k) H (k) (k) H
X
Ki := Zi Zi BR−1 = Vj Vj BR−1 . (29)
j=1
(k)
This means, that the (exact) matrix K is the limit of the matrices Ki for k, i → ∞. This
consideration motivates the following Algorithm 6, which is best understood as a version
of the LRCF-NM with an inner loop (Steps 4 and 5) consisting of interlaced sequences
based on Step 3 in Algorithm 1 and the partial sums given by the right hand term in (29).
T
INPUT: A, B, C, Q, R, K (0) for which A − BK (0) is stable (e.g., K (0) = 0, if A is stable)
OUTPUT: K (kmax ) , which approximates K given by (26)
FOR k = 1, 2, . . . , kmax
(k) (k)
1. Determine (sub)optimal ADI shift parameters p1 , p2 , . . . with respect to the
matrix F (k) = AT − K (k−1) B T .
2. G(k) = C T Q̃ K (k−1) R̃
q
(k) (k) (k)
3. V1 = −2 Re p1 (F (k) + p1 In )−1 G(k)
(k)
FOR i = 2, 3, . . . , imax
q
(k) (k) (k) (k) (k) (k) (k) (k)
4. Vi = Re pi / Re pi−1 Vi−1 − (pi + p̄i−1 )(F (k) + pi In )−1 Vi−1
(k) (k) (k) (k) H −1
5. Ki = Ki−1 + Vi Vi BR
END
(k)
6. K (k) = K (k)
imax
END
• stagnation of the normalized residual norm (most likely caused by round-off errors):
used in LRCF-NM only;
• smallness of the relative change of the feedback matrix (RCF): Used in LRCF-NM
and LRCF-NM-I;
• stagnation of the relative change of the feedback matrix: used in LRCF-NM and
LRCF-NM-I.
Here, the normalized residual norm corresponding to the low rank Cholesky factor Z (k) is
defined as
H H H H
(k) + Z (k) Z (k) A − Z (k) Z (k) BR−1 B T Z (k) Z (k) kF
kC T QC + AT Z (k) Z (k)
NRN(Z ) = ,
kC T QCkF
(30)
whereas the relative change of the feedback matrix related to the matrices K (k−1) and
K (k) is
kK (k) − K (k−1) kF
RCF(K (k−1) , K (k) ) = . (31)
kK (k) kF
30 5 RICCATI EQUATIONS
Many of the remarks on stopping criteria for the LRCF-ADI iteration made in §3.1.2 also
apply to stopping criteria for LRCF-NM or LRCF-NM-I. In particular, the application of
stopping criteria, which require the computation of normalized residual norms is numer-
ically expensive. Although the applied computational method [6] exploits the low rank
structure of the approximate solutions, it can be more expensive than the iteration itself.
Moreover, it is not possible to use residual based stopping criteria for LRCF-NM-I, because
there the low rank Cholesky factors Z (k) are not formed at all, which is the only reason
why one would apply LRCF-NM-I instead of LRCF-NM.
The consideration of (31) for the construction of heuristic stopping criteria is related
to the fact that in some sense the matrices K (k) rather than the low rank Cholesky factors
Z (k) or their products are the quantities of interest when the optimal control problem
should be solved. However, stopping criteria related to K (k) are somewhat dubious when
the optimal feedback K or, more precisely, the product BK T is very small compared to A,
because then small relative changes in K hardly change the closed-loop matrix A − BK T .
On the other hand, the accuracy of K does not play a crucial role in such situations, which
means that a possibly premature termination of the Newton iteration would not be very
harmful.
We will now discuss the five stopping criteria. Convergence plots generated for an
example problem illustrate their effects. Note that, similar to the criteria for the LRCF-
ADI iteration described in §3.1.2, the following stopping criteria can be “activated” or
“avoided”.
0
10
normalized residual norm
−5
10
−10
10
−15
10
0 5 10 15 20
iteration steps
Figure 7: Stopping criterion: maximal number of iteration steps. Solid line: max it r = 5;
dash-dotted line: max it r = 10; dotted line: max it r = 20. The other four criteria are
avoided.
• Stopping criterion: tolerance for the normalized residual norm. This cri-
terion is represented by the input parameter min res r in the routine lp lrnm. The
5.4 Stopping criteria 31
This criterion can be avoided by setting min res r = 0. (Because of round-off errors
it is practically impossible to attain NRN(Z (k) ) = 0.) It requires the computation of
normalized residual norms and is computationally expensive. A further drawback of
this criterion is that it will either stop the iteration before the maximal accuracy is
attained (see min res r = 10−5 , 10−10 in Figure 8) or it will not stop the iteration
at all (see min res r = 10−15 in Figure 8). If you want to avoid this criterion, but
compute the convergence history provided by the output vector res r, set min res r
to a value much smaller than the machine precision (say, min res r = 10−100 ).
0
10
normalized residual norm
−5
10
−10
10
−15
10
0 5 10 15 20
iteration steps
Figure 8: Stopping criterion: tolerance for the normalized residual norm. Solid line:
min res r = 10−5 ; dash-dotted line: min res r = 10−10 ; dotted line: min res r = 10−15 .
Here, the dash-dotted and the solid line are identical. The other four criteria are avoided.
0
10
−10
10
−15
10
0 5 10 15 20
iteration steps
Figure 9: Stopping criterion: Stagnation of the normalized residual norms. Solid line:
with rs r = ’S’; dotted line: with rs r = ’N’. The other four criteria are avoided.
Calling sequences:
Depending on the choice of the mode parameter zk, the following two calling sequences
exist. For zk = ’Z’, the low rank Cholesky factor Z is computed by LRCF-NM, whereas
5.5 The routine lp lrnm 33
0
10
−10
10
−15
10
0 5 10 15 20
iteration steps
Figure 10: Stopping criterion: Smallness of the the relative change of the feedback matrix.
Solid line: min ck r = 10−4 ; dash-dotted line: min ck r = 10−8 ; dotted line: min ck r
= 10−16 . The other four criteria are avoided. Note that it is mere coincidence, that
min ck r = 10−8 (dash-dotted line) leads to the termination after the “optimal” number
of steps.
• zk = ’Z’:
• zk = ’K’:
Input parameters:
zk: Mode parameter, which is either ’Z’ or ’K’. If zk = ’Z’, the low rank Cholesky factor
Z = Z (kmax ) is computed by LRCF-NM. Otherwise, K (kmax ) is directly computed by
LRCF-ADI-I.
rc: Mode parameter, which is either ’R’ or ’C’. If rc = ’C’, the routine delivers a low
rank Cholesky factor, which is not real when non-real shift parameters are used in
the last Newton step. Otherwise, this possibly complex low rank Cholesky factor
is transformed into a real low rank Cholesky factor Z̃, which describes the identi-
cal approximate solution Z̃ Z̃ T . Z̃ is returned instead of Z. The parameter rc is
not needed in the mode for LRCF-NM-I, because the returned feedback (parameter
K out) is always real, provided that K (0) (parameter K in) is real.
name: The basis name of the user supplied functions that realize basic matrix operations
with A.
34 5 RICCATI EQUATIONS
0
10
−10
10
−15
10
0 5 10 15 20
iteration steps
Figure 11: Stopping criterion: Stagnation of the relative change of the feedback matrix.
Solid line: with rs r = ’L’; dotted line: with rs r = ’N’. The other four criteria are
avoided. For this particular example, no stagnation in the relative change of the feedback
matrix is observed within 20 iteration steps.
B: System matrix B.
C: System matrix C.
K in: The stabilizing initial state feedback K (0) . If A is stable, K (0) = 0 can be used, for
example.
min res r: Stopping parameter for (outer) Newton iteration. See §5.4.
info r: Parameter, which determines the “amount” of information on the (outer) Newton
iteration that is provided as text and/or residual history plot. The following values
are possible: info r = 0 (no information), 1, 2, and 3 (most possible information).
l0: Parameter l0 for the ADI parameter routine lp para, which is invoked in each Newton
step. Note that k+ + k− > 2l0 is required.
max it l: Stopping parameter for the (inner) LRCF-ADI iterations. See §3.1.2.
min res l: Stopping parameter for the (inner) LRCF-ADI iterations. See §3.1.2.
35
with rs l: Stopping parameter for the (inner) LRCF-ADI iterations. See §3.1.2.
min in l: Stopping parameter for the (inner) LRCF-ADI iterations. See §3.1.2.
info l: Parameter, which determines the “amount” of information on the (inner) LRCF-
ADI iterations that is provided as text and/or residual history plot. The following
values are possible: info l = 0 (no information), 1, 2, and 3 (most possible infor-
mation).
Output parameters:
Z: The low rank Cholesky factor Z, which is the result of LRCF-NM. It can be complex
under certain circumstances.
flag r: A flag, that shows by which stopping criterion (or stopping parameter) the (outer)
Newton iteration has been stopped. Possible values are ’I’ (for max it r), ’R’ (for
min res r), ’S’ (for with rs r), ’K’ (for min ck r), and ’L’ (for with ks r).
res r: A vector containing the history of the algebraic Riccati equations normalized
residual norms (30). res r(1) = 1 and res r(i + 1) is the normalized residual norm
w.r.t. the Newton step i. If the stopping criteria are chosen, so that the normalized
residual norms need not be computed, res r = [] is returned.
flp r: A vector containing the history of the flops needed for the algorithm. flp r(1)
= 0 and flp r(i + 1) is the number of flops required for the Newton steps 1 to i.
flp r displays the number of flops required for the actual iteration. It also contains
the numerical costs for all user supplied functions invoked within lp lrnm as well
as the computation of the sets of ADI shift parameters. However, the costs for the
computation of Riccati equations or Lyapunov equation normalized residual norms
are not included.
flag l: Vector containing the values flag returned by the LRCF-ADI routine lp lradi,
which is called in each Newton step.
its l: Vector containing the number of iteration steps of the (inner) LRCF-ADI itera-
tions.
res l: Matrix whose columns contain the normalized residual norm history vectors res
returned by the LRCF-ADI routine lp lradi. Here, normalized residual norms in
the sense of (13) are considered.
flp l: Matrix whose columns contain the flop history vectors flp returned by the LRCF-
ADI routine lp lradi in each Newton step.
kF ZZ H + ZZ H F T + GGT kF (32)
kC T QC + AT ZZ H + ZZ H A − ZZ H BR−1 B T ZZ H kF . (33)
The following two LYAPACK routines can be used to compute such norms.
lp nrm: Computes the Lyapunov equation residual norm (32) by the technique described
in [39].
lp rcnrm: Computes the Riccati equation residual norm (33) by the technique described
in [6].
Note, that these routines do not evaluate the residual matrices, i.e., the terms inside the
norms. They rather make use of the low rank structure of ZZ H , which is often much more
efficient w.r.t. both memory and computation. However, both routines are not efficient if
the number of columns in Z is almost n or even larger than n.
lp trfia: Generates the matrices G(ωi ) (i = 1, . . . , imax ). Their stacked columns are
stored in an mq × imax “transfer function sample” matrix Gs .
lp gnorm: Computes kG(ωi )k (i = 1, . . . , imax ), where the matrices kG(ωi )k are re-
trieved from the matrix Gs generated by lp trfia.
Unlike the other LYAPACK routines, which have access to the system matrix A,
lp trfia does not make use of user supplied functions. On the other hand, this rou-
tine can be applied to the more general form of a dynamical system (which is slightly more
general than (9))
E ẋ(τ ) = Ax(τ ) + Bu(τ )
(34)
y(τ ) = Cx(τ ) + Du(τ )
to generate its transfer function G(s) = C(sE−A)−1 B+D on the imaginary axis. However,
it is required that all matrices are given explicitely. A and E should be preferably sparse.
Typically, lp trfia and lp gnorm will be used subsequently. It is important that
the same set of frequency sampling points ωi is used in both routines. If the mq Bode
magnitude plots of a system with multiple inputs or multiple outputs should be generated,
then lp gnorm must be applied mq times to the single rows of the matrix Gs generated
by lp trfia. The approximation error function kG(ω) − Ĝ(ω)k can be evaluated easily.
First, lp trfia is applied to both the original and the reduced system, which results in the
transfer function samples Gs and Ĝs . Then, lp gnorm is applied to the difference Gs − Ĝs ,
which delivers the desired result.
fdm 2d matrix: Generates the negative stiffness matrix for a 2D parabolic differential
equation, which is semidiscretized by the finite difference method (FDM). This stiff-
ness matrix can be used as system matrix A.
fdm 2d vector: Generates the corresponding load vectors, which can be used as system
matrices B and C T .
The matrices of a generalized system (9), which arises from the semidiscretization of a steel
rail cooling problem (see, e.g., [39]) by the finite element method (FEM), are provided in
two MATLAB data files.
7 Alternative methods
Under certain conditions LYAPACK works very well for the types of problems described
in §1.1. However, we are far from claiming that the methods implemented in this package
are the ultimate solution techniques for the respective problems. In this section, we want
to give a brief and by far not complete survey on alternative methods. In many cases,
no comparative studies of these methods have been done. LYAPACK is one step in this
direction.
38 7 ALTERNATIVE METHODS
• Model reduction. Model reduction methods for small, possibly dense systems
are abundant. The perhaps most popular technique for reducing stable systems is
balanced truncation [35] and all-optimal Hankel norm approximation [18]. Numeri-
cally elaborate implementations of the balanced truncation technique are proposed
in [47, 44, 48]. Algorithms for solving large dense model reduction problems on par-
allel computers can be found in [9]. The majority of model reduction methods for
large sparse problems is related to Padé approximations of the underlying transfer
function, e.g., [41, 15, 12, 16]. A quite detailed survey on this topic can be found
in [13]. Methods that are (directly) based on Krylov subspace techniques have been
proposed in [25, 23, 26]. The algorithms implemented in LYAPACK are described in
[32, 39, 33] at length.
• Riccati equations and optimal control problems. In LYAPACK, only the solu-
tion of large optimal control problems by solving Riccati equations is considered [6].
However, “Riccati equation-free” solution techniques for optimal control problems
surely exist. Standard techniques for small, possibly dense Riccati equations are the
Schur method [30], (standard) Newton method and modifications [28, 34, 29, 4], and
the sign function method, e.g., [42, 10, 17, 27].
Numerically reliable and versatile codes for dense problems of moderate size are can be
found in the freeware subroutine library SLICOT (Subroutine Library in Control Theory)
[7].
39
lp lrnm: Both versions of Newton method (LRCF-NM and LRCF-NM-I) for solving
Riccati equations and optimal control problems.
40 B LIST OF LYAPACK ROUTINES
The following routines and data files are used for generating test examples.
rail821.mat: Data file for steel rail cooling problem (order n = 821).
rail3113.mat: Data file for steel rail cooling problem (order n = 3113).
lp nrmu: Efficient computation of the Lyapunov equation residual norm based on updated
QR factorizations.
lp prm: Bandwidth reduction by reordering the rows and columns of a matrix or a matrix
pair.
• [basis name]:
as: Standard system (1); sparse matrix A is symmetric and shift parameters are
real; (shifted) linear systems are solved directly.
au: Standard system (1); sparse matrix A is (possibly) unsymmetric or shift pa-
rameters are not necessarily real; (shifted) linear systems are solved directly.
au qmr ilu: Standard system (1); sparse matrix A is (possibly) unsymmetric or
shift parameters are not necessarily real; (shifted) linear systems are solved
iteratively by QMR with ILU preconditioning.
msns: Generalized system (9); sparse (definite) matrices M and N are symmetric
and shift parameters are real; (shifted) linear systems are solved directly.
munu: generalized system (9); sparse matrices M and N are (possibly) unsymmetric
or shift parameters are not necessarily real; (shifted) linear systems are solved
directly.
• [extension(s)]:
demo u1, demo u2, demo u3: Demo programs for user supplied functions.
demo r1: Demo program for Riccati equations and optimal control problems.
42 C CASE STUDIES
C Case studies
In this section we provide listings of the demo programs which are included in LYAPACK.
In these programs, we usually provide matrices that correspond to transformed (prepro-
cessed) problems with a zero subscript (e.g., A0 or A0 ) to distinguish them from data
related to the original problem (e.g., A or A).
% -----------------------------------------------------------------------
% Generate test problem
% -----------------------------------------------------------------------
%
% As test example we use a simple FDM-semidiscretized PDE problem
% (an instationary convection-diffusion heat equation on the unit square
% with homogeneous 1st kind boundary conditions).
% We reorder the columns and rows of the resulting stiffness matrix by
% a random permutation, to generate a "bad" nonzero pattern.
disp(’Problem dimensions:’)
% -----------------------------------------------------------------------
% Preprocessing
% -----------------------------------------------------------------------
[A0,dummy,dummy,prm,iprm] = au_pre(A,[],[]);
C.1 Demo programs for user-supplied functions 43
% -----------------------------------------------------------------------
% Multiplication of matrix A0 with X0
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Multiplication of (transposed) matrix A0’ with X0
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Solution of system of linear equations with A0
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Solution of (transposed) system of linear equations with A0’
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Solve shifted systems of linear equations, i.e.
% solve (A0+p(i)*I)*Y0 = X0.
% -----------------------------------------------------------------------
disp(’Shift parameters:’)
p = [ -1; -2+3*j; -2-3*j ]
Y0 = au_s(’N’,X0,1);
test_5 = norm(A0*Y0+p(1)*Y0-X0,’fro’)
Y0 = au_s(’N’,X0,2);
test_6 = norm(A0*Y0+p(2)*Y0-X0,’fro’)
Y0 = au_s(’N’,X0,3);
test_7 = norm(A0*Y0+p(3)*Y0-X0,’fro’)
% -----------------------------------------------------------------------
% Solve (transposed) shifted systems of linear equations, i.e.
% solve (A0’+p(i)*I)*Y0 = X0.
% -----------------------------------------------------------------------
Y0 = au_s(’T’,X0,1);
test_8 = norm(A0’*Y0+p(1)*Y0-X0,’fro’)
Y0 = au_s(’T’,X0,2);
test_9 = norm(A0’*Y0+p(2)*Y0-X0,’fro’)
Y0 = au_s(’T’,X0,3);
test_10 = norm(A0’*Y0+p(3)*Y0-X0,’fro’)
% -----------------------------------------------------------------------
% Postprocessing
% -----------------------------------------------------------------------
C.1 Demo programs for user-supplied functions 45
%
% There is no postprocessing.
% -----------------------------------------------------------------------
% Destroy global data structures (clear "hidden" global variables)
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Generate test problem
% -----------------------------------------------------------------------
%
% As test example, we use a simple FDM-semidiscretized PDE problem
% (an instationary convection-diffusion heat equation on the unit square
% with homogeneous 1st kind boundary conditions).
% We reorder the columns and rows of the resulting stiffness matrix by
% a random permutation to generate a "bad" nonzero pattern.
disp(’Problem dimensions:’)
% -----------------------------------------------------------------------
% Preprocessing
% -----------------------------------------------------------------------
%
% There is no preprocessing.
% -----------------------------------------------------------------------
% Multiplication of matrix A with X
% -----------------------------------------------------------------------
disp(’NOTE: The USFs will return a warning message, when they fail to’)
disp(’ fulfill the stopping criteria for the ILU-QMR iteration.’)
disp(’ Also, the attained accuracy is displayed, which allows the’)
disp(’ user to judge, whether the results are still acceptable or’)
disp(’ not.’)
pause(5)
au_qmr_ilu_m_i(A,mc,max_it_qmr,tol_qmr,tol_ilu,info_qmr);
% initialization and generation of data needed for matrix
% multiplications with A
% -----------------------------------------------------------------------
% Multiplication of (transposed) matrix A’ with X
% -----------------------------------------------------------------------
test_2 = norm(Y-T,’fro’)
% -----------------------------------------------------------------------
% Solution of system of linear equations with A
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Solution of (transposed) system of linear equations with A’
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Solve shifted systems of linear equations, i.e.
% solve (A+p(i)*I)*Y = X.
% -----------------------------------------------------------------------
disp(’Shift parameters:’)
p = [ -1; -2+3*j; -2-3*j ]
Y = au_qmr_ilu_s(’N’,X,1);
test_5 = norm(A*Y+p(1)*Y-X,’fro’)
Y = au_qmr_ilu_s(’N’,X,2);
test_6 = norm(A*Y+p(2)*Y-X,’fro’)
Y = au_qmr_ilu_s(’N’,X,3);
test_7 = norm(A*Y+p(3)*Y-X,’fro’)
% -----------------------------------------------------------------------
% Solve (transposed) shifted systems of linear equations, i.e.
% solve (A’+p(i)*I)*Y = X.
% -----------------------------------------------------------------------
Y = au_qmr_ilu_s(’T’,X,1);
test_8 = norm(A’*Y+p(1)*Y-X,’fro’)
48 C CASE STUDIES
Y = au_qmr_ilu_s(’T’,X,2);
test_9 = norm(A’*Y+p(2)*Y-X,’fro’)
Y = au_qmr_ilu_s(’T’,X,3);
test_10 = norm(A’*Y+p(3)*Y-X,’fro’)
% -----------------------------------------------------------------------
% Postprocessing
% -----------------------------------------------------------------------
%
% There is no postprocessing.
% -----------------------------------------------------------------------
% Destroy global data structures (clear "hidden" global variables)
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Generate test problem
% -----------------------------------------------------------------------
%
% As test example, we use an FEM-semidiscretized problem, which leads to
% a generalized system where M (the mass matrix) and N (the negative
% stiffness matrix) are sparse, symmetric, and definite.
disp(’Problem dimensions:’)
C.1 Demo programs for user-supplied functions 49
% -----------------------------------------------------------------------
% Preprocessing
% -----------------------------------------------------------------------
[M0,ML,MU,N0,dummy,dummy,prm,iprm] = munu_pre(M,N,[],[]);
% -----------------------------------------------------------------------
% Multiplication of matrix A0 with X0
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Solution of system of linear equations with A0
50 C CASE STUDIES
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Solve shifted systems of linear equations, i.e.
% solve (A0+p(i)*I)*Y0 = X0.
% -----------------------------------------------------------------------
disp(’Shift parameters:’)
p = [ -1; -2+3*j; -2-3*j ]
Y0 = munu_s(’N’,X0,1);
test_3 = norm(ML\(N0*(MU\Y0))+p(1)*Y0-X0,’fro’)
Y0 = munu_s(’N’,X0,2);
test_4 = norm(ML\(N0*(MU\Y0))+p(2)*Y0-X0,’fro’)
Y0 = munu_s(’N’,X0,3);
test_5 = norm(ML\(N0*(MU\Y0))+p(3)*Y0-X0,’fro’)
% -----------------------------------------------------------------------
% Postprocessing
% -----------------------------------------------------------------------
%
% There is no postprocessing.
% -----------------------------------------------------------------------
% Destroy global data structures (clear "hidden" global variables)
% -----------------------------------------------------------------------
C.2 Demo program for LRCF-ADI iteration and algorithm for comput-
ing ADI parameters
C.2.1 Demo program demo l1
%
% SOLUTION OF LYAPUNOV EQUATION BY THE LRCF-ADI METHOD (AND GENERATION
% OF ADI PARAMETERS)
%
% This demo program shows how the routines ’lp_para’ (computation of
% ADI shift parameters) and ’lp_lradi’ (LRCF-ADI iteration for the
% solution of the Lyapunov equation F*X+X*F’=-G*G’) work. Also, the
% use of user-supplied functions is demonstrated.
% -----------------------------------------------------------------------
% Generate test problem
% -----------------------------------------------------------------------
%
% As test example, we use a simple FDM-semidiscretized PDE problem
% (an instationary convection-diffusion heat equation on the unit square
% with homogeneous 1st kind boundary conditions).
F = fdm_2d_matrix(n0,’10*x’,’100*y’,’0’);
G = fdm_2d_vector(n0,’.1<x<=.3’);
disp(’Problem dimensions:’)
% -----------------------------------------------------------------------
% Initialization/generation of data structures used in user-supplied
% functions and computation of ADI shift parameters
% -----------------------------------------------------------------------
%
% Note that the routines ’au_m_i’, ’au_l_i’, and ’au_s_i’ create global
% variables, which contain the data that is needed for the efficient
% realization of basic matrix operations with F (multiplications,
% solution of systems of linear equations, solution of shifted systems
% of linear equations).
f = flops;
[F0,G0,dummy,prm,iprm] = au_pre(F,G,[]); % preprocessing (reordering
% for bandwidth reduction)
% Note the dummy parameter,
% which will be set to [] on
% exit.
% -----------------------------------------------------------------------
% Solution of Lyapunov equation F*X+X*F’ = -G*G’ (or, more precisely,
% the transformed equation F0*X0+X0*F0’ = -G0*G0’) by LRCF-ADI iteration
% -----------------------------------------------------------------------
%
% The approximate solution is given by the low rank Cholesky factor Z0,
% i.e., Z0*Z0’ is approximately X0
%
% The stopping criteria are chosen, such that the iteration is stopped
% shortly after the residual curve stagnates. This requires the sometimes
% expensive computation of the residual norms. (If you want to avoid
% this, you might choose max_it = 500 (large value), min_res = 0
% ("avoided"), with_rs = ’N’ ("avoided"), min_in = 1e-12 ("activated").)
C.2 Demo program for LRCF-ADI iteration 53
[Z0,flag,res,flp] = lp_lradi(tp,zk,rc,name,Bf,Kf,G0,p,max_it,min_res,...
with_rs,min_in,info);
% -----------------------------------------------------------------------
% Postprocessing, destroy global data structures
% -----------------------------------------------------------------------
%
% NOTE: The matrices F and G have been reordered in the preprocessing
% step (’’au_pre’’) resulting in F0 and G0. This means that the rows of the
% matrix Z0 must be re-reordered in a postprocessing step to obtain the
% solution to the original Lyapunov equation!
Z = au_pst(Z0,iprm);
54 C CASE STUDIES
disp(’Size of Z:’);
size_Z = size(Z)
disp(’Is Z real ( 0 = no, 1 = yes )?’)
is_real = ~any(any(imag(Z)))
% -----------------------------------------------------------------------
% Verify the result
% -----------------------------------------------------------------------
%
% Note that this is only an "illustrative" way of verifying the accuracy
% by computing the (normalized) residual norm. A more practical (because
% less expensive) way is evaluating the residual norm by means of the
% routine ’lp_nrm’ (Must be applied before postprocessing!), if the
% residual norms have not been generated during the iteration.
% -----------------------------------------------------------------------
% Generate test problem
% -----------------------------------------------------------------------
C.3 Demo programs for model reduction algorithms 55
0
10
−10
10
−15
10
0 10 20 30 40 50
Iteration steps
Figure 12: Residual history for the LRCF-ADI iteration in demo l1.
%
% This is an artificial test problem of a system, whose Bode plot shows
% "spires".
disp(’Problem dimensions:’)
% -----------------------------------------------------------------------
% Initialization/generation of data structures used in user-supplied
% functions and computation of ADI shift parameters
% -----------------------------------------------------------------------
%
% See ’demo_u1’, ’demo_u2’, ’demo_u3’, and ’demo_l1’ for more detailed
% comments.
%
% Note that A is a tridiagonal matrix. No preprocessing needs to be done.
name = ’au’;
au_m_i(A); % initialization for multiplication with A
au_l_i; % initialization for solving systems with A
% -----------------------------------------------------------------------
% Solution of Lyapunov equations A*X+X*A’ = -B*B’ and
% A’*X+X*A = -C’*C
% -----------------------------------------------------------------------
zk = ’Z’;
rc = ’C’;
Bf = [];
Kf = [];
info = 3;
[ZB,flag_B] = lp_lradi(tp,zk,rc,name,Bf,Kf,B,p,max_it,min_res,...
with_rs,min_in,info);
% compute ZB
[ZC,flag_C] = lp_lradi(tp,zk,rc,name,Bf,Kf,C,p,max_it,min_res,...
with_rs,min_in,info);
% compute ZC
% -----------------------------------------------------------------------
% Plot the transfer function of the system for a certain frequency range
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Generate reduced systems
% -----------------------------------------------------------------------
disp(’Reduced order:’)
disp(length(Ars))
disp(’Reduced order:’)
disp(length(Ard))
% -----------------------------------------------------------------------
% Destroy global data structures
% -----------------------------------------------------------------------
au_m_d;
au_l_d;
au_s_d(p);
In the demo program demo m1 we use very inaccurate Gramians. The normalized residual
norms are only ≈ 7.8 · 10−2 . The reduced order of the systems delivered by LRSRM and
DSPMR is as low as 10. The result of the demo program is shown in Figure 13. There
simultaneous Bode magnitude plots of the original system and and both reduced systems
are shown.
C.3 Demo programs for model reduction algorithms 59
1
10
Magnitude
0
10
−1
10 2 3
10 10
ω
Figure 13: Simultaneous Bode magnitude plots of original system (dotted), reduced system
by LRSRM, and reduced system by DSPMR generated by the demo program demo m1.
Both Bode plots of both reduced systems are almost identical and shown as solid line.
%
% MODEL REDUCTION BY THE ALGORITHMS LRSRM AND DSPMR. THE GOAL IS TO
% GENERATE A "NUMERICALLY MINIMAL REALIZATION" OF THE GIVEN SYSTEM
% AS WELL AS A REDUCED SYSTEM OF RELATIVELY SMALL ORDER.
%
% This demo program shows how the model reduction routines ’lp_lrsrm’
% and ’lp_dspmr’ work. Also, the use of ’lp_lradi’, supplementary
% routines, and user-supplied functions is demonstrated.
% -----------------------------------------------------------------------
% Generate test problem
% -----------------------------------------------------------------------
%
% As test example, we use an FEM-semidiscretized problem, which leads to
% a generalized system where M (the mass matrix) and N (the negative
% stiffness matrix) are sparse, symmetric, and definite.
disp(’Problem dimensions:’)
% -----------------------------------------------------------------------
60 C CASE STUDIES
name = ’msns’;
[M0,MU,N0,B0,C0,prm,iprm] = msns_pre(M,N,Btilde,Ctilde); % preprocessing
msns_m_i(M0,MU,N0); % initialization for multiplication with A0
msns_l_i; % initialization for solving systems with A0
% -----------------------------------------------------------------------
% Solution of Lyapunov equations A0*XB0+XB0*A0’ = -B0*B0’ and
% A0’*XC0+XC0*A0 = -C0’*C0
% -----------------------------------------------------------------------
zk = ’Z’;
rc = ’C’;
Bf = [];
Kf = [];
info = 3;
C.3 Demo programs for model reduction algorithms 61
% -----------------------------------------------------------------------
% Plot the transfer function of the system for a certain frequency range
% -----------------------------------------------------------------------
% -----------------------------------------------------------------------
% Generate reduced systems of high accuracy and possibly high order
% -----------------------------------------------------------------------
disp(’ ’)
disp(’Generate reduced systems of high accuracy and possibly high order’)
disp(’-----------------------------------------------------------------’)
disp(’Reduced order:’)
disp(length(Ars))
disp(’Reduced order:’)
disp(length(Ard))
% -----------------------------------------------------------------------
% Generate reduced systems of low order
% -----------------------------------------------------------------------
disp(’ ’)
disp(’Generate reduced systems of low order’)
disp(’-------------------------------------’)
disp(’Reduced order:’)
disp(length(Ars))
disp(’Reduced order:’)
disp(length(Ard))
% -----------------------------------------------------------------------
% Destroy global data structures
% -----------------------------------------------------------------------
msns_m_d;
msns_l_d;
msns_s_d(p);
64 C CASE STUDIES
5
10
0
10
−5
10
Magnitude
−10
10
−15
10
−20
10 −10 −5 0 5 10
10 10 10 10 10
ω
Figure 14: Results of demo m2. The dotted line is the Bode magnitude plot of the original
system, i.e, the function kG(ω)k. The solid and dashed lines are the approximation
errors, i.e., the functions kG(ω) − Ĝ(ω)k, for LRSRM and DSPMR, respectively. The
lower two curves correspond to the first run (highly accurate reduced systems, “numerically
minimal realization”) and the upper two to the second run (low reduced order).
C.4 Demo program for algorithms for Riccati equations and linear-
quadratic optimal problems
C.4.1 Demo program demo r1
%
% SOLUTION OF RICCATI EQUATION BY LRCF-NM AND SOLUTION OF LINEAR-
% QUADRATIC OPTIMAL CONTROL PROBLEM BY LRCF-NM-I
%
% This demo program shows how both modes (i.e., the one for LRCF-NM and
% the one for LRCF-NM-I) work. Also, the use of user-supplied functions
% is demonstrated in this context.
C.4 Demo program for algorithms for Riccati equations 65
% -----------------------------------------------------------------------
% Generate test problem
% -----------------------------------------------------------------------
%
% As test example, we use a simple FDM-semidiscretized PDE problem
% (an instationary heat equation on the unit square with homogeneous 1st
% kind boundary conditions).
%
% Note that the negative stiffness matrix A is symmetric.
A = fdm_2d_matrix(n0,’0’,’0’,’0’);
B = fdm_2d_vector(n0,’.1<x<=.3’);
C = (fdm_2d_vector(n0,’.7<x<=.9’))’;
Q0 = 10 % Q = Q0*Q0’ = 100
R0 = 1 % R = R0*R0’ = 1
disp(’Problem dimensions:’)
% -----------------------------------------------------------------------
% Initialization/generation of data structures used in user-supplied
% functions
% -----------------------------------------------------------------------
%
% Note that we use routines ’au_*’ rather than the routines ’as_*’,
% although A is symmetric. This is because ADI shift parameters w.r.t.
% the nonsymmetric closed loop matrix A-B*K’ (generated in the routine
% lp_lrnm) might be not real. The routines ’as_*’ are restricted to
% problems, where the shift parameters are real.
name = ’au’;
% -----------------------------------------------------------------------
% Compute LRCF Z0 by LRCF-NM
% -----------------------------------------------------------------------
%
% The approximate solution is given by the low rank Cholesky factor Z0,
% i.e., Z0*Z0’ is approximately X0, where X0 is the solution of the
% transformed Riccati equation
%
% C0’*Q0*Q0’*C0+A0’*X0+X0*A0-X0*B0*inv(R0*R0’)*B0’*X0 = 0.
%
% The stopping criteria for both the (outer) Newton iteration and the
% (inner) LRCF-ADI iteration are chosen, such that the iterations are
% stopped shortly after the residual curves stagnate. This requires
% the sometimes expensive computation of the Lyapunov equation
% and Riccati equation residual norms.
% (criterion is "avoided")
disp(’Termination flag:’)
flag_r
disp(’Termination flags:’)
flag_l
% -----------------------------------------------------------------------
% Compute (approximately) optimal feedback K0 by LRCF-NM-I
68 C CASE STUDIES
% -----------------------------------------------------------------------
%
% Here, the matrix K0 that solves the (transformed) linear-quadratic
% optimal control problem is computed by LRCF-NM-I.
%
% The stopping criteria for both the (outer) Newton iteration and the
% (inner) LRCF-ADI iteration are chosen by inexpensive heuristic
% criteria.
randn(’state’,0);
disp(’Termination flag:’)
flag_r
C.4 Demo program for algorithms for Riccati equations 69
disp(’Termination flags:’)
flag_l
% -----------------------------------------------------------------------
% Postprocessing, destroy global data structures
% -----------------------------------------------------------------------
%
% Note that both the LRCF Z0 and the state feedback K0 must be
% postprocessed in order to attain the results for the original problems.
Z = au_pst(Z0,iprm);
K = au_pst(K0,iprm);
disp(’Size of Z:’);
size_Z = size(Z)
disp(’Is Z real ( 0 = no, 1 = yes )?’)
Z_is_real = ~any(any(imag(Z)))
disp(’Is K real ( 0 = no, 1 = yes )?’)
K_is_real = ~any(any(imag(K)))
% -----------------------------------------------------------------------
% Verify the result
% -----------------------------------------------------------------------
%
% Note that this is only an "illustrative" way of verifying the accuracy
% by computing the (normalized) residual norm of the Riccati equation.
% A more practical (because less expensive) way is evaluating the residual
% norm by means of the routine ’lp_rcnrm’ (Must be applied before
% postprocessing!), if the residual norms have not been generated during
% the iteration.
%
% In general the result for LRCF-NM-I cannot be verified. However, we
% will compare the delivered feedback K with the feedback matrix computed
% by use of the LRCF Z.
kK − K (E) kF
≈ 5.9 · 10−16 .
max{kKkF , kK (E) kF }
5
10
0
10
Normalized residual norm
−5
10
−10
10
−15
10
0 1 2 3 4 5 6 7 8
Iteration steps
Figure 15: normalized residual norm history of the LRCF-NM in demo r1.
REFERENCES 71
References
[1] F. Aliev and V. Larin, Construction of square root factor for solution of the Lya-
punov matrix equation, Sys. Control Lett., 20 (1993), pp. 109–112.
[4] P. Benner and R. Byers, An exact line search method for solving general-
ized continuous-time algebraic Riccati equations, IEEE Trans. Automat. Control, 43
(1998), pp. 101–107.
[6] P. Benner, J. Li, and T. Penzl, Numerical solution of large lyapunov equations,
riccati equations, and linear-quadratic optimal control problems, in preparation, Zen-
trum f. Technomathematik, Fb. Mathematik und Informatik, Univ. Bremen, 28334
Bremen, Germany, 2000.
[10] R. Byers, Solving the algebraic Riccati equation with the matrix sign function, Linear
Algebra Appl., 85 (1987), pp. 267–279.
[11] E. Davison, A method for simplifying linear dynamic systems, IEEE Trans. Automat.
Control, 11 (1966), pp. 93–101.
[12] P. Feldmann and R. Freund, Efficient linear circuit analysis by Padé approx-
imation via the Lanczos process, IEEE Trans. Computer-Aided Design, 14 (1995),
pp. 639–649.
[13] R. Freund, Reduced-order modeling techniques based on Krylov subspaces and their
use in circuit simulation. Numerical Analysis Manuscript No. 98-3-02, Bell Labora-
tories, Murray Hill, New Jersey, 1998.
[14] R. Freund and N. Nachtigal, QMR: a quasi-minimal residual method for non-
Hermitian linear systems, Numer. Math., 60 (1991), pp. 315–339.
[16] , A rational Lanczos algorithm for model reduction, Numer. Alg., 12 (1996),
pp. 33–63.
[17] J. Gardiner and A. Laub, Parallel algorithms for the algebraic Riccati equations,
Internat. J. Control, 54 (1991), pp. 1317–1333.
[18] K. Glover, All optimal Hankel norm approximations of linear multivariable systems
and their L∞ -error bounds, Internat. J. Control, 39 (1984), pp. 1115–1193.
[19] G. Golub and C. V. Loan, Matrix Computations, The Johns Hopkins University
Press, Baltimore, 3rd ed., 1996.
[20] S. Hammarling, Numerical solution of the stable, non–negative definite Lyapunov
equation, IMA J. Numer. Anal., 2 (1982), pp. 303–323.
[21] C. He and V. Mehrmann, Stabilization of large linear systems, in Preprints of the
European IEEE Workshop CMP’94, Prague, September 1994, L. Kulhavá, M. Kárný,
and K. Warwick, eds., 1994, pp. 91–100.
[22] D. Hu and L. Reichel, Krylov-subspace methods for the Sylvester equation, Linear
Algebra Appl., 172 (1992), pp. 283–313.
[23] I. Jaimoukha, A general minimal residual Krylov subspace method for large scale
model reduction, IEEE Trans. Automat. Control, 42 (1997), pp. 1422–1427.
[24] I. Jaimoukha and E. Kasenally, Krylov subspace methods for solving large Lya-
punov equations, SIAM J. Numer. Anal., 31 (1994), pp. 227–251.
[25] , Oblique projection methods for large scale model reduction, SIAM J. Matrix
Anal. Appl., 16 (1995), pp. 602–627.
[26] , Implicitly restarted Krylov subspace methods for stable partial realizations,
SIAM J. Matrix Anal. Appl., 18 (1997), pp. 633–652.
[27] C. Kenney and A. Laub, The matrix sign function, IEEE Trans. Automat. Control,
40 (1995), pp. 1330–1348.
[28] D. Kleinman, On an iterative technique for Riccati equation computations, IEEE
Trans. Automat. Control, 13 (1968), pp. 114–115.
[29] P. Lancaster and L. Rodman, Algebraic Riccati Equations, Clarendon Press, Ox-
ford, 1995.
[30] A. Laub, A Schur method for solving algebraic Riccati equations, IEEE Trans. Au-
tomat. Control, 24 (1979), pp. 913–921.
[31] J. Li, F. Wang, and J. White, An efficient Lyapunov equation-based approach for
generating reduced-order models of interconnect, in Proc. 36th IEEE/ACM Design
Automation Conference, New Orleans, LA, 1999.
[32] J. Li and J. White, Efficient model reduction of interconnect via approximate sys-
tem Grammians, in Proc. IEEE/ACM International Conference on Computer Aided
Design, San Jose, CA, 1999.
[33] P. Li and T. Penzl, Approximate balanced truncation of large generalized state-
space systems, in preparation, Fak. f. Mathematik, TU Chemnitz, D-09107 Chemnitz,
2000.
REFERENCES 73
[34] V. Mehrmann, The Autonomous Linear Quadratic Control Problem, Theory and
Numerical Solution, vol. 163 of Lecture Notes in Control and Information Sciences,
Springer-Verlag, Heidelberg, 1991.
[36] D. Peaceman and H. Rachford, The numerical solution of elliptic and parabolic
differential equations, J. Soc. Indust. Appl. Math., 3 (1955), pp. 28–41.
[37] T. Penzl, A cyclic low rank Smith method for large sparse Lyapunov equations. to
appear in SIAM J. Sci. Comput.
[39] , Algorithms for model reduction of large dynamical systems. Submitted for pub-
lication., 1999.
[40] , Eigenvalue decay bounds for solutions of Lyapunov equations: the symmetric
case. Submitted for publication., 1999.
[41] L. Pillage and R. Rohrer, Asymptotic waveform evaluation for timing analysis,
IEEE Trans. Computer-Aided Design, 9 (1990), pp. 352–366.
[42] J. Roberts, Linear model reduction and solution of the algebraic Riccati equation by
use of the sign function, Internat. J. Control, 32 (1980), pp. 677–687.
[43] Y. Saad, Numerical solution of large Lyapunov equations, in Signal Processing, Scat-
tering, Operator Theory and Numerical Methods, M. Kaashoek, J. V. Schuppen, and
A. Ran, eds., Birkhäuser, Boston, MA, 1990, pp. 503–511.
[44] M. Safonov and R. Chiang, A Schur method for balanced-truncation model reduc-
tion, IEEE Trans. Automat. Control, 34 (1989), pp. 729–733.
[46] G. Starke, Optimal alternating direction implicit parameters for nonsymmetric sys-
tems of linear equations, SIAM J. Numer. Anal., 28 (1991), pp. 1431–1445.
[50] , Iterative solution of the Lyapunov matrix equation, Appl. Math. Lett., 1 (1988),
pp. 87–90.
[51] , The ADI minimax problem for complex spectra, in Iterative Methods for Large
Linear Systems, D. Kincaid and L. Hayes, eds., Academic Press, San Diego, 1990,
pp. 251–271.