Wavelets 99
Wavelets 99
(,)
(t)
_
(,)R
2
(,)
(t) = s
1/2
(
t
s
)e
it
where
(, ) R
2
( phase space)
(s R
+
, s = const ( scale)).
2
The decomposition of f in terms of phase space components is given by
f =
1
2
_
d
_
d
,
, f)
,
. (1)
Denition 2.2 (Ane Coherent States) An ane coherent state is a family, D, of square
integrable functions generated from a single function L
2
(R) by dilations and translations of
time-frequency atoms:
D =
_
(s,)
(t)
_
(s,)Reals
+
R
(s,)
(t) = s
1/2
(
t
s
)e
i
0
t
s
where
R, s R
+
,
0
= constant.
The Decomposition of f in terms of time-frequency atoms is given by
f =
1
2
_
d
_
ds
(s,)
, f)
(s,)
. (2)
The elements
(s,)
are called wavelets if the generating function satises
1.
_
()[
2
<
2.
(0) = 0 (i.e.
_
(t)dt = 0)
where
is the Fourier transform of .
The set of functions
(s,)
(t)[s R
+
, R constitutes a basis for L
2
(R).
Remark 1. Frequency is now a function of scale. The more we know about the location of certain
elements, the less we know about its frequency content. This is a manifestation of Heisenbergs
uncertainty principle.
Remark 2. If we allow the decomposition around various values of
0
the family of functions is
called a wave packet. Clearly, wave packet families are over complete.
2.1 Dictionaries
In general, from the family of functions
D = (
(t))
R
+
R
2
sometimes referred to as dictionaries [3], we can construct any function f(t) L
2
(R):
f =
1
2
_
R
+
R
2
, f)
d. (3)
The family, D is highly redundant, or over complete, i.e. it covers L
2
(R) but its elements are not
linearly independent as with a basis. The challenge therefor in using these families of functions
is to nd the best representation of f in terms of
n
(t)
where
c
n
=
n
, f) = (Tf)
n
, T : L
2
(R) l
2
(Z
3
).
We require that the mapping T be continuous, bounded, one-to-one and, for practical applications,
numerically stable. Thus the operator T must satisfy
aI T
T bI (4)
(0 < a b < ), or
a[[f[[
2
n
[
n
, f)[
2
b[[f[[
2
. (5)
Denition 2.3 The set of vectors
j
[j J H, a Hilbert space, for which
jJ
[
j
, f)[
2
satises eqn(5) above is called a frame. The constants a and b are the frame bounds. When a = b
the frame is called a tight frame.
If a = b = 1 then the frame constitutes a basis, otherwise the constant a = b for tight frames
indicates the rate of redundancy of the frame. While it is not always desirable or possible to have
a tight frame, for easy inversion formulas with rapid convergence we wish to have the ratio
a
b
be as
close to 1 as possible [3].
Given the frame (
j
: j J) H, a Hilbert space, dene the mapping T : H l
2
(J) by
(Tf)
j
=
j
, f)
where l
2
(J) is the space of square summable complex sequences indexed by the set J. T is the
frame operator corresponding to the frame (
j
)
jJ
. The adjoint operator T
maps l
2
(J) to H and
is dened by
T
c =
jJ
c
j
j
where c = (c
j
)
jJ
l
2
(J). So by eqn(4) we have aI T
T is
invertible with bounded inverse. Hence b
1
I (T
T)
1
a
1
I.
Proposition 2.4 (Daubechies [3]) The family of functions (
j
)
jJ
dened by
j
= (T
T)
1
j
is the dual frame to (
j
)
jJ
. The family (
j
)
jJ
is called an atom and satises
1. b
1
[[f[[
2
jJ
[
j
, f)[
2
a
1
[[f[[
2
for f H
2.
T = T(T
T)
1
satises
T
T = (T
T)
1
3.
T
T = I = T
T where
TT
= T
T
T, and
T =
T
T = T
1
, then
1. T =
jJ
j
j
, )
2. aI T bI and
3.
j
=
T
1
j
.
Thus we can construct the inverse transformation as follows:
T f =
jJ
j
j
, f)
which implies that
f =
jJ
T
1
j
, f)
or
f =
jJ
j
, f). (6)
Similarly
f =
jJ
j
, f)
j
. (7)
Daubechies shows that the series expansion for
j
in terms of
j
is
j
=
2
a + b
k=0
_
I
2T
a + b
_
k
j
. (8)
Thus we have a formula for calculating the coecients of the expansion of f in terms of
j
. This
is known as the atomic decomposition of f and is the foundation of the Basis Pursuit algorithm
discussed below.
3 From Filter Bank Theory to the Discrete Wavelet Transform
Bandpass lters for decomposing a signal into components at dierent frequencies are frequently
used in speech analysis and acoustic signal processing. We begin by developing ideal bandpass
lters to motivate the evolution toward the discrete wavelet transform.
3.1 Ideal Bandpass Filters
We lter our signal f into m 2 frequency channels. By a frequency channel we mean a Hilbert
space of sequences (c
k
)
kZ
satisfying the following:
1.
k
[c
k
[
2
<
2. f() =
c
k
e
ik
= 0 for all / I where I [0, 2] is an interval of length 2/m.
5
Each bandpass lter output is formed by the convolution
f
i
(t) =
_
f()K
i
(t )d. (9)
The lters, K
i
are constructed so that the transfer functions (i.e. the windows in the frequency
domain) sum to 1 for all frequencies, thus
i=0
f
i
(t) = f(t). Thus we have a lossless coding of the
signal f. Compact coding may be achieved by subband coding, which we review next.
3.2 Subband Coding
For bandlimited signals there is a great deal of redundant information. Subband coding seeks to
throw out unnecessary information while maintaining lossless coding. To our signal f(t) we apply
the constraint that it be bandlimited, i.e.
T[f(t)] =
f() = 0 for
,
where
f is the Fourier transform of the signal and
n
=
1
2t
. We
partition the frequency axis into subintervals of length
N/2
and lter the signal into two frequency
channels with the lters K
0
and K
1
. Next we subsample the corresponding output, retaining every
nth data point. This operation is known as decimation, denoted by the decimation operator D.
Subband coding is the basis for quadrature mirror lters which Mallat exploited to develop his
multi-resolution analysis.
3.3 Quadrature Mirror Filters
Denition 3.1 K
0
and K
1
are quadrature mirror lters if for all signals of nite energy
[[
f
0
[[
2
+[[
f
1
[[
2
= [[f[[
2
where
f
i
= DK
i
(f) = T
i
(f), f is the sampled signal, and D is the decimation operator.
Theorem 3.2 (Meyer [8]) Let K
0
() and K
1
() be the transfer functions of the lters K
0
and
K
1
. The following two properties are equivalent:
1.
1
2
_
K
0
() K
1
()
K
0
( + ) K
1
( + )
_
is unitary a.e. [0, 2].
2. The operator T
0
T
1
: l
2
(Z) l
2
(2Z) l
2
(2Z) is an isometric isomorphism, i.e. T
0
T
0
(f)
T
1
T
1
(f).
6
For proof and further discussion see [8].
If the lters K
0
and K
1
satisfy certain conditions they can separate the signal f into trends and
uctuations. Before dening trend and uctuation we must rst introduce a bit of notation.
Let (t) L
2
(R) satisfy
k
(t) = (t k) is an orthonormal sequence in L
2
(R), k Z.
Let V
0
L
2
(R) be the closed linear subspace generated by the sequence
k
. More generally,
dene
V
j
= f [ f(2
j
t) V
0
.
Suppose V
j
, j Z form a nested sequence satisfying
V
j
= 0, and
V
j
is dense in L
2
(R).
Dene
j,k
(t) = 2
j/2
(2
j
t k), j, k Z.
Denition 3.3 The trend f
j
at scale 2
j
of f L
2
(R) is dened by
f
j
(t) =
k
f,
j,k
)
j,k
(t) = (T
0
T
0
)
j
(f).
Denition 3.4 Fluctuations, denoted by d
j
(t), are dened by
d
j
(t) = f
j+1
(t) f
j
(t)
= f
j+1
(T
0
T
0
)
j
f
= T
1
T
1
f
j
= (T
1
T
1
)(T
0
T
0
)
j
f.
By construction one can make the trend
f
0
twice as regular as the signal f. Furthermore,
f
0
can be subsampled keeping only half as many points. This is the basis for the superior compression
achieved by wavelet decompositions over traditional Fourier methods.
Denition 3.5 Multi-resolution analysis is the process of achieving ner frequency resolution by
iterating the quadrature mirror lters. Specically, a multi-resolution analysis consists of partition-
ing L
2
(R) into a nested sequence of spaces V
j
satisfying
. . . V
1
V
0
V
1
. . .
where V
m
L
2
(R) as m . Orthonormal projections of f onto V
j
correspond to approxima-
tions of resolution 2
j
.
7
3.4 Mallats Herringbone algorithm [10]
Let J
j
= 2
j
Z constitute a sequence of nested grids from ne, J
N
(N 1) to coarse, J
0
. Let f(jt)
be the sampled signal on the grid J
N
, denoted by f
0
l
2
(J
N
). Given two quadrature mirror lters
K
0
and K
1
, dene the sequence (r
1
, r
2
, . . . , r
N
, f
N
), f
N
l
2
(J
0
) by
r
j
= T
1
T
1
f
j1
,
f
j
= (T
0
T
0
)f
j1
where T
0
= DK
0
, T
1
= DK
1
, and D is the decimation operator.
Property 3.6 The mapping f l
2
(J
N
) (r
1
, r
2
, . . . , r
N
, f
N
) is an orthogonal isometry.
The decomposition of f into the above sequence is Mallats herringbone algorithm. This
algorithm has many nice properties that lead ultimately to wavelets. The above fact guarantees
that we can achieve a perfect reconstruction with nice functional properties. In fact, we can
go even further to assert that, given lters K
0
and K
1
satisfying certain decay properties, then
Mallats algorithm converges to the decomposition of f in an orthonormal wavelet basis. Since this
algorithm is performed on a discrete grid we must restrict the signal accordingly by preltering
with g satisfying
1. g(t) C
r
and g(t) and its derivatives, g
(t), . . . , g
(r)
(t) all decrease rapidly at innity.
2.
_
g(t)dt = 1 and
3. g(2k) = 0 if k Z, k ,= 0.
If we consider the grid J
N
= 2
N
Z and dene g
N
(t) = 2
N
g(2
N
t) then in the limit as N tends to
we have the following result:
Theorem 3.7 Suppose the quadrature mirror lters K
0
and K
1
decay rapidly at innity and that
the transfer functions of K
0
satises K
0
(0) =
2, K
0
() ,= 0 for [/2, /2]. Then Mallats
herringbone algorithm applied to g
N
f, the preltered signal, yields a decomposition of f in an
orthonormal wavelet basis as N .
Mallats algorithm provided the tools and inspiration necessary for Daubechies to demonstrate
by construction that for each integer r 0 there exists a function (t) C
r
such that
2
j/2
(2
j
t k), (j, k Z)
is and orthonormal basis of L
2
(R) [8].
3.5 The Wavelet Transform
Denition 3.8 A basic wavelet (or generating wavelet) is a function (t) R
n
satisfying
1.
_
()[
2
<
8
2.
(0) = 0 ( i.e.
_
(t)dt = 0)
where
is the Fourier transform of .
The set of functions
a,b
(t)[a, b R, a > 0 where
a,b
=
1
a
(
t b
a
)
constitutes a basis for L
2
(R).
Denition 3.9 The continuous wavelet transform of f(t) with respect to the wavelet (t) is
W
f
(a, b) = f,
a,b
) =
_
f(t)
a,b
(t)dt.
Denition 3.10 The corresponding inverse wavelet transform is given by
f(t) =
1
c
_
0
W
f
(a, b)
a,b
(t)
1
a
2
dbda
where a is the scale parameter and b is the position parameter.
For a bandlimited continuous function f(t) and a set of orthonormal wavelets we have:
c
j,k
=
_
f(t)
j,k
(t)dt
and
f(t) =
j,k
c
j,k
j,k
(t).
Denition 3.11 The discrete wavelet transform is given by:
f(it) =
j,k
c
j,k
j,k
(it)
where
c
j,k
=
t
f(it)
j,k
(it).
We dene the generating wavelet via the quadrature mirror lters and use the results of
Theorem 3.4.2. The quadrature mirror lters giving rise to a variety of wavelets have been dened
and tabulated for use with the discrete wavelet transform [4]. Recall, however, that we are not
limited to wavelet bases, and may venture into decompositions in terms of overcomplete dictionaries
to nd a better t. This is discussed briey in the sections that follow.
9
4 Further Renements: Wave Packets
Wave packets are a collection D of orthonormal bases composed of functions of the form
n
(2
j
tk)
where n is the frequency parameter, j the scale and k the location. These functions can be
trigonometric or non-trigonometric and form a covering of the frequency space. The functions are
computed with a quadrature mirror lter bank algorithm to form the generating function . The
Basis Pursuit, Matching Pursuit, and Best Orthogonal Basis algorithms all choose elements from
these families of basis functions to form the optimal decomposition of the signal f. We discuss
these algorithms below.
5 Wavelet Decomposition Algorithms
5.1 Information Measures
With the exception of the discrete wavelet transform, most algorithms for nding wavelet decom-
positions seek to minimize the information cost of the functional decomposition. The information
measures used by the algorithms are:
1. #k such that [c
k
[ > (cardinality)
2.
k
log(1 +[c
k
/) (bit length norm)
3. [[c[[, c l
2
(standard norms)
4.
2
(v; H
i
) =
[[v
2
i
ln[[v
i
[[
2
(entropy)
5.2 Best Orthogonal Basis
The Best Orthogonal Basis algorithm of Coifman and Wickerhauser nds the wave packet decom-
position of a signal that minimizes the information cost of the decomposition, i.e
(T) Min
D
/(f)
where /() is and additive information cost functional, (i.e /(0) = 0 and /(f
i
) =
/(f
i
))
and is a basis belonging to the orthogonal family of bases D. The preferred cost functional is
Shannon entropy (see [2] and [1] for details).
5.3 Matching Pursuit
Matching pursuit seeks to solve the problem
(T) Min
0
D
[[Rf[[
where
Rf = f
0
, f)
0
and
D = (
)
J
; J is an index set with J = R
+
R
2
.
10
The algorithm works by iteratively projecting the residuals onto the element of D closest to
the function at the previous step. A choice function, c, nds the element
0
D that is nearly
optimal in the sense that
[f,
0
)[ sup
J
[f,
)[
where is a relaxation parameter, [0, 1]. See [11] for details.
5.4 Basis Pursuit
Basis Pursuit is similar to the method of frames algorithm, but exploits the duality of frames
and atoms in a primal-dual logarithmic barrier interior point method that reduces the problem
to a perturbed linear program. In both the method of frames and basis pursuit the following
optimization problem is solved:
(T) Min
cl
2
(J)
[[c[[
subject to
c
j
j
= f
where c is the vector of coecients and | | is the l
1
-norm in the method of frames and the l
2
-norm
in basis pursuit. Recall from section 2 that c
j
=
j
, f) = T(T
T)
1
f. The algorithm solves
(T
jJ
[
j
, f)[
2
b[[f[[
2
, (0 < a b ).
Thus, for tight frames,
aI =
af =
f
f = a
1
f = a
1
m,n
(m,n)
(m,n)
, f)
y = (
)
1
a
1
f.
Thus the challenge that both the method of frames and basis pursuit address is that of nding the
dictionary that will give a tight frame for the signal f. See [9] for details.
6 Conclusion
Wavelets are a powerful too for signal analysis. One goal is to build a library of signal processing
algorithms that will cover a wide class of signals with several goals in mind, foremost among these
is diagnostics. To this end we are also interested in several other stand-alone topics in signal
processing: coding, quantization and compression, transmission and storage, and reconstruction.
11
Spectral analysis is the rst step in isolating key features of signals which neural nets, in turn,
use for discrimination. Because of the inherent diculty of representing functions with compact
support by the superposition of functions that are not compactly supported, the use of standard
Fourier techniques is not appropriate for certain classes of signals, particularly non-stationary
signals. Multi-resolution analysis and time-frequency analysis, developed in the forties for the
analysis of acoustic signals [12, 5] provided the framework and the inspiration for transformations
using functions of nite duration, known today as wavelets. Interestingly, it was also in the eld
of geophysics that the rst wavelet transform was proposed [6, 7]. There are many wavelets to
choose from, each particularly well suited for specic types of signals. The introduction of these
new techniques has expanded our ability to gather and represent information, but the variety of
methods has introduced a new problem: which one to use.
References
[1] R.R. Coifman and M.V. Wickerhauser. Best-adapted wave packet bases. Technical report,
Department of Mathematics, Yale University, May 1990.
[2] R.R. Coifman and M.V. Wickerhauser. Entropy-based algorithms for best-basis selection.
IEEE Transactions on Information Theory, 38:713718, 1992.
[3] I. Daubechies. The wavelet transform, time-frequency localization and signal analysis. IEEE.
Trans.Info. Thry., 36(5), 1990.
[4] D.Donoho et al. WaveLab.701. ftp://playfair.stanford.edu/pub/wavelab, 1996.
[5] D. Gabor. Theory of communication. IEEE Proc, (93):429441, 1946.
[6] A. Grossman and J. Morlet. Decomposition of hardy functions into square integrable wavelets
of constant shape. SIAM J. Math. Anal, 15:723736, 1984.
[7] A. Grossman, J. Morlet, and T. Paul. Transforms associated to square integrable group
representations. J. Math Phys, 26:24732479, 1985.
[8] Y. Meyer. Wavelets: Algorithms and Applications. SIAM, 1993.
[9] S.Chen. Basis Pursuit. PhD thesis, Stanford Univ., 1995.
[10] S.Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE
Trans. Pattern Anal.Machine Intell., 11:674693, 1989.
[11] S.Mallat and Zhifeng Zhang. Matching pursuits with time-frequency dictionaries. IEEE Trans.
Signal Proc., 41(12):3397415, 1993.
[12] J. Ville. Theorie et applications de la notion de signal analytique. Revue Cables et Tansmis-
sions, 1:6174, 1947.
12